emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Pieter Praet <pieter@praet.org>
To: Samuel Wales <samologist@gmail.com>
Cc: Org Mode <emacs-orgmode@gnu.org>,
	Marcelo de Moraes Serpa <celoserpa@gmail.com>
Subject: Re: [OT] Scanning for archiving
Date: Sun, 06 Nov 2011 22:59:01 +0100	[thread overview]
Message-ID: <87hb2ghpzu.fsf@praet.org> (raw)
In-Reply-To: <CAJcAo8v7jDramKw2kObT9p9ys8R=ytpHW4FejgScTW9KmF2WVA@mail.gmail.com>

On Sat, 5 Nov 2011 16:35:11 -0700, Samuel Wales <samologist@gmail.com> wrote:
> I used to find that 8-bit 75dpi was legible and small.
> 

True.

It all depends on why you're scanning them in the first place.

75dpi is fine when scanning with collaboration/quick-reference in mind,
but for archival/backup purposes (i.e. absolute peace of mind when your
whole collection of dead trees burns, drowns, or is simply disposed of)
or OCR, you'll want to go with 600dpi and beyond.

If using DjVu instead of PDF, the storage overhead will be negligible.

> What ADF scanners are out there for Linux that have high quality
> reliable ADF, [...]

I wish I knew...  If anyone on this list can think of a scanner whose
ADF doesn't require constant babysitting, I'm betting it won't have a
consumer-grade price tag.

> [...] are fast, [...]

Pretty much all of them, these days.

> and work well with CLI tools?
> 

As long as it's supported by SANE [1], rats are entirely optional.

> Is OCR at the point where it is feasible using CLI? [...]

Depends on how "fancy" the document layout is.  For most documents worth
scanning (let alone OCR'ing), it always has been.  Also see OCRopus [2].

> [...] Combining that
> with a new feature to have the Org agenda work with indexers (I
> participated in a discussion on that here a long while back) would be
> interesting.
> 

If you don't intend to create a perfect ASCII copy of the document, and
your index is restricted to word occurrence/frequency, it'll do just fine.

> On 2011-11-05, Pieter Praet <pieter@praet.org> wrote:
> > NOTE: When attempting something like this, a fast scanner with a *reliable*
> > automatic document feeder will help prevent premature hair loss ;)
> 
> ...
> 
> > [1] http://djvu.org/resources/whatisdjvu.php
> > [2] http://gscan2pdf.sourceforge.net/


Peace

-- 
Pieter

[1] http://www.sane-project.org/sane-supported-devices.html
[2] http://code.google.com/p/ocropus/

  reply	other threads:[~2011-11-06 21:59 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-05 20:03 [OT] Scanning for archiving Marcelo de Moraes Serpa
2011-11-05 20:34 ` Achim Gratz
2011-11-05 20:52   ` Marcelo de Moraes Serpa
2011-11-05 21:01 ` Jan Böcker
2011-11-05 21:06   ` Marcelo de Moraes Serpa
2011-11-05 22:36 ` Pieter Praet
2011-11-05 23:35   ` Samuel Wales
2011-11-06 21:59     ` Pieter Praet [this message]
2011-11-07  6:14       ` TP
2011-11-09  8:51         ` Pieter Praet
2011-11-20 13:57         ` Matt Lundin
2011-11-07 17:44   ` Karl Voit
2011-11-09  7:40     ` Pieter Praet
2011-11-09  9:06       ` Johnny
2011-11-09 11:05       ` Karl Voit
2011-11-09 14:53     ` Karl Voit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87hb2ghpzu.fsf@praet.org \
    --to=pieter@praet.org \
    --cc=celoserpa@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=samologist@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).