emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Samuel Wales <samologist@gmail.com>
To: Karl Maihofer <ignoramus@gmx.de>
Cc: emacs-orgmode@gnu.org
Subject: Re: Searching inside of attachments (pdf, odt)?
Date: Tue, 13 Oct 2009 10:09:10 -0700	[thread overview]
Message-ID: <20524da70910131009oe24948m2fcab864e2c4229a@mail.gmail.com> (raw)
In-Reply-To: <20091013100924.147106zsin75yyt0@webmail.df.eu>

Hi,

My idea is to keep it simple at first.  Everybody will come
up with great ways to integrate with his favorite IR tool.

Here I want to focus on the org interface.

The org interface can be the same as any other agenda
search, with all the same controls.  The back end can use
special-purpose textifiers like pdf2text (or whatever) or
general-purpose textifiers from IR tools.  Doesn't matter.

Later, the mechanism can get more fancy if desired.  But
first, we should implement existing behavior.  I often move
things to attachments merely because they are large.  I
don't want search to work differently just because I did
that.  Search should IMO work the same as it does for
outline bodies.

This includes regexp syntax.  If we use anything other than
Emacs, we risk one regexp syntax for attachments and another
for outline bodies.  That makes me shudder.

Later, we can use the fancier IR tools, or use reverse
indexes.  But not everybody has IR tools installed, and
reverse indexes might be premature optimization.

If you're worried about speed, this is a perfect, simple
application for caching.  I'd try it before concluding that
it is too slow.  If it is, we have a good foundation into
which we can hook your favorite IR.

I don't think there's a downside to achieving compatibility
and full agenda integration first, then only after that
doing the fancy stuff.

Have you tried the agenda search feature yet?  If not, perhaps trying
it first will help ground the discussion.

  parent reply	other threads:[~2009-10-13 17:09 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-12 13:40 Searching inside of attachments (pdf, odt)? Karl Maihofer
2009-10-12 22:59 ` Samuel Wales
2009-10-13  8:09   ` Karl Maihofer
2009-10-13 14:31     ` Tim O'Callaghan
2009-10-13 17:09     ` Samuel Wales [this message]
2009-10-14 16:47       ` Karl Maihofer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20524da70910131009oe24948m2fcab864e2c4229a@mail.gmail.com \
    --to=samologist@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=ignoramus@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).