emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Erik Hetzner <egh@e6h.org>
To: Org Mode <emacs-orgmode@gnu.org>
Cc: John Kitchin <jkitchin@andrew.cmu.edu>
Subject: Re: John's amazing indexing posts
Date: Sun, 26 Jul 2015 22:16:57 -0700	[thread overview]
Message-ID: <55b5bed5.4731460a.680f0.ffffb455@mx.google.com> (raw)
In-Reply-To: <m2d1zwcenw.fsf@andrew.cmu.edu>

Hi all,

I previously hooked up org with recoll with pretty good results. I’ve
written this up for worg, but I have my ssh key on a different
machine, so I can’t push now. Here is the info for the record.

** Recoll
In order to index using the [[http://www.lesbonscomptes.com/recoll/][recoll]] search engine, you will want to add
the following to your =~/.recoll/mimeinfo= file:

#+BEGIN_SRC
.org  = text/x-org
.org_archive  = text/x-org
#+END_SRC

You will also need a shell script to convert your org mode files to
HTML in batch mode. The script takes as an argument the file to
convert and prints the output to stdout. Here is an example:

#+BEGIN_SRC sh
#!/bin/sh
emacs --batch --eval "(progn (find-file \"$1\") (org-html-export-as-html) (set-buffer \"*Org HTML Export*\") (princ (buffer-string)))"
#+END_SRC

You will also need the following in your =~/.recoll/mimeconf=:

#+BEGIN_SRC
[index]
text/x-org = exec /home/egh/.recoll/rclorg ;
  mimetype = text/html
#+END_SRC

Now, rebuild your recoll index. Org mode files should be converted to
HTML and indexed. It will take some time, because emacs will be
launched for each conversion. An alternative is to use [[http://pandoc.org][pandoc]] to do
the conversion. It can be configured as follows in your
=~/.recoll/mimeconf= file:

#+BEGIN_SRC
[index]
text/x-org = exec pandoc -s -f org -t html5 ;
  mimetype = text/html
#+END_SRC

If you want, you might change the pandoc template or org mode output
to generate =meta= tags that will be recognized by recoll. See
http://www.lesbonscomptes.com/recoll/usermanual/RCL.PROGRAM.html#RCL.PROGRAM.FILTERS.HTML
for details.

On Mon, 13 Jul 2015 07:31:31 -0700,
John Kitchin <jkitchin@andrew.cmu.edu> wrote:
> 
> Thanks Matt,
> 
> That is also my impression of where this will go. Eventually this will
> move towards a database search engine, e.g. like Oleg's project at
> https://github.com/wvxvw/sphinx-mode. I am not sure precisely which
> direction though. Swish-e is nice, but at the moment you cannot
> incrementally update the database, and full indexing is required every
> time. I am not sure that is fixable, and swish-e does not do
> unicode. There are half a dozen or so candidates to go forward on, and
> they all have some pros and cons to think about.
> 
> It has a lot of other applications in org too, e.g. a file-system wide
> agenda, tag search, etc...
> 
> 
> Matt Price writes:
> 
> > Not sure if everyone has seen John's latest post about indexing org files
> > with swish-e:
> >
> > http://kitchingroup.cheme.cmu.edu/blog/2015/07/06/Indexing-headlines-in-org-files-with-swish-e-with-laser-sharp-results/
> >
> > It's very impressive.  It strikes me as a step towards an incredibly
> > ambitious project that would bring file indexing inside of Emacs -- so it
> > would not longer be necessary to go out to a shell or a Desktop Search tool
> > in order to find files that contain particular search terms.  I'm looking
> > forward to your next steps, John!
> >
> > Matt
> 
> --
> Professor John Kitchin
> Doherty Hall A207F
> Department of Chemical Engineering
> Carnegie Mellon University
> Pittsburgh, PA 15213
> 412-268-7803
> @johnkitchin
> http://kitchingroup.cheme.cmu.edu
> 
> 

  reply	other threads:[~2015-07-27  5:17 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-13  9:41 John's amazing indexing posts Matt Price
2015-07-13 14:31 ` John Kitchin
2015-07-27  5:16   ` Erik Hetzner [this message]
2015-07-27 13:19     ` Oleh Krehel
2015-07-27 14:19       ` John Kitchin
2015-07-27 16:40       ` Erik Hetzner
2015-07-28  8:14         ` Oleh Krehel
2015-07-31  7:55           ` Xebar Saram
2015-07-31  8:31             ` Oleh Krehel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55b5bed5.4731460a.680f0.ffffb455@mx.google.com \
    --to=egh@e6h.org \
    --cc=emacs-orgmode@gnu.org \
    --cc=jkitchin@andrew.cmu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).