From mboxrd@z Thu Jan 1 00:00:00 1970 From: John Kitchin Subject: Re: Anyone use 3rd party search tools w/org-mode? Date: Wed, 06 Nov 2019 15:09:30 -0500 Message-ID: References: <87wocduh6o.fsf@gmail.com> <878sosdgfq.fsf@ericabrahamsen.net> Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([2001:470:142:3::10]:44474) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iSRcW-0005ZG-33 for emacs-orgmode@gnu.org; Wed, 06 Nov 2019 15:09:37 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iSRcU-0006uu-PB for emacs-orgmode@gnu.org; Wed, 06 Nov 2019 15:09:35 -0500 Received: from mail-qt1-x830.google.com ([2607:f8b0:4864:20::830]:39974) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1iSRcU-0006uU-LN for emacs-orgmode@gnu.org; Wed, 06 Nov 2019 15:09:34 -0500 Received: by mail-qt1-x830.google.com with SMTP id o49so35065268qta.7 for ; Wed, 06 Nov 2019 12:09:34 -0800 (PST) Received: from Johns-MacBook-Air.local (KITCHIN-TIMEMACHINE.CHEME.CMU.EDU. [128.2.54.215]) by smtp.gmail.com with ESMTPSA id o53sm13515498qtj.91.2019.11.06.12.09.31 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 06 Nov 2019 12:09:32 -0800 (PST) In-reply-to: <878sosdgfq.fsf@ericabrahamsen.net> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: "Emacs-orgmode" To: emacs-orgmode@gnu.org The way I got Swish to index org files was to create a script that generated an xml file (https://kitchingroup.cheme.cmu.edu/blog/2015/07/06/Indexing-headlines-in-org-files-with-swish-e-with-laser-sharp-results/) or html (http://kitchingroup.cheme.cmu.edu/blog/2015/07/03/Using-swish-e-to-index-org-files-as-html/) that it could index. This is probably a general strategy for these tools. Eric Abrahamsen writes: > Roland Everaert writes: > >> Hello all, >> >> I am interested in a search/indexing engine targeting the org format, >> too. >> >> My interest comes from the fact that I have a growing number of org >> files and as org-mode has no file archiving feature, AFAIK, searching >> needs more and more time to complete. >> >> Moving files, that are no more necessary, outside of my org-directories, >> can be tedious and prone to moving the wrong file to the wrong location. >> >> Hence, an indexer could comes in handy, especially if it is optimised >> for the Org format (i.e.: it knows what are categories, tags, >> properties, etc in an Org file). > > I think this last point is key. Most full-text search engines provide > config options for defining fields, or "facets", which in theory we > could set up to parse tags/properties/timestamps. My guess is that any > of the major contenders (solr, xapian, lucene) would work pretty much as > well as any of the others -- for our purposes, they probably only differ > in the details. Xapian might be considered "in the family" from a > license standpoint, but I don't know that that matters too much. > > It would be fun to provide an Org indexing config for one of these > engines, and then build the Agenda on top of it. -- Professor John Kitchin Doherty Hall A207F Department of Chemical Engineering Carnegie Mellon University Pittsburgh, PA 15213 412-268-7803 @johnkitchin http://kitchingroup.cheme.cmu.edu