From mboxrd@z Thu Jan 1 00:00:00 1970 From: Russell Adams Subject: Re: Anyone use 3rd party search tools w/org-mode? Date: Tue, 12 Nov 2019 14:01:31 +0100 Message-ID: <20191112130131.GA28797@volibear> References: <87wocduh6o.fsf@gmail.com> <878sosdgfq.fsf@ericabrahamsen.net> <87h83f8vod.fsf@ericabrahamsen.net> <87eeyisea1.fsf@gmail.com> <20191108135147.GK27044@volibear> <87d0e2sb2n.fsf@gmail.com> <87k185s4ze.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from eggs.gnu.org ([2001:470:142:3::10]:46933) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iUVp1-0003xL-L9 for emacs-orgmode@gnu.org; Tue, 12 Nov 2019 08:03:05 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iUVou-0005cI-KB for emacs-orgmode@gnu.org; Tue, 12 Nov 2019 08:03:01 -0500 Received: from se10.route25.eu ([2a00:f10:121:b00:1c00:25ff:fe00:1b5b]:40102) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1iUVou-0005ae-De for emacs-orgmode@gnu.org; Tue, 12 Nov 2019 08:02:56 -0500 Content-Disposition: inline In-Reply-To: <87k185s4ze.fsf@gmail.com> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: "Emacs-orgmode" To: emacs-orgmode@gnu.org To further explain my setup, I have three libraries of files Personal, Technical and Business. Personal is all personal data including Org files, Technical is all whitepapers and vendor documentation, and Business is Org projects and other matters. Recoll is used to search all of them. In my shell profile I have a few functions to access each library, and to file away new documents (ie: I downloaded a whitepaper, and just want to slap it into a unique directory in the library). #+BEGIN_EXAMPLE # For recoll and library func _FileRecoll() { DEST="$HOME/Library/$1/$(date +%Y/%m/%d)" ; mkdir -p $DEST ; mv -i "$2" $DEST ; } func FileTech() { _FileRecoll "Technical" "$1" ; } func FilePersonal() { _FileRecoll "Personal" "$1" ; } func FileBiz() { _FileRecoll "Business" "$1" ; } func recollt() { RECOLL_CONFDIR=~/Library/.recoll-Technical ~/scripts/recolltui.sh $@ ; } func recollp() { RECOLL_CONFDIR=~/Library/.recoll-Personal ~/scripts/recolltui.sh $@ ; } func recollb() { RECOLL_CONFDIR=~/Library/.recoll-Business ~/scripts/recolltui.sh $@ ; } #+END_EXAMPLE I have a daily cronjob to index those directories: #+BEGIN_EXAMPLE # Recoll 00 2 * * * /usr/bin/recollindex -c ${HOME}/Library/.recoll-Personal/ >> "${HOME}/Library/.recoll-Personal/recollindex.log" 2>&1 00 3 * * * /usr/bin/recollindex -c ${HOME}/Library/.recoll-Technical/ >> "${HOME}/Library/.recoll-Technical/recollindex.log" 2>&1 00 4 * * * /usr/bin/recollindex -c ${HOME}/Library/.recoll-Business/ >> "${HOME}/Library/.recoll-Business/recollindex.log" 2>&1 #+END_EXAMPLE Then I have a simple TUI shell script which wraps dialog around recoll's CLI. This puts the filename in my clip board for command line pasting, and opens PDFs in Firefox. #+BEGIN_EXAMPLE #!/bin/sh # ~/scripts/recolltui.sh # requires recollq optional cli binary to be present from recoll package # uses base64, xsel, and dialog DB=$(mktemp) MENU=$(mktemp) trap 'rm -f -- "${DB}" "${MENU}"' INT TERM HUP EXIT # Make sure to customize RECOLL_CONFDIR (ie: ~/Library/.recoll-Technical) if needed # query recoll, save the base64 output to $DB as 3 space separated columns: row #, title, url recollq -e -F "title url" $@ 2>/dev/null | nl > $DB # copy header into menu head -n 2 $DB | while read num rest ; do echo "= \"$rest\"" >> $MENU done # Convert results to dialog menu using row # and title + filename as list item # skip first two lines of results, they are not base64 tail -n +3 $DB | while read num title url ; do echo "$num \"$(echo "$title" | base64 -w0 -d ) : $(basename "$(echo "$url" | base64 -w0 -d | sed 's,file://,,g')")\"" >> $MENU done # ask the user which results to view SEL=$(dialog --menu "Search results" 0 0 0 --file $MENU --stdout) # if a choice was made, open the url in firefox AND copy it to the clipboard [ $? -eq 0 ] && { URL="$(awk "\$1 == $SEL {print \$3}" $DB | base64 -w0 -d)" echo "$URL" | sed 's,file://,,g' | xsel firefox "$URL" } #+END_EXAMPLE I've often thought that the dialog script could be easily replaced by an Emacs interface, but I haven't taken the time to try to write one. I've found that recoll's indexing in Xapian is excellent. I frequently can find my search terms in technical documentation very rapidly. The support of many file types makes it index well. I think my most frequent formats are text including Org, PDF, and DOC. I used to have a "Scrapbook" extension in Firefox which would instantly save a webpage being viewed into my Personal library. Unfortunately that isn't supported on modern Firefox versions so I need to find a replacement for that functionality. On Tue, Nov 12, 2019 at 12:34:29PM +0100, Roland Everaert wrote: > I had a quick look at the recoll and I notice that there is a python API > to update/create index. > > Maybe something could be developped using the python package recently > released by Karl Voit, to feed a recoll index with org data. > > Roland. ------------------------------------------------------------------ Russell Adams RLAdams@AdamsInfoServ.com PGP Key ID: 0x1160DCB3 http://www.adamsinfoserv.com/ Fingerprint: 1723 D8CA 4280 1EC9 557F 66E8 1154 E018 1160 DCB3