From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Schulte Subject: Re: building tagcloud datastructure in elisp Date: Wed, 12 Sep 2012 12:58:40 -0600 Message-ID: <87sjanrrtr.fsf@gmx.com> References: Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([208.118.235.92]:53349) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TBs9O-0001sQ-AN for emacs-orgmode@gnu.org; Wed, 12 Sep 2012 14:59:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TBs9K-0006cB-58 for emacs-orgmode@gnu.org; Wed, 12 Sep 2012 14:59:02 -0400 Received: from mailout-eu.gmx.com ([213.165.64.42]:47247) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1TBs9J-0006bu-Q7 for emacs-orgmode@gnu.org; Wed, 12 Sep 2012 14:58:58 -0400 In-Reply-To: (Marcelo de Moraes Serpa's message of "Wed, 12 Sep 2012 13:41:35 -0500") List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Marcelo de Moraes Serpa Cc: Org Mode Marcelo de Moraes Serpa writes: > Hi list, > > How hard would it be to parse a bunch of org files and build an elisp data > structure (Hash?) that represents a tagcloud? All tags in all headlines and > subtrees should be taken into account (for all org files that are parsed). > Could I use org-element to help me parse this or is there a better way? > > I'm just learning the org API, and I've only done a bunch of elisp hacks, > so any insight would be greatly appreciated! > > Thanks, > > - Marcelo. My favorite method of getting word frequencies from text files is the following. Sometimes it is easier to just Org-mode files as text files rather than to use e-lisp. # -*- shell-script -*- many=20 # to print the 20 most popular words cat org-file.org \ |tr -cs A-Za-z '\n' \ |tr A-Z a-z \ |sort \ |uniq -c \ |sort -rn \ |sed ${many}q \ |sed 's/^ *//' \ |sed 's/\([^ ]*\) \([^ ]*\)/\2:\1/' \ |tr '\n' ' ' \ |sed 's/ $/\n/' Adapted from http://www.leancrew.com/all-this/2011/12/more-shell-less-egg/ Best, -- Eric Schulte http://cs.unm.edu/~eschulte