From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nicolas Goaziou Subject: Re: Word-counting in a subtree Date: Tue, 22 Aug 2017 11:24:03 +0200 Message-ID: <87pobo2bws.fsf@nicolasgoaziou.fr> References: <1709997.Ids7SHHmVx@bl4ckspoons> <87valgu71m.fsf@t3610> <4070576.EzaRnAn6cA@bl4ckspoons> Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:56383) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dk5Ps-0007Hy-G2 for emacs-orgmode@gnu.org; Tue, 22 Aug 2017 05:24:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dk5Pr-00050o-IP for emacs-orgmode@gnu.org; Tue, 22 Aug 2017 05:24:08 -0400 Received: from relay3-d.mail.gandi.net ([2001:4b98:c:538::195]:42691) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dk5Pr-00050P-CS for emacs-orgmode@gnu.org; Tue, 22 Aug 2017 05:24:07 -0400 In-Reply-To: <4070576.EzaRnAn6cA@bl4ckspoons> (Jacopo De Simoi's message of "Mon, 21 Aug 2017 17:34:39 -0400") List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: "Emacs-orgmode" To: Jacopo De Simoi Cc: Org Mode List Hello, Jacopo De Simoi writes: > this has the drawback that it also counts words in the header line; except > this, could work as a temporary solution, but I'd still like to cook up some > tag integrated with org What about something like this (untested): (defun my-count-words (&optional ignore-headings beg end) (save-restriction (narrow-to-region (or beg (point-min)) (or end (point-max))) (let ((count 0) (count-words-in-string (lambda (s) (length (split-string s "[^[:word:]]" t))))) (org-element-map (org-element-parse-buffer) '(clock code entity example-block fixed-width footnote-reference inline-src-block latex-fragment latex-environment link macro plain-text src-block timestamp verbatim) (lambda (datum) (pcase (org-element-type datum) (`plain-text (unless (and ignore-headings (memq (org-element-type (org-element-property :parent datum)) '(headline inlinetask))) (cl-incf count (funcall count-words-in-string datum)))) ;; Count contents in words. ((or `code `example-block `fixed-width `latex-environment `latex-fragment `src-block `verbatim) (cl-incf count (funcall count-words-in-string (org-element-property :value datum)))) ;; Object counting as a single word. (`(or `entity `inline-src-block `macro `timestamp) (cl-incf count)) (`link ;; Links with contents are handled recursively by ;; `org-element-map'. (unless (org-element-contents datum) (cl-incf count))) (`footnote-reference (pcase (org-footnote-get-definition (org-element-property :label datum)) (`(,_ ,_ ,_ ,definition) (cl-incf count (funcall count-words-in-string definition))))))) ;; Do not blindly count footnote definitions and inline ;; references contents. We want to limit ourselves to ;; definitions actually referenced in the part of the document ;; we're parsing. nil nil '(footnote-definition footnote-reference) t) ;; Return words count. count))) Note that some parts are really arbitrary (e.g., how to count a src block...). Regards, -- Nicolas Goaziou