From mboxrd@z Thu Jan 1 00:00:00 1970 From: Paul Sexton Subject: Re: Context-sensitive word count in org mode (elisp) Date: Wed, 16 Feb 2011 23:28:14 +0000 (UTC) Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Return-path: Received: from [140.186.70.92] (port=38128 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PpqnS-0003L0-Hy for emacs-orgmode@gnu.org; Wed, 16 Feb 2011 18:28:35 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PpqnR-0007tG-9N for emacs-orgmode@gnu.org; Wed, 16 Feb 2011 18:28:34 -0500 Received: from lo.gmane.org ([80.91.229.12]:43531) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PpqnQ-0007ss-Vu for emacs-orgmode@gnu.org; Wed, 16 Feb 2011 18:28:33 -0500 Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1PpqnL-00067N-LP for emacs-orgmode@gnu.org; Thu, 17 Feb 2011 00:28:30 +0100 Received: from rp.young.med.auckland.ac.nz ([130.216.140.20]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 17 Feb 2011 00:28:27 +0100 Received: from psexton by rp.young.med.auckland.ac.nz with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 17 Feb 2011 00:28:27 +0100 List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: emacs-orgmode@gnu.org Thanks for all the suggestions. Here is version 2. Improvements: - ignores source code blocks - ignores tags and TODO keywords in headings - ignores footnotes by default (option to force counting them) - skips any sections tagged as not for export - option to count words in latex macro arguments (they are ignored by default) I would still like to count hyperlink descriptions but am not sure how -- is there a function that fetches the description of the hyperlink at point? Paul ----------------------------------------------------------------------- (defun org-word-count (beg end &optional count-latex-macro-args? count-footnotes?) "Report the number of words in the Org mode buffer or selected region. Ignores: - comments - tables - source code blocks (#+BEGIN_SRC ... #+END_SRC, and inline blocks) - hyperlinks - tags, priorities, and TODO keywords in headers - sections tagged as 'not for export'. The text of footnote definitions is ignored, unless the optional argument COUNT-FOOTNOTES? is non-nil. If the optional argument COUNT-LATEX-MACRO-ARGS? is non-nil, the word count includes LaTeX macro arguments (the material between {curly braces}). Otherwise, and by default, every LaTeX macro counts as 1 word regardless of its arguments." (interactive "r") (unless mark-active (setf beg (point-min) end (point-max))) (let ((wc 0) (latex-macro-regexp "\\\\[A-Za-z]+\\(\\[[^]]*\\]\\|\\){\\([^}]*\\)}")) (save-excursion (goto-char beg) (while (< (point) end) (re-search-forward "\\w+\\W*") (cond ;; Ignore comments. ((or (org-in-commented-line) (org-at-table-p)) nil) ;; Ignore hyperlinks. ;; TODO need to count text of the link's description. ((looking-at org-any-link-re) (goto-char (match-end 0))) ;; Ignore source code blocks. ((org-in-regexps-block-p "^#\\+BEGIN_SRC\\W" "^#\\+END_SRC\\W") nil) ;; Ignore inline source blocks, counting them as 1 word. ((save-excursion (backward-char) (looking-at org-babel-inline-src-block-regexp)) (goto-char (match-end 0)) (setf wc (+ 2 wc))) ;; Count latex macros as 1 word, ignoring their arguments. ((save-excursion (backward-char) (looking-at latex-macro-regexp)) (goto-char (if count-latex-macro-args? (match-beginning 2) (match-end 0))) (setf wc (+ 2 wc))) ;; Ignore footnotes. ((and (not count-footnotes?) (or (org-footnote-at-definition-p) (org-footnote-at-reference-p))) nil) (t (let ((contexts (org-context))) (cond ;; Ignore tags and TODO keywords, etc. ((or (assoc :todo-keyword contexts) (assoc :priority contexts) (assoc :keyword contexts) (assoc :checkbox contexts)) nil) ;; Ignore sections marked with tags that are ;; excluded from export. ((assoc :tags contexts) (if (intersection (org-get-tags-at) org-export-exclude-tags :test 'equal) (org-forward-same-level 1) nil)) (t (incf wc)))))))) (message (format "%d words in %s." wc (if mark-active "region" "buffer")))))