emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Nicolas Goaziou <mail@nicolasgoaziou.fr>
To: Jacopo De Simoi <wilderkde@gmail.com>
Cc: Org Mode List <emacs-orgmode@gnu.org>
Subject: Re: Word-counting in a subtree
Date: Tue, 22 Aug 2017 11:24:03 +0200	[thread overview]
Message-ID: <87pobo2bws.fsf@nicolasgoaziou.fr> (raw)
In-Reply-To: <4070576.EzaRnAn6cA@bl4ckspoons> (Jacopo De Simoi's message of "Mon, 21 Aug 2017 17:34:39 -0400")

Hello,

Jacopo De Simoi <wilderkde@gmail.com> writes:

> this has the drawback that it also counts words in the header line; except 
> this, could work as a temporary solution, but I'd still like to cook up some 
> tag integrated with org

What about something like this (untested):

  (defun my-count-words (&optional ignore-headings beg end)
    (save-restriction
      (narrow-to-region (or beg (point-min)) (or end (point-max)))
      (let ((count 0)
            (count-words-in-string
             (lambda (s) (length (split-string s "[^[:word:]]" t)))))
        (org-element-map
            (org-element-parse-buffer)
            '(clock code entity example-block fixed-width footnote-reference
                    inline-src-block latex-fragment latex-environment
                    link macro plain-text src-block timestamp verbatim)
          (lambda (datum)
            (pcase (org-element-type datum)
              (`plain-text
               (unless (and ignore-headings
                            (memq (org-element-type
                                   (org-element-property :parent datum))
                                  '(headline inlinetask)))
                 (cl-incf count (funcall count-words-in-string datum))))
              ;; Count contents in words.
              ((or `code `example-block `fixed-width `latex-environment
                   `latex-fragment `src-block `verbatim)
               (cl-incf count (funcall count-words-in-string
                                       (org-element-property :value datum))))
              ;; Object counting as a single word.
              (`(or `entity `inline-src-block `macro `timestamp)
               (cl-incf count))
              (`link
               ;; Links with contents are handled recursively by
               ;; `org-element-map'.
               (unless (org-element-contents datum) (cl-incf count)))
              (`footnote-reference
               (pcase (org-footnote-get-definition
                       (org-element-property :label datum))
                 (`(,_ ,_ ,_ ,definition)
                  (cl-incf count (funcall count-words-in-string definition)))))))
          ;; Do not blindly count footnote definitions and inline
          ;; references contents. We want to limit ourselves to
          ;; definitions actually referenced in the part of the document
          ;; we're parsing.
          nil nil '(footnote-definition footnote-reference) t)
        ;; Return words count.
        count)))

Note that some parts are really arbitrary (e.g., how to count a src
block...).

Regards,

-- 
Nicolas Goaziou

      parent reply	other threads:[~2017-08-22  9:24 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-21 17:21 Word-counting in a subtree Jacopo De Simoi
2017-08-21 18:09 ` Eric S Fraga
2017-08-21 21:34   ` Jacopo De Simoi
2017-08-22  8:31     ` Simon Guest
2017-08-22  9:24     ` Nicolas Goaziou [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pobo2bws.fsf@nicolasgoaziou.fr \
    --to=mail@nicolasgoaziou.fr \
    --cc=emacs-orgmode@gnu.org \
    --cc=wilderkde@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).