From mboxrd@z Thu Jan  1 00:00:00 1970
From: Sebastian Miele <sebastian.miele@gmail.com>
Subject: Re: Get the text of a node
Date: Wed, 23 Oct 2019 22:43:12 +0000
Message-ID: <878spbaxrz.fsf@gmail.com>
References: <877e4vj0zf.fsf@fastmail.fm>
Reply-To: sebastian.miele@gmail.com
Mime-Version: 1.0
Content-Type: text/plain
Return-path: <emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org>
Received: from eggs.gnu.org ([2001:470:142:3::10]:58089)
 by lists.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <sebastian.miele@gmail.com>) id 1iNPLZ-0006Mo-S7
 for emacs-orgmode@gnu.org; Wed, 23 Oct 2019 18:43:18 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <sebastian.miele@gmail.com>) id 1iNPLY-0001wC-NH
 for emacs-orgmode@gnu.org; Wed, 23 Oct 2019 18:43:17 -0400
Received: from mail-wm1-x32a.google.com ([2a00:1450:4864:20::32a]:38470)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16)
 (Exim 4.71) (envelope-from <sebastian.miele@gmail.com>)
 id 1iNPLY-0001vM-HK
 for emacs-orgmode@gnu.org; Wed, 23 Oct 2019 18:43:16 -0400
Received: by mail-wm1-x32a.google.com with SMTP id 3so541659wmi.3
 for <emacs-orgmode@gnu.org>; Wed, 23 Oct 2019 15:43:16 -0700 (PDT)
Received: from tisch ([2a02:908:175c:4260:5ffc:7882:6024:ca5b])
 by smtp.gmail.com with ESMTPSA id r13sm34946286wra.74.2019.10.23.15.43.13
 for <emacs-orgmode@gnu.org>
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Wed, 23 Oct 2019 15:43:13 -0700 (PDT)
In-reply-to: <877e4vj0zf.fsf@fastmail.fm>
List-Id: "General discussions about Org-mode." <emacs-orgmode.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-orgmode>,
 <mailto:emacs-orgmode-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/emacs-orgmode>
List-Post: <mailto:emacs-orgmode@gnu.org>
List-Help: <mailto:emacs-orgmode-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-orgmode>,
 <mailto:emacs-orgmode-request@gnu.org?subject=subscribe>
Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org
Sender: "Emacs-orgmode"
 <emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org>
To: emacs-orgmode@gnu.org

Joost Kremers <joostkremers@fastmail.fm> writes:

> I was wondering if there's a way to programmatically get the text of a
> node in an Org buffer. Basically, I have a buffer that looks something
> like this:
>
> #+BEGIN_SRC org
> * Top header
> ** Subheader
>   :PROPERTIES:
>   :Custom_ID: some_id
>   :END:
>
>   Text starts here, possibly with additional subheaders
> #+END_SRC
>
> What I would like to extract is the text below "Subheader", but
> without the :PROPERTIES: block.
>
> I've looked at the org-element library, but I haven't been able to
> figure out how to use it to extract just the plain text.

You probably are not aware of dev/org-element-api.org in Worg, yet. It
is a very good introduction to and systematic overview of the element
api. It is not mentioned at the top of org-element.el.

> I use the :Custom_ID: property to find the relevant subheading and I
> know I can use (org-back-to-heading) to get point to the Subheader
> containing the relevant :PROPERTIES: block. Obviously, I could then
> narrow the buffer to the subheader, use a text search to move point
> past the line containing :END: and then extract the text from there
> until (point-max).
>
> I'm just wondering if this may break in unexpected circumstances and
> whether there's a better way.

A robust way that I see is the following. The first two steps may be
optional. Or they could be expanded slightly in order to even exclude
possible subheadings from the work of org-element-parse-buffer in the
last step.

1. Call org-element-at-point on the heading. The resulting element has
:begin and :end properties. They contain the buffer positions of the
beginning of the headline and the end of everything that belongs to the
headline, including paragraphs and subheadings.

2. Call narrow-to-region on those positions.

3. Call org-element-parse-buffer.

See dev/org-element-api.org for what that returns and why that works.

Best wishes
Sebastian