From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sebastian Miele Subject: Re: Get the text of a node Date: Wed, 23 Oct 2019 22:43:12 +0000 Message-ID: <878spbaxrz.fsf@gmail.com> References: <877e4vj0zf.fsf@fastmail.fm> Reply-To: sebastian.miele@gmail.com Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([2001:470:142:3::10]:58089) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iNPLZ-0006Mo-S7 for emacs-orgmode@gnu.org; Wed, 23 Oct 2019 18:43:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iNPLY-0001wC-NH for emacs-orgmode@gnu.org; Wed, 23 Oct 2019 18:43:17 -0400 Received: from mail-wm1-x32a.google.com ([2a00:1450:4864:20::32a]:38470) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1iNPLY-0001vM-HK for emacs-orgmode@gnu.org; Wed, 23 Oct 2019 18:43:16 -0400 Received: by mail-wm1-x32a.google.com with SMTP id 3so541659wmi.3 for ; Wed, 23 Oct 2019 15:43:16 -0700 (PDT) Received: from tisch ([2a02:908:175c:4260:5ffc:7882:6024:ca5b]) by smtp.gmail.com with ESMTPSA id r13sm34946286wra.74.2019.10.23.15.43.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Oct 2019 15:43:13 -0700 (PDT) In-reply-to: <877e4vj0zf.fsf@fastmail.fm> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: "Emacs-orgmode" To: emacs-orgmode@gnu.org Joost Kremers writes: > I was wondering if there's a way to programmatically get the text of a > node in an Org buffer. Basically, I have a buffer that looks something > like this: > > #+BEGIN_SRC org > * Top header > ** Subheader > :PROPERTIES: > :Custom_ID: some_id > :END: > > Text starts here, possibly with additional subheaders > #+END_SRC > > What I would like to extract is the text below "Subheader", but > without the :PROPERTIES: block. > > I've looked at the org-element library, but I haven't been able to > figure out how to use it to extract just the plain text. You probably are not aware of dev/org-element-api.org in Worg, yet. It is a very good introduction to and systematic overview of the element api. It is not mentioned at the top of org-element.el. > I use the :Custom_ID: property to find the relevant subheading and I > know I can use (org-back-to-heading) to get point to the Subheader > containing the relevant :PROPERTIES: block. Obviously, I could then > narrow the buffer to the subheader, use a text search to move point > past the line containing :END: and then extract the text from there > until (point-max). > > I'm just wondering if this may break in unexpected circumstances and > whether there's a better way. A robust way that I see is the following. The first two steps may be optional. Or they could be expanded slightly in order to even exclude possible subheadings from the work of org-element-parse-buffer in the last step. 1. Call org-element-at-point on the heading. The resulting element has :begin and :end properties. They contain the buffer positions of the beginning of the headline and the end of everything that belongs to the headline, including paragraphs and subheadings. 2. Call narrow-to-region on those positions. 3. Call org-element-parse-buffer. See dev/org-element-api.org for what that returns and why that works. Best wishes Sebastian