emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Get the text of a node
@ 2019-10-23  8:54 Joost Kremers
  2019-10-23 17:01 ` Jeff Filipovits
  2019-10-23 22:43 ` Sebastian Miele
  0 siblings, 2 replies; 3+ messages in thread
From: Joost Kremers @ 2019-10-23  8:54 UTC (permalink / raw)
  To: emacs-orgmode

Hi all,

I was wondering if there's a way to programmatically get the text 
of a node in an Org buffer. Basically, I have a buffer that looks 
something like this:

#+BEGIN_SRC org
* Top header
** Subheader
   :PROPERTIES:
   :Custom_ID: some_id
   :END:

   Text starts here, possibly with additional subheaders
#+END_SRC

What I would like to extract is the text below "Subheader", but 
without the :PROPERTIES: block.

I've looked at the org-element library, but I haven't been able to 
figure out how to use it to extract just the plain text.

I use the :Custom_ID: property to find the relevant subheading and 
I know I can use (org-back-to-heading) to get point to the 
Subheader containing the relevant :PROPERTIES: block. Obviously, I 
could then narrow the buffer to the subheader, use a text search 
to move point past the line containing :END: and then extract the 
text from there until (point-max).

I'm just wondering if this may break in unexpected circumstances 
and whether there's a better way.

TIA

Joost



-- 
Joost Kremers
Life has its moments

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Get the text of a node
  2019-10-23  8:54 Get the text of a node Joost Kremers
@ 2019-10-23 17:01 ` Jeff Filipovits
  2019-10-23 22:43 ` Sebastian Miele
  1 sibling, 0 replies; 3+ messages in thread
From: Jeff Filipovits @ 2019-10-23 17:01 UTC (permalink / raw)
  To: Joost Kremers; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 2194 bytes --]

 Sometimes giving a bad answer inspires someone else to give a better one,
so here goes:

It seems like the best way to get the contents programatically is using
org-dp (https://github.com/tj64/org-dp). I don't see a way in org-element.

The data returned will include the property drawer of the heading. It does
not include subheadings.

I wrote a quick and ugly function to strip out the property drawer (it also
has to remove the properties list associated with the section element,
hence excluding :begin), and then returns a string.

(defun get-contents (data)
"DATA is the data returned by (org-dp-contents)"
  (let ((contents)
    (exclusions '(property-drawer :begin)))
    (dolist (element (cdar data))
      (unless (memq (car-safe element) exclusions)
    (push element contents)))
    (org-element-interpret-data (reverse contents))))


I am skeptical that this is a better way then the alternative you
described, but do not know. Hopefully someone else can assist.




On Wed, Oct 23, 2019 at 12:10 PM Joost Kremers <joostkremers@fastmail.fm>
wrote:

> Hi all,
>
> I was wondering if there's a way to programmatically get the text
> of a node in an Org buffer. Basically, I have a buffer that looks
> something like this:
>
> #+BEGIN_SRC org
> * Top header
> ** Subheader
>    :PROPERTIES:
>    :Custom_ID: some_id
>    :END:
>
>    Text starts here, possibly with additional subheaders
> #+END_SRC
>
> What I would like to extract is the text below "Subheader", but
> without the :PROPERTIES: block.
>
> I've looked at the org-element library, but I haven't been able to
> figure out how to use it to extract just the plain text.
>
> I use the :Custom_ID: property to find the relevant subheading and
> I know I can use (org-back-to-heading) to get point to the
> Subheader containing the relevant :PROPERTIES: block. Obviously, I
> could then narrow the buffer to the subheader, use a text search
> to move point past the line containing :END: and then extract the
> text from there until (point-max).
>
> I'm just wondering if this may break in unexpected circumstances
> and whether there's a better way.
>
> TIA
>
> Joost
>
>
>
> --
> Joost Kremers
> Life has its moments
>
>

[-- Attachment #2: Type: text/html, Size: 3048 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Get the text of a node
  2019-10-23  8:54 Get the text of a node Joost Kremers
  2019-10-23 17:01 ` Jeff Filipovits
@ 2019-10-23 22:43 ` Sebastian Miele
  1 sibling, 0 replies; 3+ messages in thread
From: Sebastian Miele @ 2019-10-23 22:43 UTC (permalink / raw)
  To: emacs-orgmode

Joost Kremers <joostkremers@fastmail.fm> writes:

> I was wondering if there's a way to programmatically get the text of a
> node in an Org buffer. Basically, I have a buffer that looks something
> like this:
>
> #+BEGIN_SRC org
> * Top header
> ** Subheader
>   :PROPERTIES:
>   :Custom_ID: some_id
>   :END:
>
>   Text starts here, possibly with additional subheaders
> #+END_SRC
>
> What I would like to extract is the text below "Subheader", but
> without the :PROPERTIES: block.
>
> I've looked at the org-element library, but I haven't been able to
> figure out how to use it to extract just the plain text.

You probably are not aware of dev/org-element-api.org in Worg, yet. It
is a very good introduction to and systematic overview of the element
api. It is not mentioned at the top of org-element.el.

> I use the :Custom_ID: property to find the relevant subheading and I
> know I can use (org-back-to-heading) to get point to the Subheader
> containing the relevant :PROPERTIES: block. Obviously, I could then
> narrow the buffer to the subheader, use a text search to move point
> past the line containing :END: and then extract the text from there
> until (point-max).
>
> I'm just wondering if this may break in unexpected circumstances and
> whether there's a better way.

A robust way that I see is the following. The first two steps may be
optional. Or they could be expanded slightly in order to even exclude
possible subheadings from the work of org-element-parse-buffer in the
last step.

1. Call org-element-at-point on the heading. The resulting element has
:begin and :end properties. They contain the buffer positions of the
beginning of the headline and the end of everything that belongs to the
headline, including paragraphs and subheadings.

2. Call narrow-to-region on those positions.

3. Call org-element-parse-buffer.

See dev/org-element-api.org for what that returns and why that works.

Best wishes
Sebastian

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-10-23 22:43 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-23  8:54 Get the text of a node Joost Kremers
2019-10-23 17:01 ` Jeff Filipovits
2019-10-23 22:43 ` Sebastian Miele

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).