emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Ihor Radchenko <yantar92@gmail.com>
To: "Przemysław Kamiński" <pk@intrepidus.pl>, emacs-orgmode@gnu.org
Subject: Re: official orgmode parser
Date: Wed, 16 Sep 2020 20:27:36 +0800	[thread overview]
Message-ID: <87bli5nbyf.fsf@localhost> (raw)
In-Reply-To: <fb792b49-7387-7a43-640c-5e76b91b50b1@intrepidus.pl>

FYI: You may find https://github.com/ndwarshuis/org-ml helpful.


Przemysław Kamiński <pk@intrepidus.pl> writes:

> On 9/15/20 2:37 PM, tomas@tuxteam.de wrote:
>> On Tue, Sep 15, 2020 at 01:15:56PM +0200, Przemysław Kamiński wrote:
>> 
>> [...]
>> 
>>> There's the org-json (or ox-json) package but for some reason I
>>> wasn't able to run it successfully. I guess export to S-exps would
>>> be best here. But yes I'll check that out.
>> 
>> If that's your route, perhaps the "Org element API" [1] might be
>> helpful. Especially `org-element-parse-buffer' gives you a Lisp
>> data structure which is supposed to be a parse of your Org buffer.
>> 
>>  From there to S-expression can be trivial (e.g. `print' or `pp'),
>> depending on what you want to do.
>> 
>> Walking the structure should be nice in Lisp, too.
>> 
>> The topic of (non-Emacs) parsing of Org comes up regularly, and
>> there is a good (but AFAIK not-quite-complete) Org syntax spec
>> in Worg [2], but there are a couple of difficulties to be mastered
>> before such a thing can become really enjoyable and useful.
>> 
>> The loose specification of Org's format (arguably its second
>> or third strongest asset, the first two being its incredible
>> community and Emacs itself) is something which makes this
>> problem "interesting". People have invented lots of usages
>> which might be broken should Org change to a strict formal
>> spec. You don't want to break those people.
>> 
>> But yes, perhaps some day someone nails it. Perhaps it's you :)
>> 
>> Cheers
>> 
>> [1] https://orgmode.org/worg/dev/org-element-api.html
>> [2] https://orgmode.org/worg/dev/org-syntax.html
>> 
>>   - t
>> 
>
> So I looked at (pp (org-element-parse-buffer)) however it does print out 
> recursive stuff which other schemes have trouble parsing.
>
> My code looks more or less like this:
>
> (defun org-parse (f)
>    (with-temp-buffer
>      (find-file f)
>      (let* ((parsed (org-element-parse-buffer))
>             (all (append org-element-all-elements org-element-all-objects))
>             (mapped (org-element-map parsed all
>                       (lambda (item)
>                         (strip-parent item)))))
>        (pp mapped))))
>
>
> strip-parent is basically (plist-put props :parent nil) for elements 
> properties. However it turns out there are more recursive objects, like
>
> :title
>            #("Headline 1" 0 10
>              (:parent
>               (headline #2
>                             (section
>
> So I'm wondering do I have to do it by hand for all cases or is there 
> some way to output only a simple AST without those nested objects?
>
> Best,
> Przemek


  parent reply	other threads:[~2020-09-16 12:32 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-15  7:58 official orgmode parser Przemysław Kamiński
2020-09-15  8:44 ` Gerry Agbobada
2020-09-16 16:36   ` Matt Huszagh
2020-09-23  8:09   ` Bastien
2020-09-15  9:03 ` Tim Cross
2020-09-15  9:17   ` Przemysław Kamiński
2020-09-15  9:55     ` Russell Adams
2020-09-15 11:15       ` Przemysław Kamiński
2020-09-15 12:37         ` tomas
2020-09-15 18:09           ` Diego Zamboni
2020-09-16 12:09           ` Przemysław Kamiński
2020-09-16 12:20             ` tomas
2020-09-16 12:27             ` Ihor Radchenko [this message]
2020-09-16  0:16     ` Tim Cross
2020-09-16  7:24     ` Marcin Borkowski
2020-09-16  7:56       ` Ihor Radchenko
2020-09-16 11:36         ` Przemysław Kamiński
2020-09-16 12:02           ` Ihor Radchenko
2020-09-16 12:15             ` Przemysław Kamiński
2020-09-17  1:18               ` Ihor Radchenko
2020-09-17 15:24                 ` Przemysław Kamiński
2020-09-23  8:09 ` Bastien
2020-09-23 17:46   ` Przemysław Kamiński
2020-09-23 19:50     ` rey-coyrehourcq
2020-11-11  8:58       ` Bastien
2020-10-24 21:12   ` Daniele Nicolodi
2020-10-24 21:35     ` Tom Gillespie
2020-11-11  9:13       ` Bastien
2020-11-12 17:14         ` Tom Gillespie
2020-11-11  9:15     ` Bastien
2020-11-11 13:05       ` Daniele Nicolodi
2020-11-28 19:19       ` Gerry Agbobada
2020-10-26 11:23   ` Ken Mankoff
2020-10-26 14:21     ` Nicolas Goaziou
2020-10-26 16:17       ` Ken Mankoff
2020-10-26 16:24         ` Nicolas Goaziou
2020-10-26 16:47           ` Ken Mankoff
2020-10-26 17:59             ` Tom Gillespie
2020-10-26 20:26               ` Ken Mankoff
2020-10-26 21:00                 ` Tom Gillespie
2020-10-26 21:37                   ` Ken Mankoff
2020-10-26 22:19                     ` Tom Gillespie
2020-10-27  5:42                   ` Przemysław Kamiński
2020-11-11  8:59             ` Bastien
2020-11-11  9:00         ` Bastien

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bli5nbyf.fsf@localhost \
    --to=yantar92@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=pk@intrepidus.pl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).