From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id 6ONFERwCYl+OHQAA0tVLHw (envelope-from ) for ; Wed, 16 Sep 2020 12:16:28 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id CFxXDRwCYl8LJQAA1q6Kng (envelope-from ) for ; Wed, 16 Sep 2020 12:16:28 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id AD57A94060B for ; Wed, 16 Sep 2020 12:16:27 +0000 (UTC) Received: from localhost ([::1]:51836 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kIWMK-0006KM-Va for larch@yhetil.org; Wed, 16 Sep 2020 08:16:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59900) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kIWFz-0004Jf-UR for emacs-orgmode@gnu.org; Wed, 16 Sep 2020 08:09:51 -0400 Received: from dal3relay43.mxroute.com ([64.40.27.43]:33573) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kIWFw-0002f2-Qa for emacs-orgmode@gnu.org; Wed, 16 Sep 2020 08:09:51 -0400 Received: from filter003.mxroute.com ([168.235.111.26] 168-235-111-26.cloud.ramnode.com) (Authenticated sender: mN4UYu2MZsgR) by dal3relay43.mxroute.com (ZoneMTA) with ESMTPSA id 17496d218160004d87.001 for (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES128-GCM-SHA256); Wed, 16 Sep 2020 12:09:45 +0000 X-Zone-Loop: 35a7b44075c12739807f360d2c2f9f3530e19f409354 X-Originating-IP: [168.235.111.26] Received: from friday.mxlogin.com (friday.mxlogin.com [159.69.65.104]) by filter003.mxroute.com (Postfix) with ESMTPS id 51B076003B for ; Wed, 16 Sep 2020 12:09:44 +0000 (UTC) Subject: Re: official orgmode parser To: emacs-orgmode@gnu.org References: <68dc1ea1-52e8-7d9e-fb2d-bcf08c111eca@intrepidus.pl> <87d02n2yyr.fsf@gmail.com> <482cea5c-4214-57ac-dfeb-1e305180fee5@intrepidus.pl> <20200915095548.GP20869@maokai> <20200915123722.GA20532@tuxteam.de> From: =?UTF-8?B?UHJ6ZW15c8WCYXcgS2FtacWEc2tp?= Message-ID: Date: Wed, 16 Sep 2020 14:09:42 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: <20200915123722.GA20532@tuxteam.de> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-AuthUser: pk@mmksoft.uk Received-SPF: pass client-ip=64.40.27.43; envelope-from=pk@intrepidus.pl; helo=dal3relay43.mxroute.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/16 08:09:47 X-ACL-Warn: Detected OS = Linux 3.11 and newer [fuzzy] X-Spam_score_int: -19 X-Spam_score: -2.0 X-Spam_bar: -- X-Spam_report: (-2.0 / 5.0 requ) BAYES_00=-1.9, NICE_REPLY_A=-0.062, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Spam-Score: -1.01 X-TUID: VclGDuAyjTZF On 9/15/20 2:37 PM, tomas@tuxteam.de wrote: > On Tue, Sep 15, 2020 at 01:15:56PM +0200, Przemysław Kamiński wrote: > > [...] > >> There's the org-json (or ox-json) package but for some reason I >> wasn't able to run it successfully. I guess export to S-exps would >> be best here. But yes I'll check that out. > > If that's your route, perhaps the "Org element API" [1] might be > helpful. Especially `org-element-parse-buffer' gives you a Lisp > data structure which is supposed to be a parse of your Org buffer. > > From there to S-expression can be trivial (e.g. `print' or `pp'), > depending on what you want to do. > > Walking the structure should be nice in Lisp, too. > > The topic of (non-Emacs) parsing of Org comes up regularly, and > there is a good (but AFAIK not-quite-complete) Org syntax spec > in Worg [2], but there are a couple of difficulties to be mastered > before such a thing can become really enjoyable and useful. > > The loose specification of Org's format (arguably its second > or third strongest asset, the first two being its incredible > community and Emacs itself) is something which makes this > problem "interesting". People have invented lots of usages > which might be broken should Org change to a strict formal > spec. You don't want to break those people. > > But yes, perhaps some day someone nails it. Perhaps it's you :) > > Cheers > > [1] https://orgmode.org/worg/dev/org-element-api.html > [2] https://orgmode.org/worg/dev/org-syntax.html > > - t > So I looked at (pp (org-element-parse-buffer)) however it does print out recursive stuff which other schemes have trouble parsing. My code looks more or less like this: (defun org-parse (f) (with-temp-buffer (find-file f) (let* ((parsed (org-element-parse-buffer)) (all (append org-element-all-elements org-element-all-objects)) (mapped (org-element-map parsed all (lambda (item) (strip-parent item))))) (pp mapped)))) strip-parent is basically (plist-put props :parent nil) for elements properties. However it turns out there are more recursive objects, like :title #("Headline 1" 0 10 (:parent (headline #2 (section So I'm wondering do I have to do it by hand for all cases or is there some way to output only a simple AST without those nested objects? Best, Przemek