From mboxrd@z Thu Jan 1 00:00:00 1970 From: Brett Viren Subject: Re: Parsing Org-mode in Python Date: Tue, 07 Jan 2014 11:09:35 -0500 Message-ID: References: <2013-11-22T17-28-29@devnull.Karl-Voit.at> <3414130.xOGDSAomuL@descartes> <2013-11-22T17-57-08@devnull.Karl-Voit.at> <81482742.cUeHUGJmrV@descartes> <2013-11-24T13-29-07@devnull.Karl-Voit.at> <878uuvssi8.fsf@bzg.ath.cx> <87fvp3snof.fsf@iro.umontreal.ca> <87zjnaidlz.wl%n142857@gmail.com> <2014-01-06T11-23-40@devnull.Karl-Voit.at> Mime-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha1; protocol="application/pgp-signature" Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:54724) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W0ZEE-00007s-7k for emacs-orgmode@gnu.org; Tue, 07 Jan 2014 11:10:10 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1W0ZE6-0001KJ-72 for emacs-orgmode@gnu.org; Tue, 07 Jan 2014 11:10:06 -0500 Received: from smtpgw.bnl.gov ([2620:10a:0:3::30]:21721 helo=iron4.sec.bnl.local) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W0ZE6-0001IJ-0L for emacs-orgmode@gnu.org; Tue, 07 Jan 2014 11:09:58 -0500 In-Reply-To: <2014-01-06T11-23-40@devnull.Karl-Voit.at> (Karl Voit's message of "Mon, 6 Jan 2014 11:44:40 +0100") List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: news1142@Karl-Voit.at Cc: emacs-orgmode@gnu.org --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Hi Karl, Karl Voit writes: > Hi! > > * Daniel Clemente wrote: >>>=20 >>> I dream of having a general Python parser for Org mode files, knowing >>> every bit about the current syntax for Org files, surrounded by enough >>> Python machinery to make it useful. > > Oh, this would be great since there are way more Python-coders out > there as ELISP coders. I agree. I'm also (slowly) working toward some Python-based org processing. My strategy is to produce an intermediate file in JSON format which is designed to capture the full org document structure. I am calling this a "shunt" export as it is meant to do as little interpretation of the document as possible. If this is interesting to you and you haven't already seen it please check the thread from December were I got a lot of help to output this JSON via the new org export mechanism (I'm a LISP newbie). Here is the concluding post with a working example: http://permalink.gmane.org/gmane.emacs.orgmode/79838 Besides any eventual Python-side development, one remaining gap in my plan is how to produce some kind of schema description using the org exporter machinery. I want to have this description generated automatically so that any future changes to the org format can be accommodated with some level of automation. So, my current thinking is to find a way to exploit org export machinery to generate this schema (call it a "meta-shunt" export?). If I can find that I'll output it as another JSON file. Then, on the Python-side, I will read this schema file in and generate instances of collections.namedtuple. Finally a reader of the JSON org document will be developed to produce objects of these namedtuple classes. At the end of the day one will have a DOM-style data structure representing the initial org document. =2DBrett. --=-=-= Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAlLMJr8ACgkQEixH2Z0dKCwRKgCdFfkn8TgPmmSspFvUDxP8GTMo 4ZQAn2nfindnJmQXqW6mArjkz4UAY6LM =DhgY -----END PGP SIGNATURE----- --=-=-=--