From mboxrd@z Thu Jan 1 00:00:00 1970 From: Brett Viren Subject: Re: Parsing Org-mode in Python Date: Thu, 09 Jan 2014 09:13:39 -0500 Message-ID: References: <2013-11-22T17-28-29@devnull.Karl-Voit.at> <3414130.xOGDSAomuL@descartes> <2013-11-22T17-57-08@devnull.Karl-Voit.at> <81482742.cUeHUGJmrV@descartes> <2013-11-24T13-29-07@devnull.Karl-Voit.at> <878uuvssi8.fsf@bzg.ath.cx> <87fvp3snof.fsf@iro.umontreal.ca> <87zjnaidlz.wl%n142857@gmail.com> <2014-01-06T11-23-40@devnull.Karl-Voit.at> <87k3easlf6.fsf@iro.umontreal.ca> <87ha9diyhw.wl%n142857@gmail.com> Mime-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha1; protocol="application/pgp-signature" Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:41438) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W1GNA-0005wF-Fw for emacs-orgmode@gnu.org; Thu, 09 Jan 2014 09:14:21 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1W1GN2-0002Dd-2R for emacs-orgmode@gnu.org; Thu, 09 Jan 2014 09:14:12 -0500 Received: from smtpgw.bnl.gov ([2620:10a:0:3::30]:33570 helo=iron3.sec.bnl.local) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W1GN1-000256-PH for emacs-orgmode@gnu.org; Thu, 09 Jan 2014 09:14:04 -0500 In-Reply-To: <87ha9diyhw.wl%n142857@gmail.com> (Daniel Clemente's message of "Thu, 09 Jan 2014 11:13:15 +0700") List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Daniel Clemente Cc: =?utf-8?Q?Fran=C3=A7ois?= Pinard , emacs-orgmode@gnu.org --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Hi Daniel, Daniel Clemente writes: > Are there already Python parsers for it? Parsing generic JSON is fairly trivial in Python. import json data =3D json.dumps(open('file.json').read()) The resulting "data" is then a bunch of Python lists and/or dicts matching whatever structure was output from org and is in the .json file. The schema in these three contexts are (will be) identical. At this point, Pythonistas can do what they want with "data". Although, as I mentioned, I'd like to put another layer on this "raw" data structure which expresses/enforces the org schema as understood by the org-exporter. If I can figure out how to dump a representation of this schema from org I'll express it as a set of generated collections.namedtuple instances. We'll see. > Should ox-json's output be as raw as possible (e.g. what your code > produces now) or transformed to simpler JSON? > (I think both formats should coexist). I suppose there may be a usefulness to "winnow down" the structure. One thing I'm thinking about here is the narrowing done to support the "blog From=20anywhere" feature of Karl's lazyblorg mentioned in this thread. That can be done either on the emacs side or Python side (or both, in principle). However, my intention is to do as little modification of the org document structure on the emacs-side in order to preserve details that may possibly be interesting on the Python-side in the future. Also, I'm still learning LISP but know Python fairly well so would rather do as much processing as possible on the Python side. :) So far the only thing I see that needs to be stripped is the :parent property (and the :structure, which really should be resolved as a copy instead of being stripped) which cause the emacs-side data structure to become a Circular Object and thus break the emacs JSON dumper.=20=20 I just noticed that Python's JSON dumper can do this kind of stripping implicitly and in general. It might be nice if someone were to add such a feature to the emacs JSON dumper but I don't plan to try this. =2DBrett. --=-=-= Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAlLOrpMACgkQEixH2Z0dKCwY8QCfToo4tb+oBPX/7kHEximFH4DB g5UAoJ4AfpjF9IlJJCZHdQLKSiWKdZy9 =IaYy -----END PGP SIGNATURE----- --=-=-=--