Hi! I wrote a Python script that parses an Org-mode file in order to generate a VCard 2.1 compatible output file I am using to import to my Android 4.4 device: https://github.com/novoid/org-contacts2vcard The reason I wrote it in Python is that I don't know ELISP well enough. The reason I wrote the script instead of using existing export methods: I only want to export a small sub-set (names, phone numbers, email addresses, contact image) due to privacy reasons. So far, it is a one-direction approach and no synchronization solution. By the way: does somebody know of any somewhat intelligent tool that is able to compare two different VCard files? The main issue here is the fact that VCard order and property order within a single VCard can be different but the VCard file could still contain the same information. So line-by-line comparisons like diff do not work here. -- mail|git|SVN|photos|postings|SMS|phonecalls|RSS|CSV|XML to Org-mode: > get Memacs from https://github.com/novoid/Memacs < https://github.com/novoid/extract_pdf_annotations_to_orgmode + more on github
On Friday 22 November 2013 17:37:01 Karl Voit wrote:
> The reason I wrote it in Python is that I don't know ELISP well
> enough. The reason I wrote the script instead of using existing
> export methods: I only want to export a small sub-set (names, phone
> numbers, email addresses, contact image) due to privacy reasons.
That should be possible with the existing VCard export. See `org-contacts-
ignore-property' to ignore specific properties. And `org-contacts-export-as-
vcard' takes a NAME parameter to limit the names.
Regards,
Rüdiger
* Rüdiger Sonderfeld <ruediger@c-plusplus.de> wrote: > On Friday 22 November 2013 17:37:01 Karl Voit wrote: >> The reason I wrote it in Python is that I don't know ELISP well >> enough. The reason I wrote the script instead of using existing >> export methods: I only want to export a small sub-set (names, phone >> numbers, email addresses, contact image) due to privacy reasons. > > That should be possible with the existing VCard export. See `org-contacts- > ignore-property' to ignore specific properties. And `org-contacts-export-as- > vcard' takes a NAME parameter to limit the names. Fair enough :-) However, I did additional things like checks, filtering, and so forth that were important to my data-set. E.g., my contact template does contain "0043/" as a pre-filled content for phone numbers. I wanted to ignore those fields that got only this template and not a complete phone number. I also wanted to get warnings in case some data does not fulfill certain other requirements. I have to admit that I don't know the feature-set of the Org-mode export. I would be very surprised, if the Org-mode export method is able to follow my custom "photo:" link I am using, grab the image file, test if it has a image format that works with VCard 2.1 on Android, and encodes it in base64 accordingly. You see: I want to have ways to tweak the export process. And as long as I don't know ELISP that well, I stick to the tools I know. A side remark of mine: a couple of months ago I tried to find out how to store address information, phone numbers, and so on in org-contact properties. AFAIR I could not find anything except the :EMAIL: property. Is there a standard out there that answers questions like "separate street from house number?", "how to cope with multiple addresses for one contact?", and so forth? I created something on my own as you can see on [1]. I am happy if you can get benefit from my little project and I am also happy when Org-mode offers a great export functionality for the rest of us :-) 1. https://raw.github.com/novoid/org-contacts2vcard/master/testdata/testcontacts.org -- mail|git|SVN|photos|postings|SMS|phonecalls|RSS|CSV|XML to Org-mode: > get Memacs from https://github.com/novoid/Memacs < https://github.com/novoid/extract_pdf_annotations_to_orgmode + more on github
On Friday 22 November 2013 18:09:42 Karl Voit wrote: > I have to admit that I don't know the feature-set of the Org-mode > export. I would be very surprised, if the Org-mode export method is > able to follow my custom "photo:" link I am using, grab the image > file, test if it has a image format that works with VCard > 2.1 on Android, and encodes it in base64 accordingly. Org-contacts has an :ICON: property and supports Gravatar. It doesn't seem to be handled in the VCard export though. > You see: I want to have ways to tweak the export process. And as > long as I don't know ELISP that well, I stick to the tools I know. I understand that and it solved your problem for now. But having an external tool in a different programming language is usually not a good idea to solve the problem in the long run. The code base of org-contacts and your tool is under the risk of diverting quickly. If it's in org-contacts then it is maintained in one piece and easily accessible to other users. So my point is you should take a look at elisp. It's a lot of fun to use and if you are using org-mode and Emacs then you will have to learn it sooner or later. > A side remark of mine: a couple of months ago I tried to find out > how to store address information, phone numbers, and so on in > org-contact properties. AFAIR I could not find anything except the > > :EMAIL: property. Is there a standard out there that answers > > questions like "separate street from house number?", "how to cope > with multiple addresses for one contact?", and so forth? I created > something on my own as you can see on [1]. I have to admit the org-contacts format is pretty much ad-hoc and not really well designed. It is documented a bit in the file itself (contrib/lisp/contacts.el). M-x customize-group RET org-contacts RET should also tell you more about the options. Your format choice is not fully compatible with the existing org-contacts. Right now multiple entries are separated by space (which sadly breaks for addresses) and different entry names are used. However I'd look forward to some new ideas and improvements. Right now it's not ideal solution. Regards, Rüdiger
Karl Voit <devnull@Karl-Voit.at> writes: > Hi! > > I wrote a Python script that parses an Org-mode file in order to > generate a VCard 2.1 compatible output file I am using to import to > my Android 4.4 device: > > https://github.com/novoid/org-contacts2vcard > > The reason I wrote it in Python is that I don't know ELISP well > enough. The reason I wrote the script instead of using existing > export methods: I only want to export a small sub-set (names, phone > numbers, email addresses, contact image) due to privacy reasons. the below function will only export name, phones and email #+begin_src (defun org-contacts-vcard-format (contact) "Formats CONTACT in VCard 3.0 format." (let* ((properties (caddr contact)) (name (org-contacts-vcard-escape (car contact))) (n (org-contacts-vcard-encode-name name)) (email (cdr (assoc-string org-contacts-email-property properties))) (tel (cdr (assoc-string org-contacts-tel-property properties))) (ignore-list (cdr (assoc-string org-contacts-ignore-property properties))) (ignore-list (when ignore-list (org-contacts-split-property ignore-list))) (head (format "BEGIN:VCARD\nVERSION:3.0\nN:%s\nFN:%s\n" n name)) emails-list result phones-list) (concat head (when email (progn (setq emails-list (org-contacts-remove-ignored-property-values ignore-list (org-contacts-split-property email))) (setq result "") (while emails-list (setq result (concat result "EMAIL:" (org-contacts-strip-link (car emails-list)) "\n")) (setq emails-list (cdr emails-list))) result)) (when tel (progn (setq phones-list (org-contacts-remove-ignored-property-values ignore-list (org-contacts-split-property tel))) (setq result "") (while phones-list (setq result (concat result "TEL:" (org-link-unescape (org-contacts-strip-link (car phones-list))) "\n")) (setq phones-list (cdr phones-list))) result)) "END:VCARD\n\n"))) #+end_src > > So far, it is a one-direction approach and no synchronization > solution. > > > > By the way: does somebody know of any somewhat intelligent tool that > is able to compare two different VCard files? The main issue here is > the fact that VCard order and property order within a single VCard > can be different but the VCard file could still contain the same > information. So line-by-line comparisons like diff do not work here. This may be difficult, I use org-contacts and use a elisp function to merge all the contacs which have same name. then export contacts to a vcard file. --
Karl Voit <devnull@Karl-Voit.at> writes: > * Rüdiger Sonderfeld <ruediger@c-plusplus.de> wrote: >> On Friday 22 November 2013 17:37:01 Karl Voit wrote: >>> The reason I wrote it in Python is that I don't know ELISP well >>> enough. The reason I wrote the script instead of using existing >>> export methods: I only want to export a small sub-set (names, phone >>> numbers, email addresses, contact image) due to privacy reasons. >> >> That should be possible with the existing VCard export. See `org-contacts- >> ignore-property' to ignore specific properties. And `org-contacts-export-as- >> vcard' takes a NAME parameter to limit the names. > > Fair enough :-) > > However, I did additional things like checks, filtering, and so > forth that were important to my data-set. E.g., my contact template > does contain "0043/" as a pre-filled content for phone numbers. I > wanted to ignore those fields that got only this template and not a > complete phone number. I also wanted to get warnings in case some > data does not fulfill certain other requirements. use (replace-regexp-in-string "^[0-9]\\{4,4\\}/" "" "0043/333/333") #+begin_src (defun org-contacts-vcard-format (contact) "Formats CONTACT in VCard 3.0 format." (let* ((properties (caddr contact)) (name (org-contacts-vcard-escape (car contact))) (n (org-contacts-vcard-encode-name name)) (email (cdr (assoc-string org-contacts-email-property properties))) (tel (cdr (assoc-string org-contacts-tel-property properties))) (ignore-list (cdr (assoc-string org-contacts-ignore-property properties))) (ignore-list (when ignore-list (org-contacts-split-property ignore-list))) (note (cdr (assoc-string org-contacts-note-property properties))) (bday (org-contacts-vcard-escape (cdr (assoc-string org-contacts-birthday-property properties)))) (addr (cdr (assoc-string org-contacts-address-property properties))) (nick (org-contacts-vcard-escape (cdr (assoc-string org-contacts-nickname-property properties)))) (head (format "BEGIN:VCARD\nVERSION:3.0\nN:%s\nFN:%s\n" n name)) emails-list result phones-list) (concat head (when email (progn (setq emails-list (org-contacts-remove-ignored-property-values ignore-list (org-contacts-split-property email))) (setq result "") (while emails-list (setq result (concat result "EMAIL:" (org-contacts-strip-link (car emails-list)) "\n")) (setq emails-list (cdr emails-list))) result)) (when addr (format "ADR:;;%s\n" (replace-regexp-in-string "\\, ?" ";" addr))) (when tel (progn (setq phones-list (org-contacts-remove-ignored-property-values ignore-list (org-contacts-split-property tel))) (setq result "") (while phones-list (setq result (concat result "TEL:" (replace-regexp-in-string "^[0-9]\\{4,4\\}/" "" (org-link-unescape (org-contacts-strip-link (car phones-list)))) "\n")) (setq phones-list (cdr phones-list))) result)) (when bday (let ((cal-bday (calendar-gregorian-from-absolute (org-time-string-to-absolute bday)))) (format "BDAY:%04d-%02d-%02d\n" (calendar-extract-year cal-bday) (calendar-extract-month cal-bday) (calendar-extract-day cal-bday)))) (when nick (format "NICKNAME:%s\n" nick)) (when note (format "NOTE:%s\n" note)) "END:VCARD\n\n"))) #+end_src > > I have to admit that I don't know the feature-set of the Org-mode > export. I would be very surprised, if the Org-mode export method is > able to follow my custom "photo:" link I am using, grab the image > file, test if it has a image format that works with VCard > 2.1 on Android, and encodes it in base64 accordingly. > > You see: I want to have ways to tweak the export process. And as > long as I don't know ELISP that well, I stick to the tools I know. > > > A side remark of mine: a couple of months ago I tried to find out > how to store address information, phone numbers, and so on in > org-contact properties. AFAIR I could not find anything except the > :EMAIL: property. Is there a standard out there that answers > questions like "separate street from house number?", "how to cope > with multiple addresses for one contact?", and so forth? I created > something on my own as you can see on [1]. > > > I am happy if you can get benefit from my little project and I am > also happy when Org-mode offers a great export functionality for the > rest of us :-) > > 1. https://raw.github.com/novoid/org-contacts2vcard/master/testdata/testcontacts.org-- > mail|git|SVN|photos|postings|SMS|phonecalls|RSS|CSV|XML to Org-mode: > > get Memacs from https://github.com/novoid/Memacs < > > https://github.com/novoid/extract_pdf_annotations_to_orgmode + more on github --
Executive summary of this rather long email: I am aware that ELISP is the language of choice for Org-mode features/tools. Here, I describe my motivation behind using Python instead. * Rüdiger Sonderfeld <ruediger@c-plusplus.de> wrote: > On Friday 22 November 2013 18:09:42 Karl Voit wrote: > > Org-contacts has an :ICON: property and supports Gravatar. It doesn't seem to > be handled in the VCard export though. :ICON:, I see. Thanks. >> You see: I want to have ways to tweak the export process. And as >> long as I don't know ELISP that well, I stick to the tools I know. > > I understand that and it solved your problem for now. Exactly. > But having an external tool in a different programming language is > usually not a good idea to solve the problem in the long run. The > code base of org-contacts and your tool is under the risk of > diverting quickly. If it's in org-contacts then it is maintained > in one piece and easily accessible to other users. Don't worry: I totally agree. :-) > So my point is you should take a look at elisp. It's a lot of fun > to use and if you are using org-mode and Emacs then you will have > to learn it sooner or later. I tried but I could not get a decent progress to implement the features I want to use. It is a rather high learning effort. I am not only referring to ELISP as a language. The basics are not that hard to learn. However, the more important part is to get into the existing libraries and their feature-set. For me, I could not get into it or I am not patient any more :-) It might be laziness or my brain might not be compatible with the world of functional programming languages. Therefore, I develop all my Org-mode tools with Python which I am comfortably with. I have done various things and put it on http://github.com/novoid I agree that implementing this stuff in ELISP would have been better for the community. However, as long as I don't have an ELISP code-monkey that implements my ideas and wishes, I have to stick to Python which is doing well to me and I don't have to invest a couple of weeks/months of not being that productive. You don't have to forget that I am not a programmer - I am an advanced user who is tweaking his personal set-up in a small sub-set of his spare time. If the features of my tools are implemented in Org-mode as well, I feel happy about it. I don't want to write "please add this highly sophisticated feature to Org-mode"-messages on the ML and wait for somebody to implement it. I can do it on my own (in Python) and I am able to do it the way I need/want and I am able to *use* it right away. Works for me. Additionally, I would never be able to implement Memacs (see sig) without the help of several students of mine. And here is the next thing: I could get several students with Python-knowledge and no one(!) with (E)LISP knowledge. Sad but true. I have the feeling that ELISP knowledge is found only at a small set of experts. Therefore: I did it in Python and I am aware that this is not the best thing to do. However, if somebody finds my stuff handy, she/he can grab it from github. If somebody re-implements it in ELISP, I am fine as well. It is even "worse" than that: I totally insist on writing a complete stand-alone blog system which parses my Org-mode files and generates (static) HTML5: https://github.com/novoid/lazyblorg Bam! Worst case scenario! :-) I tried to get other people infected with my thoughts [1] on a IMHO perfect blog system. So far, it seems that everybody is happy with the blog generating systems we do have now. When I stick to my current development velocity of lazyblorg, it will be finished right for the Christmas season ... of 2014 ;-) > I have to admit the org-contacts format is pretty much ad-hoc and > not really well designed. It is documented a bit in the file > itself (contrib/lisp/contacts.el). M-x customize-group RET > org-contacts RET should also tell you more about the options. Thanks for the pointer. However, I consider my template a bit more elaborated since I want to differ things like, e.g., mobile phone, work phone, land-line phone, and so forth. > Your format choice is not fully compatible with the existing > org-contacts. Right now multiple entries are separated by space > (which sadly breaks for addresses) and different entry names are > used. > > However I'd look forward to some new ideas and improvements. > Right now it's not ideal solution. I am glad to help here as well if my help is needed. The current examples in org-contacts.el were not able to suit my personal requirements. Therefore, I did my own definitions. In future, I will derive my complete mailserver whitelist directly from my Org-mode contacts and more. 1. http://article.gmane.org/gmane.emacs.orgmode/49747/ -- mail|git|SVN|photos|postings|SMS|phonecalls|RSS|CSV|XML to Org-mode: > get Memacs from https://github.com/novoid/Memacs < https://github.com/novoid/extract_pdf_annotations_to_orgmode + more on github
Karl Voit <devnull@Karl-Voit.at> writes:
> Therefore, I develop all my Org-mode tools with Python which I am
> comfortably with. I have done various things and put it on
> http://github.com/novoid
FWIW, I think it's good to develop tools for Org not only in Elisp but
also in other languages: Org is not just an Emacs module, it's also a
format, used outside Emacs. E.g. .org files on github.
--
Bastien
Bastien <bzg@gnu.org> writes: > Karl Voit <devnull@Karl-Voit.at> writes: >> Therefore, I develop all my Org-mode tools with Python which I am >> comfortably with. I have done various things and put it on >> http://github.com/novoid > FWIW, I think it's good to develop tools for Org not only in Elisp but > also in other languages: Org is not just an Emacs module, it's also a > format, used outside Emacs. E.g. .org files on github. I dream of having a general Python parser for Org mode files, knowing every bit about the current syntax for Org files, surrounded by enough Python machinery to make it useful. One non-negligible problem is that such a tool, to be very complete, would need an Emacs Lisp interpreter, which is all of an undertaking in itself. Maybe that some half-heated compromise could be developed? A hundredth-hearted compromise is likely the most I could do! :-) François
> > I dream of having a general Python parser for Org mode files, knowing > every bit about the current syntax for Org files, surrounded by enough > Python machinery to make it useful. > Try PyOrgMode (https://github.com/bjonnh/PyOrgMode), it works for some files (but still needs corrections: it crashes with date formats, with bold markers, etc.). You don't need a Lisp interpreter written in Python, only Python code that understands org syntax without getting confused.
Hi! * Daniel Clemente <n142857@gmail.com> wrote: >> >> I dream of having a general Python parser for Org mode files, knowing >> every bit about the current syntax for Org files, surrounded by enough >> Python machinery to make it useful. Oh, this would be great since there are way more Python-coders out there as ELISP coders. > Try PyOrgMode (https://github.com/bjonnh/PyOrgMode), it works for > some files (but still needs corrections: it crashes with date > formats, with bold markers, etc.). For my blogging system I am implementing [4] I was doing some research on current Org-parsers in Python. My notes about PyOrgMode (2013-05) were that there is not much of a documentation to use it properly and that the list of open todos contains rather basic things to consider it elaborated enough. So far, I consider my own Python parser[1] as the most advanced Python parser so far (unfortunately). However, I am completely aware of its downsides: - it's a very primitive line-by-line parser and not using any classical parsing tool at all (works for me so far!) - it's currently limited to a few Org-mode elements so that I can continue to develop my blogging system - more Org-mode elements (not all!) will be added when my blogging system gets stable enough to add Org-mode syntax features such as tables. - it's not written with the premise to be a stand-alone Org-mode parser since I only need it for my blogging system - feel free to use it and modify it to be a stand-alone parser I do think that for a more general approach, somebody should develop an Org-mode Python parser with classical parsing engines. I do have some experience with ply[2]. Unfortunately, I have to say that using ply feels a bit awkward in Python. I did not get the impression that this is a parsing engine that is done the Python way. A lot of things are done by convention (naming stuff, and so on) which has certain limitations in details. And AFAIR there were more things that puzzled me. However, it got my (simple) job [3] done back then. > You don't need a Lisp interpreter written in Python, only Python > code that understands org syntax without getting confused. I am no expert in this. I do feel that if you are going to use a ELISP interpreter to parse Org-mode syntax for Python, this should completely re-use the original Org-parser and nothing else. I have no idea if this is possible or not. If you have to implement a parser on your own, you probably should stick to Python-only. In order to avoid confusion, your own Python parser implements only a very well defined and documented sub-set of Org-mode syntax and should accept/parse everything else als ordinary text (content). IMHO. HTH. 1. https://github.com/novoid/lazyblorg/blob/master/lib/orgparser.py 2. http://www.dabeaz.com/ply/ 3. https://github.com/novoid/2011-04-tagstore-formal-experiment/tree/master/analysis_and_derived_data/scripts 4. https://github.com/novoid/lazyblorg -- mail|git|SVN|photos|postings|SMS|phonecalls|RSS|CSV|XML to Org-mode: > get Memacs from https://github.com/novoid/Memacs < https://github.com/novoid/extract_pdf_annotations_to_orgmode + more on github
Daniel Clemente <n142857@gmail.com> writes: >> I dream of having a general Python parser for Org mode files, knowing >> every bit about the current syntax for Org files, surrounded by enough >> Python machinery to make it useful. > Try PyOrgMode (https://github.com/bjonnh/PyOrgMode), it works for some > files (but still needs corrections: it crashes with date formats, with > bold markers, etc.). Hi, Daniel. As Karl points out (in a kind way), PyOrgMode is rather far from "knowing every bit about the current syntax for Org files". My feeling is that this effort should be restarted afresh. > You don't need a Lisp interpreter written in Python, only Python code > that understands org syntax without getting confused. Well, I would prefer a Python-only solution, rather than requiring Emacs and using it under the scene. François
Karl Voit <devnull@Karl-Voit.at> writes: > I did not get the impression that [ply] is a parsing engine that is > done the Python way. PLY has pros and cons. SPARK[1] always attracted me as being more elegant. While it accepts a wider set of grammars than PLY, SPARK can become quite slow on grammars which are less "natural" (admittedly a very fuzzy, subjective term). For simpler grammars, recursive descent does the job at good enough speed, and often, grammars can be rearranged a bit so the lexer could cleverly help the parser. Of course, it looks like more work writing a recursive descent parser, yet many times in my experience, the programmer is amply repaid with simplicity and clarity. >> You don't need a Lisp interpreter written in Python, only Python >> code that understands org syntax without getting confused. > if you are going to use a ELISP interpreter to parse Org-mode syntax > for Python, this should completely re-use the original Org-parser and > nothing else. I have no idea if this is possible or not. If you have > to implement a parser on your own, you probably should stick to > Python-only. Hey hey, it's fun! :-) You misunderstood me, but this is constructive actually, as you raise good points. In my dreams, a pure Python parser parses Org mode files. However, here and there in the parsed files, as data, we can see bits of Emacs Lisp code, or even Calc syntax at some places. That Emacs Lisp code could be mere constants or identifiers, but sometimes more complex, evalable S-expressions. A parser is probably of limited use if it does not come with some extra-tools covering most frequent use cases around the syntax, and I guess that pressure will develop to have some kind of Emacs Lisp interpreter, hardly complete, probably only mild or even ridiculous. The interesting idea in your comments is that, *if* we had an Emacs Lisp interpreter of serious quality, that interpreter could use "the original Org-parser and nothing else". That would solve maintenance, as the parser would be wholly external, to be found in Org mode distribution, all standard. But this avenue is quite unlikely: it looks like a major undertaking to me, and while such a parser would be useful on small data excerpts within an Org file, it might be inordinately slow if it had to interpret a lot of Lisp code while deciphering big Org files. Worse, keeping a Python parser in sync with the true Emacs Lisp parser would require much energy, maybe only once in a while, but extended over a long period of time. Unless a great enthusiasm exists, distributed on many people, such projects are always doomed to fail. Not many people are ready to commit themselves for life in the required maintenance. François --------------- [1] http://pages.cpsc.ucalgary.ca/~aycock/spark/
[-- Attachment #1: Type: text/plain, Size: 1899 bytes --] Hi Karl, Karl Voit <devnull@Karl-Voit.at> writes: > Hi! > > * Daniel Clemente <n142857@gmail.com> wrote: >>> >>> I dream of having a general Python parser for Org mode files, knowing >>> every bit about the current syntax for Org files, surrounded by enough >>> Python machinery to make it useful. > > Oh, this would be great since there are way more Python-coders out > there as ELISP coders. I agree. I'm also (slowly) working toward some Python-based org processing. My strategy is to produce an intermediate file in JSON format which is designed to capture the full org document structure. I am calling this a "shunt" export as it is meant to do as little interpretation of the document as possible. If this is interesting to you and you haven't already seen it please check the thread from December were I got a lot of help to output this JSON via the new org export mechanism (I'm a LISP newbie). Here is the concluding post with a working example: http://permalink.gmane.org/gmane.emacs.orgmode/79838 Besides any eventual Python-side development, one remaining gap in my plan is how to produce some kind of schema description using the org exporter machinery. I want to have this description generated automatically so that any future changes to the org format can be accommodated with some level of automation. So, my current thinking is to find a way to exploit org export machinery to generate this schema (call it a "meta-shunt" export?). If I can find that I'll output it as another JSON file. Then, on the Python-side, I will read this schema file in and generate instances of collections.namedtuple. Finally a reader of the JSON org document will be developed to produce objects of these namedtuple classes. At the end of the day one will have a DOM-style data structure representing the initial org document. -Brett. [-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
Brett Viren <bv@bnl.gov> writes: > I'm also (slowly) working toward some Python-based org processing. My > strategy is to produce an intermediate file in JSON format which is > designed to capture the full org document structure. I am calling > this a "shunt" export as it is meant to do as little interpretation of > the document as possible. Might be interesting, indeed! > http://permalink.gmane.org/gmane.emacs.orgmode/79838 This yields: ,---- | Not Found | | The requested URL /gmane.emacs.orgmode/79838 was not found on this server. `---- > At the end of the day one will have a DOM-style data structure > representing the initial org document. Keep me (us!) posted! :-) François
[-- Attachment #1: Type: text/plain, Size: 1629 bytes --] François Pinard <pinard@iro.umontreal.ca> writes: > Brett Viren <bv@bnl.gov> writes: > >> http://permalink.gmane.org/gmane.emacs.orgmode/79838 > > This yields: > > ,---- > | Not Found > | > | The requested URL /gmane.emacs.orgmode/79838 was not found on this server. > `---- Huh, maybe a transient failure? It's there for me right now. Here is the same message from GNU's archive: http://lists.gnu.org/archive/html/emacs-orgmode/2013-12/msg00415.html In any case, here is the salient chunk: #+BEGIN_SRC elisp (require 'json) (let* ((tree (org-element-parse-buffer 'object nil))) (org-element-map tree (append org-element-all-elements org-element-all-objects '(plain-text)) (lambda (x) (if (org-element-property :parent x) (org-element-put-property x :parent "none")) (if (org-element-property :structure x) (org-element-put-property x :structure "none")) )) (write-region (json-encode tree) nil "foo.dat")) #+END_SRC This test is meant to run from inside an org-mode buffer which itself provides the fodder for the test. But, it shows the steps that I'll need to integrate into some new org export mechanism. The important part is nulling out the :parent and :structure (and maybe others?) properties in order to break their circular references. The heavy lifting is all in org-element-parse-buffer and json-encode. >> At the end of the day one will have a DOM-style data structure >> representing the initial org document. > > Keep me (us!) posted! :-) Definitely! -Brett. [-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
[-- Attachment #1: Type: text/plain, Size: 309 bytes --] 2014/1/8 Brett Viren <bv@bnl.gov> Huh, maybe a transient failure? It's there for me right now. Here is > the same message from GNU's archive: > > http://lists.gnu.org/archive/html/emacs-orgmode/2013-12/msg00415.html Got it, thanks! :-) -- François Pinard http://pinard.progiciels-bpi.ca [-- Attachment #2: Type: text/html, Size: 836 bytes --]
El Wed, 08 Jan 2014 10:42:17 -0500 Brett Viren va escriure:
>
> http://lists.gnu.org/archive/html/emacs-orgmode/2013-12/msg00415.html
>
> In any case, here is the salient chunk:
>
> #+BEGIN_SRC elisp
> (require 'json)
> (let* ((tree (org-element-parse-buffer 'object nil)))
> (org-element-map tree (append org-element-all-elements
> org-element-all-objects '(plain-text))
> (lambda (x)
> (if (org-element-property :parent x)
> (org-element-put-property x :parent "none"))
> (if (org-element-property :structure x)
> (org-element-put-property x :structure "none"))
> ))
> (write-region
> (json-encode tree)
> nil "foo.dat"))
> #+END_SRC
>
I like this very much. This output is much easier to parse than the source .org file, and it's still using the original Elisp parser (so you don't need a Python parser).
I hope ox-json.el gets into org-mode some day.
Are there already Python parsers for it?
Should ox-json's output be as raw as possible (e.g. what your code produces now) or transformed to simpler JSON?
(I think both formats should coexist).
[-- Attachment #1: Type: text/plain, Size: 2138 bytes --] Hi Daniel, Daniel Clemente <n142857@gmail.com> writes: > Are there already Python parsers for it? Parsing generic JSON is fairly trivial in Python. import json data = json.dumps(open('file.json').read()) The resulting "data" is then a bunch of Python lists and/or dicts matching whatever structure was output from org and is in the .json file. The schema in these three contexts are (will be) identical. At this point, Pythonistas can do what they want with "data". Although, as I mentioned, I'd like to put another layer on this "raw" data structure which expresses/enforces the org schema as understood by the org-exporter. If I can figure out how to dump a representation of this schema from org I'll express it as a set of generated collections.namedtuple instances. We'll see. > Should ox-json's output be as raw as possible (e.g. what your code > produces now) or transformed to simpler JSON? > (I think both formats should coexist). I suppose there may be a usefulness to "winnow down" the structure. One thing I'm thinking about here is the narrowing done to support the "blog From anywhere" feature of Karl's lazyblorg mentioned in this thread. That can be done either on the emacs side or Python side (or both, in principle). However, my intention is to do as little modification of the org document structure on the emacs-side in order to preserve details that may possibly be interesting on the Python-side in the future. Also, I'm still learning LISP but know Python fairly well so would rather do as much processing as possible on the Python side. :) So far the only thing I see that needs to be stripped is the :parent property (and the :structure, which really should be resolved as a copy instead of being stripped) which cause the emacs-side data structure to become a Circular Object and thus break the emacs JSON dumper. I just noticed that Python's JSON dumper can do this kind of stripping implicitly and in general. It might be nice if someone were to add such a feature to the emacs JSON dumper but I don't plan to try this. -Brett. [-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]