* official orgmode parser @ 2020-09-15 7:58 Przemysław Kamiński 2020-09-15 8:44 ` Gerry Agbobada ` (2 more replies) 0 siblings, 3 replies; 45+ messages in thread From: Przemysław Kamiński @ 2020-09-15 7:58 UTC (permalink / raw) To: emacs-orgmode Hello, I oftentimes find myself needing to parse org files with some external tools (to generate reports for customers or sum up clock times for given month, etc). Looking through the list https://orgmode.org/worg/org-tools/ and having tested some of these, I must say they are lacking. The Haskell ones seem to be done best, but then the compile overhead of Haskell and difficulty in embedding this into other languages is a drawback. I think it might benefit the community when such an official parser would exist (and maybe could be hooked into org mode directly). I was thinking picking some scheme like chicken or guile, which could be later easily embedded into C or whatever. Then use that parser in org mode itself. This way some important part of org mode would be outside of the small world of elisp. This is just an idea, what do you think? :) Best, Przemek ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-15 7:58 official orgmode parser Przemysław Kamiński @ 2020-09-15 8:44 ` Gerry Agbobada 2020-09-16 16:36 ` Matt Huszagh 2020-09-23 8:09 ` Bastien 2020-09-15 9:03 ` Tim Cross 2020-09-23 8:09 ` Bastien 2 siblings, 2 replies; 45+ messages in thread From: Gerry Agbobada @ 2020-09-15 8:44 UTC (permalink / raw) To: emacs-orgmode [-- Attachment #1: Type: text/plain, Size: 1686 bytes --] Hi, I'm currently toying with the idea of trying a tree-sitter parser for Org. The very static nature of a shared object parser (knowing TODO keywords are pretty dynamic for example) is a challenge I'm not sure to overcome ; to be honest even without that I can't say I'll manage to do it. Having a tree-sitter parser would be really great in my opinion, at least it's a clearer way to "freeze" the syntax with some tests describing the syntax tree with S-expressions. And tree-sitter seems to be the popular sought after solution to slowness in parsing (and incremental parsing of org files would help with big files in my opinion) On Tue, Sep 15, 2020, at 09:58, Przemysław Kamiński wrote: > Hello, > > I oftentimes find myself needing to parse org files with some external > tools (to generate reports for customers or sum up clock times for given > month, etc). Looking through the list > > https://orgmode.org/worg/org-tools/ > > and having tested some of these, I must say they are lacking. The > Haskell ones seem to be done best, but then the compile overhead of > Haskell and difficulty in embedding this into other languages is a drawback. > > I think it might benefit the community when such an official parser > would exist (and maybe could be hooked into org mode directly). > > I was thinking picking some scheme like chicken or guile, which could be > later easily embedded into C or whatever. Then use that parser in org > mode itself. This way some important part of org mode would be outside > of the small world of elisp. > > This is just an idea, what do you think? :) > > Best, > Przemek > > Gerry Agbobada [-- Attachment #2: Type: text/html, Size: 2369 bytes --] ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-15 8:44 ` Gerry Agbobada @ 2020-09-16 16:36 ` Matt Huszagh 2020-09-23 8:09 ` Bastien 1 sibling, 0 replies; 45+ messages in thread From: Matt Huszagh @ 2020-09-16 16:36 UTC (permalink / raw) To: Gerry Agbobada, emacs-orgmode "Gerry Agbobada" <emacs-orgmode@gagbo.net> writes: > I'm currently toying with the idea of trying a tree-sitter parser for Org. The very static nature of a shared object parser (knowing TODO keywords are pretty dynamic for example) is a challenge I'm not sure to overcome ; to be honest even without that I can't say I'll manage to do it. A tree-sitter parser for org would be great! Please keep this list posted on any developments you make on this front. I made some minimal attempts at this a while back, but didn't get very far. Matt ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-15 8:44 ` Gerry Agbobada 2020-09-16 16:36 ` Matt Huszagh @ 2020-09-23 8:09 ` Bastien 1 sibling, 0 replies; 45+ messages in thread From: Bastien @ 2020-09-23 8:09 UTC (permalink / raw) To: Gerry Agbobada; +Cc: emacs-orgmode Hi Gerry, "Gerry Agbobada" <emacs-orgmode@gagbo.net> writes: > Having a tree-sitter parser would be really great in my opinion 1+ Thanks for working on this, let us know how it goes! -- Bastien ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-15 7:58 official orgmode parser Przemysław Kamiński 2020-09-15 8:44 ` Gerry Agbobada @ 2020-09-15 9:03 ` Tim Cross 2020-09-15 9:17 ` Przemysław Kamiński 2020-09-23 8:09 ` Bastien 2 siblings, 1 reply; 45+ messages in thread From: Tim Cross @ 2020-09-15 9:03 UTC (permalink / raw) To: emacs-orgmode Przemysław Kamiński <pk@intrepidus.pl> writes: > Hello, > > I oftentimes find myself needing to parse org files with some external > tools (to generate reports for customers or sum up clock times for given > month, etc). Looking through the list > > https://orgmode.org/worg/org-tools/ > > and having tested some of these, I must say they are lacking. The > Haskell ones seem to be done best, but then the compile overhead of > Haskell and difficulty in embedding this into other languages is a drawback. > > I think it might benefit the community when such an official parser > would exist (and maybe could be hooked into org mode directly). > > I was thinking picking some scheme like chicken or guile, which could be > later easily embedded into C or whatever. Then use that parser in org > mode itself. This way some important part of org mode would be outside > of the small world of elisp. > > This is just an idea, what do you think? :) > The problem with this idea is maintenance. It is also partly why external tools are not terribly reliable/good. Org mode is constantly being enhanced and improved. It is very hard for external tools to keep pace with org-mode development, so they soon get out of date or stop working correctly. Org mode IS an elsip application. This is the main goal. The reason it works so well is because elisp is largely a DSL that focuses on text manipulation and is therefore ideally suited for a text based organiser. This means if you want to implement parsing of org files in any other language, there is a lot of fundamental functionality which willl need to be implemented that is not necessary when using elisp as it is already built-in. Not only that, it is also 'battle hardened' and well tested. The other problem would be in selecting another language which behaves consistently across all the platforms Emacs and org-mode is supported on. As org-mode is a stnadard part of Emacs, it also needs to be implemented in something which is also available on all the platforms emacs is on without needing the user to install additional software. The other issue is that you would need another skill in order to maintain/extend org-mode. In addition to elisp, you will also need to know whatever the parser implementation language is. A third negative is that if the parser was in a different language to elisp, the interface between the rest of org mode (in elisp) and the parser would become an issue. At the moment, there are far fewer barriers as it is all elisp. However, if part of the system is in another language, you are now restricted to whatever defined interface exists. This would likely also have performance issues and overheads associated with translating from one format to another etc. So, in short, the chances of org mode using a parser written in something other than elisp is pretty close to 0. This leaves you with 2 options - 1. Implement another external tool which can parse org-files. As metnioned above, this is a non-trivial task and will likely be difficult to maintain. Probably not the best first choice. 2. Provide some details about your workflow where you believe you need to use external tools to process the org-files. It is very likely there are alternative approaches to give you the result you want, but without the need to do external parsing of org-files. There isn't sufficient details in the examples you mention to provide any specific details. However, I have used org-mode for reporting, invoicing, time tracking, documentation, issue/request tracking, project planning and project management and never needed to parse my org files with an external tool. I have exported the data in different formats which have then been processed by other tools and I have tweaked my setup to support various enterprise/corporate standards or requirements (logos, corporate colours, report formats, etc). Sometimes these tweaks are trivial and others require more extensive effort. Often, others have had to do something the same or similar and have working examples etc. So my recommendation is post some messages to this list with details on what you need to try and do and see what others can suggest. I would keep each post to a single item rather than one long post with multiple requests. From watching this list, I've often see someone post a "How can I ..." question only to get the answer "Oh, that is already built-in, just do .....". Org is a large application with lots of sophisticated power that isn't always obvious from just reading the manual. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-15 9:03 ` Tim Cross @ 2020-09-15 9:17 ` Przemysław Kamiński 2020-09-15 9:55 ` Russell Adams ` (2 more replies) 0 siblings, 3 replies; 45+ messages in thread From: Przemysław Kamiński @ 2020-09-15 9:17 UTC (permalink / raw) To: emacs-orgmode On 9/15/20 11:03 AM, Tim Cross wrote: > > Przemysław Kamiński <pk@intrepidus.pl> writes: > >> Hello, >> >> I oftentimes find myself needing to parse org files with some external >> tools (to generate reports for customers or sum up clock times for given >> month, etc). Looking through the list >> >> https://orgmode.org/worg/org-tools/ >> >> and having tested some of these, I must say they are lacking. The >> Haskell ones seem to be done best, but then the compile overhead of >> Haskell and difficulty in embedding this into other languages is a drawback. >> >> I think it might benefit the community when such an official parser >> would exist (and maybe could be hooked into org mode directly). >> >> I was thinking picking some scheme like chicken or guile, which could be >> later easily embedded into C or whatever. Then use that parser in org >> mode itself. This way some important part of org mode would be outside >> of the small world of elisp. >> >> This is just an idea, what do you think? :) >> > > The problem with this idea is maintenance. It is also partly why > external tools are not terribly reliable/good. Org mode is constantly > being enhanced and improved. It is very hard for external tools to keep > pace with org-mode development, so they soon get out of date or stop > working correctly. > > Org mode IS an elsip application. This is the main goal. The reason it > works so well is because elisp is largely a DSL that focuses on text > manipulation and is therefore ideally suited for a text based organiser. > > This means if you want to implement parsing of org files in any > other language, there is a lot of fundamental functionality which willl > need to be implemented that is not necessary when using elisp as it is > already built-in. Not only that, it is also 'battle hardened' and well > tested. The other problem would be in selecting another language which > behaves consistently across all the platforms Emacs and org-mode is > supported on. As org-mode is a stnadard part of Emacs, it also needs to > be implemented in something which is also available on all the platforms > emacs is on without needing the user to install additional software. > > The other issue is that you would need another skill in order to > maintain/extend org-mode. In addition to elisp, you will also need to > know whatever the parser implementation language is. > > A third negative is that if the parser was in a different language to > elisp, the interface between the rest of org mode (in elisp) and the > parser would become an issue. At the moment, there are far fewer > barriers as it is all elisp. However, if part of the system is in > another language, you are now restricted to whatever defined interface > exists. This would likely also have performance issues and overheads > associated with translating from one format to another etc. > > So, in short, the chances of org mode using a parser written in > something other than elisp is pretty close to 0. This leaves you with 2 > options - > > 1. Implement another external tool which can parse org-files. As > metnioned above, this is a non-trivial task and will likely be difficult > to maintain. Probably not the best first choice. > > 2. Provide some details about your workflow where you believe you need > to use external tools to process the org-files. It is very likely there > are alternative approaches to give you the result you want, but without > the need to do external parsing of org-files. There isn't sufficient > details in the examples you mention to provide any specific details. > However, I have used org-mode for reporting, invoicing, time tracking, > documentation, issue/request tracking, project planning and project > management and never needed to parse my org files with an external tool. > I have exported the data in different formats which have then been > processed by other tools and I have tweaked my setup to support various > enterprise/corporate standards or requirements (logos, corporate > colours, report formats, etc). Sometimes these tweaks are trivial and > others require more extensive effort. Often, others have had to do > something the same or similar and have working examples etc. > > So my recommendation is post some messages to this list with details on > what you need to try and do and see what others can suggest. I would > keep each post to a single item rather than one long post with multiple > requests. From watching this list, I've often see someone post a "How > can I ..." question only to get the answer "Oh, that is already > built-in, just do .....". Org is a large application with lots of > sophisticated power that isn't always obvious from just reading the > manual. > > So, I keep clock times for work in org mode, this is very handy. However, my customers require that I use their service to provide the times. They do offer API. So basically I'm using elisp to parse org, make API calls, and at the same time generate CSV reports with a Python interop with org babel (because my elisp is just too bad to do that). If I had access to some org parser, I'd pick a language that would be more comfortable for me to get the job done. I guess it can all be done in elisp, however this is just a tool for me alone and I have limited time resources on hacking things for myself :) Another one is generating total hours report for day/week/month to put into my awesomewm toolbar. I ended up using orgstat https://github.com/volhovM/orgstat however the author is creating his own DSL in YAML and I guess things were much better off if it all stayed in some Scheme :) Anyways, my parser needs aren't that sophisticated: just parse the file, return headings with clock drawers. I tried the common lisp library but got frustrated after fiddling with it for couple of hours. Best, Przemek ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-15 9:17 ` Przemysław Kamiński @ 2020-09-15 9:55 ` Russell Adams 2020-09-15 11:15 ` Przemysław Kamiński 2020-09-16 0:16 ` Tim Cross 2020-09-16 7:24 ` Marcin Borkowski 2 siblings, 1 reply; 45+ messages in thread From: Russell Adams @ 2020-09-15 9:55 UTC (permalink / raw) To: emacs-orgmode On Tue, Sep 15, 2020 at 11:17:57AM +0200, Przemysław Kamiński wrote: > > Org mode IS an elsip application. This is the main goal. The reason it > > works so well is because elisp is largely a DSL that focuses on text > > manipulation and is therefore ideally suited for a text based organiser. > > So, I keep clock times for work in org mode, this is very handy. > However, my customers require that I use their service to provide the > times. They do offer API. So basically I'm using elisp to parse org, > make API calls, and at the same time generate CSV reports with a Python > interop with org babel (because my elisp is just too bad to do > that). Please consider this is a very specialized use case. > If I had access to some org parser, I'd pick a language that would > be more comfortable for me to get the job done. I guess it can all > be done in elisp, however this is just a tool for me alone and I > have limited time resources on hacking things for myself :) Maintainer time is limited too. Maintaining a parser library outside of Emacs would be difficult for the reasons already given. I'd encourage you to pick up some more Elisp, which I am also trying to do. > Anyways, my parser needs aren't that sophisticated: just parse the file, > return headings with clock drawers. I tried the common lisp library but > got frustrated after fiddling with it for couple of hours. If it's that small you could always do that in Python with regexps for your usage if you're more comfortable in Python. Org's plain text format means you can read it with anything. I suspect grep might even pull headlines and clocks successfully. I haven't looked at the elisp parser much, but I do wonder if someone couldn't write an exporter that exports a programmatic version of your org file data (ie: to xml). Then other tools could ingest those xml files. That'd certainly be a contrib module and not in the core, but might be worth your while to explore the idea if you really want to work with Org data outside of Emacs. ------------------------------------------------------------------ Russell Adams RLAdams@AdamsInfoServ.com PGP Key ID: 0x1160DCB3 http://www.adamsinfoserv.com/ Fingerprint: 1723 D8CA 4280 1EC9 557F 66E8 1154 E018 1160 DCB3 ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-15 9:55 ` Russell Adams @ 2020-09-15 11:15 ` Przemysław Kamiński 2020-09-15 12:37 ` tomas 0 siblings, 1 reply; 45+ messages in thread From: Przemysław Kamiński @ 2020-09-15 11:15 UTC (permalink / raw) To: emacs-orgmode On 9/15/20 11:55 AM, Russell Adams wrote: > On Tue, Sep 15, 2020 at 11:17:57AM +0200, Przemysław Kamiński wrote: >>> Org mode IS an elsip application. This is the main goal. The reason it >>> works so well is because elisp is largely a DSL that focuses on text >>> manipulation and is therefore ideally suited for a text based organiser. >> >> So, I keep clock times for work in org mode, this is very handy. >> However, my customers require that I use their service to provide the >> times. They do offer API. So basically I'm using elisp to parse org, >> make API calls, and at the same time generate CSV reports with a Python >> interop with org babel (because my elisp is just too bad to do >> that). > > Please consider this is a very specialized use case. > >> If I had access to some org parser, I'd pick a language that would >> be more comfortable for me to get the job done. I guess it can all >> be done in elisp, however this is just a tool for me alone and I >> have limited time resources on hacking things for myself :) > > Maintainer time is limited too. Maintaining a parser library outside > of Emacs would be difficult for the reasons already given. I'd > encourage you to pick up some more Elisp, which I am also trying to > do. > >> Anyways, my parser needs aren't that sophisticated: just parse the file, >> return headings with clock drawers. I tried the common lisp library but >> got frustrated after fiddling with it for couple of hours. > > If it's that small you could always do that in Python with regexps for > your usage if you're more comfortable in Python. Org's plain text > format means you can read it with anything. I suspect grep might even > pull headlines and clocks successfully. > > > > I haven't looked at the elisp parser much, but I do wonder if someone > couldn't write an exporter that exports a programmatic version of your > org file data (ie: to xml). Then other tools could ingest those xml > files. That'd certainly be a contrib module and not in the core, but > might be worth your while to explore the idea if you really want to > work with Org data outside of Emacs. > > > ------------------------------------------------------------------ > Russell Adams RLAdams@AdamsInfoServ.com > > PGP Key ID: 0x1160DCB3 http://www.adamsinfoserv.com/ > > Fingerprint: 1723 D8CA 4280 1EC9 557F 66E8 1154 E018 1160 DCB3 > There's the org-json (or ox-json) package but for some reason I wasn't able to run it successfully. I guess export to S-exps would be best here. But yes I'll check that out. Przemek ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-15 11:15 ` Przemysław Kamiński @ 2020-09-15 12:37 ` tomas 2020-09-15 18:09 ` Diego Zamboni 2020-09-16 12:09 ` Przemysław Kamiński 0 siblings, 2 replies; 45+ messages in thread From: tomas @ 2020-09-15 12:37 UTC (permalink / raw) To: Przemysław Kamiński; +Cc: emacs-orgmode [-- Attachment #1: Type: text/plain, Size: 1427 bytes --] On Tue, Sep 15, 2020 at 01:15:56PM +0200, Przemysław Kamiński wrote: [...] > There's the org-json (or ox-json) package but for some reason I > wasn't able to run it successfully. I guess export to S-exps would > be best here. But yes I'll check that out. If that's your route, perhaps the "Org element API" [1] might be helpful. Especially `org-element-parse-buffer' gives you a Lisp data structure which is supposed to be a parse of your Org buffer. From there to S-expression can be trivial (e.g. `print' or `pp'), depending on what you want to do. Walking the structure should be nice in Lisp, too. The topic of (non-Emacs) parsing of Org comes up regularly, and there is a good (but AFAIK not-quite-complete) Org syntax spec in Worg [2], but there are a couple of difficulties to be mastered before such a thing can become really enjoyable and useful. The loose specification of Org's format (arguably its second or third strongest asset, the first two being its incredible community and Emacs itself) is something which makes this problem "interesting". People have invented lots of usages which might be broken should Org change to a strict formal spec. You don't want to break those people. But yes, perhaps some day someone nails it. Perhaps it's you :) Cheers [1] https://orgmode.org/worg/dev/org-element-api.html [2] https://orgmode.org/worg/dev/org-syntax.html - t [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-15 12:37 ` tomas @ 2020-09-15 18:09 ` Diego Zamboni 2020-09-16 12:09 ` Przemysław Kamiński 1 sibling, 0 replies; 45+ messages in thread From: Diego Zamboni @ 2020-09-15 18:09 UTC (permalink / raw) To: tomas; +Cc: Przemysław Kamiński, Org-mode [-- Attachment #1: Type: text/plain, Size: 1692 bytes --] There's also org-ql (https://github.com/alphapapa/org-ql), which also provides a query-based API against Org structures. --Diego On Tue, Sep 15, 2020 at 2:59 PM <tomas@tuxteam.de> wrote: > On Tue, Sep 15, 2020 at 01:15:56PM +0200, Przemysław Kamiński wrote: > > [...] > > > There's the org-json (or ox-json) package but for some reason I > > wasn't able to run it successfully. I guess export to S-exps would > > be best here. But yes I'll check that out. > > If that's your route, perhaps the "Org element API" [1] might be > helpful. Especially `org-element-parse-buffer' gives you a Lisp > data structure which is supposed to be a parse of your Org buffer. > > From there to S-expression can be trivial (e.g. `print' or `pp'), > depending on what you want to do. > > Walking the structure should be nice in Lisp, too. > > The topic of (non-Emacs) parsing of Org comes up regularly, and > there is a good (but AFAIK not-quite-complete) Org syntax spec > in Worg [2], but there are a couple of difficulties to be mastered > before such a thing can become really enjoyable and useful. > > The loose specification of Org's format (arguably its second > or third strongest asset, the first two being its incredible > community and Emacs itself) is something which makes this > problem "interesting". People have invented lots of usages > which might be broken should Org change to a strict formal > spec. You don't want to break those people. > > But yes, perhaps some day someone nails it. Perhaps it's you :) > > Cheers > > [1] https://orgmode.org/worg/dev/org-element-api.html > [2] https://orgmode.org/worg/dev/org-syntax.html > > - t > [-- Attachment #2: Type: text/html, Size: 2393 bytes --] ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-15 12:37 ` tomas 2020-09-15 18:09 ` Diego Zamboni @ 2020-09-16 12:09 ` Przemysław Kamiński 2020-09-16 12:20 ` tomas 2020-09-16 12:27 ` Ihor Radchenko 1 sibling, 2 replies; 45+ messages in thread From: Przemysław Kamiński @ 2020-09-16 12:09 UTC (permalink / raw) To: emacs-orgmode On 9/15/20 2:37 PM, tomas@tuxteam.de wrote: > On Tue, Sep 15, 2020 at 01:15:56PM +0200, Przemysław Kamiński wrote: > > [...] > >> There's the org-json (or ox-json) package but for some reason I >> wasn't able to run it successfully. I guess export to S-exps would >> be best here. But yes I'll check that out. > > If that's your route, perhaps the "Org element API" [1] might be > helpful. Especially `org-element-parse-buffer' gives you a Lisp > data structure which is supposed to be a parse of your Org buffer. > > From there to S-expression can be trivial (e.g. `print' or `pp'), > depending on what you want to do. > > Walking the structure should be nice in Lisp, too. > > The topic of (non-Emacs) parsing of Org comes up regularly, and > there is a good (but AFAIK not-quite-complete) Org syntax spec > in Worg [2], but there are a couple of difficulties to be mastered > before such a thing can become really enjoyable and useful. > > The loose specification of Org's format (arguably its second > or third strongest asset, the first two being its incredible > community and Emacs itself) is something which makes this > problem "interesting". People have invented lots of usages > which might be broken should Org change to a strict formal > spec. You don't want to break those people. > > But yes, perhaps some day someone nails it. Perhaps it's you :) > > Cheers > > [1] https://orgmode.org/worg/dev/org-element-api.html > [2] https://orgmode.org/worg/dev/org-syntax.html > > - t > So I looked at (pp (org-element-parse-buffer)) however it does print out recursive stuff which other schemes have trouble parsing. My code looks more or less like this: (defun org-parse (f) (with-temp-buffer (find-file f) (let* ((parsed (org-element-parse-buffer)) (all (append org-element-all-elements org-element-all-objects)) (mapped (org-element-map parsed all (lambda (item) (strip-parent item))))) (pp mapped)))) strip-parent is basically (plist-put props :parent nil) for elements properties. However it turns out there are more recursive objects, like :title #("Headline 1" 0 10 (:parent (headline #2 (section So I'm wondering do I have to do it by hand for all cases or is there some way to output only a simple AST without those nested objects? Best, Przemek ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-16 12:09 ` Przemysław Kamiński @ 2020-09-16 12:20 ` tomas 2020-09-16 12:27 ` Ihor Radchenko 1 sibling, 0 replies; 45+ messages in thread From: tomas @ 2020-09-16 12:20 UTC (permalink / raw) To: emacs-orgmode [-- Attachment #1: Type: text/plain, Size: 783 bytes --] On Wed, Sep 16, 2020 at 02:09:42PM +0200, Przemysław Kamiński wrote: [...] > So I looked at (pp (org-element-parse-buffer)) however it does print > out recursive stuff which other schemes have trouble parsing. > > My code looks more or less like this: > > (defun org-parse (f) > (with-temp-buffer > (find-file f) > (let* ((parsed (org-element-parse-buffer)) > (all (append org-element-all-elements org-element-all-objects)) > (mapped (org-element-map parsed all > (lambda (item) > (strip-parent item))))) > (pp mapped)))) Actually I'd tend to not modify the result, but to walk it. See `pcase' for a powerful pattern matcher which might help you there. Cheers - t [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-16 12:09 ` Przemysław Kamiński 2020-09-16 12:20 ` tomas @ 2020-09-16 12:27 ` Ihor Radchenko 1 sibling, 0 replies; 45+ messages in thread From: Ihor Radchenko @ 2020-09-16 12:27 UTC (permalink / raw) To: Przemysław Kamiński, emacs-orgmode FYI: You may find https://github.com/ndwarshuis/org-ml helpful. Przemysław Kamiński <pk@intrepidus.pl> writes: > On 9/15/20 2:37 PM, tomas@tuxteam.de wrote: >> On Tue, Sep 15, 2020 at 01:15:56PM +0200, Przemysław Kamiński wrote: >> >> [...] >> >>> There's the org-json (or ox-json) package but for some reason I >>> wasn't able to run it successfully. I guess export to S-exps would >>> be best here. But yes I'll check that out. >> >> If that's your route, perhaps the "Org element API" [1] might be >> helpful. Especially `org-element-parse-buffer' gives you a Lisp >> data structure which is supposed to be a parse of your Org buffer. >> >> From there to S-expression can be trivial (e.g. `print' or `pp'), >> depending on what you want to do. >> >> Walking the structure should be nice in Lisp, too. >> >> The topic of (non-Emacs) parsing of Org comes up regularly, and >> there is a good (but AFAIK not-quite-complete) Org syntax spec >> in Worg [2], but there are a couple of difficulties to be mastered >> before such a thing can become really enjoyable and useful. >> >> The loose specification of Org's format (arguably its second >> or third strongest asset, the first two being its incredible >> community and Emacs itself) is something which makes this >> problem "interesting". People have invented lots of usages >> which might be broken should Org change to a strict formal >> spec. You don't want to break those people. >> >> But yes, perhaps some day someone nails it. Perhaps it's you :) >> >> Cheers >> >> [1] https://orgmode.org/worg/dev/org-element-api.html >> [2] https://orgmode.org/worg/dev/org-syntax.html >> >> - t >> > > So I looked at (pp (org-element-parse-buffer)) however it does print out > recursive stuff which other schemes have trouble parsing. > > My code looks more or less like this: > > (defun org-parse (f) > (with-temp-buffer > (find-file f) > (let* ((parsed (org-element-parse-buffer)) > (all (append org-element-all-elements org-element-all-objects)) > (mapped (org-element-map parsed all > (lambda (item) > (strip-parent item))))) > (pp mapped)))) > > > strip-parent is basically (plist-put props :parent nil) for elements > properties. However it turns out there are more recursive objects, like > > :title > #("Headline 1" 0 10 > (:parent > (headline #2 > (section > > So I'm wondering do I have to do it by hand for all cases or is there > some way to output only a simple AST without those nested objects? > > Best, > Przemek ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-15 9:17 ` Przemysław Kamiński 2020-09-15 9:55 ` Russell Adams @ 2020-09-16 0:16 ` Tim Cross 2020-09-16 7:24 ` Marcin Borkowski 2 siblings, 0 replies; 45+ messages in thread From: Tim Cross @ 2020-09-16 0:16 UTC (permalink / raw) To: emacs-orgmode Przemysław Kamiński <pk@intrepidus.pl> writes: > > So, I keep clock times for work in org mode, this is very handy. > However, my customers require that I use their service to provide the > times. They do offer API. So basically I'm using elisp to parse org, > make API calls, and at the same time generate CSV reports with a Python > interop with org babel (because my elisp is just too bad to do that). If > I had access to some org parser, I'd pick a language that would be more > comfortable for me to get the job done. I guess it can all be done in > elisp, however this is just a tool for me alone and I have limited time > resources on hacking things for myself :) > I would probably use org's org-export-table command to export the clock table as a CSV and then just use a simple script to read in that CSV and do the API calls. > Another one is generating total hours report for day/week/month to put > into my awesomewm toolbar. I ended up using orgstat > https://github.com/volhovM/orgstat > however the author is creating his own DSL in YAML and I guess things > were much better off if it all stayed in some Scheme :) > Sounds like you have a solution. I would probably just setup a hook to generate the updated table and export it when the file is saved and then have something consume that exported file to update the taskbar. -- Tim Cross ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-15 9:17 ` Przemysław Kamiński 2020-09-15 9:55 ` Russell Adams 2020-09-16 0:16 ` Tim Cross @ 2020-09-16 7:24 ` Marcin Borkowski 2020-09-16 7:56 ` Ihor Radchenko 2 siblings, 1 reply; 45+ messages in thread From: Marcin Borkowski @ 2020-09-16 7:24 UTC (permalink / raw) To: Przemysław Kamiński; +Cc: emacs-orgmode On 2020-09-15, at 11:17, Przemysław Kamiński <pk@intrepidus.pl> wrote: > So, I keep clock times for work in org mode, this is very > handy. However, my customers require that I use their service to > provide the times. They do offer API. So basically I'm using elisp to > parse org, make API calls, and at the same time generate CSV reports > with a Python interop with org babel (because my elisp is just too bad > to do that). If I had access to some org parser, I'd pick a language > that would be more comfortable for me to get the job done. I guess it > can all be done in elisp, however this is just a tool for me alone and > I have limited time resources on hacking things for myself :) I was in the exact same situation - I use Org-mode clocking, and we use Toggl at our company, so I wrote a simple tool to fire API requests to Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl It's a bit more than 200 lines of Elisp, so you might try to look into it and adapt it to whatever tool your employer is using. > Another one is generating total hours report for day/week/month to put > into my awesomewm toolbar. I ended up using orgstat > https://github.com/volhovM/orgstat > however the author is creating his own DSL in YAML and I guess things > were much better off if it all stayed in some Scheme :) Wow, another awesomewm user here; could you share your code? Best, -- Marcin Borkowski http://mbork.pl ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-16 7:24 ` Marcin Borkowski @ 2020-09-16 7:56 ` Ihor Radchenko 2020-09-16 11:36 ` Przemysław Kamiński 0 siblings, 1 reply; 45+ messages in thread From: Ihor Radchenko @ 2020-09-16 7:56 UTC (permalink / raw) To: Marcin Borkowski, Przemysław Kamiński; +Cc: emacs-orgmode [-- Attachment #1: Type: text/plain, Size: 278 bytes --] > Wow, another awesomewm user here; could you share your code? Are you interested in something particular about awesome WM integration? I am using simple textbox widgets to show currently clocked in task and weighted summary of clocked time. See the attachments. Best, Ihor [-- Attachment #2: statusbar1.png --] [-- Type: image/png, Size: 23389 bytes --] [-- Attachment #3: statusbar2.png --] [-- Type: image/png, Size: 8747 bytes --] [-- Attachment #4: Type: text/plain, Size: 1561 bytes --] Marcin Borkowski <mbork@mbork.pl> writes: > On 2020-09-15, at 11:17, Przemysław Kamiński <pk@intrepidus.pl> wrote: > >> So, I keep clock times for work in org mode, this is very >> handy. However, my customers require that I use their service to >> provide the times. They do offer API. So basically I'm using elisp to >> parse org, make API calls, and at the same time generate CSV reports >> with a Python interop with org babel (because my elisp is just too bad >> to do that). If I had access to some org parser, I'd pick a language >> that would be more comfortable for me to get the job done. I guess it >> can all be done in elisp, however this is just a tool for me alone and >> I have limited time resources on hacking things for myself :) > > I was in the exact same situation - I use Org-mode clocking, and we use > Toggl at our company, so I wrote a simple tool to fire API requests to > Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl > It's a bit more than 200 lines of Elisp, so you might try to look into > it and adapt it to whatever tool your employer is using. > >> Another one is generating total hours report for day/week/month to put >> into my awesomewm toolbar. I ended up using orgstat >> https://github.com/volhovM/orgstat >> however the author is creating his own DSL in YAML and I guess things >> were much better off if it all stayed in some Scheme :) > > Wow, another awesomewm user here; could you share your code? > > Best, > > -- > Marcin Borkowski > http://mbork.pl ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-16 7:56 ` Ihor Radchenko @ 2020-09-16 11:36 ` Przemysław Kamiński 2020-09-16 12:02 ` Ihor Radchenko 0 siblings, 1 reply; 45+ messages in thread From: Przemysław Kamiński @ 2020-09-16 11:36 UTC (permalink / raw) To: emacs-orgmode On 9/16/20 9:56 AM, Ihor Radchenko wrote: >> Wow, another awesomewm user here; could you share your code? > > Are you interested in something particular about awesome WM integration? > > I am using simple textbox widgets to show currently clocked in task and > weighted summary of clocked time. See the attachments. > > Best, > Ihor > > > > > Marcin Borkowski <mbork@mbork.pl> writes: > >> On 2020-09-15, at 11:17, Przemysław Kamiński <pk@intrepidus.pl> wrote: >> >>> So, I keep clock times for work in org mode, this is very >>> handy. However, my customers require that I use their service to >>> provide the times. They do offer API. So basically I'm using elisp to >>> parse org, make API calls, and at the same time generate CSV reports >>> with a Python interop with org babel (because my elisp is just too bad >>> to do that). If I had access to some org parser, I'd pick a language >>> that would be more comfortable for me to get the job done. I guess it >>> can all be done in elisp, however this is just a tool for me alone and >>> I have limited time resources on hacking things for myself :) >> >> I was in the exact same situation - I use Org-mode clocking, and we use >> Toggl at our company, so I wrote a simple tool to fire API requests to >> Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl >> It's a bit more than 200 lines of Elisp, so you might try to look into >> it and adapt it to whatever tool your employer is using. >> >>> Another one is generating total hours report for day/week/month to put >>> into my awesomewm toolbar. I ended up using orgstat >>> https://github.com/volhovM/orgstat >>> however the author is creating his own DSL in YAML and I guess things >>> were much better off if it all stayed in some Scheme :) >> >> Wow, another awesomewm user here; could you share your code? >> >> Best, >> >> -- >> Marcin Borkowski >> http://mbork.pl I don't have interesting code, just standard awesomevm setup. I run periodic script to output data computed by orgstat and show it in the taskbar (uses the shellout_widget). However what Ihor presented is interesting. Do you use similar approach with shellout and 'emacs -batch' to show currently running task or you 'push' data from emacs to show it in the taskbar? P. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-16 11:36 ` Przemysław Kamiński @ 2020-09-16 12:02 ` Ihor Radchenko 2020-09-16 12:15 ` Przemysław Kamiński 0 siblings, 1 reply; 45+ messages in thread From: Ihor Radchenko @ 2020-09-16 12:02 UTC (permalink / raw) To: Przemysław Kamiński, emacs-orgmode > However what Ihor presented is interesting. Do you use similar approach > with shellout and 'emacs -batch' to show currently running task or you > 'push' data from emacs to show it in the taskbar? I prefer to avoid querying emacs too often for performance reasons. Instead, I only update the clocking info when I clock in/out in emacs. Then, the clocked in time is dynamically updated by independent bash script. The scheme is the following: 1. org clock in/out in Emacs trigger writing clocking info into ~/.org-clock-in status file 2. bash script periodically monitors the file and calculates the clocked in time according to the contents and time from last modification 3. the script updates simple textbox widget using awesome-client 4. the script also warns me (notify-send) when the weighted clocked in time is negative (meaning that I should switch to some more productive activity) Best, Ihor Przemysław Kamiński <pk@intrepidus.pl> writes: > On 9/16/20 9:56 AM, Ihor Radchenko wrote: >>> Wow, another awesomewm user here; could you share your code? >> >> Are you interested in something particular about awesome WM integration? >> >> I am using simple textbox widgets to show currently clocked in task and >> weighted summary of clocked time. See the attachments. >> >> Best, >> Ihor >> >> >> >> >> Marcin Borkowski <mbork@mbork.pl> writes: >> >>> On 2020-09-15, at 11:17, Przemysław Kamiński <pk@intrepidus.pl> wrote: >>> >>>> So, I keep clock times for work in org mode, this is very >>>> handy. However, my customers require that I use their service to >>>> provide the times. They do offer API. So basically I'm using elisp to >>>> parse org, make API calls, and at the same time generate CSV reports >>>> with a Python interop with org babel (because my elisp is just too bad >>>> to do that). If I had access to some org parser, I'd pick a language >>>> that would be more comfortable for me to get the job done. I guess it >>>> can all be done in elisp, however this is just a tool for me alone and >>>> I have limited time resources on hacking things for myself :) >>> >>> I was in the exact same situation - I use Org-mode clocking, and we use >>> Toggl at our company, so I wrote a simple tool to fire API requests to >>> Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl >>> It's a bit more than 200 lines of Elisp, so you might try to look into >>> it and adapt it to whatever tool your employer is using. >>> >>>> Another one is generating total hours report for day/week/month to put >>>> into my awesomewm toolbar. I ended up using orgstat >>>> https://github.com/volhovM/orgstat >>>> however the author is creating his own DSL in YAML and I guess things >>>> were much better off if it all stayed in some Scheme :) >>> >>> Wow, another awesomewm user here; could you share your code? >>> >>> Best, >>> >>> -- >>> Marcin Borkowski >>> http://mbork.pl > > > I don't have interesting code, just standard awesomevm setup. I run > periodic script to output data computed by orgstat and show it in the > taskbar (uses the shellout_widget). > > However what Ihor presented is interesting. Do you use similar approach > with shellout and 'emacs -batch' to show currently running task or you > 'push' data from emacs to show it in the taskbar? > > P. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-16 12:02 ` Ihor Radchenko @ 2020-09-16 12:15 ` Przemysław Kamiński 2020-09-17 1:18 ` Ihor Radchenko 0 siblings, 1 reply; 45+ messages in thread From: Przemysław Kamiński @ 2020-09-16 12:15 UTC (permalink / raw) To: emacs-orgmode On 9/16/20 2:02 PM, Ihor Radchenko wrote: >> However what Ihor presented is interesting. Do you use similar approach >> with shellout and 'emacs -batch' to show currently running task or you >> 'push' data from emacs to show it in the taskbar? > > I prefer to avoid querying emacs too often for performance reasons. > Instead, I only update the clocking info when I clock in/out in emacs. > Then, the clocked in time is dynamically updated by independent bash > script. > > The scheme is the following: > 1. org clock in/out in Emacs trigger writing clocking info into > ~/.org-clock-in status file > 2. bash script periodically monitors the file and calculates the clocked > in time according to the contents and time from last modification > 3. the script updates simple textbox widget using awesome-client > 4. the script also warns me (notify-send) when the weighted clocked in > time is negative (meaning that I should switch to some more > productive activity) > > Best, > Ihor > > Przemysław Kamiński <pk@intrepidus.pl> writes: > >> On 9/16/20 9:56 AM, Ihor Radchenko wrote: >>>> Wow, another awesomewm user here; could you share your code? >>> >>> Are you interested in something particular about awesome WM integration? >>> >>> I am using simple textbox widgets to show currently clocked in task and >>> weighted summary of clocked time. See the attachments. >>> >>> Best, >>> Ihor >>> >>> >>> >>> >>> Marcin Borkowski <mbork@mbork.pl> writes: >>> >>>> On 2020-09-15, at 11:17, Przemysław Kamiński <pk@intrepidus.pl> wrote: >>>> >>>>> So, I keep clock times for work in org mode, this is very >>>>> handy. However, my customers require that I use their service to >>>>> provide the times. They do offer API. So basically I'm using elisp to >>>>> parse org, make API calls, and at the same time generate CSV reports >>>>> with a Python interop with org babel (because my elisp is just too bad >>>>> to do that). If I had access to some org parser, I'd pick a language >>>>> that would be more comfortable for me to get the job done. I guess it >>>>> can all be done in elisp, however this is just a tool for me alone and >>>>> I have limited time resources on hacking things for myself :) >>>> >>>> I was in the exact same situation - I use Org-mode clocking, and we use >>>> Toggl at our company, so I wrote a simple tool to fire API requests to >>>> Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl >>>> It's a bit more than 200 lines of Elisp, so you might try to look into >>>> it and adapt it to whatever tool your employer is using. >>>> >>>>> Another one is generating total hours report for day/week/month to put >>>>> into my awesomewm toolbar. I ended up using orgstat >>>>> https://github.com/volhovM/orgstat >>>>> however the author is creating his own DSL in YAML and I guess things >>>>> were much better off if it all stayed in some Scheme :) >>>> >>>> Wow, another awesomewm user here; could you share your code? >>>> >>>> Best, >>>> >>>> -- >>>> Marcin Borkowski >>>> http://mbork.pl >> >> >> I don't have interesting code, just standard awesomevm setup. I run >> periodic script to output data computed by orgstat and show it in the >> taskbar (uses the shellout_widget). >> >> However what Ihor presented is interesting. Do you use similar approach >> with shellout and 'emacs -batch' to show currently running task or you >> 'push' data from emacs to show it in the taskbar? >> >> P. So basically this is what this thread is about. One needs a working Emacs instance and work in "push" mode to export any Org data. This requires dealing with temporary files, as described above, and some ad-hoc formats to keep whatever data I need to pull from org. "Pull" mode would be preferred. I could then, say, write a script in Guile, execute 'emacs -batch' to export org data (I'm ok with that), then parse the S-expressions to get what I need. P. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-16 12:15 ` Przemysław Kamiński @ 2020-09-17 1:18 ` Ihor Radchenko 2020-09-17 15:24 ` Przemysław Kamiński 0 siblings, 1 reply; 45+ messages in thread From: Ihor Radchenko @ 2020-09-17 1:18 UTC (permalink / raw) To: Przemysław Kamiński, emacs-orgmode > So basically this is what this thread is about. One needs a working > Emacs instance and work in "push" mode to export any Org data. This > requires dealing with temporary files, as described above, and some > ad-hoc formats to keep whatever data I need to pull from org. > "Pull" mode would be preferred. I could then, say, write a script in > Guile, execute 'emacs -batch' to export org data (I'm ok with that), > then parse the S-expressions to get what I need. My choice to use "push" mode is just for performance reasons. Nothing prevents you from writing a function called from emacs --batch that converts parsed org data into whatever format your Guile script prefers. That function may be either on Emacs side or on Guile side. Probably, Emacs has more capabilities when dealing with s-expressions though. You can even directly push the information from Emacs to API server. You may find https://github.com/tkf/emacs-request useful for this task. Finally, you may also consider clock tables to create clock summaries using existing org-mode functionality. The tables can be named and accessed using any programming language via babel. Best, Ihor Przemysław Kamiński <pk@intrepidus.pl> writes: > On 9/16/20 2:02 PM, Ihor Radchenko wrote: >>> However what Ihor presented is interesting. Do you use similar approach >>> with shellout and 'emacs -batch' to show currently running task or you >>> 'push' data from emacs to show it in the taskbar? >> >> I prefer to avoid querying emacs too often for performance reasons. >> Instead, I only update the clocking info when I clock in/out in emacs. >> Then, the clocked in time is dynamically updated by independent bash >> script. >> >> The scheme is the following: >> 1. org clock in/out in Emacs trigger writing clocking info into >> ~/.org-clock-in status file >> 2. bash script periodically monitors the file and calculates the clocked >> in time according to the contents and time from last modification >> 3. the script updates simple textbox widget using awesome-client >> 4. the script also warns me (notify-send) when the weighted clocked in >> time is negative (meaning that I should switch to some more >> productive activity) >> >> Best, >> Ihor >> >> Przemysław Kamiński <pk@intrepidus.pl> writes: >> >>> On 9/16/20 9:56 AM, Ihor Radchenko wrote: >>>>> Wow, another awesomewm user here; could you share your code? >>>> >>>> Are you interested in something particular about awesome WM integration? >>>> >>>> I am using simple textbox widgets to show currently clocked in task and >>>> weighted summary of clocked time. See the attachments. >>>> >>>> Best, >>>> Ihor >>>> >>>> >>>> >>>> >>>> Marcin Borkowski <mbork@mbork.pl> writes: >>>> >>>>> On 2020-09-15, at 11:17, Przemysław Kamiński <pk@intrepidus.pl> wrote: >>>>> >>>>>> So, I keep clock times for work in org mode, this is very >>>>>> handy. However, my customers require that I use their service to >>>>>> provide the times. They do offer API. So basically I'm using elisp to >>>>>> parse org, make API calls, and at the same time generate CSV reports >>>>>> with a Python interop with org babel (because my elisp is just too bad >>>>>> to do that). If I had access to some org parser, I'd pick a language >>>>>> that would be more comfortable for me to get the job done. I guess it >>>>>> can all be done in elisp, however this is just a tool for me alone and >>>>>> I have limited time resources on hacking things for myself :) >>>>> >>>>> I was in the exact same situation - I use Org-mode clocking, and we use >>>>> Toggl at our company, so I wrote a simple tool to fire API requests to >>>>> Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl >>>>> It's a bit more than 200 lines of Elisp, so you might try to look into >>>>> it and adapt it to whatever tool your employer is using. >>>>> >>>>>> Another one is generating total hours report for day/week/month to put >>>>>> into my awesomewm toolbar. I ended up using orgstat >>>>>> https://github.com/volhovM/orgstat >>>>>> however the author is creating his own DSL in YAML and I guess things >>>>>> were much better off if it all stayed in some Scheme :) >>>>> >>>>> Wow, another awesomewm user here; could you share your code? >>>>> >>>>> Best, >>>>> >>>>> -- >>>>> Marcin Borkowski >>>>> http://mbork.pl >>> >>> >>> I don't have interesting code, just standard awesomevm setup. I run >>> periodic script to output data computed by orgstat and show it in the >>> taskbar (uses the shellout_widget). >>> >>> However what Ihor presented is interesting. Do you use similar approach >>> with shellout and 'emacs -batch' to show currently running task or you >>> 'push' data from emacs to show it in the taskbar? >>> >>> P. > > > So basically this is what this thread is about. One needs a working > Emacs instance and work in "push" mode to export any Org data. This > requires dealing with temporary files, as described above, and some > ad-hoc formats to keep whatever data I need to pull from org. > > "Pull" mode would be preferred. I could then, say, write a script in > Guile, execute 'emacs -batch' to export org data (I'm ok with that), > then parse the S-expressions to get what I need. > > P. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-17 1:18 ` Ihor Radchenko @ 2020-09-17 15:24 ` Przemysław Kamiński 0 siblings, 0 replies; 45+ messages in thread From: Przemysław Kamiński @ 2020-09-17 15:24 UTC (permalink / raw) To: emacs-orgmode On 9/17/20 3:18 AM, Ihor Radchenko wrote: >> So basically this is what this thread is about. One needs a working >> Emacs instance and work in "push" mode to export any Org data. This >> requires dealing with temporary files, as described above, and some >> ad-hoc formats to keep whatever data I need to pull from org. > >> "Pull" mode would be preferred. I could then, say, write a script in >> Guile, execute 'emacs -batch' to export org data (I'm ok with that), >> then parse the S-expressions to get what I need. > > My choice to use "push" mode is just for performance reasons. Nothing > prevents you from writing a function called from emacs --batch that > converts parsed org data into whatever format your Guile script prefers. > That function may be either on Emacs side or on Guile side. Probably, > Emacs has more capabilities when dealing with s-expressions though. > > You can even directly push the information from Emacs to API server. > You may find https://github.com/tkf/emacs-request useful for this task. > > Finally, you may also consider clock tables to create clock summaries > using existing org-mode functionality. The tables can be named and > accessed using any programming language via babel. > > Best, > Ihor > > > Przemysław Kamiński <pk@intrepidus.pl> writes: > >> On 9/16/20 2:02 PM, Ihor Radchenko wrote: >>>> However what Ihor presented is interesting. Do you use similar approach >>>> with shellout and 'emacs -batch' to show currently running task or you >>>> 'push' data from emacs to show it in the taskbar? >>> >>> I prefer to avoid querying emacs too often for performance reasons. >>> Instead, I only update the clocking info when I clock in/out in emacs. >>> Then, the clocked in time is dynamically updated by independent bash >>> script. >>> >>> The scheme is the following: >>> 1. org clock in/out in Emacs trigger writing clocking info into >>> ~/.org-clock-in status file >>> 2. bash script periodically monitors the file and calculates the clocked >>> in time according to the contents and time from last modification >>> 3. the script updates simple textbox widget using awesome-client >>> 4. the script also warns me (notify-send) when the weighted clocked in >>> time is negative (meaning that I should switch to some more >>> productive activity) >>> >>> Best, >>> Ihor >>> >>> Przemysław Kamiński <pk@intrepidus.pl> writes: >>> >>>> On 9/16/20 9:56 AM, Ihor Radchenko wrote: >>>>>> Wow, another awesomewm user here; could you share your code? >>>>> >>>>> Are you interested in something particular about awesome WM integration? >>>>> >>>>> I am using simple textbox widgets to show currently clocked in task and >>>>> weighted summary of clocked time. See the attachments. >>>>> >>>>> Best, >>>>> Ihor >>>>> >>>>> >>>>> >>>>> >>>>> Marcin Borkowski <mbork@mbork.pl> writes: >>>>> >>>>>> On 2020-09-15, at 11:17, Przemysław Kamiński <pk@intrepidus.pl> wrote: >>>>>> >>>>>>> So, I keep clock times for work in org mode, this is very >>>>>>> handy. However, my customers require that I use their service to >>>>>>> provide the times. They do offer API. So basically I'm using elisp to >>>>>>> parse org, make API calls, and at the same time generate CSV reports >>>>>>> with a Python interop with org babel (because my elisp is just too bad >>>>>>> to do that). If I had access to some org parser, I'd pick a language >>>>>>> that would be more comfortable for me to get the job done. I guess it >>>>>>> can all be done in elisp, however this is just a tool for me alone and >>>>>>> I have limited time resources on hacking things for myself :) >>>>>> >>>>>> I was in the exact same situation - I use Org-mode clocking, and we use >>>>>> Toggl at our company, so I wrote a simple tool to fire API requests to >>>>>> Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl >>>>>> It's a bit more than 200 lines of Elisp, so you might try to look into >>>>>> it and adapt it to whatever tool your employer is using. >>>>>> >>>>>>> Another one is generating total hours report for day/week/month to put >>>>>>> into my awesomewm toolbar. I ended up using orgstat >>>>>>> https://github.com/volhovM/orgstat >>>>>>> however the author is creating his own DSL in YAML and I guess things >>>>>>> were much better off if it all stayed in some Scheme :) >>>>>> >>>>>> Wow, another awesomewm user here; could you share your code? >>>>>> >>>>>> Best, >>>>>> >>>>>> -- >>>>>> Marcin Borkowski >>>>>> http://mbork.pl >>>> >>>> >>>> I don't have interesting code, just standard awesomevm setup. I run >>>> periodic script to output data computed by orgstat and show it in the >>>> taskbar (uses the shellout_widget). >>>> >>>> However what Ihor presented is interesting. Do you use similar approach >>>> with shellout and 'emacs -batch' to show currently running task or you >>>> 'push' data from emacs to show it in the taskbar? >>>> >>>> P. >> >> >> So basically this is what this thread is about. One needs a working >> Emacs instance and work in "push" mode to export any Org data. This >> requires dealing with temporary files, as described above, and some >> ad-hoc formats to keep whatever data I need to pull from org. >> >> "Pull" mode would be preferred. I could then, say, write a script in >> Guile, execute 'emacs -batch' to export org data (I'm ok with that), >> then parse the S-expressions to get what I need. >> >> P. > OK so this is what I got so far https://gitlab.com/cgenie/org-parse I stole the simple test.org file from ox-json test suite. Guile seems to correctly parse that output. At least something to start with :) Any comments are welcome :) Best, Przemek ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-15 7:58 official orgmode parser Przemysław Kamiński 2020-09-15 8:44 ` Gerry Agbobada 2020-09-15 9:03 ` Tim Cross @ 2020-09-23 8:09 ` Bastien 2020-09-23 17:46 ` Przemysław Kamiński ` (2 more replies) 2 siblings, 3 replies; 45+ messages in thread From: Bastien @ 2020-09-23 8:09 UTC (permalink / raw) To: Przemysław Kamiński; +Cc: emacs-orgmode Hi Przemysław, Przemysław Kamiński <pk@intrepidus.pl> writes: > I oftentimes find myself needing to parse org files with some external > tools (to generate reports for customers or sum up clock times for > given month, etc). Looking through the list > > https://orgmode.org/worg/org-tools/ Can you help on making the above page more useful to anyone? Perhaps we can have a separate worg page just for parsers, reporting the ones that seem to fully work. I disagree that a parser is too difficult to maintain because Org is a moving target. Org core syntax is not moving anymore, a parser can reasonably target it. That's what is done with the Ruby parser, in use in this small project called github.com :) So I'd say: - let's enhance Worg's documentation - yes, please go for enhancing parsing tools I don't think we need official tools. The official Org parser exists, it is Org itself. Thanks, -- Bastien ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-23 8:09 ` Bastien @ 2020-09-23 17:46 ` Przemysław Kamiński 2020-09-23 19:50 ` rey-coyrehourcq 2020-10-24 21:12 ` Daniele Nicolodi 2020-10-26 11:23 ` Ken Mankoff 2 siblings, 1 reply; 45+ messages in thread From: Przemysław Kamiński @ 2020-09-23 17:46 UTC (permalink / raw) To: emacs-orgmode On 9/23/20 10:09 AM, Bastien wrote: > Hi Przemysław, > > Przemysław Kamiński <pk@intrepidus.pl> writes: > >> I oftentimes find myself needing to parse org files with some external >> tools (to generate reports for customers or sum up clock times for >> given month, etc). Looking through the list >> >> https://orgmode.org/worg/org-tools/ > > Can you help on making the above page more useful to anyone? > > Perhaps we can have a separate worg page just for parsers, reporting > the ones that seem to fully work. > > I disagree that a parser is too difficult to maintain because Org is > a moving target. Org core syntax is not moving anymore, a parser can > reasonably target it. That's what is done with the Ruby parser, in > use in this small project called github.com :) > > So I'd say: > > - let's enhance Worg's documentation > - yes, please go for enhancing parsing tools > > I don't think we need official tools. The official Org parser exists, > it is Org itself. > > Thanks, > Hello Bastien, Thank you for your remarks. I updated the README, hopefully it's more usable now. Przemek ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-23 17:46 ` Przemysław Kamiński @ 2020-09-23 19:50 ` rey-coyrehourcq 2020-11-11 8:58 ` Bastien 0 siblings, 1 reply; 45+ messages in thread From: rey-coyrehourcq @ 2020-09-23 19:50 UTC (permalink / raw) To: Przemysław Kamiński, emacs-orgmode Hi Przemysław, Some partial org Parsers (AST or regex...) i found on the web for a recent state of the art : * org-js https://github.com/mooz/org-js * orgajs Orga is a flexible org-mode syntax parser. It parses org content into AST (Abstract Syntax Tree) https://github.com/orgapp/orgajs * orgparse * org-mode-parser https://github.com/daitangio/org-mode-parser * org-rs https://github.com/org-rs/org-rs * org-ruby https://github.com/wallyqs/org-ruby * org-swift https://github.com/orgapp/swift-org * organice https://github.com/200ok-ch/organice * organum https://github.com/seylerius/organum * clj org https://github.com/eigenhombre/clj-org * orgmode-parse https://github.com/ixmatus/orgmode-parse * org-mode https://www.fosskers.ca/ https://hackage.haskell.org/package/org-mode * orgize https://github.com/PoiScript/orgize https://www.worthe-it.co.za/blog.html Best regards, Le mercredi 23 septembre 2020 à 19:46 +0200, Przemysław Kamiński a écrit : > On 9/23/20 10:09 AM, Bastien wrote: > > Hi Przemysław, > > > > Przemysław Kamiński <pk@intrepidus.pl> writes: > > > > > I oftentimes find myself needing to parse org files with some external > > > tools (to generate reports for customers or sum up clock times for > > > given month, etc). Looking through the list > > > > > > https://orgmode.org/worg/org-tools/ > > > > Can you help on making the above page more useful to anyone? > > > > Perhaps we can have a separate worg page just for parsers, reporting > > the ones that seem to fully work. > > > > I disagree that a parser is too difficult to maintain because Org is > > a moving target. Org core syntax is not moving anymore, a parser can > > reasonably target it. That's what is done with the Ruby parser, in > > use in this small project called github.com :) > > > > So I'd say: > > > > - let's enhance Worg's documentation > > - yes, please go for enhancing parsing tools > > > > I don't think we need official tools. The official Org parser exists, > > it is Org itself. > > > > Thanks, > > > > Hello Bastien, > > Thank you for your remarks. > > I updated the README, hopefully it's more usable now. > > Przemek > -- Sébastien Rey-Coyrehourcq Research Engineer UMR IDEES 02.35.14.69.30 {Stronger security for your email, follow EFF tutorial : https://ssd.eff.org/} ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-23 19:50 ` rey-coyrehourcq @ 2020-11-11 8:58 ` Bastien 0 siblings, 0 replies; 45+ messages in thread From: Bastien @ 2020-11-11 8:58 UTC (permalink / raw) To: rey-coyrehourcq; +Cc: Przemysław Kamiński, emacs-orgmode Hi Sébastien, rey-coyrehourcq <sebastien.rey-coyrehourcq@univ-rouen.fr> writes: > Some partial org Parsers (AST or regex...) i found on the web for a > recent state of the art : Thanks -- I've updated https://orgmode.org/worg/org-tools/ with this information. Best, -- Bastien ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-23 8:09 ` Bastien 2020-09-23 17:46 ` Przemysław Kamiński @ 2020-10-24 21:12 ` Daniele Nicolodi 2020-10-24 21:35 ` Tom Gillespie 2020-11-11 9:15 ` Bastien 2020-10-26 11:23 ` Ken Mankoff 2 siblings, 2 replies; 45+ messages in thread From: Daniele Nicolodi @ 2020-10-24 21:12 UTC (permalink / raw) To: emacs-orgmode On 23/09/2020 10:09, Bastien wrote: > I disagree that a parser is too difficult to maintain because Org is > a moving target. Org core syntax is not moving anymore, a parser can > reasonably target it. That's what is done with the Ruby parser, in > use in this small project called github.com :) (Just an aside: which Ruby org-mode parser does Github use? I sometime find instances where Github does not render an org-mode file correclty and I would be happy to file bugs to have them corrected). > So I'd say: > > - let's enhance Worg's documentation > - yes, please go for enhancing parsing tools > > I don't think we need official tools. The official Org parser exists, > it is Org itself. Would it make sense to have one "official" (or a set of) org-mode test files and the corresponding syntax tree as parsed by org-elements (maybe in a format easier to read from other programming languages than s-expressions, json maybe?) to make testing other parser against the reference implementation easier? Maybe the org-mode test suite already has something like this. I haven't looked for it yet. Cheers, Dan ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-10-24 21:12 ` Daniele Nicolodi @ 2020-10-24 21:35 ` Tom Gillespie 2020-11-11 9:13 ` Bastien 2020-11-11 9:15 ` Bastien 1 sibling, 1 reply; 45+ messages in thread From: Tom Gillespie @ 2020-10-24 21:35 UTC (permalink / raw) To: Daniele Nicolodi; +Cc: emacs-orgmode > which Ruby org-mode parser does Github use? I'm pretty sure that github uses https://github.com/wallyqs/org-ruby. It is ... not compliant, shall we say. I have making some fixes to the footnote parsing section on my todo list, but I don't expect to get to it any time in the near future. Tom ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-10-24 21:35 ` Tom Gillespie @ 2020-11-11 9:13 ` Bastien 2020-11-12 17:14 ` Tom Gillespie 0 siblings, 1 reply; 45+ messages in thread From: Bastien @ 2020-11-11 9:13 UTC (permalink / raw) To: Tom Gillespie; +Cc: emacs-orgmode, Daniele Nicolodi Hi Tom, Tom Gillespie <tgbugs@gmail.com> writes: >> which Ruby org-mode parser does Github use? > > I'm pretty sure that github uses https://github.com/wallyqs/org-ruby. > It is ... not compliant, shall we say. I have making some fixes to the > footnote parsing section on my todo list, but I don't expect to get to > it any time in the near future. Can you contact GitHub and see what they use? Whatever they use, I suggest we ask them to support the org library they use to let their users display Org files. Maybe the same should be done with gitlab.com, since they also parse Org files somehow. -- Bastien ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-11-11 9:13 ` Bastien @ 2020-11-12 17:14 ` Tom Gillespie 0 siblings, 0 replies; 45+ messages in thread From: Tom Gillespie @ 2020-11-12 17:14 UTC (permalink / raw) To: Bastien; +Cc: emacs-orgmode, waldemar.quevedo, Daniele Nicolodi Hi Bastien, I agree it would be great to ask them to contribute to whichever ruby library they are using. I will see if I can get in touch, but I have no idea of where to start if we really want to get to the folks who could make a decision. It looks like gitlab uses the same org-ruby library as well https://gitlab.com/gitlab-org/gitlab-foss/-/blob/master/Gemfile#L156. They may be easier to reach out to. I have also cced Wally to see if he has any insights here. Best! Tom ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-10-24 21:12 ` Daniele Nicolodi 2020-10-24 21:35 ` Tom Gillespie @ 2020-11-11 9:15 ` Bastien 2020-11-11 13:05 ` Daniele Nicolodi 2020-11-28 19:19 ` Gerry Agbobada 1 sibling, 2 replies; 45+ messages in thread From: Bastien @ 2020-11-11 9:15 UTC (permalink / raw) To: Daniele Nicolodi; +Cc: emacs-orgmode Hi Daniele, Daniele Nicolodi <daniele@grinta.net> writes: > Would it make sense to have one "official" (or a set of) org-mode test > files and the corresponding syntax tree as parsed by org-elements (maybe > in a format easier to read from other programming languages than > s-expressions, json maybe?) to make testing other parser against the > reference implementation easier? I think it is a very good idea. The example file would be also good to help users track for small syntactic changes, when they happen. Would you like to work on such a file? -- Bastien ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-11-11 9:15 ` Bastien @ 2020-11-11 13:05 ` Daniele Nicolodi 2020-11-28 19:19 ` Gerry Agbobada 1 sibling, 0 replies; 45+ messages in thread From: Daniele Nicolodi @ 2020-11-11 13:05 UTC (permalink / raw) To: emacs-orgmode On 11/11/2020 10:15, Bastien wrote: > Hi Daniele, > > Daniele Nicolodi <daniele@grinta.net> writes: > >> Would it make sense to have one "official" (or a set of) org-mode test >> files and the corresponding syntax tree as parsed by org-elements (maybe >> in a format easier to read from other programming languages than >> s-expressions, json maybe?) to make testing other parser against the >> reference implementation easier? > > I think it is a very good idea. > > The example file would be also good to help users track for small > syntactic changes, when they happen. > > Would you like to work on such a file? I don't have enough motivation to see this climb high enough in my TODO list to see any meaningful progress in a reasonable time frame. I am mote than happy to contribute to Org, but it is more effective to keep these contributions related to my daily use of Org. Cheers, Dan ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-11-11 9:15 ` Bastien 2020-11-11 13:05 ` Daniele Nicolodi @ 2020-11-28 19:19 ` Gerry Agbobada 1 sibling, 0 replies; 45+ messages in thread From: Gerry Agbobada @ 2020-11-28 19:19 UTC (permalink / raw) To: emacs-orgmode [-- Attachment #1: Type: text/plain, Size: 570 bytes --] Hello, On Wed, Nov 11, 2020, at 10:15, Bastien wrote: > > The example file would be also good to help users track for small > syntactic changes, when they happen. > > When I thought mistakenly I could use an EBNF parser to parse Org-mode, I wrote a little examples to get going (never went past headings as I'm not really good with parsing things) https://github.com/gagbo/LuaOrgParser/tree/master/tests/test-files/headings Maybe it could be used as a base. I wasn't really sure of how to handle test cases and creating good ones. Best regards, Gerry Agbobada [-- Attachment #2: Type: text/html, Size: 1172 bytes --] ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-09-23 8:09 ` Bastien 2020-09-23 17:46 ` Przemysław Kamiński 2020-10-24 21:12 ` Daniele Nicolodi @ 2020-10-26 11:23 ` Ken Mankoff 2020-10-26 14:21 ` Nicolas Goaziou 2 siblings, 1 reply; 45+ messages in thread From: Ken Mankoff @ 2020-10-26 11:23 UTC (permalink / raw) To: Bastien; +Cc: Przemysław Kamiński, emacs-orgmode Hello, On 2020-09-23 at 01:09 -07, Bastien <bzg@gnu.org> wrote... > I disagree that a parser is too difficult to maintain because Org is a > moving target. Org core syntax is not moving anymore, a parser can > reasonably target it. That's what is done with the Ruby parser, in use > in this small project called github.com :) Do you think it would be useful (or possible) to represent the current Org syntax in EBNF form so that people can use the EBNF to build parsers or graphically understand the form? I'm thinking of a nice page of railroad diagrams from this tool: https://github.com/GuntherRademacher/rr I question if this is possible because EBNF is for context-free grammars, but I *think* Org syntax is context-free. Even if not, I think those railroad diagrams might be useful for parser-writers and can still describe 99 % of the syntax, even if a few extra sentences are needed to clarify some edge case. -k. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-10-26 11:23 ` Ken Mankoff @ 2020-10-26 14:21 ` Nicolas Goaziou 2020-10-26 16:17 ` Ken Mankoff 0 siblings, 1 reply; 45+ messages in thread From: Nicolas Goaziou @ 2020-10-26 14:21 UTC (permalink / raw) To: Ken Mankoff; +Cc: Bastien, Przemysław Kamiński, emacs-orgmode Hello, Ken Mankoff <mankoff@gmail.com> writes: > I question if this is possible because EBNF is for context-free > grammars, but I *think* Org syntax is context-free. It's not as explained in a footnote in the Org syntax document. Regards, -- Nicolas Goaziou ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-10-26 14:21 ` Nicolas Goaziou @ 2020-10-26 16:17 ` Ken Mankoff 2020-10-26 16:24 ` Nicolas Goaziou 2020-11-11 9:00 ` Bastien 0 siblings, 2 replies; 45+ messages in thread From: Ken Mankoff @ 2020-10-26 16:17 UTC (permalink / raw) To: Nicolas Goaziou; +Cc: Bastien, Przemysław Kamiński, emacs-orgmode [-- Attachment #1: Type: text/plain, Size: 1889 bytes --] On 2020-10-26 at 07:21 -07, Nicolas Goaziou <mail@nicolasgoaziou.fr> wrote... > Ken Mankoff <mankoff@gmail.com> writes: > >> I question if this is possible because EBNF is for context-free >> grammars, but I *think* Org syntax is context-free. > > It's not as explained in a footnote in the Org syntax document. Yes, I meant to write that I think Org syntax is maybe *not* context-free, and therefore EBNF can't capture all of it. But it could still be very helpful and capture most of it. But the more I think about it, the more I think Org may be context-free. For the footnotes, I'm not sure that "(1) In particular, the parser requires stars at column 0 to be quoted by a comma when they do not define a headline" violates context. An "*" in the first column defines a header. It can be escaped by anything else too (" *" works too). If ",*" has a special meaning, that can be captured elsewhere in the syntax. I'm also not sure (2) violates context-freeness, at least in the EBNF sense where a context can include a newline. See for example: section ::= "*"+ string (tag+) newline (planning newline)? (property_drawer newline)? planning ::= ("SCHEDULED:" "<" date_or_time ">")? ("DEADLINE:" "<" date_or_time ">")? property_drawer ::= ":PROPERTIES:" newline drawer_contents newline ":END:" drawer_contents ::= ":" property ":" whitespace string Where the first line, "section" is represented graphically as the attached image. I guess I'm not 100% clear what "context-free" means. EBNF can represent a language where a for loop has an opening and closing brace. The closing brace is context-dependent, just as the planning or property drawers are. I recently used EBNF to represent a CSV file with header, and I was unable to capture the requirement that the header column must have the same number of fields or commas as the data section. I think that is context-free. [-- Attachment #2: tmp_20201026_090940.png --] [-- Type: image/png, Size: 8197 bytes --] [-- Attachment #3: Type: text/plain, Size: 7 bytes --] -k. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-10-26 16:17 ` Ken Mankoff @ 2020-10-26 16:24 ` Nicolas Goaziou 2020-10-26 16:47 ` Ken Mankoff 2020-11-11 9:00 ` Bastien 1 sibling, 1 reply; 45+ messages in thread From: Nicolas Goaziou @ 2020-10-26 16:24 UTC (permalink / raw) To: Ken Mankoff; +Cc: Bastien, Przemysław Kamiński, emacs-orgmode Ken Mankoff <mankoff@gmail.com> writes: > Yes, I meant to write that I think Org syntax is maybe *not* > context-free, and therefore EBNF can't capture all of it. But it could > still be very helpful and capture most of it. I'm not arguing about the usefulness of a partial EBNF description. I'm merely pointing out that the syntax is not context-free. Here is an example: # This is a comment (1) #+begin_example # This is not a comment (2) #+end_example AFAICT, you cannot distinguish between lines (1) and (2) with EBNF. Regards, ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-10-26 16:24 ` Nicolas Goaziou @ 2020-10-26 16:47 ` Ken Mankoff 2020-10-26 17:59 ` Tom Gillespie 2020-11-11 8:59 ` Bastien 0 siblings, 2 replies; 45+ messages in thread From: Ken Mankoff @ 2020-10-26 16:47 UTC (permalink / raw) To: Nicolas Goaziou; +Cc: Bastien, Przemysław Kamiński, emacs-orgmode On 2020-10-26 at 09:24 -07, Nicolas Goaziou <mail@nicolasgoaziou.fr> wrote... > # This is a comment (1) > > #+begin_example > # This is not a comment (2) > #+end_example > > AFAICT, you cannot distinguish between lines (1) and (2) with EBNF. I agree. I think this is a better (correct?) example than the footnotes on Org Syntax page. -k. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-10-26 16:47 ` Ken Mankoff @ 2020-10-26 17:59 ` Tom Gillespie 2020-10-26 20:26 ` Ken Mankoff 2020-11-11 8:59 ` Bastien 1 sibling, 1 reply; 45+ messages in thread From: Tom Gillespie @ 2020-10-26 17:59 UTC (permalink / raw) To: Ken Mankoff Cc: Bastien, Przemysław Kamiński, emacs-orgmode, Nicolas Goaziou I started writing down Org's grammar as an EBNF (with Racket's #lang brag) on Saturday. There is indeed a layer of Org grammar that can be implemented via EBNF, but it is fairly minimal. You can identify headlines, but you can't identify nesting level; the arbitrary nesting depth means that you have to have a stack to keep track. There is a similar issue with the indentation level in order to correctly interpret plain lists. If the canonical representation of an org document was required to used org-adapt-indentation: nil; org-edit-src-content-indentation: 0 and there was a canonical normalization function some of these issues would go away, but not all of them, and I'm fairly certain that it is not possible to implement a safe normalization function that won't mangle someones formatting. Another example of something that requires a stack is the greater blocks, where you have #+begin_{name} and #+end_{name}, and the names must match. If there was a closed set of names you could sort of do it by hand, but since name can be any string that does not contain whitespace, you have to have a stack to track which block you are in. So, you can identify things that are heads, you can identify things that are block start lines and block end lines, but you need stacks to keep track of heading level, indentation, plain list level, and block name. I might be missing a few other places where stacks are required, but those are the big ones. Best, Tom On Mon, Oct 26, 2020 at 12:48 PM Ken Mankoff <mankoff@gmail.com> wrote: > > > On 2020-10-26 at 09:24 -07, Nicolas Goaziou <mail@nicolasgoaziou.fr> wrote... > > # This is a comment (1) > > > > #+begin_example > > # This is not a comment (2) > > #+end_example > > > > AFAICT, you cannot distinguish between lines (1) and (2) with EBNF. > > I agree. I think this is a better (correct?) example than the footnotes on Org Syntax page. > > -k. > > ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-10-26 17:59 ` Tom Gillespie @ 2020-10-26 20:26 ` Ken Mankoff 2020-10-26 21:00 ` Tom Gillespie 0 siblings, 1 reply; 45+ messages in thread From: Ken Mankoff @ 2020-10-26 20:26 UTC (permalink / raw) To: Tom Gillespie Cc: Bastien, Przemysław Kamiński, emacs-orgmode, Nicolas Goaziou On 2020-10-26 at 10:59 -07, Tom Gillespie <tgbugs@gmail.com> wrote... > You can identify headlines, but you can't identify nesting level; Do you need to? This is valid as an entire Org file, I think: *** foo * bar ***** baz And that can be represented in EBNF. I'm not aware of places where behavior is indent-level specific, except inline tasks, and that edge case can be represented. > There is a similar issue with the indentation level in > order to correctly interpret plain lists. list ::= ('+' string newline)+ sublist? sublist ::= (indent list)+ I think this captures lists? > Another example of something that requires a stack is the greater > blocks, where you have #+begin_{name} and #+end_{name}, and the names > must match. Definitely not able to be represented in EBNF, unless as you say {name} is a limited vocabulary. -k. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-10-26 20:26 ` Ken Mankoff @ 2020-10-26 21:00 ` Tom Gillespie 2020-10-26 21:37 ` Ken Mankoff 2020-10-27 5:42 ` Przemysław Kamiński 0 siblings, 2 replies; 45+ messages in thread From: Tom Gillespie @ 2020-10-26 21:00 UTC (permalink / raw) To: Ken Mankoff Cc: Bastien, Przemysław Kamiński, emacs-orgmode, Nicolas Goaziou Here is an attempt to clarify my own confusion around the nested structures in org. In short: each node in the headline tree and the plain list tree can be parse using the EBNF, the nesting level cannot, which means that certain useful operations such as folding, require additional rules beyond the grammar. More in line. Best! Tom > Do you need to? This is valid as an entire Org file, I think: > > *** foo > * bar > ***** baz > > And that can be represented in EBNF. I'm not aware of places where behavior is indent-level specific, except inline tasks, and that edge case can be represented. You are correct, and as long as the heading depth doesn't change some interpretation then this is a non-issue. The reason I mentioned this though is because it means that you cannot determine how to correctly fold an org file from the grammar alone. To make sure I understand. It is possible to determine the number of leading stars (and thus the level), but I think that it is not possible to identify the end of a section. For example * a *** b ** c * d You can parse out a 1, b 3, c 2, d 1, but if you want to be able to nest b and c inside a but not nest d inside a, then you need a stack in there somewhere. You can't have a rule such as section : headline content content : text | section because the parse would incorrectly nest sections at the same level, you would have to write section-level-1 : headline-1 content-1 content-1 : text | section-level-2-n but since we have an arbitrary number of levels the grammar would have to be infinite. This is only if you want your grammar to be able to encode that the content of sections can include other more deeply nested sections, which in this context we almost certainly do not (as you point out). > > There is a similar issue with the indentation level in > > order to correctly interpret plain lists. > > list ::= ('+' string newline)+ sublist? > sublist ::= (indent list)+ > > I think this captures lists? Ah yes, I see my mistake here. In order for this to work the parser has to implement significant whitespace, so whitespace cannot be parsed into a single token. I think everything works out after that. > Definitely not able to be represented in EBNF, unless as you say {name} is a limited vocabulary. Darn those pesky open sets! ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-10-26 21:00 ` Tom Gillespie @ 2020-10-26 21:37 ` Ken Mankoff 2020-10-26 22:19 ` Tom Gillespie 2020-10-27 5:42 ` Przemysław Kamiński 1 sibling, 1 reply; 45+ messages in thread From: Ken Mankoff @ 2020-10-26 21:37 UTC (permalink / raw) To: Tom Gillespie Cc: Bastien, Przemysław Kamiński, emacs-orgmode, Nicolas Goaziou On 2020-10-26 at 14:00 -07, Tom Gillespie <tgbugs@gmail.com> wrote... >> list ::= ('+' string newline)+ sublist? >> sublist ::= (indent list)+ >> >> I think this captures lists? > > Ah yes, I see my mistake here. In order for this to work the parser > has to implement significant whitespace, so whitespace cannot be > parsed into a single token. I think everything works out after that. If we agree that the syntax above captures lists and sublists, then I think we could apply the same methods to the issue of headlines and sub-headlines? -k. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-10-26 21:37 ` Ken Mankoff @ 2020-10-26 22:19 ` Tom Gillespie 0 siblings, 0 replies; 45+ messages in thread From: Tom Gillespie @ 2020-10-26 22:19 UTC (permalink / raw) To: Ken Mankoff Cc: Bastien, Przemysław Kamiński, emacs-orgmode, Nicolas Goaziou Even if this did work for plain lists it won't work for headlines because headlines have an arbitrary number of stars and thus it is not possible for the grammar to know what is a sub-headline vs "the next headline". For a similar reason I'm fairly sure that the sublist approach will not work due to issues with relative indent. Here is the quote from the current draft syntax. > An item ends before the next item, the first line less or equally indented > than its starting line, or two consecutive empty lines. Indentation of lines > within other greater elements do not count, neither do inlinetasks boundaries. The "the first line less or equally indented than its starting line" section is what prevents your approach from working because you have to know the relative indentation in order to figure out which list contains a nested list. As written your grammar will parse a nested list into a flat list. This is because there are an arbitrary number distinct tokens that could be =indent= in your grammar and the EBNF can't specify an ordering for them so that you can't say that one indent is greater than another. For list termination the rule seems to be two new lines followed by not a list element. As a result of this, my inclination is to only parse plain list elements and reconstruct the whole "list" only as an internal semantic. Check the behavior of 1. to 1. see 1. what 1. I 1. mean 1. 1. 1. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-10-26 21:00 ` Tom Gillespie 2020-10-26 21:37 ` Ken Mankoff @ 2020-10-27 5:42 ` Przemysław Kamiński 1 sibling, 0 replies; 45+ messages in thread From: Przemysław Kamiński @ 2020-10-27 5:42 UTC (permalink / raw) To: emacs-orgmode I'm no expert in parsing but I would expect org's parser to be quite similar to the multitude of markdown or CommonMark [1] parsers. There isn't that much difference in syntax, except maybe org is more versatile and has more syntax elements, like drawers. Searching for "EBNF Markdown" I stumbled upon [2]. [1] https://commonmark.org/ [2] http://roopc.net/posts/2014/markdown-cfg/ On 10/26/20 10:00 PM, Tom Gillespie wrote: > Here is an attempt to clarify my own confusion around the nested > structures in org. In short: each node in the headline tree and the > plain list tree can be parse using the EBNF, the nesting level cannot, > which means that certain useful operations such as folding, require > additional rules beyond the grammar. More in line. Best! > Tom > >> Do you need to? This is valid as an entire Org file, I think: >> >> *** foo >> * bar >> ***** baz >> >> And that can be represented in EBNF. I'm not aware of places where behavior is indent-level specific, except inline tasks, and that edge case can be represented. > > You are correct, and as long as the heading depth doesn't change some > interpretation then this is a non-issue. The reason I mentioned this > though is > because it means that you cannot determine how to correctly fold an > org file from the grammar alone. > > To make sure I understand. It is possible to determine the number of > leading stars (and thus the level), but I think that it is not > possible to identify the end of a section. > For example > > * a > *** b > ** c > * d > > You can parse out a 1, b 3, c 2, d 1, but if you want to be able to > nest b and c inside a but not nest d inside a, then you need a stack > in there somewhere. You > can't have a rule such as > > section : headline content > content : text | section > > because the parse would incorrectly nest sections at the same level, > you would have to write > > section-level-1 : headline-1 content-1 > content-1 : text | section-level-2-n > > but since we have an arbitrary number of levels the grammar would have > to be infinite. > This is only if you want your grammar to be able to encode that the > content of sections > can include other more deeply nested sections, which in this context > we almost certainly > do not (as you point out). > >>> There is a similar issue with the indentation level in >>> order to correctly interpret plain lists. >> >> list ::= ('+' string newline)+ sublist? >> sublist ::= (indent list)+ >> >> I think this captures lists? > > Ah yes, I see my mistake here. In order for this to work the parser > has to implement significant whitespace, > so whitespace cannot be parsed into a single token. I think everything > works out after that. > >> Definitely not able to be represented in EBNF, unless as you say {name} is a limited vocabulary. > > Darn those pesky open sets! > ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-10-26 16:47 ` Ken Mankoff 2020-10-26 17:59 ` Tom Gillespie @ 2020-11-11 8:59 ` Bastien 1 sibling, 0 replies; 45+ messages in thread From: Bastien @ 2020-11-11 8:59 UTC (permalink / raw) To: Ken Mankoff; +Cc: Przemysław Kamiński, emacs-orgmode, Nicolas Goaziou Hi Ken, Ken Mankoff <mankoff@gmail.com> writes: > On 2020-10-26 at 09:24 -07, Nicolas Goaziou <mail@nicolasgoaziou.fr> wrote... >> # This is a comment (1) >> >> #+begin_example >> # This is not a comment (2) >> #+end_example >> >> AFAICT, you cannot distinguish between lines (1) and (2) with EBNF. > > I agree. I think this is a better (correct?) example than the > footnotes on Org Syntax page. Can you suggest a patch? -- Bastien ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: official orgmode parser 2020-10-26 16:17 ` Ken Mankoff 2020-10-26 16:24 ` Nicolas Goaziou @ 2020-11-11 9:00 ` Bastien 1 sibling, 0 replies; 45+ messages in thread From: Bastien @ 2020-11-11 9:00 UTC (permalink / raw) To: Ken Mankoff; +Cc: Przemysław Kamiński, emacs-orgmode, Nicolas Goaziou Hi Ken, Ken Mankoff <mankoff@gmail.com> writes: > Yes, I meant to write that I think Org syntax is maybe *not* > context-free, and therefore EBNF can't capture all of it. But it could > still be very helpful and capture most of it. Perhaps. Or you willing to give it a try and report here? -- Bastien ^ permalink raw reply [flat|nested] 45+ messages in thread
end of thread, other threads:[~2020-11-28 19:23 UTC | newest] Thread overview: 45+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-09-15 7:58 official orgmode parser Przemysław Kamiński 2020-09-15 8:44 ` Gerry Agbobada 2020-09-16 16:36 ` Matt Huszagh 2020-09-23 8:09 ` Bastien 2020-09-15 9:03 ` Tim Cross 2020-09-15 9:17 ` Przemysław Kamiński 2020-09-15 9:55 ` Russell Adams 2020-09-15 11:15 ` Przemysław Kamiński 2020-09-15 12:37 ` tomas 2020-09-15 18:09 ` Diego Zamboni 2020-09-16 12:09 ` Przemysław Kamiński 2020-09-16 12:20 ` tomas 2020-09-16 12:27 ` Ihor Radchenko 2020-09-16 0:16 ` Tim Cross 2020-09-16 7:24 ` Marcin Borkowski 2020-09-16 7:56 ` Ihor Radchenko 2020-09-16 11:36 ` Przemysław Kamiński 2020-09-16 12:02 ` Ihor Radchenko 2020-09-16 12:15 ` Przemysław Kamiński 2020-09-17 1:18 ` Ihor Radchenko 2020-09-17 15:24 ` Przemysław Kamiński 2020-09-23 8:09 ` Bastien 2020-09-23 17:46 ` Przemysław Kamiński 2020-09-23 19:50 ` rey-coyrehourcq 2020-11-11 8:58 ` Bastien 2020-10-24 21:12 ` Daniele Nicolodi 2020-10-24 21:35 ` Tom Gillespie 2020-11-11 9:13 ` Bastien 2020-11-12 17:14 ` Tom Gillespie 2020-11-11 9:15 ` Bastien 2020-11-11 13:05 ` Daniele Nicolodi 2020-11-28 19:19 ` Gerry Agbobada 2020-10-26 11:23 ` Ken Mankoff 2020-10-26 14:21 ` Nicolas Goaziou 2020-10-26 16:17 ` Ken Mankoff 2020-10-26 16:24 ` Nicolas Goaziou 2020-10-26 16:47 ` Ken Mankoff 2020-10-26 17:59 ` Tom Gillespie 2020-10-26 20:26 ` Ken Mankoff 2020-10-26 21:00 ` Tom Gillespie 2020-10-26 21:37 ` Ken Mankoff 2020-10-26 22:19 ` Tom Gillespie 2020-10-27 5:42 ` Przemysław Kamiński 2020-11-11 8:59 ` Bastien 2020-11-11 9:00 ` Bastien
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).