emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* official orgmode parser
@ 2020-09-15  7:58 Przemysław Kamiński
  2020-09-15  8:44 ` Gerry Agbobada
                   ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Przemysław Kamiński @ 2020-09-15  7:58 UTC (permalink / raw)
  To: emacs-orgmode

Hello,

I oftentimes find myself needing to parse org files with some external 
tools (to generate reports for customers or sum up clock times for given 
month, etc). Looking through the list

https://orgmode.org/worg/org-tools/

and having tested some of these, I must say they are lacking. The 
Haskell ones seem to be done best, but then the compile overhead of 
Haskell and difficulty in embedding this into other languages is a drawback.

I think it might benefit the community when such an official parser 
would exist (and maybe could be hooked into org mode directly).

I was thinking picking some scheme like chicken or guile, which could be 
later easily embedded into C or whatever. Then use that parser in org 
mode itself. This way some important part of org mode would be outside 
of the small world of elisp.

This is just an idea, what do you think? :)

Best,
Przemek


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-15  7:58 official orgmode parser Przemysław Kamiński
@ 2020-09-15  8:44 ` Gerry Agbobada
  2020-09-16 16:36   ` Matt Huszagh
  2020-09-23  8:09   ` Bastien
  2020-09-15  9:03 ` Tim Cross
  2020-09-23  8:09 ` Bastien
  2 siblings, 2 replies; 45+ messages in thread
From: Gerry Agbobada @ 2020-09-15  8:44 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 1686 bytes --]

Hi,

I'm currently toying with the idea of trying a tree-sitter parser for Org. The very static nature of a shared object parser (knowing TODO keywords are pretty dynamic for example) is a challenge I'm not sure to overcome ; to be honest even without that I can't say I'll manage to do it.

Having a tree-sitter parser would be really great in my opinion, at least it's a clearer way to "freeze" the syntax with some tests describing the syntax tree with S-expressions. And tree-sitter seems to be the popular sought after solution to slowness in parsing (and incremental parsing of org files would help with big files in my opinion)

On Tue, Sep 15, 2020, at 09:58, Przemysław Kamiński wrote:
> Hello,
> 
> I oftentimes find myself needing to parse org files with some external 
> tools (to generate reports for customers or sum up clock times for given 
> month, etc). Looking through the list
> 
> https://orgmode.org/worg/org-tools/
> 
> and having tested some of these, I must say they are lacking. The 
> Haskell ones seem to be done best, but then the compile overhead of 
> Haskell and difficulty in embedding this into other languages is a drawback.
> 
> I think it might benefit the community when such an official parser 
> would exist (and maybe could be hooked into org mode directly).
> 
> I was thinking picking some scheme like chicken or guile, which could be 
> later easily embedded into C or whatever. Then use that parser in org 
> mode itself. This way some important part of org mode would be outside 
> of the small world of elisp.
> 
> This is just an idea, what do you think? :)
> 
> Best,
> Przemek
> 
> 

Gerry Agbobada

[-- Attachment #2: Type: text/html, Size: 2369 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-15  7:58 official orgmode parser Przemysław Kamiński
  2020-09-15  8:44 ` Gerry Agbobada
@ 2020-09-15  9:03 ` Tim Cross
  2020-09-15  9:17   ` Przemysław Kamiński
  2020-09-23  8:09 ` Bastien
  2 siblings, 1 reply; 45+ messages in thread
From: Tim Cross @ 2020-09-15  9:03 UTC (permalink / raw)
  To: emacs-orgmode


Przemysław Kamiński <pk@intrepidus.pl> writes:

> Hello,
>
> I oftentimes find myself needing to parse org files with some external 
> tools (to generate reports for customers or sum up clock times for given 
> month, etc). Looking through the list
>
> https://orgmode.org/worg/org-tools/
>
> and having tested some of these, I must say they are lacking. The 
> Haskell ones seem to be done best, but then the compile overhead of 
> Haskell and difficulty in embedding this into other languages is a drawback.
>
> I think it might benefit the community when such an official parser 
> would exist (and maybe could be hooked into org mode directly).
>
> I was thinking picking some scheme like chicken or guile, which could be 
> later easily embedded into C or whatever. Then use that parser in org 
> mode itself. This way some important part of org mode would be outside 
> of the small world of elisp.
>
> This is just an idea, what do you think? :)
>

The problem with this idea is maintenance. It is also partly why
external tools are not terribly reliable/good. Org mode is constantly
being enhanced and improved. It is very hard for external tools to keep
pace with org-mode development, so they soon get out of date or stop
working correctly. 

Org mode IS an elsip application. This is the main goal. The reason it
works so well is because elisp is largely a DSL that focuses on text
manipulation and is therefore ideally suited for a text based organiser. 

This means if you want to implement parsing of org files in any
other language, there is a lot of fundamental functionality which willl
need to be implemented that is not necessary when using elisp as it is
already built-in. Not only that, it is also 'battle hardened' and well
tested. The other problem would be in selecting another language which
behaves consistently across all the platforms Emacs and org-mode is
supported on. As org-mode is a stnadard part of Emacs, it also needs to
be implemented in something which is also available on all the platforms
emacs is on without needing the user to install additional software. 

The other issue is that you would need another skill in order to
maintain/extend org-mode. In addition to elisp, you will also need to
know whatever the parser implementation language is.

A third negative is that if the parser was in a different language to
elisp, the interface between the rest of org mode (in elisp) and the
parser would become an issue. At the moment, there are far fewer
barriers as it is all elisp. However, if part of the system is in
another language, you are now restricted to whatever defined interface
exists. This would likely also have performance issues and overheads
associated with translating from one format to another etc.

So, in short, the chances of org mode using a parser written in
something other than elisp is pretty close to 0. This leaves you with 2
options -

1. Implement another external tool which can parse org-files. As
metnioned above, this is a non-trivial task and will likely be difficult
to maintain. Probably not the best first choice.

2. Provide some details about your workflow where you believe you need
to use external tools to process the org-files. It is very likely there
are alternative approaches to give you the result you want, but without
the need to do external parsing of org-files. There isn't sufficient
details in the examples you mention to provide any specific details.
However, I have used org-mode for reporting, invoicing, time tracking,
documentation, issue/request tracking, project planning and project
management and never needed to parse my org files with an external tool.
I have exported the data in different formats which have then been
processed by other tools and I have tweaked my setup to support various
enterprise/corporate standards or requirements (logos, corporate
colours, report formats, etc). Sometimes these tweaks are trivial and
others require more extensive effort. Often, others have had to do
something the same or similar and have working examples etc.

So my recommendation is post some messages to this list with details on
what you need to try and do and see what others can suggest. I would
keep each post to a single item rather than one long post with multiple
requests. From watching this list, I've often see someone post a "How
can I ..." question only to get the answer "Oh, that is already
built-in, just do .....". Org is a large application with lots of
sophisticated power that isn't always obvious from just reading the
manual. 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-15  9:03 ` Tim Cross
@ 2020-09-15  9:17   ` Przemysław Kamiński
  2020-09-15  9:55     ` Russell Adams
                       ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Przemysław Kamiński @ 2020-09-15  9:17 UTC (permalink / raw)
  To: emacs-orgmode

On 9/15/20 11:03 AM, Tim Cross wrote:
> 
> Przemysław Kamiński <pk@intrepidus.pl> writes:
> 
>> Hello,
>>
>> I oftentimes find myself needing to parse org files with some external
>> tools (to generate reports for customers or sum up clock times for given
>> month, etc). Looking through the list
>>
>> https://orgmode.org/worg/org-tools/
>>
>> and having tested some of these, I must say they are lacking. The
>> Haskell ones seem to be done best, but then the compile overhead of
>> Haskell and difficulty in embedding this into other languages is a drawback.
>>
>> I think it might benefit the community when such an official parser
>> would exist (and maybe could be hooked into org mode directly).
>>
>> I was thinking picking some scheme like chicken or guile, which could be
>> later easily embedded into C or whatever. Then use that parser in org
>> mode itself. This way some important part of org mode would be outside
>> of the small world of elisp.
>>
>> This is just an idea, what do you think? :)
>>
> 
> The problem with this idea is maintenance. It is also partly why
> external tools are not terribly reliable/good. Org mode is constantly
> being enhanced and improved. It is very hard for external tools to keep
> pace with org-mode development, so they soon get out of date or stop
> working correctly.
> 
> Org mode IS an elsip application. This is the main goal. The reason it
> works so well is because elisp is largely a DSL that focuses on text
> manipulation and is therefore ideally suited for a text based organiser.
> 
> This means if you want to implement parsing of org files in any
> other language, there is a lot of fundamental functionality which willl
> need to be implemented that is not necessary when using elisp as it is
> already built-in. Not only that, it is also 'battle hardened' and well
> tested. The other problem would be in selecting another language which
> behaves consistently across all the platforms Emacs and org-mode is
> supported on. As org-mode is a stnadard part of Emacs, it also needs to
> be implemented in something which is also available on all the platforms
> emacs is on without needing the user to install additional software.
> 
> The other issue is that you would need another skill in order to
> maintain/extend org-mode. In addition to elisp, you will also need to
> know whatever the parser implementation language is.
> 
> A third negative is that if the parser was in a different language to
> elisp, the interface between the rest of org mode (in elisp) and the
> parser would become an issue. At the moment, there are far fewer
> barriers as it is all elisp. However, if part of the system is in
> another language, you are now restricted to whatever defined interface
> exists. This would likely also have performance issues and overheads
> associated with translating from one format to another etc.
> 
> So, in short, the chances of org mode using a parser written in
> something other than elisp is pretty close to 0. This leaves you with 2
> options -
> 
> 1. Implement another external tool which can parse org-files. As
> metnioned above, this is a non-trivial task and will likely be difficult
> to maintain. Probably not the best first choice.
> 
> 2. Provide some details about your workflow where you believe you need
> to use external tools to process the org-files. It is very likely there
> are alternative approaches to give you the result you want, but without
> the need to do external parsing of org-files. There isn't sufficient
> details in the examples you mention to provide any specific details.
> However, I have used org-mode for reporting, invoicing, time tracking,
> documentation, issue/request tracking, project planning and project
> management and never needed to parse my org files with an external tool.
> I have exported the data in different formats which have then been
> processed by other tools and I have tweaked my setup to support various
> enterprise/corporate standards or requirements (logos, corporate
> colours, report formats, etc). Sometimes these tweaks are trivial and
> others require more extensive effort. Often, others have had to do
> something the same or similar and have working examples etc.
> 
> So my recommendation is post some messages to this list with details on
> what you need to try and do and see what others can suggest. I would
> keep each post to a single item rather than one long post with multiple
> requests. From watching this list, I've often see someone post a "How
> can I ..." question only to get the answer "Oh, that is already
> built-in, just do .....". Org is a large application with lots of
> sophisticated power that isn't always obvious from just reading the
> manual.
> 
> 

So, I keep clock times for work in org mode, this is very handy. 
However, my customers require that I use their service to provide the 
times. They do offer API. So basically I'm using elisp to parse org, 
make API calls, and at the same time generate CSV reports with a Python 
interop with org babel (because my elisp is just too bad to do that). If 
I had access to some org parser, I'd pick a language that would be more 
comfortable for me to get the job done. I guess it can all be done in 
elisp, however this is just a tool for me alone and I have limited time 
resources on hacking things for myself :)

Another one is generating total hours report for day/week/month to put 
into my awesomewm toolbar. I ended up using orgstat
https://github.com/volhovM/orgstat
however the author is creating his own DSL in YAML and I guess things 
were much better off if it all stayed in some Scheme :)

Anyways, my parser needs aren't that sophisticated: just parse the file, 
return headings with clock drawers. I tried the common lisp library but 
got frustrated after fiddling with it for couple of hours.

Best,
Przemek


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-15  9:17   ` Przemysław Kamiński
@ 2020-09-15  9:55     ` Russell Adams
  2020-09-15 11:15       ` Przemysław Kamiński
  2020-09-16  0:16     ` Tim Cross
  2020-09-16  7:24     ` Marcin Borkowski
  2 siblings, 1 reply; 45+ messages in thread
From: Russell Adams @ 2020-09-15  9:55 UTC (permalink / raw)
  To: emacs-orgmode

On Tue, Sep 15, 2020 at 11:17:57AM +0200, Przemysław Kamiński wrote:
> > Org mode IS an elsip application. This is the main goal. The reason it
> > works so well is because elisp is largely a DSL that focuses on text
> > manipulation and is therefore ideally suited for a text based organiser.
>
> So, I keep clock times for work in org mode, this is very handy.
> However, my customers require that I use their service to provide the
> times. They do offer API. So basically I'm using elisp to parse org,
> make API calls, and at the same time generate CSV reports with a Python
> interop with org babel (because my elisp is just too bad to do
> that).

Please consider this is a very specialized use case.

> If I had access to some org parser, I'd pick a language that would
> be more comfortable for me to get the job done. I guess it can all
> be done in elisp, however this is just a tool for me alone and I
> have limited time resources on hacking things for myself :)

Maintainer time is limited too. Maintaining a parser library outside
of Emacs would be difficult for the reasons already given. I'd
encourage you to pick up some more Elisp, which I am also trying to
do.

> Anyways, my parser needs aren't that sophisticated: just parse the file,
> return headings with clock drawers. I tried the common lisp library but
> got frustrated after fiddling with it for couple of hours.

If it's that small you could always do that in Python with regexps for
your usage if you're more comfortable in Python. Org's plain text
format means you can read it with anything. I suspect grep might even
pull headlines and clocks successfully.



I haven't looked at the elisp parser much, but I do wonder if someone
couldn't write an exporter that exports a programmatic version of your
org file data (ie: to xml). Then other tools could ingest those xml
files. That'd certainly be a contrib module and not in the core, but
might be worth your while to explore the idea if you really want to
work with Org data outside of Emacs.


------------------------------------------------------------------
Russell Adams                            RLAdams@AdamsInfoServ.com

PGP Key ID:     0x1160DCB3           http://www.adamsinfoserv.com/

Fingerprint:    1723 D8CA 4280 1EC9 557F  66E8 1154 E018 1160 DCB3


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-15  9:55     ` Russell Adams
@ 2020-09-15 11:15       ` Przemysław Kamiński
  2020-09-15 12:37         ` tomas
  0 siblings, 1 reply; 45+ messages in thread
From: Przemysław Kamiński @ 2020-09-15 11:15 UTC (permalink / raw)
  To: emacs-orgmode

On 9/15/20 11:55 AM, Russell Adams wrote:
> On Tue, Sep 15, 2020 at 11:17:57AM +0200, Przemysław Kamiński wrote:
>>> Org mode IS an elsip application. This is the main goal. The reason it
>>> works so well is because elisp is largely a DSL that focuses on text
>>> manipulation and is therefore ideally suited for a text based organiser.
>>
>> So, I keep clock times for work in org mode, this is very handy.
>> However, my customers require that I use their service to provide the
>> times. They do offer API. So basically I'm using elisp to parse org,
>> make API calls, and at the same time generate CSV reports with a Python
>> interop with org babel (because my elisp is just too bad to do
>> that).
> 
> Please consider this is a very specialized use case.
> 
>> If I had access to some org parser, I'd pick a language that would
>> be more comfortable for me to get the job done. I guess it can all
>> be done in elisp, however this is just a tool for me alone and I
>> have limited time resources on hacking things for myself :)
> 
> Maintainer time is limited too. Maintaining a parser library outside
> of Emacs would be difficult for the reasons already given. I'd
> encourage you to pick up some more Elisp, which I am also trying to
> do.
> 
>> Anyways, my parser needs aren't that sophisticated: just parse the file,
>> return headings with clock drawers. I tried the common lisp library but
>> got frustrated after fiddling with it for couple of hours.
> 
> If it's that small you could always do that in Python with regexps for
> your usage if you're more comfortable in Python. Org's plain text
> format means you can read it with anything. I suspect grep might even
> pull headlines and clocks successfully.
> 
> 
> 
> I haven't looked at the elisp parser much, but I do wonder if someone
> couldn't write an exporter that exports a programmatic version of your
> org file data (ie: to xml). Then other tools could ingest those xml
> files. That'd certainly be a contrib module and not in the core, but
> might be worth your while to explore the idea if you really want to
> work with Org data outside of Emacs.
> 
> 
> ------------------------------------------------------------------
> Russell Adams                            RLAdams@AdamsInfoServ.com
> 
> PGP Key ID:     0x1160DCB3           http://www.adamsinfoserv.com/
> 
> Fingerprint:    1723 D8CA 4280 1EC9 557F  66E8 1154 E018 1160 DCB3
> 

There's the org-json (or ox-json) package but for some reason I wasn't 
able to run it successfully. I guess export to S-exps would be best 
here. But yes I'll check that out.

Przemek


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-15 11:15       ` Przemysław Kamiński
@ 2020-09-15 12:37         ` tomas
  2020-09-15 18:09           ` Diego Zamboni
  2020-09-16 12:09           ` Przemysław Kamiński
  0 siblings, 2 replies; 45+ messages in thread
From: tomas @ 2020-09-15 12:37 UTC (permalink / raw)
  To: Przemysław Kamiński; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 1427 bytes --]

On Tue, Sep 15, 2020 at 01:15:56PM +0200, Przemysław Kamiński wrote:

[...]

> There's the org-json (or ox-json) package but for some reason I
> wasn't able to run it successfully. I guess export to S-exps would
> be best here. But yes I'll check that out.

If that's your route, perhaps the "Org element API" [1] might be
helpful. Especially `org-element-parse-buffer' gives you a Lisp
data structure which is supposed to be a parse of your Org buffer.

From there to S-expression can be trivial (e.g. `print' or `pp'),
depending on what you want to do.

Walking the structure should be nice in Lisp, too.

The topic of (non-Emacs) parsing of Org comes up regularly, and
there is a good (but AFAIK not-quite-complete) Org syntax spec
in Worg [2], but there are a couple of difficulties to be mastered
before such a thing can become really enjoyable and useful.

The loose specification of Org's format (arguably its second
or third strongest asset, the first two being its incredible
community and Emacs itself) is something which makes this
problem "interesting". People have invented lots of usages
which might be broken should Org change to a strict formal
spec. You don't want to break those people.

But yes, perhaps some day someone nails it. Perhaps it's you :)

Cheers

[1] https://orgmode.org/worg/dev/org-element-api.html
[2] https://orgmode.org/worg/dev/org-syntax.html

 - t

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-15 12:37         ` tomas
@ 2020-09-15 18:09           ` Diego Zamboni
  2020-09-16 12:09           ` Przemysław Kamiński
  1 sibling, 0 replies; 45+ messages in thread
From: Diego Zamboni @ 2020-09-15 18:09 UTC (permalink / raw)
  To: tomas; +Cc: Przemysław Kamiński, Org-mode

[-- Attachment #1: Type: text/plain, Size: 1692 bytes --]

There's also org-ql (https://github.com/alphapapa/org-ql), which also
provides a query-based API against Org structures.

--Diego


On Tue, Sep 15, 2020 at 2:59 PM <tomas@tuxteam.de> wrote:

> On Tue, Sep 15, 2020 at 01:15:56PM +0200, Przemysław Kamiński wrote:
>
> [...]
>
> > There's the org-json (or ox-json) package but for some reason I
> > wasn't able to run it successfully. I guess export to S-exps would
> > be best here. But yes I'll check that out.
>
> If that's your route, perhaps the "Org element API" [1] might be
> helpful. Especially `org-element-parse-buffer' gives you a Lisp
> data structure which is supposed to be a parse of your Org buffer.
>
> From there to S-expression can be trivial (e.g. `print' or `pp'),
> depending on what you want to do.
>
> Walking the structure should be nice in Lisp, too.
>
> The topic of (non-Emacs) parsing of Org comes up regularly, and
> there is a good (but AFAIK not-quite-complete) Org syntax spec
> in Worg [2], but there are a couple of difficulties to be mastered
> before such a thing can become really enjoyable and useful.
>
> The loose specification of Org's format (arguably its second
> or third strongest asset, the first two being its incredible
> community and Emacs itself) is something which makes this
> problem "interesting". People have invented lots of usages
> which might be broken should Org change to a strict formal
> spec. You don't want to break those people.
>
> But yes, perhaps some day someone nails it. Perhaps it's you :)
>
> Cheers
>
> [1] https://orgmode.org/worg/dev/org-element-api.html
> [2] https://orgmode.org/worg/dev/org-syntax.html
>
>  - t
>

[-- Attachment #2: Type: text/html, Size: 2393 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-15  9:17   ` Przemysław Kamiński
  2020-09-15  9:55     ` Russell Adams
@ 2020-09-16  0:16     ` Tim Cross
  2020-09-16  7:24     ` Marcin Borkowski
  2 siblings, 0 replies; 45+ messages in thread
From: Tim Cross @ 2020-09-16  0:16 UTC (permalink / raw)
  To: emacs-orgmode


Przemysław Kamiński <pk@intrepidus.pl> writes:

>
> So, I keep clock times for work in org mode, this is very handy. 
> However, my customers require that I use their service to provide the 
> times. They do offer API. So basically I'm using elisp to parse org, 
> make API calls, and at the same time generate CSV reports with a Python 
> interop with org babel (because my elisp is just too bad to do that). If 
> I had access to some org parser, I'd pick a language that would be more 
> comfortable for me to get the job done. I guess it can all be done in 
> elisp, however this is just a tool for me alone and I have limited time 
> resources on hacking things for myself :)
>

I would probably use org's org-export-table command to export the clock
table as a CSV and then just use a simple script to read in that CSV and
do the API calls. 

> Another one is generating total hours report for day/week/month to put 
> into my awesomewm toolbar. I ended up using orgstat
> https://github.com/volhovM/orgstat
> however the author is creating his own DSL in YAML and I guess things 
> were much better off if it all stayed in some Scheme :)
>

Sounds like you have a solution. I would probably just setup a hook to
generate the updated table and export it when the file is saved and then
have something consume that exported file to update the taskbar. 

-- 
Tim Cross


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-15  9:17   ` Przemysław Kamiński
  2020-09-15  9:55     ` Russell Adams
  2020-09-16  0:16     ` Tim Cross
@ 2020-09-16  7:24     ` Marcin Borkowski
  2020-09-16  7:56       ` Ihor Radchenko
  2 siblings, 1 reply; 45+ messages in thread
From: Marcin Borkowski @ 2020-09-16  7:24 UTC (permalink / raw)
  To: Przemysław Kamiński; +Cc: emacs-orgmode


On 2020-09-15, at 11:17, Przemysław Kamiński <pk@intrepidus.pl> wrote:

> So, I keep clock times for work in org mode, this is very
> handy. However, my customers require that I use their service to
> provide the times. They do offer API. So basically I'm using elisp to
> parse org, make API calls, and at the same time generate CSV reports
> with a Python interop with org babel (because my elisp is just too bad
> to do that). If I had access to some org parser, I'd pick a language
> that would be more comfortable for me to get the job done. I guess it
> can all be done in elisp, however this is just a tool for me alone and
> I have limited time resources on hacking things for myself :)

I was in the exact same situation - I use Org-mode clocking, and we use
Toggl at our company, so I wrote a simple tool to fire API requests to
Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl
It's a bit more than 200 lines of Elisp, so you might try to look into
it and adapt it to whatever tool your employer is using.

> Another one is generating total hours report for day/week/month to put
> into my awesomewm toolbar. I ended up using orgstat
> https://github.com/volhovM/orgstat
> however the author is creating his own DSL in YAML and I guess things
> were much better off if it all stayed in some Scheme :)

Wow, another awesomewm user here; could you share your code?

Best,

-- 
Marcin Borkowski
http://mbork.pl


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-16  7:24     ` Marcin Borkowski
@ 2020-09-16  7:56       ` Ihor Radchenko
  2020-09-16 11:36         ` Przemysław Kamiński
  0 siblings, 1 reply; 45+ messages in thread
From: Ihor Radchenko @ 2020-09-16  7:56 UTC (permalink / raw)
  To: Marcin Borkowski, Przemysław Kamiński; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 278 bytes --]

> Wow, another awesomewm user here; could you share your code?

Are you interested in something particular about awesome WM integration?

I am using simple textbox widgets to show currently clocked in task and
weighted summary of clocked time. See the attachments.

Best,
Ihor


[-- Attachment #2: statusbar1.png --]
[-- Type: image/png, Size: 23389 bytes --]

[-- Attachment #3: statusbar2.png --]
[-- Type: image/png, Size: 8747 bytes --]

[-- Attachment #4: Type: text/plain, Size: 1561 bytes --]



Marcin Borkowski <mbork@mbork.pl> writes:

> On 2020-09-15, at 11:17, Przemysław Kamiński <pk@intrepidus.pl> wrote:
>
>> So, I keep clock times for work in org mode, this is very
>> handy. However, my customers require that I use their service to
>> provide the times. They do offer API. So basically I'm using elisp to
>> parse org, make API calls, and at the same time generate CSV reports
>> with a Python interop with org babel (because my elisp is just too bad
>> to do that). If I had access to some org parser, I'd pick a language
>> that would be more comfortable for me to get the job done. I guess it
>> can all be done in elisp, however this is just a tool for me alone and
>> I have limited time resources on hacking things for myself :)
>
> I was in the exact same situation - I use Org-mode clocking, and we use
> Toggl at our company, so I wrote a simple tool to fire API requests to
> Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl
> It's a bit more than 200 lines of Elisp, so you might try to look into
> it and adapt it to whatever tool your employer is using.
>
>> Another one is generating total hours report for day/week/month to put
>> into my awesomewm toolbar. I ended up using orgstat
>> https://github.com/volhovM/orgstat
>> however the author is creating his own DSL in YAML and I guess things
>> were much better off if it all stayed in some Scheme :)
>
> Wow, another awesomewm user here; could you share your code?
>
> Best,
>
> -- 
> Marcin Borkowski
> http://mbork.pl

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-16  7:56       ` Ihor Radchenko
@ 2020-09-16 11:36         ` Przemysław Kamiński
  2020-09-16 12:02           ` Ihor Radchenko
  0 siblings, 1 reply; 45+ messages in thread
From: Przemysław Kamiński @ 2020-09-16 11:36 UTC (permalink / raw)
  To: emacs-orgmode

On 9/16/20 9:56 AM, Ihor Radchenko wrote:
>> Wow, another awesomewm user here; could you share your code?
> 
> Are you interested in something particular about awesome WM integration?
> 
> I am using simple textbox widgets to show currently clocked in task and
> weighted summary of clocked time. See the attachments.
> 
> Best,
> Ihor
> 
> 
> 
> 
> Marcin Borkowski <mbork@mbork.pl> writes:
> 
>> On 2020-09-15, at 11:17, Przemysław Kamiński <pk@intrepidus.pl> wrote:
>>
>>> So, I keep clock times for work in org mode, this is very
>>> handy. However, my customers require that I use their service to
>>> provide the times. They do offer API. So basically I'm using elisp to
>>> parse org, make API calls, and at the same time generate CSV reports
>>> with a Python interop with org babel (because my elisp is just too bad
>>> to do that). If I had access to some org parser, I'd pick a language
>>> that would be more comfortable for me to get the job done. I guess it
>>> can all be done in elisp, however this is just a tool for me alone and
>>> I have limited time resources on hacking things for myself :)
>>
>> I was in the exact same situation - I use Org-mode clocking, and we use
>> Toggl at our company, so I wrote a simple tool to fire API requests to
>> Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl
>> It's a bit more than 200 lines of Elisp, so you might try to look into
>> it and adapt it to whatever tool your employer is using.
>>
>>> Another one is generating total hours report for day/week/month to put
>>> into my awesomewm toolbar. I ended up using orgstat
>>> https://github.com/volhovM/orgstat
>>> however the author is creating his own DSL in YAML and I guess things
>>> were much better off if it all stayed in some Scheme :)
>>
>> Wow, another awesomewm user here; could you share your code?
>>
>> Best,
>>
>> -- 
>> Marcin Borkowski
>> http://mbork.pl


I don't have interesting code, just standard awesomevm setup. I run 
periodic script to output data computed by orgstat and show it in the 
taskbar (uses the shellout_widget).

However what Ihor presented is interesting. Do you use similar approach 
with shellout and 'emacs -batch' to show currently running task or you 
'push' data from emacs to show it in the taskbar?

P.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-16 11:36         ` Przemysław Kamiński
@ 2020-09-16 12:02           ` Ihor Radchenko
  2020-09-16 12:15             ` Przemysław Kamiński
  0 siblings, 1 reply; 45+ messages in thread
From: Ihor Radchenko @ 2020-09-16 12:02 UTC (permalink / raw)
  To: Przemysław Kamiński, emacs-orgmode

> However what Ihor presented is interesting. Do you use similar approach 
> with shellout and 'emacs -batch' to show currently running task or you 
> 'push' data from emacs to show it in the taskbar?

I prefer to avoid querying emacs too often for performance reasons.
Instead, I only update the clocking info when I clock in/out in emacs.
Then, the clocked in time is dynamically updated by independent bash
script.

The scheme is the following:
1. org clock in/out in Emacs trigger writing clocking info into
   ~/.org-clock-in status file
2. bash script periodically monitors the file and calculates the clocked
   in time according to the contents and time from last modification
3. the script updates simple textbox widget using awesome-client
4. the script also warns me (notify-send) when the weighted clocked in
   time is negative (meaning that I should switch to some more
   productive activity)

Best,
Ihor

Przemysław Kamiński <pk@intrepidus.pl> writes:

> On 9/16/20 9:56 AM, Ihor Radchenko wrote:
>>> Wow, another awesomewm user here; could you share your code?
>> 
>> Are you interested in something particular about awesome WM integration?
>> 
>> I am using simple textbox widgets to show currently clocked in task and
>> weighted summary of clocked time. See the attachments.
>> 
>> Best,
>> Ihor
>> 
>> 
>> 
>> 
>> Marcin Borkowski <mbork@mbork.pl> writes:
>> 
>>> On 2020-09-15, at 11:17, Przemysław Kamiński <pk@intrepidus.pl> wrote:
>>>
>>>> So, I keep clock times for work in org mode, this is very
>>>> handy. However, my customers require that I use their service to
>>>> provide the times. They do offer API. So basically I'm using elisp to
>>>> parse org, make API calls, and at the same time generate CSV reports
>>>> with a Python interop with org babel (because my elisp is just too bad
>>>> to do that). If I had access to some org parser, I'd pick a language
>>>> that would be more comfortable for me to get the job done. I guess it
>>>> can all be done in elisp, however this is just a tool for me alone and
>>>> I have limited time resources on hacking things for myself :)
>>>
>>> I was in the exact same situation - I use Org-mode clocking, and we use
>>> Toggl at our company, so I wrote a simple tool to fire API requests to
>>> Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl
>>> It's a bit more than 200 lines of Elisp, so you might try to look into
>>> it and adapt it to whatever tool your employer is using.
>>>
>>>> Another one is generating total hours report for day/week/month to put
>>>> into my awesomewm toolbar. I ended up using orgstat
>>>> https://github.com/volhovM/orgstat
>>>> however the author is creating his own DSL in YAML and I guess things
>>>> were much better off if it all stayed in some Scheme :)
>>>
>>> Wow, another awesomewm user here; could you share your code?
>>>
>>> Best,
>>>
>>> -- 
>>> Marcin Borkowski
>>> http://mbork.pl
>
>
> I don't have interesting code, just standard awesomevm setup. I run 
> periodic script to output data computed by orgstat and show it in the 
> taskbar (uses the shellout_widget).
>
> However what Ihor presented is interesting. Do you use similar approach 
> with shellout and 'emacs -batch' to show currently running task or you 
> 'push' data from emacs to show it in the taskbar?
>
> P.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-15 12:37         ` tomas
  2020-09-15 18:09           ` Diego Zamboni
@ 2020-09-16 12:09           ` Przemysław Kamiński
  2020-09-16 12:20             ` tomas
  2020-09-16 12:27             ` Ihor Radchenko
  1 sibling, 2 replies; 45+ messages in thread
From: Przemysław Kamiński @ 2020-09-16 12:09 UTC (permalink / raw)
  To: emacs-orgmode

On 9/15/20 2:37 PM, tomas@tuxteam.de wrote:
> On Tue, Sep 15, 2020 at 01:15:56PM +0200, Przemysław Kamiński wrote:
> 
> [...]
> 
>> There's the org-json (or ox-json) package but for some reason I
>> wasn't able to run it successfully. I guess export to S-exps would
>> be best here. But yes I'll check that out.
> 
> If that's your route, perhaps the "Org element API" [1] might be
> helpful. Especially `org-element-parse-buffer' gives you a Lisp
> data structure which is supposed to be a parse of your Org buffer.
> 
>  From there to S-expression can be trivial (e.g. `print' or `pp'),
> depending on what you want to do.
> 
> Walking the structure should be nice in Lisp, too.
> 
> The topic of (non-Emacs) parsing of Org comes up regularly, and
> there is a good (but AFAIK not-quite-complete) Org syntax spec
> in Worg [2], but there are a couple of difficulties to be mastered
> before such a thing can become really enjoyable and useful.
> 
> The loose specification of Org's format (arguably its second
> or third strongest asset, the first two being its incredible
> community and Emacs itself) is something which makes this
> problem "interesting". People have invented lots of usages
> which might be broken should Org change to a strict formal
> spec. You don't want to break those people.
> 
> But yes, perhaps some day someone nails it. Perhaps it's you :)
> 
> Cheers
> 
> [1] https://orgmode.org/worg/dev/org-element-api.html
> [2] https://orgmode.org/worg/dev/org-syntax.html
> 
>   - t
> 

So I looked at (pp (org-element-parse-buffer)) however it does print out 
recursive stuff which other schemes have trouble parsing.

My code looks more or less like this:

(defun org-parse (f)
   (with-temp-buffer
     (find-file f)
     (let* ((parsed (org-element-parse-buffer))
            (all (append org-element-all-elements org-element-all-objects))
            (mapped (org-element-map parsed all
                      (lambda (item)
                        (strip-parent item)))))
       (pp mapped))))


strip-parent is basically (plist-put props :parent nil) for elements 
properties. However it turns out there are more recursive objects, like

:title
           #("Headline 1" 0 10
             (:parent
              (headline #2
                            (section

So I'm wondering do I have to do it by hand for all cases or is there 
some way to output only a simple AST without those nested objects?

Best,
Przemek


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-16 12:02           ` Ihor Radchenko
@ 2020-09-16 12:15             ` Przemysław Kamiński
  2020-09-17  1:18               ` Ihor Radchenko
  0 siblings, 1 reply; 45+ messages in thread
From: Przemysław Kamiński @ 2020-09-16 12:15 UTC (permalink / raw)
  To: emacs-orgmode

On 9/16/20 2:02 PM, Ihor Radchenko wrote:
>> However what Ihor presented is interesting. Do you use similar approach
>> with shellout and 'emacs -batch' to show currently running task or you
>> 'push' data from emacs to show it in the taskbar?
> 
> I prefer to avoid querying emacs too often for performance reasons.
> Instead, I only update the clocking info when I clock in/out in emacs.
> Then, the clocked in time is dynamically updated by independent bash
> script.
> 
> The scheme is the following:
> 1. org clock in/out in Emacs trigger writing clocking info into
>     ~/.org-clock-in status file
> 2. bash script periodically monitors the file and calculates the clocked
>     in time according to the contents and time from last modification
> 3. the script updates simple textbox widget using awesome-client
> 4. the script also warns me (notify-send) when the weighted clocked in
>     time is negative (meaning that I should switch to some more
>     productive activity)
> 
> Best,
> Ihor
> 
> Przemysław Kamiński <pk@intrepidus.pl> writes:
> 
>> On 9/16/20 9:56 AM, Ihor Radchenko wrote:
>>>> Wow, another awesomewm user here; could you share your code?
>>>
>>> Are you interested in something particular about awesome WM integration?
>>>
>>> I am using simple textbox widgets to show currently clocked in task and
>>> weighted summary of clocked time. See the attachments.
>>>
>>> Best,
>>> Ihor
>>>
>>>
>>>
>>>
>>> Marcin Borkowski <mbork@mbork.pl> writes:
>>>
>>>> On 2020-09-15, at 11:17, Przemysław Kamiński <pk@intrepidus.pl> wrote:
>>>>
>>>>> So, I keep clock times for work in org mode, this is very
>>>>> handy. However, my customers require that I use their service to
>>>>> provide the times. They do offer API. So basically I'm using elisp to
>>>>> parse org, make API calls, and at the same time generate CSV reports
>>>>> with a Python interop with org babel (because my elisp is just too bad
>>>>> to do that). If I had access to some org parser, I'd pick a language
>>>>> that would be more comfortable for me to get the job done. I guess it
>>>>> can all be done in elisp, however this is just a tool for me alone and
>>>>> I have limited time resources on hacking things for myself :)
>>>>
>>>> I was in the exact same situation - I use Org-mode clocking, and we use
>>>> Toggl at our company, so I wrote a simple tool to fire API requests to
>>>> Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl
>>>> It's a bit more than 200 lines of Elisp, so you might try to look into
>>>> it and adapt it to whatever tool your employer is using.
>>>>
>>>>> Another one is generating total hours report for day/week/month to put
>>>>> into my awesomewm toolbar. I ended up using orgstat
>>>>> https://github.com/volhovM/orgstat
>>>>> however the author is creating his own DSL in YAML and I guess things
>>>>> were much better off if it all stayed in some Scheme :)
>>>>
>>>> Wow, another awesomewm user here; could you share your code?
>>>>
>>>> Best,
>>>>
>>>> -- 
>>>> Marcin Borkowski
>>>> http://mbork.pl
>>
>>
>> I don't have interesting code, just standard awesomevm setup. I run
>> periodic script to output data computed by orgstat and show it in the
>> taskbar (uses the shellout_widget).
>>
>> However what Ihor presented is interesting. Do you use similar approach
>> with shellout and 'emacs -batch' to show currently running task or you
>> 'push' data from emacs to show it in the taskbar?
>>
>> P.


So basically this is what this thread is about. One needs a working 
Emacs instance and work in "push" mode to export any Org data. This 
requires dealing with temporary files, as described above, and some 
ad-hoc formats to keep whatever data I need to pull from org.

"Pull" mode would be preferred. I could then, say, write a script in 
Guile, execute 'emacs -batch' to export org data (I'm ok with that), 
then parse the S-expressions to get what I need.

P.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-16 12:09           ` Przemysław Kamiński
@ 2020-09-16 12:20             ` tomas
  2020-09-16 12:27             ` Ihor Radchenko
  1 sibling, 0 replies; 45+ messages in thread
From: tomas @ 2020-09-16 12:20 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 783 bytes --]

On Wed, Sep 16, 2020 at 02:09:42PM +0200, Przemysław Kamiński wrote:

[...]

> So I looked at (pp (org-element-parse-buffer)) however it does print
> out recursive stuff which other schemes have trouble parsing.
> 
> My code looks more or less like this:
> 
> (defun org-parse (f)
>   (with-temp-buffer
>     (find-file f)
>     (let* ((parsed (org-element-parse-buffer))
>            (all (append org-element-all-elements org-element-all-objects))
>            (mapped (org-element-map parsed all
>                      (lambda (item)
>                        (strip-parent item)))))
>       (pp mapped))))

Actually I'd tend to not modify the result, but to walk
it.

See `pcase' for a powerful pattern matcher which might
help you there.

Cheers
 - t

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-16 12:09           ` Przemysław Kamiński
  2020-09-16 12:20             ` tomas
@ 2020-09-16 12:27             ` Ihor Radchenko
  1 sibling, 0 replies; 45+ messages in thread
From: Ihor Radchenko @ 2020-09-16 12:27 UTC (permalink / raw)
  To: Przemysław Kamiński, emacs-orgmode

FYI: You may find https://github.com/ndwarshuis/org-ml helpful.


Przemysław Kamiński <pk@intrepidus.pl> writes:

> On 9/15/20 2:37 PM, tomas@tuxteam.de wrote:
>> On Tue, Sep 15, 2020 at 01:15:56PM +0200, Przemysław Kamiński wrote:
>> 
>> [...]
>> 
>>> There's the org-json (or ox-json) package but for some reason I
>>> wasn't able to run it successfully. I guess export to S-exps would
>>> be best here. But yes I'll check that out.
>> 
>> If that's your route, perhaps the "Org element API" [1] might be
>> helpful. Especially `org-element-parse-buffer' gives you a Lisp
>> data structure which is supposed to be a parse of your Org buffer.
>> 
>>  From there to S-expression can be trivial (e.g. `print' or `pp'),
>> depending on what you want to do.
>> 
>> Walking the structure should be nice in Lisp, too.
>> 
>> The topic of (non-Emacs) parsing of Org comes up regularly, and
>> there is a good (but AFAIK not-quite-complete) Org syntax spec
>> in Worg [2], but there are a couple of difficulties to be mastered
>> before such a thing can become really enjoyable and useful.
>> 
>> The loose specification of Org's format (arguably its second
>> or third strongest asset, the first two being its incredible
>> community and Emacs itself) is something which makes this
>> problem "interesting". People have invented lots of usages
>> which might be broken should Org change to a strict formal
>> spec. You don't want to break those people.
>> 
>> But yes, perhaps some day someone nails it. Perhaps it's you :)
>> 
>> Cheers
>> 
>> [1] https://orgmode.org/worg/dev/org-element-api.html
>> [2] https://orgmode.org/worg/dev/org-syntax.html
>> 
>>   - t
>> 
>
> So I looked at (pp (org-element-parse-buffer)) however it does print out 
> recursive stuff which other schemes have trouble parsing.
>
> My code looks more or less like this:
>
> (defun org-parse (f)
>    (with-temp-buffer
>      (find-file f)
>      (let* ((parsed (org-element-parse-buffer))
>             (all (append org-element-all-elements org-element-all-objects))
>             (mapped (org-element-map parsed all
>                       (lambda (item)
>                         (strip-parent item)))))
>        (pp mapped))))
>
>
> strip-parent is basically (plist-put props :parent nil) for elements 
> properties. However it turns out there are more recursive objects, like
>
> :title
>            #("Headline 1" 0 10
>              (:parent
>               (headline #2
>                             (section
>
> So I'm wondering do I have to do it by hand for all cases or is there 
> some way to output only a simple AST without those nested objects?
>
> Best,
> Przemek


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-15  8:44 ` Gerry Agbobada
@ 2020-09-16 16:36   ` Matt Huszagh
  2020-09-23  8:09   ` Bastien
  1 sibling, 0 replies; 45+ messages in thread
From: Matt Huszagh @ 2020-09-16 16:36 UTC (permalink / raw)
  To: Gerry Agbobada, emacs-orgmode

"Gerry Agbobada" <emacs-orgmode@gagbo.net> writes:

> I'm currently toying with the idea of trying a tree-sitter parser for Org. The very static nature of a shared object parser (knowing TODO keywords are pretty dynamic for example) is a challenge I'm not sure to overcome ; to be honest even without that I can't say I'll manage to do it.

A tree-sitter parser for org would be great! Please keep this list
posted on any developments you make on this front. I made some minimal
attempts at this a while back, but didn't get very far.

Matt


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-16 12:15             ` Przemysław Kamiński
@ 2020-09-17  1:18               ` Ihor Radchenko
  2020-09-17 15:24                 ` Przemysław Kamiński
  0 siblings, 1 reply; 45+ messages in thread
From: Ihor Radchenko @ 2020-09-17  1:18 UTC (permalink / raw)
  To: Przemysław Kamiński, emacs-orgmode

> So basically this is what this thread is about. One needs a working 
> Emacs instance and work in "push" mode to export any Org data. This 
> requires dealing with temporary files, as described above, and some 
> ad-hoc formats to keep whatever data I need to pull from org.

> "Pull" mode would be preferred. I could then, say, write a script in 
> Guile, execute 'emacs -batch' to export org data (I'm ok with that), 
> then parse the S-expressions to get what I need.

My choice to use "push" mode is just for performance reasons. Nothing
prevents you from writing a function called from emacs --batch that
converts parsed org data into whatever format your Guile script prefers.
That function may be either on Emacs side or on Guile side. Probably,
Emacs has more capabilities when dealing with s-expressions though.

You can even directly push the information from Emacs to API server.
You may find https://github.com/tkf/emacs-request useful for this task.

Finally, you may also consider clock tables to create clock summaries
using existing org-mode functionality. The tables can be named and
accessed using any programming language via babel.

Best,
Ihor


Przemysław Kamiński <pk@intrepidus.pl> writes:

> On 9/16/20 2:02 PM, Ihor Radchenko wrote:
>>> However what Ihor presented is interesting. Do you use similar approach
>>> with shellout and 'emacs -batch' to show currently running task or you
>>> 'push' data from emacs to show it in the taskbar?
>> 
>> I prefer to avoid querying emacs too often for performance reasons.
>> Instead, I only update the clocking info when I clock in/out in emacs.
>> Then, the clocked in time is dynamically updated by independent bash
>> script.
>> 
>> The scheme is the following:
>> 1. org clock in/out in Emacs trigger writing clocking info into
>>     ~/.org-clock-in status file
>> 2. bash script periodically monitors the file and calculates the clocked
>>     in time according to the contents and time from last modification
>> 3. the script updates simple textbox widget using awesome-client
>> 4. the script also warns me (notify-send) when the weighted clocked in
>>     time is negative (meaning that I should switch to some more
>>     productive activity)
>> 
>> Best,
>> Ihor
>> 
>> Przemysław Kamiński <pk@intrepidus.pl> writes:
>> 
>>> On 9/16/20 9:56 AM, Ihor Radchenko wrote:
>>>>> Wow, another awesomewm user here; could you share your code?
>>>>
>>>> Are you interested in something particular about awesome WM integration?
>>>>
>>>> I am using simple textbox widgets to show currently clocked in task and
>>>> weighted summary of clocked time. See the attachments.
>>>>
>>>> Best,
>>>> Ihor
>>>>
>>>>
>>>>
>>>>
>>>> Marcin Borkowski <mbork@mbork.pl> writes:
>>>>
>>>>> On 2020-09-15, at 11:17, Przemysław Kamiński <pk@intrepidus.pl> wrote:
>>>>>
>>>>>> So, I keep clock times for work in org mode, this is very
>>>>>> handy. However, my customers require that I use their service to
>>>>>> provide the times. They do offer API. So basically I'm using elisp to
>>>>>> parse org, make API calls, and at the same time generate CSV reports
>>>>>> with a Python interop with org babel (because my elisp is just too bad
>>>>>> to do that). If I had access to some org parser, I'd pick a language
>>>>>> that would be more comfortable for me to get the job done. I guess it
>>>>>> can all be done in elisp, however this is just a tool for me alone and
>>>>>> I have limited time resources on hacking things for myself :)
>>>>>
>>>>> I was in the exact same situation - I use Org-mode clocking, and we use
>>>>> Toggl at our company, so I wrote a simple tool to fire API requests to
>>>>> Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl
>>>>> It's a bit more than 200 lines of Elisp, so you might try to look into
>>>>> it and adapt it to whatever tool your employer is using.
>>>>>
>>>>>> Another one is generating total hours report for day/week/month to put
>>>>>> into my awesomewm toolbar. I ended up using orgstat
>>>>>> https://github.com/volhovM/orgstat
>>>>>> however the author is creating his own DSL in YAML and I guess things
>>>>>> were much better off if it all stayed in some Scheme :)
>>>>>
>>>>> Wow, another awesomewm user here; could you share your code?
>>>>>
>>>>> Best,
>>>>>
>>>>> -- 
>>>>> Marcin Borkowski
>>>>> http://mbork.pl
>>>
>>>
>>> I don't have interesting code, just standard awesomevm setup. I run
>>> periodic script to output data computed by orgstat and show it in the
>>> taskbar (uses the shellout_widget).
>>>
>>> However what Ihor presented is interesting. Do you use similar approach
>>> with shellout and 'emacs -batch' to show currently running task or you
>>> 'push' data from emacs to show it in the taskbar?
>>>
>>> P.
>
>
> So basically this is what this thread is about. One needs a working 
> Emacs instance and work in "push" mode to export any Org data. This 
> requires dealing with temporary files, as described above, and some 
> ad-hoc formats to keep whatever data I need to pull from org.
>
> "Pull" mode would be preferred. I could then, say, write a script in 
> Guile, execute 'emacs -batch' to export org data (I'm ok with that), 
> then parse the S-expressions to get what I need.
>
> P.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-17  1:18               ` Ihor Radchenko
@ 2020-09-17 15:24                 ` Przemysław Kamiński
  0 siblings, 0 replies; 45+ messages in thread
From: Przemysław Kamiński @ 2020-09-17 15:24 UTC (permalink / raw)
  To: emacs-orgmode

On 9/17/20 3:18 AM, Ihor Radchenko wrote:
>> So basically this is what this thread is about. One needs a working
>> Emacs instance and work in "push" mode to export any Org data. This
>> requires dealing with temporary files, as described above, and some
>> ad-hoc formats to keep whatever data I need to pull from org.
> 
>> "Pull" mode would be preferred. I could then, say, write a script in
>> Guile, execute 'emacs -batch' to export org data (I'm ok with that),
>> then parse the S-expressions to get what I need.
> 
> My choice to use "push" mode is just for performance reasons. Nothing
> prevents you from writing a function called from emacs --batch that
> converts parsed org data into whatever format your Guile script prefers.
> That function may be either on Emacs side or on Guile side. Probably,
> Emacs has more capabilities when dealing with s-expressions though.
> 
> You can even directly push the information from Emacs to API server.
> You may find https://github.com/tkf/emacs-request useful for this task.
> 
> Finally, you may also consider clock tables to create clock summaries
> using existing org-mode functionality. The tables can be named and
> accessed using any programming language via babel.
> 
> Best,
> Ihor
> 
> 
> Przemysław Kamiński <pk@intrepidus.pl> writes:
> 
>> On 9/16/20 2:02 PM, Ihor Radchenko wrote:
>>>> However what Ihor presented is interesting. Do you use similar approach
>>>> with shellout and 'emacs -batch' to show currently running task or you
>>>> 'push' data from emacs to show it in the taskbar?
>>>
>>> I prefer to avoid querying emacs too often for performance reasons.
>>> Instead, I only update the clocking info when I clock in/out in emacs.
>>> Then, the clocked in time is dynamically updated by independent bash
>>> script.
>>>
>>> The scheme is the following:
>>> 1. org clock in/out in Emacs trigger writing clocking info into
>>>      ~/.org-clock-in status file
>>> 2. bash script periodically monitors the file and calculates the clocked
>>>      in time according to the contents and time from last modification
>>> 3. the script updates simple textbox widget using awesome-client
>>> 4. the script also warns me (notify-send) when the weighted clocked in
>>>      time is negative (meaning that I should switch to some more
>>>      productive activity)
>>>
>>> Best,
>>> Ihor
>>>
>>> Przemysław Kamiński <pk@intrepidus.pl> writes:
>>>
>>>> On 9/16/20 9:56 AM, Ihor Radchenko wrote:
>>>>>> Wow, another awesomewm user here; could you share your code?
>>>>>
>>>>> Are you interested in something particular about awesome WM integration?
>>>>>
>>>>> I am using simple textbox widgets to show currently clocked in task and
>>>>> weighted summary of clocked time. See the attachments.
>>>>>
>>>>> Best,
>>>>> Ihor
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Marcin Borkowski <mbork@mbork.pl> writes:
>>>>>
>>>>>> On 2020-09-15, at 11:17, Przemysław Kamiński <pk@intrepidus.pl> wrote:
>>>>>>
>>>>>>> So, I keep clock times for work in org mode, this is very
>>>>>>> handy. However, my customers require that I use their service to
>>>>>>> provide the times. They do offer API. So basically I'm using elisp to
>>>>>>> parse org, make API calls, and at the same time generate CSV reports
>>>>>>> with a Python interop with org babel (because my elisp is just too bad
>>>>>>> to do that). If I had access to some org parser, I'd pick a language
>>>>>>> that would be more comfortable for me to get the job done. I guess it
>>>>>>> can all be done in elisp, however this is just a tool for me alone and
>>>>>>> I have limited time resources on hacking things for myself :)
>>>>>>
>>>>>> I was in the exact same situation - I use Org-mode clocking, and we use
>>>>>> Toggl at our company, so I wrote a simple tool to fire API requests to
>>>>>> Toggl on clock start/cancel/end: https://github.com/mbork/org-toggl
>>>>>> It's a bit more than 200 lines of Elisp, so you might try to look into
>>>>>> it and adapt it to whatever tool your employer is using.
>>>>>>
>>>>>>> Another one is generating total hours report for day/week/month to put
>>>>>>> into my awesomewm toolbar. I ended up using orgstat
>>>>>>> https://github.com/volhovM/orgstat
>>>>>>> however the author is creating his own DSL in YAML and I guess things
>>>>>>> were much better off if it all stayed in some Scheme :)
>>>>>>
>>>>>> Wow, another awesomewm user here; could you share your code?
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> -- 
>>>>>> Marcin Borkowski
>>>>>> http://mbork.pl
>>>>
>>>>
>>>> I don't have interesting code, just standard awesomevm setup. I run
>>>> periodic script to output data computed by orgstat and show it in the
>>>> taskbar (uses the shellout_widget).
>>>>
>>>> However what Ihor presented is interesting. Do you use similar approach
>>>> with shellout and 'emacs -batch' to show currently running task or you
>>>> 'push' data from emacs to show it in the taskbar?
>>>>
>>>> P.
>>
>>
>> So basically this is what this thread is about. One needs a working
>> Emacs instance and work in "push" mode to export any Org data. This
>> requires dealing with temporary files, as described above, and some
>> ad-hoc formats to keep whatever data I need to pull from org.
>>
>> "Pull" mode would be preferred. I could then, say, write a script in
>> Guile, execute 'emacs -batch' to export org data (I'm ok with that),
>> then parse the S-expressions to get what I need.
>>
>> P.
> 

OK so this is what I got so far
https://gitlab.com/cgenie/org-parse
I stole the simple test.org file from ox-json test suite.
Guile seems to correctly parse that output. At least something to start 
with :)
Any comments are welcome :)

Best,
Przemek


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-15  7:58 official orgmode parser Przemysław Kamiński
  2020-09-15  8:44 ` Gerry Agbobada
  2020-09-15  9:03 ` Tim Cross
@ 2020-09-23  8:09 ` Bastien
  2020-09-23 17:46   ` Przemysław Kamiński
                     ` (2 more replies)
  2 siblings, 3 replies; 45+ messages in thread
From: Bastien @ 2020-09-23  8:09 UTC (permalink / raw)
  To: Przemysław Kamiński; +Cc: emacs-orgmode

Hi Przemysław,

Przemysław Kamiński <pk@intrepidus.pl> writes:

> I oftentimes find myself needing to parse org files with some external
> tools (to generate reports for customers or sum up clock times for
> given month, etc). Looking through the list
>
> https://orgmode.org/worg/org-tools/

Can you help on making the above page more useful to anyone?

Perhaps we can have a separate worg page just for parsers, reporting
the ones that seem to fully work.

I disagree that a parser is too difficult to maintain because Org is 
a moving target.  Org core syntax is not moving anymore, a parser can
reasonably target it.  That's what is done with the Ruby parser, in
use in this small project called github.com :)

So I'd say:

- let's enhance Worg's documentation
- yes, please go for enhancing parsing tools

I don't think we need official tools.  The official Org parser exists,
it is Org itself.

Thanks,

-- 
 Bastien


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-15  8:44 ` Gerry Agbobada
  2020-09-16 16:36   ` Matt Huszagh
@ 2020-09-23  8:09   ` Bastien
  1 sibling, 0 replies; 45+ messages in thread
From: Bastien @ 2020-09-23  8:09 UTC (permalink / raw)
  To: Gerry Agbobada; +Cc: emacs-orgmode

Hi Gerry,

"Gerry Agbobada" <emacs-orgmode@gagbo.net> writes:

> Having a tree-sitter parser would be really great in my opinion

1+

Thanks for working on this, let us know how it goes!

-- 
 Bastien


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-23  8:09 ` Bastien
@ 2020-09-23 17:46   ` Przemysław Kamiński
  2020-09-23 19:50     ` rey-coyrehourcq
  2020-10-24 21:12   ` Daniele Nicolodi
  2020-10-26 11:23   ` Ken Mankoff
  2 siblings, 1 reply; 45+ messages in thread
From: Przemysław Kamiński @ 2020-09-23 17:46 UTC (permalink / raw)
  To: emacs-orgmode

On 9/23/20 10:09 AM, Bastien wrote:
> Hi Przemysław,
> 
> Przemysław Kamiński <pk@intrepidus.pl> writes:
> 
>> I oftentimes find myself needing to parse org files with some external
>> tools (to generate reports for customers or sum up clock times for
>> given month, etc). Looking through the list
>>
>> https://orgmode.org/worg/org-tools/
> 
> Can you help on making the above page more useful to anyone?
> 
> Perhaps we can have a separate worg page just for parsers, reporting
> the ones that seem to fully work.
> 
> I disagree that a parser is too difficult to maintain because Org is
> a moving target.  Org core syntax is not moving anymore, a parser can
> reasonably target it.  That's what is done with the Ruby parser, in
> use in this small project called github.com :)
> 
> So I'd say:
> 
> - let's enhance Worg's documentation
> - yes, please go for enhancing parsing tools
> 
> I don't think we need official tools.  The official Org parser exists,
> it is Org itself.
> 
> Thanks,
> 

Hello Bastien,

Thank you for your remarks.

I updated the README, hopefully it's more usable now.

Przemek


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-23 17:46   ` Przemysław Kamiński
@ 2020-09-23 19:50     ` rey-coyrehourcq
  2020-11-11  8:58       ` Bastien
  0 siblings, 1 reply; 45+ messages in thread
From: rey-coyrehourcq @ 2020-09-23 19:50 UTC (permalink / raw)
  To: Przemysław Kamiński, emacs-orgmode

Hi Przemysław,

Some partial org Parsers (AST or regex...) i found on the web for a recent state of the art : 

* org-js
https://github.com/mooz/org-js

* orgajs
Orga is a flexible org-mode syntax parser. It parses org content into AST (Abstract Syntax Tree)
https://github.com/orgapp/orgajs
* orgparse
* org-mode-parser
https://github.com/daitangio/org-mode-parser
* org-rs
https://github.com/org-rs/org-rs
* org-ruby
https://github.com/wallyqs/org-ruby
* org-swift
https://github.com/orgapp/swift-org
* organice
https://github.com/200ok-ch/organice
* organum
https://github.com/seylerius/organum
* clj org
https://github.com/eigenhombre/clj-org
* orgmode-parse
https://github.com/ixmatus/orgmode-parse
* org-mode
https://www.fosskers.ca/
https://hackage.haskell.org/package/org-mode
* orgize
https://github.com/PoiScript/orgize
https://www.worthe-it.co.za/blog.html

Best regards,

Le mercredi 23 septembre 2020 à 19:46 +0200, Przemysław Kamiński a écrit :
> On 9/23/20 10:09 AM, Bastien wrote:
> > Hi Przemysław,
> > 
> > Przemysław Kamiński <pk@intrepidus.pl> writes:
> > 
> > > I oftentimes find myself needing to parse org files with some external
> > > tools (to generate reports for customers or sum up clock times for
> > > given month, etc). Looking through the list
> > > 
> > > https://orgmode.org/worg/org-tools/
> > 
> > Can you help on making the above page more useful to anyone?
> > 
> > Perhaps we can have a separate worg page just for parsers, reporting
> > the ones that seem to fully work.
> > 
> > I disagree that a parser is too difficult to maintain because Org is
> > a moving target.  Org core syntax is not moving anymore, a parser can
> > reasonably target it.  That's what is done with the Ruby parser, in
> > use in this small project called github.com :)
> > 
> > So I'd say:
> > 
> > - let's enhance Worg's documentation
> > - yes, please go for enhancing parsing tools
> > 
> > I don't think we need official tools.  The official Org parser exists,
> > it is Org itself.
> > 
> > Thanks,
> > 
> 
> Hello Bastien,
> 
> Thank you for your remarks.
> 
> I updated the README, hopefully it's more usable now.
> 
> Przemek
> 
-- 


Sébastien Rey-Coyrehourcq
Research Engineer UMR IDEES
02.35.14.69.30

{Stronger security for your email, follow EFF tutorial : https://ssd.eff.org/}





^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-23  8:09 ` Bastien
  2020-09-23 17:46   ` Przemysław Kamiński
@ 2020-10-24 21:12   ` Daniele Nicolodi
  2020-10-24 21:35     ` Tom Gillespie
  2020-11-11  9:15     ` Bastien
  2020-10-26 11:23   ` Ken Mankoff
  2 siblings, 2 replies; 45+ messages in thread
From: Daniele Nicolodi @ 2020-10-24 21:12 UTC (permalink / raw)
  To: emacs-orgmode

On 23/09/2020 10:09, Bastien wrote:
> I disagree that a parser is too difficult to maintain because Org is 
> a moving target.  Org core syntax is not moving anymore, a parser can
> reasonably target it.  That's what is done with the Ruby parser, in
> use in this small project called github.com :)

(Just an aside: which Ruby org-mode parser does Github use? I sometime
find instances where Github does not render an org-mode file correclty
and I would be happy to file bugs to have them corrected).

> So I'd say:
> 
> - let's enhance Worg's documentation
> - yes, please go for enhancing parsing tools
> 
> I don't think we need official tools.  The official Org parser exists,
> it is Org itself.

Would it make sense to have one "official" (or a set of) org-mode test
files and the corresponding syntax tree as parsed by org-elements (maybe
in a format easier to read from other programming languages than
s-expressions, json maybe?) to make testing other parser against the
reference implementation easier?

Maybe the org-mode test suite already has something like this. I haven't
looked for it yet.

Cheers,
Dan


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-10-24 21:12   ` Daniele Nicolodi
@ 2020-10-24 21:35     ` Tom Gillespie
  2020-11-11  9:13       ` Bastien
  2020-11-11  9:15     ` Bastien
  1 sibling, 1 reply; 45+ messages in thread
From: Tom Gillespie @ 2020-10-24 21:35 UTC (permalink / raw)
  To: Daniele Nicolodi; +Cc: emacs-orgmode

> which Ruby org-mode parser does Github use?

I'm pretty sure that github uses https://github.com/wallyqs/org-ruby.
It is ... not compliant, shall we say. I have making some fixes to the
footnote parsing section on my todo list, but I don't expect to get to
it any time in the near future.

Tom


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-23  8:09 ` Bastien
  2020-09-23 17:46   ` Przemysław Kamiński
  2020-10-24 21:12   ` Daniele Nicolodi
@ 2020-10-26 11:23   ` Ken Mankoff
  2020-10-26 14:21     ` Nicolas Goaziou
  2 siblings, 1 reply; 45+ messages in thread
From: Ken Mankoff @ 2020-10-26 11:23 UTC (permalink / raw)
  To: Bastien; +Cc: Przemysław Kamiński, emacs-orgmode

Hello,

On 2020-09-23 at 01:09 -07, Bastien <bzg@gnu.org> wrote...
> I disagree that a parser is too difficult to maintain because Org is a
> moving target. Org core syntax is not moving anymore, a parser can
> reasonably target it. That's what is done with the Ruby parser, in use
> in this small project called github.com :)

Do you think it would be useful (or possible) to represent the current Org syntax in EBNF form so that people can use the EBNF to build parsers or graphically understand the form? I'm thinking of a nice page of railroad diagrams from this tool: https://github.com/GuntherRademacher/rr

I question if this is possible because EBNF is for context-free grammars, but I *think* Org syntax is context-free. Even if not, I think those railroad diagrams might be useful for parser-writers and can still describe 99 % of the syntax, even if a few extra sentences are needed to clarify some edge case.

  -k.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-10-26 11:23   ` Ken Mankoff
@ 2020-10-26 14:21     ` Nicolas Goaziou
  2020-10-26 16:17       ` Ken Mankoff
  0 siblings, 1 reply; 45+ messages in thread
From: Nicolas Goaziou @ 2020-10-26 14:21 UTC (permalink / raw)
  To: Ken Mankoff; +Cc: Bastien, Przemysław Kamiński, emacs-orgmode

Hello,

Ken Mankoff <mankoff@gmail.com> writes:

> I question if this is possible because EBNF is for context-free
> grammars, but I *think* Org syntax is context-free.

It's not as explained in a footnote in the Org syntax document.

Regards,
-- 
Nicolas Goaziou


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-10-26 14:21     ` Nicolas Goaziou
@ 2020-10-26 16:17       ` Ken Mankoff
  2020-10-26 16:24         ` Nicolas Goaziou
  2020-11-11  9:00         ` Bastien
  0 siblings, 2 replies; 45+ messages in thread
From: Ken Mankoff @ 2020-10-26 16:17 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: Bastien, Przemysław Kamiński, emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 1889 bytes --]


On 2020-10-26 at 07:21 -07, Nicolas Goaziou <mail@nicolasgoaziou.fr> wrote...
> Ken Mankoff <mankoff@gmail.com> writes:
>
>> I question if this is possible because EBNF is for context-free
>> grammars, but I *think* Org syntax is context-free.
>
> It's not as explained in a footnote in the Org syntax document.

Yes, I meant to write that I think Org syntax is maybe *not* context-free, and therefore EBNF can't capture all of it. But it could still be very helpful and capture most of it.

But the more I think about it, the more I think Org may be context-free.

For the footnotes, I'm not sure that "(1) In particular, the parser requires stars at column 0 to be quoted by a comma when they do not define a headline" violates context. An "*" in the first column defines a header. It can be escaped by anything else too (" *" works too). If ",*" has a special meaning, that can be captured elsewhere in the syntax.

I'm also not sure (2) violates context-freeness, at least in the EBNF sense where a context can include a newline. See for example:

section ::= "*"+ string (tag+) newline (planning newline)? (property_drawer newline)?

planning ::= ("SCHEDULED:" "<" date_or_time ">")? ("DEADLINE:" "<" date_or_time ">")?

property_drawer ::= ":PROPERTIES:" newline drawer_contents newline ":END:"

drawer_contents ::= ":" property ":" whitespace string

Where the first line, "section" is represented graphically as the attached image.

I guess I'm not 100% clear what "context-free" means. EBNF can represent a language where a for loop has an opening and closing brace. The closing brace is context-dependent, just as the planning or property drawers are.

I recently used EBNF to represent a CSV file with header, and I was unable to capture the requirement that the header column must have the same number of fields or commas as the data section. I think that is context-free. 



[-- Attachment #2: tmp_20201026_090940.png --]
[-- Type: image/png, Size: 8197 bytes --]

[-- Attachment #3: Type: text/plain, Size: 7 bytes --]


  -k.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-10-26 16:17       ` Ken Mankoff
@ 2020-10-26 16:24         ` Nicolas Goaziou
  2020-10-26 16:47           ` Ken Mankoff
  2020-11-11  9:00         ` Bastien
  1 sibling, 1 reply; 45+ messages in thread
From: Nicolas Goaziou @ 2020-10-26 16:24 UTC (permalink / raw)
  To: Ken Mankoff; +Cc: Bastien, Przemysław Kamiński, emacs-orgmode

Ken Mankoff <mankoff@gmail.com> writes:

> Yes, I meant to write that I think Org syntax is maybe *not*
> context-free, and therefore EBNF can't capture all of it. But it could
> still be very helpful and capture most of it.

I'm not arguing about the usefulness of a partial EBNF description. I'm
merely pointing out that the syntax is not context-free. Here is an
example:

    # This is a comment (1)

    #+begin_example
    # This is not a comment (2)
    #+end_example

AFAICT, you cannot distinguish between lines (1) and (2) with EBNF.

Regards,


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-10-26 16:24         ` Nicolas Goaziou
@ 2020-10-26 16:47           ` Ken Mankoff
  2020-10-26 17:59             ` Tom Gillespie
  2020-11-11  8:59             ` Bastien
  0 siblings, 2 replies; 45+ messages in thread
From: Ken Mankoff @ 2020-10-26 16:47 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: Bastien, Przemysław Kamiński, emacs-orgmode


On 2020-10-26 at 09:24 -07, Nicolas Goaziou <mail@nicolasgoaziou.fr> wrote...
>     # This is a comment (1)
>
>     #+begin_example
>     # This is not a comment (2)
>     #+end_example
>
> AFAICT, you cannot distinguish between lines (1) and (2) with EBNF.

I agree. I think this is a better (correct?) example than the footnotes on Org Syntax page.

  -k.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-10-26 16:47           ` Ken Mankoff
@ 2020-10-26 17:59             ` Tom Gillespie
  2020-10-26 20:26               ` Ken Mankoff
  2020-11-11  8:59             ` Bastien
  1 sibling, 1 reply; 45+ messages in thread
From: Tom Gillespie @ 2020-10-26 17:59 UTC (permalink / raw)
  To: Ken Mankoff
  Cc: Bastien, Przemysław Kamiński, emacs-orgmode,
	Nicolas Goaziou

I started writing down Org's grammar as an EBNF (with Racket's #lang
brag) on Saturday. There is indeed a layer of Org grammar that can be
implemented via EBNF, but it is fairly minimal. You can identify
headlines, but you can't identify nesting level; the arbitrary nesting
depth means that you have to have a stack to keep track. There is a
similar issue with the indentation level in order to correctly
interpret plain lists. If the canonical representation of an org
document was required to used org-adapt-indentation: nil;
org-edit-src-content-indentation: 0 and there was a canonical
normalization function some of these issues would go away, but not all
of them, and I'm fairly certain that it is not possible to implement a
safe normalization function that won't mangle someones formatting.
Another example of something that requires a stack is the greater
blocks, where you have #+begin_{name} and #+end_{name}, and the names
must match. If there was a closed set of names you could sort of do it
by hand, but since name can be any string that does not contain
whitespace, you have to have a stack to track which block you are in.
So, you can identify things that are heads, you can identify things
that are block start lines and block end lines, but you need stacks to
keep track of heading level, indentation, plain list level, and block
name. I might be missing a few other places where stacks are required,
but those are the big ones. Best,
Tom

On Mon, Oct 26, 2020 at 12:48 PM Ken Mankoff <mankoff@gmail.com> wrote:
>
>
> On 2020-10-26 at 09:24 -07, Nicolas Goaziou <mail@nicolasgoaziou.fr> wrote...
> >     # This is a comment (1)
> >
> >     #+begin_example
> >     # This is not a comment (2)
> >     #+end_example
> >
> > AFAICT, you cannot distinguish between lines (1) and (2) with EBNF.
>
> I agree. I think this is a better (correct?) example than the footnotes on Org Syntax page.
>
>   -k.
>
>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-10-26 17:59             ` Tom Gillespie
@ 2020-10-26 20:26               ` Ken Mankoff
  2020-10-26 21:00                 ` Tom Gillespie
  0 siblings, 1 reply; 45+ messages in thread
From: Ken Mankoff @ 2020-10-26 20:26 UTC (permalink / raw)
  To: Tom Gillespie
  Cc: Bastien, Przemysław Kamiński, emacs-orgmode,
	Nicolas Goaziou


On 2020-10-26 at 10:59 -07, Tom Gillespie <tgbugs@gmail.com> wrote...
> You can identify headlines, but you can't identify nesting level;

Do you need to? This is valid as an entire Org file, I think:

*** foo
* bar
***** baz

And that can be represented in EBNF. I'm not aware of places where behavior is indent-level specific, except inline tasks, and that edge case can be represented.

> There is a similar issue with the indentation level in
> order to correctly interpret plain lists.

list ::= ('+' string newline)+ sublist?
sublist ::= (indent list)+

I think this captures lists?

> Another example of something that requires a stack is the greater
> blocks, where you have #+begin_{name} and #+end_{name}, and the names
> must match.

Definitely not able to be represented in EBNF, unless as you say {name} is a limited vocabulary.

  -k.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-10-26 20:26               ` Ken Mankoff
@ 2020-10-26 21:00                 ` Tom Gillespie
  2020-10-26 21:37                   ` Ken Mankoff
  2020-10-27  5:42                   ` Przemysław Kamiński
  0 siblings, 2 replies; 45+ messages in thread
From: Tom Gillespie @ 2020-10-26 21:00 UTC (permalink / raw)
  To: Ken Mankoff
  Cc: Bastien, Przemysław Kamiński, emacs-orgmode,
	Nicolas Goaziou

Here is an attempt to clarify my own confusion around the nested
structures in org. In short: each node in the headline tree and the
plain list tree can be parse using the EBNF, the nesting level cannot,
which means that certain useful operations such as folding, require
additional rules beyond the grammar. More in line. Best!
Tom

> Do you need to? This is valid as an entire Org file, I think:
>
> *** foo
> * bar
> ***** baz
>
> And that can be represented in EBNF. I'm not aware of places where behavior is indent-level specific, except inline tasks, and that edge case can be represented.

You are correct, and as long as the heading depth doesn't change some
interpretation then this is a non-issue. The reason I mentioned this
though is
because it means that you cannot determine how to correctly fold an
org file from the grammar alone.

To make sure I understand. It is possible to determine the number of
leading stars (and thus the level), but I think that it is not
possible to identify the end of a section.
For example

* a
*** b
** c
* d

You can parse out a 1, b 3, c 2, d 1, but if you want to be able to
nest b and c inside a but not nest d inside a, then you need a stack
in there somewhere. You
can't have a rule such as

section : headline content
content : text | section

because the parse would incorrectly nest sections at the same level,
you would have to write

section-level-1 : headline-1 content-1
content-1 : text | section-level-2-n

but since we have an arbitrary number of levels the grammar would have
to be infinite.
This is only if you want your grammar to be able to encode that the
content of sections
can include other more deeply nested sections, which in this context
we almost certainly
do not (as you point out).

> > There is a similar issue with the indentation level in
> > order to correctly interpret plain lists.
>
> list ::= ('+' string newline)+ sublist?
> sublist ::= (indent list)+
>
> I think this captures lists?

Ah yes, I see my mistake here. In order for this to work the parser
has to implement significant whitespace,
so whitespace cannot be parsed into a single token. I think everything
works out after that.

> Definitely not able to be represented in EBNF, unless as you say {name} is a limited vocabulary.

Darn those pesky open sets!


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-10-26 21:00                 ` Tom Gillespie
@ 2020-10-26 21:37                   ` Ken Mankoff
  2020-10-26 22:19                     ` Tom Gillespie
  2020-10-27  5:42                   ` Przemysław Kamiński
  1 sibling, 1 reply; 45+ messages in thread
From: Ken Mankoff @ 2020-10-26 21:37 UTC (permalink / raw)
  To: Tom Gillespie
  Cc: Bastien, Przemysław Kamiński, emacs-orgmode,
	Nicolas Goaziou


On 2020-10-26 at 14:00 -07, Tom Gillespie <tgbugs@gmail.com> wrote...
>> list ::= ('+' string newline)+ sublist?
>> sublist ::= (indent list)+
>>
>> I think this captures lists?
>
> Ah yes, I see my mistake here. In order for this to work the parser
> has to implement significant whitespace, so whitespace cannot be
> parsed into a single token. I think everything works out after that.

If we agree that the syntax above captures lists and sublists, then I think we could apply the same methods to the issue of headlines and sub-headlines?

  -k.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-10-26 21:37                   ` Ken Mankoff
@ 2020-10-26 22:19                     ` Tom Gillespie
  0 siblings, 0 replies; 45+ messages in thread
From: Tom Gillespie @ 2020-10-26 22:19 UTC (permalink / raw)
  To: Ken Mankoff
  Cc: Bastien, Przemysław Kamiński, emacs-orgmode,
	Nicolas Goaziou

Even if this did work for plain lists it won't work for headlines
because headlines have an arbitrary number of stars and thus it is not
possible for the grammar to know what is a sub-headline vs "the next
headline". For a similar reason I'm fairly sure that the sublist
approach will not work due to issues with relative indent. Here is the
quote from the current draft syntax.

> An item ends before the next item, the first line less or equally indented
> than its starting line, or two consecutive empty lines. Indentation of lines
> within other greater elements do not count, neither do inlinetasks boundaries.

The "the first line less or equally indented than its starting line"
section is what prevents your approach from working because you have
to know the relative indentation in order to figure out which list
contains a nested list. As written your grammar will parse a nested
list into a flat list. This is because there are an arbitrary number
distinct tokens that could be =indent= in your grammar and the EBNF
can't specify an ordering for them so that you can't say that one
indent is greater than another.

For list termination the rule seems to be two new lines followed by
not a list element. As a result of this, my inclination is to only
parse plain list elements and reconstruct the whole "list" only as an
internal semantic.

Check the behavior of
                                                 1. to
                                                        1. see
                                          1. what
                                              1. I
                                   1. mean
                           1.
                                                    1.
                  1.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-10-26 21:00                 ` Tom Gillespie
  2020-10-26 21:37                   ` Ken Mankoff
@ 2020-10-27  5:42                   ` Przemysław Kamiński
  1 sibling, 0 replies; 45+ messages in thread
From: Przemysław Kamiński @ 2020-10-27  5:42 UTC (permalink / raw)
  To: emacs-orgmode

I'm no expert in parsing but I would expect org's parser to be quite 
similar to the multitude of markdown or CommonMark [1] parsers. There 
isn't that much difference in syntax, except maybe org is more versatile 
and has more syntax elements, like drawers.

Searching for "EBNF Markdown" I stumbled upon [2].

[1] https://commonmark.org/
[2] http://roopc.net/posts/2014/markdown-cfg/

On 10/26/20 10:00 PM, Tom Gillespie wrote:
> Here is an attempt to clarify my own confusion around the nested
> structures in org. In short: each node in the headline tree and the
> plain list tree can be parse using the EBNF, the nesting level cannot,
> which means that certain useful operations such as folding, require
> additional rules beyond the grammar. More in line. Best!
> Tom
> 
>> Do you need to? This is valid as an entire Org file, I think:
>>
>> *** foo
>> * bar
>> ***** baz
>>
>> And that can be represented in EBNF. I'm not aware of places where behavior is indent-level specific, except inline tasks, and that edge case can be represented.
> 
> You are correct, and as long as the heading depth doesn't change some
> interpretation then this is a non-issue. The reason I mentioned this
> though is
> because it means that you cannot determine how to correctly fold an
> org file from the grammar alone.
> 
> To make sure I understand. It is possible to determine the number of
> leading stars (and thus the level), but I think that it is not
> possible to identify the end of a section.
> For example
> 
> * a
> *** b
> ** c
> * d
> 
> You can parse out a 1, b 3, c 2, d 1, but if you want to be able to
> nest b and c inside a but not nest d inside a, then you need a stack
> in there somewhere. You
> can't have a rule such as
> 
> section : headline content
> content : text | section
> 
> because the parse would incorrectly nest sections at the same level,
> you would have to write
> 
> section-level-1 : headline-1 content-1
> content-1 : text | section-level-2-n
> 
> but since we have an arbitrary number of levels the grammar would have
> to be infinite.
> This is only if you want your grammar to be able to encode that the
> content of sections
> can include other more deeply nested sections, which in this context
> we almost certainly
> do not (as you point out).
> 
>>> There is a similar issue with the indentation level in
>>> order to correctly interpret plain lists.
>>
>> list ::= ('+' string newline)+ sublist?
>> sublist ::= (indent list)+
>>
>> I think this captures lists?
> 
> Ah yes, I see my mistake here. In order for this to work the parser
> has to implement significant whitespace,
> so whitespace cannot be parsed into a single token. I think everything
> works out after that.
> 
>> Definitely not able to be represented in EBNF, unless as you say {name} is a limited vocabulary.
> 
> Darn those pesky open sets!
> 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-09-23 19:50     ` rey-coyrehourcq
@ 2020-11-11  8:58       ` Bastien
  0 siblings, 0 replies; 45+ messages in thread
From: Bastien @ 2020-11-11  8:58 UTC (permalink / raw)
  To: rey-coyrehourcq; +Cc: Przemysław Kamiński, emacs-orgmode

Hi Sébastien,

rey-coyrehourcq <sebastien.rey-coyrehourcq@univ-rouen.fr> writes:

> Some partial org Parsers (AST or regex...) i found on the web for a
> recent state of the art : 

Thanks -- I've updated https://orgmode.org/worg/org-tools/ with this
information. 

Best,

-- 
 Bastien


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-10-26 16:47           ` Ken Mankoff
  2020-10-26 17:59             ` Tom Gillespie
@ 2020-11-11  8:59             ` Bastien
  1 sibling, 0 replies; 45+ messages in thread
From: Bastien @ 2020-11-11  8:59 UTC (permalink / raw)
  To: Ken Mankoff; +Cc: Przemysław Kamiński, emacs-orgmode, Nicolas Goaziou

Hi Ken,

Ken Mankoff <mankoff@gmail.com> writes:

> On 2020-10-26 at 09:24 -07, Nicolas Goaziou <mail@nicolasgoaziou.fr> wrote...
>>     # This is a comment (1)
>>
>>     #+begin_example
>>     # This is not a comment (2)
>>     #+end_example
>>
>> AFAICT, you cannot distinguish between lines (1) and (2) with EBNF.
>
> I agree. I think this is a better (correct?) example than the
> footnotes on Org Syntax page.

Can you suggest a patch?

-- 
 Bastien


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-10-26 16:17       ` Ken Mankoff
  2020-10-26 16:24         ` Nicolas Goaziou
@ 2020-11-11  9:00         ` Bastien
  1 sibling, 0 replies; 45+ messages in thread
From: Bastien @ 2020-11-11  9:00 UTC (permalink / raw)
  To: Ken Mankoff; +Cc: Przemysław Kamiński, emacs-orgmode, Nicolas Goaziou

Hi Ken,

Ken Mankoff <mankoff@gmail.com> writes:

> Yes, I meant to write that I think Org syntax is maybe *not*
> context-free, and therefore EBNF can't capture all of it. But it could
> still be very helpful and capture most of it.

Perhaps.  Or you willing to give it a try and report here?

-- 
 Bastien


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-10-24 21:35     ` Tom Gillespie
@ 2020-11-11  9:13       ` Bastien
  2020-11-12 17:14         ` Tom Gillespie
  0 siblings, 1 reply; 45+ messages in thread
From: Bastien @ 2020-11-11  9:13 UTC (permalink / raw)
  To: Tom Gillespie; +Cc: emacs-orgmode, Daniele Nicolodi

Hi Tom,

Tom Gillespie <tgbugs@gmail.com> writes:

>> which Ruby org-mode parser does Github use?
>
> I'm pretty sure that github uses https://github.com/wallyqs/org-ruby.
> It is ... not compliant, shall we say. I have making some fixes to the
> footnote parsing section on my todo list, but I don't expect to get to
> it any time in the near future.

Can you contact GitHub and see what they use?

Whatever they use, I suggest we ask them to support the org library
they use to let their users display Org files.

Maybe the same should be done with gitlab.com, since they also parse
Org files somehow.

-- 
 Bastien


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-10-24 21:12   ` Daniele Nicolodi
  2020-10-24 21:35     ` Tom Gillespie
@ 2020-11-11  9:15     ` Bastien
  2020-11-11 13:05       ` Daniele Nicolodi
  2020-11-28 19:19       ` Gerry Agbobada
  1 sibling, 2 replies; 45+ messages in thread
From: Bastien @ 2020-11-11  9:15 UTC (permalink / raw)
  To: Daniele Nicolodi; +Cc: emacs-orgmode

Hi Daniele,

Daniele Nicolodi <daniele@grinta.net> writes:

> Would it make sense to have one "official" (or a set of) org-mode test
> files and the corresponding syntax tree as parsed by org-elements (maybe
> in a format easier to read from other programming languages than
> s-expressions, json maybe?) to make testing other parser against the
> reference implementation easier?

I think it is a very good idea.

The example file would be also good to help users track for small
syntactic changes, when they happen.

Would you like to work on such a file?

-- 
 Bastien


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-11-11  9:15     ` Bastien
@ 2020-11-11 13:05       ` Daniele Nicolodi
  2020-11-28 19:19       ` Gerry Agbobada
  1 sibling, 0 replies; 45+ messages in thread
From: Daniele Nicolodi @ 2020-11-11 13:05 UTC (permalink / raw)
  To: emacs-orgmode

On 11/11/2020 10:15, Bastien wrote:
> Hi Daniele,
> 
> Daniele Nicolodi <daniele@grinta.net> writes:
> 
>> Would it make sense to have one "official" (or a set of) org-mode test
>> files and the corresponding syntax tree as parsed by org-elements (maybe
>> in a format easier to read from other programming languages than
>> s-expressions, json maybe?) to make testing other parser against the
>> reference implementation easier?
> 
> I think it is a very good idea.
> 
> The example file would be also good to help users track for small
> syntactic changes, when they happen.
> 
> Would you like to work on such a file?

I don't have enough motivation to see this climb high enough in my TODO
list to see any meaningful progress in a reasonable time frame.  I am
mote than happy to contribute to Org, but it is more effective to keep
these contributions related to my daily use of Org.

Cheers,
Dan


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-11-11  9:13       ` Bastien
@ 2020-11-12 17:14         ` Tom Gillespie
  0 siblings, 0 replies; 45+ messages in thread
From: Tom Gillespie @ 2020-11-12 17:14 UTC (permalink / raw)
  To: Bastien; +Cc: emacs-orgmode, waldemar.quevedo, Daniele Nicolodi

Hi Bastien,
     I agree it would be great to ask them to contribute to whichever
ruby library they are using. I will see if I can get in touch, but I
have no idea of where to start if we really want to get to the folks
who could make a decision. It looks like gitlab uses the same org-ruby
library as well
https://gitlab.com/gitlab-org/gitlab-foss/-/blob/master/Gemfile#L156.
They may be easier to reach out to. I have also cced Wally to see if
he has any insights here. Best!
Tom


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: official orgmode parser
  2020-11-11  9:15     ` Bastien
  2020-11-11 13:05       ` Daniele Nicolodi
@ 2020-11-28 19:19       ` Gerry Agbobada
  1 sibling, 0 replies; 45+ messages in thread
From: Gerry Agbobada @ 2020-11-28 19:19 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 570 bytes --]

Hello,

On Wed, Nov 11, 2020, at 10:15, Bastien wrote:
> 
> The example file would be also good to help users track for small
> syntactic changes, when they happen.
> 
> 

When I thought mistakenly I could use an EBNF parser to parse Org-mode, I wrote a little examples to get going (never went past headings as I'm not really good with parsing things) 
https://github.com/gagbo/LuaOrgParser/tree/master/tests/test-files/headings

Maybe it could be used as a base. I wasn't really sure of how to handle test cases and creating good ones.

Best regards,


Gerry Agbobada

[-- Attachment #2: Type: text/html, Size: 1172 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2020-11-28 19:23 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-15  7:58 official orgmode parser Przemysław Kamiński
2020-09-15  8:44 ` Gerry Agbobada
2020-09-16 16:36   ` Matt Huszagh
2020-09-23  8:09   ` Bastien
2020-09-15  9:03 ` Tim Cross
2020-09-15  9:17   ` Przemysław Kamiński
2020-09-15  9:55     ` Russell Adams
2020-09-15 11:15       ` Przemysław Kamiński
2020-09-15 12:37         ` tomas
2020-09-15 18:09           ` Diego Zamboni
2020-09-16 12:09           ` Przemysław Kamiński
2020-09-16 12:20             ` tomas
2020-09-16 12:27             ` Ihor Radchenko
2020-09-16  0:16     ` Tim Cross
2020-09-16  7:24     ` Marcin Borkowski
2020-09-16  7:56       ` Ihor Radchenko
2020-09-16 11:36         ` Przemysław Kamiński
2020-09-16 12:02           ` Ihor Radchenko
2020-09-16 12:15             ` Przemysław Kamiński
2020-09-17  1:18               ` Ihor Radchenko
2020-09-17 15:24                 ` Przemysław Kamiński
2020-09-23  8:09 ` Bastien
2020-09-23 17:46   ` Przemysław Kamiński
2020-09-23 19:50     ` rey-coyrehourcq
2020-11-11  8:58       ` Bastien
2020-10-24 21:12   ` Daniele Nicolodi
2020-10-24 21:35     ` Tom Gillespie
2020-11-11  9:13       ` Bastien
2020-11-12 17:14         ` Tom Gillespie
2020-11-11  9:15     ` Bastien
2020-11-11 13:05       ` Daniele Nicolodi
2020-11-28 19:19       ` Gerry Agbobada
2020-10-26 11:23   ` Ken Mankoff
2020-10-26 14:21     ` Nicolas Goaziou
2020-10-26 16:17       ` Ken Mankoff
2020-10-26 16:24         ` Nicolas Goaziou
2020-10-26 16:47           ` Ken Mankoff
2020-10-26 17:59             ` Tom Gillespie
2020-10-26 20:26               ` Ken Mankoff
2020-10-26 21:00                 ` Tom Gillespie
2020-10-26 21:37                   ` Ken Mankoff
2020-10-26 22:19                     ` Tom Gillespie
2020-10-27  5:42                   ` Przemysław Kamiński
2020-11-11  8:59             ` Bastien
2020-11-11  9:00         ` Bastien

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).