Thank you for the Haskell fix! I found another issue (not a bug but could be handled better): Running =org-lint= on an Org file containing #+NAME:Hello #+BEGIN_SRC emacs-lisp :exports code #+END_SRC I get the following error: #+begin_quote Debugger entered--Lisp error: (search-failed "^[ \11]*#\\+[A-Za-z]+: +Hello *$") #+end_quote The code is faulty because there should be a space between #+NAME: and Hello, like so: #+NAME: Hello #+BEGIN_SRC emacs-lisp :exports code #+END_SRC However, this should probably be reported by =org-lint= as an Org syntax error, and not lead to an error when executing =org-lint=. What do you think? Thank you, Dominik
Dominik Schrempf <dominik.schrempf@gmail.com> writes:
> Running =org-lint= on an Org file containing
>
> #+NAME:Hello
> #+BEGIN_SRC emacs-lisp :exports code
> #+END_SRC
>
> I get the following error:
> #+begin_quote
> Debugger entered--Lisp error: (search-failed "^[ \11]*#\\+[A-Za-z]+: +Hello *$")
> #+end_quote
Confirmed.
This one is tricky. The linter (org-lint-duplicate-name) expects that
NAME keyword must have space before value. However, the actual Org
parser (org-element--collect-affiliated-keywords) does not care about
space. My intuition says that the parser behaviour is
unintentional. However, not requiring a whitespace may also be a valid
syntax.
Dear Orgers,
Should we allow syntax like #+KEYWORD:value to be correct or do we
require a whitespace/space after colon all the time?
Best,
Ihor
> Should we allow syntax like #+KEYWORD:value to be correct or do we > require a whitespace/space after colon all the time? The spec as written is ambiguous/silent on this issue. In my work on laundry tokenizer and grammar I have found keyword syntax to be a thorny issue, and I strongly suggest that for the time being we either make no ruling on this or we state that the colon that ends the keyword should be followed by a space as a precautionary measure. The safe thing to do is to always require whitespace after the colon because it guarantees correct interpretation. Requiring whitespace after the colon simplifies the grammar, however it means that you can't compact keyword lines, and it induces an annoying failure mode where missing spaces are no longer keywords. However, it is technically possible to make keywords work without the whitespace, so long as there is at least one whitespace prior to the next colon (but not contained in square brackets, e.g. #+key:lol[ a b c ]:value is a well formed keyword under a slighly generalized grammar). The problem is that we would like to make keyword syntax fully closed, and I need a bit more time to get that worked out before any definitive conclusions are drawn. The complexity of the generalized keyword syntax can be seen here https://github.com/tgbugs/laundry/blob/5a396bef98d9a3cd9ee929f21cd47612dd6cb1ac/laundry/lex-abbrev.rkt#L107-L249 Best, Tom
Ihor Radchenko <yantar92@gmail.com> writes:
> This one is tricky. The linter (org-lint-duplicate-name) expects that
> NAME keyword must have space before value. However, the actual Org
> parser (org-element--collect-affiliated-keywords) does not care about
> space. My intuition says that the parser behaviour is
> unintentional. However, not requiring a whitespace may also be a valid
> syntax.
For the time being, let's prefer what org-element does over the linter.
I have pushed the fix to bugfix as bd0493eda.
Best,
Ihor
Tom Gillespie <tgbugs@gmail.com> writes: >> Should we allow syntax like #+KEYWORD:value to be correct or do we >> require a whitespace/space after colon all the time? > > The spec as written is ambiguous/silent on this issue. In my work on > laundry tokenizer and grammar I have found keyword syntax to be a > thorny issue, and I strongly suggest that for the time being we either > make no ruling on this or we state that the colon that ends the > keyword should be followed by a space as a precautionary measure. > The safe thing to do is to always require whitespace after the colon > because it guarantees correct interpretation. By the way, wouldn't it be better to use tree-sitter rather than something else for the format grammar? At least, there is some work on integrating tree-sitter into Emacs core [1,2]. [1] https://lists.gnu.org/archive/html/emacs-devel/2021-08/msg00268.html [2] https://archive.casouri.cat/note/2021/emacs-tree-sitter/#Feedback Best, Ihor
> By the way, wouldn't it be better to use tree-sitter rather than > something else for the format grammar? Not really since we are going to need more than one implementation using a parser generator to avoid baking implementation specific details into the spec by accident. This is true for more than just the grammar as well. The complexity of tokenization, parsing, expanding, etc, for Org means that we are going to need multiple implementations to nail the behavior for any formal spec. That said, we definitely want a TS implementation at some point. See https://github.com/tgbugs/laundry/issues/1 for a recent discussion about ways forward. The implementation I'm working on should translate to TS without too much work since both brag and tree sitter describe LR variants. There may be some subtle differences, but nothing fundamental. The issue for me is that I don't have the bandwidth to get started with a full tree sitter implementation, especially because it is going to need a custom scanner, and because you're effectively on your own when it comes to reconstructing the output of the AST into the actual internal representation of an Org file. I also have no idea how to deal with nested parsers in tree sitter. I have some ideas about how it might be done, but nothing concrete (see the linked issue for more on that). Best, Tom
[-- Attachment #1: Type: text/plain, Size: 659 bytes --] Hi Tom, > The issue for me is that I don’t have the bandwidth to get started > with a full tree sitter implementation, especially because it is going > to need a custom scanner, and because you’re effectively on your > own when it comes to reconstructing the output of the AST into the > actual internal representation of an Org file. I also have no idea how > to deal with nested parsers in tree sitter. I have some ideas about > how it might be done, but nothing concrete (see the linked issue > for more on that). orgmode.nvim is developing a tree-sitter parser, perhaps a dialog with them could be productive? All the best, Timothy
Thanks for the pointer! The actual point of contact seems to be https://github.com/milisims/tree-sitter-org. Good to find another group that is working on this. Best, Tom