emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Org lint and named source blocks
@ 2021-09-21  7:39 Dominik Schrempf
  2021-09-21 13:18 ` Ihor Radchenko
  0 siblings, 1 reply; 8+ messages in thread
From: Dominik Schrempf @ 2021-09-21  7:39 UTC (permalink / raw)
  To: Emacs Org Mode Mailing List

Thank you for the Haskell fix! I found another issue (not a bug but could be
handled better):

Running =org-lint= on an Org file containing

#+NAME:Hello
#+BEGIN_SRC emacs-lisp :exports code
#+END_SRC

I get the following error:
#+begin_quote
Debugger entered--Lisp error: (search-failed "^[ \11]*#\\+[A-Za-z]+: +Hello *$")
#+end_quote

The code is faulty because there should be a space between #+NAME: and Hello,
like so:

#+NAME: Hello
#+BEGIN_SRC emacs-lisp :exports code
#+END_SRC

However, this should probably be reported by =org-lint= as an Org syntax error,
and not lead to an error when executing =org-lint=.

What do you think?

Thank you,
Dominik


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Org lint and named source blocks
  2021-09-21  7:39 Org lint and named source blocks Dominik Schrempf
@ 2021-09-21 13:18 ` Ihor Radchenko
  2021-09-21 20:32   ` Tom Gillespie
  2021-10-04  6:16   ` Ihor Radchenko
  0 siblings, 2 replies; 8+ messages in thread
From: Ihor Radchenko @ 2021-09-21 13:18 UTC (permalink / raw)
  To: Dominik Schrempf; +Cc: Emacs Org Mode Mailing List

Dominik Schrempf <dominik.schrempf@gmail.com> writes:

> Running =org-lint= on an Org file containing
>
> #+NAME:Hello
> #+BEGIN_SRC emacs-lisp :exports code
> #+END_SRC
>
> I get the following error:
> #+begin_quote
> Debugger entered--Lisp error: (search-failed "^[ \11]*#\\+[A-Za-z]+: +Hello *$")
> #+end_quote

Confirmed.

This one is tricky. The linter (org-lint-duplicate-name) expects that
NAME keyword must have space before value. However, the actual Org
parser (org-element--collect-affiliated-keywords) does not care about
space. My intuition says that the parser behaviour is
unintentional. However, not requiring a whitespace may also be a valid
syntax.

Dear Orgers,

Should we allow syntax like #+KEYWORD:value to be correct or do we
require a whitespace/space after colon all the time?

Best,
Ihor


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Org lint and named source blocks
  2021-09-21 13:18 ` Ihor Radchenko
@ 2021-09-21 20:32   ` Tom Gillespie
  2021-10-04  6:19     ` Ihor Radchenko
  2021-10-04  6:16   ` Ihor Radchenko
  1 sibling, 1 reply; 8+ messages in thread
From: Tom Gillespie @ 2021-09-21 20:32 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Dominik Schrempf, Emacs Org Mode Mailing List

> Should we allow syntax like #+KEYWORD:value to be correct or do we
> require a whitespace/space after colon all the time?

The spec as written is ambiguous/silent on this issue. In my work on
laundry tokenizer and grammar I have found keyword syntax to be a
thorny issue, and I strongly suggest that for the time being we either
make no ruling on this or we state that the colon that ends the
keyword should be followed by a space as a precautionary measure.
The safe thing to do is to always require whitespace after the colon
because it guarantees correct interpretation.

Requiring whitespace after the colon simplifies the grammar, however
it means that you can't compact keyword lines, and it induces an
annoying failure mode where missing spaces are no longer keywords.

However, it is technically possible to make keywords work without the
whitespace, so long as there is at least one whitespace prior to the
next colon (but not contained in square brackets, e.g. #+key:lol[ a b
c ]:value is a well formed keyword under a slighly generalized
grammar). The problem is that we would like to make keyword syntax
fully closed, and I need a bit more time to get that worked out before
any definitive conclusions are drawn.

The complexity of the generalized keyword syntax can be seen here
https://github.com/tgbugs/laundry/blob/5a396bef98d9a3cd9ee929f21cd47612dd6cb1ac/laundry/lex-abbrev.rkt#L107-L249

Best,
Tom


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Org lint and named source blocks
  2021-09-21 13:18 ` Ihor Radchenko
  2021-09-21 20:32   ` Tom Gillespie
@ 2021-10-04  6:16   ` Ihor Radchenko
  1 sibling, 0 replies; 8+ messages in thread
From: Ihor Radchenko @ 2021-10-04  6:16 UTC (permalink / raw)
  To: Dominik Schrempf; +Cc: Emacs Org Mode Mailing List

Ihor Radchenko <yantar92@gmail.com> writes:

> This one is tricky. The linter (org-lint-duplicate-name) expects that
> NAME keyword must have space before value. However, the actual Org
> parser (org-element--collect-affiliated-keywords) does not care about
> space. My intuition says that the parser behaviour is
> unintentional. However, not requiring a whitespace may also be a valid
> syntax.

For the time being, let's prefer what org-element does over the linter.
I have pushed the fix to bugfix as bd0493eda.

Best,
Ihor


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Org lint and named source blocks
  2021-09-21 20:32   ` Tom Gillespie
@ 2021-10-04  6:19     ` Ihor Radchenko
  2021-10-04  8:02       ` Tom Gillespie
  0 siblings, 1 reply; 8+ messages in thread
From: Ihor Radchenko @ 2021-10-04  6:19 UTC (permalink / raw)
  To: Tom Gillespie; +Cc: Dominik Schrempf, Emacs Org Mode Mailing List

Tom Gillespie <tgbugs@gmail.com> writes:

>> Should we allow syntax like #+KEYWORD:value to be correct or do we
>> require a whitespace/space after colon all the time?
>
> The spec as written is ambiguous/silent on this issue. In my work on
> laundry tokenizer and grammar I have found keyword syntax to be a
> thorny issue, and I strongly suggest that for the time being we either
> make no ruling on this or we state that the colon that ends the
> keyword should be followed by a space as a precautionary measure.
> The safe thing to do is to always require whitespace after the colon
> because it guarantees correct interpretation.

By the way, wouldn't it be better to use tree-sitter rather than
something else for the format grammar? At least, there is some work on
integrating tree-sitter into Emacs core [1,2].

[1] https://lists.gnu.org/archive/html/emacs-devel/2021-08/msg00268.html
[2] https://archive.casouri.cat/note/2021/emacs-tree-sitter/#Feedback

Best,
Ihor


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Org lint and named source blocks
  2021-10-04  6:19     ` Ihor Radchenko
@ 2021-10-04  8:02       ` Tom Gillespie
  2021-10-04  8:21         ` Timothy
  0 siblings, 1 reply; 8+ messages in thread
From: Tom Gillespie @ 2021-10-04  8:02 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Dominik Schrempf, Emacs Org Mode Mailing List

> By the way, wouldn't it be better to use tree-sitter rather than
> something else for the format grammar?

Not really since we are going to need more than one implementation
using a parser generator to avoid baking implementation specific
details into the spec by accident. This is true for more than just
the grammar as well. The complexity of tokenization, parsing,
expanding, etc, for Org means that we are going to need multiple
implementations to nail the behavior for any formal spec.

That said, we definitely want a TS implementation at some point.
See https://github.com/tgbugs/laundry/issues/1 for a recent
discussion about ways forward.

The implementation I'm working on should translate to TS without
too much work since both brag and tree sitter describe LR variants.
There may be some subtle differences, but nothing fundamental.

The issue for me is that I don't have the bandwidth to get started
with a full tree sitter implementation, especially because it is going
to need a custom scanner, and because you're effectively on your
own when it comes to reconstructing the output of the AST into the
actual internal representation of an Org file. I also have no idea how
to deal with nested parsers in tree sitter. I have some ideas about
how it might be done, but nothing concrete (see the linked issue
for more on that).

Best,
Tom


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Org lint and named source blocks
  2021-10-04  8:02       ` Tom Gillespie
@ 2021-10-04  8:21         ` Timothy
  2021-10-04  9:14           ` Tom Gillespie
  0 siblings, 1 reply; 8+ messages in thread
From: Timothy @ 2021-10-04  8:21 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 659 bytes --]

Hi Tom,

> The issue for me is that I don’t have the bandwidth to get started
> with a full tree sitter implementation, especially because it is going
> to need a custom scanner, and because you’re effectively on your
> own when it comes to reconstructing the output of the AST into the
> actual internal representation of an Org file. I also have no idea how
> to deal with nested parsers in tree sitter. I have some ideas about
> how it might be done, but nothing concrete (see the linked issue
> for more on that).

orgmode.nvim is developing a tree-sitter parser, perhaps a dialog with them
could be productive?

All the best,
Timothy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Org lint and named source blocks
  2021-10-04  8:21         ` Timothy
@ 2021-10-04  9:14           ` Tom Gillespie
  0 siblings, 0 replies; 8+ messages in thread
From: Tom Gillespie @ 2021-10-04  9:14 UTC (permalink / raw)
  To: Timothy; +Cc: emacs-orgmode

Thanks for the pointer! The actual point of contact seems to be
https://github.com/milisims/tree-sitter-org. Good to find another
group that is working on this. Best,
Tom


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-10-04  9:40 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-21  7:39 Org lint and named source blocks Dominik Schrempf
2021-09-21 13:18 ` Ihor Radchenko
2021-09-21 20:32   ` Tom Gillespie
2021-10-04  6:19     ` Ihor Radchenko
2021-10-04  8:02       ` Tom Gillespie
2021-10-04  8:21         ` Timothy
2021-10-04  9:14           ` Tom Gillespie
2021-10-04  6:16   ` Ihor Radchenko

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).