emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Discrepancy between documentation and implementation regarding comments
@ 2019-10-27 10:07 Thibault Polge
  2019-10-27 16:13 ` Robert Pluim
  2019-10-28 11:05 ` Nicolas Goaziou
  0 siblings, 2 replies; 10+ messages in thread
From: Thibault Polge @ 2019-10-27 10:07 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 919 bytes --]

Hello,

According to Org-Mode documentation[1],

> Lines starting with zero or more whitespace characters followed by one
> ‘#’ and a whitespace are treated as comments and, as such, are not
> exported.

The actual implementation differs on a subtle detail: Org-Mode will
treat a line where the pound sign is immediatly followed by \n as a
comment.  I believe this is expected behavior, since it allows to
comment out multiple paragraphs, and behaves as expected even when using
`delete-trailing-whitespace`.

I'm asking this because Pandoc follows strictly the org documentation,
and treats a line containing only a pound sign as text.  I opened a bug
about this[2], where I've been asked –reasonably– to first make sure the
bug isn't actually in Org Mode.

[1] https://orgmode.org/manual/Comment-lines.html

[2] https://github.com/jgm/pandoc/issues/5856

All the best,

--
Thibault

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Discrepancy between documentation and implementation regarding comments
  2019-10-27 10:07 Discrepancy between documentation and implementation regarding comments Thibault Polge
@ 2019-10-27 16:13 ` Robert Pluim
  2019-10-27 20:09   ` Adam Porter
  2019-10-28 11:05 ` Nicolas Goaziou
  1 sibling, 1 reply; 10+ messages in thread
From: Robert Pluim @ 2019-10-27 16:13 UTC (permalink / raw)
  To: Thibault Polge; +Cc: emacs-orgmode

>>>>> On Sun, 27 Oct 2019 11:07:20 +0100, Thibault Polge <thibault@thb.lt> said:

    Thibault> Hello,
    Thibault> According to Org-Mode documentation[1],

    >> Lines starting with zero or more whitespace characters followed by one
    >> ‘#’ and a whitespace are treated as comments and, as such, are not
    >> exported.

'whitespace' in emacs normally covers newline as well. Of course org
might mean 'at least one space or tab', but as you say, thatʼs not
what the implementation does. eg in org 9.2.6, org-fill-element does

    (re-search-backward "^[ \t]*#[ \t]*$" begin t)

However org-at-comment-p does

    (looking-at "^[ \t]*# ")

so thereʼs some possible inconsistency there.

FWIW, Iʼd vote for expressing it as 'zero or more whitespace followed
by one # followed by zero or more whitespace'

Robert

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Discrepancy between documentation and implementation regarding comments
  2019-10-27 16:13 ` Robert Pluim
@ 2019-10-27 20:09   ` Adam Porter
  2019-10-27 21:02     ` Samuel Wales
  0 siblings, 1 reply; 10+ messages in thread
From: Adam Porter @ 2019-10-27 20:09 UTC (permalink / raw)
  To: emacs-orgmode

I agree with Robert that "whitespace" includes newlines in "Emacsland."
For example, with this document (the second "#" has a newline
immediately after, no spaces or tabs):

#+BEGIN_SRC org
foo

# comment

bar

#

buzz
#+END_SRC

This code matches both lines that begin with "#":

  (re-search-forward (rx bol "#" (1+ space)))

But this code only matches the first one, because "blank" only matches
"horizontal whitespace":

  (re-search-forward (rx bol "#" (1+ blank)))

So I think Pandoc is technically at fault here.  However, outside of
Emacs's own context, I can see how the the documentation could be
misinterpreted in this case, so it's hard to fault them too much.  :)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Discrepancy between documentation and implementation regarding comments
  2019-10-27 20:09   ` Adam Porter
@ 2019-10-27 21:02     ` Samuel Wales
  0 siblings, 0 replies; 10+ messages in thread
From: Samuel Wales @ 2019-10-27 21:02 UTC (permalink / raw)
  To: Adam Porter; +Cc: emacs-orgmode

beware # at eob with no newline.

On 10/27/19, Adam Porter <adam@alphapapa.net> wrote:
> I agree with Robert that "whitespace" includes newlines in "Emacsland."
> For example, with this document (the second "#" has a newline
> immediately after, no spaces or tabs):
>
> #+BEGIN_SRC org
> foo
>
> # comment
>
> bar
>
> #
>
> buzz
> #+END_SRC
>
> This code matches both lines that begin with "#":
>
>   (re-search-forward (rx bol "#" (1+ space)))
>
> But this code only matches the first one, because "blank" only matches
> "horizontal whitespace":
>
>   (re-search-forward (rx bol "#" (1+ blank)))
>
> So I think Pandoc is technically at fault here.  However, outside of
> Emacs's own context, I can see how the the documentation could be
> misinterpreted in this case, so it's hard to fault them too much.  :)
>
>
>


-- 
The Kafka Pandemic

What is misopathy?
https://thekafkapandemic.blogspot.com/2013/10/why-some-diseases-are-wronged.html

The disease DOES progress. MANY people have died from it. And ANYBODY
can get it at any time.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Discrepancy between documentation and implementation regarding comments
  2019-10-27 10:07 Discrepancy between documentation and implementation regarding comments Thibault Polge
  2019-10-27 16:13 ` Robert Pluim
@ 2019-10-28 11:05 ` Nicolas Goaziou
  2019-10-28 11:42   ` Thibault Polge
  1 sibling, 1 reply; 10+ messages in thread
From: Nicolas Goaziou @ 2019-10-28 11:05 UTC (permalink / raw)
  To: Thibault Polge; +Cc: emacs-orgmode

Hello,

Thibault Polge <thibault@thb.lt> writes:

> According to Org-Mode documentation[1],

See <https://orgmode.org/worg/dev/org-syntax.html#Comments> (with a nice
typo...)

Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Discrepancy between documentation and implementation regarding comments
  2019-10-28 11:05 ` Nicolas Goaziou
@ 2019-10-28 11:42   ` Thibault Polge
  2019-10-28 16:16     ` Nicolas Goaziou
  0 siblings, 1 reply; 10+ messages in thread
From: Thibault Polge @ 2019-10-28 11:42 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 840 bytes --]

Nicolas Goaziou writes:
> See <https://orgmode.org/worg/dev/org-syntax.html#Comments> (with a nice
> typo...)

Thanks Nicolas, just a small detail though: unless this is a planned
(breaking) change, I believe the description you linked should read:

A “comment line” starts with *zero or more whitespace characters,
followed by* a hash sign, followed by a whitespace character or an end
of line.

Another detail: it could be nice to have a small appendix somewhere
mapping character names to codepoints, since Unicode has no less than
three “number signs” (from Wikipedia):

 - U+0023 # NUMBER SIGN (HTML &#35;). Other attested names in Unicode are: pound sign, hash, crosshatch, octothorpe.
 - U+FF03 # FULLWIDTH NUMBER SIGN (HTML &#65283;)
 - U+FE5F ﹟ SMALL NUMBER SIGN (HTML &#65119;)

Regards,
Thibault

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Discrepancy between documentation and implementation regarding comments
  2019-10-28 11:42   ` Thibault Polge
@ 2019-10-28 16:16     ` Nicolas Goaziou
  2019-10-29 13:52       ` Robert Pluim
  0 siblings, 1 reply; 10+ messages in thread
From: Nicolas Goaziou @ 2019-10-28 16:16 UTC (permalink / raw)
  To: Thibault Polge; +Cc: emacs-orgmode

Hello,

Thibault Polge <thibault@thb.lt> writes:

> Thanks Nicolas, just a small detail though: unless this is a planned
> (breaking) change, I believe the description you linked should read:
>
> A “comment line” starts with *zero or more whitespace characters,
> followed by* a hash sign, followed by a whitespace character or an end
> of line.

True. I fixed that.

> Another detail: it could be nice to have a small appendix somewhere
> mapping character names to codepoints, since Unicode has no less than
> three “number signs” (from Wikipedia):
>
>  - U+0023 # NUMBER SIGN (HTML &#35;). Other attested names in Unicode are: pound sign, hash, crosshatch, octothorpe.
>  - U+FF03 # FULLWIDTH NUMBER SIGN (HTML &#65283;)
>  - U+FE5F ﹟ SMALL NUMBER SIGN (HTML &#65119;)

This is left as an exercise to the reader. ;)

Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Discrepancy between documentation and implementation regarding comments
  2019-10-28 16:16     ` Nicolas Goaziou
@ 2019-10-29 13:52       ` Robert Pluim
  2019-10-29 14:14         ` Thibault Polge
  0 siblings, 1 reply; 10+ messages in thread
From: Robert Pluim @ 2019-10-29 13:52 UTC (permalink / raw)
  To: Thibault Polge; +Cc: emacs-orgmode

>>>>> On Mon, 28 Oct 2019 17:16:55 +0100, Nicolas Goaziou <mail@nicolasgoaziou.fr> said:

    Nicolas> Hello,
    Nicolas> Thibault Polge <thibault@thb.lt> writes:

    >> Thanks Nicolas, just a small detail though: unless this is a planned
    >> (breaking) change, I believe the description you linked should read:
    >> 
    >> A “comment line” starts with *zero or more whitespace characters,
    >> followed by* a hash sign, followed by a whitespace character or an end
    >> of line.

    Nicolas> True. I fixed that.

end of line *is* a whitespace character, but Iʼm not going to argue
that. Iʼm going to argue that this doesnʼt cover the case of a '#' at
EOB without a newline, hence saying 'zero or more' would be better.

(and if it really is *one* whitespace character, thatʼs a breaking
change from at least org-9.2.6, which allows zero-or-more).

Robert

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Discrepancy between documentation and implementation regarding comments
  2019-10-29 13:52       ` Robert Pluim
@ 2019-10-29 14:14         ` Thibault Polge
  2019-10-29 14:34           ` Robert Pluim
  0 siblings, 1 reply; 10+ messages in thread
From: Thibault Polge @ 2019-10-29 14:14 UTC (permalink / raw)
  To: Robert Pluim; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 603 bytes --]

Robert Pluim writes:

> end of line *is* a whitespace character, but Iʼm not going to argue
> that. Iʼm going to argue that this doesnʼt cover the case of a '#' at
> EOB without a newline, hence saying 'zero or more' would be better.

But zero-or-more would mean that this line:

#Alpha

Is a comment, along with:

#+TITLE: My Org document

And virtually of all Org meta-lines. I've thought about the \n#<EOB>
issue, but I haven't tested how the current implementation behaves in
this regard.  I think the recent changes in Pandoc would parse it as a
comment.

Regards,
Thibault

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Discrepancy between documentation and implementation regarding comments
  2019-10-29 14:14         ` Thibault Polge
@ 2019-10-29 14:34           ` Robert Pluim
  0 siblings, 0 replies; 10+ messages in thread
From: Robert Pluim @ 2019-10-29 14:34 UTC (permalink / raw)
  To: Thibault Polge; +Cc: emacs-orgmode

>>>>> On Tue, 29 Oct 2019 15:14:37 +0100, Thibault Polge <thibault@thb.lt> said:

    Thibault> Robert Pluim writes:
    >> end of line *is* a whitespace character, but Iʼm not going to argue
    >> that. Iʼm going to argue that this doesnʼt cover the case of a '#' at
    >> EOB without a newline, hence saying 'zero or more' would be better.

    Thibault> But zero-or-more would mean that this line:

    Thibault> #Alpha

Thatʼs the problem with human language, itʼs imprecise. I meant

^[ \t]*#[ \t]*$

Robert

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-10-29 14:34 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-27 10:07 Discrepancy between documentation and implementation regarding comments Thibault Polge
2019-10-27 16:13 ` Robert Pluim
2019-10-27 20:09   ` Adam Porter
2019-10-27 21:02     ` Samuel Wales
2019-10-28 11:05 ` Nicolas Goaziou
2019-10-28 11:42   ` Thibault Polge
2019-10-28 16:16     ` Nicolas Goaziou
2019-10-29 13:52       ` Robert Pluim
2019-10-29 14:14         ` Thibault Polge
2019-10-29 14:34           ` Robert Pluim

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).