[-- Attachment #1: Type: text/plain, Size: 919 bytes --] Hello, According to Org-Mode documentation[1], > Lines starting with zero or more whitespace characters followed by one > ‘#’ and a whitespace are treated as comments and, as such, are not > exported. The actual implementation differs on a subtle detail: Org-Mode will treat a line where the pound sign is immediatly followed by \n as a comment. I believe this is expected behavior, since it allows to comment out multiple paragraphs, and behaves as expected even when using `delete-trailing-whitespace`. I'm asking this because Pandoc follows strictly the org documentation, and treats a line containing only a pound sign as text. I opened a bug about this[2], where I've been asked –reasonably– to first make sure the bug isn't actually in Org Mode. [1] https://orgmode.org/manual/Comment-lines.html [2] https://github.com/jgm/pandoc/issues/5856 All the best, -- Thibault [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --]
>>>>> On Sun, 27 Oct 2019 11:07:20 +0100, Thibault Polge <thibault@thb.lt> said:
Thibault> Hello,
Thibault> According to Org-Mode documentation[1],
>> Lines starting with zero or more whitespace characters followed by one
>> ‘#’ and a whitespace are treated as comments and, as such, are not
>> exported.
'whitespace' in emacs normally covers newline as well. Of course org
might mean 'at least one space or tab', but as you say, thatʼs not
what the implementation does. eg in org 9.2.6, org-fill-element does
(re-search-backward "^[ \t]*#[ \t]*$" begin t)
However org-at-comment-p does
(looking-at "^[ \t]*# ")
so thereʼs some possible inconsistency there.
FWIW, Iʼd vote for expressing it as 'zero or more whitespace followed
by one # followed by zero or more whitespace'
Robert
I agree with Robert that "whitespace" includes newlines in "Emacsland." For example, with this document (the second "#" has a newline immediately after, no spaces or tabs): #+BEGIN_SRC org foo # comment bar # buzz #+END_SRC This code matches both lines that begin with "#": (re-search-forward (rx bol "#" (1+ space))) But this code only matches the first one, because "blank" only matches "horizontal whitespace": (re-search-forward (rx bol "#" (1+ blank))) So I think Pandoc is technically at fault here. However, outside of Emacs's own context, I can see how the the documentation could be misinterpreted in this case, so it's hard to fault them too much. :)
beware # at eob with no newline. On 10/27/19, Adam Porter <adam@alphapapa.net> wrote: > I agree with Robert that "whitespace" includes newlines in "Emacsland." > For example, with this document (the second "#" has a newline > immediately after, no spaces or tabs): > > #+BEGIN_SRC org > foo > > # comment > > bar > > # > > buzz > #+END_SRC > > This code matches both lines that begin with "#": > > (re-search-forward (rx bol "#" (1+ space))) > > But this code only matches the first one, because "blank" only matches > "horizontal whitespace": > > (re-search-forward (rx bol "#" (1+ blank))) > > So I think Pandoc is technically at fault here. However, outside of > Emacs's own context, I can see how the the documentation could be > misinterpreted in this case, so it's hard to fault them too much. :) > > > -- The Kafka Pandemic What is misopathy? https://thekafkapandemic.blogspot.com/2013/10/why-some-diseases-are-wronged.html The disease DOES progress. MANY people have died from it. And ANYBODY can get it at any time.
Hello, Thibault Polge <thibault@thb.lt> writes: > According to Org-Mode documentation[1], See <https://orgmode.org/worg/dev/org-syntax.html#Comments> (with a nice typo...) Regards, -- Nicolas Goaziou
[-- Attachment #1: Type: text/plain, Size: 840 bytes --] Nicolas Goaziou writes: > See <https://orgmode.org/worg/dev/org-syntax.html#Comments> (with a nice > typo...) Thanks Nicolas, just a small detail though: unless this is a planned (breaking) change, I believe the description you linked should read: A “comment line” starts with *zero or more whitespace characters, followed by* a hash sign, followed by a whitespace character or an end of line. Another detail: it could be nice to have a small appendix somewhere mapping character names to codepoints, since Unicode has no less than three “number signs” (from Wikipedia): - U+0023 # NUMBER SIGN (HTML #). Other attested names in Unicode are: pound sign, hash, crosshatch, octothorpe. - U+FF03 # FULLWIDTH NUMBER SIGN (HTML #) - U+FE5F ﹟ SMALL NUMBER SIGN (HTML ﹟) Regards, Thibault [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --]
Hello, Thibault Polge <thibault@thb.lt> writes: > Thanks Nicolas, just a small detail though: unless this is a planned > (breaking) change, I believe the description you linked should read: > > A “comment line” starts with *zero or more whitespace characters, > followed by* a hash sign, followed by a whitespace character or an end > of line. True. I fixed that. > Another detail: it could be nice to have a small appendix somewhere > mapping character names to codepoints, since Unicode has no less than > three “number signs” (from Wikipedia): > > - U+0023 # NUMBER SIGN (HTML #). Other attested names in Unicode are: pound sign, hash, crosshatch, octothorpe. > - U+FF03 # FULLWIDTH NUMBER SIGN (HTML #) > - U+FE5F ﹟ SMALL NUMBER SIGN (HTML ﹟) This is left as an exercise to the reader. ;) Regards, -- Nicolas Goaziou
>>>>> On Mon, 28 Oct 2019 17:16:55 +0100, Nicolas Goaziou <mail@nicolasgoaziou.fr> said:
Nicolas> Hello,
Nicolas> Thibault Polge <thibault@thb.lt> writes:
>> Thanks Nicolas, just a small detail though: unless this is a planned
>> (breaking) change, I believe the description you linked should read:
>>
>> A “comment line” starts with *zero or more whitespace characters,
>> followed by* a hash sign, followed by a whitespace character or an end
>> of line.
Nicolas> True. I fixed that.
end of line *is* a whitespace character, but Iʼm not going to argue
that. Iʼm going to argue that this doesnʼt cover the case of a '#' at
EOB without a newline, hence saying 'zero or more' would be better.
(and if it really is *one* whitespace character, thatʼs a breaking
change from at least org-9.2.6, which allows zero-or-more).
Robert
[-- Attachment #1: Type: text/plain, Size: 603 bytes --] Robert Pluim writes: > end of line *is* a whitespace character, but Iʼm not going to argue > that. Iʼm going to argue that this doesnʼt cover the case of a '#' at > EOB without a newline, hence saying 'zero or more' would be better. But zero-or-more would mean that this line: #Alpha Is a comment, along with: #+TITLE: My Org document And virtually of all Org meta-lines. I've thought about the \n#<EOB> issue, but I haven't tested how the current implementation behaves in this regard. I think the recent changes in Pandoc would parse it as a comment. Regards, Thibault [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --]
>>>>> On Tue, 29 Oct 2019 15:14:37 +0100, Thibault Polge <thibault@thb.lt> said:
Thibault> Robert Pluim writes:
>> end of line *is* a whitespace character, but Iʼm not going to argue
>> that. Iʼm going to argue that this doesnʼt cover the case of a '#' at
>> EOB without a newline, hence saying 'zero or more' would be better.
Thibault> But zero-or-more would mean that this line:
Thibault> #Alpha
Thatʼs the problem with human language, itʼs imprecise. I meant
^[ \t]*#[ \t]*$
Robert