emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Ihor Radchenko <yantar92@posteo.net>
To: Tommy Kelly <tommy.kelly@verilab.com>, Bastien <bzg@gnu.org>
Cc: emacs-orgmode@gnu.org
Subject: [BUG] Null character in block/drawer regexps (but not in org-element parser) (was: BUG? Null character prevents org-babel-tangle from tangling a block)
Date: Sat, 12 Nov 2022 12:59:40 +0000	[thread overview]
Message-ID: <875yfk9vlv.fsf@localhost> (raw)
In-Reply-To: <CAMg28Ov+fn3uLBb1XTYDnxeNkL1M14edpVnsj6A-cGoPDP=9gg@mail.gmail.com>

Tommy Kelly <tommy.kelly@verilab.com> writes:

> The attached .org file describes a simple test to demonstrate the problem.
> I've also attached a .zip version, in case the NULL character in the test
> doesn't survive the gmailing process. (The null is In BLOCK 2, two
> characters after the '3' in ';; line3' If it's there, you should see the
> usual ^@ (as a single character) placeholder.

Confirmed.

This is because `org-babel-src-block-regexp' explicitly prohibits null
characters in the body. Similar situation is with `org-block-regexp',
`org-clock-drawer-re', `org-latex-regexps', and a number of other places
in Org sources.

At least for src blocks, prohibiting null character is inconsistent with
org-element parser. I am not very sure what is the rationale behind now
allowing null character. I see no clues in git history and a single
possibly relevant comment in `org-latex-regexps':

  ;; \000 in the following regexp is needed for org-inside-LaTeX-fragment-p

However, `org-inside-LaTeX-fragment-p' itself is outdated and needs
to be replaced with org-element machinery.

So, we should probably remove zero-width shenanigans from the code.

Unless I miss something.

Bastien, maybe you recall something about presence of null character in
regexs?

P.S. If we decide to remove the null character, I'd prefer to do it
after the release: this change may affect a lot of code and the bug is
not that major to risk breakage.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


  reply	other threads:[~2022-11-12 13:00 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-12  2:13 BUG? Null character prevents org-babel-tangle from tangling a block Tommy Kelly
2022-11-12 12:59 ` Ihor Radchenko [this message]
2022-11-12 15:22   ` [BUG] Null character in block/drawer regexps (but not in org-element parser) Bastien Guerry
2023-04-25 19:12     ` Ihor Radchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875yfk9vlv.fsf@localhost \
    --to=yantar92@posteo.net \
    --cc=bzg@gnu.org \
    --cc=emacs-orgmode@gnu.org \
    --cc=tommy.kelly@verilab.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).