[-- Attachment #1.1: Type: text/plain, Size: 824 bytes --] I'm starting to figure out tangle by wrapping chunks of my emacs init.el in #+begin_src/#end_src and then hitting C-c C-v t. It has been working fine, but one block simply refused to participate. I eventually tracked the problem down to a stray NULL character that had found its way into some of the lisp comments in that particular chunk of my init.el. It had the effect of completely disabling tangling of the entire block. Blocks before and after that one, however, all tangle nicely. The attached .org file describes a simple test to demonstrate the problem. I've also attached a .zip version, in case the NULL character in the test doesn't survive the gmailing process. (The null is In BLOCK 2, two characters after the '3' in ';; line3' If it's there, you should see the usual ^@ (as a single character) placeholder. [-- Attachment #1.2: Type: text/html, Size: 990 bytes --] [-- Attachment #2: test-effect-of-null.org --] [-- Type: application/octet-stream, Size: 947 bytes --] [-- Attachment #3: test-effect-of-null.org.zip --] [-- Type: application/zip, Size: 999 bytes --]
Tommy Kelly <tommy.kelly@verilab.com> writes: > The attached .org file describes a simple test to demonstrate the problem. > I've also attached a .zip version, in case the NULL character in the test > doesn't survive the gmailing process. (The null is In BLOCK 2, two > characters after the '3' in ';; line3' If it's there, you should see the > usual ^@ (as a single character) placeholder. Confirmed. This is because `org-babel-src-block-regexp' explicitly prohibits null characters in the body. Similar situation is with `org-block-regexp', `org-clock-drawer-re', `org-latex-regexps', and a number of other places in Org sources. At least for src blocks, prohibiting null character is inconsistent with org-element parser. I am not very sure what is the rationale behind now allowing null character. I see no clues in git history and a single possibly relevant comment in `org-latex-regexps': ;; \000 in the following regexp is needed for org-inside-LaTeX-fragment-p However, `org-inside-LaTeX-fragment-p' itself is outdated and needs to be replaced with org-element machinery. So, we should probably remove zero-width shenanigans from the code. Unless I miss something. Bastien, maybe you recall something about presence of null character in regexs? P.S. If we decide to remove the null character, I'd prefer to do it after the release: this change may affect a lot of code and the bug is not that major to risk breakage. -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92>
Hi Ihor, Ihor Radchenko <yantar92@posteo.net> writes: > So, we should probably remove zero-width shenanigans from the code. +1. > Unless I miss something. > > Bastien, maybe you recall something about presence of null character in > regexs? No, I don't. > P.S. If we decide to remove the null character, I'd prefer to do it > after the release: this change may affect a lot of code and the bug is > not that major to risk breakage. Yes, that's more prudent. -- Bastien
Bastien Guerry <bzg@gnu.org> writes: > Ihor Radchenko <yantar92@posteo.net> writes: > >> So, we should probably remove zero-width shenanigans from the code. > > +1. Upon further look, it appears to me that [^\000] was simply used to match "anything, including newline" in most places. Fixed, on main. https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=eaf274909 -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92>