emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [BUG] Underlined text in parentheses is not exported correctly
@ 2021-12-31 10:14 Juan Manuel Macías
  2021-12-31 11:08 ` Juan Manuel Macías
  2021-12-31 11:13 ` Ihor Radchenko
  0 siblings, 2 replies; 6+ messages in thread
From: Juan Manuel Macías @ 2021-12-31 10:14 UTC (permalink / raw)
  To: orgmode

Hi all,

I don't know if this is a known issue...

Consider the text:

(_underline_)

When exported to LaTeX we get:

(\textsubscript{underline}\_)

And to HTML:

(<sub>underline</sub>_)

The same result with:

(_underline_ text)

LaTeX:

(\textsubscript{underline}\_ text)

But this:

(this word is _underline_)

is exported correctly:

(this word is \uline{underline})

If I do M-! (occur org-match-substring-regexp)

I get:

     10:(_underline_)
     22:(_underline_ text)

Best regards, and happy New Year,

Juan Manuel 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [BUG] Underlined text in parentheses is not exported correctly
  2021-12-31 10:14 [BUG] Underlined text in parentheses is not exported correctly Juan Manuel Macías
@ 2021-12-31 11:08 ` Juan Manuel Macías
  2021-12-31 11:13 ` Ihor Radchenko
  1 sibling, 0 replies; 6+ messages in thread
From: Juan Manuel Macías @ 2021-12-31 11:08 UTC (permalink / raw)
  To: orgmode

Juan Manuel Macías writes:

> If I do M-! (occur org-match-substring-regexp)
>
> I get:
>
>      10:(_underline_)
>      22:(_underline_ text)

Well, in my case the temporary workaround was to force super/subscripts
with braces:

#+begin_src emacs-lisp
  (defun my-org-element-subscript-with-braces-parser ()
    (save-excursion
      (unless (bolp) (backward-char))
      (when (looking-at org-match-substring-with-braces-regexp)
	(let ((bracketsp (match-beginning 4))
	      (begin (match-beginning 2))
	      (contents-begin (or (match-beginning 4)
				  (match-beginning 3)))
	      (contents-end (or (match-end 4) (match-end 3)))
	      (post-blank (progn (goto-char (match-end 0))
				 (skip-chars-forward " \t")))
	      (end (point)))
	  (list 'subscript
		(list :begin begin
		      :end end
		      :use-brackets-p bracketsp
		      :contents-begin contents-begin
		      :contents-end contents-end
		      :post-blank post-blank))))))

(advice-add 'org-element-subscript-parser :override #'my-org-element-subscript-with-braces-parser)
#+end_src


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [BUG] Underlined text in parentheses is not exported correctly
  2021-12-31 10:14 [BUG] Underlined text in parentheses is not exported correctly Juan Manuel Macías
  2021-12-31 11:08 ` Juan Manuel Macías
@ 2021-12-31 11:13 ` Ihor Radchenko
  2021-12-31 11:31   ` Juan Manuel Macías
  1 sibling, 1 reply; 6+ messages in thread
From: Ihor Radchenko @ 2021-12-31 11:13 UTC (permalink / raw)
  To: Juan Manuel Macías; +Cc: orgmode

Juan Manuel Macías <maciaschain@posteo.net> writes:

> I don't know if this is a known issue...
>
> Consider the text:
>
> (_underline_)

I am not sure if  it is an actual issue.

Note that (_u can be interpreted as a subscript.
Org prioritises subscript over underline.

Looking at the code:

(?_ (or (and (memq 'subscript restriction)
	   (org-element-subscript-parser))
      (and (memq 'underline restriction)
	   (org-element-underline-parser))))

The priority appears to be intentional.

Unless Nicolas (the author of this code) sees anything wrong here, I
recommend you to use zero-width space in front of the first _ to make
sure that you obtain underline, not subscripts (see
https://orgmode.org/manual/Escape-Character.html#Escape-Character)

P.S.
Note that the fontification you observe in Org is wrong. It is not how
the actual exporter sees the text. I am sorry about this. Fixing the
fontification is a work-in-progress.

Best,
Ihor



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [BUG] Underlined text in parentheses is not exported correctly
  2021-12-31 11:13 ` Ihor Radchenko
@ 2021-12-31 11:31   ` Juan Manuel Macías
  2021-12-31 14:43     ` [PATCH] " Ihor Radchenko
  0 siblings, 1 reply; 6+ messages in thread
From: Juan Manuel Macías @ 2021-12-31 11:31 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: orgmode

Ihor Radchenko writes:

> I am not sure if  it is an actual issue.
>
> Note that (_u can be interpreted as a subscript.
> Org prioritises subscript over underline.
>
> Looking at the code:
>
> (?_ (or (and (memq 'subscript restriction)
> 	   (org-element-subscript-parser))
>       (and (memq 'underline restriction)
> 	   (org-element-underline-parser))))
>
> The priority appears to be intentional.

I see. But then the compatibility with the rest of the emphasis is
broken. I mean, the user would expect things like (_underline_) will be
exported as (\uline{underline}), in the same way that (/emphasis/) is
exported as (\emph{emphasis}). I would say there is a slight
inconsistency in the syntax here.

Anyway, in my case I have solved it by always forcing the
super/sub-scripts with brackets overriding `org-element-subscript-parser'
(I never use them without brackets), as I mentioned in my previous
message.

Best regards,

Juan Manuel 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] Re: [BUG] Underlined text in parentheses is not exported correctly
  2021-12-31 11:31   ` Juan Manuel Macías
@ 2021-12-31 14:43     ` Ihor Radchenko
  2022-01-01 11:28       ` Juan Manuel Macías
  0 siblings, 1 reply; 6+ messages in thread
From: Ihor Radchenko @ 2021-12-31 14:43 UTC (permalink / raw)
  To: Juan Manuel Macías; +Cc: orgmode

[-- Attachment #1: Type: text/plain, Size: 1001 bytes --]

Juan Manuel Macías <maciaschain@posteo.net> writes:

>> The priority appears to be intentional.
>
> I see. But then the compatibility with the rest of the emphasis is
> broken. I mean, the user would expect things like (_underline_) will be
> exported as (\uline{underline}), in the same way that (/emphasis/) is
> exported as (\emph{emphasis}). I would say there is a slight
> inconsistency in the syntax here.

I agree with you. I think that the initial intention was to avoid
parsing things like (x+y)_1+x_2 as underline.

However, thinking about it more, I feel that prioritising underline
should work better. The underline parser recently got changed into a
stricter version. Now, only underlines starting after spaces,-,(,',",
and { are recognised as an underlines.

So, the attached patch is changing the priority of the parsing.
Maybe Nicolas knows some tricky cases when the patch makes things wrong,
but those cases are certainly not covered by tests.

Best,
Ihor


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Fix-underline-parser-inside-parenthesis.patch --]
[-- Type: text/x-diff, Size: 2454 bytes --]

From 12272f1ea89c169dcbece009c3a227e354019366 Mon Sep 17 00:00:00 2001
Message-Id: <12272f1ea89c169dcbece009c3a227e354019366.1640961654.git.yantar92@gmail.com>
From: Ihor Radchenko <yantar92@gmail.com>
Date: Fri, 31 Dec 2021 22:39:03 +0800
Subject: [PATCH] Fix underline parser inside parenthesis

* lisp/org-element.el (org-element--object-lex): prioritise underline
parser over subscript.  `org-element-underline-parser' is more strict
compared to `org-element-subscript-parser'.
* testing/lisp/test-org-element.el (test-org-element/underline-parser):
Add test.

Fixes https://list.orgmode.org/87v8z52eom.fsf@posteo.net/T/#t
---
 lisp/org-element.el              |  8 ++++----
 testing/lisp/test-org-element.el | 11 ++++++++++-
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/lisp/org-element.el b/lisp/org-element.el
index 45ddc79b7..c9d1d80bb 100644
--- a/lisp/org-element.el
+++ b/lisp/org-element.el
@@ -4850,10 +4850,10 @@ (defun org-element--object-lex (restriction)
 		    (pcase (char-after)
 		      (?^ (and (memq 'superscript restriction)
 			       (org-element-superscript-parser)))
-		      (?_ (or (and (memq 'subscript restriction)
-				   (org-element-subscript-parser))
-			      (and (memq 'underline restriction)
-				   (org-element-underline-parser))))
+		      (?_ (or (and (memq 'underline restriction)
+				   (org-element-underline-parser))
+                              (and (memq 'subscript restriction)
+				   (org-element-subscript-parser))))
 		      (?* (and (memq 'bold restriction)
 			       (org-element-bold-parser)))
 		      (?/ (and (memq 'italic restriction)
diff --git a/testing/lisp/test-org-element.el b/testing/lisp/test-org-element.el
index 338204eab..b58d71c8c 100644
--- a/testing/lisp/test-org-element.el
+++ b/testing/lisp/test-org-element.el
@@ -2661,7 +2661,16 @@ (ert-deftest test-org-element/underline-parser ()
      (org-test-with-temp-text "_first line\nsecond line_"
        (org-element-map
 	   (org-element-parse-buffer) 'underline #'identity nil t)))
-    '("first line\nsecond line"))))
+    '("first line\nsecond line")))
+  ;; Starting after non-blank
+  (should
+   (eq 'underline
+       (org-test-with-temp-text "(_under<point>line_)"
+         (org-element-type (org-element-context)))))
+  (should-not
+   (eq 'underline
+       (org-test-with-temp-text "x_under<point>line_)"
+         (org-element-type (org-element-context))))))
 
 
 ;;;; Verbatim
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Re: [BUG] Underlined text in parentheses is not exported correctly
  2021-12-31 14:43     ` [PATCH] " Ihor Radchenko
@ 2022-01-01 11:28       ` Juan Manuel Macías
  0 siblings, 0 replies; 6+ messages in thread
From: Juan Manuel Macías @ 2022-01-01 11:28 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: orgmode

Ihor Radchenko writes:

> However, thinking about it more, I feel that prioritising underline
> should work better. The underline parser recently got changed into a
> stricter version. Now, only underlines starting after spaces,-,(,',",
> and { are recognised as an underlines.
>
> So, the attached patch is changing the priority of the parsing.
> Maybe Nicolas knows some tricky cases when the patch makes things wrong,
> but those cases are certainly not covered by tests.

Great! I vote for this patch, I think it is a necessary addition. In my
case I have not found any error.

Best regards,

Juan Manuel 


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-01-01 11:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-31 10:14 [BUG] Underlined text in parentheses is not exported correctly Juan Manuel Macías
2021-12-31 11:08 ` Juan Manuel Macías
2021-12-31 11:13 ` Ihor Radchenko
2021-12-31 11:31   ` Juan Manuel Macías
2021-12-31 14:43     ` [PATCH] " Ihor Radchenko
2022-01-01 11:28       ` Juan Manuel Macías

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).