[-- Attachment #1: Type: text/plain, Size: 1071 bytes --] Remember to cover the basics, that is, what you expected to happen and what in fact did happen. You don't know how to make a good report? See https://orgmode.org/manual/Feedback.html#Feedback Your bug report will be posted to the Org mailing list. ------------------------------------------------------------------------ To reproduce: - create an org-file with the following content: /Foo [[https://taz.de/!5843294/][link with a bang]]/ - M-x org-html-export-to-html Expected: The HTML-file contains an italic link named "link with a bang". Actual: The HTML-file contains a broken link with only the domain: <i>Foo [[<a href="https://taz.de">https://taz.de</a></i>!5843294/][link with a bang]]/</p> Best wishes, Arne Emacs : GNU Emacs 27.2 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.30, cairo version 1.16.0) Package: Org mode version 9.5.2 (N/A @ /gnu/store/89yvbijwnvsbpa5h33mvbgh1gy9w30n2-emacs-org-9.5.2/share/emacs/site-lisp/org-9.5.2/) -- Unpolitisch sein heißt politisch sein, ohne es zu merken. draketo.de [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 1125 bytes --]
"Dr. Arne Babenhauserheide" <arne_bab@web.de> writes: > To reproduce: > > - create an org-file with the following content: > /Foo [[https://taz.de/!5843294/][link with a bang]]/ > - M-x org-html-export-to-html > > Expected: The HTML-file contains an italic link named "link with a bang". > > Actual: The HTML-file contains a broken link with only the domain: > <i>Foo [[<a href="https://taz.de">https://taz.de</a></i>!5843294/][link with a bang]]/</p> Confirmed. But with a caveat. Despite intuition, your example can be treated in two ways: 1. <begin italic>/Foo [[https://taz.de<end italic>/!5843294/][link witha bang]]/ 2. <begin italic>/Foo <begin link>[[...]]/<end italic> Org mode always chooses the first case as it prioritise markup that starts early and ends early. To force Org mode not treat internal /! as italics ending, you can insert a zero-width space before "/": <zero width space>/! So, what you see is not exactly a bug, but non-intuitive behaviour of Org. (I do not like it, but we have reasons why Org parser behaves this way). On the other hand, the example link could be inserted using org-insert-link. If one does the following: 1. emacs -Q /tmp/test.org 2. Type "/Begin italic " 3. C-c C-l https://taz.de/!5843294/ <RET> <test> <RET> 4. The inserted text is not a link because the problematic /! is not fixed automatically. I consider the above to be at least a bug in org-insert-link. Best, Ihor
On 30/04/2022 16:37, Ihor Radchenko wrote: > "Dr. Arne Babenhauserheide" <arne_bab@web.de> writes: > >> To reproduce: >> >> - create an org-file with the following content: >> /Foo [[https://taz.de/!5843294/][link with a bang]]/ >> - M-x org-html-export-to-html >> >> Expected: The HTML-file contains an italic link named "link with a bang". >> >> Actual: The HTML-file contains a broken link with only the domain: >> <i>Foo [[<a href="https://taz.de">https://taz.de</a></i>!5843294/][link with a bang]]/</p> > > Confirmed. Nicolas clearly expressed that it is a feature of the Org parser though. Moreover, this is a duplicate of another item already tracked on updates.orgmode.org: 2021-09-03 5:17 Dr. Arne Babenhauserheide Bug: PDF Export of Link fails https://list.orgmode.org/87pmtqp79s.fsf@web.de/T/#u The following markup should be used instead: /Foo/ [[https://taz.de/!5843294/][/link with a bang/]] > To force Org mode not treat internal /! as italics ending, you can > insert a zero-width space before "/": <zero width space>/! Unfortunately It requires an additional export filter to remove zero width spaces. > On the other hand, the example link could be inserted using > org-insert-link. > > If one does the following: > 1. emacs -Q /tmp/test.org > 2. Type "/Begin italic " > 3. C-c C-l https://taz.de/!5843294/ <RET> <test> <RET> > 4. The inserted text is not a link because the problematic /! is not > fixed automatically. > > I consider the above to be at least a bug in org-insert-link. Timothy suggested to fix `org-insert-link' somehow in than thread. P.S. Actually I like behavior of pandoc printf '%s' '/Foo [[https://taz.de/!5843294/][link with a bang]]/' | pandoc -f org -t html <p><em>Foo <a href="https://taz.de/!5843294/">link with a bang</a></em></p> Juan Manuel Macías to emacs-orgmode. Pandoc and nested emhases. Fri, 18 Feb 2022 00:47:18 +0000. https://list.orgmode.org/87sfshgfvt.fsf@posteo.net
Max Nikulin <manikulin@gmail.com> writes: >> Confirmed. > > Nicolas clearly expressed that it is a feature of the Org parser though. > > Moreover, this is a duplicate of another item already tracked on > updates.orgmode.org: > > 2021-09-03 5:17 Dr. Arne Babenhauserheide Bug: PDF Export of Link fails > https://list.orgmode.org/87pmtqp79s.fsf@web.de/T/#u > > The following markup should be used instead: > > /Foo/ [[https://taz.de/!5843294/][/link with a bang/]] > ... >> I consider the above to be at least a bug in org-insert-link. > > Timothy suggested to fix `org-insert-link' somehow in than thread. Yeah. I recall a number of bug report related to this behaviour. Though I wanted to focus on org-insert-link here. We can expect users to change the markup if they type a problematic link manually, but not when specialised functions like org-insert-link are used. In this scenario, org-insert-link should take care about not messing up the existing markup. >> To force Org mode not treat internal /! as italics ending, you can >> insert a zero-width space before "/": <zero width space>/! > > Unfortunately It requires an additional export filter to remove zero > width spaces. Yeah. Right. It should even be an easy patch, which would be welcome :) > P.S. Actually I like behavior of pandoc > > printf '%s' '/Foo [[https://taz.de/!5843294/][link with a bang]]/' | > pandoc -f org -t html > > <p><em>Foo <a href="https://taz.de/!5843294/">link with a > bang</a></em></p> I also like such behaviour, but it would require multi-pass parsing or parser tree branching. Nicolas opposed it. Best, Ihor
On 30/04/2022 19:34, Ihor Radchenko wrote: > Max Nikulin writes: > >> 2021-09-03 5:17 Dr. Arne Babenhauserheide Bug: PDF Export of Link fails >> https://list.orgmode.org/87pmtqp79s.fsf@web.de/T/#u >> >> Timothy suggested to fix `org-insert-link' somehow in than thread. > > Yeah. I recall a number of bug report related to this behaviour. > Though I wanted to focus on org-insert-link here. Then the older bug may be cancelled as a duplicate. > We can expect users to change the markup if they type a problematic link > manually, but not when specialised functions like org-insert-link are > used. In this scenario, org-insert-link should take care about not > messing up the existing markup. > >>> To force Org mode not treat internal /! as italics ending, you can >>> insert a zero-width space before "/": <zero width space>/! >> >> Unfortunately It requires an additional export filter to remove zero >> width spaces. > > Yeah. Right. It should even be an easy patch, which would be welcome :) I meant a custom user filter. I consider zero width spaces as the last resort. Nicolas considered making zero width spaces an official part of syntax stripped during export and a way to preserve some of them. In the case of links I still prefer breaking emphasis at the link borders. `org-insert-link' may check after inserting the markup if it is parsed as a link and add more markers if necessary. Unfortunately it is not always possible. In the following case /inter[[https://orgmode.org/?oops=1][word]]link/ additional markers would not work (unless augmented by zero width spaces, but at least they will be outside of link target) /inter/[[https://orgmode.org/?oops=1][/word/]]/link/ However mostly it is a decent workaround since links are usually surrounded by spaces. At certain moment I was surprised that emphasis markers are not recognized at the borders of export snippets and they are active one at one side of links. I am afraid that zero width spaces in link targets may lead to confusion of users since in the most cases e.g. http: URLs may be pasted to external application as is.
Max Nikulin <manikulin@gmail.com> writes: >>> 2021-09-03 5:17 Dr. Arne Babenhauserheide Bug: PDF Export of Link fails >>> https://list.orgmode.org/87pmtqp79s.fsf@web.de/T/#u >>> >>> Timothy suggested to fix `org-insert-link' somehow in than thread. >> >> Yeah. I recall a number of bug report related to this behaviour. >> Though I wanted to focus on org-insert-link here. > > Then the older bug may be cancelled as a duplicate. Not sure. Even a fix to org-insert-link would not solve the problem with unexpected export if the link is typed in manually. So, I'd rather keep both the reports for the time being. Or someone may go through all the related bugs and create a single giant discussion to avoid scattering things around. In my notes, I have at least 6 discussions related to edge cases of Org markup. >>> Unfortunately It requires an additional export filter to remove zero >>> width spaces. >> >> Yeah. Right. It should even be an easy patch, which would be welcome :) > > I meant a custom user filter. I consider zero width spaces as the last > resort. Nicolas considered making zero width spaces an official part of > syntax stripped during export and a way to preserve some of them. I think it is already kind of official. At least, we directly suggest using zero width spaces in https://orgmode.org/manual/Escape-Character.html#Escape-Character The other thing is that ox.el does not do anything about zero width spaces. > In the case of links I still prefer breaking emphasis at the link > borders. `org-insert-link' may check after inserting the markup if it > is parsed as a link and add more markers if necessary. > ... > /inter/[[https://orgmode.org/?oops=1][/word/]]/link/ I do not like this idea. It is fine when inserting a link into existing emphasis, but what if an emphasis is applied around link later? We would also need to update org-emphasize and still have an issue because many users simply type the emphasis markers manually. > I am afraid that zero width spaces in link targets may lead to confusion > of users since in the most cases e.g. http: URLs may be pasted to > external application as is. We already escape '\' ,'[', and ']' in links. Zero width spaces will not make things much different. Of course, org-link-escape and org-link-unescape will need to be updated. Note that even copying Urls directly can be worked around using filter-buffer-substring-function. Best, Ihor
On 01/05/2022 10:27, Ihor Radchenko wrote: > Max Nikulin writes: > >>>> 2021-09-03 5:17 Dr. Arne Babenhauserheide Bug: PDF Export of Link fails >>>> https://list.orgmode.org/87pmtqp79s.fsf@web.de/T/#u >> >> Then the older bug may be cancelled as a duplicate. > > Not sure. Even a fix to org-insert-link would not solve the problem with > unexpected export if the link is typed in manually. So, I'd rather keep > both the reports for the time being. I would not insist any more. My point was: the same reporter, the same case of punctuation after slash in link target, the same idea to make `org-insert-link' more smart. > Or someone may go through all the related bugs and create a single giant > discussion to avoid scattering things around. In my notes, I have at > least 6 discussions related to edge cases of Org markup. I have some notes as well. Though I think it should be either FAQ entry or a separate document describing limitations of the parser (and test data set for the parser). > I think it is already kind of official. At least, we directly suggest > using zero width spaces in > https://orgmode.org/manual/Escape-Character.html#Escape-Character Things are more complicated. Without a filter (that it is not mentioned) it may cause undesired line breaks (the primary purpose of zero width space). Fortunately PdfLaTeX ignores them. Tom Gillespie. On zero width spaces and Org syntax. Fri, 3 Dec 2021 20:04:28 -0800. https://list.orgmode.org/CA+G3_PM4cxHa8bU+3QG541UiOauLNAQFZQu-+UKczx3itOeTHg@mail.gmail.com suggested word joiner U+2060, but this character is not a space for regular expressions. I experimented a bit, but I can not provide a summary yet, my notes are in early draft stage. the "Escape Character" section should be expanded to discuss more use cases. >> In the case of links I still prefer breaking emphasis at the link >> borders. `org-insert-link' may check after inserting the markup if it >> is parsed as a link and add more markers if necessary. >> ... >> /inter/[[https://orgmode.org/?oops=1][/word/]]/link/ > > I do not like this idea. It is fine when inserting a link into existing > emphasis, but what if an emphasis is applied around link later? We would > also need to update org-emphasize and still have an issue because many > users simply type the emphasis markers manually. Emphasis around other inline objects anyway can be easily broken. Try to make the whole string bold: begin =middle* verbatim= end It may be useful to add a checker to `org-lint' that issues warnings for confusing link targets. I believe that zero width space does not belong to "plain text markup" since it is invisible (at least by default). I see that printable ASCII characters are already in use, but I still think that U+200B should be used as rare as possible. You are aware of my opinion now and I do not need more. You are free to ignore it since I can not offer anything better.