* Org-syntax: Intra-word markup @ 2021-12-02 10:50 Denis Maier 2021-12-02 11:18 ` Ihor Radchenko ` (2 more replies) 0 siblings, 3 replies; 69+ messages in thread From: Denis Maier @ 2021-12-02 10:50 UTC (permalink / raw) To: Org Mode List Hi everyone, while we're at discussing org syntax anyway, I thought it's time to bring up another syntax question: Currently, org syntax doesn't officially seem to support intra-word emphasis. Am I missing something? If the assessment is correct: Is there a reason for this? And, shouldn't that be officially added? Best, Denis ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 10:50 Org-syntax: Intra-word markup Denis Maier @ 2021-12-02 11:18 ` Ihor Radchenko 2021-12-02 11:30 ` Juan Manuel Macías 2021-12-02 11:58 ` Timothy 2022-01-28 12:12 ` [PATCH] Intra-word markup: \relax Max Nikulin 2 siblings, 1 reply; 69+ messages in thread From: Ihor Radchenko @ 2021-12-02 11:18 UTC (permalink / raw) To: Denis Maier; +Cc: Org Mode List Denis Maier <denismaier@mailbox.org> writes: > Currently, org syntax doesn't officially seem to support intra-word > emphasis. Am I missing something? intra-*word* works just fine for me. Best, Ihor ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 11:18 ` Ihor Radchenko @ 2021-12-02 11:30 ` Juan Manuel Macías 2021-12-02 11:36 ` Denis Maier ` (2 more replies) 0 siblings, 3 replies; 69+ messages in thread From: Juan Manuel Macías @ 2021-12-02 11:30 UTC (permalink / raw) To: Ihor Radchenko; +Cc: orgmode, Denis Maier Hi Denis and Ihor, Ihor Radchenko writes: > Denis Maier <denismaier@mailbox.org> writes: > >> Currently, org syntax doesn't officially seem to support intra-word >> emphasis. Am I missing something? > > intra-*word* works just fine for me. > > Best, > Ihor I think what Denis is referring to is a construction of the type *intra*word, which, if I'm not mistaken, is not supported and can only be achieved by inserting a zero width space. Best regards, Juan Manuel ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 11:30 ` Juan Manuel Macías @ 2021-12-02 11:36 ` Denis Maier 2021-12-02 12:01 ` Ihor Radchenko 2021-12-02 11:42 ` Marco Wahl 2021-12-02 12:00 ` Ihor Radchenko 2 siblings, 1 reply; 69+ messages in thread From: Denis Maier @ 2021-12-02 11:36 UTC (permalink / raw) To: Juan Manuel Macías, Ihor Radchenko; +Cc: orgmode Yes, Juan Manuel. That's it. See for reference: https://stackoverflow.com/questions/1218238/how-to-make-part-of-a-word-bold-in-org-mode Best, Denis Am 02.12.2021 um 12:30 schrieb Juan Manuel Macías: > Hi Denis and Ihor, > > Ihor Radchenko writes: > >> Denis Maier <denismaier@mailbox.org> writes: >> >>> Currently, org syntax doesn't officially seem to support intra-word >>> emphasis. Am I missing something? >> intra-*word* works just fine for me. >> >> Best, >> Ihor > I think what Denis is referring to is a construction of the type > *intra*word, which, if I'm not mistaken, is not supported and can only > be achieved by inserting a zero width space. > > Best regards, > > Juan Manuel ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 11:36 ` Denis Maier @ 2021-12-02 12:01 ` Ihor Radchenko 0 siblings, 0 replies; 69+ messages in thread From: Ihor Radchenko @ 2021-12-02 12:01 UTC (permalink / raw) To: Denis Maier; +Cc: Juan Manuel Macías, orgmode Denis Maier <denismaier@mailbox.org> writes: > Yes, Juan Manuel. That's it. > > See for reference: > https://stackoverflow.com/questions/1218238/how-to-make-part-of-a-word-bold-in-org-mode Please, do not use that stackoverflow answer. It is not officially supported, breaks exporting, and will not work anymore in future Org versions. Best, Ihor ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 11:30 ` Juan Manuel Macías 2021-12-02 11:36 ` Denis Maier @ 2021-12-02 11:42 ` Marco Wahl 2021-12-02 11:50 ` Denis Maier 2021-12-02 12:02 ` Ihor Radchenko 2021-12-02 12:00 ` Ihor Radchenko 2 siblings, 2 replies; 69+ messages in thread From: Marco Wahl @ 2021-12-02 11:42 UTC (permalink / raw) To: Juan Manuel Macías; +Cc: orgmode, Ihor Radchenko, Denis Maier Hi! >>> Currently, org syntax doesn't officially seem to support intra-word >>> emphasis. Am I missing something? >> >> intra-*word* works just fine for me. >> >> Best, >> Ihor > > I think what Denis is referring to is a construction of the type > *intra*word, which, if I'm not mistaken, is not supported and can only > be achieved by inserting a zero width space. Is there a recommended way to insert a zero with space? BTW occasionally I use (defun mw-insert-zero-width-whitespace () "Insert a space with zero width." (interactive) (insert ?\x200B)) Thanks and ciao, -- Marco ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 11:42 ` Marco Wahl @ 2021-12-02 11:50 ` Denis Maier 2021-12-02 12:10 ` Ihor Radchenko 2021-12-02 12:02 ` Ihor Radchenko 1 sibling, 1 reply; 69+ messages in thread From: Denis Maier @ 2021-12-02 11:50 UTC (permalink / raw) To: Marco Wahl, Juan Manuel Macías; +Cc: orgmode, Ihor Radchenko Am 02.12.2021 um 12:42 schrieb Marco Wahl: > Hi! > >>>> Currently, org syntax doesn't officially seem to support intra-word >>>> emphasis. Am I missing something? >>> >>> intra-*word* works just fine for me. >>> >>> Best, >>> Ihor >> >> I think what Denis is referring to is a construction of the type >> *intra*word, which, if I'm not mistaken, is not supported and can only >> be achieved by inserting a zero width space. > > Is there a recommended way to insert a zero with space? > > BTW occasionally I use > > (defun mw-insert-zero-width-whitespace () > "Insert a space with zero width." > (interactive) > (insert ?\x200B)) > > > Thanks and ciao, Just a furter remark: while zero-width-spaces can be used as a workaround, they may create problems in some export formats. E.g., they will mess up hyphenation in latex. I think if read somewhere that those can be removed with hooks or filters, but I think that shouldn't be necessary. Denis ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 11:50 ` Denis Maier @ 2021-12-02 12:10 ` Ihor Radchenko 2021-12-02 12:40 ` Denis Maier 2021-12-02 12:48 ` Max Nikulin 0 siblings, 2 replies; 69+ messages in thread From: Ihor Radchenko @ 2021-12-02 12:10 UTC (permalink / raw) To: Denis Maier; +Cc: Juan Manuel Macías, Marco Wahl, orgmode Denis Maier <denismaier@mailbox.org> writes: > > Just a furter remark: while zero-width-spaces can be used as a > workaround, they may create problems in some export formats. E.g., they > will mess up hyphenation in latex. I think if read somewhere that those > can be removed with hooks or filters, but I think that shouldn't be > necessary. Can you create an example of such scenario and post it as a bug? Probably, we just need to strip all zero-width spaces at the basic ox.el level. Best, Ihor ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 12:10 ` Ihor Radchenko @ 2021-12-02 12:40 ` Denis Maier 2021-12-02 12:54 ` Ihor Radchenko 2021-12-02 12:48 ` Max Nikulin 1 sibling, 1 reply; 69+ messages in thread From: Denis Maier @ 2021-12-02 12:40 UTC (permalink / raw) To: Ihor Radchenko; +Cc: Marco Wahl, Juan Manuel Macías, orgmode [-- Attachment #1: Type: text/plain, Size: 767 bytes --] Am 02.12.2021 um 13:10 schrieb Ihor Radchenko: > Denis Maier<denismaier@mailbox.org> writes: > >> Just a furter remark: while zero-width-spaces can be used as a >> workaround, they may create problems in some export formats. E.g., they >> will mess up hyphenation in latex. I think if read somewhere that those >> can be removed with hooks or filters, but I think that shouldn't be >> necessary. > Can you create an example of such scenario and post it as a bug? > Probably, we just need to strip all zero-width spaces at the basic ox.el > level. To be clear: That's not an org bug. It's just that latex won't be able such a word. If | is a zero width space, the word "hyphen|ation" is not the same as "hyphenation". 1. hyphenation 2. hyphen|ation Best, Denis [-- Attachment #2.1: Type: text/html, Size: 1431 bytes --] [-- Attachment #2.2: b7OGd2OT4Kkun0eA.png --] [-- Type: image/png, Size: 4888 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 12:40 ` Denis Maier @ 2021-12-02 12:54 ` Ihor Radchenko 2021-12-02 13:14 ` Juan Manuel Macías 0 siblings, 1 reply; 69+ messages in thread From: Ihor Radchenko @ 2021-12-02 12:54 UTC (permalink / raw) To: Denis Maier; +Cc: Juan Manuel Macías, Marco Wahl, orgmode Denis Maier <denismaier@mailbox.org> writes: >> Can you create an example of such scenario and post it as a bug? >> Probably, we just need to strip all zero-width spaces at the basic ox.el >> level. > To be clear: That's not an org bug. It's just that latex won't be able > such a word. If | is a zero width space, the word "hyphen|ation" is not > the same as "hyphenation". > 1. hyphenation > 2. hyphen|ation You are right for your example, but if we force the user to put *hyphen*|ation to create bold emphasis, it should not be any different compared to @@latex:\textbf{hyphen}ation@@. Meanwhile the *hyphen*|ation gets exported as \textbf{hyphen}|ation keeping the zero width space. Best, Ihor ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 12:54 ` Ihor Radchenko @ 2021-12-02 13:14 ` Juan Manuel Macías 2021-12-02 13:28 ` Denis Maier 0 siblings, 1 reply; 69+ messages in thread From: Juan Manuel Macías @ 2021-12-02 13:14 UTC (permalink / raw) To: Ihor Radchenko; +Cc: orgmode, denismaier Ihor Radchenko writes: > Denis Maier <denismaier@mailbox.org> writes: > >>> Can you create an example of such scenario and post it as a bug? >>> Probably, we just need to strip all zero-width spaces at the basic ox.el >>> level. >> To be clear: That's not an org bug. It's just that latex won't be able >> such a word. If | is a zero width space, the word "hyphen|ation" is not >> the same as "hyphenation". >> 1. hyphenation >> 2. hyphen|ation > > You are right for your example, but if we force the user to put > *hyphen*|ation to create bold emphasis, it should not be any different > compared to @@latex:\textbf{hyphen}ation@@. Meanwhile the *hyphen*|ation > gets exported as \textbf{hyphen}|ation keeping the zero width space. -- I would say that they are very random cases, and therefore difficult to reproduce. In the 'hyphenation' example, if we load the package showhypehns, you see that: /hyphen/ation (with zero width sp) and \emph{hyphen}ation they are cut in the same way. But differently from hyphenation (without emphasis) (compiled with LuaTeX). Anyway, I have come across some curious cases. For example, a long time ago I had defined a macro for text in other languages: #+MACRO: lg (eval (if (org-export-derived-backend-p org-export-current-backend 'latex) (concat "@@latex:\\foreignlanguage{@@" $1 "@@latex:}{@@" "\u200B" $2 "\u200B" "@@latex:}@@") $2)) I needed to add before and after a zero width space, but doing so, the shape of the text was altered. That can be reproduced with this example: #+LaTeX_Header: \usepackage{showhyphens} #+LaTeX_Header:\usepackage{lipsum,multicol} #+LaTeX_Header:\usepackage[spanish]{babel} #+LaTeX_Header: \def\example{\lipsum[1]} #+LaTeX_Header: \def\zwsp{\char"200B{}} #+OPTIONS: toc:nil @@latex:\begin{multicols}{2}@@ @@latex:\foreignlanguage{italian}{\zwsp\example\zwsp}@@ @@latex:\foreignlanguage{italian}{\example}@@ @@latex:\end{multicols}@@ Best regards, Juan Manuel ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 13:14 ` Juan Manuel Macías @ 2021-12-02 13:28 ` Denis Maier 0 siblings, 0 replies; 69+ messages in thread From: Denis Maier @ 2021-12-02 13:28 UTC (permalink / raw) To: Juan Manuel Macías, Ihor Radchenko; +Cc: orgmode [-- Attachment #1: Type: text/plain, Size: 2468 bytes --] Am 02.12.2021 um 14:14 schrieb Juan Manuel Macías: > Ihor Radchenko writes: > >> Denis Maier<denismaier@mailbox.org> writes: >> >>>> Can you create an example of such scenario and post it as a bug? >>>> Probably, we just need to strip all zero-width spaces at the basic ox.el >>>> level. >>> To be clear: That's not an org bug. It's just that latex won't be able >>> such a word. If | is a zero width space, the word "hyphen|ation" is not >>> the same as "hyphenation". >>> 1. hyphenation >>> 2. hyphen|ation >> You are right for your example, but if we force the user to put >> *hyphen*|ation to create bold emphasis, it should not be any different >> compared to @@latex:\textbf{hyphen}ation@@. Meanwhile the*hyphen*|ation >> gets exported as \textbf{hyphen}|ation keeping the zero width space. > -- I would say that they are very random cases, and therefore > difficult to reproduce. In the 'hyphenation' example, if we load the > package showhypehns, you see that: /hyphen/ation (with zero width sp) > and \emph{hyphen}ation they are cut in the same way. But differently > from hyphenation (without emphasis) (compiled with LuaTeX). Anyway, I > have come across some curious cases. For example, a long time ago I > had defined a macro for text in other languages: #+MACRO: lg (eval (if > (org-export-derived-backend-p org-export-current-backend 'latex) > (concat "@@latex:\\foreignlanguage{@@" $1 "@@latex:}{@@" "\u200B" $2 > "\u200B" "@@latex:}@@") $2)) I needed to add before and after a zero > width space, but doing so, the shape of the text was altered. That can > be reproduced with this example: #+LaTeX_Header: > \usepackage{showhyphens} #+LaTeX_Header:\usepackage{lipsum,multicol} > #+LaTeX_Header:\usepackage[spanish]{babel} #+LaTeX_Header: > \def\example{\lipsum[1]} #+LaTeX_Header: \def\zwsp{\char"200B{}} > #+OPTIONS: toc:nil @@latex:\begin{multicols}{2}@@ > @@latex:\foreignlanguage{italian}{\zwsp\example\zwsp}@@ > @@latex:\foreignlanguage{italian}{\example}@@ > @@latex:\end{multicols}@@ Best regards, Juan Manuel Thanks Juan Manuel. I should have tried that first. Hyphenation is the same for both /hyphen/ation (with zero width sp) and \emph{hyphen}ation. (Maybe I can nudge Hans Hagen to add some low level trickery in context that removes the groups before doing the hyphenation... but that's a different story.) Anyway, as Juan Manuel shows there can be cases where zero width spaces cause problems. Denis [-- Attachment #2: Type: text/html, Size: 3801 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 12:10 ` Ihor Radchenko 2021-12-02 12:40 ` Denis Maier @ 2021-12-02 12:48 ` Max Nikulin 1 sibling, 0 replies; 69+ messages in thread From: Max Nikulin @ 2021-12-02 12:48 UTC (permalink / raw) To: emacs-orgmode On 02/12/2021 19:10, Ihor Radchenko wrote: > Denis Maier writes: > >> Just a furter remark: while zero-width-spaces can be used as a >> workaround, they may create problems in some export formats. E.g., they >> will mess up hyphenation in latex. I think if read somewhere that those >> can be removed with hooks or filters, but I think that shouldn't be >> necessary. > > Probably, we just need to strip all zero-width spaces at the basic ox.el > level. I think, legitimate cases when zero-width spaces should be preserved in a document may exist, so unconditionally stripping them is not a perfect solution. I am afraid, regexps detecting start and end of emphasis are similar to a short blanket. They will always fail for some cases, especially since verbatim, URLs and similar contexts (that significantly differ from prose in respect to punctuation) do not have higher priority for parser. Extensive test set is required for tuning of heuristics. Failures should be reported in a such way that allows to estimate overall quality before and after change. Ideally, format of file with such tests should allow to use the *same* input data for other tools like ruby-org. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 11:42 ` Marco Wahl 2021-12-02 11:50 ` Denis Maier @ 2021-12-02 12:02 ` Ihor Radchenko 1 sibling, 0 replies; 69+ messages in thread From: Ihor Radchenko @ 2021-12-02 12:02 UTC (permalink / raw) To: Marco Wahl; +Cc: Juan Manuel Macías, orgmode, Denis Maier Marco Wahl <marcowahlsoft@gmail.com> writes: > Is there a recommended way to insert a zero with space? C-x 8 <RET> ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 11:30 ` Juan Manuel Macías 2021-12-02 11:36 ` Denis Maier 2021-12-02 11:42 ` Marco Wahl @ 2021-12-02 12:00 ` Ihor Radchenko [not found] ` <87r1avtdjy.fsf@ucl.ac.uk> 2021-12-02 12:28 ` Denis Maier 2 siblings, 2 replies; 69+ messages in thread From: Ihor Radchenko @ 2021-12-02 12:00 UTC (permalink / raw) To: Juan Manuel Macías; +Cc: orgmode, Denis Maier Juan Manuel Macías <maciaschain@posteo.net> writes: >> intra-*word* works just fine for me. >> > I think what Denis is referring to is a construction of the type > *intra*word, which, if I'm not mistaken, is not supported and can only > be achieved by inserting a zero width space. I see. We had a discussion about emphasis issues in https://orgmode.org/list/8735nnq73n.fsf@localhost The conclusion from there is that supporting such scenarios will introduce various edge cases. We would need to make the emaphsis parser more and more complex inevitably introducing errors. An alternative may be some kind of "forced" emphasis syntax where Org does not have to guess about the emphasis using non-transparent rules. But it's what zero width space is for and it is what we recommend in the Org manual. Best, Ihor ^ permalink raw reply [flat|nested] 69+ messages in thread
[parent not found: <87r1avtdjy.fsf@ucl.ac.uk>]
* Re: Org-syntax: Intra-word markup [not found] ` <87r1avtdjy.fsf@ucl.ac.uk> @ 2021-12-02 12:27 ` Denis Maier 2021-12-02 13:06 ` Eric S Fraga 0 siblings, 1 reply; 69+ messages in thread From: Denis Maier @ 2021-12-02 12:27 UTC (permalink / raw) To: Org Mode List Am 02.12.2021 um 13:08 schrieb Eric S Fraga: > My solution, in these case, is to fall back to LaTeX using @@latex:...@@ > (and equivalent for HTML, if desired). Not pretty but I need this so > seldom that I am happy with the org emphasis support generally. > Hi Eric, Am 02.12.2021 um 13:08 schrieb Eric S Fraga: > My solution, in these case, is to fall back to LaTeX using @@latex:...@@ > (and equivalent for HTML, if desired). Not pretty but I need this so > seldom that I am happy with the org emphasis support generally. > This works if your target is just latex, but not if you have multiple targets, right? Denis ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 12:27 ` Denis Maier @ 2021-12-02 13:06 ` Eric S Fraga 0 siblings, 0 replies; 69+ messages in thread From: Eric S Fraga @ 2021-12-02 13:06 UTC (permalink / raw) To: Denis Maier; +Cc: Org Mode List On Thursday, 2 Dec 2021 at 13:27, Denis Maier wrote: > This works if your target is just latex, but not if you have multiple > targets, right? Multiple targets are possible: @@latex:\textbf{@@@@html:<strong>@@intra@@latex:}@@@@html:</strong>@@word. Just very ugly! 🤣 Of course, if you do this more than once, a macro can help... -- : Eric S Fraga, with org release_9.5.1-231-g6766c4 in Emacs 29.0.50 : Latest paper written in org: https://arxiv.org/abs/2106.05096 ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 12:00 ` Ihor Radchenko [not found] ` <87r1avtdjy.fsf@ucl.ac.uk> @ 2021-12-02 12:28 ` Denis Maier 2021-12-02 12:55 ` Ihor Radchenko 1 sibling, 1 reply; 69+ messages in thread From: Denis Maier @ 2021-12-02 12:28 UTC (permalink / raw) To: Ihor Radchenko, Juan Manuel Macías; +Cc: orgmode Am 02.12.2021 um 13:00 schrieb Ihor Radchenko: > Juan Manuel Macías <maciaschain@posteo.net> writes: > >>> intra-*word* works just fine for me. >>> >> I think what Denis is referring to is a construction of the type >> *intra*word, which, if I'm not mistaken, is not supported and can only >> be achieved by inserting a zero width space. > I see. We had a discussion about emphasis issues in > https://orgmode.org/list/8735nnq73n.fsf@localhost > > The conclusion from there is that supporting such scenarios will > introduce various edge cases. We would need to make the emaphsis parser > more and more complex inevitably introducing errors. Thanks, I'll try to read that thread in due time. > > An alternative may be some kind of "forced" emphasis syntax where Org > does not have to guess about the emphasis using non-transparent rules. > But it's what zero width space is for and it is what we recommend in the > Org manual. As for the forced syntax. What do you think about the asciidoc solution? Denis ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 12:28 ` Denis Maier @ 2021-12-02 12:55 ` Ihor Radchenko 0 siblings, 0 replies; 69+ messages in thread From: Ihor Radchenko @ 2021-12-02 12:55 UTC (permalink / raw) To: Denis Maier; +Cc: Juan Manuel Macías, orgmode Denis Maier <denismaier@mailbox.org> writes: >> An alternative may be some kind of "forced" emphasis syntax where Org >> does not have to guess about the emphasis using non-transparent rules. >> But it's what zero width space is for and it is what we recommend in the >> Org manual. > As for the forced syntax. What do you think about the asciidoc solution? Can you elaborate? ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 10:50 Org-syntax: Intra-word markup Denis Maier 2021-12-02 11:18 ` Ihor Radchenko @ 2021-12-02 11:58 ` Timothy 2021-12-02 12:26 ` Denis Maier 2022-01-28 12:12 ` [PATCH] Intra-word markup: \relax Max Nikulin 2 siblings, 1 reply; 69+ messages in thread From: Timothy @ 2021-12-02 11:58 UTC (permalink / raw) To: Denis Maier; +Cc: emacs-orgmode [-- Attachment #1: Type: text/plain, Size: 568 bytes --] Hi Denis, > Currently, org syntax doesn’t officially seem to support intra-word emphasis. Am > I missing something? I’d describe it as supported via-zero width spaces. You may be interested in <https://blog.tecosaur.com/tmio/2021-05-31-async.html#easy-zero-width>. > If the assessment is correct: Is there a reason for this? And, shouldn’t that > be officially added? Do you happen to have any ideas on how this could be achieved? I’d rather not resort to having to do things like `\ast{}' and `\tilde{}' too much. All the best, Timothy ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 11:58 ` Timothy @ 2021-12-02 12:26 ` Denis Maier 2021-12-02 13:07 ` Ihor Radchenko 0 siblings, 1 reply; 69+ messages in thread From: Denis Maier @ 2021-12-02 12:26 UTC (permalink / raw) To: Timothy; +Cc: emacs-orgmode Hi Timothy, Am 02.12.2021 um 12:58 schrieb Timothy: > Hi Denis, > >> Currently, org syntax doesn’t officially seem to support intra-word emphasis. Am >> I missing something? > I’d describe it as supported via-zero width spaces. > > You may be interested in <https://blog.tecosaur.com/tmio/2021-05-31-async.html#easy-zero-width>. Thank's that's helpful. > >> If the assessment is correct: Is there a reason for this? And, shouldn’t that >> be officially added? > Do you happen to have any ideas on how this could be achieved? I’d rather not > resort to having to do things like `\ast{}' and `\tilde{}' too much. Well, not really. I just don't understand why /intra/word shouldn't mean \emph{intra}word. Pandoc's markdown supports *intra*word, asciidoc supports it via unconstrained formatting pairs: https://docs.asciidoctor.org/asciidoc/latest/text/#unconstrained; so __intra__word. And, as org syntax is said to be the superior markup language, I thought that must be possible ;-) I understand zero width spaces are the official workaround, but I don't really like having invisible characters in my documents. Automatically removing all of them on export might also introduce problems. Perhaps some have been added on purpose, and not just to help org? As for suggestions: If just using /intra/word creates ambiguities, what about the asciidoc solution? So //intra//word? In fact, I'd even use raw latex for this things. It's true, they are rare enough. So I wouldn't mind an occassional `\emph{}`. Best, Denis ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 12:26 ` Denis Maier @ 2021-12-02 13:07 ` Ihor Radchenko 2021-12-02 15:51 ` Max Nikulin 2021-12-02 19:03 ` Nicolas Goaziou 0 siblings, 2 replies; 69+ messages in thread From: Ihor Radchenko @ 2021-12-02 13:07 UTC (permalink / raw) To: Denis Maier; +Cc: emacs-orgmode, Nicolas Goaziou, Timothy Denis Maier <denismaier@mailbox.org> writes: > As for suggestions: If just using /intra/word creates ambiguities, what > about the asciidoc solution? So //intra//word? I do like this idea. Though I would also like to hear Nicolas' opinion. Best, Ihor ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 13:07 ` Ihor Radchenko @ 2021-12-02 15:51 ` Max Nikulin 2021-12-02 18:11 ` Tom Gillespie 2021-12-02 19:03 ` Nicolas Goaziou 1 sibling, 1 reply; 69+ messages in thread From: Max Nikulin @ 2021-12-02 15:51 UTC (permalink / raw) To: emacs-orgmode On 02/12/2021 20:07, Ihor Radchenko wrote: > >> As for suggestions: If just using /intra/word creates ambiguities, what >> about the asciidoc solution? So //intra//word? > > I do like this idea. - Some //text <https://orgmode.org/> surprise// - ++another ~i++~ problem++ First wins... ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 15:51 ` Max Nikulin @ 2021-12-02 18:11 ` Tom Gillespie 2021-12-02 19:09 ` Juan Manuel Macías ` (3 more replies) 0 siblings, 4 replies; 69+ messages in thread From: Tom Gillespie @ 2021-12-02 18:11 UTC (permalink / raw) To: emacs-orgmode I don't mean to be a wet blanket, but the edge cases for the current markup syntax are already hard enough to implement correctly, to the point where different parts of Org mode are inconsistent. Intra-word markup isn't viable because there simply isn't any sane way to parse something like *hello world*/hrm/oh no*. The other issue is that this will degrade parsing performance because almost every character could precede the start of a markup section. I recommend anyone suggesting solutions try to implement something that can parse the markup unambiguously with lots of nasty test cases. You will likely find that it is impossible to consistently tokenize markup, and that you have to hand write a whole bunch of heuristics, making Org syntax even harder to implement correctly. Any solution that suggests extending how =/*~+_ can be used gets a hard no from me. I could see teaching other exporters how to interpret \emph{hello}world, but trying for to have any sane behavior for something like why *hello*world oh no a wild askterisk* is not worth it. Best, Tom ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 18:11 ` Tom Gillespie @ 2021-12-02 19:09 ` Juan Manuel Macías 2021-12-04 13:07 ` Org-syntax: emphasis and not English punctuation Max Nikulin 2021-12-02 20:47 ` Org-syntax: Intra-word markup Denis Maier ` (2 subsequent siblings) 3 siblings, 1 reply; 69+ messages in thread From: Juan Manuel Macías @ 2021-12-02 19:09 UTC (permalink / raw) To: Tom Gillespie; +Cc: orgmode Tom Gillespie writes: > I don't mean to be a wet blanket, but the edge cases for > the current markup syntax are already hard enough to > implement correctly, to the point where different parts of > Org mode are inconsistent. Intra-word markup isn't viable > because there simply isn't any sane way to parse something > like *hello world*/hrm/oh no*. The other issue is that this will > degrade parsing performance because almost every > character could precede the start of a markup section. > > I recommend anyone suggesting solutions try to implement > something that can parse the markup unambiguously with > lots of nasty test cases. You will likely find that it is impossible > to consistently tokenize markup, and that you have to hand > write a whole bunch of heuristics, making Org syntax even > harder to implement correctly. > > Any solution that suggests extending how =/*~+_ can be > used gets a hard no from me. I could see teaching other > exporters how to interpret \emph{hello}world, but trying for > to have any sane behavior for something like > why *hello*world oh no a wild askterisk* > is not worth it. I believe, that emphasis marks are a part of Org that can be very shocking to new users. I mean, there is a series of behaviors that seem obvious and trivial in the emphasized text, but that in Org are not possible out of the box, unless you configure `org-emphasis-regexp-components'. Three quick examples. This in Org is not possible out of the box: #+begin_example [/emphasis/] ¡/emphasis/! ¿/Emphasis/? #+end_example Nor is it possible ---out of the box--- to extend emphasis beyond a certain number of lines. New users who come from other forms of markup maybe expect the obvious to be something like: some-text begin-emphasis whatever-is-in-between end-emphasis more-text Over time one ends up seeing these things more as a feature than as a bug :-) But those little inconsistencies make the Org syntax a bit ugly, IMHO. I can't think of how to improve that, though. Best regards, Juan Manuel ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: emphasis and not English punctuation 2021-12-02 19:09 ` Juan Manuel Macías @ 2021-12-04 13:07 ` Max Nikulin 2021-12-04 16:42 ` Juan Manuel Macías 0 siblings, 1 reply; 69+ messages in thread From: Max Nikulin @ 2021-12-04 13:07 UTC (permalink / raw) To: emacs-orgmode On 03/12/2021 02:09, Juan Manuel Macías wrote: > > I believe, that emphasis marks are a part of Org that can be very > shocking to new users. I mean, there is a series of behaviors that seem > obvious and trivial in the emphasized text, but that in Org are not > possible out of the box, unless you configure > `org-emphasis-regexp-components'. Three quick examples. This in Org is > not possible out of the box: > > #+begin_example > [/emphasis/] > ¡/emphasis/! > ¿/Emphasis/? > #+end_example Maybe this issue should be considered independently of itra-word emphasis. Second and third examples looks like they should be supported. Ihor mentioned treating punctuation in a more general way. It requires rich test set to estimate changes in heuristics. I suspect some problems since start and end patterns are not symmetric and I have not found a way to specify in regexp only punctuation marks that normally appears in front of words. Square brackets likely should be excluded somehow as well since they are part of Org syntax. I am unsure if it is possible to use just regexp without additional checks of candidates. Ihor Radchenko. [PATCH] Re: c47b535bb origin/main org-element: Remove dependency on ‘org-emphasis-regexp-components’ Sun, 21 Nov 2021 17:28:57 +0800. https://list.orgmode.org/87v90lzwkm.fsf@localhost ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: emphasis and not English punctuation 2021-12-04 13:07 ` Org-syntax: emphasis and not English punctuation Max Nikulin @ 2021-12-04 16:42 ` Juan Manuel Macías 0 siblings, 0 replies; 69+ messages in thread From: Juan Manuel Macías @ 2021-12-04 16:42 UTC (permalink / raw) To: Max Nikulin; +Cc: orgmode Max Nikulin writes: > Maybe this issue should be considered independently of itra-word emphasis. Yes I agree. Apologies for mixing up this topic in the discussion about intra-word emphasis... > Second and third examples looks like they should be supported. Ihor > mentioned treating punctuation in a more general way. It requires rich > test set to estimate changes in heuristics. I suspect some problems > since start and end patterns are not symmetric and I have not found a > way to specify in regexp only punctuation marks that normally appears > in front of words. Square brackets likely should be excluded somehow > as well since they are part of Org syntax. I am unsure if it is > possible to use just regexp without additional checks of candidates. Ihor's idea seems interesting to me, although I understand the possible problems you mention. By the way, I'm afraid of initial inverted punctuation (¡¿) are only used in Castilian Spanish and other languages of Spain, such as Galician or Asturian, due to the Castilian influence (we go backwards from the rest of the world ;-): https://en.wikipedia.org/wiki/Inverted_question_and_exclamation_marks > Ihor Radchenko. [PATCH] Re: c47b535bb origin/main org-element: Remove > dependency on ‘org-emphasis-regexp-components’ > Sun, 21 Nov 2021 17:28:57 +0800. > https://list.orgmode.org/87v90lzwkm.fsf@localhost I see. I believe it's a sensible decision to get rid of the dependency on org-emphasis-regexp-components. I understand that now everything related to the structure of emphases is the competence of org-element? Best regards, Juan Manuel ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 18:11 ` Tom Gillespie 2021-12-02 19:09 ` Juan Manuel Macías @ 2021-12-02 20:47 ` Denis Maier 2021-12-02 22:44 ` Samuel Wales 2021-12-03 14:53 ` Max Nikulin 2021-12-03 23:51 ` Tim Cross 3 siblings, 1 reply; 69+ messages in thread From: Denis Maier @ 2021-12-02 20:47 UTC (permalink / raw) To: Tom Gillespie, emacs-orgmode Am 02.12.2021 um 19:11 schrieb Tom Gillespie: > I don't mean to be a wet blanket, but the edge cases for > the current markup syntax are already hard enough to > implement correctly, to the point where different parts of > Org mode are inconsistent. Intra-word markup isn't viable > because there simply isn't any sane way to parse something > like *hello world*/hrm/oh no*. The other issue is that this will > degrade parsing performance because almost every > character could precede the start of a markup section. > > I recommend anyone suggesting solutions try to implement > something that can parse the markup unambiguously with > lots of nasty test cases. You will likely find that it is impossible > to consistently tokenize markup, and that you have to hand > write a whole bunch of heuristics, making Org syntax even > harder to implement correctly. > > Any solution that suggests extending how =/*~+_ can be > used gets a hard no from me. I could see teaching other > exporters how to interpret \emph{hello}world, but trying for > to have any sane behavior for something like > why *hello*world oh no a wild askterisk* > is not worth it. As I've said before, I could well live with \emph{what}ever or something similar. Denis ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 20:47 ` Org-syntax: Intra-word markup Denis Maier @ 2021-12-02 22:44 ` Samuel Wales 0 siblings, 0 replies; 69+ messages in thread From: Samuel Wales @ 2021-12-02 22:44 UTC (permalink / raw) To: Denis Maier; +Cc: Tom Gillespie, emacs-orgmode a silly question. don't we already use something kinda similar to \emph{what}ever for all backends? could we do so? On 12/2/21, Denis Maier <denismaier@mailbox.org> wrote: > Am 02.12.2021 um 19:11 schrieb Tom Gillespie: >> I don't mean to be a wet blanket, but the edge cases for >> the current markup syntax are already hard enough to >> implement correctly, to the point where different parts of >> Org mode are inconsistent. Intra-word markup isn't viable >> because there simply isn't any sane way to parse something >> like *hello world*/hrm/oh no*. The other issue is that this will >> degrade parsing performance because almost every >> character could precede the start of a markup section. >> >> I recommend anyone suggesting solutions try to implement >> something that can parse the markup unambiguously with >> lots of nasty test cases. You will likely find that it is impossible >> to consistently tokenize markup, and that you have to hand >> write a whole bunch of heuristics, making Org syntax even >> harder to implement correctly. >> >> Any solution that suggests extending how =/*~+_ can be >> used gets a hard no from me. I could see teaching other >> exporters how to interpret \emph{hello}world, but trying for >> to have any sane behavior for something like >> why *hello*world oh no a wild askterisk* >> is not worth it. > > As I've said before, I could well live with \emph{what}ever or something > similar. > > Denis > > -- The Kafka Pandemic Please learn what misopathy is. https://thekafkapandemic.blogspot.com/2013/10/why-some-diseases-are-wronged.html ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 18:11 ` Tom Gillespie 2021-12-02 19:09 ` Juan Manuel Macías 2021-12-02 20:47 ` Org-syntax: Intra-word markup Denis Maier @ 2021-12-03 14:53 ` Max Nikulin 2021-12-03 23:51 ` Tim Cross 3 siblings, 0 replies; 69+ messages in thread From: Max Nikulin @ 2021-12-03 14:53 UTC (permalink / raw) To: emacs-orgmode On 03/12/2021 01:11, Tom Gillespie wrote: > > I recommend anyone suggesting solutions try to implement > something that can parse the markup unambiguously with > lots of nasty test cases. You will likely find that it is impossible > to consistently tokenize markup, and that you have to hand > write a whole bunch of heuristics, making Org syntax even > harder to implement correctly. Tom, I see and share you point, however sometimes more specific and convincing arguments are necessary. Why unconstrained markup ("//") does not cause problems in asciidoc? Maybe it does but they are not immediately obvious. I don know since I have never used asciidoc. Maybe parser behaves in a different way than org-element. Maybe plain text links are not allowed at all. Almost any URL contains such pair of markers: https://orgmode.org/, so it should be addressed somehow. Examples of corner cases that are used for tests should be more visible to users otherwise it is hard to use such samples in discussions. They should be annotated (arbitrary examples from recent discussions): - input: [[https://first/-/url/][pre]] text [[https://second-url/?][post]] parsed: ( (link :target "https://first/-/url/" :description "pre") " text " (link :target "https://second-url/?" :description "post")) comment: "Regexp-based syntax highlighting falsely finds italic text because URLs have slashes similar start and end of italics" - input: A _b =c_ d= e_ f parsed: ( "A " (underline "b =c") " d= e_ f") comment: "Users of markdown may falsely expect that c_ is protected by verbatim markers and underlined text is ended at e_" ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 18:11 ` Tom Gillespie ` (2 preceding siblings ...) 2021-12-03 14:53 ` Max Nikulin @ 2021-12-03 23:51 ` Tim Cross 2021-12-04 15:01 ` Max Nikulin 2021-12-05 23:37 ` Russell Adams 3 siblings, 2 replies; 69+ messages in thread From: Tim Cross @ 2021-12-03 23:51 UTC (permalink / raw) To: emacs-orgmode Tom Gillespie <tgbugs@gmail.com> writes: > I don't mean to be a wet blanket, but the edge cases for > the current markup syntax are already hard enough to > implement correctly, to the point where different parts of > Org mode are inconsistent. Intra-word markup isn't viable > because there simply isn't any sane way to parse something > like *hello world*/hrm/oh no*. The other issue is that this will > degrade parsing performance because almost every > character could precede the start of a markup section. > > I recommend anyone suggesting solutions try to implement > something that can parse the markup unambiguously with > lots of nasty test cases. You will likely find that it is impossible > to consistently tokenize markup, and that you have to hand > write a whole bunch of heuristics, making Org syntax even > harder to implement correctly. > > Any solution that suggests extending how =/*~+_ can be > used gets a hard no from me. I could see teaching other > exporters how to interpret \emph{hello}world, but trying for > to have any sane behavior for something like > why *hello*world oh no a wild askterisk* > is not worth it. > +infinity! Please, please can we stop trying to satisfy every edge case or extend the markup to satisfy every possible scenario. Org's big strength is in its simplicity. This comes at a price - limitations in what can be done. If those limitations are unacceptable, then use a richer markup format like Latex, XML, HTML etc. The point about back end exporter support is very relevant. The 'richer' the markup, the harder it is to get a consistent mapping for back end exporters. things quickly become more complex and difficult to maintain. In 18 years, I've seen requests for inner word markup less than 4 times. this is not a feature we should even be considering adding to the markup syntax. Org provides a light weight markup, not a fully flexible rich markup designed to meet any need. It makes the easy stuff simple. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-03 23:51 ` Tim Cross @ 2021-12-04 15:01 ` Max Nikulin 2021-12-05 23:34 ` Russell Adams 2021-12-05 23:37 ` Russell Adams 1 sibling, 1 reply; 69+ messages in thread From: Max Nikulin @ 2021-12-04 15:01 UTC (permalink / raw) To: emacs-orgmode On 04/12/2021 06:51, Tim Cross wrote: > > Please, please can we stop trying to satisfy every edge case or extend > the markup to satisfy every possible scenario. > > Org's big strength is in its simplicity. This comes at a price - > limitations in what can be done. If those limitations are unacceptable, > then use a richer markup format like Latex, XML, HTML etc. It is ridiculous to throw away a nice tool and start to struggle with another bunch of problems when a small missed feature is really required. > The point about back end exporter support is very relevant. Notice that this particular feature does not require extending of underlying intermediate representation. There may be some subtle points but generally export backends are ready to intra-word markup. > In 18 years, I've seen requests for inner word markup less than 4 times. > this is not a feature we should even be considering adding to the markup > syntax. > > Org provides a light weight markup, not a fully flexible rich markup > designed to meet any need. It makes the easy stuff simple. Different users wish to have different minor features. It would be great to have a way to include a fragment with more verbose markup that allows to express special needs unsupported by lightweight markup. I am discussing a more general solution, not syntax extension namely for intra-word markup. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-04 15:01 ` Max Nikulin @ 2021-12-05 23:34 ` Russell Adams 0 siblings, 0 replies; 69+ messages in thread From: Russell Adams @ 2021-12-05 23:34 UTC (permalink / raw) To: emacs-orgmode On Sat, Dec 04, 2021 at 10:01:15PM +0700, Max Nikulin wrote: > On 04/12/2021 06:51, Tim Cross wrote: > > > > Please, please can we stop trying to satisfy every edge case or extend > > the markup to satisfy every possible scenario. > > It is ridiculous to throw away a nice tool and start to struggle with > another bunch of problems when a small missed feature is really required. I think this is a problem of expectations. I don't export Org to export perfect documents in every language. I expect Org to make a simple subset of features available consistently. With HTML or Latex you can create those words, and you can insert that code into your Org document. Why does the Org syntax need to be further extended to support this? Part of the reason Org is a nice tool is that it is simple, and we should be cautious trying to make it any more complex. ------------------------------------------------------------------ Russell Adams RLAdams@AdamsInfoServ.com PGP Key ID: 0x1160DCB3 http://www.adamsinfoserv.com/ Fingerprint: 1723 D8CA 4280 1EC9 557F 66E8 1154 E018 1160 DCB3 ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-03 23:51 ` Tim Cross 2021-12-04 15:01 ` Max Nikulin @ 2021-12-05 23:37 ` Russell Adams 2021-12-06 1:39 ` Samuel Wales 1 sibling, 1 reply; 69+ messages in thread From: Russell Adams @ 2021-12-05 23:37 UTC (permalink / raw) To: emacs-orgmode On Sat, Dec 04, 2021 at 10:51:47AM +1100, Tim Cross wrote: > > Tom Gillespie <tgbugs@gmail.com> writes: > > > I don't mean to be a wet blanket... I'd like to be a wet blanket. > +infinity! > > Please, please can we stop trying to satisfy every edge case or extend > the markup to satisfy every possible scenario. +infinity^2 I've often thought Org needs to hit the brakes and stop adding features, or cut out features that have a high support/maintenance cost. We need to respect our maintainers' time. ------------------------------------------------------------------ Russell Adams RLAdams@AdamsInfoServ.com PGP Key ID: 0x1160DCB3 http://www.adamsinfoserv.com/ Fingerprint: 1723 D8CA 4280 1EC9 557F 66E8 1154 E018 1160 DCB3 ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-05 23:37 ` Russell Adams @ 2021-12-06 1:39 ` Samuel Wales 0 siblings, 0 replies; 69+ messages in thread From: Samuel Wales @ 2021-12-06 1:39 UTC (permalink / raw) To: emacs-orgmode i think i can't add much useful to these threads, i agree with the simplicity, but, a nuance, want for org to have had a bit more consistency growing up. e.g. quoting/escaping, demarcation, and applicability of features in different contexts. sort of a "mentally factored user interface" where the user's expectation is pretty straightforwardly met. e.g. works here so should also work there. or, there is only one rule for doing this. that kind of thing. orthogonality also. few exceptions. it is understandable in context that inconsistencies exist, and that might apply to various maintenance-over-heavy things users want. if we are to remove features as suggested below, then i suggest, where possible, consistency be a desideratum for final result. On 12/5/21, Russell Adams <RLAdams@adamsinfoserv.com> wrote: > On Sat, Dec 04, 2021 at 10:51:47AM +1100, Tim Cross wrote: >> >> Tom Gillespie <tgbugs@gmail.com> writes: >> >> > I don't mean to be a wet blanket... > > I'd like to be a wet blanket. > >> +infinity! >> >> Please, please can we stop trying to satisfy every edge case or extend >> the markup to satisfy every possible scenario. > > +infinity^2 > > I've often thought Org needs to hit the brakes and stop adding > features, or cut out features that have a high support/maintenance > cost. We need to respect our maintainers' time. > > ------------------------------------------------------------------ > Russell Adams RLAdams@AdamsInfoServ.com > > PGP Key ID: 0x1160DCB3 http://www.adamsinfoserv.com/ > > Fingerprint: 1723 D8CA 4280 1EC9 557F 66E8 1154 E018 1160 DCB3 > > -- The Kafka Pandemic A blog about science, health, human rights, and misopathy: https://thekafkapandemic.blogspot.com ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 13:07 ` Ihor Radchenko 2021-12-02 15:51 ` Max Nikulin @ 2021-12-02 19:03 ` Nicolas Goaziou 2021-12-02 19:34 ` Juan Manuel Macías 2021-12-03 14:24 ` Max Nikulin 1 sibling, 2 replies; 69+ messages in thread From: Nicolas Goaziou @ 2021-12-02 19:03 UTC (permalink / raw) To: Ihor Radchenko; +Cc: Timothy, emacs-orgmode, Denis Maier Hello, Ihor Radchenko <yantar92@gmail.com> writes: > Denis Maier <denismaier@mailbox.org> writes: > >> As for suggestions: If just using /intra/word creates ambiguities, what >> about the asciidoc solution? So //intra//word? > > I do like this idea. > > Though I would also like to hear Nicolas' opinion. I sympathize to the idea of intra-word emphasis, but the syntax above is going to cause some ambiguous situations. I do think the marker + zero-width space is one way to go. We could, as an improvement, consider zero-width spaces around emphasis markers to be part of the markup, and replace them along during export. Another solution is to introduce a less-subtle, but less prone to ambiguity, syntax, e.g., /{bold}/markup or /|bold|/markup where /{ }/ or /| |/ become "extended" markers. I find zero-with spaces solution much more elegant. It also doesn't change current syntax, which is a big advantage. Regards, -- Nicolas Goaziou ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 19:03 ` Nicolas Goaziou @ 2021-12-02 19:34 ` Juan Manuel Macías 2021-12-02 23:05 ` Nicolas Goaziou 2021-12-03 14:24 ` Max Nikulin 1 sibling, 1 reply; 69+ messages in thread From: Juan Manuel Macías @ 2021-12-02 19:34 UTC (permalink / raw) To: Nicolas Goaziou; +Cc: Denis Maier, emacs-orgmode, Ihor Radchenko, Timothy Hi Nicolas and all, Nicolas Goaziou writes: > I find zero-with spaces solution much more elegant. It also doesn't > change current syntax, which is a big advantage. I agree that zero width spaces work fine as a solution, but I think they should not be understood as part of the syntax but as a punctual (temporal?) remedy to certain scenarios. As mentioned before, in LaTeX zero width spaces can produce unexpected effects and modify the final form of the text (at least in luatex). I also don't know if it would be useful to remove all zero width spaces in the export process, because in some cases the user may want to keep them, as I think Maxim commented in a previous message. As for the solution of using complementary marks ("//...//", etc.), I think it would undermine consistency, as those marks would only be to fix exceptions. It's a tricky subject... Best regards, Juan Manuel ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 19:34 ` Juan Manuel Macías @ 2021-12-02 23:05 ` Nicolas Goaziou 2021-12-02 23:24 ` Juan Manuel Macías 0 siblings, 1 reply; 69+ messages in thread From: Nicolas Goaziou @ 2021-12-02 23:05 UTC (permalink / raw) To: Juan Manuel Macías Cc: Timothy, emacs-orgmode, Ihor Radchenko, Denis Maier Hello, Juan Manuel Macías <maciaschain@posteo.net> writes: > I agree that zero width spaces work fine as a solution, but I think they > should not be understood as part of the syntax but as a punctual > (temporal?) remedy to certain scenarios. As mentioned before, in LaTeX > zero width spaces can produce unexpected effects and modify the final > form of the text (at least in luatex). I also don't know if it would be > useful to remove all zero width spaces in the export process, because in > some cases the user may want to keep them, as I think Maxim commented in > a previous message. We may be misunderstanding each other. I'm suggesting to remove zero-width spaces contiguous to emphasis markers only. Therefore LaTeX process would npot see them. Other zero width spaces, e.g., inserted by user, are kept. AFAICT, the two last points you mention are not relevant with my proposal. Besides, they already part of the syntax, in some way. So that ship has sailed long ago. Regards, -- Nicolas Goaziou ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 23:05 ` Nicolas Goaziou @ 2021-12-02 23:24 ` Juan Manuel Macías 0 siblings, 0 replies; 69+ messages in thread From: Juan Manuel Macías @ 2021-12-02 23:24 UTC (permalink / raw) To: Nicolas Goaziou; +Cc: orgmode Nicolas Goaziou writes: > I'm suggesting to remove zero-width spaces contiguous to emphasis > markers only. Therefore LaTeX process would npot see them. Other zero > width spaces, e.g., inserted by user, are kept. AFAICT, the two last > points you mention are not relevant with my proposal. > > Besides, they already part of the syntax, in some way. So that ship has > sailed long ago. I understand that it is too late to change certain things, but that is not an impediment for me to continue to think that using the character U+200B as a part (at least /de facto/) of the syntax is still shocking and weird. On the other hand, what was expected in Org would have been to have the emphasis marks and at the same time have a universal escape character for those emphasis marks. In the same way as I can write in markdown: *foo* AND \*foo\*. In Org we have the emphasis marks but not the escape character. That was probably the cause of many issues that are being discussed here. But that means also entering the realm of assumptions. Still, I wanted to leave an opinion on this question in particular. Best regards, Juan Manuel ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-02 19:03 ` Nicolas Goaziou 2021-12-02 19:34 ` Juan Manuel Macías @ 2021-12-03 14:24 ` Max Nikulin 2021-12-03 15:01 ` Juan Manuel Macías 2021-12-04 15:57 ` Denis Maier 1 sibling, 2 replies; 69+ messages in thread From: Max Nikulin @ 2021-12-03 14:24 UTC (permalink / raw) To: emacs-orgmode On 03/12/2021 02:03, Nicolas Goaziou wrote: >> Denis Maier writes: >> >>> As for suggestions: If just using /intra/word creates ambiguities, what >>> about the asciidoc solution? So //intra//word? > > I sympathize to the idea of intra-word emphasis, but the syntax above is > going to cause some ambiguous situations. I suppose, some more general solution is required. > I do think the marker + zero-width space is one way to go. We could, as > an improvement, consider zero-width spaces around emphasis markers to be > part of the markup, and replace them along during export. Zero-space characters adjacent to emphasis markers is a better idea than replacing any zero space. However I agree with Juan Manuel that white space characters, especially completely invisible (I am not Eli who sees such special characters by moving cursor through them) should not be overloaded. From my point of view, it is acceptable to use zero width spaces as a workaround but they should not become official part of Org syntax. > Another solution is to introduce a less-subtle, but less prone to > ambiguity, syntax, e.g., > > /{bold}/markup or /|bold|/markup > > where /{ }/ or /| |/ become "extended" markers. More explicit markup leaves less room for ambiguities, and I like the idea due to this reason. On the other hand it diverges from principle of lightweight markup. The almost only special character in TeX is "\", HTML has three ones "&<>" with simple escape rules. Org uses many special characters to avoid verbosity and requires some tricks to escape them. Markers like "\{" make Org more verbose but do not make it more strict, a lot of things still rely on heuristics. I have an idea what can be done when some special markup is required that is not fit into current syntax. Unfortunately some new constructs should be introduced anyway: inline objects and multiline elements that represent simplified result of parsed Org structures: ((italic "intra") "word") wrapped with some markup. It should satisfy any special needs (and even should allow to create invalid impossible constructs). Maybe idea of combination of lightweight markup and low-level blocks better suits for some other project with more expressive internal representation. In Org it may become the most hated feature. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-03 14:24 ` Max Nikulin @ 2021-12-03 15:01 ` Juan Manuel Macías 2021-12-04 15:57 ` Denis Maier 1 sibling, 0 replies; 69+ messages in thread From: Juan Manuel Macías @ 2021-12-03 15:01 UTC (permalink / raw) To: Max Nikulin; +Cc: orgmode Hi Maxim, Max Nikulin writes: > More explicit markup leaves less room for ambiguities, and I like the > idea due to this reason. On the other hand it diverges from principle > of lightweight markup. The almost only special character in TeX is > "\", HTML has three ones "&<>" with simple escape rules. Org uses many > special characters to avoid verbosity and requires some tricks to > escape them. Markers like "\{" make Org more verbose but do not make > it more strict, a lot of things still rely on heuristics. Excellent explanation. Thanks for the clarification. > I have an idea what can be done when some special markup is required > that is not fit into current syntax. Unfortunately some new constructs > should be introduced anyway: inline objects and multiline elements > that represent simplified result of parsed Org structures: > > ((italic "intra") "word") > > wrapped with some markup. It should satisfy any special needs (and > even should allow to create invalid impossible constructs). Maybe idea > of combination of lightweight markup and low-level blocks better suits > for some other project with more expressive internal representation. > In Org it may become the most hated feature. I really would like a solution in this direction. In LaTeX there is a command called \protect (which has nothing to do with this topic and is used for other things, but I like the 'protection' concept); we could perhaps think of a type of mark to protect the 'usual' marks when syntax consistency is compromised in some way by the context. Maybe something like enclosing the normal marks between two double single quotes ''...'' ---or a single set of single quotes before the leading marker--- as I proposed in another thread: #+begin_example ''*protected emphasis*'' #+end_example Best regards, Juan Manuel ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-03 14:24 ` Max Nikulin 2021-12-03 15:01 ` Juan Manuel Macías @ 2021-12-04 15:57 ` Denis Maier 2021-12-04 17:53 ` Tom Gillespie 1 sibling, 1 reply; 69+ messages in thread From: Denis Maier @ 2021-12-04 15:57 UTC (permalink / raw) To: Max Nikulin, emacs-orgmode Am 03.12.2021 um 15:24 schrieb Max Nikulin: > On 03/12/2021 02:03, Nicolas Goaziou wrote: >>> Denis Maier writes: >>> >>>> As for suggestions: If just using /intra/word creates ambiguities, what >>>> about the asciidoc solution? So //intra//word? >> >> I sympathize to the idea of intra-word emphasis, but the syntax above is >> going to cause some ambiguous situations. > > I suppose, some more general solution is required. > >> I do think the marker + zero-width space is one way to go. We could, as >> an improvement, consider zero-width spaces around emphasis markers to be >> part of the markup, and replace them along during export. > > Zero-space characters adjacent to emphasis markers is a better idea than > replacing any zero space. However I agree with Juan Manuel that white > space characters, especially completely invisible (I am not Eli who sees > such special characters by moving cursor through them) should not be > overloaded. From my point of view, it is acceptable to use zero width > spaces as a workaround but they should not become official part of Org > syntax. > >> Another solution is to introduce a less-subtle, but less prone to >> ambiguity, syntax, e.g., >> >> /{bold}/markup or /|bold|/markup >> >> where /{ }/ or /| |/ become "extended" markers. > > More explicit markup leaves less room for ambiguities, and I like the > idea due to this reason. On the other hand it diverges from principle of > lightweight markup. The almost only special character in TeX is "\", > HTML has three ones "&<>" with simple escape rules. Org uses many > special characters to avoid verbosity and requires some tricks to escape > them. Markers like "\{" make Org more verbose but do not make it more > strict, a lot of things still rely on heuristics. > > I have an idea what can be done when some special markup is required > that is not fit into current syntax. Unfortunately some new constructs > should be introduced anyway: inline objects and multiline elements that > represent simplified result of parsed Org structures: > > ((italic "intra") "word") > > wrapped with some markup. It should satisfy any special needs (and even > should allow to create invalid impossible constructs). Maybe idea of > combination of lightweight markup and low-level blocks better suits for > some other project with more expressive internal representation. In Org > it may become the most hated feature. I have to admit I like this idea. That brings a lot of flexibility to accomodate even the most obscure needs, yet it makes the discussion about escape characters or new symbols much less pressing. After all, most markup languages face the same problem, i.e., special characters are limited, and beyond the usual /*_ the meaning of characters becomes much less obvious. This idea reminds me a bit of Scribble/Racket where every document is just inverted code, which makes it possible to insert arbitrary Racket code in your prose... Denis > > > ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-04 15:57 ` Denis Maier @ 2021-12-04 17:53 ` Tom Gillespie 2021-12-04 18:37 ` John Kitchin ` (2 more replies) 0 siblings, 3 replies; 69+ messages in thread From: Tom Gillespie @ 2021-12-04 17:53 UTC (permalink / raw) To: emacs-orgmode Cc: Juan Manuel Macías, Max Nikulin, Tim Cross, Denis Maier Hi all, After a bunch of rambling (see below if interested), I think I have a solution that should work for everyone. The key realization is that what we really want is the ability to have a "parse me separately" type of syntax. This meets the intra-word syntax needs and might meet some other needs as well. The solution is to make @@org:...@@ "parse me separately" block! It nearly works that way already too! To minimize typing we could have @@:...@@ the empty type default to org. This seems like a winner to me. The syntax for it already exists and won't conflict. It requires relatively minimal additional typing the implication is clear, and there are other places where such behavior could be useful. This syntax seems like a winner to me @@org:/hello/@@world @@:/hello/@@world You can also do things like #+begin_src org I want a number in this number@@org:src_elisp{(+ 1 2)}@@word! #+end_src Which would render to #+begin_src org I want a number in this number3word! #+end_src Thoughts? Best! Tom --------------- rambling below ------------- > This idea reminds me a bit of Scribble/Racket where every document is > just inverted code, which makes it possible to insert arbitrary Racket > code in your prose... I will say, despite some of my comments elsewhere, that I think exploring certain features of Scribble syntax for use in Org mode would simplify certain parts of the syntax immensely. For example various inline blocks are an absolute pain to parse because they allow nested delimiters /if they are matched/. The implementation of the /if they are matched/ clause is currently a nasty hack which generates a regular expression that can only actually handle nesting to depth 3. Actually implementing the recursive grammar add a lot of complexity to the syntax and is hard to get right. It would be vastly simpler to use Scribble's |<{hello }} world}>| style syntax and always terminate at the first matching delimiter. I'm sure that this would break some Org files, but it would make dealing with latex fragments and inline source blocks and inline footnotes SO much simpler. Matching an arbitrary number of angle brackets does add some complexity, but it is tiny compared to the complexity of enforcing matched parens and their failure cases especially because many of the places where nesting is required probably only see use of the nesting feature in a tiny fraction of all cases. One other reason why this is attractive is that all the instances where nested delimiters can appear on a line are preceded by some non-whitespace character. This means that using the pipe syntax does not conflict with table syntax! Now the question comes. If we could implement this for delimiters, could we also implement something similar for markup? The issue with the proposed markup outside delimiter inside approach is that it will change existing behavior for files that want the delimiters to be included in the markup, i.e. /{oops}/ becoming /oops/ is bad. A second issue is that putting the delimiter inside the markup cannot work for verbatim and code ={oops}= is ={oops}= no matter what. Therefore the solution is not uniform across all types of markup. We need another solution that works for all types of markup. What if we put the "start arbitrary markup" char outside the markup? Say something like |/ital/|icks? Or what if we went whole hog and used |{/ital/}|ics and made the |{...}| syntax trigger a generalized feature where the contents of the |{...}| block are parsed by themselves and can abutt any other text? This would be generally useful in a variety of situations beyond just intra-word markup. What are the issues with this approach? The first issue is that there is a conflict with table syntax if we were to use the pipe character because markup can appear at the start of a line. The second issue is that it might be confusing for users if |{}| also worked like {} when in the context of latex elements or inline src blocks, or maybe that is ok because |{}| never renders as text. Hrm. Ok. Second issue resolved, but what to do about the first? If we want generalized "parse this by itself" syntax so that we can write hello|{/world/}|ok, then we need a solution that can appear at the start of a line. So we can't use pipe because that is always a table line even if a zero width space is put before it ;). What other options do we have? How about #+|{/hello/}|world for the start of a line? As long as there is no trailing colon it isn't a keyword, so it could work ... except that if someone reflows the text and it is no longer a the start of a line then the syntax breaks. That is to say using #+| at the start of a line is not uniform, so we can't take that approach. What other chars to we have at our disposal? Hrm. How about @@? Could we use that? What happens if we use @@org:/hello/@@world? Or maybe if we want to minimize the number of chars we could do @@:/hello/@@world and have the empty prefix in @@ blocks mean org? ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-04 17:53 ` Tom Gillespie @ 2021-12-04 18:37 ` John Kitchin 2021-12-04 21:16 ` Juan Manuel Macías 2021-12-06 10:57 ` Raw Org AST snippets for "impossible" markup Max Nikulin 2021-12-04 19:04 ` Org-syntax: Intra-word markup Timothy 2021-12-06 11:01 ` Denis Maier 2 siblings, 2 replies; 69+ messages in thread From: John Kitchin @ 2021-12-04 18:37 UTC (permalink / raw) To: Tom Gillespie Cc: Juan Manuel Macías, Max Nikulin, Tim Cross, emacs-orgmode, Denis Maier [-- Attachment #1: Type: text/plain, Size: 7120 bytes --] Along these lines (and combining the s-exp suggestion from Max) , you can achieve something like this with links. This is lightly tested, and I am not thrilled with the eval for exporting, but I couldn't get a macro to work on the export function to avoid it, and this is just a proof of concept idea. This might only be suitable for individual solutions, since you have to define this markup yourself. #+BEGIN_SRC emacs-lisp :results silent (defun italic (s) (pcase backend ;; lexical ('latex (format "{\\textit{%s}}" s)) ('html (format "<i>%s</i>" s)) (_ s))) (defun @@-export (path desc backend) (eval `(concat ,@(read path)))) (org-link-set-parameters "@@" :export #'@@-export) #+END_SRC In org, it would look like Here is a [[@@:((italic "part") "ial")]] markup. And in exports this is what this implementation does. #+BEGIN_SRC emacs-lisp (org-export-string-as "Here is a [[@@:((italic \"part\") \"ial\")]] markup." 'latex t) #+END_SRC #+RESULTS: : Here is a {\textit{part}}ial markup. #+BEGIN_SRC emacs-lisp (org-export-string-as "Here is a [[@@:((italic \"part\") \"ial\")]] markup." 'html t) #+END_SRC #+RESULTS: : <p> : Here is a <i>part</i>ial markup.</p> #+BEGIN_SRC emacs-lisp (org-export-string-as "Here is a [[@@:((italic \"part\") \"ial\")]] markup." 'ascii t) #+END_SRC #+RESULTS: : Here is a partial markup. Of course, you are free to do what you want with the path, including parse it yourself to generate the output, and since it is a link, you could do all kinds of things to make it look the way you want with faces, overlays, etc. John ----------------------------------- Professor John Kitchin (he/him/his) Doherty Hall A207F Department of Chemical Engineering Carnegie Mellon University Pittsburgh, PA 15213 412-268-7803 @johnkitchin http://kitchingroup.cheme.cmu.edu On Sat, Dec 4, 2021 at 12:54 PM Tom Gillespie <tgbugs@gmail.com> wrote: > Hi all, > After a bunch of rambling (see below if interested), I think I have > a solution that should work for everyone. The key realization is that > what we really want is the ability to have a "parse me separately" > type of syntax. This meets the intra-word syntax needs and might > meet some other needs as well. > > The solution is to make @@org:...@@ "parse me separately" > block! It nearly works that way already too! To minimize typing > we could have @@:...@@ the empty type default to org. > > This seems like a winner to me. The syntax for it already exists > and won't conflict. It requires relatively minimal additional typing > the implication is clear, and there are other places where such > behavior could be useful. > > This syntax seems like a winner to me > @@org:/hello/@@world > @@:/hello/@@world > > You can also do things like > #+begin_src org > I want a number in this number@@org:src_elisp{(+ 1 2)}@@word! > #+end_src > > Which would render to > #+begin_src org > I want a number in this number3word! > #+end_src > > Thoughts? > > Best! > Tom > > --------------- rambling below ------------- > > > > This idea reminds me a bit of Scribble/Racket where every document is > > just inverted code, which makes it possible to insert arbitrary Racket > > code in your prose... > > I will say, despite some of my comments elsewhere, that I think > exploring certain features of Scribble syntax for use in Org mode > would simplify certain parts of the syntax immensely. > > For example > various inline blocks are an absolute pain to parse because they > allow nested delimiters /if they are matched/. The implementation > of the /if they are matched/ clause is currently a nasty hack which > generates a regular expression that can only actually handle nesting > to depth 3. Actually implementing the recursive grammar add a lot > of complexity to the syntax and is hard to get right. > > It would be vastly simpler to use Scribble's |<{hello }} world}>| > style syntax and always terminate at the first matching delimiter. > I'm sure that this would break some Org files, but it would make > dealing with latex fragments and inline source blocks and inline > footnotes SO much simpler. Matching an arbitrary number of > angle brackets does add some complexity, but it is tiny compared > to the complexity of enforcing matched parens and their failure cases > especially because many of the places where nesting is required > probably only see use of the nesting feature in a tiny fraction of > all cases. > > One other reason why this is attractive is that all the instances > where nested delimiters can appear on a line are preceded by > some non-whitespace character. This means that using the > pipe syntax does not conflict with table syntax! > > Now the question comes. If we could implement this for > delimiters, could we also implement something similar > for markup? The issue with the proposed markup outside > delimiter inside approach is that it will change existing > behavior for files that want the delimiters to be included > in the markup, i.e. /{oops}/ becoming /oops/ is bad. A > second issue is that putting the delimiter inside the markup > cannot work for verbatim and code ={oops}= is ={oops}= no > matter what. Therefore the solution is not uniform across all > types of markup. We need another solution that works for > all types of markup. > > What if we put the "start arbitrary markup" char outside > the markup? Say something like |/ital/|icks? Or what if > we went whole hog and used |{/ital/}|ics and made the > |{...}| syntax trigger a generalized feature where the > contents of the |{...}| block are parsed by themselves > and can abutt any other text? This would be generally > useful in a variety of situations beyond just intra-word > markup. > > What are the issues with this approach? The first issue > is that there is a conflict with table syntax if we were to > use the pipe character because markup can appear at > the start of a line. The second issue is that it might be > confusing for users if |{}| also worked like {} when in the > context of latex elements or inline src blocks, or maybe > that is ok because |{}| never renders as text. Hrm. Ok. > Second issue resolved, but what to do about the first? > > If we want generalized "parse this by itself" syntax so > that we can write hello|{/world/}|ok, then we need a > solution that can appear at the start of a line. So we > can't use pipe because that is always a table line even > if a zero width space is put before it ;). What other > options do we have? How about #+|{/hello/}|world for > the start of a line? As long as there is no trailing colon > it isn't a keyword, so it could work ... except that if > someone reflows the text and it is no longer a the > start of a line then the syntax breaks. That is to say > using #+| at the start of a line is not uniform, so we > can't take that approach. > > What other chars to we have at our disposal? Hrm. > How about @@? Could we use that? What happens > if we use @@org:/hello/@@world? Or maybe if we > want to minimize the number of chars we could do > @@:/hello/@@world and have the empty prefix in > @@ blocks mean org? > > [-- Attachment #2: Type: text/html, Size: 8579 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-04 18:37 ` John Kitchin @ 2021-12-04 21:16 ` Juan Manuel Macías 2021-12-06 10:57 ` Raw Org AST snippets for "impossible" markup Max Nikulin 1 sibling, 0 replies; 69+ messages in thread From: Juan Manuel Macías @ 2021-12-04 21:16 UTC (permalink / raw) To: John Kitchin; +Cc: orgmode Hi John, John Kitchin writes: > Along these lines (and combining the s-exp suggestion from Max) , you > can achieve something like this with links. I like this idea of merging the Maxim's proposal with the power of links. In any case, this and other workarounds provided here make it clear that in Org we do not lack of good and useful resources. I usually use macros (taking advantage of the fact that macros expand soon). For example (only in this case with the LaTeX backend): #+MACRO: emph (eval (when (org-export-derived-backend-p org-export-current-backend 'latex) (concat "@@latex:\\emph{@@" $1 "@@latex:}@@"))) Defined the macro this way, it allows me also to introduce nested emphases by both ways: #+begin_src example {{{emph(lorem *ipsum* /dolor/ {{{emph(sit)}}} amet)}}} #+end_src ==> \emph{lorem \textbf{ipsum} \emph{dolor} \emph{sit} amet} Best regards, Juan Manuel ^ permalink raw reply [flat|nested] 69+ messages in thread
* Raw Org AST snippets for "impossible" markup 2021-12-04 18:37 ` John Kitchin 2021-12-04 21:16 ` Juan Manuel Macías @ 2021-12-06 10:57 ` Max Nikulin 2021-12-06 15:45 ` Juan Manuel Macías 1 sibling, 1 reply; 69+ messages in thread From: Max Nikulin @ 2021-12-06 10:57 UTC (permalink / raw) To: emacs-orgmode On 05/12/2021 01:37, John Kitchin wrote: > Along these lines (and combining the s-exp suggestion from Max) , you > can achieve something like this with links. > > #+BEGIN_SRC emacs-lisp :results silent > (defun italic (s) > (pcase backend ;; lexical > ('latex (format "{\\textit{%s}}" s)) > ('html (format "<i>%s</i>" s)) > (_ s))) > > (defun @@-export (path desc backend) > (eval `(concat ,@(read path)))) > > (org-link-set-parameters > "@@" > :export #'@@-export) > #+END_SRC John, thank you for the reminding me of Juan Manuel's idea that everything missed in Org may be polyfilled (ab)using links. It is enough for proof of concept, special markers may be introduced later. After some time spent exercising in monkey-typing, I have got some code that illustrates my idea. So the goal is to mitigate demand to extend current syntax. While simple cases should be easy, special cases should not be impossible. - Raw AST snippets should be processed without ~eval~ to give other tools such as =pandoc= a chance to support the feature. If you desperately need ~eval~ then you can use source blocks. - The idea is to use existing backends by passing structures similar to ones generated by ~org-element~ parser. - I would prefer to avoid "@@" for link prefix since such sequences are already a part of Org syntax. In the following example export snippet is preliminary terminated by such link: #+begin_src elisp :results pp (org-element-parse-secondary-string "@@latex:[[@@:(italics \"i\")]]@@" (org-element-restriction 'paragraph)) #+end_src #+RESULTS: : ((export-snippet : (:back-end "latex" :value "[[" :begin 1 :end 13 :post-blank 0 :parent #0)) : #(":(italics \"i\")]]@@" 0 18 : (:parent #0))) Let's take some link prefix that makes it clear that the proposal is a draft and a sane variant will be chosen later when agreement concerning details of such feature is achieved. Till that moment it is named "orgia". #+begin_src elisp :results silent (defun orgia-export (path desc backend) (if (not (eq ?\( (aref path 0))) path (let ((tree (read path)) (info (org-export-get-environment backend nil nil))) (org-no-properties (org-export-data-with-backend tree backend info))))) (org-link-set-parameters "orgia" :export #'orgia-export) #+end_src Either [[orgia:("inter" (bold () "word"))]] or <orgia:((italic () "inter") "word")> links may be used. Certainly plain text may be outside: #+begin_src elisp (org-export-string-as "A <orgia:(italic () \"inter\")>word" 'html t) #+end_src #+RESULTS: : <p> : A <i>inter</i>word</p> - Error handling is required. - Elements (blocks) should be considered as an error in object (inline) context. - Passed tree should be preprocessed to glue strings split to avoid interpreting them as terminating outer construct or link itself (=]]= =][= should be ="]" "]"= ="]" "["= inside bracket links). It is especially important for property values. - For convenience =parse= element may be added to parse a string accordingly to Org markup. - There should be a similar element (block-level markup structure). - Symbols and structures used by ~org-element~ becomes a part of public API, but they are already are since they are used by export backends. - ~org-cite~ is likely will be a problem. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Raw Org AST snippets for "impossible" markup 2021-12-06 10:57 ` Raw Org AST snippets for "impossible" markup Max Nikulin @ 2021-12-06 15:45 ` Juan Manuel Macías 2021-12-06 16:56 ` Juan Manuel Macías 2021-12-08 13:09 ` Max Nikulin 0 siblings, 2 replies; 69+ messages in thread From: Juan Manuel Macías @ 2021-12-06 15:45 UTC (permalink / raw) To: Max Nikulin; +Cc: orgmode Max Nikulin writes: > John, thank you for the reminding me of Juan Manuel's idea that > everything missed in Org may be polyfilled (ab)using links. > It is enough for proof of concept, special markers may be introduced > later. After some time spent exercising in monkey-typing, > I have got some code that illustrates my idea. > > So the goal is to mitigate demand to extend current syntax. > While simple cases should be easy, > special cases should not be impossible. > > - Raw AST snippets should be processed without ~eval~ to give > other tools such as =pandoc= a chance to support the feature. > If you desperately need ~eval~ then you can use source blocks. > - The idea is to use existing backends by passing structures > similar to ones generated by ~org-element~ parser. > - I would prefer to avoid "@@" for link prefix since such sequences > are already a part of Org syntax. In the following example > export snippet is preliminary terminated by such link: > > #+begin_src elisp :results pp > (org-element-parse-secondary-string > "@@latex:[[@@:(italics \"i\")]]@@" > (org-element-restriction 'paragraph)) > #+end_src > > > #+RESULTS: > : ((export-snippet > : (:back-end "latex" :value "[[" :begin 1 :end 13 :post-blank 0 > :parent #0)) > : #(":(italics \"i\")]]@@" 0 18 > : (:parent #0))) > > Let's take some link prefix that makes it clear that the proposal > is a draft and a sane variant will be chosen later when agreement > concerning details of such feature is achieved. Till that moment > it is named "orgia". > > #+begin_src elisp :results silent > (defun orgia-export (path desc backend) > (if (not (eq ?\( (aref path 0))) > path > (let ((tree (read path)) > (info (org-export-get-environment backend nil nil))) > (org-no-properties > (org-export-data-with-backend tree backend info))))) > > (org-link-set-parameters > "orgia" > :export #'orgia-export) > #+end_src > > > Either [[orgia:("inter" (bold () "word"))]] > or <orgia:((italic () "inter") "word")> > links may be used. Certainly plain text may be outside: > > #+begin_src elisp > (org-export-string-as "A <orgia:(italic () \"inter\")>word" 'html t) > #+end_src > > #+RESULTS: > : <p> > : A <i>inter</i>word</p> > > - Error handling is required. > - Elements (blocks) should be considered as an error > in object (inline) context. > - Passed tree should be preprocessed to glue strings split to > avoid interpreting them as terminating outer construct or link itself > (=]]= =][= should be ="]" "]"= ="]" "["= inside bracket links). > It is especially important for property values. > - For convenience =parse= element may be added to parse a string > accordingly to Org markup. > - There should be a similar element (block-level markup structure). > - Symbols and structures used by ~org-element~ becomes a part of > public API, but they are already are since they are used > by export backends. > - ~org-cite~ is likely will be a problem. Hi Maxim, I understand that with this method the emphases could be nested, which it seems also very productive. I like it. I would suggest, however, not to use the term 'italics', since is a 'typographic' term, but a term that is agnostic of format and typography, something like as 'emphasis' or 'emph'. For example, in a format agnostic environment like Org, which is concerned only with structure, an emphasis is always an emphasis. But in a typographic environment that emphasis may or may not be be in italics. That is why in LaTeX you can write constructions like: #+begin_src latex \emph{The Making Off of \emph{Star Wars}} #+end_src In this context 'Star Wars' would appear in upright font. Naturally, these things are only possible in LaTeX, but it's nice to keep in Org a typographic agnosticism. Anyway, I find all this very interesting as proof of concept, although in my workflow I prefer to use macros for these types of scenarios (yes, a rare case where I don't use links! :-D): #+begin_src emacs-lisp (defun my-macro-emph (arg) (cond ((org-export-derived-backend-p org-export-current-backend 'latex) (concat "@@latex:\\emph{@@" arg "@@latex:}@@")) ((org-export-derived-backend-p org-export-current-backend 'html) (concat "@@html:<em>@@" arg "@@html:</em>@@")) ((org-export-derived-backend-p org-export-current-backend 'odt) (concat "@@odt:<text:span text:style-name=\"Emphasis\">@@" arg "@@odt:</text:span>@@")))) (setq org-export-global-macros '(("emph" . "(eval (my-macro-emph $1))"))) #+end_src {{{emph(The Making Off of {{{emph(Star Wars)}}})}}} Best regards, Juan Manuel ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Raw Org AST snippets for "impossible" markup 2021-12-06 15:45 ` Juan Manuel Macías @ 2021-12-06 16:56 ` Juan Manuel Macías 2021-12-08 13:09 ` Max Nikulin 1 sibling, 0 replies; 69+ messages in thread From: Juan Manuel Macías @ 2021-12-06 16:56 UTC (permalink / raw) To: Max Nikulin; +Cc: orgmode Juan Manuel Macías writes: > I would suggest, however, not to use the term 'italics [...blah blah...]' Sorry for the noise! I think I messed myself up... Naturally, 'italic' (or 'bold') is required: (italic () \"inter\") Best regards, Juan Manuel ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Raw Org AST snippets for "impossible" markup 2021-12-06 15:45 ` Juan Manuel Macías 2021-12-06 16:56 ` Juan Manuel Macías @ 2021-12-08 13:09 ` Max Nikulin 2021-12-08 23:19 ` Juan Manuel Macías 1 sibling, 1 reply; 69+ messages in thread From: Max Nikulin @ 2021-12-08 13:09 UTC (permalink / raw) To: emacs-orgmode On 06/12/2021 22:45, Juan Manuel Macías wrote: > > I understand that with this method the emphases could be nested, which > it seems also very productive. I like it. > > I would suggest, however, not to use the term 'italics', since is a > 'typographic' term, but a term that is agnostic of format and > typography, something like as 'emphasis' or 'emph'. For example, in a > format agnostic environment like Org, which is concerned only with > structure, an emphasis is always an emphasis. But in a typographic > environment that emphasis may or may not be be in italics. That is why > in LaTeX you can write constructions like: As you have guessed, It is not my choice, it is interface of ox.el and org-element.el. However if you strongly want to use proper terminology in markup, you may try to trade it for +your soul+ compatibility and portability issues. The following almost works: #+begin_src elisp :results silent (defun orgia-link (link-data desc info) (let* ((backend-struct (plist-get info :back-end)) (backend-name (org-export-backend-name backend-struct))) (or (org-export-custom-protocol-maybe link-data desc backend-name info) (let* ((parent (org-export-backend-parent backend-struct)) (transcoders-alist (org-export-get-all-transcoders parent)) (link-transcoder (alist-get 'link transcoders-alist))) (if link-transcoder (funcall link-transcoder link-data desc info) desc))))) (defun evilatex-emph (_emph content info) ;; I have no idea yet why newline is appended. (format "\\textit{%s}%%" content)) (org-export-define-derived-backend 'evilatex 'latex :translate-alist '((emph . evilatex-emph) (link . orgia-link))) #+end_src #+begin_src elisp (let ((org-export-with-broken-links 'mark)) (org-export-string-as "An [[orgia:(italic () \"ex\")]]ample of <orgia:(emph () \"inter\")>word and [[http://te.st][link]] [[unknown:prefix][desc]]!" 'evilatex t)) #+end_src #+RESULTS: : An \emph{ex}ample of \textit{inter}% : word and \href{http://te.st}{link} [BROKEN LINK: unknown:prefix]! Actually, I believe that something like orgia-link code should be added by `org-exprot-define-derived-backend' if "link" is missed in translate-alist. I suspect that `org-export-get-all-transcoders' may be avoided. > (setq org-export-global-macros > '(("emph" . "(eval (my-macro-emph $1))"))) Sorry, I have not prepared better variant to solve comma in macro problem yet. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Raw Org AST snippets for "impossible" markup 2021-12-08 13:09 ` Max Nikulin @ 2021-12-08 23:19 ` Juan Manuel Macías 2021-12-08 23:35 ` John Kitchin 0 siblings, 1 reply; 69+ messages in thread From: Juan Manuel Macías @ 2021-12-08 23:19 UTC (permalink / raw) To: Max Nikulin; +Cc: orgmode Max Nikulin writes: > As you have guessed, It is not my choice, it is interface of ox.el and > org-element.el. Indeed. Sorry for my haste: it's the consequences of not read the code carefully :-) Of course, your orgia-link-procedure could be extended to more org elements. I can't think of what kind of scenario that might fit in, but as a proof of concept I find it really stimulating. E.g: #+begin_src elisp (org-export-string-as "<orgia:(verse-block () \"Lorem\\nipsum\\ndolor\")>" 'html t) #+end_src #+RESULTS: : <p> : <p class="verse"> : Lorem<br /> : ipsum<br /> : dolor</p> : </p> #+begin_src elisp (org-export-string-as "<orgia:(quote-block (:attr_latex (\":environment foreigndisplayquote :options {greek}\")) \"Δαρείου καὶ Παρυσάτιδος γίγνονται παῖδες δύο, πρεσβύτερος μὲν Ἀρταξέρξης, νεώτερος δὲ Κῦρος·\")>" 'latex t) #+end_src #+RESULTS: : \begin{foreigndisplayquote}{greek} : Δαρείου καὶ Παρυσάτιδος γίγνονται παῖδες δύο, πρεσβύτερος μὲνἈρταξέρξης, νεώτερος δὲ Κῦρος· : \end{foreigndisplayquote} > However if you strongly want to use proper terminology in markup, you > may try to trade it for +your soul+ compatibility and portability > issues. The following almost works: Interesting, thank you. Yes, it is strange the new line added in `evilatex-emph' ... I have no idea why that happens. Best regards, Juan Manuel ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Raw Org AST snippets for "impossible" markup 2021-12-08 23:19 ` Juan Manuel Macías @ 2021-12-08 23:35 ` John Kitchin 2021-12-09 7:01 ` Juan Manuel Macías 0 siblings, 1 reply; 69+ messages in thread From: John Kitchin @ 2021-12-08 23:35 UTC (permalink / raw) To: Juan Manuel Macías; +Cc: Max Nikulin, orgmode [-- Attachment #1: Type: text/plain, Size: 2220 bytes --] Have you seen https://github.com/tj64/org-dp? It seems to do a lot with creating and manipulating org elements. It might either be handy or lead to some inspiration. On Wed, Dec 8, 2021 at 6:20 PM Juan Manuel Macías <maciaschain@posteo.net> wrote: > Max Nikulin writes: > > > As you have guessed, It is not my choice, it is interface of ox.el and > > org-element.el. > > Indeed. Sorry for my haste: it's the consequences of not read the code > carefully :-) > > Of course, your orgia-link-procedure could be extended to more org > elements. > I can't think of what kind of scenario that might fit in, but as a proof > of concept I find it really stimulating. E.g: > > #+begin_src elisp > (org-export-string-as "<orgia:(verse-block () > \"Lorem\\nipsum\\ndolor\")>" 'html t) > #+end_src > > #+RESULTS: > : <p> > : <p class="verse"> > : Lorem<br /> > : ipsum<br /> > : dolor</p> > : </p> > > #+begin_src elisp > (org-export-string-as "<orgia:(quote-block (:attr_latex > (\":environment foreigndisplayquote :options {greek}\")) > \"Δαρείου καὶ Παρυσάτιδος γίγνονται παῖδες δύο, πρεσβύτερος μὲν > Ἀρταξέρξης, νεώτερος δὲ Κῦρος·\")>" 'latex t) > #+end_src > > #+RESULTS: > : \begin{foreigndisplayquote}{greek} > : Δαρείου καὶ Παρυσάτιδος γίγνονται παῖδες δύο, πρεσβύτερος μὲνἈρταξέρξης, > νεώτερος δὲ Κῦρος· > : \end{foreigndisplayquote} > > > > However if you strongly want to use proper terminology in markup, you > > may try to trade it for +your soul+ compatibility and portability > > issues. The following almost works: > > Interesting, thank you. > > Yes, it is strange the new line added in `evilatex-emph' ... I have no > idea why that happens. > > Best regards, > > Juan Manuel > -- John ----------------------------------- Professor John Kitchin (he/him/his) Doherty Hall A207F Department of Chemical Engineering Carnegie Mellon University Pittsburgh, PA 15213 412-268-7803 @johnkitchin http://kitchingroup.cheme.cmu.edu [-- Attachment #2: Type: text/html, Size: 3121 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Raw Org AST snippets for "impossible" markup 2021-12-08 23:35 ` John Kitchin @ 2021-12-09 7:01 ` Juan Manuel Macías 2021-12-09 14:56 ` Max Nikulin 0 siblings, 1 reply; 69+ messages in thread From: Juan Manuel Macías @ 2021-12-09 7:01 UTC (permalink / raw) To: John Kitchin; +Cc: Maxim Nikulin, orgmode John Kitchin writes: > Have you seen > https://github.com/tj64/org-dp? It seems to do a lot with creating and > manipulating org elements. It might either be handy or lead to some > inspiration. Interesting package. Thanks for sharing. It gave me an idea, also borrowing part of Maxim's code, but evaluating in this case the path. To continue playing with links... The goal is to obtain a link with this structure `[[quote-lang:lang][quote]]': #+BEGIN_SRC emacs-lisp :results silent (org-link-set-parameters "quote-lang" :display 'full :export (lambda (path desc bck) (let* ((bck org-export-current-backend) (attr (list (format ":environment foreigndisplayquote :options {%s}" path))) (info (org-export-get-environment bck nil nil))) (org-no-properties (org-export-data-with-backend `(quote-block (:attr_latex ,attr) ,desc) bck info))))) #+END_SRC #+begin_src emacs-lisp (setq backends '(latex html odt)) (setq results nil) (mapc (lambda (backend) (add-to-list 'results (org-export-string-as "[[quote-lang:spanish][Publicamos nuestro libros para librarnos de ellos, para no pasar el resto de nuestras vidas corrigiendo borradores.]]" backend t) t)) backends) (mapconcat 'identity results "\n") #+end_src #+RESULTS: #+begin_example \begin{foreigndisplayquote}{spanish} Publicamos nuestro libros para librarnos de ellos, para no pasar el resto de nuestras vidas corrigiendo borradores. \end{foreigndisplayquote} <p> <blockquote> Publicamos nuestro libros para librarnos de ellos, para no pasar el resto de nuestras vidas corrigiendo borradores. </blockquote> </p> <text:p text:style-name="Text_20_body">Publicamos nuestro libros para librarnos de ellos, para no pasar el resto de nuestras vidas corrigiendo borradores.</text:p> #+end_example ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Raw Org AST snippets for "impossible" markup 2021-12-09 7:01 ` Juan Manuel Macías @ 2021-12-09 14:56 ` Max Nikulin 2021-12-09 16:11 ` Juan Manuel Macías 0 siblings, 1 reply; 69+ messages in thread From: Max Nikulin @ 2021-12-09 14:56 UTC (permalink / raw) To: emacs-orgmode On 09/12/2021 14:01, Juan Manuel Macías wrote: > John Kitchin writes: > >> Have you seen >> https://github.com/tj64/org-dp? It seems to do a lot with creating and >> manipulating org elements. It might either be handy or lead to some >> inspiration. > > Interesting package. Thanks for sharing. Either I missed something or its purpose is completely different. It maps Org markup to Org markup. I am experimenting with fragments that should allow to get something that is really tricky or even impossible with established syntax, so it has to run immediately before exporters. > It gave me an idea, also borrowing part of Maxim's code, but evaluating > in this case the path. To continue playing with links... The goal is > to obtain a link with this structure `[[quote-lang:lang][quote]]': > > #+BEGIN_SRC emacs-lisp :results silent > (org-link-set-parameters > "quote-lang" > :display 'full > :export (lambda (path desc bck) > (let* ((bck org-export-current-backend) > (attr (list (format > ":environment foreigndisplayquote :options {%s}" > path))) > (info (org-export-get-environment > bck nil nil))) > (org-no-properties > (org-export-data-with-backend > `(quote-block (:attr_latex ,attr) > ,desc) > bck info))))) > #+END_SRC Looking into your code I have realized that it should be implemented using filter, not through :export property of links. Maybe without working proof of concept with link exporters, this session of monkey-typing would not be successful. #+begin_src elisp :results silent (defun orgia-element-replace (current new destructive?) (if (eq current new) current (let* ((lst? (and (listp new) (not (symbolp (car new))))) (new-lst (if lst? (if destructive? (nconc new) (reverse new)) (list new)))) (dolist (element new-lst) (org-element-insert-before element current))) (org-element-extract-element current) new)) (defun orgia--transform-link (data) (if (not (string-equal "orgia" (org-element-property :type data))) data (let* ((path (org-element-property :path data))) (if (not (eq ?\( (aref path 0))) (or path (org-element-contents data)) (read path))))) (defun orgia-parse-tree-filter (data _backend info) (org-element-map data 'link (lambda (data) (orgia-element-replace data (orgia--transform-link data) t)) info nil nil t) data) #+end_src #+begin_src elisp :results silent (add-to-list 'org-export-filter-parse-tree-functions #'orgia-parse-tree-filter) (org-link-set-parameters "orgia") #+end_src #+begin_src elisp (org-export-string-as "An <orgia:(\"in\" (italic () \"ter\"))>word" 'html t) #+end_src #+RESULTS: : <p> : An in<i>ter</i>word</p> ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Raw Org AST snippets for "impossible" markup 2021-12-09 14:56 ` Max Nikulin @ 2021-12-09 16:11 ` Juan Manuel Macías 2021-12-09 22:27 ` Juan Manuel Macías 0 siblings, 1 reply; 69+ messages in thread From: Juan Manuel Macías @ 2021-12-09 16:11 UTC (permalink / raw) To: Max Nikulin; +Cc: orgmode Max Nikulin writes: > Looking into your code I have realized that it should be implemented > using filter, not through :export property of links. Maybe without > working proof of concept with link exporters, this session of > monkey-typing would not be successful. Jumping into the "real world", how about these two examples of nested emphasis? #+begin_src org :results latex :results replace [[orgia:(italic () "The English versions of the " (italic () "Iliad") " and the " (italic () "Odyssey"))]] #+end_src #+RESULTS: #+begin_export latex \emph{The English versions of the \emph{Iliad} and the \emph{Odyssey}} #+end_export This one more complex: #+begin_src org :results latex :results replace [[orgia:(italic () "The English versions of the " (bold () (italic () "Iliad")) " and the " (bold () (italic () "Odyssey")))]] #+end_src #+RESULTS: #+begin_export latex \emph{The English versions of the \textbf{\emph{Iliad}} and the \textbf{\emph{Odyssey}}} #+end_export ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Raw Org AST snippets for "impossible" markup 2021-12-09 16:11 ` Juan Manuel Macías @ 2021-12-09 22:27 ` Juan Manuel Macías 2022-01-03 14:34 ` Max Nikulin 0 siblings, 1 reply; 69+ messages in thread From: Juan Manuel Macías @ 2021-12-09 22:27 UTC (permalink / raw) To: Maxim Nikulin; +Cc: orgmode Juan Manuel Macías writes: > Jumping into the "real world", how about these two examples of nested emphasis? By the way, what do you think about allowing the use of some kind of aliases, so that the aspect is less verbose? Maybe something like "(i::" instead of "(italic () ..."? I came up with this hasty sketch over your latest code, *just* to see how it looks (I don't know if I prefer it to stay verbose): #+begin_src emacs-lisp :results silent (setq orgia-alias-alist '(("i" "italic") ("b" "bold") ("u" "underline") ("s" "strike-through"))) (defun orgia-replace (before after) (interactive) (save-excursion (goto-char (point-min)) (while (re-search-forward before nil t) (replace-match after t nil)))) (defun orgia--transform-path (path) (with-temp-buffer (insert path) (mapc (lambda (el) (orgia-replace (concat "(" (car el) "::") (concat "(" (cadr el) " () "))) orgia-alias-alist) (buffer-string))) (defun orgia--transform-link (data) (if (not (string-equal "orgia" (org-element-property :type data))) data (let* ((path (org-element-property :path data))) (if (not (eq ?\( (aref path 0))) (or path (org-element-contents data)) (read (orgia--transform-path path)))))) ;; <==== ;;;;;;;;;;;;;;;;;; #+end_src #+begin_src elisp (org-export-string-as "An <orgia:(\"in\" (s:: \"ter\"))>word" 'odt t) #+end_src #+RESULTS: : : <text:p text:style-name="Text_20_body">An in<text:span text:style-name="Strikethrough">ter</text:span>word</text:p> #+begin_src org :results latex :results replace [[orgia:(i:: "The English versions of the " (b:: (i:: "Iliad")) " and the " (b:: (i:: "Odyssey")))]] #+end_src #+RESULTS: #+begin_export latex \emph{The English versions of the \textbf{\emph{Iliad}} and the \textbf{\emph{Odyssey}}} #+end_export ------------------------------------------------------ Juan Manuel Macías https://juanmanuelmacias.com/ ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Raw Org AST snippets for "impossible" markup 2021-12-09 22:27 ` Juan Manuel Macías @ 2022-01-03 14:34 ` Max Nikulin 0 siblings, 0 replies; 69+ messages in thread From: Max Nikulin @ 2022-01-03 14:34 UTC (permalink / raw) To: emacs-orgmode [-- Attachment #1: Type: text/plain, Size: 1899 bytes --] On 10/12/2021 05:27, Juan Manuel Macías wrote: > Juan Manuel Macías writes: > >> Jumping into the "real world", how about these two examples of nested emphasis? > > By the way, what do you think about allowing the use of some kind of > aliases, so that the aspect is less verbose? I have no particular opinion concerning aliases, but certainly they should not work through string search and replace when parsed tree is available. > (defun orgia--transform-path (path) > (with-temp-buffer > (insert path) > (mapc (lambda (el) > (orgia-replace (concat "(" (car el) "::") (concat "(" (cadr el) " () "))) By the way, is there any problem with `replace-regexp-in-string'? See the attached file for definitions of some helper functions. Final setup: #+begin_src elisp :results silent (setq orgia-demo-alias-alist '((b . bold) (i . italic) (s . strike-through) (_ . underline))) (defun orgia-demo-alias-post-filter (node &optional _children) (when (listp node) (let ((sym (and (symbolp (car node)) (assq (car node) orgia-demo-alias-alist)))) (when sym (setcar node (cdr sym))))) node) (defun orgia-demo-alias (tree) (orgia-transform-tree-deep tree nil #'orgia-demo-alias-post-filter)) #+end_src #+begin_src elisp :results silent (require 'ox) (add-to-list 'org-export-filter-parse-tree-functions #'orgia-parse-tree-filter) (org-link-set-parameters "orgia") (require 'ob-org) (add-to-list 'orgia-transform-functions #'orgia-demo-alias) #+end_src And a bit modified your test sample: #+begin_src org :results latex :results replace [[orgia:(i nil "The English versions of the " (b nil (i () "Iliad")) " and the " (b () (i () "Odyssey")))]] #+end_src #+RESULTS: #+begin_export latex \emph{The English versions of the \textbf{\emph{Iliad}} and the \textbf{\emph{Odyssey}}} #+end_export [-- Attachment #2: orgia-draft.el --] [-- Type: text/x-emacs-lisp, Size: 2080 bytes --] (defvar orgia-transform-functions nil) (defun orgia-default-pre-filter (node) "Returns (node . children)" (if (listp node) (cons node node) (cons node nil))) (defun orgia-transform-tree-deep (tree &optional pre-filter post-filter) "Deep-first walk." ;; Queue items: ((node-cell . children) . next-list) (let* ((pre-filter (or pre-filter #'orgia-default-pre-filter)) (top (list tree)) (queue (list (cons (cons top top) top)))) (while queue (let* ((item (pop queue)) (next-list (cdr item))) (if (not next-list) ;; post; skip POST-FILTER for the list wrapping TREE (when (and queue post-filter) (let* ((node-cell-children (car item)) (children (cdr node-cell-children))) (setcar (car node-cell-children) (funcall post-filter (caar node-cell-children) children)))) ;; pre (setcdr item (cdr next-list)) (push item queue) (let* ((node-children (funcall pre-filter (car next-list))) (node (car node-children)) (children (cdr node-children))) (setcar next-list node) (push (cons (cons next-list children) children) queue))))) (car top))) (defun orgia-element-replace (current new destructive?) (if (eq current new) current (let* ((lst? (and (listp new) (not (symbolp (car new))))) (new-lst (if lst? (if destructive? (nconc new) (reverse new)) (list new)))) (dolist (element new-lst) (org-element-insert-before element current))) (org-element-extract-element current) new)) (defun orgia--transform-link (data) (if (not (string-equal "orgia" (org-element-property :type data))) data (let* ((path (org-element-property :path data))) (if (not (eq ?\( (aref path 0))) (or path (org-element-contents data)) (let ((tree (read path))) (dolist (f orgia-transform-functions tree) (setq tree (funcall f tree)))))))) (defun orgia-parse-tree-filter (data _backend info) (org-element-map data 'link (lambda (data) (orgia-element-replace data (orgia--transform-link data) t)) info nil nil t) data) ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-04 17:53 ` Tom Gillespie 2021-12-04 18:37 ` John Kitchin @ 2021-12-04 19:04 ` Timothy 2021-12-04 21:48 ` Tom Gillespie 2021-12-06 11:01 ` Denis Maier 2 siblings, 1 reply; 69+ messages in thread From: Timothy @ 2021-12-04 19:04 UTC (permalink / raw) To: Tom Gillespie Cc: Juan Manuel Macías, Max Nikulin, Tim Cross, emacs-orgmode, Denis Maier [-- Attachment #1: Type: text/plain, Size: 872 bytes --] Hi Tom, > After a bunch of rambling (see below if interested), I think I have > a solution that should work for everyone. The key realization is that > what we really want is the ability to have a “parse me separately” > type of syntax. This meets the intra-word syntax needs and might > meet some other needs as well. > > The solution is to make “parse me separately” > block! It nearly works that way already too! To minimize typing > we could have @@:…@@ the empty type default to org. > > Thoughts? This isn’t quite as succinct as the ascii-doc inspired suggestions, but it’s barely an extension on the current syntax — I like it! Since org is a valid export backend though, perhaps this behaviour should be reserved for @@:…@@, i.e. no export backend, which I think semantically fits fairly nicely. All the best, Timothy ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-04 19:04 ` Org-syntax: Intra-word markup Timothy @ 2021-12-04 21:48 ` Tom Gillespie 2021-12-06 10:59 ` Max Nikulin 2022-01-28 14:52 ` Max Nikulin 0 siblings, 2 replies; 69+ messages in thread From: Tom Gillespie @ 2021-12-04 21:48 UTC (permalink / raw) To: Timothy; +Cc: emacs-orgmode > Since org is a valid export backend though, perhaps this behaviour should be > reserved for @@:…@@, i.e. no export backend, which I think semantically fits > fairly nicely. This ends up being even more convenient than I initially realized. The current spec for export snippets is ambiguous when it says "NAME can contain any alpha-numeric character and hyphens" but the implementation behavior requires that "any" means "at least one" and is implemented using the + regex operator. What this means is that @@:...@@ syntax is not actually used in Org at all at the moment and renders as plain text. I agree that we need to avoid @@org:..@@ because it has legitimate uses. Making a back-end of empty string valid for parse separately syntax thus makes @@ syntax more regular overall, and allows @@:...@@ to be processed separately because it currently never enters the export snippet processing. This is important because export snippets do not seem to be easily accessible to earlier phases of the org-export machinery, i.e. there isn't a nice centralized place to preprocess @@org:...@@ even if we wanted to. On the other hand @@:...@@ isn't processed at all. I could be missing something in the org export code though. It will take a bit of work to get this behavior implemented I think, but it doesn't seem to have any conflicts. Some users may have set the empty backend to expand manually via org-export-snippet-translation-alist, but as long as we give org-export-snippet-translation-alist priority and warn people that setting "" manually will disable the new functionality then there shouldn't be any disruption. The behavior also sort of matches what we would want the empty string to be in this case, which is "all backends" and of course the only markup that makes sense for "all backends" is org itself! Best, Tom ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-04 21:48 ` Tom Gillespie @ 2021-12-06 10:59 ` Max Nikulin 2022-01-28 14:52 ` Max Nikulin 1 sibling, 0 replies; 69+ messages in thread From: Max Nikulin @ 2021-12-06 10:59 UTC (permalink / raw) To: emacs-orgmode On 05/12/2021 04:48, Tom Gillespie wrote: >> Since org is a valid export backend though, perhaps this behaviour should be >> reserved for @@:…@@, i.e. no export backend, which I think semantically fits >> fairly nicely. > > This ends up being even more convenient than I initially realized. It is a bright idea. The only drawback I see is that it is impossible to put new "@@:@@" fragment inside export snippet "@@latex:some @@:special@@thing@@ or vice versa. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-04 21:48 ` Tom Gillespie 2021-12-06 10:59 ` Max Nikulin @ 2022-01-28 14:52 ` Max Nikulin 2022-01-29 3:13 ` Ihor Radchenko 1 sibling, 1 reply; 69+ messages in thread From: Max Nikulin @ 2022-01-28 14:52 UTC (permalink / raw) To: Tom Gillespie; +Cc: emacs-orgmode On 05/12/2021 04:48, Tom Gillespie wrote: >> Since org is a valid export backend though, perhaps this behaviour should be >> reserved for @@:…@@, i.e. no export backend, which I think semantically fits >> fairly nicely. > > ... > > What this means is that @@:...@@ syntax is not actually used > in Org at all at the moment and renders as plain text. I agree that > we need to avoid @@org:..@@ because it has legitimate uses. > Making a back-end of empty string valid for parse separately > syntax thus makes @@ syntax more regular overall, and allows > @@:...@@ to be processed separately because it currently > never enters the export snippet processing. It seems that @@:...@@ should behave significantly different from regular export snippet since org markup should be parsed inside. It could be used for one more purpose. I miss "fallback" option for export snippets. E.g. if explicit raw markup is specified for HTML and LaTeX, it would be nice to have something for other backends such as ascii or odt. In the series of adjacent export snippets @@:...@@ may be taken when backends in earlier snippets are not matched: @@html:HTML 1@@@@latex:LaTeX 1@@@@:ascii and odt 1@@@@html: HTML 2@@@@:LaTeX, ascii, and odt 2@@. At first I complained that it would be impossible to put export snippets in "parse separately" construct with @@:...@@ syntax. Likely it is not necessary. It is a bit verbose, but "parse separately" may be split: @@:part 1@@@@html:html-only@@@@:@@@@:part 2@@ Empty @@:@@ is added to avoid considering @@:part 2@@ as a fallback for "html-only". ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2022-01-28 14:52 ` Max Nikulin @ 2022-01-29 3:13 ` Ihor Radchenko 2022-01-29 13:05 ` Juan Manuel Macías 0 siblings, 1 reply; 69+ messages in thread From: Ihor Radchenko @ 2022-01-29 3:13 UTC (permalink / raw) To: Max Nikulin; +Cc: Tom Gillespie, emacs-orgmode Max Nikulin <manikulin@gmail.com> writes: > It could be used for one more purpose. I miss "fallback" option for > export snippets. E.g. if explicit raw markup is specified for HTML and > LaTeX, it would be nice to have something for other backends such as > ascii or odt. In the series of adjacent export snippets @@:...@@ may be > taken when backends in earlier snippets are not matched: This reminds me about our #+begin_export export blocks and #+begin_* special blocks. We can think of @@backend:...@@ snippets as inline equivalent of export blocks. Special blocks do not have inline equivalent (except maybe links abused for export by some people). Keeping in mind the above analogy, note that export blocks do not have fallbacks, while special blocks do (for example, see https://github.com/alhassy/org-special-block-extras/). Maybe we should introduce an equivalent of special blocks, but for inline use? Or should we modify _both_ inline export snippets and export blocks to allow fallback mechanism? Best, Ihor ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2022-01-29 3:13 ` Ihor Radchenko @ 2022-01-29 13:05 ` Juan Manuel Macías 2022-02-02 15:28 ` Max Nikulin 0 siblings, 1 reply; 69+ messages in thread From: Juan Manuel Macías @ 2022-01-29 13:05 UTC (permalink / raw) To: Ihor Radchenko; +Cc: orgmode Ihor Radchenko writes: > Maybe we should introduce an equivalent of special blocks, but for > inline use? Or should we modify _both_ inline export snippets and export > blocks to allow fallback mechanism? I find the idea of inline special blocks very interesting, but I think there are a couple of drawbacks: since special blocks support ATTR_X, how would that be implemented in the inline version? The most obvious thing I can think of is to mimic inline code blocks: my_special_block[attributes list]{content} But it would produce a result many times too verbose. Another risk that this would entail, IMHO, is that of the "LaTeXification" of Org... In any case, for things like that, aren't links and macros enough? I'm one of those who 'abuse' links for many export scenarios (I even have written this package: https://gitlab.com/maciaschain/org-critical-edition), and I think links have enormous potential and versatility. John Kitchin's blog has really helped me open my mind and explore that very productive Org component. Macros are also a very powerful tool, except for the comma issue, which I think is still an unfinished business and a solution should be found one day. Still, the possibility of a special inline block is very interesting to me. Best regards, Juan Manuel ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2022-01-29 13:05 ` Juan Manuel Macías @ 2022-02-02 15:28 ` Max Nikulin 2022-02-02 20:01 ` Juan Manuel Macías 0 siblings, 1 reply; 69+ messages in thread From: Max Nikulin @ 2022-02-02 15:28 UTC (permalink / raw) To: emacs-orgmode > Ihor Radchenko writes: >> Keeping in mind the above analogy, note that export blocks do not have >> fallbacks, while special blocks do (for example, see >> https://github.com/alhassy/org-special-block-extras/) Ihor, I am sorry, but I missed your point. That project provides some set of defined link+block pairs and some macros to define new links/pairs. I do not see relation to export snippets or blocks that are used when their content is not intended to be reusable. >> Maybe we should introduce an equivalent of special blocks, but for >> inline use? Or should we modify _both_ inline export snippets and export >> blocks to allow fallback mechanism? I suppose, it should be consistent to consider adjacent export blocks as alternatives and to allow "fallback" or "default" block. Again, similar to @@:...@@ snippets, block content should be parsed as Org markup. On 29/01/2022 20:05, Juan Manuel Macías wrote: > I find the idea of inline special blocks very interesting, but I think > there are a couple of drawbacks: since special blocks support ATTR_X, > how would that be implemented in the inline version? The most obvious > thing I can think of is to mimic inline code blocks: > > my_special_block[attributes list]{content} ATTR_X attributes are supported for links as well, see info "(org) Links in HTML export" https://orgmode.org/manual/Links-in-HTML-export.html However it is rather verbose, may have problems with LaTeX, and I am unsure if they can be accessed from export link handlers Actually I do not like src_something[...]{...} syntax since there is no clear mark (such as "\") at the beginning that it is a special construct. > In any case, for things like that, aren't links and macros enough? Ad hoc code for particular backends (and discussed fallback for other backends) is a bit different thing. It may be used in macros, but macros can not replace it. Moreover @@:...@@ construct proposed by Tom would allow e.g. [[https://orgmode.org][@@:*inter*@@@@:/word/@@]] to be half-word bold and half-word italics without invisible zero width spaces and filters to remove them. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2022-02-02 15:28 ` Max Nikulin @ 2022-02-02 20:01 ` Juan Manuel Macías 2022-02-03 12:10 ` Max Nikulin 0 siblings, 1 reply; 69+ messages in thread From: Juan Manuel Macías @ 2022-02-02 20:01 UTC (permalink / raw) To: Max Nikulin; +Cc: orgmode Max Nikulin writes: > ATTR_X attributes are supported for links as well, see > info "(org) Links in HTML export" > https://orgmode.org/manual/Links-in-HTML-export.html > However it is rather verbose, may have problems with LaTeX, and I am > unsure if they can be accessed from export link handlers Yes, I know. I use a lot in my blogs constructions of this type: #+ATTR_HTML: :target _blank some link... But, as far as I know, its use is line-oriented. I mean, you can't use multiple ATTR_X constructs inside a paragraph and for different links inside the paragraph. As for links and their multiple possible or future uses (I say *uses* and never *abuses*: it's a tool, it's there to be used, and it works great), of course I see them more as a resource ---and quite powerful and versatile, by the way. --- that a matter of syntax. But the thing is that for me Org is, in addition to a syntax, above all a set of coherently assembled resources to prepare my documents and take my notes, organize my work and a lot of other things. Best regards, Juan Manuel ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2022-02-02 20:01 ` Juan Manuel Macías @ 2022-02-03 12:10 ` Max Nikulin 0 siblings, 0 replies; 69+ messages in thread From: Max Nikulin @ 2022-02-03 12:10 UTC (permalink / raw) To: emacs-orgmode On 03/02/2022 03:01, Juan Manuel Macías wrote: > Max Nikulin writes: > >> ATTR_X attributes are supported for links as well, see >> info "(org) Links in HTML export" >> https://orgmode.org/manual/Links-in-HTML-export.html >> However it is rather verbose, may have problems with LaTeX, and I am >> unsure if they can be accessed from export link handlers > > Yes, I know. I use a lot in my blogs constructions of this type: > > #+ATTR_HTML: :target _blank > some link... I just have realized that example in the manual does not work. I will start a new thread. Attributes are assigned to paragraph, not to the link: #+ATTR_HTML: :title The Org mode homepage :style color:red; [[https://orgmode.org]] <p title="The Org mode homepage" style="color:red;"> <a href="https://orgmode.org" title="The Org mode homepage" style="color:red;">https://orgmode.org</a> </p> > But, as far as I know, its use is line-oriented. I mean, you can't use > multiple ATTR_X constructs inside a paragraph and for different links > inside the paragraph. Thank you, I confused issues related to export when keywords and export blocks are used. For some reason I believed that affiliated keywords have a dedicated section in https://orgmode.org/worg/dev/org-syntax.html because they can be applied to inline objects, but you are right, they set property for next block-level element. Attributes from several lines are combined however. The following snippets illustrates bugs in LaTeX exporter that I remember from an earlier discussion: ---- >8 ---- This is a single paragraph in LaTeX export, but 3 HTML paragraphs. First link (with =rel= attribute) is to #+attr_html: :rel nofollow :title Org Mode web site [[https://orgmode.org/][Org Mode]]. Another one is to #+attr_html: :rel noopener #+attr_html: :title GNU web site [[https://www.gnu.org/][GNU]]. Both links have =title= HTML attributes. This is single paragraph in HTML @@odt:@@ but 2 paragraphs in LaTeX. ---- 8< ---- This is a single paragraph in \LaTeX{} export, but 3 HTML paragraphs. First link (with \texttt{rel} attribute) is to \href{https://orgmode.org/}{Org Mode}. Another one is to \href{https://www.gnu.org/}{GNU}. Both links have \texttt{title} HTML attributes. This is single paragraph in HTML but 2 paragraphs in \LaTeX{}. ---- >8 ---- <p> This is a single paragraph in LaTeX export, but 3 HTML paragraphs. First link (with <code>rel</code> attribute) is to </p> <p rel="nofollow" title="Org Mode web site"> <a href="https://orgmode.org/" rel="nofollow" title="Org Mode web site">Org Mode</a>. Another one is to </p> <p title="GNU web site" rel="noopener"> <a href="https://www.gnu.org/" title="GNU web site" rel="noopener">GNU</a>. Both links have <code>title</code> HTML attributes. </p> <p> This is single paragraph in HTML but 2 paragraphs in LaTeX.</p> ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Org-syntax: Intra-word markup 2021-12-04 17:53 ` Tom Gillespie 2021-12-04 18:37 ` John Kitchin 2021-12-04 19:04 ` Org-syntax: Intra-word markup Timothy @ 2021-12-06 11:01 ` Denis Maier 2 siblings, 0 replies; 69+ messages in thread From: Denis Maier @ 2021-12-06 11:01 UTC (permalink / raw) To: Tom Gillespie, emacs-orgmode Cc: Juan Manuel Macías, Max Nikulin, Tim Cross Hi Tom Am 04.12.2021 um 18:53 schrieb Tom Gillespie: > Hi all, > After a bunch of rambling (see below if interested), I think I have > a solution that should work for everyone. The key realization is that > what we really want is the ability to have a "parse me separately" > type of syntax. This meets the intra-word syntax needs and might > meet some other needs as well. > > The solution is to make @@org:...@@ "parse me separately" > block! It nearly works that way already too! To minimize typing > we could have @@:...@@ the empty type default to org. > > This seems like a winner to me. The syntax for it already exists > and won't conflict. It requires relatively minimal additional typing > the implication is clear, and there are other places where such > behavior could be useful. > > This syntax seems like a winner to me > @@org:/hello/@@world > @@:/hello/@@world > > You can also do things like > #+begin_src org > I want a number in this number@@org:src_elisp{(+ 1 2)}@@word! > #+end_src > > Which would render to > #+begin_src org > I want a number in this number3word! > #+end_src > > Thoughts? > > Best! > Tom > Thanks for the suggestion. I think that sounds like a good idea. Of course not as terse as the asciidoc inspired suggestion, but entirely appropriate for a case like this one! I also like that there might be other cases where case might be handy. Best, Denis ^ permalink raw reply [flat|nested] 69+ messages in thread
* [PATCH] Intra-word markup: \relax 2021-12-02 10:50 Org-syntax: Intra-word markup Denis Maier 2021-12-02 11:18 ` Ihor Radchenko 2021-12-02 11:58 ` Timothy @ 2022-01-28 12:12 ` Max Nikulin 2022-01-28 13:13 ` Juan Manuel Macías 2 siblings, 1 reply; 69+ messages in thread From: Max Nikulin @ 2022-01-28 12:12 UTC (permalink / raw) To: emacs-orgmode [-- Attachment #1: Type: text/plain, Size: 1214 bytes --] On 02/12/2021 17:50, Denis Maier wrote: > > Currently, org syntax doesn't officially seem to support intra-word > emphasis. Am I missing something? > If the assessment is correct: Is there a reason for this? And, shouldn't > that be officially added? I have an idea how to implement *intra*/word/ markup with minimal change of Org syntax. At first I had a hope that it is enough to introduce \relax entity that expands to empty string, but it does not work for second part of words: *intra*\relax{}/word/ is exported to <b>intra</b>/word/. So it is necessary to support consuming spaces after such entity similar to TeX commands: *intra*\relax /word/ In Org "a\_ b" already behaves in the same way. I do not like zero-width spaces since they are invisible, so they are not really "text" markup. Moreover, it is better to filter them out during export. Another failed idea was to use export snippet or a macro for such purpose: #+macro sep $1 *intra*{{{sep()}}}/word/, *intra*@@html:@@/word/ Important point that suggested solution works for all export backends. I do not consider explicit export snippets as a workaround since it requires code for all backends in org files. [-- Attachment #2: 0001-Intra-word-markup-relax.patch --] [-- Type: text/x-patch, Size: 2278 bytes --] From 95a0dcb1370577409388e137dae98ec4c1af5bbd Mon Sep 17 00:00:00 2001 From: Max Nikulin <manikulin@gmail.com> Date: Fri, 28 Jan 2022 18:55:54 +0700 Subject: [PATCH] Intra-word markup: \relax lisp/org-element.el (org-element-entity-parser): Parse \relax entity with following spaces. lisp/org-entities.el (org-entities): Add "\relax " entities with various number of spaces expanding to nothing. Allow "*intra*\relax /word/" markup change withing continuous word. It is not enough to just add "relax" entity since while it allows "*intra*\relax{}word", characters after "{}" are not considered as emphasis markers "intra\relax{}/word/". The name is similar to the TeX command. Consuming spaces following a command is usual behavior of TeX commands as well. --- lisp/org-element.el | 2 +- lisp/org-entities.el | 7 ++++++- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/lisp/org-element.el b/lisp/org-element.el index b82475a14..83001fd74 100644 --- a/lisp/org-element.el +++ b/lisp/org-element.el @@ -3159,7 +3159,7 @@ a plist with `:begin', `:end', `:latex', `:latex-math-p', Assume point is at the beginning of the entity." (catch 'no-object - (when (looking-at "\\\\\\(?:\\(?1:_ +\\)\\|\\(?1:there4\\|sup[123]\\|frac[13][24]\\|[a-zA-Z]+\\)\\(?2:$\\|{}\\|[^[:alpha:]]\\)\\)") + (when (looking-at "\\\\\\(?:\\(?1:\\(?:_\\|relax\\) +\\)\\|\\(?1:there4\\|sup[123]\\|frac[13][24]\\|[a-zA-Z]+\\)\\(?2:$\\|{}\\|[^[:alpha:]]\\)\\)") (save-excursion (let* ((value (or (org-entity-get (match-string 1)) (throw 'no-object nil))) diff --git a/lisp/org-entities.el b/lisp/org-entities.el index 2bd4f2fe3..f6177c471 100644 --- a/lisp/org-entities.el +++ b/lisp/org-entities.el @@ -526,7 +526,12 @@ packages to be loaded, add these packages to `org-latex-packages-alist'." spaces spaces (make-string n ?\x2002)) - space-entities))))) + space-entities)))) + ;; Add "\relax " space-eating entity family for "intra\relax *word*" markup. + (mapcar (lambda (n) + (list (concat "relax" (make-string n ? )) "" nil "" "" "" "")) + (number-sequence 0 20))) + "Default entities used in Org mode to produce special characters. For details see `org-entities-user'.") -- 2.25.1 ^ permalink raw reply related [flat|nested] 69+ messages in thread
* Re: [PATCH] Intra-word markup: \relax 2022-01-28 12:12 ` [PATCH] Intra-word markup: \relax Max Nikulin @ 2022-01-28 13:13 ` Juan Manuel Macías 2022-02-02 15:42 ` Max Nikulin 0 siblings, 1 reply; 69+ messages in thread From: Juan Manuel Macías @ 2022-01-28 13:13 UTC (permalink / raw) To: Max Nikulin; +Cc: orgmode Max Nikulin writes: > I have an idea how to implement *intra*/word/ markup with minimal > change of Org syntax. At first I had a hope that it is enough to > introduce \relax entity that expands to empty string, but it does not > work for second part of words: *intra*\relax{}/word/ is exported to > <b>intra</b>/word/. > So it is necessary to support consuming spaces after such entity > similar to TeX commands: > *intra*\relax /word/ > In Org "a\_ b" already behaves in the same way. > > I do not like zero-width spaces since they are invisible, so they are > not really "text" markup. Moreover, it is better to filter them out > during export. > > Another failed idea was to use export snippet or a macro for such purpose: > #+macro sep $1 > *intra*{{{sep()}}}/word/, *intra*@@html:@@/word/ > > Important point that suggested solution works for all export backends. > I do not consider explicit export snippets as a workaround since it > requires code for all backends in org files. Maxim, I find the idea of \relax entity interesting. The only (minor) drawback I find (in normal use, I mean) is the verbosity it adds. In my case, I have already given up on the problem of marks inside words :-(. My personal opinion: I think that, unless a completely 'revolutionary' solution emerges, it is better to leave the matter as it is, and consider this a feature of Org rather than a bug. I suspect that a single solution could not satisfy all tastes or all possible scenarios, so maybe it would be nice to put a list of solutions (including this one and also the zero space thing, and others that have arisen or may arise) somewhere (perhaps in the manual?). What doesn't quite convince me (and I agree with you on that) is recommending zero width space as a sort of 'official' escape character. For the reasons you have expressed, which I think are very fair. Best regards, Juan Manuel ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: [PATCH] Intra-word markup: \relax 2022-01-28 13:13 ` Juan Manuel Macías @ 2022-02-02 15:42 ` Max Nikulin 0 siblings, 0 replies; 69+ messages in thread From: Max Nikulin @ 2022-02-02 15:42 UTC (permalink / raw) To: emacs-orgmode On 28/01/2022 20:13, Juan Manuel Macías wrote: > Max Nikulin writes: > >> I have an idea how to implement *intra*/word/ markup with minimal >> change of Org syntax. At first I had a hope that it is enough to >> introduce \relax entity that expands to empty string, but it does not >> work for second part of words: *intra*\relax{}/word/ is exported to >> <b>intra</b>/word/. >> So it is necessary to support consuming spaces after such entity >> similar to TeX commands: >> *intra*\relax /word/ >> In Org "a\_ b" already behaves in the same way. > > Maxim, I find the idea of \relax entity interesting. The only (minor) > drawback I find (in normal use, I mean) is the verbosity it adds. "Relax" is just a name known to TeX users. Certainly another shorter word may be used instead. I am just lazy enough to look through HTML named entities and LaTeX command to avoid conflicts and thus behavior unexpected to some users. > In my case, I have already given up on the problem of marks inside words > :-(. My personal opinion: I think that, unless a completely > 'revolutionary' solution emerges, it is better to leave the matter as it > is, and consider this a feature of Org rather than a bug. I suspect that > a single solution could not satisfy all tastes or all possible > scenarios, so maybe it would be nice to put a list of solutions > (including this one and also the zero space thing, and others that have > arisen or may arise) somewhere (perhaps in the manual?). A day before I posted my current summary why export snippets and macros do not help with intra-word markup (before I expected that they can), only custom links is a workaround (with some limitations, as usual): [RFC] Creole-style / Support for **emphasis**__within__**a word** Tue, 25 Jan 2022 23:27:50 +0700. https://list.orgmode.org/ssp8e7$ah2$1@ciao.gmane.io/ But at that moment I forgot about entities, Another topic served as a reminder, and I spent some time experimenting with them. ^ permalink raw reply [flat|nested] 69+ messages in thread
end of thread, other threads:[~2022-02-03 12:13 UTC | newest] Thread overview: 69+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-12-02 10:50 Org-syntax: Intra-word markup Denis Maier 2021-12-02 11:18 ` Ihor Radchenko 2021-12-02 11:30 ` Juan Manuel Macías 2021-12-02 11:36 ` Denis Maier 2021-12-02 12:01 ` Ihor Radchenko 2021-12-02 11:42 ` Marco Wahl 2021-12-02 11:50 ` Denis Maier 2021-12-02 12:10 ` Ihor Radchenko 2021-12-02 12:40 ` Denis Maier 2021-12-02 12:54 ` Ihor Radchenko 2021-12-02 13:14 ` Juan Manuel Macías 2021-12-02 13:28 ` Denis Maier 2021-12-02 12:48 ` Max Nikulin 2021-12-02 12:02 ` Ihor Radchenko 2021-12-02 12:00 ` Ihor Radchenko [not found] ` <87r1avtdjy.fsf@ucl.ac.uk> 2021-12-02 12:27 ` Denis Maier 2021-12-02 13:06 ` Eric S Fraga 2021-12-02 12:28 ` Denis Maier 2021-12-02 12:55 ` Ihor Radchenko 2021-12-02 11:58 ` Timothy 2021-12-02 12:26 ` Denis Maier 2021-12-02 13:07 ` Ihor Radchenko 2021-12-02 15:51 ` Max Nikulin 2021-12-02 18:11 ` Tom Gillespie 2021-12-02 19:09 ` Juan Manuel Macías 2021-12-04 13:07 ` Org-syntax: emphasis and not English punctuation Max Nikulin 2021-12-04 16:42 ` Juan Manuel Macías 2021-12-02 20:47 ` Org-syntax: Intra-word markup Denis Maier 2021-12-02 22:44 ` Samuel Wales 2021-12-03 14:53 ` Max Nikulin 2021-12-03 23:51 ` Tim Cross 2021-12-04 15:01 ` Max Nikulin 2021-12-05 23:34 ` Russell Adams 2021-12-05 23:37 ` Russell Adams 2021-12-06 1:39 ` Samuel Wales 2021-12-02 19:03 ` Nicolas Goaziou 2021-12-02 19:34 ` Juan Manuel Macías 2021-12-02 23:05 ` Nicolas Goaziou 2021-12-02 23:24 ` Juan Manuel Macías 2021-12-03 14:24 ` Max Nikulin 2021-12-03 15:01 ` Juan Manuel Macías 2021-12-04 15:57 ` Denis Maier 2021-12-04 17:53 ` Tom Gillespie 2021-12-04 18:37 ` John Kitchin 2021-12-04 21:16 ` Juan Manuel Macías 2021-12-06 10:57 ` Raw Org AST snippets for "impossible" markup Max Nikulin 2021-12-06 15:45 ` Juan Manuel Macías 2021-12-06 16:56 ` Juan Manuel Macías 2021-12-08 13:09 ` Max Nikulin 2021-12-08 23:19 ` Juan Manuel Macías 2021-12-08 23:35 ` John Kitchin 2021-12-09 7:01 ` Juan Manuel Macías 2021-12-09 14:56 ` Max Nikulin 2021-12-09 16:11 ` Juan Manuel Macías 2021-12-09 22:27 ` Juan Manuel Macías 2022-01-03 14:34 ` Max Nikulin 2021-12-04 19:04 ` Org-syntax: Intra-word markup Timothy 2021-12-04 21:48 ` Tom Gillespie 2021-12-06 10:59 ` Max Nikulin 2022-01-28 14:52 ` Max Nikulin 2022-01-29 3:13 ` Ihor Radchenko 2022-01-29 13:05 ` Juan Manuel Macías 2022-02-02 15:28 ` Max Nikulin 2022-02-02 20:01 ` Juan Manuel Macías 2022-02-03 12:10 ` Max Nikulin 2021-12-06 11:01 ` Denis Maier 2022-01-28 12:12 ` [PATCH] Intra-word markup: \relax Max Nikulin 2022-01-28 13:13 ` Juan Manuel Macías 2022-02-02 15:42 ` Max Nikulin
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).