emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Org-syntax: Intra-word markup
@ 2021-12-02 10:50 Denis Maier
  2021-12-02 11:18 ` Ihor Radchenko
                   ` (2 more replies)
  0 siblings, 3 replies; 72+ messages in thread
From: Denis Maier @ 2021-12-02 10:50 UTC (permalink / raw)
  To: Org Mode List

Hi everyone,

while we're at discussing org syntax anyway, I thought it's time to 
bring up another syntax question:
Currently, org syntax doesn't officially seem to support intra-word 
emphasis. Am I missing something?
If the assessment is correct: Is there a reason for this? And, shouldn't 
that be officially added?

Best,
Denis


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 10:50 Org-syntax: Intra-word markup Denis Maier
@ 2021-12-02 11:18 ` Ihor Radchenko
  2021-12-02 11:30   ` Juan Manuel Macías
  2021-12-02 11:58 ` Timothy
  2022-01-28 12:12 ` [PATCH] Intra-word markup: \relax Max Nikulin
  2 siblings, 1 reply; 72+ messages in thread
From: Ihor Radchenko @ 2021-12-02 11:18 UTC (permalink / raw)
  To: Denis Maier; +Cc: Org Mode List

Denis Maier <denismaier@mailbox.org> writes:

> Currently, org syntax doesn't officially seem to support intra-word 
> emphasis. Am I missing something?

intra-*word* works just fine for me.

Best,
Ihor


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 11:18 ` Ihor Radchenko
@ 2021-12-02 11:30   ` Juan Manuel Macías
  2021-12-02 11:36     ` Denis Maier
                       ` (2 more replies)
  0 siblings, 3 replies; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-02 11:30 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: orgmode, Denis Maier

Hi Denis and Ihor,

Ihor Radchenko writes:

> Denis Maier <denismaier@mailbox.org> writes:
>
>> Currently, org syntax doesn't officially seem to support intra-word 
>> emphasis. Am I missing something?
>
> intra-*word* works just fine for me.
>
> Best,
> Ihor

I think what Denis is referring to is a construction of the type
*intra*word, which, if I'm not mistaken, is not supported and can only
be achieved by inserting a zero width space.

Best regards,

Juan Manuel 


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 11:30   ` Juan Manuel Macías
@ 2021-12-02 11:36     ` Denis Maier
  2021-12-02 12:01       ` Ihor Radchenko
  2021-12-02 11:42     ` Marco Wahl
  2021-12-02 12:00     ` Ihor Radchenko
  2 siblings, 1 reply; 72+ messages in thread
From: Denis Maier @ 2021-12-02 11:36 UTC (permalink / raw)
  To: Juan Manuel Macías, Ihor Radchenko; +Cc: orgmode

Yes, Juan Manuel. That's it.

See for reference: 
https://stackoverflow.com/questions/1218238/how-to-make-part-of-a-word-bold-in-org-mode

Best,
Denis

Am 02.12.2021 um 12:30 schrieb Juan Manuel Macías:
> Hi Denis and Ihor,
>
> Ihor Radchenko writes:
>
>> Denis Maier <denismaier@mailbox.org> writes:
>>
>>> Currently, org syntax doesn't officially seem to support intra-word
>>> emphasis. Am I missing something?
>> intra-*word* works just fine for me.
>>
>> Best,
>> Ihor
> I think what Denis is referring to is a construction of the type
> *intra*word, which, if I'm not mistaken, is not supported and can only
> be achieved by inserting a zero width space.
>
> Best regards,
>
> Juan Manuel



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 11:30   ` Juan Manuel Macías
  2021-12-02 11:36     ` Denis Maier
@ 2021-12-02 11:42     ` Marco Wahl
  2021-12-02 11:50       ` Denis Maier
  2021-12-02 12:02       ` Ihor Radchenko
  2021-12-02 12:00     ` Ihor Radchenko
  2 siblings, 2 replies; 72+ messages in thread
From: Marco Wahl @ 2021-12-02 11:42 UTC (permalink / raw)
  To: Juan Manuel Macías; +Cc: orgmode, Ihor Radchenko, Denis Maier

Hi!

>>> Currently, org syntax doesn't officially seem to support intra-word 
>>> emphasis. Am I missing something?
>>
>> intra-*word* works just fine for me.
>>
>> Best,
>> Ihor
>
> I think what Denis is referring to is a construction of the type
> *intra*word, which, if I'm not mistaken, is not supported and can only
> be achieved by inserting a zero width space.

Is there a recommended way to insert a zero with space?

BTW occasionally I use

	(defun mw-insert-zero-width-whitespace ()
	  "Insert a space with zero width."
	  (interactive)
	  (insert ?\x200B))


Thanks and ciao,
-- 
Marco


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 11:42     ` Marco Wahl
@ 2021-12-02 11:50       ` Denis Maier
  2021-12-02 12:10         ` Ihor Radchenko
  2021-12-02 12:02       ` Ihor Radchenko
  1 sibling, 1 reply; 72+ messages in thread
From: Denis Maier @ 2021-12-02 11:50 UTC (permalink / raw)
  To: Marco Wahl, Juan Manuel Macías; +Cc: orgmode, Ihor Radchenko

Am 02.12.2021 um 12:42 schrieb Marco Wahl:
> Hi!
> 
>>>> Currently, org syntax doesn't officially seem to support intra-word
>>>> emphasis. Am I missing something?
>>>
>>> intra-*word* works just fine for me.
>>>
>>> Best,
>>> Ihor
>>
>> I think what Denis is referring to is a construction of the type
>> *intra*word, which, if I'm not mistaken, is not supported and can only
>> be achieved by inserting a zero width space.
> 
> Is there a recommended way to insert a zero with space?
> 
> BTW occasionally I use
> 
> 	(defun mw-insert-zero-width-whitespace ()
> 	  "Insert a space with zero width."
> 	  (interactive)
> 	  (insert ?\x200B))
> 
> 
> Thanks and ciao,

Just a furter remark: while zero-width-spaces can be used as a 
workaround, they may create problems in some export formats. E.g., they 
will mess up hyphenation in latex. I think if read somewhere that those 
can be removed with hooks or filters, but I think that shouldn't be 
necessary.

Denis



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 10:50 Org-syntax: Intra-word markup Denis Maier
  2021-12-02 11:18 ` Ihor Radchenko
@ 2021-12-02 11:58 ` Timothy
  2021-12-02 12:26   ` Denis Maier
  2022-01-28 12:12 ` [PATCH] Intra-word markup: \relax Max Nikulin
  2 siblings, 1 reply; 72+ messages in thread
From: Timothy @ 2021-12-02 11:58 UTC (permalink / raw)
  To: Denis Maier; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 568 bytes --]

Hi Denis,

> Currently, org syntax doesn’t officially seem to support intra-word emphasis. Am
> I missing something?

I’d describe it as supported via-zero width spaces.

You may be interested in <https://blog.tecosaur.com/tmio/2021-05-31-async.html#easy-zero-width>.

> If the assessment is correct: Is there a reason for this? And, shouldn’t that
> be officially added?

Do you happen to have any ideas on how this could be achieved? I’d rather not
resort to having to do things like `\ast{}' and `\tilde{}' too much.

All the best,
Timothy

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 11:30   ` Juan Manuel Macías
  2021-12-02 11:36     ` Denis Maier
  2021-12-02 11:42     ` Marco Wahl
@ 2021-12-02 12:00     ` Ihor Radchenko
       [not found]       ` <87r1avtdjy.fsf@ucl.ac.uk>
  2021-12-02 12:28       ` Denis Maier
  2 siblings, 2 replies; 72+ messages in thread
From: Ihor Radchenko @ 2021-12-02 12:00 UTC (permalink / raw)
  To: Juan Manuel Macías; +Cc: orgmode, Denis Maier

Juan Manuel Macías <maciaschain@posteo.net> writes:

>> intra-*word* works just fine for me.
>>
> I think what Denis is referring to is a construction of the type
> *intra*word, which, if I'm not mistaken, is not supported and can only
> be achieved by inserting a zero width space.

I see. We had a discussion about emphasis issues in
https://orgmode.org/list/8735nnq73n.fsf@localhost

The conclusion from there is that supporting such scenarios will
introduce various edge cases. We would need to make the emaphsis parser
more and more complex inevitably introducing errors.

An alternative may be some kind of "forced" emphasis syntax where Org
does not have to guess about the emphasis using non-transparent rules.
But it's what zero width space is for and it is what we recommend in the
Org manual.

Best,
Ihor


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 11:36     ` Denis Maier
@ 2021-12-02 12:01       ` Ihor Radchenko
  0 siblings, 0 replies; 72+ messages in thread
From: Ihor Radchenko @ 2021-12-02 12:01 UTC (permalink / raw)
  To: Denis Maier; +Cc: Juan Manuel Macías, orgmode


Denis Maier <denismaier@mailbox.org> writes:
> Yes, Juan Manuel. That's it.
>
> See for reference: 
> https://stackoverflow.com/questions/1218238/how-to-make-part-of-a-word-bold-in-org-mode

Please, do not use that stackoverflow answer. It is not officially
supported, breaks exporting, and will not work anymore in future Org
versions. 

Best,
Ihor


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 11:42     ` Marco Wahl
  2021-12-02 11:50       ` Denis Maier
@ 2021-12-02 12:02       ` Ihor Radchenko
  1 sibling, 0 replies; 72+ messages in thread
From: Ihor Radchenko @ 2021-12-02 12:02 UTC (permalink / raw)
  To: Marco Wahl; +Cc: Juan Manuel Macías, orgmode, Denis Maier

Marco Wahl <marcowahlsoft@gmail.com> writes:

> Is there a recommended way to insert a zero with space?

C-x 8 <RET>


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 11:50       ` Denis Maier
@ 2021-12-02 12:10         ` Ihor Radchenko
  2021-12-02 12:40           ` Denis Maier
  2021-12-02 12:48           ` Max Nikulin
  0 siblings, 2 replies; 72+ messages in thread
From: Ihor Radchenko @ 2021-12-02 12:10 UTC (permalink / raw)
  To: Denis Maier; +Cc: Juan Manuel Macías, Marco Wahl, orgmode

Denis Maier <denismaier@mailbox.org> writes:

>
> Just a furter remark: while zero-width-spaces can be used as a 
> workaround, they may create problems in some export formats. E.g., they 
> will mess up hyphenation in latex. I think if read somewhere that those 
> can be removed with hooks or filters, but I think that shouldn't be 
> necessary.

Can you create an example of such scenario and post it as a bug?
Probably, we just need to strip all zero-width spaces at the basic ox.el
level.

Best,
Ihor


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 11:58 ` Timothy
@ 2021-12-02 12:26   ` Denis Maier
  2021-12-02 13:07     ` Ihor Radchenko
  0 siblings, 1 reply; 72+ messages in thread
From: Denis Maier @ 2021-12-02 12:26 UTC (permalink / raw)
  To: Timothy; +Cc: emacs-orgmode

Hi Timothy,

Am 02.12.2021 um 12:58 schrieb Timothy:
> Hi Denis,
>
>> Currently, org syntax doesn’t officially seem to support intra-word emphasis. Am
>> I missing something?
> I’d describe it as supported via-zero width spaces.
>
> You may be interested in <https://blog.tecosaur.com/tmio/2021-05-31-async.html#easy-zero-width>.
Thank's that's helpful.
>
>> If the assessment is correct: Is there a reason for this? And, shouldn’t that
>> be officially added?
> Do you happen to have any ideas on how this could be achieved? I’d rather not
> resort to having to do things like `\ast{}' and `\tilde{}' too much.

Well, not really. I just don't understand why /intra/word shouldn't mean 
\emph{intra}word. Pandoc's markdown supports *intra*word, asciidoc 
supports it via unconstrained formatting pairs: 
https://docs.asciidoctor.org/asciidoc/latest/text/#unconstrained; so 
__intra__word.
And, as org syntax is said to be the superior markup language, I thought 
that must be possible ;-)

I understand zero width spaces are the official workaround, but I don't 
really like having invisible characters in my documents. Automatically 
removing all of them on export might also introduce problems. Perhaps 
some have been added on purpose, and not just to help org?

As for suggestions: If just using /intra/word creates ambiguities, what 
about the asciidoc solution? So //intra//word?
In fact, I'd even use raw latex for this things. It's true, they are 
rare enough. So I wouldn't mind an occassional `\emph{}`.

Best,
Denis




^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
       [not found]       ` <87r1avtdjy.fsf@ucl.ac.uk>
@ 2021-12-02 12:27         ` Denis Maier
  2021-12-02 13:06           ` Eric S Fraga
  0 siblings, 1 reply; 72+ messages in thread
From: Denis Maier @ 2021-12-02 12:27 UTC (permalink / raw)
  To: Org Mode List



Am 02.12.2021 um 13:08 schrieb Eric S Fraga:
> My solution, in these case, is to fall back to LaTeX using @@latex:...@@
> (and equivalent for HTML, if desired).  Not pretty but I need this so
> seldom that I am happy with the org emphasis support generally.
>
Hi Eric,

Am 02.12.2021 um 13:08 schrieb Eric S Fraga:
> My solution, in these case, is to fall back to LaTeX using @@latex:...@@
> (and equivalent for HTML, if desired).  Not pretty but I need this so
> seldom that I am happy with the org emphasis support generally.
>
This works if your target is just latex, but not if you have multiple 
targets, right?

Denis


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 12:00     ` Ihor Radchenko
       [not found]       ` <87r1avtdjy.fsf@ucl.ac.uk>
@ 2021-12-02 12:28       ` Denis Maier
  2021-12-02 12:55         ` Ihor Radchenko
  1 sibling, 1 reply; 72+ messages in thread
From: Denis Maier @ 2021-12-02 12:28 UTC (permalink / raw)
  To: Ihor Radchenko, Juan Manuel Macías; +Cc: orgmode



Am 02.12.2021 um 13:00 schrieb Ihor Radchenko:
> Juan Manuel Macías <maciaschain@posteo.net> writes:
>
>>> intra-*word* works just fine for me.
>>>
>> I think what Denis is referring to is a construction of the type
>> *intra*word, which, if I'm not mistaken, is not supported and can only
>> be achieved by inserting a zero width space.
> I see. We had a discussion about emphasis issues in
> https://orgmode.org/list/8735nnq73n.fsf@localhost
>
> The conclusion from there is that supporting such scenarios will
> introduce various edge cases. We would need to make the emaphsis parser
> more and more complex inevitably introducing errors.
Thanks, I'll try to read that thread in due time.
>
> An alternative may be some kind of "forced" emphasis syntax where Org
> does not have to guess about the emphasis using non-transparent rules.
> But it's what zero width space is for and it is what we recommend in the
> Org manual.
As for the forced syntax. What do you think about the asciidoc solution?

Denis



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 12:10         ` Ihor Radchenko
@ 2021-12-02 12:40           ` Denis Maier
  2021-12-02 12:54             ` Ihor Radchenko
  2021-12-02 12:48           ` Max Nikulin
  1 sibling, 1 reply; 72+ messages in thread
From: Denis Maier @ 2021-12-02 12:40 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Marco Wahl, Juan Manuel Macías, orgmode

[-- Attachment #1: Type: text/plain, Size: 767 bytes --]



Am 02.12.2021 um 13:10 schrieb Ihor Radchenko:
> Denis Maier<denismaier@mailbox.org>  writes:
>
>> Just a furter remark: while zero-width-spaces can be used as a
>> workaround, they may create problems in some export formats. E.g., they
>> will mess up hyphenation in latex. I think if read somewhere that those
>> can be removed with hooks or filters, but I think that shouldn't be
>> necessary.
> Can you create an example of such scenario and post it as a bug?
> Probably, we just need to strip all zero-width spaces at the basic ox.el
> level.
To be clear: That's not an org bug. It's just that latex won't be able 
such a word. If | is a zero width space, the word "hyphen|ation" is not 
the same as "hyphenation".
1. hyphenation
2. hyphen|ation


Best,
Denis

[-- Attachment #2.1: Type: text/html, Size: 1431 bytes --]

[-- Attachment #2.2: b7OGd2OT4Kkun0eA.png --]
[-- Type: image/png, Size: 4888 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 12:10         ` Ihor Radchenko
  2021-12-02 12:40           ` Denis Maier
@ 2021-12-02 12:48           ` Max Nikulin
  1 sibling, 0 replies; 72+ messages in thread
From: Max Nikulin @ 2021-12-02 12:48 UTC (permalink / raw)
  To: emacs-orgmode

On 02/12/2021 19:10, Ihor Radchenko wrote:
> Denis Maier writes:
> 
>> Just a furter remark: while zero-width-spaces can be used as a
>> workaround, they may create problems in some export formats. E.g., they
>> will mess up hyphenation in latex. I think if read somewhere that those
>> can be removed with hooks or filters, but I think that shouldn't be
>> necessary.
> 
> Probably, we just need to strip all zero-width spaces at the basic ox.el
> level.

I think, legitimate cases when zero-width spaces should be preserved in 
a document may exist, so unconditionally stripping them is not a perfect 
solution.

I am afraid, regexps detecting start and end of emphasis are similar to 
a short blanket. They will always fail for some cases, especially since 
verbatim, URLs and similar contexts (that significantly differ from 
prose in respect to punctuation) do not have higher priority for parser.

Extensive test set is required for tuning of heuristics. Failures should 
be reported in a such way that allows to estimate overall quality before 
and after change. Ideally, format of file with such tests should allow 
to use the *same* input data for other tools like ruby-org.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 12:40           ` Denis Maier
@ 2021-12-02 12:54             ` Ihor Radchenko
  2021-12-02 13:14               ` Juan Manuel Macías
  0 siblings, 1 reply; 72+ messages in thread
From: Ihor Radchenko @ 2021-12-02 12:54 UTC (permalink / raw)
  To: Denis Maier; +Cc: Juan Manuel Macías, Marco Wahl, orgmode

Denis Maier <denismaier@mailbox.org> writes:

>> Can you create an example of such scenario and post it as a bug?
>> Probably, we just need to strip all zero-width spaces at the basic ox.el
>> level.
> To be clear: That's not an org bug. It's just that latex won't be able 
> such a word. If | is a zero width space, the word "hyphen|ation" is not 
> the same as "hyphenation".
> 1. hyphenation
> 2. hyphen|ation

You are right for your example, but if we force the user to put
*hyphen*|ation to create bold emphasis, it should not be any different
compared to @@latex:\textbf{hyphen}ation@@. Meanwhile the *hyphen*|ation
gets exported as \textbf{hyphen}|ation keeping the zero width space.

Best,
Ihor


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 12:28       ` Denis Maier
@ 2021-12-02 12:55         ` Ihor Radchenko
  0 siblings, 0 replies; 72+ messages in thread
From: Ihor Radchenko @ 2021-12-02 12:55 UTC (permalink / raw)
  To: Denis Maier; +Cc: Juan Manuel Macías, orgmode

Denis Maier <denismaier@mailbox.org> writes:

>> An alternative may be some kind of "forced" emphasis syntax where Org
>> does not have to guess about the emphasis using non-transparent rules.
>> But it's what zero width space is for and it is what we recommend in the
>> Org manual.
> As for the forced syntax. What do you think about the asciidoc solution?

Can you elaborate?


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 12:27         ` Denis Maier
@ 2021-12-02 13:06           ` Eric S Fraga
  0 siblings, 0 replies; 72+ messages in thread
From: Eric S Fraga @ 2021-12-02 13:06 UTC (permalink / raw)
  To: Denis Maier; +Cc: Org Mode List

On Thursday,  2 Dec 2021 at 13:27, Denis Maier wrote:
> This works if your target is just latex, but not if you have multiple
> targets, right?

Multiple targets are possible:

@@latex:\textbf{@@@@html:<strong>@@intra@@latex:}@@@@html:</strong>@@word.

Just very ugly! 🤣

Of course, if you do this more than once, a macro can help...

-- 
: Eric S Fraga, with org release_9.5.1-231-g6766c4 in Emacs 29.0.50
: Latest paper written in org: https://arxiv.org/abs/2106.05096


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 12:26   ` Denis Maier
@ 2021-12-02 13:07     ` Ihor Radchenko
  2021-12-02 15:51       ` Max Nikulin
  2021-12-02 19:03       ` Nicolas Goaziou
  0 siblings, 2 replies; 72+ messages in thread
From: Ihor Radchenko @ 2021-12-02 13:07 UTC (permalink / raw)
  To: Denis Maier; +Cc: emacs-orgmode, Nicolas Goaziou, Timothy

Denis Maier <denismaier@mailbox.org> writes:

> As for suggestions: If just using /intra/word creates ambiguities, what 
> about the asciidoc solution? So //intra//word?

I do like this idea.

Though I would also like to hear Nicolas' opinion.

Best,
Ihor


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 12:54             ` Ihor Radchenko
@ 2021-12-02 13:14               ` Juan Manuel Macías
  2021-12-02 13:28                 ` Denis Maier
  0 siblings, 1 reply; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-02 13:14 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: orgmode, denismaier

Ihor Radchenko writes:

> Denis Maier <denismaier@mailbox.org> writes:
>
>>> Can you create an example of such scenario and post it as a bug?
>>> Probably, we just need to strip all zero-width spaces at the basic ox.el
>>> level.
>> To be clear: That's not an org bug. It's just that latex won't be able 
>> such a word. If | is a zero width space, the word "hyphen|ation" is not 
>> the same as "hyphenation".
>> 1. hyphenation
>> 2. hyphen|ation
>
> You are right for your example, but if we force the user to put
> *hyphen*|ation to create bold emphasis, it should not be any different
> compared to @@latex:\textbf{hyphen}ation@@. Meanwhile the *hyphen*|ation
> gets exported as \textbf{hyphen}|ation keeping the zero width space.

-- 
I would say that they are very random cases, and therefore difficult to
reproduce. In the 'hyphenation' example, if we load the package
showhypehns, you see that:

/hyphen/​ation (with zero width sp)

and

\emph{hyphen}ation

they are cut in the same way. But differently from

hyphenation (without emphasis)

(compiled with LuaTeX).

Anyway, I have come across some curious cases. For example, a long time
ago I had defined a macro for text in other languages:

#+MACRO: lg (eval (if (org-export-derived-backend-p org-export-current-backend 'latex) (concat "@@latex:\\foreignlanguage{@@" $1 "@@latex:}{@@" "\u200B" $2 "\u200B" "@@latex:}@@") $2))

I needed to add before and after a zero width space, but doing so, the
shape of the text was altered. That can be reproduced with this example:

#+LaTeX_Header: \usepackage{showhyphens}
#+LaTeX_Header:\usepackage{lipsum,multicol}
#+LaTeX_Header:\usepackage[spanish]{babel}
#+LaTeX_Header: \def\example{\lipsum[1]}
#+LaTeX_Header: \def\zwsp{\char"200B{}}
#+OPTIONS: toc:nil

@@latex:\begin{multicols}{2}@@
@@latex:\foreignlanguage{italian}{\zwsp\example\zwsp}@@
@@latex:\foreignlanguage{italian}​{\example}@@
@@latex:\end{multicols}@@

Best regards,

Juan Manuel 


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 13:14               ` Juan Manuel Macías
@ 2021-12-02 13:28                 ` Denis Maier
  0 siblings, 0 replies; 72+ messages in thread
From: Denis Maier @ 2021-12-02 13:28 UTC (permalink / raw)
  To: Juan Manuel Macías, Ihor Radchenko; +Cc: orgmode

[-- Attachment #1: Type: text/plain, Size: 2468 bytes --]

Am 02.12.2021 um 14:14 schrieb Juan Manuel Macías:
> Ihor Radchenko writes:
>
>> Denis Maier<denismaier@mailbox.org>  writes:
>>
>>>> Can you create an example of such scenario and post it as a bug?
>>>> Probably, we just need to strip all zero-width spaces at the basic ox.el
>>>> level.
>>> To be clear: That's not an org bug. It's just that latex won't be able
>>> such a word. If | is a zero width space, the word "hyphen|ation" is not
>>> the same as "hyphenation".
>>> 1. hyphenation
>>> 2. hyphen|ation
>> You are right for your example, but if we force the user to put
>> *hyphen*|ation to create bold emphasis, it should not be any different
>> compared to @@latex:\textbf{hyphen}ation@@. Meanwhile the*hyphen*|ation
>> gets exported as \textbf{hyphen}|ation keeping the zero width space.
> -- I would say that they are very random cases, and therefore 
> difficult to reproduce. In the 'hyphenation' example, if we load the 
> package showhypehns, you see that: /hyphen/​ation (with zero width sp) 
> and \emph{hyphen}ation they are cut in the same way. But differently 
> from hyphenation (without emphasis) (compiled with LuaTeX). Anyway, I 
> have come across some curious cases. For example, a long time ago I 
> had defined a macro for text in other languages: #+MACRO: lg (eval (if 
> (org-export-derived-backend-p org-export-current-backend 'latex) 
> (concat "@@latex:\\foreignlanguage{@@" $1 "@@latex:}{@@" "\u200B" $2 
> "\u200B" "@@latex:}@@") $2)) I needed to add before and after a zero 
> width space, but doing so, the shape of the text was altered. That can 
> be reproduced with this example: #+LaTeX_Header: 
> \usepackage{showhyphens} #+LaTeX_Header:\usepackage{lipsum,multicol} 
> #+LaTeX_Header:\usepackage[spanish]{babel} #+LaTeX_Header: 
> \def\example{\lipsum[1]} #+LaTeX_Header: \def\zwsp{\char"200B{}} 
> #+OPTIONS: toc:nil @@latex:\begin{multicols}{2}@@ 
> @@latex:\foreignlanguage{italian}{\zwsp\example\zwsp}@@ 
> @@latex:\foreignlanguage{italian}​{\example}@@ 
> @@latex:\end{multicols}@@ Best regards, Juan Manuel

Thanks Juan Manuel. I should have tried that first. Hyphenation is the 
same for both /hyphen/​ation (with zero width sp) and 
\emph{hyphen}ation. (Maybe I can nudge Hans Hagen to add some low level 
trickery in context that removes the groups before doing the 
hyphenation... but that's a different story.) Anyway, as Juan Manuel 
shows there can be cases where zero width spaces cause problems.

Denis


[-- Attachment #2: Type: text/html, Size: 3801 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
@ 2021-12-02 13:36 autofrettage
  2021-12-02 15:24 ` Robert Pluim
  0 siblings, 1 reply; 72+ messages in thread
From: autofrettage @ 2021-12-02 13:36 UTC (permalink / raw)
  To: emacs-orgmode@gnu.org

Someone brought up edge and corner cases, so I simply have to mention the German gender stars ("Gendersternchen").

In an effort to make German gender neutral, some individuals use '*' in the midst of some words, e.g. rower.
Ordinary German:
male rower = Ruderer
female rower = Ruderin

Gender neutral German with gender star:
any kind of rower = Ruder*in

Yours
Rasmus


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 13:36 Org-syntax: Intra-word markup autofrettage
@ 2021-12-02 15:24 ` Robert Pluim
  2021-12-02 17:11   ` autofrettage
  0 siblings, 1 reply; 72+ messages in thread
From: Robert Pluim @ 2021-12-02 15:24 UTC (permalink / raw)
  To: autofrettage; +Cc: emacs-orgmode@gnu.org

>>>>> On Thu, 02 Dec 2021 13:36:48 +0000, autofrettage <autofrettage@protonmail.ch> said:

    autofrettage> Someone brought up edge and corner cases, so I simply have to mention the German gender stars ("Gendersternchen").
    autofrettage> In an effort to make German gender neutral, some individuals use '*' in the midst of some words, e.g. rower.
    autofrettage> Ordinary German:
    autofrettage> male rower = Ruderer
    autofrettage> female rower = Ruderin

    autofrettage> Gender neutral German with gender star:
    autofrettage> any kind of rower = Ruder*in

But with the 'female' suffix? Thatʼs almost as bad as 'écriture
inclusive'. Surely 'Ruder**'? 😇

Robert
-- 


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 13:07     ` Ihor Radchenko
@ 2021-12-02 15:51       ` Max Nikulin
  2021-12-02 18:11         ` Tom Gillespie
  2021-12-02 19:03       ` Nicolas Goaziou
  1 sibling, 1 reply; 72+ messages in thread
From: Max Nikulin @ 2021-12-02 15:51 UTC (permalink / raw)
  To: emacs-orgmode

On 02/12/2021 20:07, Ihor Radchenko wrote:
> 
>> As for suggestions: If just using /intra/word creates ambiguities, what
>> about the asciidoc solution? So //intra//word?
> 
> I do like this idea.

- Some //text <https://orgmode.org/> surprise//
- ++another ~i++~ problem++

First wins...



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 15:24 ` Robert Pluim
@ 2021-12-02 17:11   ` autofrettage
  0 siblings, 0 replies; 72+ messages in thread
From: autofrettage @ 2021-12-02 17:11 UTC (permalink / raw)
  To: Robert Pluim; +Cc: emacs-orgmode@gnu.org



Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

On Thursday, December 2nd, 2021 at 4:24 PM, Robert Pluim <rpluim@gmail.com> wrote:

>>     autofrettage> any kind of rower = Ruder*in
>
> But with the 'female' suffix? Thatʼs almost as bad as 'écriture
> inclusive'. Surely 'Ruder**'? 😇

The German wikipedia page* about gender neutral language is well
over 30 k words long, and there are almost 250 bibliographic
references. It lists a number of alternatives, such as (based
on Lehrer and Lehrerin, the German words for teacher):

+ Lehrx
+ Lehry
+ Lehrerin
+ Lehrer/-in
+ Lehrer/in
+ LehrerIn
+ Lehrer(in)
+ Lehrer:in
+ Lehrer*in
+ Lehrer_in
+ Lehrer_In
+ Lehrer•in
+ Lehrkraft
+ Lehrperson
+ Lehrende
+ ...

So, by all means, join the party. They will consider all aspects
of your suggestion, and being dead serious about it.

Yours
Rasmus

* https://de.wikipedia.org/wiki/Geschlechtergerechte_Sprache

p.s. There are even browser plug-ins, removing all of this
political correctness, making texts _much_ easier to read.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 15:51       ` Max Nikulin
@ 2021-12-02 18:11         ` Tom Gillespie
  2021-12-02 19:09           ` Juan Manuel Macías
                             ` (3 more replies)
  0 siblings, 4 replies; 72+ messages in thread
From: Tom Gillespie @ 2021-12-02 18:11 UTC (permalink / raw)
  To: emacs-orgmode

I don't mean to be a wet blanket, but the edge cases for
the current markup syntax are already hard enough to
implement correctly, to the point where different parts of
Org mode are inconsistent. Intra-word markup isn't viable
because there simply isn't any sane way to parse something
like *hello world*/hrm/oh no*. The other issue is that this will
degrade parsing performance because almost every
character could precede the start of a markup section.

I recommend anyone suggesting solutions try to implement
something that can parse the markup unambiguously with
lots of nasty test cases. You will likely find that it is impossible
to consistently tokenize markup, and that you have to hand
write a whole bunch of heuristics, making Org syntax even
harder to implement correctly.

Any solution that suggests extending how =/*~+_  can be
used gets a hard no from me. I could see teaching other
exporters how to interpret \emph{hello}world, but trying for
to have any sane behavior for something like
why *hello*world oh no a wild askterisk*
is not worth it.

Best,
Tom


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 13:07     ` Ihor Radchenko
  2021-12-02 15:51       ` Max Nikulin
@ 2021-12-02 19:03       ` Nicolas Goaziou
  2021-12-02 19:34         ` Juan Manuel Macías
  2021-12-03 14:24         ` Max Nikulin
  1 sibling, 2 replies; 72+ messages in thread
From: Nicolas Goaziou @ 2021-12-02 19:03 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Timothy, emacs-orgmode, Denis Maier

Hello,

Ihor Radchenko <yantar92@gmail.com> writes:

> Denis Maier <denismaier@mailbox.org> writes:
>
>> As for suggestions: If just using /intra/word creates ambiguities, what 
>> about the asciidoc solution? So //intra//word?
>
> I do like this idea.
>
> Though I would also like to hear Nicolas' opinion.

I sympathize to the idea of intra-word emphasis, but the syntax above is
going to cause some ambiguous situations.

I do think the marker + zero-width space is one way to go. We could, as
an improvement, consider zero-width spaces around emphasis markers to be
part of the markup, and replace them along during export.

Another solution is to introduce a less-subtle, but less prone to
ambiguity, syntax, e.g.,

                  /{bold}/markup   or   /|bold|/markup

where /{ }/  or /|  |/ become "extended" markers.

I find zero-with spaces solution much more elegant. It also doesn't
change current syntax, which is a big advantage.

Regards,
-- 
Nicolas Goaziou


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 18:11         ` Tom Gillespie
@ 2021-12-02 19:09           ` Juan Manuel Macías
  2021-12-04 13:07             ` Org-syntax: emphasis and not English punctuation Max Nikulin
  2021-12-02 20:47           ` Org-syntax: Intra-word markup Denis Maier
                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-02 19:09 UTC (permalink / raw)
  To: Tom Gillespie; +Cc: orgmode

Tom Gillespie writes:

> I don't mean to be a wet blanket, but the edge cases for
> the current markup syntax are already hard enough to
> implement correctly, to the point where different parts of
> Org mode are inconsistent. Intra-word markup isn't viable
> because there simply isn't any sane way to parse something
> like *hello world*/hrm/oh no*. The other issue is that this will
> degrade parsing performance because almost every
> character could precede the start of a markup section.
>
> I recommend anyone suggesting solutions try to implement
> something that can parse the markup unambiguously with
> lots of nasty test cases. You will likely find that it is impossible
> to consistently tokenize markup, and that you have to hand
> write a whole bunch of heuristics, making Org syntax even
> harder to implement correctly.
>
> Any solution that suggests extending how =/*~+_  can be
> used gets a hard no from me. I could see teaching other
> exporters how to interpret \emph{hello}world, but trying for
> to have any sane behavior for something like
> why *hello*world oh no a wild askterisk*
> is not worth it.

I believe, that emphasis marks are a part of Org that can be very
shocking to new users. I mean, there is a series of behaviors that seem
obvious and trivial in the emphasized text, but that in Org are not
possible out of the box, unless you configure
`org-emphasis-regexp-components'. Three quick examples. This in Org is
not possible out of the box:

#+begin_example
[/emphasis/]
¡/emphasis/!
¿/Emphasis/?
#+end_example

Nor is it possible ---out of the box--- to extend emphasis beyond a
certain number of lines. New users who come from other forms of markup
maybe expect the obvious to be something like:

some-text begin-emphasis whatever-is-in-between end-emphasis more-text

Over time one ends up seeing these things more as a feature than as a
bug :-) But those little inconsistencies make the Org syntax a bit ugly,
IMHO. I can't think of how to improve that, though.

Best regards,

Juan Manuel 


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 19:03       ` Nicolas Goaziou
@ 2021-12-02 19:34         ` Juan Manuel Macías
  2021-12-02 23:05           ` Nicolas Goaziou
  2021-12-03 14:24         ` Max Nikulin
  1 sibling, 1 reply; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-02 19:34 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: Denis Maier, emacs-orgmode, Ihor Radchenko, Timothy

Hi Nicolas and all,

Nicolas Goaziou writes:

> I find zero-with spaces solution much more elegant. It also doesn't
> change current syntax, which is a big advantage.

I agree that zero width spaces work fine as a solution, but I think they
should not be understood as part of the syntax but as a punctual
(temporal?) remedy to certain scenarios. As mentioned before, in LaTeX
zero width spaces can produce unexpected effects and modify the final
form of the text (at least in luatex). I also don't know if it would be
useful to remove all zero width spaces in the export process, because in
some cases the user may want to keep them, as I think Maxim commented in
a previous message.

As for the solution of using complementary marks ("//...//", etc.), I
think it would undermine consistency, as those marks would only be to
fix exceptions.

It's a tricky subject...

Best regards,

Juan Manuel 


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 18:11         ` Tom Gillespie
  2021-12-02 19:09           ` Juan Manuel Macías
@ 2021-12-02 20:47           ` Denis Maier
  2021-12-02 22:44             ` Samuel Wales
  2021-12-03 14:53           ` Max Nikulin
  2021-12-03 23:51           ` Tim Cross
  3 siblings, 1 reply; 72+ messages in thread
From: Denis Maier @ 2021-12-02 20:47 UTC (permalink / raw)
  To: Tom Gillespie, emacs-orgmode

Am 02.12.2021 um 19:11 schrieb Tom Gillespie:
> I don't mean to be a wet blanket, but the edge cases for
> the current markup syntax are already hard enough to
> implement correctly, to the point where different parts of
> Org mode are inconsistent. Intra-word markup isn't viable
> because there simply isn't any sane way to parse something
> like *hello world*/hrm/oh no*. The other issue is that this will
> degrade parsing performance because almost every
> character could precede the start of a markup section.
> 
> I recommend anyone suggesting solutions try to implement
> something that can parse the markup unambiguously with
> lots of nasty test cases. You will likely find that it is impossible
> to consistently tokenize markup, and that you have to hand
> write a whole bunch of heuristics, making Org syntax even
> harder to implement correctly.
> 
> Any solution that suggests extending how =/*~+_  can be
> used gets a hard no from me. I could see teaching other
> exporters how to interpret \emph{hello}world, but trying for
> to have any sane behavior for something like
> why *hello*world oh no a wild askterisk*
> is not worth it.

As I've said before, I could well live with \emph{what}ever or something 
similar.

Denis


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 20:47           ` Org-syntax: Intra-word markup Denis Maier
@ 2021-12-02 22:44             ` Samuel Wales
  0 siblings, 0 replies; 72+ messages in thread
From: Samuel Wales @ 2021-12-02 22:44 UTC (permalink / raw)
  To: Denis Maier; +Cc: Tom Gillespie, emacs-orgmode

a silly question.  don't we already use something kinda similar to
\emph{what}ever for all backends?  could we do so?

On 12/2/21, Denis Maier <denismaier@mailbox.org> wrote:
> Am 02.12.2021 um 19:11 schrieb Tom Gillespie:
>> I don't mean to be a wet blanket, but the edge cases for
>> the current markup syntax are already hard enough to
>> implement correctly, to the point where different parts of
>> Org mode are inconsistent. Intra-word markup isn't viable
>> because there simply isn't any sane way to parse something
>> like *hello world*/hrm/oh no*. The other issue is that this will
>> degrade parsing performance because almost every
>> character could precede the start of a markup section.
>>
>> I recommend anyone suggesting solutions try to implement
>> something that can parse the markup unambiguously with
>> lots of nasty test cases. You will likely find that it is impossible
>> to consistently tokenize markup, and that you have to hand
>> write a whole bunch of heuristics, making Org syntax even
>> harder to implement correctly.
>>
>> Any solution that suggests extending how =/*~+_  can be
>> used gets a hard no from me. I could see teaching other
>> exporters how to interpret \emph{hello}world, but trying for
>> to have any sane behavior for something like
>> why *hello*world oh no a wild askterisk*
>> is not worth it.
>
> As I've said before, I could well live with \emph{what}ever or something
> similar.
>
> Denis
>
>


-- 
The Kafka Pandemic

Please learn what misopathy is.
https://thekafkapandemic.blogspot.com/2013/10/why-some-diseases-are-wronged.html


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 19:34         ` Juan Manuel Macías
@ 2021-12-02 23:05           ` Nicolas Goaziou
  2021-12-02 23:24             ` Juan Manuel Macías
  0 siblings, 1 reply; 72+ messages in thread
From: Nicolas Goaziou @ 2021-12-02 23:05 UTC (permalink / raw)
  To: Juan Manuel Macías
  Cc: Timothy, emacs-orgmode, Ihor Radchenko, Denis Maier

Hello,

Juan Manuel Macías <maciaschain@posteo.net> writes:

> I agree that zero width spaces work fine as a solution, but I think they
> should not be understood as part of the syntax but as a punctual
> (temporal?) remedy to certain scenarios. As mentioned before, in LaTeX
> zero width spaces can produce unexpected effects and modify the final
> form of the text (at least in luatex). I also don't know if it would be
> useful to remove all zero width spaces in the export process, because in
> some cases the user may want to keep them, as I think Maxim commented in
> a previous message.

We may be misunderstanding each other. 

I'm suggesting to remove zero-width spaces contiguous to emphasis
markers only. Therefore LaTeX process would npot see them. Other zero
width spaces, e.g., inserted by user, are kept. AFAICT, the two last
points you mention are not relevant with my proposal.

Besides, they already part of the syntax, in some way. So that ship has
sailed long ago.

Regards,
-- 
Nicolas Goaziou


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 23:05           ` Nicolas Goaziou
@ 2021-12-02 23:24             ` Juan Manuel Macías
  0 siblings, 0 replies; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-02 23:24 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: orgmode

Nicolas Goaziou writes:

> I'm suggesting to remove zero-width spaces contiguous to emphasis
> markers only. Therefore LaTeX process would npot see them. Other zero
> width spaces, e.g., inserted by user, are kept. AFAICT, the two last
> points you mention are not relevant with my proposal.
>
> Besides, they already part of the syntax, in some way. So that ship has
> sailed long ago.

I understand that it is too late to change certain things, but that is
not an impediment for me to continue to think that using the character
U+200B as a part (at least /de facto/) of the syntax is still shocking
and weird.

On the other hand, what was expected in Org would have been to have the
emphasis marks and at the same time have a universal escape character
for those emphasis marks. In the same way as I can write in markdown:
*foo* AND \*foo\*. In Org we have the emphasis marks but not the escape
character. That was probably the cause of many issues that are being
discussed here. But that means also entering the realm of assumptions.
Still, I wanted to leave an opinion on this question in particular.

Best regards,

Juan Manuel




^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 19:03       ` Nicolas Goaziou
  2021-12-02 19:34         ` Juan Manuel Macías
@ 2021-12-03 14:24         ` Max Nikulin
  2021-12-03 15:01           ` Juan Manuel Macías
  2021-12-04 15:57           ` Denis Maier
  1 sibling, 2 replies; 72+ messages in thread
From: Max Nikulin @ 2021-12-03 14:24 UTC (permalink / raw)
  To: emacs-orgmode

On 03/12/2021 02:03, Nicolas Goaziou wrote:
>> Denis Maier writes:
>>
>>> As for suggestions: If just using /intra/word creates ambiguities, what
>>> about the asciidoc solution? So //intra//word?
> 
> I sympathize to the idea of intra-word emphasis, but the syntax above is
> going to cause some ambiguous situations.

I suppose, some more general solution is required.

> I do think the marker + zero-width space is one way to go. We could, as
> an improvement, consider zero-width spaces around emphasis markers to be
> part of the markup, and replace them along during export.

Zero-space characters adjacent to emphasis markers is a better idea than 
replacing any zero space. However I agree with Juan Manuel that white 
space characters, especially completely invisible (I am not Eli who sees 
such special characters by moving cursor through them) should not be 
overloaded. From my point of view, it is acceptable to use zero width 
spaces as a workaround but they should not become official part of Org 
syntax.

> Another solution is to introduce a less-subtle, but less prone to
> ambiguity, syntax, e.g.,
> 
>                    /{bold}/markup   or   /|bold|/markup
> 
> where /{ }/  or /|  |/ become "extended" markers.

More explicit markup leaves less room for ambiguities, and I like the 
idea due to this reason. On the other hand it diverges from principle of 
lightweight markup. The almost only special character in TeX is "\", 
HTML has three ones "&<>" with simple escape rules. Org uses many 
special characters to avoid verbosity and requires some tricks to escape 
them. Markers like "\{" make Org more verbose but do not make it more 
strict, a lot of things still rely on heuristics.

I have an idea what can be done when some special markup is required 
that is not fit into current syntax. Unfortunately some new constructs 
should be introduced anyway: inline objects and multiline elements that 
represent simplified result of parsed Org structures:

     ((italic "intra") "word")

wrapped with some markup. It should satisfy any special needs (and even 
should allow to create invalid impossible constructs). Maybe idea of 
combination of lightweight markup and low-level blocks better suits for 
some other project with more expressive internal representation. In Org 
it may become the most hated feature.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 18:11         ` Tom Gillespie
  2021-12-02 19:09           ` Juan Manuel Macías
  2021-12-02 20:47           ` Org-syntax: Intra-word markup Denis Maier
@ 2021-12-03 14:53           ` Max Nikulin
  2021-12-03 23:51           ` Tim Cross
  3 siblings, 0 replies; 72+ messages in thread
From: Max Nikulin @ 2021-12-03 14:53 UTC (permalink / raw)
  To: emacs-orgmode

On 03/12/2021 01:11, Tom Gillespie wrote:
> 
> I recommend anyone suggesting solutions try to implement
> something that can parse the markup unambiguously with
> lots of nasty test cases. You will likely find that it is impossible
> to consistently tokenize markup, and that you have to hand
> write a whole bunch of heuristics, making Org syntax even
> harder to implement correctly.

Tom, I see and share you point, however sometimes more specific and 
convincing arguments are necessary.

Why unconstrained markup ("//") does not cause problems in asciidoc? 
Maybe it does but they are not immediately obvious. I don know since I 
have never used asciidoc. Maybe parser behaves in a different way than 
org-element. Maybe plain text links are not allowed at all. Almost any 
URL contains such pair of markers: https://orgmode.org/, so it should be 
addressed somehow.

Examples of corner cases that are used for tests should be more visible 
to users otherwise it is hard to use such samples in discussions. They 
should be annotated (arbitrary examples from recent discussions):

- input: [[https://first/-/url/][pre]] text [[https://second-url/?][post]]
   parsed: (
     (link :target "https://first/-/url/" :description "pre")
     " text "
     (link :target "https://second-url/?" :description "post"))
   comment: "Regexp-based syntax highlighting falsely finds italic text 
because URLs have slashes similar start and end of italics"

- input: A _b =c_ d= e_ f
   parsed: (
     "A "
     (underline "b =c")
     " d= e_ f")
   comment: "Users of markdown may falsely expect that c_ is protected 
by verbatim markers and underlined text is ended at e_"



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-03 14:24         ` Max Nikulin
@ 2021-12-03 15:01           ` Juan Manuel Macías
  2021-12-04 15:57           ` Denis Maier
  1 sibling, 0 replies; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-03 15:01 UTC (permalink / raw)
  To: Max Nikulin; +Cc: orgmode

Hi Maxim,

Max Nikulin writes:

> More explicit markup leaves less room for ambiguities, and I like the
> idea due to this reason. On the other hand it diverges from principle
> of lightweight markup. The almost only special character in TeX is
> "\", HTML has three ones "&<>" with simple escape rules. Org uses many 
> special characters to avoid verbosity and requires some tricks to
> escape them. Markers like "\{" make Org more verbose but do not make
> it more strict, a lot of things still rely on heuristics.

Excellent explanation. Thanks for the clarification. 

> I have an idea what can be done when some special markup is required
> that is not fit into current syntax. Unfortunately some new constructs 
> should be introduced anyway: inline objects and multiline elements
> that represent simplified result of parsed Org structures:
>
>     ((italic "intra") "word")
>
> wrapped with some markup. It should satisfy any special needs (and
> even should allow to create invalid impossible constructs). Maybe idea
> of combination of lightweight markup and low-level blocks better suits
> for some other project with more expressive internal representation.
> In Org it may become the most hated feature.

I really would like a solution in this direction. In LaTeX there is a
command called \protect (which has nothing to do with this topic and is
used for other things, but I like the 'protection' concept); we could
perhaps think of a type of mark to protect the 'usual' marks when syntax
consistency is compromised in some way by the context. Maybe something
like enclosing the normal marks between two double single quotes ''...''
---or a single set of single quotes before the leading marker--- as I
proposed in another thread:

#+begin_example
''*protected emphasis*''
#+end_example

Best regards,

Juan Manuel 



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-02 18:11         ` Tom Gillespie
                             ` (2 preceding siblings ...)
  2021-12-03 14:53           ` Max Nikulin
@ 2021-12-03 23:51           ` Tim Cross
  2021-12-04 15:01             ` Max Nikulin
  2021-12-05 23:37             ` Russell Adams
  3 siblings, 2 replies; 72+ messages in thread
From: Tim Cross @ 2021-12-03 23:51 UTC (permalink / raw)
  To: emacs-orgmode


Tom Gillespie <tgbugs@gmail.com> writes:

> I don't mean to be a wet blanket, but the edge cases for
> the current markup syntax are already hard enough to
> implement correctly, to the point where different parts of
> Org mode are inconsistent. Intra-word markup isn't viable
> because there simply isn't any sane way to parse something
> like *hello world*/hrm/oh no*. The other issue is that this will
> degrade parsing performance because almost every
> character could precede the start of a markup section.
>
> I recommend anyone suggesting solutions try to implement
> something that can parse the markup unambiguously with
> lots of nasty test cases. You will likely find that it is impossible
> to consistently tokenize markup, and that you have to hand
> write a whole bunch of heuristics, making Org syntax even
> harder to implement correctly.
>
> Any solution that suggests extending how =/*~+_  can be
> used gets a hard no from me. I could see teaching other
> exporters how to interpret \emph{hello}world, but trying for
> to have any sane behavior for something like
> why *hello*world oh no a wild askterisk*
> is not worth it.
>

+infinity!

Please, please can we stop trying to satisfy every edge case or extend
the markup to satisfy every possible scenario.

Org's big strength is in its simplicity. This comes at a price -
limitations in what can be done. If those limitations are unacceptable,
then use a richer markup format like Latex, XML, HTML etc.

The point about back end exporter support is very relevant. The 'richer'
the markup, the harder it is to get a consistent mapping for back end
exporters. things quickly become more complex and difficult to maintain.

In 18 years, I've seen requests for inner word markup less than 4 times.
this is not a feature we should even be considering adding to the markup
syntax. 

Org provides a light weight markup, not a fully flexible rich markup
designed to meet any need. It makes the easy stuff simple. 


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: emphasis and not English punctuation
  2021-12-02 19:09           ` Juan Manuel Macías
@ 2021-12-04 13:07             ` Max Nikulin
  2021-12-04 16:42               ` Juan Manuel Macías
  0 siblings, 1 reply; 72+ messages in thread
From: Max Nikulin @ 2021-12-04 13:07 UTC (permalink / raw)
  To: emacs-orgmode

On 03/12/2021 02:09, Juan Manuel Macías wrote:
> 
> I believe, that emphasis marks are a part of Org that can be very
> shocking to new users. I mean, there is a series of behaviors that seem
> obvious and trivial in the emphasized text, but that in Org are not
> possible out of the box, unless you configure
> `org-emphasis-regexp-components'. Three quick examples. This in Org is
> not possible out of the box:
> 
> #+begin_example
> [/emphasis/]
> ¡/emphasis/!
> ¿/Emphasis/?
> #+end_example

Maybe this issue should be considered independently of itra-word emphasis.

Second and third examples looks like they should be supported. Ihor 
mentioned treating punctuation in a more general way. It requires rich 
test set to estimate changes in heuristics. I suspect some problems 
since start and end patterns are not symmetric and I have not found a 
way to specify in regexp only punctuation marks that normally appears in 
front of words. Square brackets likely should be excluded somehow as 
well since they are part of Org syntax. I am unsure if it is possible to 
use just regexp without additional checks of candidates.

Ihor Radchenko. [PATCH] Re: c47b535bb origin/main org-element: Remove 
dependency on ‘org-emphasis-regexp-components’
Sun, 21 Nov 2021 17:28:57 +0800.
https://list.orgmode.org/87v90lzwkm.fsf@localhost



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-03 23:51           ` Tim Cross
@ 2021-12-04 15:01             ` Max Nikulin
  2021-12-05 23:34               ` Russell Adams
  2021-12-05 23:37             ` Russell Adams
  1 sibling, 1 reply; 72+ messages in thread
From: Max Nikulin @ 2021-12-04 15:01 UTC (permalink / raw)
  To: emacs-orgmode

On 04/12/2021 06:51, Tim Cross wrote:
> 
> Please, please can we stop trying to satisfy every edge case or extend
> the markup to satisfy every possible scenario.
> 
> Org's big strength is in its simplicity. This comes at a price -
> limitations in what can be done. If those limitations are unacceptable,
> then use a richer markup format like Latex, XML, HTML etc.

It is ridiculous to throw away a nice tool and start to struggle with 
another bunch of problems when a small missed feature is really required.

> The point about back end exporter support is very relevant.

Notice that this particular feature does not require extending of 
underlying intermediate representation. There may be some subtle points 
but generally export backends are ready to intra-word markup.

> In 18 years, I've seen requests for inner word markup less than 4 times.
> this is not a feature we should even be considering adding to the markup
> syntax.
> 
> Org provides a light weight markup, not a fully flexible rich markup
> designed to meet any need. It makes the easy stuff simple.

Different users wish to have different minor features. It would be great 
to have a way to include a fragment with more verbose markup that allows 
to express special needs unsupported by lightweight markup. I am 
discussing a more general solution, not syntax extension namely for 
intra-word markup.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-03 14:24         ` Max Nikulin
  2021-12-03 15:01           ` Juan Manuel Macías
@ 2021-12-04 15:57           ` Denis Maier
  2021-12-04 17:53             ` Tom Gillespie
  1 sibling, 1 reply; 72+ messages in thread
From: Denis Maier @ 2021-12-04 15:57 UTC (permalink / raw)
  To: Max Nikulin, emacs-orgmode

Am 03.12.2021 um 15:24 schrieb Max Nikulin:
> On 03/12/2021 02:03, Nicolas Goaziou wrote:
>>> Denis Maier writes:
>>>
>>>> As for suggestions: If just using /intra/word creates ambiguities, what
>>>> about the asciidoc solution? So //intra//word?
>>
>> I sympathize to the idea of intra-word emphasis, but the syntax above is
>> going to cause some ambiguous situations.
> 
> I suppose, some more general solution is required.
> 
>> I do think the marker + zero-width space is one way to go. We could, as
>> an improvement, consider zero-width spaces around emphasis markers to be
>> part of the markup, and replace them along during export.
> 
> Zero-space characters adjacent to emphasis markers is a better idea than 
> replacing any zero space. However I agree with Juan Manuel that white 
> space characters, especially completely invisible (I am not Eli who sees 
> such special characters by moving cursor through them) should not be 
> overloaded. From my point of view, it is acceptable to use zero width 
> spaces as a workaround but they should not become official part of Org 
> syntax.
> 
>> Another solution is to introduce a less-subtle, but less prone to
>> ambiguity, syntax, e.g.,
>>
>>                    /{bold}/markup   or   /|bold|/markup
>>
>> where /{ }/  or /|  |/ become "extended" markers.
> 
> More explicit markup leaves less room for ambiguities, and I like the 
> idea due to this reason. On the other hand it diverges from principle of 
> lightweight markup. The almost only special character in TeX is "\", 
> HTML has three ones "&<>" with simple escape rules. Org uses many 
> special characters to avoid verbosity and requires some tricks to escape 
> them. Markers like "\{" make Org more verbose but do not make it more 
> strict, a lot of things still rely on heuristics.
> 
> I have an idea what can be done when some special markup is required 
> that is not fit into current syntax. Unfortunately some new constructs 
> should be introduced anyway: inline objects and multiline elements that 
> represent simplified result of parsed Org structures:
> 
>      ((italic "intra") "word")
> 
> wrapped with some markup. It should satisfy any special needs (and even 
> should allow to create invalid impossible constructs). Maybe idea of 
> combination of lightweight markup and low-level blocks better suits for 
> some other project with more expressive internal representation. In Org 
> it may become the most hated feature.

I have to admit I like this idea. That brings a lot of flexibility to 
accomodate even the most obscure needs, yet it makes the discussion 
about escape characters or new symbols much less pressing. After all, 
most markup languages face the same problem, i.e., special characters 
are limited, and beyond the usual /*_ the meaning of characters becomes 
much less obvious.

This idea reminds me a bit of Scribble/Racket where every document is 
just inverted code, which makes it possible to insert arbitrary Racket 
code in your prose...

Denis

> 
> 
> 



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: emphasis and not English punctuation
  2021-12-04 13:07             ` Org-syntax: emphasis and not English punctuation Max Nikulin
@ 2021-12-04 16:42               ` Juan Manuel Macías
  0 siblings, 0 replies; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-04 16:42 UTC (permalink / raw)
  To: Max Nikulin; +Cc: orgmode

Max Nikulin writes:

> Maybe this issue should be considered independently of itra-word emphasis.

Yes I agree. Apologies for mixing up this topic in the discussion about
intra-word emphasis...

> Second and third examples looks like they should be supported. Ihor
> mentioned treating punctuation in a more general way. It requires rich 
> test set to estimate changes in heuristics. I suspect some problems
> since start and end patterns are not symmetric and I have not found a 
> way to specify in regexp only punctuation marks that normally appears
> in front of words. Square brackets likely should be excluded somehow
> as well since they are part of Org syntax. I am unsure if it is
> possible to use just regexp without additional checks of candidates.

Ihor's idea seems interesting to me, although I understand the possible
problems you mention. By the way, I'm afraid of initial inverted
punctuation (¡¿) are only used in Castilian Spanish and other languages
of Spain, such as Galician or Asturian, due to the Castilian influence
(we go backwards from the rest of the world ;-):
https://en.wikipedia.org/wiki/Inverted_question_and_exclamation_marks

> Ihor Radchenko. [PATCH] Re: c47b535bb origin/main org-element: Remove
> dependency on ‘org-emphasis-regexp-components’
> Sun, 21 Nov 2021 17:28:57 +0800.
> https://list.orgmode.org/87v90lzwkm.fsf@localhost

I see. I believe it's a sensible decision to get rid of the dependency
on org-emphasis-regexp-components. I understand that now everything
related to the structure of emphases is the competence of org-element?

Best regards,

Juan Manuel 


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-04 15:57           ` Denis Maier
@ 2021-12-04 17:53             ` Tom Gillespie
  2021-12-04 18:37               ` John Kitchin
                                 ` (2 more replies)
  0 siblings, 3 replies; 72+ messages in thread
From: Tom Gillespie @ 2021-12-04 17:53 UTC (permalink / raw)
  To: emacs-orgmode
  Cc: Juan Manuel Macías, Max Nikulin, Tim Cross, Denis Maier

Hi all,
    After a bunch of rambling (see below if interested), I think I have
a solution that should work for everyone. The key realization is that
what we really want is the ability to have a "parse me separately"
type of syntax. This meets the intra-word syntax needs and might
meet some other needs as well.

The solution is to make @@org:...@@ "parse me separately"
block! It nearly works that way already too! To minimize typing
we could have @@:...@@ the empty type default to org.

This seems like a winner to me. The syntax for it already exists
and won't conflict. It requires relatively minimal additional typing
the implication is clear, and there are other places where such
behavior could be useful.

This syntax seems like a winner to me
@@org:/hello/@@world
@@:/hello/@@world

You can also do things like
#+begin_src org
I want a number in this number@@org:src_elisp{(+ 1 2)}@@word!
#+end_src

Which would render to
#+begin_src org
I want a number in this number3word!
#+end_src

Thoughts?

Best!
Tom

--------------- rambling below -------------


> This idea reminds me a bit of Scribble/Racket where every document is
> just inverted code, which makes it possible to insert arbitrary Racket
> code in your prose...

I will say, despite some of my comments elsewhere, that I think
exploring certain features of Scribble syntax for use in Org mode
would simplify certain parts of the syntax immensely.

For example
various inline blocks are an absolute pain to parse because they
allow nested delimiters /if they are matched/. The implementation
of the /if they are matched/ clause is currently a nasty hack which
generates a regular expression that can only actually handle nesting
to depth 3. Actually implementing the recursive grammar add a lot
of complexity to the syntax and is hard to get right.

It would be vastly simpler to use Scribble's |<{hello }} world}>|
style syntax and always terminate at the first matching delimiter.
I'm sure that this would break some Org files, but it would make
dealing with latex fragments and inline source blocks and inline
footnotes SO much simpler. Matching an arbitrary number of
angle brackets does add some complexity, but it is tiny compared
to the complexity of enforcing matched parens and their failure cases
especially because many of the places where nesting is required
probably only see use of the nesting feature in a tiny fraction of
all cases.

One other reason why this is attractive is that all the instances
where nested delimiters can appear on a line are preceded by
some non-whitespace character. This means that using the
pipe syntax does not conflict with table syntax!

Now the question comes. If we could implement this for
delimiters, could we also implement something similar
for markup? The issue with the proposed markup outside
delimiter inside approach is that it will change existing
behavior for files that want the delimiters to be included
in the markup, i.e. /{oops}/ becoming /oops/ is bad. A
second issue is that putting the delimiter inside the markup
cannot work for verbatim and code ={oops}= is ={oops}= no
matter what. Therefore the solution is not uniform across all
types of markup. We need another solution that works for
all types of markup.

What if we put the "start arbitrary markup" char outside
the markup? Say something like |/ital/|icks? Or what if
we went whole hog and used |{/ital/}|ics and made the
|{...}| syntax trigger a generalized feature where the
contents of the |{...}| block are parsed by themselves
and can abutt any other text? This would be generally
useful in a variety of situations beyond just intra-word
markup.

What are the issues with this approach? The first issue
is that there is a conflict with table syntax if we were to
use the pipe character because markup can appear at
the start of a line. The second issue is that it might be
confusing for users if |{}| also worked like {} when in the
context of latex elements or inline src blocks, or maybe
that is ok because |{}| never renders as text. Hrm. Ok.
Second issue resolved, but what to do about the first?

If we want generalized "parse this by itself" syntax so
that we can write hello|{/world/}|ok, then we need a
solution that can appear at the start of a line. So we
can't use pipe because that is always a table line even
if a zero width space is put before it ;). What other
options do we have? How about #+|{/hello/}|world for
the start of a line? As long as there is no trailing colon
it isn't a keyword, so it could work ... except that if
someone reflows the text and it is no longer a the
start of a line then the syntax breaks. That is to say
using #+| at the start of a line is not uniform, so we
can't take that approach.

What other chars to we have at our disposal? Hrm.
How about @@? Could we use that? What happens
if we use @@org:/hello/@@world? Or maybe if we
want to minimize the number of chars we could do
@@:/hello/@@world and have the empty prefix in
@@ blocks mean org?


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-04 17:53             ` Tom Gillespie
@ 2021-12-04 18:37               ` John Kitchin
  2021-12-04 21:16                 ` Juan Manuel Macías
  2021-12-06 10:57                 ` Raw Org AST snippets for "impossible" markup Max Nikulin
  2021-12-04 19:04               ` Org-syntax: Intra-word markup Timothy
  2021-12-06 11:01               ` Denis Maier
  2 siblings, 2 replies; 72+ messages in thread
From: John Kitchin @ 2021-12-04 18:37 UTC (permalink / raw)
  To: Tom Gillespie
  Cc: Juan Manuel Macías, Max Nikulin, Tim Cross, emacs-orgmode,
	Denis Maier

[-- Attachment #1: Type: text/plain, Size: 7120 bytes --]

Along these lines (and combining the s-exp suggestion from Max) , you can
achieve something like this with links.

This is lightly tested, and I am not thrilled with the eval for exporting,
but I couldn't get a macro to work on the export function to avoid it, and
this is just a proof of concept idea. This might only be suitable for
individual solutions, since you have to define this markup yourself.

#+BEGIN_SRC emacs-lisp :results silent
(defun italic (s)
  (pcase backend ;; lexical
    ('latex (format "{\\textit{%s}}" s))
    ('html (format "<i>%s</i>" s))
    (_ s)))

(defun @@-export (path desc backend)
  (eval `(concat ,@(read path))))

(org-link-set-parameters
 "@@"
 :export #'@@-export)
#+END_SRC

In org, it would look like Here is a [[@@:((italic "part") "ial")]] markup.
And in exports this is what this implementation does.

#+BEGIN_SRC emacs-lisp
(org-export-string-as "Here is a [[@@:((italic \"part\") \"ial\")]]
markup." 'latex t)
#+END_SRC

#+RESULTS:
: Here is a {\textit{part}}ial markup.


#+BEGIN_SRC emacs-lisp
(org-export-string-as "Here is a [[@@:((italic \"part\") \"ial\")]]
markup." 'html t)
#+END_SRC

#+RESULTS:
: <p>
: Here is a <i>part</i>ial markup.</p>

#+BEGIN_SRC emacs-lisp
(org-export-string-as "Here is a [[@@:((italic \"part\") \"ial\")]]
markup." 'ascii t)
#+END_SRC

#+RESULTS:
: Here is a partial markup.

Of course, you are free to do what you want with the path, including parse
it yourself to generate the output, and since it is a link, you could do
all kinds of things to make it look the way you want with faces, overlays,
etc.



John

-----------------------------------
Professor John Kitchin (he/him/his)
Doherty Hall A207F
Department of Chemical Engineering
Carnegie Mellon University
Pittsburgh, PA 15213
412-268-7803
@johnkitchin
http://kitchingroup.cheme.cmu.edu



On Sat, Dec 4, 2021 at 12:54 PM Tom Gillespie <tgbugs@gmail.com> wrote:

> Hi all,
>     After a bunch of rambling (see below if interested), I think I have
> a solution that should work for everyone. The key realization is that
> what we really want is the ability to have a "parse me separately"
> type of syntax. This meets the intra-word syntax needs and might
> meet some other needs as well.
>
> The solution is to make @@org:...@@ "parse me separately"
> block! It nearly works that way already too! To minimize typing
> we could have @@:...@@ the empty type default to org.
>
> This seems like a winner to me. The syntax for it already exists
> and won't conflict. It requires relatively minimal additional typing
> the implication is clear, and there are other places where such
> behavior could be useful.
>
> This syntax seems like a winner to me
> @@org:/hello/@@world
> @@:/hello/@@world
>
> You can also do things like
> #+begin_src org
> I want a number in this number@@org:src_elisp{(+ 1 2)}@@word!
> #+end_src
>
> Which would render to
> #+begin_src org
> I want a number in this number3word!
> #+end_src
>
> Thoughts?
>
> Best!
> Tom
>
> --------------- rambling below -------------
>
>
> > This idea reminds me a bit of Scribble/Racket where every document is
> > just inverted code, which makes it possible to insert arbitrary Racket
> > code in your prose...
>
> I will say, despite some of my comments elsewhere, that I think
> exploring certain features of Scribble syntax for use in Org mode
> would simplify certain parts of the syntax immensely.
>
> For example
> various inline blocks are an absolute pain to parse because they
> allow nested delimiters /if they are matched/. The implementation
> of the /if they are matched/ clause is currently a nasty hack which
> generates a regular expression that can only actually handle nesting
> to depth 3. Actually implementing the recursive grammar add a lot
> of complexity to the syntax and is hard to get right.
>
> It would be vastly simpler to use Scribble's |<{hello }} world}>|
> style syntax and always terminate at the first matching delimiter.
> I'm sure that this would break some Org files, but it would make
> dealing with latex fragments and inline source blocks and inline
> footnotes SO much simpler. Matching an arbitrary number of
> angle brackets does add some complexity, but it is tiny compared
> to the complexity of enforcing matched parens and their failure cases
> especially because many of the places where nesting is required
> probably only see use of the nesting feature in a tiny fraction of
> all cases.
>
> One other reason why this is attractive is that all the instances
> where nested delimiters can appear on a line are preceded by
> some non-whitespace character. This means that using the
> pipe syntax does not conflict with table syntax!
>
> Now the question comes. If we could implement this for
> delimiters, could we also implement something similar
> for markup? The issue with the proposed markup outside
> delimiter inside approach is that it will change existing
> behavior for files that want the delimiters to be included
> in the markup, i.e. /{oops}/ becoming /oops/ is bad. A
> second issue is that putting the delimiter inside the markup
> cannot work for verbatim and code ={oops}= is ={oops}= no
> matter what. Therefore the solution is not uniform across all
> types of markup. We need another solution that works for
> all types of markup.
>
> What if we put the "start arbitrary markup" char outside
> the markup? Say something like |/ital/|icks? Or what if
> we went whole hog and used |{/ital/}|ics and made the
> |{...}| syntax trigger a generalized feature where the
> contents of the |{...}| block are parsed by themselves
> and can abutt any other text? This would be generally
> useful in a variety of situations beyond just intra-word
> markup.
>
> What are the issues with this approach? The first issue
> is that there is a conflict with table syntax if we were to
> use the pipe character because markup can appear at
> the start of a line. The second issue is that it might be
> confusing for users if |{}| also worked like {} when in the
> context of latex elements or inline src blocks, or maybe
> that is ok because |{}| never renders as text. Hrm. Ok.
> Second issue resolved, but what to do about the first?
>
> If we want generalized "parse this by itself" syntax so
> that we can write hello|{/world/}|ok, then we need a
> solution that can appear at the start of a line. So we
> can't use pipe because that is always a table line even
> if a zero width space is put before it ;). What other
> options do we have? How about #+|{/hello/}|world for
> the start of a line? As long as there is no trailing colon
> it isn't a keyword, so it could work ... except that if
> someone reflows the text and it is no longer a the
> start of a line then the syntax breaks. That is to say
> using #+| at the start of a line is not uniform, so we
> can't take that approach.
>
> What other chars to we have at our disposal? Hrm.
> How about @@? Could we use that? What happens
> if we use @@org:/hello/@@world? Or maybe if we
> want to minimize the number of chars we could do
> @@:/hello/@@world and have the empty prefix in
> @@ blocks mean org?
>
>

[-- Attachment #2: Type: text/html, Size: 8579 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-04 17:53             ` Tom Gillespie
  2021-12-04 18:37               ` John Kitchin
@ 2021-12-04 19:04               ` Timothy
  2021-12-04 21:48                 ` Tom Gillespie
  2021-12-06 11:01               ` Denis Maier
  2 siblings, 1 reply; 72+ messages in thread
From: Timothy @ 2021-12-04 19:04 UTC (permalink / raw)
  To: Tom Gillespie
  Cc: Juan Manuel Macías, Max Nikulin, Tim Cross, emacs-orgmode,
	Denis Maier

[-- Attachment #1: Type: text/plain, Size: 872 bytes --]

Hi Tom,

> After a bunch of rambling (see below if interested), I think I have
> a solution that should work for everyone. The key realization is that
> what we really want is the ability to have a “parse me separately”
> type of syntax. This meets the intra-word syntax needs and might
> meet some other needs as well.
>
> The solution is to make  “parse me separately”
> block! It nearly works that way already too! To minimize typing
> we could have @@:…@@ the empty type default to org.
>
> Thoughts?

This isn’t quite as succinct as the ascii-doc inspired suggestions, but it’s
barely an extension on the current syntax — I like it!

Since org is a valid export backend though, perhaps this behaviour should be
reserved for @@:…@@, i.e. no export backend, which I think semantically fits
fairly nicely.

All the best,
Timothy

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-04 18:37               ` John Kitchin
@ 2021-12-04 21:16                 ` Juan Manuel Macías
  2021-12-06 10:57                 ` Raw Org AST snippets for "impossible" markup Max Nikulin
  1 sibling, 0 replies; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-04 21:16 UTC (permalink / raw)
  To: John Kitchin; +Cc: orgmode

Hi John,

John Kitchin writes:

> Along these lines (and combining the s-exp suggestion from Max) , you
> can achieve something like this with links. 

I like this idea of merging the Maxim's proposal with the power of links.

In any case, this and other workarounds provided here make it clear that
in Org we do not lack of good and useful resources. I usually use macros
(taking advantage of the fact that macros expand soon). For example
(only in this case with the LaTeX backend):

#+MACRO: emph (eval (when (org-export-derived-backend-p org-export-current-backend 'latex) (concat "@@latex:\\emph{@@" $1 "@@latex:}@@")))

Defined the macro this way, it allows me also to introduce nested
emphases by both ways:

#+begin_src example
{{{emph(lorem *ipsum* /dolor/ {{{emph(sit)}}} amet)}}}
#+end_src

==> \emph{lorem \textbf{ipsum} \emph{dolor} \emph{sit} amet}

Best regards,

Juan Manuel


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-04 19:04               ` Org-syntax: Intra-word markup Timothy
@ 2021-12-04 21:48                 ` Tom Gillespie
  2021-12-06 10:59                   ` Max Nikulin
  2022-01-28 14:52                   ` Max Nikulin
  0 siblings, 2 replies; 72+ messages in thread
From: Tom Gillespie @ 2021-12-04 21:48 UTC (permalink / raw)
  To: Timothy; +Cc: emacs-orgmode

> Since org is a valid export backend though, perhaps this behaviour should be
> reserved for @@:…@@, i.e. no export backend, which I think semantically fits
> fairly nicely.

This ends up being even more convenient than I initially realized.
The current spec for export snippets is ambiguous when it says
"NAME can contain any alpha-numeric character and hyphens"
but the implementation behavior requires that "any" means "at
least one" and is implemented using the + regex operator.

What this means is that @@:...@@ syntax is not actually used
in Org at all at the moment and renders as plain text. I agree that
we need to avoid @@org:..@@ because it has legitimate uses.
Making a back-end of empty string valid for parse separately
syntax thus makes @@ syntax more regular overall, and allows
@@:...@@ to be processed separately because it currently
never enters the export snippet processing.

This is important because export snippets do not seem to be easily
accessible to earlier phases of the org-export machinery, i.e. there
isn't a nice centralized place to preprocess @@org:...@@ even
if we wanted to. On the other hand @@:...@@ isn't processed
at all. I could be missing something in the org export code though.

It will take a bit of work to get this behavior implemented I think,
but it doesn't seem to have any conflicts. Some users may have
set the empty backend to expand manually via
org-export-snippet-translation-alist, but as long as we give
org-export-snippet-translation-alist priority and warn people
that setting "" manually will disable the new functionality
then there shouldn't be any disruption. The behavior also sort
of matches what we would want the empty string to be in this
case, which is "all backends" and of course the only markup
that makes sense for "all backends" is org itself!

Best,
Tom


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-04 15:01             ` Max Nikulin
@ 2021-12-05 23:34               ` Russell Adams
  0 siblings, 0 replies; 72+ messages in thread
From: Russell Adams @ 2021-12-05 23:34 UTC (permalink / raw)
  To: emacs-orgmode

On Sat, Dec 04, 2021 at 10:01:15PM +0700, Max Nikulin wrote:
> On 04/12/2021 06:51, Tim Cross wrote:
> >
> > Please, please can we stop trying to satisfy every edge case or extend
> > the markup to satisfy every possible scenario.
>
> It is ridiculous to throw away a nice tool and start to struggle with
> another bunch of problems when a small missed feature is really required.

I think this is a problem of expectations. I don't export Org to
export perfect documents in every language. I expect Org to make a
simple subset of features available consistently.

With HTML or Latex you can create those words, and you can insert that
code into your Org document. Why does the Org syntax need to be
further extended to support this?

Part of the reason Org is a nice tool is that it is simple, and we
should be cautious trying to make it any more complex.


------------------------------------------------------------------
Russell Adams                            RLAdams@AdamsInfoServ.com

PGP Key ID:     0x1160DCB3           http://www.adamsinfoserv.com/

Fingerprint:    1723 D8CA 4280 1EC9 557F  66E8 1154 E018 1160 DCB3


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-03 23:51           ` Tim Cross
  2021-12-04 15:01             ` Max Nikulin
@ 2021-12-05 23:37             ` Russell Adams
  2021-12-06  1:39               ` Samuel Wales
  1 sibling, 1 reply; 72+ messages in thread
From: Russell Adams @ 2021-12-05 23:37 UTC (permalink / raw)
  To: emacs-orgmode

On Sat, Dec 04, 2021 at 10:51:47AM +1100, Tim Cross wrote:
>
> Tom Gillespie <tgbugs@gmail.com> writes:
>
> > I don't mean to be a wet blanket...

I'd like to be a wet blanket.

> +infinity!
>
> Please, please can we stop trying to satisfy every edge case or extend
> the markup to satisfy every possible scenario.

+infinity^2

I've often thought Org needs to hit the brakes and stop adding
features, or cut out features that have a high support/maintenance
cost. We need to respect our maintainers' time.

------------------------------------------------------------------
Russell Adams                            RLAdams@AdamsInfoServ.com

PGP Key ID:     0x1160DCB3           http://www.adamsinfoserv.com/

Fingerprint:    1723 D8CA 4280 1EC9 557F  66E8 1154 E018 1160 DCB3


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-05 23:37             ` Russell Adams
@ 2021-12-06  1:39               ` Samuel Wales
  0 siblings, 0 replies; 72+ messages in thread
From: Samuel Wales @ 2021-12-06  1:39 UTC (permalink / raw)
  To: emacs-orgmode

i think i can't add much useful to these threads, i agree with the
simplicity, but, a nuance, want for org to have had a bit more
consistency growing up.  e.g. quoting/escaping, demarcation, and
applicability of features in different contexts.

sort of a "mentally factored user interface" where the user's
expectation is pretty straightforwardly met.  e.g. works here so
should also work there.  or, there is only one rule for doing this.
that kind of thing.  orthogonality also.  few exceptions.

it is understandable in context that inconsistencies exist, and that
might apply to various maintenance-over-heavy things users want.

if we are to remove features as suggested below, then i suggest, where
possible, consistency be a desideratum for final result.


On 12/5/21, Russell Adams <RLAdams@adamsinfoserv.com> wrote:
> On Sat, Dec 04, 2021 at 10:51:47AM +1100, Tim Cross wrote:
>>
>> Tom Gillespie <tgbugs@gmail.com> writes:
>>
>> > I don't mean to be a wet blanket...
>
> I'd like to be a wet blanket.
>
>> +infinity!
>>
>> Please, please can we stop trying to satisfy every edge case or extend
>> the markup to satisfy every possible scenario.
>
> +infinity^2
>
> I've often thought Org needs to hit the brakes and stop adding
> features, or cut out features that have a high support/maintenance
> cost. We need to respect our maintainers' time.
>
> ------------------------------------------------------------------
> Russell Adams                            RLAdams@AdamsInfoServ.com
>
> PGP Key ID:     0x1160DCB3           http://www.adamsinfoserv.com/
>
> Fingerprint:    1723 D8CA 4280 1EC9 557F  66E8 1154 E018 1160 DCB3
>
>


-- 
The Kafka Pandemic

A blog about science, health, human rights, and misopathy:
https://thekafkapandemic.blogspot.com


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Raw Org AST snippets for "impossible" markup
  2021-12-04 18:37               ` John Kitchin
  2021-12-04 21:16                 ` Juan Manuel Macías
@ 2021-12-06 10:57                 ` Max Nikulin
  2021-12-06 15:45                   ` Juan Manuel Macías
  1 sibling, 1 reply; 72+ messages in thread
From: Max Nikulin @ 2021-12-06 10:57 UTC (permalink / raw)
  To: emacs-orgmode

On 05/12/2021 01:37, John Kitchin wrote:
> Along these lines (and combining the s-exp suggestion from Max) , you 
> can achieve something like this with links.
> 
> #+BEGIN_SRC emacs-lisp :results silent
> (defun italic (s)
>    (pcase backend ;; lexical
>      ('latex (format "{\\textit{%s}}" s))
>      ('html (format "<i>%s</i>" s))
>      (_ s)))
> 
> (defun @@-export (path desc backend)
>    (eval `(concat ,@(read path))))
> 
> (org-link-set-parameters
>   "@@"
>   :export #'@@-export)
> #+END_SRC

John, thank you for the reminding me of Juan Manuel's idea that
everything missed in Org may be polyfilled (ab)using links.
It is enough for proof of concept, special markers may be introduced
later. After some time spent exercising in monkey-typing,
I have got some code that illustrates my idea.

So the goal is to mitigate demand to extend current syntax.
While simple cases should be easy,
special cases should not be impossible.

- Raw AST snippets should be processed without ~eval~ to give
   other tools such as =pandoc= a chance to support the feature.
   If you desperately need ~eval~ then you can use source blocks.
- The idea is to use existing backends by passing structures
   similar to ones generated by ~org-element~ parser.
- I would prefer to avoid "@@" for link prefix since such sequences
   are already a part of Org syntax. In the following example
   export snippet is preliminary terminated by such link:

   #+begin_src elisp :results pp
     (org-element-parse-secondary-string
      "@@latex:[[@@:(italics \"i\")]]@@"
      (org-element-restriction 'paragraph))
   #+end_src

   #+RESULTS:
   : ((export-snippet
   :   (:back-end "latex" :value "[[" :begin 1 :end 13 :post-blank 0 
:parent #0))
   :  #(":(italics \"i\")]]@@" 0 18
   :    (:parent #0)))

Let's take some link prefix that makes it clear that the proposal
is a draft and a sane variant will be chosen later when agreement
concerning details of such feature is achieved. Till that moment
it is named "orgia".

#+begin_src elisp :results silent
   (defun orgia-export (path desc backend)
     (if (not (eq ?\( (aref path 0)))
	path
       (let ((tree (read path))
	    (info (org-export-get-environment backend nil nil)))
	(org-no-properties
	 (org-export-data-with-backend tree backend info)))))

   (org-link-set-parameters
    "orgia"
    :export #'orgia-export)
#+end_src

Either [[orgia:("inter" (bold () "word"))]]
or <orgia:((italic () "inter") "word")>
links may be used. Certainly plain text may be outside:

#+begin_src elisp
   (org-export-string-as "A <orgia:(italic () \"inter\")>word" 'html t)
#+end_src

#+RESULTS:
: <p>
: A <i>inter</i>word</p>

- Error handling is required.
- Elements (blocks) should be considered as an error
   in object (inline) context.
- Passed tree should be preprocessed to glue strings split to
   avoid interpreting them as terminating outer construct or link itself
   (=]]= =][= should be ="]" "]"= ="]" "["= inside bracket links).
   It is especially important for property values.
- For convenience =parse= element may be added to parse a string
   accordingly to Org markup.
- There should be a similar element (block-level markup structure).
- Symbols and structures used by ~org-element~ becomes a part of
   public API, but they are already are since they are used
   by export backends.
- ~org-cite~ is likely will be a problem.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-04 21:48                 ` Tom Gillespie
@ 2021-12-06 10:59                   ` Max Nikulin
  2022-01-28 14:52                   ` Max Nikulin
  1 sibling, 0 replies; 72+ messages in thread
From: Max Nikulin @ 2021-12-06 10:59 UTC (permalink / raw)
  To: emacs-orgmode

On 05/12/2021 04:48, Tom Gillespie wrote:
>> Since org is a valid export backend though, perhaps this behaviour should be
>> reserved for @@:…@@, i.e. no export backend, which I think semantically fits
>> fairly nicely.
> 
> This ends up being even more convenient than I initially realized.

It is a bright idea. The only drawback I see is that it is impossible to 
put new "@@:@@" fragment inside export snippet "@@latex:some 
@@:special@@thing@@ or vice versa.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-04 17:53             ` Tom Gillespie
  2021-12-04 18:37               ` John Kitchin
  2021-12-04 19:04               ` Org-syntax: Intra-word markup Timothy
@ 2021-12-06 11:01               ` Denis Maier
  2 siblings, 0 replies; 72+ messages in thread
From: Denis Maier @ 2021-12-06 11:01 UTC (permalink / raw)
  To: Tom Gillespie, emacs-orgmode
  Cc: Juan Manuel Macías, Max Nikulin, Tim Cross

Hi Tom

Am 04.12.2021 um 18:53 schrieb Tom Gillespie:
> Hi all,
>      After a bunch of rambling (see below if interested), I think I have
> a solution that should work for everyone. The key realization is that
> what we really want is the ability to have a "parse me separately"
> type of syntax. This meets the intra-word syntax needs and might
> meet some other needs as well.
> 
> The solution is to make @@org:...@@ "parse me separately"
> block! It nearly works that way already too! To minimize typing
> we could have @@:...@@ the empty type default to org.
> 
> This seems like a winner to me. The syntax for it already exists
> and won't conflict. It requires relatively minimal additional typing
> the implication is clear, and there are other places where such
> behavior could be useful.
> 
> This syntax seems like a winner to me
> @@org:/hello/@@world
> @@:/hello/@@world
> 
> You can also do things like
> #+begin_src org
> I want a number in this number@@org:src_elisp{(+ 1 2)}@@word!
> #+end_src
> 
> Which would render to
> #+begin_src org
> I want a number in this number3word!
> #+end_src
> 
> Thoughts?
> 
> Best!
> Tom
> 

Thanks for the suggestion. I think that sounds like a good idea. Of 
course not as terse as the asciidoc inspired suggestion, but entirely 
appropriate for a case like this one! I also like that there might be 
other cases where case might be handy.

Best,
Denis



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Raw Org AST snippets for "impossible" markup
  2021-12-06 10:57                 ` Raw Org AST snippets for "impossible" markup Max Nikulin
@ 2021-12-06 15:45                   ` Juan Manuel Macías
  2021-12-06 16:56                     ` Juan Manuel Macías
  2021-12-08 13:09                     ` Max Nikulin
  0 siblings, 2 replies; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-06 15:45 UTC (permalink / raw)
  To: Max Nikulin; +Cc: orgmode

Max Nikulin writes:

> John, thank you for the reminding me of Juan Manuel's idea that
> everything missed in Org may be polyfilled (ab)using links.
> It is enough for proof of concept, special markers may be introduced
> later. After some time spent exercising in monkey-typing,
> I have got some code that illustrates my idea.
>
> So the goal is to mitigate demand to extend current syntax.
> While simple cases should be easy,
> special cases should not be impossible.
>
> - Raw AST snippets should be processed without ~eval~ to give
>    other tools such as =pandoc= a chance to support the feature.
>    If you desperately need ~eval~ then you can use source blocks.
> - The idea is to use existing backends by passing structures
>    similar to ones generated by ~org-element~ parser.
> - I would prefer to avoid "@@" for link prefix since such sequences
>    are already a part of Org syntax. In the following example
>    export snippet is preliminary terminated by such link:
>
>    #+begin_src elisp :results pp
>      (org-element-parse-secondary-string
>       "@@latex:[[@@:(italics \"i\")]]@@"
>       (org-element-restriction 'paragraph))
>    #+end_src
>
>
>    #+RESULTS:
>    : ((export-snippet
>    :   (:back-end "latex" :value "[[" :begin 1 :end 13 :post-blank 0 
> :parent #0))
>    :  #(":(italics \"i\")]]@@" 0 18
>    :    (:parent #0)))
>
> Let's take some link prefix that makes it clear that the proposal
> is a draft and a sane variant will be chosen later when agreement
> concerning details of such feature is achieved. Till that moment
> it is named "orgia".
>
> #+begin_src elisp :results silent
>    (defun orgia-export (path desc backend)
>      (if (not (eq ?\( (aref path 0)))
> 	path
>        (let ((tree (read path))
> 	    (info (org-export-get-environment backend nil nil)))
> 	(org-no-properties
> 	 (org-export-data-with-backend tree backend info)))))
>
>    (org-link-set-parameters
>     "orgia"
>     :export #'orgia-export)
> #+end_src
>
>
> Either [[orgia:("inter" (bold () "word"))]]
> or <orgia:((italic () "inter") "word")>
> links may be used. Certainly plain text may be outside:
>
> #+begin_src elisp
>    (org-export-string-as "A <orgia:(italic () \"inter\")>word" 'html t)
> #+end_src
>
> #+RESULTS:
> : <p>
> : A <i>inter</i>word</p>
>
> - Error handling is required.
> - Elements (blocks) should be considered as an error
>    in object (inline) context.
> - Passed tree should be preprocessed to glue strings split to
>    avoid interpreting them as terminating outer construct or link itself
>    (=]]= =][= should be ="]" "]"= ="]" "["= inside bracket links).
>    It is especially important for property values.
> - For convenience =parse= element may be added to parse a string
>    accordingly to Org markup.
> - There should be a similar element (block-level markup structure).
> - Symbols and structures used by ~org-element~ becomes a part of
>    public API, but they are already are since they are used
>    by export backends.
> - ~org-cite~ is likely will be a problem.

Hi Maxim,

I understand that with this method the emphases could be nested, which
it seems also very productive. I like it.

I would suggest, however, not to use the term 'italics', since is a
'typographic' term, but a term that is agnostic of format and
typography, something like as 'emphasis' or 'emph'. For example, in a
format agnostic environment like Org, which is concerned only with
structure, an emphasis is always an emphasis. But in a typographic
environment that emphasis may or may not be be in italics. That is why
in LaTeX you can write constructions like:

#+begin_src latex
\emph{The Making Off of \emph{Star Wars}}
#+end_src

In this context 'Star Wars' would appear in upright font. Naturally,
these things are only possible in LaTeX, but it's nice to keep in Org a
typographic agnosticism.

Anyway, I find all this very interesting as proof of concept, although
in my workflow I prefer to use macros for these types of scenarios (yes,
a rare case where I don't use links! :-D):

#+begin_src emacs-lisp
  (defun my-macro-emph (arg)
    (cond ((org-export-derived-backend-p org-export-current-backend 'latex)
	   (concat "@@latex:\\emph{@@" arg "@@latex:}@@"))
	  ((org-export-derived-backend-p org-export-current-backend 'html)
	   (concat "@@html:<em>@@" arg "@@html:</em>@@"))
	  ((org-export-derived-backend-p org-export-current-backend 'odt)
	   (concat "@@odt:<text:span text:style-name=\"Emphasis\">@@" arg "@@odt:</text:span>@@"))))

  (setq org-export-global-macros
	'(("emph" . "(eval (my-macro-emph $1))")))
#+end_src

{{{emph(The Making Off of {{{emph(Star Wars)}}})}}}

Best regards,

Juan Manuel 


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Raw Org AST snippets for "impossible" markup
  2021-12-06 15:45                   ` Juan Manuel Macías
@ 2021-12-06 16:56                     ` Juan Manuel Macías
  2021-12-08 13:09                     ` Max Nikulin
  1 sibling, 0 replies; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-06 16:56 UTC (permalink / raw)
  To: Max Nikulin; +Cc: orgmode

Juan Manuel Macías writes:

> I would suggest, however, not to use the term 'italics [...blah blah...]'

Sorry for the noise! I think I messed myself up...

Naturally, 'italic' (or 'bold') is required: (italic () \"inter\")

Best regards,

Juan Manuel 


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Raw Org AST snippets for "impossible" markup
  2021-12-06 15:45                   ` Juan Manuel Macías
  2021-12-06 16:56                     ` Juan Manuel Macías
@ 2021-12-08 13:09                     ` Max Nikulin
  2021-12-08 23:19                       ` Juan Manuel Macías
  1 sibling, 1 reply; 72+ messages in thread
From: Max Nikulin @ 2021-12-08 13:09 UTC (permalink / raw)
  To: emacs-orgmode

On 06/12/2021 22:45, Juan Manuel Macías wrote:
> 
> I understand that with this method the emphases could be nested, which
> it seems also very productive. I like it.
> 
> I would suggest, however, not to use the term 'italics', since is a
> 'typographic' term, but a term that is agnostic of format and
> typography, something like as 'emphasis' or 'emph'. For example, in a
> format agnostic environment like Org, which is concerned only with
> structure, an emphasis is always an emphasis. But in a typographic
> environment that emphasis may or may not be be in italics. That is why
> in LaTeX you can write constructions like:

As you have guessed, It is not my choice, it is interface of ox.el and 
org-element.el.

However if you strongly want to use proper terminology in markup, you 
may try to trade it for +your soul+ compatibility and portability 
issues. The following almost works:

#+begin_src elisp :results silent
   (defun orgia-link (link-data desc info)
     (let* ((backend-struct (plist-get info :back-end))
	   (backend-name (org-export-backend-name backend-struct)))
       (or
        (org-export-custom-protocol-maybe link-data desc backend-name info)
        (let* ((parent (org-export-backend-parent backend-struct))
	      (transcoders-alist (org-export-get-all-transcoders parent))
	      (link-transcoder (alist-get 'link transcoders-alist)))
	 (if link-transcoder
	     (funcall link-transcoder link-data desc info)
	   desc)))))

   (defun evilatex-emph (_emph content info)
     ;; I have no idea yet why newline is appended.
     (format "\\textit{%s}%%" content))

   (org-export-define-derived-backend 'evilatex 'latex
     :translate-alist '((emph . evilatex-emph)
		       (link . orgia-link)))
#+end_src

#+begin_src elisp
   (let ((org-export-with-broken-links 'mark))
     (org-export-string-as
      "An [[orgia:(italic () \"ex\")]]ample of <orgia:(emph () 
\"inter\")>word and [[http://te.st][link]] [[unknown:prefix][desc]]!"
      'evilatex t))
#+end_src

#+RESULTS:
: An \emph{ex}ample of \textit{inter}%
: word and \href{http://te.st}{link} [BROKEN LINK: unknown:prefix]!

Actually, I believe that something like orgia-link code should be added 
by `org-exprot-define-derived-backend' if "link" is missed in 
translate-alist. I suspect that `org-export-get-all-transcoders' may be 
avoided.

>    (setq org-export-global-macros
> 	'(("emph" . "(eval (my-macro-emph $1))")))

Sorry, I have not prepared better variant to solve comma in macro 
problem yet.




^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Raw Org AST snippets for "impossible" markup
  2021-12-08 13:09                     ` Max Nikulin
@ 2021-12-08 23:19                       ` Juan Manuel Macías
  2021-12-08 23:35                         ` John Kitchin
  0 siblings, 1 reply; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-08 23:19 UTC (permalink / raw)
  To: Max Nikulin; +Cc: orgmode

Max Nikulin writes:

> As you have guessed, It is not my choice, it is interface of ox.el and 
> org-element.el.

Indeed. Sorry for my haste: it's the consequences of not read the code
carefully :-)

Of course, your orgia-link-procedure could be extended to more org elements.
I can't think of what kind of scenario that might fit in, but as a proof
of concept I find it really stimulating. E.g:

#+begin_src elisp
  (org-export-string-as "<orgia:(verse-block () \"Lorem\\nipsum\\ndolor\")>" 'html t)
#+end_src

#+RESULTS:
: <p>
: <p class="verse">
: Lorem<br />
: ipsum<br />
: dolor</p>
: </p>

#+begin_src elisp
  (org-export-string-as "<orgia:(quote-block (:attr_latex
	 (\":environment foreigndisplayquote :options {greek}\"))
	 \"Δαρείου καὶ Παρυσάτιδος γίγνονται παῖδες δύο, πρεσβύτερος μὲν
	 Ἀρταξέρξης, νεώτερος δὲ Κῦρος·\")>" 'latex t)
#+end_src

#+RESULTS:
: \begin{foreigndisplayquote}{greek}
: Δαρείου καὶ Παρυσάτιδος γίγνονται παῖδες δύο, πρεσβύτερος μὲνἈρταξέρξης, νεώτερος δὲ Κῦρος·
: \end{foreigndisplayquote}


> However if you strongly want to use proper terminology in markup, you 
> may try to trade it for +your soul+ compatibility and portability 
> issues. The following almost works:

Interesting, thank you.

Yes, it is strange the new line added in `evilatex-emph' ... I have no
idea why that happens.

Best regards,

Juan Manuel 

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Raw Org AST snippets for "impossible" markup
  2021-12-08 23:19                       ` Juan Manuel Macías
@ 2021-12-08 23:35                         ` John Kitchin
  2021-12-09  7:01                           ` Juan Manuel Macías
  0 siblings, 1 reply; 72+ messages in thread
From: John Kitchin @ 2021-12-08 23:35 UTC (permalink / raw)
  To: Juan Manuel Macías; +Cc: Max Nikulin, orgmode

[-- Attachment #1: Type: text/plain, Size: 2220 bytes --]

Have you seen
https://github.com/tj64/org-dp? It seems to do a lot with creating and
manipulating org elements. It might either be handy or lead to some
inspiration.

On Wed, Dec 8, 2021 at 6:20 PM Juan Manuel Macías <maciaschain@posteo.net>
wrote:

> Max Nikulin writes:
>
> > As you have guessed, It is not my choice, it is interface of ox.el and
> > org-element.el.
>
> Indeed. Sorry for my haste: it's the consequences of not read the code
> carefully :-)
>
> Of course, your orgia-link-procedure could be extended to more org
> elements.
> I can't think of what kind of scenario that might fit in, but as a proof
> of concept I find it really stimulating. E.g:
>
> #+begin_src elisp
>   (org-export-string-as "<orgia:(verse-block ()
> \"Lorem\\nipsum\\ndolor\")>" 'html t)
> #+end_src
>
> #+RESULTS:
> : <p>
> : <p class="verse">
> : Lorem<br />
> : ipsum<br />
> : dolor</p>
> : </p>
>
> #+begin_src elisp
>   (org-export-string-as "<orgia:(quote-block (:attr_latex
>          (\":environment foreigndisplayquote :options {greek}\"))
>          \"Δαρείου καὶ Παρυσάτιδος γίγνονται παῖδες δύο, πρεσβύτερος μὲν
>          Ἀρταξέρξης, νεώτερος δὲ Κῦρος·\")>" 'latex t)
> #+end_src
>
> #+RESULTS:
> : \begin{foreigndisplayquote}{greek}
> : Δαρείου καὶ Παρυσάτιδος γίγνονται παῖδες δύο, πρεσβύτερος μὲνἈρταξέρξης,
> νεώτερος δὲ Κῦρος·
> : \end{foreigndisplayquote}
>
>
> > However if you strongly want to use proper terminology in markup, you
> > may try to trade it for +your soul+ compatibility and portability
> > issues. The following almost works:
>
> Interesting, thank you.
>
> Yes, it is strange the new line added in `evilatex-emph' ... I have no
> idea why that happens.
>
> Best regards,
>
> Juan Manuel
>
-- 
John

-----------------------------------
Professor John Kitchin (he/him/his)
Doherty Hall A207F
Department of Chemical Engineering
Carnegie Mellon University
Pittsburgh, PA 15213
412-268-7803
@johnkitchin
http://kitchingroup.cheme.cmu.edu

[-- Attachment #2: Type: text/html, Size: 3121 bytes --]

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Raw Org AST snippets for "impossible" markup
  2021-12-08 23:35                         ` John Kitchin
@ 2021-12-09  7:01                           ` Juan Manuel Macías
  2021-12-09 14:56                             ` Max Nikulin
  0 siblings, 1 reply; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-09  7:01 UTC (permalink / raw)
  To: John Kitchin; +Cc: Maxim Nikulin, orgmode

John Kitchin writes:

> Have you seen 
> https://github.com/tj64/org-dp? It seems to do a lot with creating and
> manipulating org elements. It might either be handy or lead to some
> inspiration. 

Interesting package. Thanks for sharing.

It gave me an idea, also borrowing part of Maxim's code, but evaluating
in this case the path. To continue playing with links... The goal is
to obtain a link with this structure `[[quote-lang:lang][quote]]':

#+BEGIN_SRC emacs-lisp :results silent
  (org-link-set-parameters
   "quote-lang"
   :display 'full
   :export (lambda (path desc bck)
	     (let* ((bck org-export-current-backend)
		    (attr (list (format
				 ":environment foreigndisplayquote :options {%s}"
				 path)))
		    (info (org-export-get-environment
			   bck nil nil)))
	       (org-no-properties
		(org-export-data-with-backend
		 `(quote-block (:attr_latex ,attr)
			       ,desc)
		 bck info)))))
#+END_SRC

#+begin_src emacs-lisp 
  (setq backends '(latex html odt))
  (setq results nil)
  (mapc (lambda (backend)
	  (add-to-list 'results
		       (org-export-string-as 
			"[[quote-lang:spanish][Publicamos nuestro libros
   para librarnos de ellos, para no pasar el resto de nuestras vidas
   corrigiendo borradores.]]" backend t) t))
	backends)
  (mapconcat 'identity results "\n")
#+end_src

#+RESULTS:
#+begin_example
\begin{foreigndisplayquote}{spanish}
Publicamos nuestro libros
 para librarnos de ellos, para no pasar el resto de nuestras vidas
 corrigiendo borradores.
\end{foreigndisplayquote}

<p>
<blockquote>
Publicamos nuestro libros
 para librarnos de ellos, para no pasar el resto de nuestras vidas
 corrigiendo borradores.
</blockquote>
</p>


<text:p text:style-name="Text_20_body">Publicamos nuestro libros
 para librarnos de ellos, para no pasar el resto de nuestras vidas
 corrigiendo borradores.</text:p>
#+end_example



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Raw Org AST snippets for "impossible" markup
  2021-12-09  7:01                           ` Juan Manuel Macías
@ 2021-12-09 14:56                             ` Max Nikulin
  2021-12-09 16:11                               ` Juan Manuel Macías
  0 siblings, 1 reply; 72+ messages in thread
From: Max Nikulin @ 2021-12-09 14:56 UTC (permalink / raw)
  To: emacs-orgmode

On 09/12/2021 14:01, Juan Manuel Macías wrote:
> John Kitchin writes:
> 
>> Have you seen
>> https://github.com/tj64/org-dp? It seems to do a lot with creating and
>> manipulating org elements. It might either be handy or lead to some
>> inspiration.
> 
> Interesting package. Thanks for sharing.

Either I missed something or its purpose is completely different. It 
maps Org markup to Org markup. I am experimenting with fragments that 
should allow to get something that is really tricky or even impossible 
with established syntax, so it has to run immediately before exporters.

> It gave me an idea, also borrowing part of Maxim's code, but evaluating
> in this case the path. To continue playing with links... The goal is
> to obtain a link with this structure `[[quote-lang:lang][quote]]':
> 
> #+BEGIN_SRC emacs-lisp :results silent
>    (org-link-set-parameters
>     "quote-lang"
>     :display 'full
>     :export (lambda (path desc bck)
> 	     (let* ((bck org-export-current-backend)
> 		    (attr (list (format
> 				 ":environment foreigndisplayquote :options {%s}"
> 				 path)))
> 		    (info (org-export-get-environment
> 			   bck nil nil)))
> 	       (org-no-properties
> 		(org-export-data-with-backend
> 		 `(quote-block (:attr_latex ,attr)
> 			       ,desc)
> 		 bck info)))))
> #+END_SRC

Looking into your code I have realized that it should be implemented 
using filter, not through :export property of links. Maybe without 
working proof of concept with link exporters, this session of 
monkey-typing would not be successful.

#+begin_src elisp :results silent
   (defun orgia-element-replace (current new destructive?)
     (if (eq current new)
	current
       (let* ((lst? (and (listp new) (not (symbolp (car new)))))
	     (new-lst (if lst?
			  (if destructive? (nconc new) (reverse new))
			(list new))))
	(dolist (element new-lst)
	  (org-element-insert-before element current)))
       (org-element-extract-element current)
       new))

   (defun orgia--transform-link (data)
     (if (not (string-equal "orgia" (org-element-property :type data)))
	data
       (let* ((path (org-element-property :path data)))
	(if (not (eq ?\( (aref path 0)))
	    (or path (org-element-contents data))
	  (read path)))))

   (defun orgia-parse-tree-filter (data _backend info)
     (org-element-map data 'link
       (lambda (data)
	(orgia-element-replace data (orgia--transform-link data) t))
       info nil nil t)
     data)
#+end_src

#+begin_src elisp :results silent
   (add-to-list 'org-export-filter-parse-tree-functions 
#'orgia-parse-tree-filter)

   (org-link-set-parameters "orgia")
#+end_src


#+begin_src elisp
   (org-export-string-as "An <orgia:(\"in\" (italic () \"ter\"))>word" 
'html t)
#+end_src

#+RESULTS:
: <p>
: An in<i>ter</i>word</p>



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Raw Org AST snippets for "impossible" markup
  2021-12-09 14:56                             ` Max Nikulin
@ 2021-12-09 16:11                               ` Juan Manuel Macías
  2021-12-09 22:27                                 ` Juan Manuel Macías
  0 siblings, 1 reply; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-09 16:11 UTC (permalink / raw)
  To: Max Nikulin; +Cc: orgmode

Max Nikulin writes:

> Looking into your code I have realized that it should be implemented 
> using filter, not through :export property of links. Maybe without 
> working proof of concept with link exporters, this session of 
> monkey-typing would not be successful.

Jumping into the "real world", how about these two examples of nested emphasis?

#+begin_src org :results latex :results replace
[[orgia:(italic () "The English versions of the " (italic () "Iliad") " and the " (italic () "Odyssey"))]]
#+end_src

#+RESULTS:
#+begin_export latex
\emph{The English versions of the \emph{Iliad} and the \emph{Odyssey}}
#+end_export

This one more complex:

#+begin_src org :results latex :results replace
[[orgia:(italic () "The English versions of the " (bold () (italic () "Iliad")) " and the " (bold () (italic () "Odyssey")))]]
#+end_src

#+RESULTS:
#+begin_export latex
\emph{The English versions of the \textbf{\emph{Iliad}} and the \textbf{\emph{Odyssey}}}
#+end_export


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Raw Org AST snippets for "impossible" markup
  2021-12-09 16:11                               ` Juan Manuel Macías
@ 2021-12-09 22:27                                 ` Juan Manuel Macías
  2022-01-03 14:34                                   ` Max Nikulin
  0 siblings, 1 reply; 72+ messages in thread
From: Juan Manuel Macías @ 2021-12-09 22:27 UTC (permalink / raw)
  To: Maxim Nikulin; +Cc: orgmode

Juan Manuel Macías writes:

> Jumping into the "real world", how about these two examples of nested emphasis?

By the way, what do you think about allowing the use of some kind of
aliases, so that the aspect is less verbose? Maybe something like "(i::"
instead of "(italic () ..."? I came up with this hasty sketch over your
latest code, *just* to see how it looks (I don't know if I prefer it to
stay verbose):

#+begin_src emacs-lisp :results silent
 (setq orgia-alias-alist '(("i" "italic")
			   ("b" "bold")
			   ("u" "underline")
			   ("s" "strike-through")))

  (defun orgia-replace (before after)
    (interactive)
    (save-excursion
      (goto-char (point-min))
      (while (re-search-forward before nil t)
	(replace-match after t nil))))

  (defun orgia--transform-path (path)
    (with-temp-buffer
      (insert path)
      (mapc (lambda (el)
	      (orgia-replace (concat "(" (car el) "::") (concat "(" (cadr el) " () ")))
	    orgia-alias-alist)
      (buffer-string)))

  (defun orgia--transform-link (data)
    (if (not (string-equal "orgia" (org-element-property :type data)))
	data
      (let* ((path (org-element-property :path data)))
	(if (not (eq ?\( (aref path 0)))
	    (or path (org-element-contents data))
          (read (orgia--transform-path path)))))) ;; <====
    ;;;;;;;;;;;;;;;;;;
 #+end_src

#+begin_src elisp
   (org-export-string-as "An <orgia:(\"in\" (s:: \"ter\"))>word"
'odt t)
#+end_src

#+RESULTS:
: 
: <text:p text:style-name="Text_20_body">An in<text:span text:style-name="Strikethrough">ter</text:span>word</text:p>


#+begin_src org :results latex :results replace
  [[orgia:(i:: "The English versions of the " (b:: (i:: "Iliad")) " and the " (b:: (i::
  "Odyssey")))]]
#+end_src

#+RESULTS:
#+begin_export latex
\emph{The English versions of the \textbf{\emph{Iliad}} and the \textbf{\emph{Odyssey}}}
#+end_export


------------------------------------------------------
Juan Manuel Macías 

https://juanmanuelmacias.com/



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Raw Org AST snippets for "impossible" markup
  2021-12-09 22:27                                 ` Juan Manuel Macías
@ 2022-01-03 14:34                                   ` Max Nikulin
  0 siblings, 0 replies; 72+ messages in thread
From: Max Nikulin @ 2022-01-03 14:34 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 1899 bytes --]

On 10/12/2021 05:27, Juan Manuel Macías wrote:
> Juan Manuel Macías writes:
> 
>> Jumping into the "real world", how about these two examples of nested emphasis?
> 
> By the way, what do you think about allowing the use of some kind of
> aliases, so that the aspect is less verbose?

I have no particular opinion concerning aliases, but certainly they 
should not work through string search and replace when parsed tree is 
available.

>    (defun orgia--transform-path (path)
>      (with-temp-buffer
>        (insert path)
>        (mapc (lambda (el)
> 	      (orgia-replace (concat "(" (car el) "::") (concat "(" (cadr el) " () ")))

By the way, is there any problem with `replace-regexp-in-string'?

See the attached file for definitions of some helper functions. Final setup:

#+begin_src elisp :results silent
   (setq orgia-demo-alias-alist
	'((b . bold)
	  (i . italic)
	  (s . strike-through)
	  (_ . underline)))

   (defun orgia-demo-alias-post-filter (node &optional _children)
     (when (listp node)
       (let ((sym (and (symbolp (car node))
		      (assq (car node) orgia-demo-alias-alist))))
	(when sym
	  (setcar node (cdr sym)))))
     node)

   (defun orgia-demo-alias (tree)
     (orgia-transform-tree-deep tree nil #'orgia-demo-alias-post-filter))
#+end_src

#+begin_src elisp :results silent
   (require 'ox)
   (add-to-list 'org-export-filter-parse-tree-functions 
#'orgia-parse-tree-filter)
   (org-link-set-parameters "orgia")

   (require 'ob-org)
   (add-to-list 'orgia-transform-functions #'orgia-demo-alias)
#+end_src

And a bit modified your test sample:

#+begin_src org :results latex :results replace
   [[orgia:(i nil "The English versions of the " (b nil (i () "Iliad")) 
" and the " (b () (i ()
   "Odyssey")))]]
#+end_src

#+RESULTS:
#+begin_export latex
\emph{The English versions of the \textbf{\emph{Iliad}} and the 
\textbf{\emph{Odyssey}}}
#+end_export

[-- Attachment #2: orgia-draft.el --]
[-- Type: text/x-emacs-lisp, Size: 2080 bytes --]

(defvar orgia-transform-functions nil)

(defun orgia-default-pre-filter (node)
  "Returns (node . children)"
  (if (listp node)
      (cons node node)
    (cons node nil)))

(defun orgia-transform-tree-deep (tree &optional pre-filter post-filter)
  "Deep-first walk."
  ;; Queue items: ((node-cell . children) . next-list)
  (let* ((pre-filter (or pre-filter #'orgia-default-pre-filter))
	 (top (list tree))
	 (queue (list (cons (cons top top) top))))
    (while queue
      (let* ((item (pop queue))
	     (next-list (cdr item)))
	(if (not next-list)
	    ;; post; skip POST-FILTER for the list wrapping TREE
	    (when (and queue post-filter)
	      (let* ((node-cell-children (car item))
		     (children (cdr node-cell-children)))
		(setcar (car node-cell-children)
			(funcall post-filter
				 (caar node-cell-children)
				 children))))
	  ;; pre
	  (setcdr item (cdr next-list))
	  (push item queue)
	  (let* ((node-children
		  (funcall pre-filter (car next-list)))
		 (node (car node-children))
		 (children (cdr node-children)))
		(setcar next-list node)
		(push (cons (cons next-list children) children) queue)))))
    (car top)))

(defun orgia-element-replace (current new destructive?)
  (if (eq current new)
      current
    (let* ((lst? (and (listp new) (not (symbolp (car new)))))
	   (new-lst (if lst?
			(if destructive? (nconc new) (reverse new))
		      (list new))))
      (dolist (element new-lst)
	(org-element-insert-before element current)))
    (org-element-extract-element current)
    new))

(defun orgia--transform-link (data)
  (if (not (string-equal "orgia" (org-element-property :type data)))
      data
    (let* ((path (org-element-property :path data)))
      (if (not (eq ?\( (aref path 0)))
	  (or path (org-element-contents data))
	(let ((tree (read path)))
	  (dolist (f orgia-transform-functions tree)
	    (setq tree (funcall f tree))))))))

(defun orgia-parse-tree-filter (data _backend info)
  (org-element-map data 'link
    (lambda (data)
      (orgia-element-replace data (orgia--transform-link data) t))
    info nil nil t)
  data)

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH] Intra-word markup: \relax
  2021-12-02 10:50 Org-syntax: Intra-word markup Denis Maier
  2021-12-02 11:18 ` Ihor Radchenko
  2021-12-02 11:58 ` Timothy
@ 2022-01-28 12:12 ` Max Nikulin
  2022-01-28 13:13   ` Juan Manuel Macías
  2 siblings, 1 reply; 72+ messages in thread
From: Max Nikulin @ 2022-01-28 12:12 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 1214 bytes --]

On 02/12/2021 17:50, Denis Maier wrote:
> 
> Currently, org syntax doesn't officially seem to support intra-word 
> emphasis. Am I missing something?
> If the assessment is correct: Is there a reason for this? And, shouldn't 
> that be officially added?

I have an idea how to implement *intra*/word/ markup with minimal change 
of Org syntax. At first I had a hope that it is enough to introduce 
\relax entity that expands to empty string, but it does not work for 
second part of words: *intra*\relax{}/word/ is exported to
     <b>intra</b>/word/.
So it is necessary to support consuming spaces after such entity similar 
to TeX commands:
     *intra*\relax /word/
In Org "a\_      b" already behaves in the same way.

I do not like zero-width spaces since they are invisible, so they are 
not really "text" markup. Moreover, it is better to filter them out 
during export.

Another failed idea was to use export snippet or a macro for such purpose:
     #+macro sep $1
     *intra*{{{sep()}}}/word/, *intra*@@html:@@/word/

Important point that suggested solution works for all export backends. I 
do not consider explicit export snippets as a workaround since it 
requires code for all backends in org files.

[-- Attachment #2: 0001-Intra-word-markup-relax.patch --]
[-- Type: text/x-patch, Size: 2278 bytes --]

From 95a0dcb1370577409388e137dae98ec4c1af5bbd Mon Sep 17 00:00:00 2001
From: Max Nikulin <manikulin@gmail.com>
Date: Fri, 28 Jan 2022 18:55:54 +0700
Subject: [PATCH] Intra-word markup: \relax

lisp/org-element.el (org-element-entity-parser): Parse \relax entity
with following spaces.

lisp/org-entities.el (org-entities): Add "\relax " entities with various
number of spaces expanding to nothing.

Allow "*intra*\relax /word/" markup change withing continuous word.  It
is not enough to just add "relax" entity since while it allows
"*intra*\relax{}word", characters after "{}" are not considered as
emphasis markers "intra\relax{}/word/".  The name is similar to the TeX
command. Consuming spaces following a command is usual behavior of TeX
commands as well.
---
 lisp/org-element.el  | 2 +-
 lisp/org-entities.el | 7 ++++++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/lisp/org-element.el b/lisp/org-element.el
index b82475a14..83001fd74 100644
--- a/lisp/org-element.el
+++ b/lisp/org-element.el
@@ -3159,7 +3159,7 @@ a plist with `:begin', `:end', `:latex', `:latex-math-p',
 
 Assume point is at the beginning of the entity."
   (catch 'no-object
-    (when (looking-at "\\\\\\(?:\\(?1:_ +\\)\\|\\(?1:there4\\|sup[123]\\|frac[13][24]\\|[a-zA-Z]+\\)\\(?2:$\\|{}\\|[^[:alpha:]]\\)\\)")
+    (when (looking-at "\\\\\\(?:\\(?1:\\(?:_\\|relax\\) +\\)\\|\\(?1:there4\\|sup[123]\\|frac[13][24]\\|[a-zA-Z]+\\)\\(?2:$\\|{}\\|[^[:alpha:]]\\)\\)")
       (save-excursion
 	(let* ((value (or (org-entity-get (match-string 1))
 			  (throw 'no-object nil)))
diff --git a/lisp/org-entities.el b/lisp/org-entities.el
index 2bd4f2fe3..f6177c471 100644
--- a/lisp/org-entities.el
+++ b/lisp/org-entities.el
@@ -526,7 +526,12 @@ packages to be loaded, add these packages to `org-latex-packages-alist'."
 		     spaces
 		     spaces
 		     (make-string n ?\x2002))
-	       space-entities)))))
+	       space-entities))))
+   ;; Add "\relax " space-eating entity family for "intra\relax *word*" markup.
+   (mapcar (lambda (n)
+             (list (concat "relax" (make-string n ? )) "" nil "" "" "" ""))
+             (number-sequence 0 20)))
+
   "Default entities used in Org mode to produce special characters.
 For details see `org-entities-user'.")
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH] Intra-word markup: \relax
  2022-01-28 12:12 ` [PATCH] Intra-word markup: \relax Max Nikulin
@ 2022-01-28 13:13   ` Juan Manuel Macías
  2022-02-02 15:42     ` Max Nikulin
  0 siblings, 1 reply; 72+ messages in thread
From: Juan Manuel Macías @ 2022-01-28 13:13 UTC (permalink / raw)
  To: Max Nikulin; +Cc: orgmode

Max Nikulin writes:

> I have an idea how to implement *intra*/word/ markup with minimal
> change of Org syntax. At first I had a hope that it is enough to
> introduce \relax entity that expands to empty string, but it does not
> work for second part of words: *intra*\relax{}/word/ is exported to
>     <b>intra</b>/word/.
> So it is necessary to support consuming spaces after such entity
> similar to TeX commands:
>     *intra*\relax /word/
> In Org "a\_      b" already behaves in the same way.
>
> I do not like zero-width spaces since they are invisible, so they are
> not really "text" markup. Moreover, it is better to filter them out 
> during export.
>
> Another failed idea was to use export snippet or a macro for such purpose:
>     #+macro sep $1
>     *intra*{{{sep()}}}/word/, *intra*@@html:@@/word/
>
> Important point that suggested solution works for all export backends.
> I do not consider explicit export snippets as a workaround since it 
> requires code for all backends in org files.

Maxim, I find the idea of \relax entity interesting. The only (minor)
drawback I find (in normal use, I mean) is the verbosity it adds.

In my case, I have already given up on the problem of marks inside words
:-(. My personal opinion: I think that, unless a completely
'revolutionary' solution emerges, it is better to leave the matter as it
is, and consider this a feature of Org rather than a bug. I suspect that
a single solution could not satisfy all tastes or all possible
scenarios, so maybe it would be nice to put a list of solutions
(including this one and also the zero space thing, and others that have
arisen or may arise) somewhere (perhaps in the manual?). What doesn't
quite convince me (and I agree with you on that) is recommending zero
width space as a sort of 'official' escape character. For the reasons
you have expressed, which I think are very fair.

Best regards,

Juan Manuel 


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2021-12-04 21:48                 ` Tom Gillespie
  2021-12-06 10:59                   ` Max Nikulin
@ 2022-01-28 14:52                   ` Max Nikulin
  2022-01-29  3:13                     ` Ihor Radchenko
  1 sibling, 1 reply; 72+ messages in thread
From: Max Nikulin @ 2022-01-28 14:52 UTC (permalink / raw)
  To: Tom Gillespie; +Cc: emacs-orgmode

On 05/12/2021 04:48, Tom Gillespie wrote:
>> Since org is a valid export backend though, perhaps this behaviour should be
>> reserved for @@:…@@, i.e. no export backend, which I think semantically fits
>> fairly nicely.
> 
> ...
> 
> What this means is that @@:...@@ syntax is not actually used
> in Org at all at the moment and renders as plain text. I agree that
> we need to avoid @@org:..@@ because it has legitimate uses.
> Making a back-end of empty string valid for parse separately
> syntax thus makes @@ syntax more regular overall, and allows
> @@:...@@ to be processed separately because it currently
> never enters the export snippet processing.

It seems that @@:...@@ should behave significantly different from 
regular export snippet since org markup should be parsed inside.

It could be used for one more purpose. I miss "fallback" option for 
export snippets. E.g. if explicit raw markup is specified for HTML and 
LaTeX, it would be nice to have something for other backends such as 
ascii or odt. In the series of adjacent export snippets @@:...@@ may be 
taken when backends in earlier snippets are not matched:

     @@html:HTML 1@@@@latex:LaTeX 1@@@@:ascii and odt 1@@@@html: HTML 
2@@@@:LaTeX, ascii, and odt 2@@.

At first I complained that it would be impossible to put export snippets 
in "parse separately" construct with @@:...@@ syntax. Likely it is not 
necessary. It is a bit verbose, but "parse separately" may be split:
    @@:part 1@@@@html:html-only@@@@:@@@@:part 2@@
Empty @@:@@ is added to avoid considering @@:part 2@@ as a fallback for 
"html-only".


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2022-01-28 14:52                   ` Max Nikulin
@ 2022-01-29  3:13                     ` Ihor Radchenko
  2022-01-29 13:05                       ` Juan Manuel Macías
  0 siblings, 1 reply; 72+ messages in thread
From: Ihor Radchenko @ 2022-01-29  3:13 UTC (permalink / raw)
  To: Max Nikulin; +Cc: Tom Gillespie, emacs-orgmode

Max Nikulin <manikulin@gmail.com> writes:

> It could be used for one more purpose. I miss "fallback" option for 
> export snippets. E.g. if explicit raw markup is specified for HTML and 
> LaTeX, it would be nice to have something for other backends such as 
> ascii or odt. In the series of adjacent export snippets @@:...@@ may be 
> taken when backends in earlier snippets are not matched:

This reminds me about our #+begin_export export blocks and #+begin_*
special blocks. We can think of @@backend:...@@ snippets as inline
equivalent of export blocks. Special blocks do not have inline
equivalent (except maybe links abused for export by some people).

Keeping in mind the above analogy, note that export blocks do not have
fallbacks, while special blocks do (for example, see
https://github.com/alhassy/org-special-block-extras/).

Maybe we should introduce an equivalent of special blocks, but for
inline use? Or should we modify _both_ inline export snippets and export
blocks to allow fallback mechanism?

Best,
Ihor


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2022-01-29  3:13                     ` Ihor Radchenko
@ 2022-01-29 13:05                       ` Juan Manuel Macías
  2022-02-02 15:28                         ` Max Nikulin
  0 siblings, 1 reply; 72+ messages in thread
From: Juan Manuel Macías @ 2022-01-29 13:05 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: orgmode

Ihor Radchenko writes:

> Maybe we should introduce an equivalent of special blocks, but for
> inline use? Or should we modify _both_ inline export snippets and export
> blocks to allow fallback mechanism?

I find the idea of inline special blocks very interesting, but I think
there are a couple of drawbacks: since special blocks support ATTR_X,
how would that be implemented in the inline version? The most obvious
thing I can think of is to mimic inline code blocks:

my_special_block[attributes list]{content}

But it would produce a result many times too verbose. Another risk that
this would entail, IMHO, is that of the "LaTeXification" of Org...

In any case, for things like that, aren't links and macros enough? I'm
one of those who 'abuse' links for many export scenarios (I even have
written this package:
https://gitlab.com/maciaschain/org-critical-edition), and I think links
have enormous potential and versatility. John Kitchin's blog has really
helped me open my mind and explore that very productive Org component.
Macros are also a very powerful tool, except for the comma issue, which
I think is still an unfinished business and a solution should be found
one day. Still, the possibility of a special inline block is very
interesting to me.

Best regards,

Juan Manuel 


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2022-01-29 13:05                       ` Juan Manuel Macías
@ 2022-02-02 15:28                         ` Max Nikulin
  2022-02-02 20:01                           ` Juan Manuel Macías
  0 siblings, 1 reply; 72+ messages in thread
From: Max Nikulin @ 2022-02-02 15:28 UTC (permalink / raw)
  To: emacs-orgmode

> Ihor Radchenko writes:
>> Keeping in mind the above analogy, note that export blocks do not have
>> fallbacks, while special blocks do (for example, see
>> https://github.com/alhassy/org-special-block-extras/)

Ihor, I am sorry, but I missed your point. That project provides some 
set of defined link+block pairs and some macros to define new 
links/pairs. I do not see relation to export snippets or blocks that are 
used when their content is not intended to be reusable.

>> Maybe we should introduce an equivalent of special blocks, but for
>> inline use? Or should we modify _both_ inline export snippets and export
>> blocks to allow fallback mechanism?

I suppose, it should be consistent to consider adjacent export blocks as 
alternatives and to allow "fallback" or "default" block. Again, similar 
to @@:...@@ snippets, block content should be parsed as Org markup.

On 29/01/2022 20:05, Juan Manuel Macías wrote:
> I find the idea of inline special blocks very interesting, but I think
> there are a couple of drawbacks: since special blocks support ATTR_X,
> how would that be implemented in the inline version? The most obvious
> thing I can think of is to mimic inline code blocks:
> 
> my_special_block[attributes list]{content}

ATTR_X attributes are supported for links as well, see
info "(org) Links in HTML export"
https://orgmode.org/manual/Links-in-HTML-export.html
However it is rather verbose, may have problems with LaTeX, and I am 
unsure if they can be accessed from export link handlers

Actually I do not like src_something[...]{...} syntax since there is no 
clear mark (such as "\") at the beginning that it is a special construct.

> In any case, for things like that, aren't links and macros enough?

Ad hoc code for particular backends (and discussed fallback for other 
backends) is a bit different thing. It may be used in macros, but macros 
can not replace it. Moreover @@:...@@ construct proposed by Tom would 
allow e.g.
    [[https://orgmode.org][@@:*inter*@@@@:/word/@@]]
to be half-word bold and half-word italics without invisible zero width 
spaces and filters to remove them.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH] Intra-word markup: \relax
  2022-01-28 13:13   ` Juan Manuel Macías
@ 2022-02-02 15:42     ` Max Nikulin
  0 siblings, 0 replies; 72+ messages in thread
From: Max Nikulin @ 2022-02-02 15:42 UTC (permalink / raw)
  To: emacs-orgmode

On 28/01/2022 20:13, Juan Manuel Macías wrote:
> Max Nikulin writes:
> 
>> I have an idea how to implement *intra*/word/ markup with minimal
>> change of Org syntax. At first I had a hope that it is enough to
>> introduce \relax entity that expands to empty string, but it does not
>> work for second part of words: *intra*\relax{}/word/ is exported to
>>      <b>intra</b>/word/.
>> So it is necessary to support consuming spaces after such entity
>> similar to TeX commands:
>>      *intra*\relax /word/
>> In Org "a\_      b" already behaves in the same way.
> 
> Maxim, I find the idea of \relax entity interesting. The only (minor)
> drawback I find (in normal use, I mean) is the verbosity it adds.

"Relax" is just a name known to TeX users. Certainly another shorter 
word may be used instead. I am just lazy enough to look through HTML 
named entities and LaTeX command to avoid conflicts and thus behavior 
unexpected to some users.

> In my case, I have already given up on the problem of marks inside words
> :-(. My personal opinion: I think that, unless a completely
> 'revolutionary' solution emerges, it is better to leave the matter as it
> is, and consider this a feature of Org rather than a bug. I suspect that
> a single solution could not satisfy all tastes or all possible
> scenarios, so maybe it would be nice to put a list of solutions
> (including this one and also the zero space thing, and others that have
> arisen or may arise) somewhere (perhaps in the manual?).

A day before I posted my current summary why export snippets and macros 
do not help with intra-word markup (before I expected that they can), 
only custom links is a workaround (with some limitations, as usual):

[RFC] Creole-style / Support for **emphasis**__within__**a word**
Tue, 25 Jan 2022 23:27:50 +0700.
https://list.orgmode.org/ssp8e7$ah2$1@ciao.gmane.io/

But at that moment I forgot about entities, Another topic served as a 
reminder, and I spent some time experimenting with them.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2022-02-02 15:28                         ` Max Nikulin
@ 2022-02-02 20:01                           ` Juan Manuel Macías
  2022-02-03 12:10                             ` Max Nikulin
  0 siblings, 1 reply; 72+ messages in thread
From: Juan Manuel Macías @ 2022-02-02 20:01 UTC (permalink / raw)
  To: Max Nikulin; +Cc: orgmode

Max Nikulin writes:

> ATTR_X attributes are supported for links as well, see
> info "(org) Links in HTML export"
> https://orgmode.org/manual/Links-in-HTML-export.html
> However it is rather verbose, may have problems with LaTeX, and I am
> unsure if they can be accessed from export link handlers

Yes, I know. I use a lot in my blogs constructions of this type:

#+ATTR_HTML: :target _blank
some link...

But, as far as I know, its use is line-oriented. I mean, you can't use
multiple ATTR_X constructs inside a paragraph and for different links
inside the paragraph.

As for links and their multiple possible or future uses (I say *uses*
and never *abuses*: it's a tool, it's there to be used, and it works
great), of course I see them more as a resource ---and quite powerful
and versatile, by the way. --- that a matter of syntax. But the thing is
that for me Org is, in addition to a syntax, above all a set of
coherently assembled resources to prepare my documents and take my
notes, organize my work and a lot of other things.

Best regards,

Juan Manuel 



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: Org-syntax: Intra-word markup
  2022-02-02 20:01                           ` Juan Manuel Macías
@ 2022-02-03 12:10                             ` Max Nikulin
  0 siblings, 0 replies; 72+ messages in thread
From: Max Nikulin @ 2022-02-03 12:10 UTC (permalink / raw)
  To: emacs-orgmode

On 03/02/2022 03:01, Juan Manuel Macías wrote:
> Max Nikulin writes:
> 
>> ATTR_X attributes are supported for links as well, see
>> info "(org) Links in HTML export"
>> https://orgmode.org/manual/Links-in-HTML-export.html
>> However it is rather verbose, may have problems with LaTeX, and I am
>> unsure if they can be accessed from export link handlers
> 
> Yes, I know. I use a lot in my blogs constructions of this type:
> 
> #+ATTR_HTML: :target _blank
> some link...

I just have realized that example in the manual does not work. I will 
start a new thread. Attributes are assigned to paragraph, not to the link:

#+ATTR_HTML: :title The Org mode homepage :style color:red;
[[https://orgmode.org]]

<p title="The Org mode homepage" style="color:red;">
<a href="https://orgmode.org" title="The Org mode homepage" 
style="color:red;">https://orgmode.org</a>
</p>

> But, as far as I know, its use is line-oriented. I mean, you can't use
> multiple ATTR_X constructs inside a paragraph and for different links
> inside the paragraph.

Thank you, I confused issues related to export when keywords and export 
blocks are used. For some reason I believed that affiliated keywords 
have a dedicated section in https://orgmode.org/worg/dev/org-syntax.html 
because they can be applied to inline objects, but you are right, they 
set property for next block-level element.

Attributes from several lines are combined however.

The following snippets illustrates bugs in LaTeX exporter that I 
remember from an earlier discussion:

---- >8 ----

This is a single paragraph in LaTeX export, but 3 HTML paragraphs.
First link (with =rel= attribute) is to
#+attr_html: :rel nofollow :title Org Mode web site
[[https://orgmode.org/][Org Mode]].
Another one is to
#+attr_html: :rel noopener
#+attr_html: :title GNU web site
[[https://www.gnu.org/][GNU]]. Both links have =title= HTML attributes.

This is single paragraph in HTML
@@odt:@@
but 2 paragraphs in LaTeX.

---- 8< ----

This is a single paragraph in \LaTeX{} export, but 3 HTML paragraphs.
First link (with \texttt{rel} attribute) is to
\href{https://orgmode.org/}{Org Mode}.
Another one is to
\href{https://www.gnu.org/}{GNU}. Both links have \texttt{title} HTML 
attributes.

This is single paragraph in HTML

but 2 paragraphs in \LaTeX{}.

---- >8 ----

<p>
This is a single paragraph in LaTeX export, but 3 HTML paragraphs.
First link (with <code>rel</code> attribute) is to
</p>
<p rel="nofollow" title="Org Mode web site">
<a href="https://orgmode.org/" rel="nofollow" title="Org Mode web 
site">Org Mode</a>.
Another one is to
</p>
<p title="GNU web site" rel="noopener">
<a href="https://www.gnu.org/" title="GNU web site" 
rel="noopener">GNU</a>. Both links have <code>title</code> HTML attributes.
</p>

<p>
This is single paragraph in HTML

but 2 paragraphs in LaTeX.</p>



^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2022-02-03 12:13 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-02 10:50 Org-syntax: Intra-word markup Denis Maier
2021-12-02 11:18 ` Ihor Radchenko
2021-12-02 11:30   ` Juan Manuel Macías
2021-12-02 11:36     ` Denis Maier
2021-12-02 12:01       ` Ihor Radchenko
2021-12-02 11:42     ` Marco Wahl
2021-12-02 11:50       ` Denis Maier
2021-12-02 12:10         ` Ihor Radchenko
2021-12-02 12:40           ` Denis Maier
2021-12-02 12:54             ` Ihor Radchenko
2021-12-02 13:14               ` Juan Manuel Macías
2021-12-02 13:28                 ` Denis Maier
2021-12-02 12:48           ` Max Nikulin
2021-12-02 12:02       ` Ihor Radchenko
2021-12-02 12:00     ` Ihor Radchenko
     [not found]       ` <87r1avtdjy.fsf@ucl.ac.uk>
2021-12-02 12:27         ` Denis Maier
2021-12-02 13:06           ` Eric S Fraga
2021-12-02 12:28       ` Denis Maier
2021-12-02 12:55         ` Ihor Radchenko
2021-12-02 11:58 ` Timothy
2021-12-02 12:26   ` Denis Maier
2021-12-02 13:07     ` Ihor Radchenko
2021-12-02 15:51       ` Max Nikulin
2021-12-02 18:11         ` Tom Gillespie
2021-12-02 19:09           ` Juan Manuel Macías
2021-12-04 13:07             ` Org-syntax: emphasis and not English punctuation Max Nikulin
2021-12-04 16:42               ` Juan Manuel Macías
2021-12-02 20:47           ` Org-syntax: Intra-word markup Denis Maier
2021-12-02 22:44             ` Samuel Wales
2021-12-03 14:53           ` Max Nikulin
2021-12-03 23:51           ` Tim Cross
2021-12-04 15:01             ` Max Nikulin
2021-12-05 23:34               ` Russell Adams
2021-12-05 23:37             ` Russell Adams
2021-12-06  1:39               ` Samuel Wales
2021-12-02 19:03       ` Nicolas Goaziou
2021-12-02 19:34         ` Juan Manuel Macías
2021-12-02 23:05           ` Nicolas Goaziou
2021-12-02 23:24             ` Juan Manuel Macías
2021-12-03 14:24         ` Max Nikulin
2021-12-03 15:01           ` Juan Manuel Macías
2021-12-04 15:57           ` Denis Maier
2021-12-04 17:53             ` Tom Gillespie
2021-12-04 18:37               ` John Kitchin
2021-12-04 21:16                 ` Juan Manuel Macías
2021-12-06 10:57                 ` Raw Org AST snippets for "impossible" markup Max Nikulin
2021-12-06 15:45                   ` Juan Manuel Macías
2021-12-06 16:56                     ` Juan Manuel Macías
2021-12-08 13:09                     ` Max Nikulin
2021-12-08 23:19                       ` Juan Manuel Macías
2021-12-08 23:35                         ` John Kitchin
2021-12-09  7:01                           ` Juan Manuel Macías
2021-12-09 14:56                             ` Max Nikulin
2021-12-09 16:11                               ` Juan Manuel Macías
2021-12-09 22:27                                 ` Juan Manuel Macías
2022-01-03 14:34                                   ` Max Nikulin
2021-12-04 19:04               ` Org-syntax: Intra-word markup Timothy
2021-12-04 21:48                 ` Tom Gillespie
2021-12-06 10:59                   ` Max Nikulin
2022-01-28 14:52                   ` Max Nikulin
2022-01-29  3:13                     ` Ihor Radchenko
2022-01-29 13:05                       ` Juan Manuel Macías
2022-02-02 15:28                         ` Max Nikulin
2022-02-02 20:01                           ` Juan Manuel Macías
2022-02-03 12:10                             ` Max Nikulin
2021-12-06 11:01               ` Denis Maier
2022-01-28 12:12 ` [PATCH] Intra-word markup: \relax Max Nikulin
2022-01-28 13:13   ` Juan Manuel Macías
2022-02-02 15:42     ` Max Nikulin
  -- strict thread matches above, loose matches on Subject: below --
2021-12-02 13:36 Org-syntax: Intra-word markup autofrettage
2021-12-02 15:24 ` Robert Pluim
2021-12-02 17:11   ` autofrettage

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).