emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* On zero width spaces and Org syntax
@ 2021-12-03 12:48 Juan Manuel Macías
  2021-12-03 19:03 ` Greg Minshall
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Juan Manuel Macías @ 2021-12-03 12:48 UTC (permalink / raw)
  To: orgmode

Hi all,

It is usually recommended, as you know, to insert a zero width space
character (Unicode U+200B) as a sort of delimiter mark to solve the
scenarios of emphasis within a word (for example, =/meta/literature=)
and others contexts where emphasis marks are not recognized (for example
=[/literature/]=). I believe that as a puntual workaround it is not bad;
however, I find it problematic that this character is part, more or less
de facto, of the Org syntax. For two main reasons:

1. It is an invisible character, and therefore it is difficult to
control and manage. I think it is not good practice to introduce this
type of characters implicitly in a plain text document.

2. It is more natural that this type of space characters are part of the
'output' and not of the 'input'. In the input it is better to introduce
them not implicitly but through their representation. For example, in
LaTeX (with LuaTeX) using the command '\char"200B{}' (or '^^^^200b'),
'​' in HTML, etc.

In any case, as an implicit character, I do not see it appropriate for
the syntax of a markup language. The marks should be simply ascii
characters, IMHO. So what if Org had a specific delimiter mark for the
scenarios described above? For example, something like that:

#+begin_example

/meta/''literature

*meta*''literature

[''*literature*'']

#+end_example

WDYT?

Best regards,

Juan Manuel 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: On zero width spaces and Org syntax
  2021-12-03 12:48 On zero width spaces and Org syntax Juan Manuel Macías
@ 2021-12-03 19:03 ` Greg Minshall
  2021-12-03 20:30   ` Juan Manuel Macías
  2021-12-03 21:48 ` Tim Cross
  2021-12-04  6:43 ` Marcin Borkowski
  2 siblings, 1 reply; 15+ messages in thread
From: Greg Minshall @ 2021-12-03 19:03 UTC (permalink / raw)
  To: Juan Manuel Macías; +Cc: orgmode

Juan Manuel,

> however, I find it problematic that this character is part, more or
> less de facto, of the Org syntax. For two main reasons:

in fact, i am always queasy when i enter ZWNBSP in a .org (or any other)
file.  some sort of "visible" sequence would be great.  backwards
compatibility might be a problem.

your last example

: [''*literature*'']

seems a bit of sleight-of-hand, though.  iiuc, text inside square
brackets isn't highlighted currently, and ZWNBSP doesn't (afaict) turn
on highlighting.  (maybe there's been recent discussion, modifications
of this?)

i.e., if the goal is to *expand* the realm of highlighting, might that
not be a separate issue?

cheers, Greg


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: On zero width spaces and Org syntax
  2021-12-03 19:03 ` Greg Minshall
@ 2021-12-03 20:30   ` Juan Manuel Macías
  0 siblings, 0 replies; 15+ messages in thread
From: Juan Manuel Macías @ 2021-12-03 20:30 UTC (permalink / raw)
  To: Greg Minshall; +Cc: orgmode

Hi Greg, thank you for your comment,

Greg Minshall writes:

> in fact, i am always queasy when i enter ZWNBSP in a .org (or any other)
> file.  some sort of "visible" sequence would be great.  backwards
> compatibility might be a problem.

Yes I agree. I think that in this case, a new mark would not compromise
backward compatibility, as this presumed new mark would do the same
function as zero width space: i.e. delimit to preserve emphasis. Of
course one could go on using a zero-width space, though I keep thinking
that this is rather a puntual workaround and should not form part of the
syntax.

> your last example
>
> : [''*literature*'']
>
> seems a bit of sleight-of-hand, though.  iiuc, text inside square
> brackets isn't highlighted currently, and ZWNBSP doesn't (afaict) turn
> on highlighting.  (maybe there's been recent discussion, modifications
> of this?)

The idea would be to use a kind of 'protection mark', to allow something
in a context where it is not allowed: a passport ;-). As the emphasis
marks are recognized before and after a single quote, I thought that
maybe a sequence of two single quotes could function here as a
protection mark (screenshot: https://i.imgur.com/cPIH9qa.png). For
example:

#+begin_example
| Some examples where emphasis marks are not allowed | Protected emphasis marks |
|----------------------------------------------------+--------------------------|
| /meta/literature                                   | /meta/''literature       |
| [/literature/]                                     | [''/literature/'']       |
| <*literature*>                                     | <''*literature*''>       |
| meta/*literature*                                  | meta/''*literature*      |
#+end_example

With the protection marks we get (in LaTeX for example):

\emph{meta}literature
[\emph{literature}]
<\textbf{literature}>
meta/\textbf{literature}

Best regards,

Juan Manuel 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: On zero width spaces and Org syntax
  2021-12-03 12:48 On zero width spaces and Org syntax Juan Manuel Macías
  2021-12-03 19:03 ` Greg Minshall
@ 2021-12-03 21:48 ` Tim Cross
  2021-12-04  1:26   ` Juan Manuel Macías
                     ` (2 more replies)
  2021-12-04  6:43 ` Marcin Borkowski
  2 siblings, 3 replies; 15+ messages in thread
From: Tim Cross @ 2021-12-03 21:48 UTC (permalink / raw)
  To: emacs-orgmode


Juan Manuel Macías <maciaschain@posteo.net> writes:

> Hi all,
>
> It is usually recommended, as you know, to insert a zero width space
> character (Unicode U+200B) as a sort of delimiter mark to solve the
> scenarios of emphasis within a word (for example, =/meta/literature=)
> and others contexts where emphasis marks are not recognized (for example
> =[/literature/]=). I believe that as a puntual workaround it is not bad;
> however, I find it problematic that this character is part, more or less
> de facto, of the Org syntax. For two main reasons:
>
> 1. It is an invisible character, and therefore it is difficult to
> control and manage. I think it is not good practice to introduce this
> type of characters implicitly in a plain text document.
>
> 2. It is more natural that this type of space characters are part of the
> 'output' and not of the 'input'. In the input it is better to introduce
> them not implicitly but through their representation. For example, in
> LaTeX (with LuaTeX) using the command '\char"200B{}' (or '^^^^200b'),
> '&#x200B;' in HTML, etc.
>
> In any case, as an implicit character, I do not see it appropriate for
> the syntax of a markup language. The marks should be simply ascii
> characters, IMHO. So what if Org had a specific delimiter mark for the
> scenarios described above? For example, something like that:
>
> #+begin_example
>
> /meta/''literature
>
> *meta*''literature
>
> [''*literature*'']
>
> #+end_example
>
> WDYT?
>
> Best regards,
>
> Juan Manuel 

I think I am in agreement regarding most of your points about the use of
the zero-width character. I see it as a type of escape hatch which
provides a solution in some less frequent situations. It is a somewhat
clever kludge to enable markup in some situations not supported by the
basic markup syntax I'm happy with its status as a kludge and would not
want to see it become an official part of the syntax. Where we may
differ is in whether we actually want to add inner word markup support
at all. 

I'm somewhat surprised and more than a little concerned at how much
interest and focus on modifying the markup syntax of org the question of
inner word markup has generated. This seems to be a symptom of a more
general trend towards adding and extending org mode to meet the needs of
everyone and I'm concerned this is overlooking the key strength of org
mode - simplicity.

Consider how many times we have had requests for inner word markup in
the last 18 years. I've seen such requests only a very few times.
Certainly not frequently enough to consider modification of the markup
syntax to accommodate such a requirement.

A key philosophy of org mode is simplicity - it makes the easy stuff
simple and the hard stuff possible. The thing about simple solutions is
that they will inevitably have limitations. If you don't want those
limitations, then you use a more complex feature rich markup, such as
Latex, HTML, XML etc. Ideally, your system will provide some escape
hatches to allow you to do things not supported by the base markup
syntax. Those escape hatches will usually be less convenient and often
look quite ugly, but that is fine because they are an escape hatch
which is used infrequently. Better still is if the system provides some
way to make a specific escape hatch easier to use in a document (such as
via a macro). The basic org markup syntax has worked remarkably well for
18 years. Nearly all the proposed additions or alterations to support
inner word markup with complicate the syntax or introduce potential new
ambiguities and/or complexity in processing to support a feature which
has been rarely asked for and which has other, less convenient and often
ugly, solutions which work.

One of org's strengths has been the ability to export documents to
multiple formats. One way this has been made possible is by keeping the
markup syntax simple - a basic markup which is well supported by all
export back ends. Once you start adding more complex markup support, you
see a blow out of complexity in the export back ends. Worse yet, you get
results which are surprising to the end user or which simply don't work
correctly with some formats. to avoid this, it is critical to keep the
markup syntax as simple and straight-forward as possible, even if that
means some limitations on what can be done with the markup. 

My vote is to simply maintain the status quo. Don't modify the syntax,
don't make the zero space character somewhat special or processed in any
special way during export. In short, accept that inner word markup has
only limited support and if that is a requirement which is critical to
your use case, accept that org mode may not be the right solution for
your requirements. 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: On zero width spaces and Org syntax
  2021-12-03 21:48 ` Tim Cross
@ 2021-12-04  1:26   ` Juan Manuel Macías
  2021-12-04  4:04     ` Tom Gillespie
  2021-12-04 15:26   ` Max Nikulin
  2021-12-06 11:40   ` Eric S Fraga
  2 siblings, 1 reply; 15+ messages in thread
From: Juan Manuel Macías @ 2021-12-04  1:26 UTC (permalink / raw)
  To: Tim Cross; +Cc: orgmode

Tim Cross writes:

> I think I am in agreement regarding most of your points about the use of
> the zero-width character. I see it as a type of escape hatch which
> provides a solution in some less frequent situations. It is a somewhat
> clever kludge to enable markup in some situations not supported by the
> basic markup syntax I'm happy with its status as a kludge and would not
> want to see it become an official part of the syntax. Where we may
> differ is in whether we actually want to add inner word markup support
> at all. 
>
> I'm somewhat surprised and more than a little concerned at how much
> interest and focus on modifying the markup syntax of org the question of
> inner word markup has generated. This seems to be a symptom of a more
> general trend towards adding and extending org mode to meet the needs of
> everyone and I'm concerned this is overlooking the key strength of org
> mode - simplicity.
>
> Consider how many times we have had requests for inner word markup in
> the last 18 years. I've seen such requests only a very few times.
> Certainly not frequently enough to consider modification of the markup
> syntax to accommodate such a requirement.
>
> A key philosophy of org mode is simplicity - it makes the easy stuff
> simple and the hard stuff possible. The thing about simple solutions is
> that they will inevitably have limitations. If you don't want those
> limitations, then you use a more complex feature rich markup, such as
> Latex, HTML, XML etc. Ideally, your system will provide some escape
> hatches to allow you to do things not supported by the base markup
> syntax. Those escape hatches will usually be less convenient and often
> look quite ugly, but that is fine because they are an escape hatch
> which is used infrequently. Better still is if the system provides some
> way to make a specific escape hatch easier to use in a document (such as
> via a macro). The basic org markup syntax has worked remarkably well for
> 18 years. Nearly all the proposed additions or alterations to support
> inner word markup with complicate the syntax or introduce potential new
> ambiguities and/or complexity in processing to support a feature which
> has been rarely asked for and which has other, less convenient and often
> ugly, solutions which work.
>
> One of org's strengths has been the ability to export documents to
> multiple formats. One way this has been made possible is by keeping the
> markup syntax simple - a basic markup which is well supported by all
> export back ends. Once you start adding more complex markup support, you
> see a blow out of complexity in the export back ends. Worse yet, you get
> results which are surprising to the end user or which simply don't work
> correctly with some formats. to avoid this, it is critical to keep the
> markup syntax as simple and straight-forward as possible, even if that
> means some limitations on what can be done with the markup. 
>
> My vote is to simply maintain the status quo. Don't modify the syntax,
> don't make the zero space character somewhat special or processed in any
> special way during export. In short, accept that inner word markup has
> only limited support and if that is a requirement which is critical to
> your use case, accept that org mode may not be the right solution for
> your requirements. 

Thank you very much for the detailed and precise exposition of your
point of view. I appreciate it.

First of all, a point that I consider important and essential in this
and other debates that are generated here, is that there is no single
conception of Org that should prevail as (say) "the canon". Org is so
polyhedral and so multifaceted that there are as many conceptions of Org
as there are users of Org. Well, what I have said is in itself one more
conception of Org. But I assume that other users may think that Org is
not all the things that I say it is. At the end of the day, what matters
is only one thing, for on top of theories and doctrines: if Org is
useful to you and helps you to do your work, so great. A few months ago
(and I think I already shared it here) I finished the typesetting and
layout of a dictionary of almost 1000 pages, and I did it using a
workflow that I have developed which is a merge between Org/Org-Publish
and LuaTeX. And now, using the same method, I am working on an
ancient-Greek/Spanish bilingual critical edition. So I believe I'm not
suspicious of thinking that Org doesn't cover the needs of my workflow.

As for the matter of emphasis marks between words. I believe that this
is not the underlying problem, but rather the (little) inconsistency of
the markup on certain contexts. Think, for example, of a text where you
have to put many words in italics, enclosed between brackets. I don't
care if that type of text is 'typical' or 'non-typical', 'majority' or
'non-majority'. It is simply a kind of scenario absolutely legitimate
and feasible, and right now I could quote you more than a type of text
in that direction.

Since I have been using Org I have been running into these little
inconsistencies. Any insurmountable, of course, nor I had to abandon the
use of Org for that minor issues. Fortunately, Org is more than just a
markup language, and it offers lots of alternative resources and
extensibility. Org is GNU Emacs. Org is not Markdown.

My proposal here also does not arise from an irrepressible desire to add
more complexity to the syntax. If it's recommended that the user, in
certain contexts, enter implicitly a zero-width space (which, I insist,
is a practice that should be avoided as much as possible in a plain text
document), why not at least offer a graphical alternative, a *real* mark
whose role is *exactly* the same as that of the zero-with space? Is that
adding more complexity??? Honestly I think that's exactly the opposite.

In any case, I have suggested that new mark as a possibility, in case it
is interesting to implement it, since a thread has emerged these days
about the topic of the intra-words syntax. Discussions and threads
arised about these questions and any other are perfectly legitimate and
natural and welcome. Please: there are no issues more 'important' than
others; no two users are the same in Org. What you do not find useful,
another user may perhaps finds it indispensable. And vice versa. And I
think no one is in willingness to state what the average Org user does
or does not want, given that we do not know even 1% of Org users.

Best regards,

Juan Manuel 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: On zero width spaces and Org syntax
  2021-12-04  1:26   ` Juan Manuel Macías
@ 2021-12-04  4:04     ` Tom Gillespie
  2021-12-04  5:29       ` Juan Manuel Macías
  0 siblings, 1 reply; 15+ messages in thread
From: Tom Gillespie @ 2021-12-04  4:04 UTC (permalink / raw)
  To: Juan Manuel Macías; +Cc: Tim Cross, orgmode

An important note: for intra-word markup you probably want to
use word joiner U+2060 and not zero width space, because a
zero width space allows layout to break the word, whereas a
word joiner does not. We may need to check to make sure that
U+2060 counts as whitespace for the purposes of markup.

> 2. It is more natural that this type of space characters are part of the
> 'output' and not of the 'input'.

That is not relevant in this case. However, Org export should not be
emitting byte-literal zero width spaces either, that causes as NASTY
surprise for the user. All that Org does in this pass is pass something
along for the user. The kludge is a kluge because it just happens to
be compatible with Org syntax, that is all. I agree that significant
whitespace is decidedly undesirable, unfortunately Org already
has some, though it is nowhere near as bad as markdown with
the trailing whitespace. There also happen to be ways to mitigate
issues with non-printing chars via font-locking etc. to make them
print/visible when authoring. This is another good reason to use
macros as well --- they can be documented.

> As for the matter of emphasis marks between words. I believe that this
> is not the underlying problem, but rather the (little) inconsistency of
> the markup on certain contexts. Think, for example, of a text where you
> have to put many words in italics, enclosed between brackets. I don't
> care if that type of text is 'typical' or 'non-typical', 'majority' or
> 'non-majority'. It is simply a kind of scenario absolutely legitimate
> and feasible, and right now I could quote you more than a type of text
> in that direction.

The problem here is that there is an unbalanced design tradeoff.
Supporting intra-word markup using Org's simple markup syntax
actually introduces more inconsistencies elsewhere (see my
note at the end about where the burden of proof lies with
regard to statements like this).

Further, we also have to consider the impact of such a change
across the whole population of Emacs users and use cases.
Adding complexity to support a very narrow use case, and one
that will produce inconsistencies elsewhere means that the
whole community is forced to bear the burden of that complexity.

This is the principle that I think Tim touches on in terms of keeping
simple things simple. Complexity in pursuit of niche use cases is
never worth the cost when it has to be borne by 99% of users that
will never need such things.

Further, Org provides not only a single solution to these cases, but
multiple solutions. Worst case it is also possible to fail over to
text macros, which are an absurdly powerful escape hatch for users
that have advanced (read niche) needs.

> My proposal here also does not arise from an irrepressible desire to add
> more complexity to the syntax. If it's recommended that the user, in
> certain contexts, enter implicitly a zero-width space (which, I insist,
> is a practice that should be avoided as much as possible in a plain text
> document), why not at least offer a graphical alternative, a *real* mark
> whose role is *exactly* the same as that of the zero-with space? Is that
> adding more complexity??? Honestly I think that's exactly the opposite.

This has the same problems as other proposals about this, whether
they are escape chars, or other syntactic additions. It complicates
the syntax for the community as a whole. It may simplify it for your
particular use case, but not when averaged out with everyone else.

I think one approach is to encourage the use of \emph{a}b and friends.
They are printable and hide nothing. I would also suggest that we work
to update other export backends to support \emph where possible.

> In any case, I have suggested that new mark as a possibility, in case it
> is interesting to implement it, since a thread has emerged these days
> about the topic of the intra-words syntax. Discussions and threads
> arised about these questions and any other are perfectly legitimate and
> natural and welcome. Please: there are no issues more 'important' than
> others; no two users are the same in Org. What you do not find useful,
> another user may perhaps finds it indispensable. And vice versa. And I
> think no one is in willingness to state what the average Org user does
> or does not want, given that we do not know even 1% of Org users.

I think we have a fairly good idea in this particular case. If someone
wanted to do a more thorough study of existing org files in the wild
to see whether they are using a workaround it would certainly be
interesting, if unlikely to reject the null hypothesis. Take a survey
of all the html in the world and see how many documents make
use of intra-word markup that use any markup at all. I'm guessing
it is a vanishingly small percentage.

If we could figure out how to implement intra-word markup in a way
that didn't induce complexity it would be done, and probably
would already have been done, and I suspect people might use it.

There are very few syntax changes that reduce the complexity for
Org (though there are some). The rest have major costs, both in
implementation time, and in disruption of workflows, and hunting
down of edge cases, and total complexity.

The burden of proof for syntax changes lies squarely with the
individual(s) suggesting the change to show that it can be
done without disrupting the existing implementation and without
inducing complexity and changing the interpretation of existing
documents. I say this as someone who has at least one major
syntax change suggestion in the pipeline.

Requesting a syntax change is among the most deeply
invasive and complex things that can be done. I know that
syntax is also the most obvious to users, it is their interface
to the format afterall! However, each individual shares that
interface with thousands of other people. The maintainers have
to speak for those thousands who never read, much less respond
on this mailing list, and that almost always means that the
response will be one that is decidedly conservative.

I don't mean to be dismissive of the suggestion, but a lot of
time is spent on this list walking back ideas that have not
had sufficient time put into understanding what the
unintended consequences would be, so I wouldn't say
that it is irresponsible, I would say instead that it lacks
sufficient rigor and depth to be seriously considered. If you
can add those to this proposal (e.g. in the form of a patch)
then I suspect it would get a much warmer reception.

Best,
Tom


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: On zero width spaces and Org syntax
  2021-12-04  4:04     ` Tom Gillespie
@ 2021-12-04  5:29       ` Juan Manuel Macías
  0 siblings, 0 replies; 15+ messages in thread
From: Juan Manuel Macías @ 2021-12-04  5:29 UTC (permalink / raw)
  To: Tom Gillespie; +Cc: orgmode

Tom Gillespie writes:

> I don't mean to be dismissive of the suggestion, but a lot of
> time is spent on this list walking back ideas that have not
> had sufficient time put into understanding what the
> unintended consequences would be, so I wouldn't say
> that it is irresponsible, I would say instead that it lacks
> sufficient rigor and depth to be seriously considered. If you
> can add those to this proposal (e.g. in the form of a patch)
> then I suspect it would get a much warmer reception.

I am afraid that I am explaining myself wrong, and it is not my
intention that this matter becomes entangled to infinity.

I have no intention of proposing any patch on this. I'm not strongly
requesting this feature be included, and I am not interested in starting
a crusade to defend this (and as for lack of rigor and depth, well, it's your
subjective opinion). But it's more simple. Since a thread on these
questions came up recently, it occurred to me to suggest this idea as a
*possibility*, in case anyone could find it interesting and would like
to explore it. Nothing more. In fact, I don't think I was going to use
this probable feature much, if it was implemented, because for these
scenarios I prefer to use Org macros or other resources that I have
implemented for my workflow. But maybe users would prefer this to insert
a zero-whith space character (which is a tricky and quite ugly
workaround and should not be recommended). Or maybe not. I really don't
know. I don't know all Org users in the world, do you know them?

Anyway, I want to point out one thing, again. The scenarios and contexts
that are being described here are far from "very narrow use case". And I
don't think it's very appropriate to hide the lack of something with the
excuse that no one is going to need it. Intra-word emphasis is used (for
example) a lot in linguistics books and texts, grammars, etc. That you
*ignore* this fact does not mean that does not exist.

regards,

jm








^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: On zero width spaces and Org syntax
  2021-12-03 12:48 On zero width spaces and Org syntax Juan Manuel Macías
  2021-12-03 19:03 ` Greg Minshall
  2021-12-03 21:48 ` Tim Cross
@ 2021-12-04  6:43 ` Marcin Borkowski
  2021-12-04  7:22   ` Ihor Radchenko
  2021-12-06 16:01   ` Robert Pluim
  2 siblings, 2 replies; 15+ messages in thread
From: Marcin Borkowski @ 2021-12-04  6:43 UTC (permalink / raw)
  To: Juan Manuel Macías; +Cc: orgmode


On 2021-12-03, at 13:48, Juan Manuel Macías <maciaschain@posteo.net> wrote:

> Hi all,
>
> It is usually recommended, as you know, to insert a zero width space
> character (Unicode U+200B) as a sort of delimiter mark to solve the
> scenarios of emphasis within a word (for example, =/meta/literature=)
> and others contexts where emphasis marks are not recognized (for example
> =[/literature/]=). I believe that as a puntual workaround it is not bad;
> however, I find it problematic that this character is part, more or less
> de facto, of the Org syntax. For two main reasons:
>
> 1. It is an invisible character, and therefore it is difficult to
> control and manage. I think it is not good practice to introduce this
> type of characters implicitly in a plain text document.
>
> 2. It is more natural that this type of space characters are part of the
> 'output' and not of the 'input'. In the input it is better to introduce
> them not implicitly but through their representation. For example, in
> LaTeX (with LuaTeX) using the command '\char"200B{}' (or '^^^^200b'),
> '&#x200B;' in HTML, etc.
>
> In any case, as an implicit character, I do not see it appropriate for
> the syntax of a markup language. The marks should be simply ascii
> characters, IMHO. So what if Org had a specific delimiter mark for the
> scenarios described above? For example, something like that:

Hi all,

I've skimmed through this discussion.  FWIW, I also use zero-width
spaces in my Org files for this precise reason.  However, I agree that
extending syntax is dangerous.

How about a solution (or maybe it's only a "solution"...) where:

1. We take care to modify the "official" exporters to throw out the ZWSs.
Or even better, convert them to something reasonable, e.g. with LaTeX
they can be discarded or converted to some command – possibly even one
defined in the preamble – so that nothing is lost.  I'd even say that an
option deciding what to do with those could be nice.

2. We modify Emacs itself to somehow highlight the ZWS.  There is (kind
of) a precedent – a no-breaking space is already fontified with
=nobreak-space= face.  At the very least, make whitespace-mode somehow
show ZWSs (which it doesn't now, and I'd probably say it's a bug).

I know that my point 2. is a bit controversial, since it could lead to
alignment issues where a ZWS is displayed as something with a positive
width. OTOH, even now changing the face of a ZWS leads to a narrow
(1-pixel wide) line of a different color.  Is there a way to make it
a bit stronger?

Just some random ideas,

-- 
Marcin Borkowski
http://mbork.pl


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: On zero width spaces and Org syntax
  2021-12-04  6:43 ` Marcin Borkowski
@ 2021-12-04  7:22   ` Ihor Radchenko
  2021-12-04 17:37     ` Marcin Borkowski
  2021-12-06 16:01   ` Robert Pluim
  1 sibling, 1 reply; 15+ messages in thread
From: Ihor Radchenko @ 2021-12-04  7:22 UTC (permalink / raw)
  To: Marcin Borkowski; +Cc: Juan Manuel Macías, orgmode

[-- Attachment #1: Type: text/plain, Size: 898 bytes --]

Marcin Borkowski <mbork@mbork.pl> writes:
> 2. We modify Emacs itself to somehow highlight the ZWS.  There is (kind
> of) a precedent – a no-breaking space is already fontified with
> =nobreak-space= face.  At the very least, make whitespace-mode somehow
> show ZWSs (which it doesn't now, and I'd probably say it's a bug).
>
> I know that my point 2. is a bit controversial, since it could lead to
> alignment issues where a ZWS is displayed as something with a positive
> width. OTOH, even now changing the face of a ZWS leads to a narrow
> (1-pixel wide) line of a different color.  Is there a way to make it
> a bit stronger?

We can try to create an accent. Try the following:
1. Open new empty org buffer
2. Disable font-lock-mode
3. M-: (insert (compose-string "a​" nil nil (list ?a '(bl . tl) ?␣)))

The result will look like on the attached image.

Best,
Ihor


[-- Attachment #2: example.png --]
[-- Type: image/png, Size: 2020 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: On zero width spaces and Org syntax
  2021-12-03 21:48 ` Tim Cross
  2021-12-04  1:26   ` Juan Manuel Macías
@ 2021-12-04 15:26   ` Max Nikulin
  2021-12-04 20:29     ` Tim Cross
  2021-12-06 11:40   ` Eric S Fraga
  2 siblings, 1 reply; 15+ messages in thread
From: Max Nikulin @ 2021-12-04 15:26 UTC (permalink / raw)
  To: emacs-orgmode

On 04/12/2021 04:48, Tim Cross wrote:
> 
> My vote is to simply maintain the status quo. Don't modify the syntax,
> don't make the zero space character somewhat special or processed in any
> special way during export. In short, accept that inner word markup has
> only limited support and if that is a requirement which is critical to
> your use case, accept that org mode may not be the right solution for
> your requirements.

Tim, you are skeptical concerning usage of Org markup outside of Emacs. 
Though some subscribers of this list support such idea with hope for 
collaboration with colleagues and for other reasons. Status quo in 
respect to similar questions increases risk that other tools will adapt 
different workarounds and incompatible dialects will appear.

 From the point of view of popularizing Org it is better to make some 
decision: either zero-width space should become a part of syntax or some 
other printable marker should be chosen to suppress effect of Org markup 
or vice versa to activate some construct.




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: On zero width spaces and Org syntax
  2021-12-04  7:22   ` Ihor Radchenko
@ 2021-12-04 17:37     ` Marcin Borkowski
  0 siblings, 0 replies; 15+ messages in thread
From: Marcin Borkowski @ 2021-12-04 17:37 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: Juan Manuel Macías, orgmode


On 2021-12-04, at 08:22, Ihor Radchenko <yantar92@gmail.com> wrote:

> Marcin Borkowski <mbork@mbork.pl> writes:
>> 2. We modify Emacs itself to somehow highlight the ZWS.  There is (kind
>> of) a precedent – a no-breaking space is already fontified with
>> =nobreak-space= face.  At the very least, make whitespace-mode somehow
>> show ZWSs (which it doesn't now, and I'd probably say it's a bug).
>>
>> I know that my point 2. is a bit controversial, since it could lead to
>> alignment issues where a ZWS is displayed as something with a positive
>> width. OTOH, even now changing the face of a ZWS leads to a narrow
>> (1-pixel wide) line of a different color.  Is there a way to make it
>> a bit stronger?
>
> We can try to create an accent. Try the following:
> 1. Open new empty org buffer
> 2. Disable font-lock-mode
> 3. M-: (insert (compose-string "a​" nil nil (list ?a '(bl . tl) ?␣)))
>
> The result will look like on the attached image.

I'm not sure if I like that idea - looks great, but I'd be a bit afraid
of unintended consequences.

Either way, personally I can live with ZWSs in my Org files, so whatever
is decided, it's fine with me.

Best,

-- 
Marcin Borkowski
http://mbork.pl


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: On zero width spaces and Org syntax
  2021-12-04 15:26   ` Max Nikulin
@ 2021-12-04 20:29     ` Tim Cross
  0 siblings, 0 replies; 15+ messages in thread
From: Tim Cross @ 2021-12-04 20:29 UTC (permalink / raw)
  To: emacs-orgmode


Max Nikulin <manikulin@gmail.com> writes:

> On 04/12/2021 04:48, Tim Cross wrote:
>> My vote is to simply maintain the status quo. Don't modify the syntax,
>> don't make the zero space character somewhat special or processed in any
>> special way during export. In short, accept that inner word markup has
>> only limited support and if that is a requirement which is critical to
>> your use case, accept that org mode may not be the right solution for
>> your requirements.
>
> Tim, you are skeptical concerning usage of Org markup outside of Emacs. Though
> some subscribers of this list support such idea with hope for collaboration with
> colleagues and for other reasons. Status quo in respect to similar questions
> increases risk that other tools will adapt different workarounds and
> incompatible dialects will appear.

This is a misrepresentation of my position. I've never stated I'm
sceptical or org markup outside of Emacs. I'm sceptical of org mode
outside of Emacs, but have never expressed an opinion of org markup
outside of Emacs.  

However, I will now....

Org markup outside Emacs is very much a secondary concern that would be
a nice to have for some workflows, but should be achieved with zero
impact on Emacs users. Org mode and the markup it uses is primarily an
Emacs mode. In fact, making it easier for non-Emacs users to use org
mode is almost certainly working against the FSF philosophy. I'm pretty
certain RMS would be very unhappy of any efforts to allow users to use
org mode in products like MS Visual Code. While it is fine for 3rd party
systems to try and mimic org mode, it is totally contrary to GNU
philosophy for a GNU project to actively support or enable such
functionality in non-free solutions. Any decisions to make changes to
org mode must be primarily for the benefit of Emacs users. When such
decisions also have benefit for non-Emacs users, that is great, but it
should not be a driving factor in making decisions regarding change or
extensions to org mode.

>
> From the point of view of popularizing Org it is better to make some decision:
> either zero-width space should become a part of syntax or some other printable
> marker should be chosen to suppress effect of Org markup or vice versa to
> activate some construct.

Chasing popularity is always a mistake and should never be used as an
argument for change. We are also talking about something where there is
little evidence of demand. We have a single post from someone asking how
to support inner word emphasis and suddenly, threads about modifying
syntax, modifying back ends and a dozen proposals on how to support this
'feature'.

A question I would ask is that if extending and adding broader support
for emphasis is so straight-forward, why do we already have so many
issues reported about incorrect application of markup? We have not been
successful in eliminating existing ambiguities with the markup and yet
some would have us charge off and add even more complexity.

Rather than extending markup syntax, lets focus on fixing the real issues we
already have. There have been far more posts to this list about that
than about inner word emphasis. For example, the many posts about markup
and links. 

With respect to the status of zero width space, I'm not convinced we
need to do anything. Would it be classified as a kludge, probably. Does
it provide an escape hatch for some situations, yes. Does that mean it
needs to be formally recognised and added to the syntax, no. Does the
existence of this kludge make implementation of org mode markup for
other tools more difficult or less clear, probably. Should that be a
primary concern for Emacs org-mode, no. Should it be something we
consider when making decisions, sure, but only as a secondary
consideration. 

What the need for the zero width space kludge really means is that in
some situations, we have some ambiguity in the existing syntax. Can we
fix those ambiguities? I don't know - so far, I've not seen a proposal
which doesn't introduce as many problems as it solves, (though Tomp's @@
proposal looks interesting, but lots more analysis is required).

The zero width kludge is certainly a symptom of limitations in the
existing syntax definition. However, I don't think it is the cure and I
don't agree it needs to be formally recognised as part of the syntax -
it is not the cure. If we can find the correct cure, the zero width
kludge will not be necessary (or will only be necessary in extreme and
rare edge cases). 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: On zero width spaces and Org syntax
  2021-12-03 21:48 ` Tim Cross
  2021-12-04  1:26   ` Juan Manuel Macías
  2021-12-04 15:26   ` Max Nikulin
@ 2021-12-06 11:40   ` Eric S Fraga
  2 siblings, 0 replies; 15+ messages in thread
From: Eric S Fraga @ 2021-12-06 11:40 UTC (permalink / raw)
  To: Tim Cross; +Cc: emacs-orgmode

On Saturday,  4 Dec 2021 at 08:48, Tim Cross wrote:
> My vote is to simply maintain the status quo. 

A very strong +1 on this.  Org has enough /escape mechanisms/, as you
call them, to cater for special cases, and these include @@...@@, babel,
and filters, amongst others.  The simplicity of org is a major
advantage.

-- 
: Eric S Fraga, with org release_9.5.1-243-gad53c5 in Emacs 29.0.50
: Latest paper written in org: https://arxiv.org/abs/2106.05096


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: On zero width spaces and Org syntax
  2021-12-04  6:43 ` Marcin Borkowski
  2021-12-04  7:22   ` Ihor Radchenko
@ 2021-12-06 16:01   ` Robert Pluim
  2021-12-06 16:42     ` Greg Minshall
  1 sibling, 1 reply; 15+ messages in thread
From: Robert Pluim @ 2021-12-06 16:01 UTC (permalink / raw)
  To: Marcin Borkowski; +Cc: Juan Manuel Macías, orgmode

>>>>> On Sat, 04 Dec 2021 07:43:35 +0100, Marcin Borkowski <mbork@mbork.pl> said:
    Marcin> 2. We modify Emacs itself to somehow highlight the ZWS.  There is (kind
    Marcin> of) a precedent – a no-breaking space is already fontified with
    Marcin> =nobreak-space= face.  At the very least, make whitespace-mode somehow
    Marcin> show ZWSs (which it doesn't now, and I'd probably say it's a bug).

Thereʼs no need to modify Emacs: see
`glyphless-char-display-control'. ZWS falls under 'format-control'.

Robert
-- 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: On zero width spaces and Org syntax
  2021-12-06 16:01   ` Robert Pluim
@ 2021-12-06 16:42     ` Greg Minshall
  0 siblings, 0 replies; 15+ messages in thread
From: Greg Minshall @ 2021-12-06 16:42 UTC (permalink / raw)
  To: Robert Pluim; +Cc: Juan Manuel =?utf-8?Q?Mac=C3=ADas?=, orgmode

Robert,

> Thereʼs no need to modify Emacs: see
> `glyphless-char-display-control'. ZWS falls under 'format-control'.

very nice.  thanks!

cheers, Greg


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-12-06 16:49 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-03 12:48 On zero width spaces and Org syntax Juan Manuel Macías
2021-12-03 19:03 ` Greg Minshall
2021-12-03 20:30   ` Juan Manuel Macías
2021-12-03 21:48 ` Tim Cross
2021-12-04  1:26   ` Juan Manuel Macías
2021-12-04  4:04     ` Tom Gillespie
2021-12-04  5:29       ` Juan Manuel Macías
2021-12-04 15:26   ` Max Nikulin
2021-12-04 20:29     ` Tim Cross
2021-12-06 11:40   ` Eric S Fraga
2021-12-04  6:43 ` Marcin Borkowski
2021-12-04  7:22   ` Ihor Radchenko
2021-12-04 17:37     ` Marcin Borkowski
2021-12-06 16:01   ` Robert Pluim
2021-12-06 16:42     ` Greg Minshall

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).