emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Tom Gillespie <tgbugs@gmail.com>
To: "Juan Manuel Macías" <maciaschain@posteo.net>
Cc: Tim Cross <theophilusx@gmail.com>, orgmode <emacs-orgmode@gnu.org>
Subject: Re: On zero width spaces and Org syntax
Date: Fri, 3 Dec 2021 20:04:28 -0800	[thread overview]
Message-ID: <CA+G3_PM4cxHa8bU+3QG541UiOauLNAQFZQu-+UKczx3itOeTHg@mail.gmail.com> (raw)
In-Reply-To: <87a6hh40uk.fsf@posteo.net>

An important note: for intra-word markup you probably want to
use word joiner U+2060 and not zero width space, because a
zero width space allows layout to break the word, whereas a
word joiner does not. We may need to check to make sure that
U+2060 counts as whitespace for the purposes of markup.

> 2. It is more natural that this type of space characters are part of the
> 'output' and not of the 'input'.

That is not relevant in this case. However, Org export should not be
emitting byte-literal zero width spaces either, that causes as NASTY
surprise for the user. All that Org does in this pass is pass something
along for the user. The kludge is a kluge because it just happens to
be compatible with Org syntax, that is all. I agree that significant
whitespace is decidedly undesirable, unfortunately Org already
has some, though it is nowhere near as bad as markdown with
the trailing whitespace. There also happen to be ways to mitigate
issues with non-printing chars via font-locking etc. to make them
print/visible when authoring. This is another good reason to use
macros as well --- they can be documented.

> As for the matter of emphasis marks between words. I believe that this
> is not the underlying problem, but rather the (little) inconsistency of
> the markup on certain contexts. Think, for example, of a text where you
> have to put many words in italics, enclosed between brackets. I don't
> care if that type of text is 'typical' or 'non-typical', 'majority' or
> 'non-majority'. It is simply a kind of scenario absolutely legitimate
> and feasible, and right now I could quote you more than a type of text
> in that direction.

The problem here is that there is an unbalanced design tradeoff.
Supporting intra-word markup using Org's simple markup syntax
actually introduces more inconsistencies elsewhere (see my
note at the end about where the burden of proof lies with
regard to statements like this).

Further, we also have to consider the impact of such a change
across the whole population of Emacs users and use cases.
Adding complexity to support a very narrow use case, and one
that will produce inconsistencies elsewhere means that the
whole community is forced to bear the burden of that complexity.

This is the principle that I think Tim touches on in terms of keeping
simple things simple. Complexity in pursuit of niche use cases is
never worth the cost when it has to be borne by 99% of users that
will never need such things.

Further, Org provides not only a single solution to these cases, but
multiple solutions. Worst case it is also possible to fail over to
text macros, which are an absurdly powerful escape hatch for users
that have advanced (read niche) needs.

> My proposal here also does not arise from an irrepressible desire to add
> more complexity to the syntax. If it's recommended that the user, in
> certain contexts, enter implicitly a zero-width space (which, I insist,
> is a practice that should be avoided as much as possible in a plain text
> document), why not at least offer a graphical alternative, a *real* mark
> whose role is *exactly* the same as that of the zero-with space? Is that
> adding more complexity??? Honestly I think that's exactly the opposite.

This has the same problems as other proposals about this, whether
they are escape chars, or other syntactic additions. It complicates
the syntax for the community as a whole. It may simplify it for your
particular use case, but not when averaged out with everyone else.

I think one approach is to encourage the use of \emph{a}b and friends.
They are printable and hide nothing. I would also suggest that we work
to update other export backends to support \emph where possible.

> In any case, I have suggested that new mark as a possibility, in case it
> is interesting to implement it, since a thread has emerged these days
> about the topic of the intra-words syntax. Discussions and threads
> arised about these questions and any other are perfectly legitimate and
> natural and welcome. Please: there are no issues more 'important' than
> others; no two users are the same in Org. What you do not find useful,
> another user may perhaps finds it indispensable. And vice versa. And I
> think no one is in willingness to state what the average Org user does
> or does not want, given that we do not know even 1% of Org users.

I think we have a fairly good idea in this particular case. If someone
wanted to do a more thorough study of existing org files in the wild
to see whether they are using a workaround it would certainly be
interesting, if unlikely to reject the null hypothesis. Take a survey
of all the html in the world and see how many documents make
use of intra-word markup that use any markup at all. I'm guessing
it is a vanishingly small percentage.

If we could figure out how to implement intra-word markup in a way
that didn't induce complexity it would be done, and probably
would already have been done, and I suspect people might use it.

There are very few syntax changes that reduce the complexity for
Org (though there are some). The rest have major costs, both in
implementation time, and in disruption of workflows, and hunting
down of edge cases, and total complexity.

The burden of proof for syntax changes lies squarely with the
individual(s) suggesting the change to show that it can be
done without disrupting the existing implementation and without
inducing complexity and changing the interpretation of existing
documents. I say this as someone who has at least one major
syntax change suggestion in the pipeline.

Requesting a syntax change is among the most deeply
invasive and complex things that can be done. I know that
syntax is also the most obvious to users, it is their interface
to the format afterall! However, each individual shares that
interface with thousands of other people. The maintainers have
to speak for those thousands who never read, much less respond
on this mailing list, and that almost always means that the
response will be one that is decidedly conservative.

I don't mean to be dismissive of the suggestion, but a lot of
time is spent on this list walking back ideas that have not
had sufficient time put into understanding what the
unintended consequences would be, so I wouldn't say
that it is irresponsible, I would say instead that it lacks
sufficient rigor and depth to be seriously considered. If you
can add those to this proposal (e.g. in the form of a patch)
then I suspect it would get a much warmer reception.


  reply	other threads:[~2021-12-04  4:11 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-03 12:48 On zero width spaces and Org syntax Juan Manuel Macías
2021-12-03 19:03 ` Greg Minshall
2021-12-03 20:30   ` Juan Manuel Macías
2021-12-03 21:48 ` Tim Cross
2021-12-04  1:26   ` Juan Manuel Macías
2021-12-04  4:04     ` Tom Gillespie [this message]
2021-12-04  5:29       ` Juan Manuel Macías
2021-12-04 15:26   ` Max Nikulin
2021-12-04 20:29     ` Tim Cross
2021-12-06 11:40   ` Eric S Fraga
2021-12-04  6:43 ` Marcin Borkowski
2021-12-04  7:22   ` Ihor Radchenko
2021-12-04 17:37     ` Marcin Borkowski
2021-12-06 16:01   ` Robert Pluim
2021-12-06 16:42     ` Greg Minshall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+G3_PM4cxHa8bU+3QG541UiOauLNAQFZQu-+UKczx3itOeTHg@mail.gmail.com \
    --to=tgbugs@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=maciaschain@posteo.net \
    --cc=theophilusx@gmail.com \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).