From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id yIcKN3XqqmGI3gAAgWs5BA (envelope-from ) for ; Sat, 04 Dec 2021 05:11:33 +0100 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id eIqEMnXqqmEIWQAAbx9fmQ (envelope-from ) for ; Sat, 04 Dec 2021 04:11:33 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 578C7F9D2 for ; Sat, 4 Dec 2021 05:11:33 +0100 (CET) Received: from localhost ([::1]:44986 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mtMOZ-0006q5-Bj for larch@yhetil.org; Fri, 03 Dec 2021 23:11:31 -0500 Received: from eggs.gnu.org ([209.51.188.92]:50354) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mtMO9-0006ps-JJ for emacs-orgmode@gnu.org; Fri, 03 Dec 2021 23:11:05 -0500 Received: from [2a00:1450:4864:20::52f] (port=43658 helo=mail-ed1-x52f.google.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mtMO4-0008On-CJ for emacs-orgmode@gnu.org; Fri, 03 Dec 2021 23:11:05 -0500 Received: by mail-ed1-x52f.google.com with SMTP id o20so19449328eds.10 for ; Fri, 03 Dec 2021 20:10:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=3eqyJX6JjhdkgXW74W/iDshPa34XvAjCGQKsKctIsO4=; b=S/e/LwAHAO0+4xYvAiGlKNcENAyd3X47xFqJTKb8Ne8+4BZi8cvI+R44oXbzRbpe0R z2u0qQtzsqLrayAX2pud5BJ++ZL1iU+062hcwTU4oPAGBbwZxAvp8ASfQdWM0SXZwMXQ /bcUKOivvBmjoJD3pPWahcX0dyLZlKxFSTy81/TlQ5/xWHZ5Rp8VZ7CWlF8bcd0Kf3iI UEBGSi5lgI2xi9OmCDXH/xZDvHPto88qdd95esE/6f0xulZ5jX/T420+mChKSgQks6Hy 4i/4zRqGIWV7Bcv8v94fxQlyQiCTRLRpT3B9j2k43li0VvkRVba73uqTWNbyiLgOalfo /e1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=3eqyJX6JjhdkgXW74W/iDshPa34XvAjCGQKsKctIsO4=; b=0SfuTne202KnaX53I5B6jf4uzL7NFSHSp3WWGgTN2vVWb/dtY97n5ypeJxDOVzW6kl 6bqVcKT7PYYUYGoN4Kiy9YoRi0NSPI70NCzg6IOCXcpflzO6GpMjUR+a7gA1r8lF06HL aYLOdN+/GOKJryroL7wvdhP1bN5DpZpEzGmp+HKmX66n7L1oACsy5Qv3axl8uwNGv8PL QR7w5XAhxA2JxptArXa8luwB+d0iKULj4/DlvUEHmzgAN6B52B3zYy/TCC7KPpZgTa89 +YIuT99+YkWTIg2btIPndJOol90b4e6wf+HgqAS7Rd3VWXFUWY0o95GjUJ8c8+RZYGO8 wvkA== X-Gm-Message-State: AOAM533dZApfAnQreesaLXa9D2boaWN+PJq2uB2WqK1fT8nAGCfh8z0a B1jXNGCmt6RfQCEQGBnelm2J//TkKAjAmj9yDNtxuxF+x+0= X-Google-Smtp-Source: ABdhPJxGmU1PxksP15c/jIPUdHIiMCxJwN7QwbIxsZhD02wINrdDummEtDPlnYiFRQ1HHhzAxeIxKWCq8o4ximlXFa8= X-Received: by 2002:a5d:4b82:: with SMTP id b2mr26519753wrt.419.1638590680260; Fri, 03 Dec 2021 20:04:40 -0800 (PST) MIME-Version: 1.0 References: <87ilw5yhv3.fsf@posteo.net> <87v905gtis.fsf@gmail.com> <87a6hh40uk.fsf@posteo.net> In-Reply-To: <87a6hh40uk.fsf@posteo.net> From: Tom Gillespie Date: Fri, 3 Dec 2021 20:04:28 -0800 Message-ID: Subject: Re: On zero width spaces and Org syntax To: =?UTF-8?Q?Juan_Manuel_Mac=C3=ADas?= Content-Type: text/plain; charset="UTF-8" X-Host-Lookup-Failed: Reverse DNS lookup failed for 2a00:1450:4864:20::52f (failed) Received-SPF: pass client-ip=2a00:1450:4864:20::52f; envelope-from=tgbugs@gmail.com; helo=mail-ed1-x52f.google.com X-Spam_score_int: -12 X-Spam_score: -1.3 X-Spam_bar: - X-Spam_report: (-1.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, PDS_HP_HELO_NORDNS=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tim Cross , orgmode Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1638591093; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=3eqyJX6JjhdkgXW74W/iDshPa34XvAjCGQKsKctIsO4=; b=DVOWqmnKEl0in7m4rvhgB12L6DG8YjoiJ3wCiM1mIUeK4hC1LIKffJwa37mqNl+1NJ/bQu tMR+NpgE4+R6ug7m0jA9mUfjCFU/PGdNENFtYk1c10NXRIOZMYzKsUBzmjopZfztOHL00a pLMFtsGJ1psfs/N1jVdH5mo/wy8TjC787EaVm+D1Jej9pkj+GiZa7Ol8976tWs90UTQSo1 q4nK2tEsGFFR0iz4qL3O6dELD1hXTsbMxQ5OZZ9qwYibafj/DApRVIovAIQWZ68x6z3m/x /0z8Nia/se2UzsX7iKK4OhVNupIuYAha598Pro0iHA5RcMIKIfwuiwVHl6thMQ== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1638591093; a=rsa-sha256; cv=none; b=R6tdwIpP6igbgaiOrbqXnrR251INYv3l7vFaZ5OUvOY36/luSHCt/9Z2nMDybQdCY0dD3Y e1XzSunEbwjcB58OIJypcyqXnstkyX5SkOC+32t+Ypfd2f2t9zUmR5hpfiToHEJt/q88uq l7fLA3jIVMmTUYtd9OByNLweKLuHlGqKbhiDg74hV7Ge9NDAkCR3uaBThymwfPS3bxhjCl TM6uAWaZ+E7y9Mcxwuv1hl4R1RLae1e2Q3hQhtr8iPEgMYEcfExP0TKe1VXESxHqUYfHrU DWmB4M/BhABf3OGp77FbLZUY1jjShYmA5UTzeQHktCAUq4xem5xL1NuJiMzOKQ== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b="S/e/LwAH"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -4.13 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b="S/e/LwAH"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: 578C7F9D2 X-Spam-Score: -4.13 X-Migadu-Scanner: scn0.migadu.com X-TUID: r0WMuyI5yHr8 An important note: for intra-word markup you probably want to use word joiner U+2060 and not zero width space, because a zero width space allows layout to break the word, whereas a word joiner does not. We may need to check to make sure that U+2060 counts as whitespace for the purposes of markup. > 2. It is more natural that this type of space characters are part of the > 'output' and not of the 'input'. That is not relevant in this case. However, Org export should not be emitting byte-literal zero width spaces either, that causes as NASTY surprise for the user. All that Org does in this pass is pass something along for the user. The kludge is a kluge because it just happens to be compatible with Org syntax, that is all. I agree that significant whitespace is decidedly undesirable, unfortunately Org already has some, though it is nowhere near as bad as markdown with the trailing whitespace. There also happen to be ways to mitigate issues with non-printing chars via font-locking etc. to make them print/visible when authoring. This is another good reason to use macros as well --- they can be documented. > As for the matter of emphasis marks between words. I believe that this > is not the underlying problem, but rather the (little) inconsistency of > the markup on certain contexts. Think, for example, of a text where you > have to put many words in italics, enclosed between brackets. I don't > care if that type of text is 'typical' or 'non-typical', 'majority' or > 'non-majority'. It is simply a kind of scenario absolutely legitimate > and feasible, and right now I could quote you more than a type of text > in that direction. The problem here is that there is an unbalanced design tradeoff. Supporting intra-word markup using Org's simple markup syntax actually introduces more inconsistencies elsewhere (see my note at the end about where the burden of proof lies with regard to statements like this). Further, we also have to consider the impact of such a change across the whole population of Emacs users and use cases. Adding complexity to support a very narrow use case, and one that will produce inconsistencies elsewhere means that the whole community is forced to bear the burden of that complexity. This is the principle that I think Tim touches on in terms of keeping simple things simple. Complexity in pursuit of niche use cases is never worth the cost when it has to be borne by 99% of users that will never need such things. Further, Org provides not only a single solution to these cases, but multiple solutions. Worst case it is also possible to fail over to text macros, which are an absurdly powerful escape hatch for users that have advanced (read niche) needs. > My proposal here also does not arise from an irrepressible desire to add > more complexity to the syntax. If it's recommended that the user, in > certain contexts, enter implicitly a zero-width space (which, I insist, > is a practice that should be avoided as much as possible in a plain text > document), why not at least offer a graphical alternative, a *real* mark > whose role is *exactly* the same as that of the zero-with space? Is that > adding more complexity??? Honestly I think that's exactly the opposite. This has the same problems as other proposals about this, whether they are escape chars, or other syntactic additions. It complicates the syntax for the community as a whole. It may simplify it for your particular use case, but not when averaged out with everyone else. I think one approach is to encourage the use of \emph{a}b and friends. They are printable and hide nothing. I would also suggest that we work to update other export backends to support \emph where possible. > In any case, I have suggested that new mark as a possibility, in case it > is interesting to implement it, since a thread has emerged these days > about the topic of the intra-words syntax. Discussions and threads > arised about these questions and any other are perfectly legitimate and > natural and welcome. Please: there are no issues more 'important' than > others; no two users are the same in Org. What you do not find useful, > another user may perhaps finds it indispensable. And vice versa. And I > think no one is in willingness to state what the average Org user does > or does not want, given that we do not know even 1% of Org users. I think we have a fairly good idea in this particular case. If someone wanted to do a more thorough study of existing org files in the wild to see whether they are using a workaround it would certainly be interesting, if unlikely to reject the null hypothesis. Take a survey of all the html in the world and see how many documents make use of intra-word markup that use any markup at all. I'm guessing it is a vanishingly small percentage. If we could figure out how to implement intra-word markup in a way that didn't induce complexity it would be done, and probably would already have been done, and I suspect people might use it. There are very few syntax changes that reduce the complexity for Org (though there are some). The rest have major costs, both in implementation time, and in disruption of workflows, and hunting down of edge cases, and total complexity. The burden of proof for syntax changes lies squarely with the individual(s) suggesting the change to show that it can be done without disrupting the existing implementation and without inducing complexity and changing the interpretation of existing documents. I say this as someone who has at least one major syntax change suggestion in the pipeline. Requesting a syntax change is among the most deeply invasive and complex things that can be done. I know that syntax is also the most obvious to users, it is their interface to the format afterall! However, each individual shares that interface with thousands of other people. The maintainers have to speak for those thousands who never read, much less respond on this mailing list, and that almost always means that the response will be one that is decidedly conservative. I don't mean to be dismissive of the suggestion, but a lot of time is spent on this list walking back ideas that have not had sufficient time put into understanding what the unintended consequences would be, so I wouldn't say that it is irresponsible, I would say instead that it lacks sufficient rigor and depth to be seriously considered. If you can add those to this proposal (e.g. in the form of a patch) then I suspect it would get a much warmer reception. Best, Tom