From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id gIfSFpWpqmF1VAAAgWs5BA (envelope-from ) for ; Sat, 04 Dec 2021 00:34:45 +0100 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id iPx8EpWpqmFqHAAAB5/wlQ (envelope-from ) for ; Fri, 03 Dec 2021 23:34:45 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id C6C9329123 for ; Sat, 4 Dec 2021 00:34:44 +0100 (CET) Received: from localhost ([::1]:45110 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mtI4h-00044v-1i for larch@yhetil.org; Fri, 03 Dec 2021 18:34:43 -0500 Received: from eggs.gnu.org ([209.51.188.92]:39516) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mtHwS-0002Y8-4m for emacs-orgmode@gnu.org; Fri, 03 Dec 2021 18:26:12 -0500 Received: from [2607:f8b0:4864:20::436] (port=39929 helo=mail-pf1-x436.google.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mtHwP-0007Oq-RR for emacs-orgmode@gnu.org; Fri, 03 Dec 2021 18:26:11 -0500 Received: by mail-pf1-x436.google.com with SMTP id i12so4281864pfd.6 for ; Fri, 03 Dec 2021 15:26:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=references:user-agent:from:to:subject:date:in-reply-to:message-id :mime-version:content-transfer-encoding; bh=+Lwc9jcgaJnpv09P+FxyUMXDZV2vpJKVKK0h0x7zvl4=; b=HnxFozmTqexdzqBUvkIfBTQO88Ru+m/e9v0qFkwYPHQicA2T9dnIQpuNmLtwxI1iAD uYLFbH0zaRCsF1rDC2UY+VsTTmuu3euLLcSCuT6KgK3y2pesjsAptUN9TScL3TjjWKI+ vLCViZG+6baEgdJeQbi62ZsZtbp9/+kb7I4RDWsscn4IKIOmVuxxF4k472hIXY1K92So c3BlDYO9oJBa39Odyj2IR+R0twb1g51wG8ZXeHuCXxAmd2dq16Wl/3p56C2T+L0Mn9nF Ebp4yCP63kAaYRlIF0KlKWdLA9nbUyTyTVKP0d+IDZ8qmE8Y5tiYXUpM/CXgtId+/XsY aA4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:references:user-agent:from:to:subject:date :in-reply-to:message-id:mime-version:content-transfer-encoding; bh=+Lwc9jcgaJnpv09P+FxyUMXDZV2vpJKVKK0h0x7zvl4=; b=5mwWT7YRqapN/ETLxzSQHL+OvhOD7Mr+m4NcBqw7epO3LreKgG2zJ8DoXJT2ThDd+V mczVpA3427OXZhwSyMV2U2vacDU37dU+mNJORLXAAXD70DCQw6TK+Zg65cIASbkBDgzO 39QafLrTOb6lHtD2zw9bHQGpPlYJmL3r2cnS6OXg8zyXjo9WirQNBBTtofDpVZ+JUZF/ 0lZymlHooHGPWwkEge15auQWnkp4drVunFKD0ZxyypoShRVo2iRdRdh/lsaWN39af1bQ AWRI5DB6EmvLP6OCV+/mtfNS/su9k/hzjy2z2mZA4ZLsdujFWm6LesTcWXX0UqKAmVsV uzgg== X-Gm-Message-State: AOAM531P7/s0lV7vVbv4KWhs83kzPFWLVR8fNf5PxQo4s8yC8KbxETV7 1i70mMFQCSKtHfdn7l1IC072dak39yY= X-Google-Smtp-Source: ABdhPJzEdbORwBbQUNXvz/JdOI97/H1fYu92zrncJ8EvA7mi2OwaQg+55d2oIqslsNyl8oNnijm/ug== X-Received: by 2002:a05:6a00:2402:b0:4a8:4557:e96b with SMTP id z2-20020a056a00240200b004a84557e96bmr21697577pfh.76.1638573966992; Fri, 03 Dec 2021 15:26:06 -0800 (PST) Received: from dingbat (2001-44b8-31f2-bb00-ae68-ae3c-97fa-f661.static.ipv6.internode.on.net. [2001:44b8:31f2:bb00:ae68:ae3c:97fa:f661]) by smtp.gmail.com with ESMTPSA id k26sm4133183pfe.78.2021.12.03.15.26.05 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Dec 2021 15:26:06 -0800 (PST) References: <87ilw5yhv3.fsf@posteo.net> User-agent: mu4e 1.7.5; emacs 28.0.60 From: Tim Cross To: emacs-orgmode@gnu.org Subject: Re: On zero width spaces and Org syntax Date: Sat, 04 Dec 2021 08:48:39 +1100 In-reply-to: <87ilw5yhv3.fsf@posteo.net> Message-ID: <87v905gtis.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Host-Lookup-Failed: Reverse DNS lookup failed for 2607:f8b0:4864:20::436 (failed) Received-SPF: pass client-ip=2607:f8b0:4864:20::436; envelope-from=theophilusx@gmail.com; helo=mail-pf1-x436.google.com X-Spam_score_int: -12 X-Spam_score: -1.3 X-Spam_bar: - X-Spam_report: (-1.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, PDS_HP_HELO_NORDNS=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1638574485; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post:dkim-signature; bh=+Lwc9jcgaJnpv09P+FxyUMXDZV2vpJKVKK0h0x7zvl4=; b=pFVP5wrnE9a3Fpfa7Gd0lfFIwHEeR4rCOJXBE686pJlMLghqFN6z+dDnFlL6cEJGHOz5Rb g7Ii7wJ0j5a/6YTnUmtbvTou4+xYI/NW+aXlppPhciPcUH01O34byz8n/FOXoNZh2nENm3 JAZCIU9OPjJ5fWAo+pn4jUN7h65c6iTsKyq/76acItPXbZKcTTG6I8PVR3wtUZvDaNDvq0 aKwBjRT6aC1WZOy4u6QW5xAYkxUTEY2U1rsumLHRIatmW7A7OaFzIOiA2H55Dz88H5XaXc cyeKwfxVyRlhuEMO6A5FQ0nBDSBeMNyUpq8ydK14VZp7EHtD3zNZhYHsPHYzMw== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1638574485; a=rsa-sha256; cv=none; b=esqo5z9OKhcmYmbp79gV3wk7AvQxJOIZ9N8O6b6N9u5DrMTKvzU+KZesu2Q0yHxnJ1DQDM CVjJT72HdP0GajkDuHbGKq8+/3GueZ5Y9331H0oMzlondkzriT+AweJScS/9MW1hD7KLWo iR2tvbfatHZJADQfVy/AmzyFgXGXKh53OUFuskR9shFFstMQwjNg/XAmtV7LjzR6/MmYUq TboC0bP12pjPP30go8nmg0oHL6oEhuKnr9qsPkO2WJh0bLsO+q+wONY+f8Sr4p70Qp3zvh WE4FVtpEP4XKHwMusJY2L00/LkmyP0wYHt/Yi7MJc0qfSKkE1Nxb5YoztChM7w== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=HnxFozmT; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -4.13 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=HnxFozmT; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: C6C9329123 X-Spam-Score: -4.13 X-Migadu-Scanner: scn0.migadu.com X-TUID: +resNkUrSPv6 Juan Manuel Mac=C3=ADas writes: > Hi all, > > It is usually recommended, as you know, to insert a zero width space > character (Unicode U+200B) as a sort of delimiter mark to solve the > scenarios of emphasis within a word (for example, =3D/meta/literature=3D) > and others contexts where emphasis marks are not recognized (for example > =3D[/literature/]=3D). I believe that as a puntual workaround it is not b= ad; > however, I find it problematic that this character is part, more or less > de facto, of the Org syntax. For two main reasons: > > 1. It is an invisible character, and therefore it is difficult to > control and manage. I think it is not good practice to introduce this > type of characters implicitly in a plain text document. > > 2. It is more natural that this type of space characters are part of the > 'output' and not of the 'input'. In the input it is better to introduce > them not implicitly but through their representation. For example, in > LaTeX (with LuaTeX) using the command '\char"200B{}' (or '^^^^200b'), > '​' in HTML, etc. > > In any case, as an implicit character, I do not see it appropriate for > the syntax of a markup language. The marks should be simply ascii > characters, IMHO. So what if Org had a specific delimiter mark for the > scenarios described above? For example, something like that: > > #+begin_example > > /meta/''literature > > *meta*''literature > > [''*literature*''] > > #+end_example > > WDYT? > > Best regards, > > Juan Manuel=20 I think I am in agreement regarding most of your points about the use of the zero-width character. I see it as a type of escape hatch which provides a solution in some less frequent situations. It is a somewhat clever kludge to enable markup in some situations not supported by the basic markup syntax I'm happy with its status as a kludge and would not want to see it become an official part of the syntax. Where we may differ is in whether we actually want to add inner word markup support at all.=20 I'm somewhat surprised and more than a little concerned at how much interest and focus on modifying the markup syntax of org the question of inner word markup has generated. This seems to be a symptom of a more general trend towards adding and extending org mode to meet the needs of everyone and I'm concerned this is overlooking the key strength of org mode - simplicity. Consider how many times we have had requests for inner word markup in the last 18 years. I've seen such requests only a very few times. Certainly not frequently enough to consider modification of the markup syntax to accommodate such a requirement. A key philosophy of org mode is simplicity - it makes the easy stuff simple and the hard stuff possible. The thing about simple solutions is that they will inevitably have limitations. If you don't want those limitations, then you use a more complex feature rich markup, such as Latex, HTML, XML etc. Ideally, your system will provide some escape hatches to allow you to do things not supported by the base markup syntax. Those escape hatches will usually be less convenient and often look quite ugly, but that is fine because they are an escape hatch which is used infrequently. Better still is if the system provides some way to make a specific escape hatch easier to use in a document (such as via a macro). The basic org markup syntax has worked remarkably well for 18 years. Nearly all the proposed additions or alterations to support inner word markup with complicate the syntax or introduce potential new ambiguities and/or complexity in processing to support a feature which has been rarely asked for and which has other, less convenient and often ugly, solutions which work. One of org's strengths has been the ability to export documents to multiple formats. One way this has been made possible is by keeping the markup syntax simple - a basic markup which is well supported by all export back ends. Once you start adding more complex markup support, you see a blow out of complexity in the export back ends. Worse yet, you get results which are surprising to the end user or which simply don't work correctly with some formats. to avoid this, it is critical to keep the markup syntax as simple and straight-forward as possible, even if that means some limitations on what can be done with the markup.=20 My vote is to simply maintain the status quo. Don't modify the syntax, don't make the zero space character somewhat special or processed in any special way during export. In short, accept that inner word markup has only limited support and if that is a requirement which is critical to your use case, accept that org mode may not be the right solution for your requirements.=20