From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp0 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id GB0JCBAPq2GcJQEAgWs5BA (envelope-from ) for ; Sat, 04 Dec 2021 07:47:44 +0100 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp0 with LMTPS id AEPIAxAPq2F6IgAA1q6Kng (envelope-from ) for ; Sat, 04 Dec 2021 06:47:44 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id B83352E97C for ; Sat, 4 Dec 2021 07:47:43 +0100 (CET) Received: from localhost ([::1]:54602 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mtOph-0004Xm-Ks for larch@yhetil.org; Sat, 04 Dec 2021 01:47:41 -0500 Received: from eggs.gnu.org ([209.51.188.92]:42552) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mtOm5-0004XQ-FR for emacs-orgmode@gnu.org; Sat, 04 Dec 2021 01:43:57 -0500 Received: from mail.mojserwer.eu ([195.110.48.8]:58208) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mtOm1-0003yw-2R for emacs-orgmode@gnu.org; Sat, 04 Dec 2021 01:43:57 -0500 Received: from localhost (localhost [127.0.0.1]) by mail.mojserwer.eu (Postfix) with ESMTP id A928EE6DB0; Sat, 4 Dec 2021 07:43:46 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at mail.mojserwer.eu Received: from mail.mojserwer.eu ([127.0.0.1]) by localhost (mail.mojserwer.eu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RebJ9mWIsFwt; Sat, 4 Dec 2021 07:43:39 +0100 (CET) Received: from localhost (178235147228.dynamic-3-poz-k-0-1-0.vectranet.pl [178.235.147.228]) by mail.mojserwer.eu (Postfix) with ESMTPSA id 28866E64B8; Sat, 4 Dec 2021 07:43:39 +0100 (CET) References: <87ilw5yhv3.fsf@posteo.net> User-agent: mu4e 1.1.0; emacs 28.0.50 From: Marcin Borkowski To: Juan Manuel =?utf-8?Q?Mac=C3=ADas?= Subject: Re: On zero width spaces and Org syntax In-reply-to: <87ilw5yhv3.fsf@posteo.net> Date: Sat, 04 Dec 2021 07:43:35 +0100 Message-ID: <87fsr8na3s.fsf@mbork.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=195.110.48.8; envelope-from=mbork@mbork.pl; helo=mail.mojserwer.eu X-Spam_score_int: -25 X-Spam_score: -2.6 X-Spam_bar: -- X-Spam_report: (-2.6 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: orgmode Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Migadu-Flow: FLOW_IN X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1638600463; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=tlWTS+GZhmmyzb3c3sGo2m+XrfMayNkgNPPezfW6qsw=; b=Dl0bb5VG65DPOpSquPbOHhkahYQ7pJfWCCH9W4dG+YueyBNKAQwCl1rqO3Nz6hTHlcZFAC GoKb0IueHnoRv4u8BY4ilqmt2oUw3ywz23+7P4oXbMaQMpXW+R8ttFVSieZ8vLkwLvtCn9 RNYLjdIJPB7GAKVW7N/ymEpGX9SuPx/5lhkLkchaIH11fMEjH/F3S+5oQ6IbsrJobqHA7u 5zY6C8mqpLqjzvlUS8DrU2581jnl89Cgfe4ktXfCLDdiP6ueqzJUyDlRy1+LaiGIl3N8WR Xu1BIkYeMk2YEU9IctLX5tQnyVRp/fCAbASQFVKItVgi6F9KhvY9U6zKcxwwGA== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1638600463; a=rsa-sha256; cv=none; b=QGFrrD13Tw3FEOYVUCh5kOOfP0goMUujVj9ll8H6YXTFt99Uy1f7aWlbbrrOdz+xf+B9ZH cLOGTBH0ZNvmLhRjkAPfsl/iCRTkb3GmpyoYmK+o6cZqsTUOhkQdM2GrYH6bsMVE4xY/aH PVBn2K/UrBQyvpfL2nFJA6h3gK9GMi8zN0zE8QxtOmnxbxfrUnJKAxTRucvxhUhKmruH2H VWoaeX0h19AeFfinwDcb2+CgmPY8vfFxK1MFde4Gj4ftOXZdYuUwzf1cGxqagwEalYPqf0 2+isILrQ7XfamIkSYZC2oTEAIEfyxgaUKAoe8Ri5MmJi+2igPnjeciQJnxeyiQ== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: -3.03 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: B83352E97C X-Spam-Score: -3.03 X-Migadu-Scanner: scn0.migadu.com X-TUID: 8i5dTCmtDEY7 On 2021-12-03, at 13:48, Juan Manuel Mac=C3=ADas w= rote: > Hi all, > > It is usually recommended, as you know, to insert a zero width space > character (Unicode U+200B) as a sort of delimiter mark to solve the > scenarios of emphasis within a word (for example, =3D/meta/literature=3D) > and others contexts where emphasis marks are not recognized (for example > =3D[/literature/]=3D). I believe that as a puntual workaround it is not b= ad; > however, I find it problematic that this character is part, more or less > de facto, of the Org syntax. For two main reasons: > > 1. It is an invisible character, and therefore it is difficult to > control and manage. I think it is not good practice to introduce this > type of characters implicitly in a plain text document. > > 2. It is more natural that this type of space characters are part of the > 'output' and not of the 'input'. In the input it is better to introduce > them not implicitly but through their representation. For example, in > LaTeX (with LuaTeX) using the command '\char"200B{}' (or '^^^^200b'), > '​' in HTML, etc. > > In any case, as an implicit character, I do not see it appropriate for > the syntax of a markup language. The marks should be simply ascii > characters, IMHO. So what if Org had a specific delimiter mark for the > scenarios described above? For example, something like that: Hi all, I've skimmed through this discussion. FWIW, I also use zero-width spaces in my Org files for this precise reason. However, I agree that extending syntax is dangerous. How about a solution (or maybe it's only a "solution"...) where: 1. We take care to modify the "official" exporters to throw out the ZWSs. Or even better, convert them to something reasonable, e.g. with LaTeX they can be discarded or converted to some command =E2=80=93 possibly even = one defined in the preamble =E2=80=93 so that nothing is lost. I'd even say th= at an option deciding what to do with those could be nice. 2. We modify Emacs itself to somehow highlight the ZWS. There is (kind of) a precedent =E2=80=93 a no-breaking space is already fontified with =3Dnobreak-space=3D face. At the very least, make whitespace-mode somehow show ZWSs (which it doesn't now, and I'd probably say it's a bug). I know that my point 2. is a bit controversial, since it could lead to alignment issues where a ZWS is displayed as something with a positive width. OTOH, even now changing the face of a ZWS leads to a narrow (1-pixel wide) line of a different color. Is there a way to make it a bit stronger? Just some random ideas, --=20 Marcin Borkowski http://mbork.pl