From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp10.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id 0MFSAjVU42ITdgEAbAwnHQ (envelope-from ) for ; Fri, 29 Jul 2022 05:29:57 +0200 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp10.migadu.com with LMTPS id oP15ATVU42IQMgEAG6o9tA (envelope-from ) for ; Fri, 29 Jul 2022 05:29:57 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 967B41DFFF for ; Fri, 29 Jul 2022 05:29:56 +0200 (CEST) Received: from localhost ([::1]:38024 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oHG6F-0003lO-TW for larch@yhetil.org; Thu, 28 Jul 2022 22:51:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35408) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oHG5m-0003lD-92 for emacs-orgmode@gnu.org; Thu, 28 Jul 2022 22:51:10 -0400 Received: from ciao.gmane.io ([116.202.254.214]:50042) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oHG5k-0003dC-Rw for emacs-orgmode@gnu.org; Thu, 28 Jul 2022 22:51:10 -0400 Received: from list by ciao.gmane.io with local (Exim 4.92) (envelope-from ) id 1oHG5h-00074h-1m for emacs-orgmode@gnu.org; Fri, 29 Jul 2022 04:51:05 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: emacs-orgmode@gnu.org From: Max Nikulin Subject: Re: [PATCH] Add new entity \-- serving as markup separator/escape symbol Date: Fri, 29 Jul 2022 09:50:58 +0700 Message-ID: References: <87r128d5pp.fsf@localhost> <80f0990042a564556cc6b047a94f7e9dddf5a280.camel@outlook.com> <87v8rkav2x.fsf@localhost> <87mtct9y1f.fsf@localhost> <87mtcsn173.fsf@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Content-Language: en-US In-Reply-To: <87mtcsn173.fsf@localhost> Received-SPF: pass client-ip=116.202.254.214; envelope-from=geo-emacs-orgmode@m.gmane-mx.org; helo=ciao.gmane.io X-Spam_score_int: 28 X-Spam_score: 2.8 X-Spam_bar: ++ X-Spam_report: (2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_ADSP_CUSTOM_MED=0.001, FORGED_GMAIL_RCVD=1, FORGED_MUA_MOZILLA=2.309, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.249, NICE_REPLY_A=-0.001, NML_ADSP_CUSTOM_MED=0.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Migadu-Flow: FLOW_IN X-Migadu-To: larch@yhetil.org X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1659065396; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=VhijjdrCprQQ1oIQfPn2Db90Jmq/Qjz6bMRuMW0cEKQ=; b=QppDImMfuRfNFsjpY1UCbJusUrUWFFVwH8uNsp7Qhmtr2XCdvLvrDiAZS7PMDZdm9QuA0A vHyAb2ij4ZkrheT8MiN58Um0d2ZGYWaBXeeMfaLrg+3MRWN6u+DNHRNwUgXtpKtKGN54pa H9ab2fbcOgZS6m4+OIUR5hecEfokHZI3m1aYY9aoBMOaM/ytkXWUw9I77BYoyObM9Mj2tC JQfFwSv7REqCyQj7DCnfkj6MJA/UG1b7IGlAIxbFOL0DiqYueIkgupUGhRdN+knof03fCD AZuWbI7EHJSIV2+tuBV3ZejW4k/zmHokrjBY60rr2q6WK40ZJHCu5io5jOnqGg== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1659065396; a=rsa-sha256; cv=none; b=Wq0qkYu6TMS4VJcap+KQykEk4xGQ3G5VZTx3OkTpUoWEix4/oCIOtS13sacWZDScXwsfUk 6lxl6E1sO4/FMHfQ/NiUgcyblt2Oly7K+M28m9bPJ80Dvm8Cso4LHvQuoS+Z2jqzhNIeBM XJD5n9j85ZEBhbVuIaJrT/KX0249rXzGl0CFwr9gDyCD5cIo3L1pER8KnkqdIFLTeWFW5O 7ejCvCaQnnnLary8IxrCGx6tnT9pWLElqJrJQZeJxcEtDqCPXM6Uribjs4o/CP+90EmFzz Oxmv2Q1oBm/LdatIz7gOWxF88YFwnaeAv0op70KjfDJ65uTlFCwJGg+NTK9zZA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: 3.48 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: 967B41DFFF X-Spam-Score: 3.48 X-Migadu-Scanner: scn0.migadu.com X-TUID: DoN5dImfBWfv On 29/07/2022 08:43, Ihor Radchenko wrote: > Max Nikulin writes: > >> The good point in your patch is that \- is still work as shy hyphen >> (that, by the way, may be used in some cases instead of zero width >> space: *intra*\-word). On the other hand I have managed to find a case >> when your approach is not ideal: >> >> *\--scratch\--* >> >>

>> ­-scratch

> > Well. I think that it is impossible to use the same escape construct to > both force emphasis and escape it. Let's articulate the problem as follows: when some characters ("*". "/". etc.) besides used literally are overloaded with 2 additional roles that are start emphasis group and terminate emphasis group, in addition to lightweight markup heuristics, it is necessary to provide a way to disambiguate which of 3 roles is associated with particular character. "Activate" and "deactivate" characters or entities for emphasis markers are alternative and perhaps not so clear terms have used before. The advantage of zero width space is that "[:space:]" is part of PREMATCH and POSTMATCH (outer) regexps in `org-emphasis-regexp-components' and "[:space:]" is forbidden at the inner borders of emphasized span of text. The latter is mostly meaningful, however I am unsure if bold space has the same width as regular one, and space in fixed width font is certainly distinct. The problem with the "\--" entity is that it is not handled properly at the start of emphasis region. It neither disables emphasis nor parsed as complete entity, instead it becomes combination of "\-" shy hyphen and literal "-". Unsure if it can be solved consistently. Possible ways: - It addition to space-like (in respect to current regexp) entity add another one that acts as a part of word, but like "\--" stripped from output. Likely it should be accompanied by more changes in the parser and regexps. - Provide some new explicit syntax for literal character, start of emphasis group, end of emphasis group. Concerning zero width space workaround, I may be wrong, but Nicolas might consider using U+200B zero width space as the escape character for itself: single one is filtered out during export, double zero width space becomes single character. (I do not like this kind of "white space" programming language".) Another question is whether U+2060 word joiner (or some other character) should be added either as alternative to zero width space or to allow = verbatim = fixed width text surrounded by fixed width spaces.