From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp10.migadu.com ([2001:41d0:403:4789::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms9.migadu.com with LMTPS id 2Cr3KAsmtmRR/AAASxT56A (envelope-from ) for ; Tue, 18 Jul 2023 07:41:31 +0200 Received: from aspmx1.migadu.com ([2001:41d0:403:4789::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp10.migadu.com with LMTPS id KPr7JwsmtmSV0gAAG6o9tA (envelope-from ) for ; Tue, 18 Jul 2023 07:41:31 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 855E351344 for ; Tue, 18 Jul 2023 07:41:26 +0200 (CEST) Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="kfdj/BMl"; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1689658886; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=ilOfAxCczKECO/LQG5wcpQ6uOcYF5bMUI8CGS7tHcD8=; b=Nfm/w2LHixKNx4If2hEkdbT0nYmozkg9Iwb3+lt1tjXN50VViGKujmjVBn4H4BgyWNebkg j8CVsXsSEr3jyEBSjzsECNzXMoe5PnsHgHHlq3t6rCdtW5tQIkAkfNsJKPiYHM+J2Km0Ho 6tjUveXR+UFhaV7wiv7P1T+U2TENS7AgdpmqdbHcy8ZJ0XcuwAQFM+RxAl+XyInVTyKmMc CvvmgcJYYwyWhHjjAgDMYlKjuDlOdf/Vy1a4sRj8QHk369nLy6KVRFyEdohITB+REUNS1y GqSckLIcPR4kNyTNjiFVCFWtdL+Fx6MH5h0CemQjnvJJB4OW8qtnpi39BCjNdQ== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1689658886; a=rsa-sha256; cv=none; b=lsrZRIWiCa2LkAOS48HPZ/oXOkgcifCGWeuae69PqWxXsr+ValfuJJV2haj3UhE4wUB8kl KWsDrmQHTueBGiN9Lk6y6uN/1DZfXroWjoaRh83BeQ0Vum5k8Hb7fRV+v9KuQCKYfF1mM/ 2fU1YimpSYe8HrW2gj5PQkNLn0Py0aechOA8wj8jsm+GP9kmMUMcg1odMtgg4LOcYT5OIv 9eLfAxgvrxK5LQUlGr6pP3GEoj2r/Z8OyS2cRGSzvrOLB1Z+Py2VfOCxGAXHFlj+ePGfkB nwv+sPqFUy/ZlVGRaVeQJ6Lvt9fdJYXIlg4kFEKA4x+t+O06aZzTTtiFLBErIQ== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="kfdj/BMl"; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org"; dmarc=pass (policy=none) header.from=gmail.com Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qLdS5-0006nA-Rh; Tue, 18 Jul 2023 01:40:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qLdS4-0006mo-Hm for emacs-orgmode@gnu.org; Tue, 18 Jul 2023 01:40:48 -0400 Received: from mail-qk1-x735.google.com ([2607:f8b0:4864:20::735]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qLdS3-0007ui-2E; Tue, 18 Jul 2023 01:40:48 -0400 Received: by mail-qk1-x735.google.com with SMTP id af79cd13be357-7672303c831so497667385a.2; Mon, 17 Jul 2023 22:40:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689658844; x=1692250844; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=ilOfAxCczKECO/LQG5wcpQ6uOcYF5bMUI8CGS7tHcD8=; b=kfdj/BMlvx7Tcf28x0izf3zRRQLKwpQlJIu+Cdf2X/LvIGBfJxHXLEPkjAHETXJFC6 d4JSlW2HmLCK5nBO2dw4q00GjmBXQp+gjquMwJ1zCp+zP2hYiaog2vNJBz99rZBgagbN Y0D/N9EdtvIuBVZkVhW7hghpZ3dXwxl5Po0TO6hWtTvjeOtQCGVwiu1p0dHXqqIttMh7 /eVlWSkAP9wXoR/Y0VZL4jJ1evbNkSZblgsNylWxfD1szjlUewVlDA7JkuMzesEi8jiU 1S5K18dFTgjLpFz60i2ctEHmo8R4ioHIe45ih7skbuaMEzwyyWW2Jb56+SjJsI/G+scA FdBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689658844; x=1692250844; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ilOfAxCczKECO/LQG5wcpQ6uOcYF5bMUI8CGS7tHcD8=; b=a4F3EKJzR2GmhRo8RocIqAOkrlDBHyqH4wFC8vJwSE1ZYW45eWS6pc+nGqxz3mmEX0 q/QS4oIaZANNL0Pso97e3JckwdMvJN71S0b4ZuD4jC6AEaxk4AccG3M2W8E78r0hKwpi E5RwiDQBCJdnrl4pupZanOqX4oP7D2kNBgnjWAqGQKJNWf1fQixDd1HgG5sQO9k7Awi+ 0K0WtqrhMSc1IFOww7m5bgNnqxViF+tBXl6QZBSZJDEaBGiabokNdFcOc5k1vUqXfKyx Ore3jdBJXgymCTzcW2OOaw3LjTf8VQYfaEhU9wUPhnwv+UP/ttwVMmkKFGhMuzYlSYU9 8HKw== X-Gm-Message-State: ABy/qLY5biDNGspjrbhwRvofShZ3kVcJy9XM8G4TneWtxBnz9qCWsY2z woL+UvBS96X0692IZSzDcld+99KynTMkaa3f1kI= X-Google-Smtp-Source: APBJJlFefd1rGh0nfDl2yr4sFVGliVW3H0X5YSz7g/YT+EZc8mlUU6Fa3Z1pxth7HVOtErBc5C6t9huq02EB70u4ZrM= X-Received: by 2002:a37:5a06:0:b0:768:efc:5714 with SMTP id o6-20020a375a06000000b007680efc5714mr7366067qkb.62.1689658844491; Mon, 17 Jul 2023 22:40:44 -0700 (PDT) MIME-Version: 1.0 References: <87o86mw86r.fsf@localhost> <87fsrxkahq.fsf@nicolasgoaziou.fr> <87fsrxa1j5.fsf@localhost> <878rxoa6lk.fsf@localhost> <87tug93b2a.fsf@localhost> <87y25l8wvs.fsf@nicolasgoaziou.fr> <87r1bd39ny.fsf@localhost> <8735nsv9qo.fsf@nicolasgoaziou.fr> <87mtm09xzf.fsf@localhost> <87zgq02ueq.fsf@nicolasgoaziou.fr> <87h7c89rqr.fsf@localhost> <874k86y997.fsf@nicolasgoaziou.fr> <87v90lzwkm.fsf@localhost> <874jm2kb7x.fsf@localhost> <87ttu13j08.fsf@localhost> In-Reply-To: <87ttu13j08.fsf@localhost> From: Tom Gillespie Date: Mon, 17 Jul 2023 22:40:33 -0700 Message-ID: Subject: Re: Org markup and non-ASCII punctuation (was: org parser and priorities of inline elements) To: Ihor Radchenko Cc: Max Nikulin , emacs-orgmode@gnu.org, Timothy , Bastien Content-Type: text/plain; charset="UTF-8" Received-SPF: pass client-ip=2607:f8b0:4864:20::735; envelope-from=tgbugs@gmail.com; helo=mail-qk1-x735.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: emacs-orgmode-bounces+larch=yhetil.org@gnu.org X-Migadu-Country: US X-Migadu-Flow: FLOW_IN X-Migadu-Spam-Score: -7.77 X-Spam-Score: -7.77 X-Migadu-Queue-Id: 855E351344 X-Migadu-Scanner: mx1.migadu.com X-TUID: f9W7oo+cr6sp > We might probably generalize to > PRE = Zs Zl Pc Pd Ps Pi ' " > POST = Zs Zl Pc Pd Pe Pf . ; : ! ? ' " \ [ If this works I think it is reasonable. We might want to specify what to do in cases where an org implementation might not fully support unicode, and might want to do a review of related issues in syntax with respect to ascii vs unicode, because iirc there is some ambiguity in the current syntax doc. For example, I'm pretty sure that I'm mixing and matching unicode and ascii whitespace in the tokenizer I have in Racket. > Though we need to take care excluding zero-width spaces. Ya, I removed a comment to this effect in the paragraph about the usual alternate solution. > Emacs does not support them though (yet?). Racket has full support for the latest unicode standards iirc, so I will see if I can leverage that support for testing in laundry. > At the end, it is the current ASCII limitation plus partially arbitrary > choice of boundaries that keep some users confused (we are getting bug > reports about confusing markup from time to time). Ya, it would be good to try to generalize the affordance if possible since users of text in non-ascii languages have certain valid expectations. Hopefully, the unicode consortium has managed to cover the categories we need.