From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1.migadu.com ([2001:41d0:403:58f0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms13.migadu.com with LMTPS id KCZ9NLILb2buSAEA62LTzQ:P1 (envelope-from ) for ; Sun, 16 Jun 2024 15:58:43 +0000 Received: from aspmx1.migadu.com ([2001:41d0:403:58f0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1.migadu.com with LMTPS id KCZ9NLILb2buSAEA62LTzQ (envelope-from ) for ; Sun, 16 Jun 2024 17:58:42 +0200 X-Envelope-To: larch@yhetil.org Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=posteo.net header.s=2017 header.b=Fr7wqbCb; dmarc=pass (policy=none) header.from=posteo.net; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1718553522; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references:list-id:list-help:list-unsubscribe: list-subscribe:list-post:dkim-signature; bh=K31F0UKnT9RshPuAzgyE6M85EBdelkpPV2dKUncs5kw=; b=s+ENjpoXONfDg+ZDt5IMwMpoXs3puJLk9DFiLqQMC9NpWSUlqqoemC4SZzssiEbnTUQkE9 b+wGjaKzMTlhl5h9uMnax2NBO5841IuSqag+k96m5E/9P7NYgrKM4DB279CFhA7aK5Df2z BirWAYdqKMJmcH4jh6EG4/Mk+E9ypzlLzblCd7MkMpowIyGMx87xJlLOL9WyP27ZVQyuOg 0cJevOFY51f07bZCEOFY8F9zzHwmY2r4GnkLjw0G93pb+6hsh8t5PdocPDGEpzcSJ5k9Lx uqCSRukvMmgxtDP0boZDVirFEqNDjX2YjqwfI6Qnk/EDgWUw0m1strOb9y1C5A== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1718553522; a=rsa-sha256; cv=none; b=azubi4gIpjjSpUJ+bo6ADVICCgdKsJ5NbPEdcMIO7UHWanQld7QdW3RpByPqVJKvg2ZqXL suf/CGf+Ev8068AChQACxEW6Auw/yj0Nyl/L+NsIQZO4i/kNU+wGwviwyAqEFCEpUs+tOz ByzkrvgllsNAq4l35k9zBAdsAfWh9Za8bUsQM3SH0+j+xuTsM0ttQU2UxtEVQZf8mffbIO C8JBmCZK1SYVcLwqNMo9UvpEob/yOqGN3XT2ymX3beOJ/HR7PhTg/pYy3zNTAv7Zs6XgP6 MIEvpZYe6cGOVKYEy4PKSu3O/reYuM/VPZbyocivHSpKaJnjvaXCtZDKUP4o9g== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=pass header.d=posteo.net header.s=2017 header.b=Fr7wqbCb; dmarc=pass (policy=none) header.from=posteo.net; spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 5E427907F for ; Sun, 16 Jun 2024 17:58:42 +0200 (CEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sIsGS-0007pO-Lw; Sun, 16 Jun 2024 11:57:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sIsGQ-0007my-HV for emacs-orgmode@gnu.org; Sun, 16 Jun 2024 11:57:54 -0400 Received: from mout01.posteo.de ([185.67.36.65]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sIsGO-0002zY-86 for emacs-orgmode@gnu.org; Sun, 16 Jun 2024 11:57:54 -0400 Received: from submission (posteo.de [185.67.36.169]) by mout01.posteo.de (Postfix) with ESMTPS id 2C4BC240027 for ; Sun, 16 Jun 2024 17:57:49 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017; t=1718553469; bh=kFu/62Qn1IGBxtRh8fUCEI59rTtAJwgW+6HmVvAWz/I=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type: From; b=Fr7wqbCb/MgXpTEtvLMsNP/lfsbgVE98+duxnkGJqILKc9yRzgQP7x0krCX0GudiE iNrS/OiHbsdrfc2DSeY6J7XIqjm5cMHF2CPwgGQOP2oNDH5U4ZZH3yhGVBqElBWRvP rcfUDqLyfwNxAQAW7+JNhR4M7ymuyPw47jdMB12LDpZoh2VqFNkMilob0rroeMeKRq AVY7M2reT11Jq6+CtKcYJOIR4HLWPwTZFhQqzxK2o6PJZXlCi9Ou4hklD4Y0dx2OwZ MCOMmVLxqXC7VN5RjPAr3Pd8AhANS6RIsjPg9iGA/vupQojMgiag0IOMuY9EmuPUOC EdrireUA1G/Og== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4W2Hj849qJz6tvn; Sun, 16 Jun 2024 17:57:48 +0200 (CEST) From: Ihor Radchenko To: Max Nikulin Cc: emacs-orgmode@gnu.org Subject: Re: [BUG] Trailing dash is not included in link [9.7.3 (9.7.3-2f1844 @ /home/mwillcock/.emacs.d/elpa/org-9.7.3/)] In-Reply-To: References: <87sexh9ddv.fsf@ice9.digital> <87le37k4c8.fsf@localhost> Date: Sun, 16 Jun 2024 15:59:34 +0000 Message-ID: <875xu86fq1.fsf@localhost> MIME-Version: 1.0 Content-Type: text/plain Received-SPF: pass client-ip=185.67.36.65; envelope-from=yantar92@posteo.net; helo=mout01.posteo.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: emacs-orgmode-bounces+larch=yhetil.org@gnu.org X-Migadu-Flow: FLOW_IN X-Migadu-Country: US X-Migadu-Spam-Score: -6.57 X-Spam-Score: -6.57 X-Migadu-Queue-Id: 5E427907F X-Migadu-Scanner: mx11.migadu.com X-TUID: eI3M1XiGate2 Max Nikulin writes: >> +*** Trailing =-= is now allowed in plain links > > After a look into > > 7dcb1afb6 2021-03-24 21:27:24 +0800 Ihor Radchenko: Improve > org-link-plain-re > > I suspect, it worked prior to v9.5. Without a unit test it may be > accidentally broken again. No, it did not work. If you can, please do not make such assertions without testing. >> +: https://domain/test- > > example.org, example.net, example.com are domains reserved for usage in > examples: > And so? >> (or (regexp "[^[:punct:] \t\n]") > > I have realized that some Org regexps use [:punct:] *regexp class* and > others *syntax class*, see latex math regexp. I am in doubts if the > discrepancy is intentional. It is not intentional, but using syntax classes can sometimes be fragile. > I have noticed that the following change > > 09ced6d2c 2024-02-03 15:15:46 +0100 Ihor Radchenko: org-link-plain-re: > Improve regexp heuristics > > that causes > > (link http://example.org/a > input is exported as > >

> (link href="http://example.org/a%3Cb)">http://example.org/a%3Cb)

> > I expect that ")" should not be parsed as a part of the link. Balanced > brackets are tricky with regexps (and it is not possible to match > arbitrary nested ones). It is heuristics. We cannot be 100% right. So, it is what it is. > Perhaps "[^[:punct:] \t\n]" is too strict in respect to spaces. It does > not allow the recommended workaround with zero width space: You don't need zero width space for links. Just use . > As to the original bug report, while reading it, I noticed that > thunderbird includes dash into the recognized link for > > "https://domain/test-" > > I decided to look into its implementation and to my surprise I found: > ``punctation chars and "-" at the end are stipped off.'' I realized that > double quotes along with angle brackets are treated as a recommended way > to mark URLs in plain text. Thunderbird does not consider dash as a part > of links for e.g. http://example.org/t- It might be an attempt to > reserve possibility to assemble URLs wrapped into several lines with > added hyphenation marks, but it has not been implemented (RFC2396 > appendix E warns about accidentally added hyphens). > > https://www.bucksch.org/1/projects/mozilla/16507/ > https://searchfox.org/mozilla-central/source/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp#line-243 > mozTXTToHTMLConv::FindURLEnd > > Implementation is tricky, I have not noticed anything that may be reused > to improve heuristics for Org. Nowadays it is likely better to inspect > autolinking code for GitHub/GitLab or widely used python packages. If you have concrete proposals, please share them. > I would consider [:space:] or \s-. Do you mean "[^[:punct:][:space:]\t\n]"? -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at . Support Org development at , or support my work at