From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <emacs-orgmode-bounces+larch=yhetil.org@gnu.org>
Received: from mp1.migadu.com ([2001:41d0:303:e16b::])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	by ms13.migadu.com with LMTPS
	id AG7/CljUdmYecAEA62LTzQ:P1
	(envelope-from <emacs-orgmode-bounces+larch=yhetil.org@gnu.org>)
	for <larch@yhetil.org>; Sat, 22 Jun 2024 13:40:40 +0000
Received: from aspmx1.migadu.com ([2001:41d0:303:e16b::])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	by mp1.migadu.com with LMTPS
	id AG7/CljUdmYecAEA62LTzQ
	(envelope-from <emacs-orgmode-bounces+larch=yhetil.org@gnu.org>)
	for <larch@yhetil.org>; Sat, 22 Jun 2024 15:40:40 +0200
X-Envelope-To: larch@yhetil.org
Authentication-Results: aspmx1.migadu.com;
	dkim=pass header.d=posteo.net header.s=2017 header.b=YYkVWfIh;
	spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org";
	dmarc=pass (policy=none) header.from=posteo.net
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org;
	s=key1; t=1719063640;
	h=from:from:sender:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:in-reply-to:in-reply-to:
	 references:references:list-id:list-help:list-unsubscribe:
	 list-subscribe:list-post:dkim-signature;
	bh=1fwfy+Nk6GPys+HVqq+a+4AaokF4Ird0b/QBFy4LT1I=;
	b=LzSA/V4K3Px+SQZxyYEPJ/Nuw3kVSeeIC5k7SXcBmpptM+9cEetr8v0Be3GnAlMA6CqUZv
	PtnNFAZsjxq0HL5+MZEQnFD5RoiP4q0AXSTnqnYx1LiNBWiZQVZlbK77CdzqP5o2RHWgM8
	OUOAamYL4BSiGvjYmXMI0e4wxKAExpkPcHN/M9xy2RVMrgTXO9HQwUwTPp8mO7reA+1tXs
	5oMNmdxp737RCb5fukHzTPWpvDH/TC0akpHBPv1z0GiLP0UisEYZtdlj9uzj2USNT40iy/
	8qQdnryt6q4117k76pfs8MZFDkdzawvh7kiHD7lj0OeoNajOuMonW+3Zyj7kTA==
ARC-Authentication-Results: i=1;
	aspmx1.migadu.com;
	dkim=pass header.d=posteo.net header.s=2017 header.b=YYkVWfIh;
	spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org";
	dmarc=pass (policy=none) header.from=posteo.net
ARC-Seal: i=1; s=key1; d=yhetil.org; t=1719063640; a=rsa-sha256; cv=none;
	b=u5Ykpc+O6xgvtbDemMzQtkBLt9DGnhikD5LB+m8qLv3Huq5EDrJSvquSRlHWsuKfWL+iah
	Wr1/YLBC2SclAj0PgSiKASISvKSRFaz89+5cG7LNrpSIEtQ1ycmucIe1m7Lwq8eW8FAWJ5
	MMNTB5uCWgjHfG0NNuJ0w0trubzXjIXcPuhVdBN/DwY6N3qKSiPAFsfCXdSiEEbnCxBFQg
	tren83VQ6zxoMmrFCI+tg3oha1xEd6z0qHthAlMINp7kD28n+IKvUo7IizAygjPWFI2PCy
	yFskyf8ClSu8IOuVph4ksV8Jn3f8UeEWOVuVpUQNRTpxvTpYVkyRSYsuD4qKCA==
Received: from lists.gnu.org (lists.gnu.org [209.51.188.17])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by aspmx1.migadu.com (Postfix) with ESMTPS id DE44D157E0
	for <larch@yhetil.org>; Sat, 22 Jun 2024 15:40:39 +0200 (CEST)
Received: from localhost ([::1] helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <emacs-orgmode-bounces@gnu.org>)
	id 1sL0xx-0006dP-2u; Sat, 22 Jun 2024 09:39:41 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <yantar92@posteo.net>)
 id 1sL0xv-0006cm-N8
 for emacs-orgmode@gnu.org; Sat, 22 Jun 2024 09:39:39 -0400
Received: from mout01.posteo.de ([185.67.36.65])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <yantar92@posteo.net>)
 id 1sL0xr-0006Zm-Jd
 for emacs-orgmode@gnu.org; Sat, 22 Jun 2024 09:39:39 -0400
Received: from submission (posteo.de [185.67.36.169]) 
 by mout01.posteo.de (Postfix) with ESMTPS id D65D4240028
 for <emacs-orgmode@gnu.org>; Sat, 22 Jun 2024 15:39:32 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017;
 t=1719063572; bh=PF/6QsjZfzD60wwPhlUiVRxKyOkAajoT1yLukpeyGns=;
 h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type:
 From;
 b=YYkVWfIhdFu2Ou2agddH5tufcRZ9QvqoGsaW6xEa27tYXm4kCmkTx1cqs24ILKE/d
 LL3P3Lfc1lZAkbLyUVuWleObraxk6RyS2IUnLbgs53kMHTKiCeWC+ImnSPPQocClHO
 uvmHRrqFu85ldmuw0Urk78iCSmxu1CYTrlpUqAu4t5JbWy6ItW/RUdfv2fQQsEpn6L
 g6sDcMrqEuYOKs+3GeHUNITQf5hoC+2swhDu2X7qSv/KoeP2ZzEFcss/trzN+SAyZK
 dmzMRvSjNJfpsOQNazAYCZ0euSS9bZyYJCJTjblzdRk6kW6xGhKJ2kwC7slf4eVnEE
 WG1V9I83FNFOw==
Received: from customer (localhost [127.0.0.1])
 by submission (posteo.de) with ESMTPSA id 4W5wLq4qgPz6txm;
 Sat, 22 Jun 2024 15:39:31 +0200 (CEST)
From: Ihor Radchenko <yantar92@posteo.net>
To: Max Nikulin <manikulin@gmail.com>
Cc: emacs-orgmode@gnu.org
Subject: Re: [BUG] Trailing dash is not included in link [9.7.3
 (9.7.3-2f1844 @ /home/mwillcock/.emacs.d/elpa/org-9.7.3/)]
In-Reply-To: <v516i1$kuv$1@ciao.gmane.io>
References: <87sexh9ddv.fsf@ice9.digital> <87le37k4c8.fsf@localhost>
 <v4n17e$gnr$1@ciao.gmane.io> <875xu86fq1.fsf@localhost>
 <v516i1$kuv$1@ciao.gmane.io>
Date: Sat, 22 Jun 2024 13:41:15 +0000
Message-ID: <874j9lhz7o.fsf@localhost>
MIME-Version: 1.0
Content-Type: text/plain
Received-SPF: pass client-ip=185.67.36.65; envelope-from=yantar92@posteo.net;
 helo=mout01.posteo.de
X-Spam_score_int: -43
X-Spam_score: -4.4
X-Spam_bar: ----
X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001,
 SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: emacs-orgmode@gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "General discussions about Org-mode." <emacs-orgmode.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-orgmode>,
 <mailto:emacs-orgmode-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/emacs-orgmode>
List-Post: <mailto:emacs-orgmode@gnu.org>
List-Help: <mailto:emacs-orgmode-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-orgmode>,
 <mailto:emacs-orgmode-request@gnu.org?subject=subscribe>
Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org
Sender: emacs-orgmode-bounces+larch=yhetil.org@gnu.org
X-Migadu-Country: US
X-Migadu-Flow: FLOW_IN
X-Migadu-Queue-Id: DE44D157E0
X-Migadu-Scanner: mx13.migadu.com
X-Migadu-Spam-Score: -9.58
X-Spam-Score: -9.58
X-TUID: seXKaWbiwUrE

Max Nikulin <manikulin@gmail.com> writes:

>> If you can, please do not make such assertions without testing.
>
> I am sorry, I had no intention to offend you. I missed that the removed 
> line with explicit list of punctuation characters was commented out. I 
> have tried the regexp used before (a part of v6.34)

>      facedba05 2009-12-09 15:13:50 +0100 Carsten Dominik: Use John 
> Gruber's regular expression for URL's
>
> and it seems trailing dash was allowed.

Hmm. That's a really long time ago, earlier than built-in Org in Emacs
versions that are available in various distros. My reading of "prior to
v9.5" was more like "not too far before v9.5" (and I tested everything
down to Org mode included into Emacs 26).

>>>> +: https://domain/test-
>>>
>>> example.org, example.net, example.com are domains reserved for usage in
>>> examples:
>>> <https://www.iana.org/assignments/special-use-domain-names/special-use-domain-names.xhtml>
>> 
>> And so?
>
> http://example.org/dash- may be a bit better for docs. (For IPv6 
> addresses the difference should be more noticeable, but I do not 
> remember what range is reserved for usage in examples there.)

I see. I would not mind installing a patch, if you submit it.

>>> I have realized that some Org regexps use [:punct:] *regexp class* and
>>> others *syntax class*, see latex math regexp. I am in doubts if the
>>> discrepancy is intentional.
>> 
>> It is not intentional, but using syntax classes can sometimes be
>> fragile.
>
> Do you mean that result depends on current buffer? I do not have strong 
> opinion what variant should be used.

Not current buffer. Current syntax table, inherited from
outline-mode. And that syntax table is customized by some users, leading
to Org parser behaving unexpectedly in some scenarios.

Also, there is 'syntax-table text property, and I have managed to break
Org parser in the past by trying to apply 'syntax-table property to code
blocks in Org mode (I was trying to solve `forward-sexp' bug people
frequently report).

So, we should generally avoid using syntax tables, so that Org syntax
becomes independent of user customizations in that area. Or, at least,
we should not introduce more syntax class uses when possible.

> ... What I do not like is that in the 
> case of $n$-th the character after second "$" is tested against syntax 
> class, while regexp class is used for links. This subtle difference is 
> almost certainly ignored in alternative implementations of the parser. 
> However I am not sure what characters besides dash and apostrophe are 
> affected and whether it depends on locale.

These kinds of inconsistencies should be solved eventually. We should not
use locale, but UTF syntax classes; and document it in org-syntax
document.

>>> 09ced6d2c 2024-02-03 15:15:46 +0100 Ihor Radchenko: org-link-plain-re:
>>> Improve regexp heuristics
> [...]
>>>       (link http://example.org/a<b)
> [...]
>> It is heuristics. We cannot be 100% right. So, it is what it is.
>
>  From my point of view it is at least close to a regression. I do not 
> have any argument against http://example.org/a<b>, but the regexp should 
> not match whole "http://example.org/a<b)"

No bug reports, so your point is rather theoretical.

I do not mind improving the regexp, of course, but I am afraid that we
will need PEG or `org-element--parse-paired-brackets' to match paired
brackets accurately. And that kind of change will be breaking - we will
need to trash the regexp variable.

>>> I would consider [:space:] or \s-.
>> 
>> Do you mean "[^[:punct:][:space:]\t\n]"?
>
> I believe it might be an improvement ([:space:] includes \t).

https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=6cada29c0

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>