From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id F3MjBfuWjmC00QAAgWs5BA (envelope-from ) for ; Sun, 02 May 2021 14:11:39 +0200 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id cKUSAPuWjmBedQAAbx9fmQ (envelope-from ) for ; Sun, 02 May 2021 12:11:39 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 6431A2D615 for ; Sun, 2 May 2021 14:11:38 +0200 (CEST) Received: from localhost ([::1]:49160 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ldAwj-0002fS-BV for larch@yhetil.org; Sun, 02 May 2021 08:11:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60730) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ldAvY-0002fF-CR for emacs-orgmode@gnu.org; Sun, 02 May 2021 08:10:27 -0400 Received: from relay8-d.mail.gandi.net ([217.70.183.201]:47643) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ldAvT-0006wt-BF for emacs-orgmode@gnu.org; Sun, 02 May 2021 08:10:23 -0400 X-Originating-IP: 185.131.40.67 Received: from localhost (40-67.ipv4.commingeshautdebit.fr [185.131.40.67]) (Authenticated sender: admin@nicolasgoaziou.fr) by relay8-d.mail.gandi.net (Postfix) with ESMTPSA id 27FCE1BF203; Sun, 2 May 2021 12:10:13 +0000 (UTC) From: Nicolas Goaziou To: Timothy Subject: Re: stability of toc links References: <877dkzg9y2.fsf@nicolasgoaziou.fr> <87wnsx9rcj.fsf@nicolasgoaziou.fr> <87y2dc82ct.fsf@nicolasgoaziou.fr> <87sg3j4vbl.fsf@gmail.com> <33fd87ff1332b56114909973804df669@isnotmyreal.name> <87h7joibxc.fsf@gmail.com> <87pmycay41.fsf@gmail.com> <87h7jouiuk.fsf@nicolasgoaziou.fr> <875z03igpf.fsf@gmail.com> <87tunmskap.fsf@nicolasgoaziou.fr> <87fsz6boxa.fsf@gmail.com> <87fsz6sik0.fsf@nicolasgoaziou.fr> <87czuabm6w.fsf@gmail.com> <871raqsfzp.fsf@nicolasgoaziou.fr> <87a6pebkk8.fsf@gmail.com> Mail-Followup-To: Timothy , Tim Cross , emacs-orgmode@gnu.org, Samuel Loury Date: Sun, 02 May 2021 14:10:12 +0200 In-Reply-To: <87a6pebkk8.fsf@gmail.com> (Timothy's message of "Sat, 01 May 2021 22:22:31 +0800") Message-ID: <875z019w0r.fsf@nicolasgoaziou.fr> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=217.70.183.201; envelope-from=mail@nicolasgoaziou.fr; helo=relay8-d.mail.gandi.net X-Spam_score_int: -25 X-Spam_score: -2.6 X-Spam_bar: -- X-Spam_report: (-2.6 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tim Cross , emacs-orgmode@gnu.org, Samuel Loury Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1619957498; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=gT3gPVlg4My6djJ5vWuvtgwICx7UoT7mYwNC08P1dyk=; b=D8nOl0MRNwwwHXQXQJTZ3dTJYkeD8G+rdVFLl7T3n2J0aeNCHA3jAwyB7jn78wuxD7CRsl 6Izn22dyI3XpwOrldk9pCn9AOIDyh+HCSJ7uVuVGfpFiF3eWT3egk7+X0HVUTnYfIXf8aJ 04m1QrPvpYtBkdl6PK2sc7GYB2Zz879OW4+IZpwxHSy6itwCtYfBGzwOoxe21yY8+WEQLS SO5ejHoF1tvxAT8+EogL+hJsPUulwP0F0gqGjPMB5UZqgeuAMeAeBIR/AO3IVe090tjMRH R3zBztVlkuZLtEmNwVMq27Ff/DhADwN2Lhz67jiLysP0CiN7Nm9l3EvyTzce5A== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1619957498; a=rsa-sha256; cv=none; b=YrfFq/0hcvKmp7Czc84IPAFTxF8XrCAfLZlTW8L63rpi2nwvMxFw7GVkDxMDeszrlSZYRu XsKhkMwn8Vm6J/cAicJLUKIVkN8E5PhQuC0Sn0aL36DfGyJmSw06oLCD+HN0M/RkfIhl0o VdBEUkzxIrxmWkxwqBRUSnzqBBbBcqXF87pAh2osY4W5zr22dhMfNv1HXd+8EH+Qh1uEvF WIgudPMxepTgNM4MYgPt0JVrs6y1HKjrFoqZ2p+BwMZb+Q2geENiFSztB5Gjn5p2N18Ifc xk5PCy/k/B1MAQpot6/rV+3hjoMlAFWgkSxMKCyfJ7q+POk0HdNPCwczBFioAA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Spam-Score: -0.96 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Queue-Id: 6431A2D615 X-Spam-Score: -0.96 X-Migadu-Scanner: scn0.migadu.com X-TUID: lHC4MsfJXby7 Hello, Timothy writes: > Nicolas Goaziou writes: > >> I pointed out some concerns I have about the robustness of this system >> already. I don't think you answered to any of them. I fear we may be >> communicating past each other in this thread. > > Sorry about that. I'll try to address the bits I've missed in these last > few emails. Please note that those short answers did not help me much. So I did my homework and looked at your code. I didn't test it thoroughly, so I may be missing something. >> references consist of alphanumeric characters only, so they are /de >> facto/ compatible with any target format; > > This is uses characters from [a-z0-9-] Indeed. I didn't know about punycode. It has very interesting properties. Now, here's the elephant in the room: "puny.el" was included in Emacs 26.1. Org cannot make use of it yet. Also, the bootstring algorithm, and yours, are very much English-centered, as can attest `org-reference-contraction-stripped-words'. I insisted on non-latin languages for a reason: (org-reference-contraction "=E3=81=93=E3=82=93=E3=81=AB=E3=81=A1=E3= =81=AF") =3D> "28j2a3ar1p-" or, for a not so long title (org-reference-contraction "=E3=81=93=E3=82=93=E3=81=AB=E3=81=A1=E3=81=AF= =EF=BD=BA=EF=BE=9D=EF=BE=86=EF=BE=81=EF=BE=8A") =3D> "v8ttbvbva7si998jvba0= bzb0m-" which is arguably worse than "org1234567". >> references are guaranteed to be unique in the document; > > The suffixed number I mentioned ensures this. Unfortunately, because of them, you cannot guarantee stable links during export, much like random references. For example, if you first export * Foo bar and if you later modify your document like this * Foo baz * Foo bar your link will now point to the "baz" contents instead of "bar".=20 As a side note, this the reason why I introduced randomness in references in the first place. We cannot reference first headline as "headline-1", second one as "headline-2", i.e., in a monotonic way, because we cannot assume their order is fixed. More importantly, the above is not limited to headlines with the exact same title. Since your algorithm truncates output, this will happen in various, less obvious, situations. >> cross-references between documents are stable. > > I'm not quite sure what to make of this. Since you don't implement something new but re-use the existing caching mechanism, I don't think this is an issue. >> Also, header content is not stable enough: when you're linking to the >> custom ID, you may be able to change the title and yet preserve the >> link. > > Custom IDs still work, so I don't quite see the point here. How can you be sure? The point is that in some export back-ends, e.g., ASCII, you will only provide a single reference for a headline, i.e., not one for the title and another one for the custom ID. If your reference is based solely on the title, the reference will break whenever you modify the title without touching custom ID. I gave an example in an earlier post already. This is a regression wrt the current system. In a nutshell: - there are very interesting points in your proposal; - it is not applicable at the moment; - it greatly improves references for English language, it is slightly better for latin languages, and worse for non-latin ones; - it does not guarantee link stability during export; - it introduces a regression wrt custom ID. Notwithstanding the problem of "puny.el", the regression makes it not suitable as a drop-in replacement for random `org-export-get-reference' yet. With more work, it can become an interesting evolution of `org-export-get-reference', however. Since this regression does not affect HTML export back-ends, it could be used there meanwhile. Link stability is still an issue, even if the proposal gives a false sense of security in that area. I don't think we can solve it without creating a cache for export, where you store all previous references for a given file. Even this is not sufficient, because you can export buffers not attached to files. Regards, --=20 Nicolas Goaziou