From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:bcc0::]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) by ms0.migadu.com with LMTPS id cFdwMdHkgmCOYQEAgWs5BA (envelope-from ) for ; Fri, 23 Apr 2021 17:16:33 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id UK8ZLdHkgmCqYAAAbx9fmQ (envelope-from ) for ; Fri, 23 Apr 2021 15:16:33 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 5EE6EE8D9 for ; Fri, 23 Apr 2021 17:16:09 +0200 (CEST) Received: from localhost ([::1]:42212 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lZxXM-0001T5-IH for larch@yhetil.org; Fri, 23 Apr 2021 11:16:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59860) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lZxWb-0001SD-8w for emacs-orgmode@gnu.org; Fri, 23 Apr 2021 11:15:21 -0400 Received: from ciao.gmane.io ([116.202.254.214]:40430) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lZxWZ-0003By-MO for emacs-orgmode@gnu.org; Fri, 23 Apr 2021 11:15:21 -0400 Received: from list by ciao.gmane.io with local (Exim 4.92) (envelope-from ) id 1lZxWX-0004KP-Lc for emacs-orgmode@gnu.org; Fri, 23 Apr 2021 17:15:17 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: emacs-orgmode@gnu.org From: Maxim Nikulin Subject: Re: stability of toc links Date: Fri, 23 Apr 2021 22:15:06 +0700 Message-ID: References: <877dkzg9y2.fsf@nicolasgoaziou.fr> <87wnsx9rcj.fsf@nicolasgoaziou.fr> <87y2dc82ct.fsf@nicolasgoaziou.fr> <87sg3j4vbl.fsf@gmail.com> <87pmyn8v2c.fsf@nicolasgoaziou.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 In-Reply-To: <87pmyn8v2c.fsf@nicolasgoaziou.fr> Content-Language: en-US Received-SPF: pass client-ip=116.202.254.214; envelope-from=geo-emacs-orgmode@m.gmane-mx.org; helo=ciao.gmane.io X-Spam_score_int: 28 X-Spam_score: 2.8 X-Spam_bar: ++ X-Spam_report: (2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_ADSP_CUSTOM_MED=0.001, FORGED_GMAIL_RCVD=1, FORGED_MUA_MOZILLA=2.309, FREEMAIL_FORGED_FROMDOMAIN=0.25, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, NICE_REPLY_A=-0.001, NML_ADSP_CUSTOM_MED=0.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Migadu-Flow: FLOW_IN ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1619190969; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=ZHKYnTcnuI09EkyNNeucHT3zfbMvkSxF35J+YmnUng0=; b=moGpFEEtG9DkuJMrC/D8jx9IT+M6rJpl1VFtyrQl/Y+w5zWtJOCQrGj0gVZCXwM9rME3Bc xZd39JaZMsC3IoXyP/RlhSNZWfmQ87gdRHJddHGlhok6L554T9A8bsEt0erPlwAH8Q0JH9 7sBni+PvQ5eU2Ef9MMU1KMtwHaBUxZ701WvlApwtFO5IIRPPYLfXEOtNYl+vlHyxteipGF 8dyI8bCQPhGODARkXYuwoJKm06wCwqRrjt3bzmoybdOuCmO56P2QeTUM7dA4EvNGRQ+iwn BtJv3d5v8lW/UuBq3bnvf3RO6pHBLnAkx8JA/rAFU5zmgWxLcU0tfzOqiaP4Bg== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1619190969; a=rsa-sha256; cv=none; b=TOo0fyy0OJnZLpH+yrSHb7kpSFTKO97rGM9Bgj/O9IJEPgZ4g+t8GjF9jssXwKdL3kwef2 P0MnywbjAqBszI+/uKB8mXvKj6/wx84BMhLh1LOZGFGnCTuk5ABXkPEjF/AdNHh5hm5v3M CZ7huqlJYMXUfZ/qVz/LCXdMDVxsCsAnOdPyKcyIMpkxi4Sk/fV4FpTmbNHvRfBnWTksXP AMAcGRdwtj7riBAC/I6l49m5tiF77H9yhsOsewjhS31dnapRfhrmNr1G9voxpMWvC87dbI +1RqRvxGu7aAwtjL5PHhRpMq5e2c+okCoqkEObGVyEH+kK8f89/d8bQxWnXLoA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Spam-Score: -1.84 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Queue-Id: 5EE6EE8D9 X-Spam-Score: -1.84 X-Migadu-Scanner: scn0.migadu.com X-TUID: AQpRhlQNA1y6 On 21/04/2021 23:24, Nicolas Goaziou wrote: > > In particular, I'm not sure to understand how one system can generate an > ID based on the heading content and still limit itself to alphanumeric > characters. For example, what ID are generated with the following > document? My impression is that such conversion is rather wide spread in various web CMS and documentation generators. I strongly prefer human-readable anchors (when I can guess link content and realize if I have read it earlier) https://werkzeug.palletsprojects.com/en/1.0.x/tutorial/#step-0-a-basic-wsgi-introduction to codes like https://orgmode.org/worg/org-hacks.html#org98f055b I know, Cyrillic is a trivial case in comparison to your example below, however that is the case when I can confirm that result of transliteration to ASCII is usually readable enough. It is usually applied to article title to generate a path component of URL. > --8<---------------cut here---------------start------------->8--- > * こんにちは > * コンニチハ > --8<---------------cut here---------------end--------------->8--- Sorry, I can not estimate if the following conversion is accurate enough: python3 -c 'import unidecode; print(unidecode.unidecode("こんにちは"))' konnichiha python3 -c 'import unidecode; print(unidecode.unidecode("コンニチハ"))' konnitiha Hex anchors could be a fallback if smarter method could not generate something reasonable. Finally, exporters can generate compiler-like warnings if some problem with anchor stability/ambiguity is detected. A helper function may be suitable to fix ID before editing of a heading. Actually it was not obvious to me that IDs like org98f055b may be stable. It is a hidden feature. I do not know if Samuel can adjust his workflow to use copy from "published" (to local directory) files instead of copy from export buffer. I guess, as a starting point it is necessary to pre-populate cache with IDs from existing HTML documents somehow. Anyway thank you for clarifying of the role of publishing.