emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Ihor Radchenko <yantar92@gmail.com>
To: Timothy <tecosaur@gmail.com>
Cc: Nicolas Goaziou <mail@nicolasgoaziou.fr>,
	 Tim Cross <theophilusx@gmail.com>,
	 emacs-orgmode@gnu.org,  Samuel Loury <konubinix@gmail.com>
Subject: [FR] [Revived] Human readable / customizable link anchors during export (was: stability of toc links)
Date: Tue, 11 Oct 2022 19:44:32 +0800	[thread overview]
Message-ID: <87fsfutwin.fsf@localhost> (raw)
In-Reply-To: <8735v4hoxl.fsf@gmail.com>

Timothy <tecosaur@gmail.com> writes:

>> Link stability is still an issue, even if the proposal gives a false
>> sense of security in that area. I don't think we can solve it without
>> creating a cache for export, where you store all previous references for
>> a given file. Even this is not sufficient, because you can export
>> buffers not attached to files.
>
> To me this is a case of "don't let the perfect be the enemy of the
> good", though I do see that a false sense of security may be
> problematic, I consider the benefits to outweigh this.

I would like to revive this thread as we now have an important
development happened after this discussion -- org-persist library. It
can handle caching without a need to have a dedicated cache
implementation for every use-case.

To summarize the previous discussion:

- Org export currently generates ugly link anchors, which degrade the
  export output quality. In particular, html export can generate link
  anchors like
  https://orgmode.org/worg/org-contrib/babel/languages/ob-doc-ditaa.html#org96b5528
  Note that randomly generated #org96b5528 anchor

- The anchors are not just unreadable, but also change on every single
  export. (Except for ox-publish, which maintains anchor cache; but not
  all the people use or need publishing)

- The random anchors are there for a reason: it is difficult to
  derive anchors based on the heading title/contents and avoid
  duplicates and also keep track of the same heading being modified.

  If we have

  * duplicated heading
  Text
  * duplicated heading
  Text

  we cannot usefully derive the anchor from the identical headings

  even worse

  * unique heading
  * duplicated heading

  can be changed to

  * duplicated heading
  * duplicated heading

  and we cannot find out which old anchor was for which heading

- Further, there appears to be no suitable _universal_ algorithm to
  generate human-readable anchors. Timothy proposed one, but it does not
  work well for non-Latin text.

----------------------------

Proposal:

1. To avoid collisions, we can add randomness to the anchors:

   * This is headline

   will be #this-is-headline-<4 random letters>

2. The generated anchors will be cached according to headline text +
   headline contents + headline number in the document via org-persist.
   During consequent exports, if two out of the three keys match, we
   take the cached anchor.

3. Instead of trying to find a silver bullet for human-readable anchor
   generator, we allow users to customize it. The default will be
   constant "org" yielding "org-Ajjq"-type anchors, just like we have
   now. But we can also provide other generators, like the one Timothy
   proposed, or better versions contributed in future if there is
   demand.

WDYT?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


  reply	other threads:[~2022-10-11 13:27 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-08 23:28 stability of toc links Samuel Wales
2020-12-08 23:30 ` Samuel Wales
2020-12-09  1:39   ` Tom Gillespie
2020-12-12 21:51     ` TRS-80
2020-12-12 22:47       ` TRS-80
2022-10-10  0:49     ` Samuel Wales
2022-10-10  1:37       ` Samuel Wales
2022-10-11  3:12         ` Robert Weiner
2022-10-11 11:25         ` Ihor Radchenko
2020-12-09  2:48 ` TEC
2020-12-09  8:45   ` Diego Zamboni
2020-12-09  9:15   ` Carsten Dominik
2020-12-09 21:25     ` Samuel Wales
2020-12-10  9:55       ` Carsten Dominik
2020-12-10 12:49         ` TEC
2020-12-10 14:36           ` TEC
2020-12-11  7:51             ` Carsten Dominik
2020-12-19  6:41               ` Carsten Dominik
2020-12-19 11:22                 ` Ihor Radchenko
2021-04-18 21:02   ` Samuel Wales
2020-12-14 10:46 ` Dominique Dumont
2021-04-18 10:32 ` Nicolas Goaziou
2021-04-20  0:58   ` Samuel Wales
2021-04-20 10:34     ` Nicolas Goaziou
2021-04-21  0:33       ` Samuel Wales
2021-04-21  8:32         ` Nicolas Goaziou
2021-04-21 13:32           ` Samuel Loury
2021-04-21 16:24             ` Nicolas Goaziou
2021-04-23 15:15               ` Maxim Nikulin
2021-04-23 20:46                 ` Samuel Wales
2021-04-23 20:48                   ` Samuel Wales
2021-04-23 20:51                     ` Samuel Wales
2021-04-24  3:05                 ` Timothy
2021-04-25 17:01               ` Dominique Dumont
2021-04-30  6:24                 ` Timothy
2021-04-30 12:20                   ` Maxim Nikulin
2021-04-21 23:20             ` Samuel Wales
2021-04-21 23:30               ` Samuel Wales
2021-04-29 21:40                 ` TRS-80
2021-04-29 22:18                   ` Samuel Wales
2021-04-30  1:48                     ` TRS-80
2021-04-30  5:13                     ` Tim Cross
2021-04-30 10:02                       ` Samuel Loury
2021-04-30 11:12                         ` Nicolas Goaziou
2021-04-30 21:12                           ` Tim Cross
2021-05-01 12:36                             ` Nicolas Goaziou
2021-05-01 12:48                               ` Timothy
2021-05-01 13:13                                 ` Nicolas Goaziou
2021-05-01 13:47                                   ` Timothy
2021-05-01 14:09                                     ` Nicolas Goaziou
2021-05-01 14:22                                       ` Timothy
2021-05-02 12:10                                         ` Nicolas Goaziou
2021-05-02 20:16                                           ` Timothy
2022-10-11 11:44                                             ` Ihor Radchenko [this message]
2022-10-11 19:20                                               ` [FR] [Revived] Human readable / customizable link anchors during export Kévin Le Gouguec
2022-10-12  6:33                                                 ` Ihor Radchenko
2022-10-12 17:38                                                   ` Kévin Le Gouguec
2021-05-01  3:08                           ` stability of toc links Greg Minshall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fsfutwin.fsf@localhost \
    --to=yantar92@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=konubinix@gmail.com \
    --cc=mail@nicolasgoaziou.fr \
    --cc=tecosaur@gmail.com \
    --cc=theophilusx@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).