emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* HTML export uses anchor ids which change on every export
@ 2021-05-29 18:18 sbaugh
  2021-05-29 19:50 ` Nicolas Goaziou
  0 siblings, 1 reply; 9+ messages in thread
From: sbaugh @ 2021-05-29 18:18 UTC (permalink / raw)
  To: emacs-orgmode


HTML export wraps headlines in anchor tags with IDs, so that they can be
linked by suffixing #[anchor-tag-ID] to the URL.

HTML export used to use anchor IDs like "sec-2" for the second headline,
but at some point it switched to generated IDs like "org7ffb324", which
change on every re-export.

This means anchor-links on external sites (that is, links which link to
a specific section of an org file) break every time an org file is
re-exported to HTML. The old style of anchor IDs would break URLs when
sections moved around, but at least it wouldn't break on every
re-export!

This makes org much less useful for typical web publishing use cases.

This can be worked around by setting CUSTOM_ID for every headline, which
will override the anchor id used, but I think it was much better when it
just worked by default...

It looks like this was changed in commit
459033265295723cbfb0fccb3577acbfdc9d0285
"Export back-ends: Use `org-export-get-reference'"

Perhaps this functionality (of generating anchor IDs based on the
section number) could be added back in?



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: HTML export uses anchor ids which change on every export
  2021-05-29 18:18 HTML export uses anchor ids which change on every export sbaugh
@ 2021-05-29 19:50 ` Nicolas Goaziou
  2021-05-29 19:54   ` Timothy
  2021-06-08 23:31   ` Spencer Baugh
  0 siblings, 2 replies; 9+ messages in thread
From: Nicolas Goaziou @ 2021-05-29 19:50 UTC (permalink / raw)
  To: sbaugh; +Cc: emacs-orgmode

Hello,

sbaugh@catern.com writes:

> HTML export wraps headlines in anchor tags with IDs, so that they can be
> linked by suffixing #[anchor-tag-ID] to the URL.
>
> HTML export used to use anchor IDs like "sec-2" for the second headline,
> but at some point it switched to generated IDs like "org7ffb324", which
> change on every re-export.
>
> This means anchor-links on external sites (that is, links which link to
> a specific section of an org file) break every time an org file is
> re-exported to HTML. The old style of anchor IDs would break URLs when
> sections moved around, but at least it wouldn't break on every
> re-export!
>
> This makes org much less useful for typical web publishing use cases.
>
> This can be worked around by setting CUSTOM_ID for every headline, which
> will override the anchor id used, but I think it was much better when it
> just worked by default...
>
> It looks like this was changed in commit
> 459033265295723cbfb0fccb3577acbfdc9d0285
> "Export back-ends: Use `org-export-get-reference'"
>
> Perhaps this functionality (of generating anchor IDs based on the
> section number) could be added back in?

No, for public links, CUSTOM_ID is the only sane way to handle this.
Even "sec-2" could betray you if you slightly modify the document.

Regards,

-- 
Nicolas Goaziou


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: HTML export uses anchor ids which change on every export
  2021-05-29 19:50 ` Nicolas Goaziou
@ 2021-05-29 19:54   ` Timothy
  2021-05-29 23:10     ` Tim Cross
  2021-06-08 23:31   ` Spencer Baugh
  1 sibling, 1 reply; 9+ messages in thread
From: Timothy @ 2021-05-29 19:54 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: sbaugh, emacs-orgmode


Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:

> No, for public links, CUSTOM_ID is the only sane way to handle this.
> Even "sec-2" could betray you if you slightly modify the document.

Hi Nicolas,

On this, would you have any interested in going back to that thread
about IDs generated based on the headings? IIRC it petered out more that
reached a conclusion.

--
Timothy.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: HTML export uses anchor ids which change on every export
  2021-05-29 19:54   ` Timothy
@ 2021-05-29 23:10     ` Tim Cross
  2021-05-30  5:16       ` Timothy
  0 siblings, 1 reply; 9+ messages in thread
From: Tim Cross @ 2021-05-29 23:10 UTC (permalink / raw)
  To: emacs-orgmode


Timothy <tecosaur@gmail.com> writes:

> Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:
>
>> No, for public links, CUSTOM_ID is the only sane way to handle this.
>> Even "sec-2" could betray you if you slightly modify the document.
>
> Hi Nicolas,
>
> On this, would you have any interested in going back to that thread
> about IDs generated based on the headings? IIRC it petered out more that
> reached a conclusion.

I thought the conclusion was that if you wanted link stability, use
publish rather than export?

-- 
Tim Cross


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: HTML export uses anchor ids which change on every export
  2021-05-29 23:10     ` Tim Cross
@ 2021-05-30  5:16       ` Timothy
  2021-05-30  6:56         ` Tim Cross
  0 siblings, 1 reply; 9+ messages in thread
From: Timothy @ 2021-05-30  5:16 UTC (permalink / raw)
  To: Tim Cross; +Cc: emacs-orgmode


Tim Cross <theophilusx@gmail.com> writes:

> Timothy <tecosaur@gmail.com> writes:
>
>> On this, would you have any interested in going back to that thread
>> about IDs generated based on the headings? IIRC it petered out more that
>> reached a conclusion.
>
> I thought the conclusion was that if you wanted link stability, use
> publish rather than export?

No conclusion on the viability of my approach being modified a bit then
integrated into Org.

--
Timothy


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: HTML export uses anchor ids which change on every export
  2021-05-30  5:16       ` Timothy
@ 2021-05-30  6:56         ` Tim Cross
  2021-05-30 12:11           ` Nicolas Goaziou
  0 siblings, 1 reply; 9+ messages in thread
From: Tim Cross @ 2021-05-30  6:56 UTC (permalink / raw)
  To: Timothy; +Cc: emacs-orgmode


Timothy <tecosaur@gmail.com> writes:

> Tim Cross <theophilusx@gmail.com> writes:
>
>> Timothy <tecosaur@gmail.com> writes:
>>
>>> On this, would you have any interested in going back to that thread
>>> about IDs generated based on the headings? IIRC it petered out more that
>>> reached a conclusion.
>>
>> I thought the conclusion was that if you wanted link stability, use
>> publish rather than export?
>
> No conclusion on the viability of my approach being modified a bit then
> integrated into Org.

Perhaps I misunderstood. My reading was that none of the proposed
approaches were complete enough (in the sense they either introduced
other issues or, while addressing some corner cases, made it much harder
to address others, broke or failed to cater for other workflows).

I was left with the general impression that solving this issue required
a significant amount of re-development and a far more sophisticated
approach for tracking, caching/memoizing IDs and attempting to address
the issues just by patching the existing code was only going to make
small improvements while complicating the existing code and making it
harder to maintain. In short, a significant re-design and
re-implementation effort rather than application of patches on the
existing code base is required and until someone can do this work, the
best approach was to use publish instead of export if link stability was
required.


-- 
Tim Cross


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: HTML export uses anchor ids which change on every export
  2021-05-30  6:56         ` Tim Cross
@ 2021-05-30 12:11           ` Nicolas Goaziou
  0 siblings, 0 replies; 9+ messages in thread
From: Nicolas Goaziou @ 2021-05-30 12:11 UTC (permalink / raw)
  To: Tim Cross; +Cc: emacs-orgmode, Timothy

Hello,

Tim Cross <theophilusx@gmail.com> writes:

> Perhaps I misunderstood. My reading was that none of the proposed
> approaches were complete enough (in the sense they either introduced
> other issues or, while addressing some corner cases, made it much harder
> to address others, broke or failed to cater for other workflows).
>
> I was left with the general impression that solving this issue required
> a significant amount of re-development and a far more sophisticated
> approach for tracking, caching/memoizing IDs and attempting to address
> the issues just by patching the existing code was only going to make
> small improvements while complicating the existing code and making it
> harder to maintain. In short, a significant re-design and
> re-implementation effort rather than application of patches on the
> existing code base is required and until someone can do this work, the
> best approach was to use publish instead of export if link stability was
> required.

I agree on some points, but my analysis is slightly different. In
particular, it seems to me the whole topic is conflating problems. And
the mistake is to look for a single solution that solves them all.

First, _external_ link stability is a solved problem. Users need to use
CUSTOM_ID, no matter what they think about it. I do believe there is no
other automatic way to solve this. Only approximations of a solution,
which will bite you in one way or the other, as you noted.

Secondly, _internal_ link stability is not that important. By
definition, if you're not going to see them, you don't care about what
they look like, as long as they correctly link the expected parts of the
document. Current implementation of internal references guarantees all
internal links do work, with export or publish, but does not go further.
I don't think we need another solution for internal links since they do
the job.

This is not to say there is no problem to solve, of course. Currently,
internal links sometimes leak outside, which understandably bothers
users. Even though there is no ultimate solution for this besides
manually writing every link going to the outside, it may be possible to
mitigate the issue, if users accept to get bitten from time to time.

With that in mind, I think Timothy's solution goes in the right
direction, but, IMO, attempts to solve the problem at the wrong level,
i.e., by trying to unify all links (internal and external) into a single
banner. I'm not convinced this is doable, because expectations are so
different.

However, this kind of solution could be implemented in Org and used by
export back-ends generating external links. For example, Org Export
could provide, e.g., `org-export-punycode', and Org Export HTML could
use instead of internal `org-export-get-reference'. As I wrote already,
Org Export Texinfo does something similar for the nodes it generates.
Since those are meant to be external references, the back-end tries hard
to generate something meaningful (in `org-texinfo--get-node') and, as
a last resort, `org-export-get-reference'. Even though I mentioned it in
the other thread, it didn't attract much interest.

This is, I think, a practical way to improve the actual problem, i.e.,
how to to generate automatically pseudo-stable external links (note: I'm
writing this without contempt).

Regards,
-- 
Nicolas Goaziou


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: HTML export uses anchor ids which change on every export
  2021-05-29 19:50 ` Nicolas Goaziou
  2021-05-29 19:54   ` Timothy
@ 2021-06-08 23:31   ` Spencer Baugh
  2021-06-09 12:19     ` Nicolas Goaziou
  1 sibling, 1 reply; 9+ messages in thread
From: Spencer Baugh @ 2021-06-08 23:31 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode

Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:
> No, for public links, CUSTOM_ID is the only sane way to handle this.
> Even "sec-2" could betray you if you slightly modify the document.

I understand and agree. However, "sec-2" is strictly better than the
current situation in terms of link stability: There are many document
modifications that don't change "sec-2", and there are no document
modifications that don't change the current id format.

If some user likes link stability a litle bit, but not enough to add
CUSTOM_ID to every single heading, then providing some option to
generate ids like "sec-2", which are stable in some situations for very
little cost, is good for that user.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: HTML export uses anchor ids which change on every export
  2021-06-08 23:31   ` Spencer Baugh
@ 2021-06-09 12:19     ` Nicolas Goaziou
  0 siblings, 0 replies; 9+ messages in thread
From: Nicolas Goaziou @ 2021-06-09 12:19 UTC (permalink / raw)
  To: Spencer Baugh; +Cc: emacs-orgmode

Hello,

Spencer Baugh <sbaugh@catern.com> writes:

> Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:
>> No, for public links, CUSTOM_ID is the only sane way to handle this.
>> Even "sec-2" could betray you if you slightly modify the document.
>
> I understand and agree. However, "sec-2" is strictly better than the
> current situation in terms of link stability: There are many document
> modifications that don't change "sec-2", and there are no document
> modifications that don't change the current id format.
>
> If some user likes link stability a litle bit, but not enough to add
> CUSTOM_ID to every single heading, then providing some option to
> generate ids like "sec-2", which are stable in some situations for very
> little cost, is good for that user.

I disagree. "sec-2" is not "strictly better". Actually, long ago, Org
used "sec-2", or "outline-2", but we got bug reports about that (in
particular, it broke publishing) too. A weaker poison is no healthier.

Please note that, if you're exporting again and again the same document,
you ought to publish it, in which case referenced links are stable.

Also, not too long ago, Timothy had a different suggestion for the
internal link stability problem. One idea to move forward could be to
provide a defcustom to let users use whatever function they want to
generate internal links. I think, however, it might be tricky to have
that function handle properly duplicates.

Regards,
-- 
Nicolas Goaziou


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-06-09 12:19 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-05-29 18:18 HTML export uses anchor ids which change on every export sbaugh
2021-05-29 19:50 ` Nicolas Goaziou
2021-05-29 19:54   ` Timothy
2021-05-29 23:10     ` Tim Cross
2021-05-30  5:16       ` Timothy
2021-05-30  6:56         ` Tim Cross
2021-05-30 12:11           ` Nicolas Goaziou
2021-06-08 23:31   ` Spencer Baugh
2021-06-09 12:19     ` Nicolas Goaziou

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).