From: Sebastian Rose <sebastian_rose@gmx.de>
To: Carsten Dominik <carsten.dominik@gmail.com>
Cc: org-mode mailing list <emacs-orgmode@gnu.org>,
	Bernt Hansen <bernt@norang.ca>
Subject: Re: Re: Custom entry IDs in HTML export
Date: Thu, 16 Apr 2009 15:14:39 +0200	[thread overview]
Message-ID: <87vdp4u4e8.fsf@kassiopeya.MSHEIMNETZ> (raw)
In-Reply-To: <43758593-D9D0-43BC-B4D9-14E036C66271@gmail.com> (Carsten Dominik's message of "Thu, 16 Apr 2009 13:28:52 +0200")

Hm - counter arguments?

The only counter argument is, that hand made IDs for links are prone to
error. But that risk should be up to the user.

I actually changed my mind a little in this concern.

If the user clicks a section link in the toc to jump to a section, he
can bookmark the page with exactly that jump target. If the jump target
(the ID) is human readable, the bookmark is more verbose.

Just one wish:

  The containers should reflect that change (HRID = human readable id):

  <div   id="outline-container-HRID">
    <h4  id="HRID">                   headline    </h4>
    <div id="outline-text-HRID">
       sections content...

That way the script would keep working with older pages.
Automatic IDs and human readable ones could be mixed.

The '<a id="">' anchors are scanned anyway, as are all jump targets in
the page.

Maybe this is even the point to re-work the parser of org-info.js to
become independent of the TOC at all. The script could search for
headings instead. That's more work, but the script would then work for
all HTML pages with a structure similar to the org-export's one:

 <div id=""><hx id=""></hx><div>content</div></div> 

but I could postpone this, if you fullfill my wish above.

Best wishes


Carsten Dominik <carsten.dominik@gmail.com> writes:
> On Apr 16, 2009, at 10:50 AM, Sebastian Rose wrote:
>> Carsten Dominik <carsten.dominik@gmail.com> writes:
>>> Hi Sebastian,
>>> I kind of like the idea to have a property that can be
>>> used to set an ID, as an alternative to the <<target>>
>>> notation.  Actually, using a property seems a lot cleaner,
>>> thanks for coming up with this idea, Daniel.
>>> I can also follow the reasoning that it is useful to have
>>> the table of contents link to the human-readable id, because
>>> it provides a general, simple workflow to retrieve a link that
>>> will persist through changes of the document.  This workflow
>>> was described also by Bernt earlier in this thread.
>>> Finally, I also agree that the main id in the <h3> tag
>>> should be the automatically generated one because this is
>>> best for automatic processing and because of all the arguments
>>> you have presented.
>>> Would it cause problems for org-info.js if the toc points to
>>> a user specified anchor in the headline, instead of the main
>>> ID that is inside the <h3> tag?  THis would really be the only
>>> required change.
>> I'll have to test this before I can give a final answer to this
>> question.
>> But regardless of the results, I will adjust the script to reflect that
>> change. The script should not rule the HTML export and it will be an
>> easy thing to do.
> But I do want to hear any counter arguments you might have....
> - Carsten
>>  Sebastian
>>> - Carsten
>>> On Mar 30, 2009, at 1:49 PM, Daniel Clemente wrote:
>>>> El dv, mar 27 2009, Sebastian Rose va escriure:
>>>>> What we have now, just as Carstens said:
>>>>> # <<human-readable>>
>>>>> * Section B
>>>>> Creates this headline in HTML:
>>>>> <h2 id="sec-2"><a name="human-readable" id="human-readable"></a>2 Section B
>>>>> </h2>
>>>>> This is enough for all the use cases I can think of.
>>>> Yes, this is enough except for two things:
>>>> 1. The TOC still links to #sec-2 and the user can't change that
>>>> 2. Your syntax doesn't fold very well in the outliner. I mean: if you use
>>>>> # <<human-readable>>
>>>>> * Section B
>>>> then the comment appears at the end of the previous section, and you can
>>>> miss
>>>> it when you are viewing the heading „Section B“. I  would swap both lines
>>>> (solution 1):
>>>>> * Section B
>>>>> # <<human-readable>>
>>>> But since there are already LOGBOOK drawers under the heading, it would be a
>>>> lot clearer to use a property, like EXPORT_ID (solution 2):
>>>>> * Section B
>>>>> :EXPORT_ID: human-readable
>>>>> :END:
>>>> In this way, the TOC can reliably find the EXPORT_ID, and then generate:
>>>>> <h2 id="sec-2"><a name="human-readable" id="human-readable"></a>2 Section B
>>>>> </h2>
>>>> (You could also leave *just* the human-readable id, but having two is not
>>>> bad.
>>>> I would prefer solution 1, but I don't because I'm not sure that the TOC can
>>>> find the ID if it is written as a comment anywhere under  the heading (and
>>>> together with other things).
>>>> Solution 2 involves thus: a new property to specify the human-
>>>> readable entry ID, which will be used to link to the entry. The automatic ID
>>>> (#sec-2) will still work for all entrys.
>>>>> * Distinguishing automatic and human readable IDs
>>>>> One thing I like is, that we now _can_ distinguish the
>>>>> `human-readable-target' (human readable) from the `sec-2' (not human
>>>>> readable and not context related) using a regular expression.
>>>>> In org-info.js, I can now prefere the human readable ID in <a> from an
>>>>> automatic created one, and thus use that to create the links for `l'
>>>>> and `L'. The same holds true for other programming languages and
>>>>> parsers.
>>>>> If we open the <h3>'s ID for user defined values (bad), we can not
>>>>> distinguish those ID's using a regular expression and there is no way
>>>>> to detect the human readable one. There will be no way to _know_ that
>>>>> the <a>'s ID is the prefered one used for human readable links.
>>>> Solution 2 doesn't break the parsing techniques you use; in fact it can also
>>>> make clearer which ID is the human readable one and which  one not.
>>>> This is not extremely important; just useful:
>>>> - for pages with many incoming links from external sites
>>>> - to ensure link integrity (now you can't assure that links will still work
>>>> in
>>>> 1 year ... or in some weeks)
>>>> - to avoid that HTML visitors get directed to a wrong section and can't find
>>>> what they searched
>>>> Greetings,
>>>> Daniel
>>>> _______________________________________________
>>>> Emacs-orgmode mailing list
>>>> Remember: use `Reply All' to send replies to the list.
>>>> Emacs-orgmode@gnu.org
>>>> http://lists.gnu.org/mailman/listinfo/emacs-orgmode
