From: Sebastian Rose <sebastian_rose@gmx.de>
To: Carsten Dominik <carsten.dominik@gmail.com>
Cc: org-mode mailing list <emacs-orgmode@gnu.org>,
Bernt Hansen <bernt@norang.ca>
Subject: Re: Re: Custom entry IDs in HTML export
Date: Fri, 17 Apr 2009 00:37:49 +0200 [thread overview]
Message-ID: <87prfcgr7m.fsf@kassiopeya.MSHEIMNETZ> (raw)
In-Reply-To: <70EC5312-4BB8-4A7F-A2AD-7B96CBF7C068@gmail.com> (Carsten Dominik's message of "Thu, 16 Apr 2009 23:26:48 +0200")
Carsten Dominik <carsten.dominik@gmail.com> writes:
> On Apr 16, 2009, at 10:50 PM, Sebastian Rose wrote:
>
>> Carsten Dominik <carsten.dominik@gmail.com> writes:
>>> Hi Sebastian,
>>>
>>> On Apr 16, 2009, at 3:14 PM, Sebastian Rose wrote:
>>>
>>>> Hm - counter arguments?
>>>>
>>>> The only counter argument is, that hand made IDs for links are prone to
>>>> error. But that risk should be up to the user.
>>>
>>> Yes. and during the export, I can actually check and throw a warning or an
>>> error if the same custom ID shows up twice.
>>>
>>>>
>>>> I actually changed my mind a little in this concern.
>>>>
>>>> If the user clicks a section link in the toc to jump to a section, he
>>>> can bookmark the page with exactly that jump target. If the jump target
>>>> (the ID) is human readable, the bookmark is more verbose.
>>>
>>> Yes, this is really the best application. Also, when hovering over internal
>>> links, it is helpful if the link displays the human-readable form.
>>>
>>>> Just one wish:
>>>>
>>>> The containers should reflect that change (HRID = human readable id):
>>>>
>>>> <div id="outline-container-HRID">
>>>> <h4 id="HRID"> headline </h4>
>>>> <div id="outline-text-HRID">
>>>> sections content...
>>>> </div>
>>>> </div>
>>>
>>>
>>> Sure, we can do this. I would then add sec-xxx as one
>>> of the alternative anchors as well.
>>>
>>> However: If I make the structure as you indicate above,
>>> do I understand correctly that the structure of a section without a
>>> human-readable id should be changed to this:
>>>
>>> <div id="outline-container-sec-1.1">
>>> <h4 id="sec-1.1"> headline </h4>
>>> <div id="outline-text-sec-1.1">
>>> sections content...
>>> </div>
>>> </div>
>>>
>>>
>>> Note the "sec-" which is added to the stuff that currently
>>> defines the structure.
>>
>>
>>
>> I considered the `sec-' part of the automatic IDs.
>>
>> In either case I'd have to adjust org-info.js. So why not go for the
>> human readable IDs without `sec-'?
>>
>>
>> Right now we have:
>>
>> <div id="outline-container-2" class="outline-2">
>> <h2 id="sec-2"><span class="section-number-2">2</span> Things I want to find
>> out </h2>
>> <div class="outline-text-2" id="text-2">
>>
>> The `sec-' part is in the headlines ID only.
>
>
> Why? Because this introduced a parsing inconsistency for you between automatic
> and custom IDs. Because for the automatic ones, you need to strip "sec-" to
> retrieve the correct suffix for the container etc names. With the custom IDs,
> no such stripping should be done. Does this not make things harder?
>
> - Carsten
That's the way it is _now_. The structure above is taken from one of my
exported org-files. But it's not that hard to strip `sec-' :)
Now the scanning considers `sec-' a prefix - just like
`outline-container-' and `outline-text-'.
But in the future:
If we now plan to use human readable IDs in the TOC, those IDs would be
the IDs of the section heading. That's why those IDs should have no
`sec-' prefix.
Otherwise, bookmark URLs would not be what we want them:
http://orgmode.org/org-faq.php#sec-isearch-in-links
instead of
http://orgmode.org/org-faq.php#isearch-in-links
Automatic IDs on the other hand must have a prefix, since an ID may
_not_ start with a number.
So wouldn't it make sense, to change the IDs of the containers this way:
Case _automatic_:
<div id="outline-container-sec-1.1" ... >
<h3 id="sec-1.1"> .... </h3>
<div id="outline-text-sec-1.1" ... >
....
</div>
</div>
Case _human-readable_:
<div id="outline-container-isearch-in-links" ... >
<h3 id="isearch-in-links"> .... </h3>
<div id="outline-text-isearch-in-links" ... >
....
</div>
</div>
??
Sebastian
>>
>>
>>
>> Sebastian
>>
>>
>>
>>
>>>> That way the script would keep working with older pages.
>>>> Automatic IDs and human readable ones could be mixed.
>>>>
>>>>
>>>> The '<a id="">' anchors are scanned anyway, as are all jump targets in
>>>> the page.
>>>
>>> Yes, you implemented that some time ago, I remember.
>>>
>>>>
>>>> Maybe this is even the point to re-work the parser of org-info.js to
>>>> become independent of the TOC at all. The script could search for
>>>> headings instead. That's more work, but the script would then work for
>>>> all HTML pages with a structure similar to the org-export's one:
>>>
>>> So this would mean, we could read web pages with your java
>>> support even if those webpages were not created with Org?
>>> Pretty cool.
>>>
>>>> <div id=""><hx id=""></hx><div>content</div></div>
>>>>
>>>> but I could postpone this, if you fullfill my wish above.
>>>
>>>
>>> Best wishes
>>>
>>> - Carsten
>>>
>>>>
>>>>
>>>> Best wishes
>>>>
>>>> Sebastian
>>>>
>>>>
>>>>
>>>>
>>>> Carsten Dominik <carsten.dominik@gmail.com> writes:
>>>>> On Apr 16, 2009, at 10:50 AM, Sebastian Rose wrote:
>>>>>
>>>>>> Carsten Dominik <carsten.dominik@gmail.com> writes:
>>>>>>> Hi Sebastian,
>>>>>>>
>>>>>>> I kind of like the idea to have a property that can be
>>>>>>> used to set an ID, as an alternative to the <<target>>
>>>>>>> notation. Actually, using a property seems a lot cleaner,
>>>>>>> thanks for coming up with this idea, Daniel.
>>>>>>>
>>>>>>> I can also follow the reasoning that it is useful to have
>>>>>>> the table of contents link to the human-readable id, because
>>>>>>> it provides a general, simple workflow to retrieve a link that
>>>>>>> will persist through changes of the document. This workflow
>>>>>>> was described also by Bernt earlier in this thread.
>>>>>>>
>>>>>>> Finally, I also agree that the main id in the <h3> tag
>>>>>>> should be the automatically generated one because this is
>>>>>>> best for automatic processing and because of all the arguments
>>>>>>> you have presented.
>>>>>>>
>>>>>>> Would it cause problems for org-info.js if the toc points to
>>>>>>> a user specified anchor in the headline, instead of the main
>>>>>>> ID that is inside the <h3> tag? THis would really be the only
>>>>>>> required change.
>>>>>>
>>>>>>
>>>>>> I'll have to test this before I can give a final answer to this
>>>>>> question.
>>>>>>
>>>>>> But regardless of the results, I will adjust the script to reflect that
>>>>>> change. The script should not rule the HTML export and it will be an
>>>>>> easy thing to do.
>>>>>
>>>>> But I do want to hear any counter arguments you might have....
>>>>>
>>>>> - Carsten
>>>>>
>>>>>>
>>>>>> Sebastian
>>>>>>
>>>>>>
>>>>>>
>>>>>>> - Carsten
>>>>>>>
>>>>>>>
>>>>>>> On Mar 30, 2009, at 1:49 PM, Daniel Clemente wrote:
>>>>>>>
>>>>>>>> El dv, mar 27 2009, Sebastian Rose va escriure:
>>>>>>>>>
>>>>>>>>> What we have now, just as Carstens said:
>>>>>>>>>
>>>>>>>>> # <<human-readable>>
>>>>>>>>> * Section B
>>>>>>>>>
>>>>>>>>> Creates this headline in HTML:
>>>>>>>>>
>>>>>>>>> <h2 id="sec-2"><a name="human-readable" id="human-readable"></
>>>>>>>>> a>2 Section B
>>>>>>>>> </h2>
>>>>>>>>>
>>>>>>>>> This is enough for all the use cases I can think of.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Yes, this is enough except for two things:
>>>>>>>> 1. The TOC still links to #sec-2 and the user can't change that
>>>>>>>> 2. Your syntax doesn't fold very well in the outliner. I mean: if you
>>>>>>>> use
>>>>>>>>
>>>>>>>>> # <<human-readable>>
>>>>>>>>> * Section B
>>>>>>>>
>>>>>>>> then the comment appears at the end of the previous section, and you can
>>>>>>>> miss
>>>>>>>> it when you are viewing the heading „Section B“. I would swap both
>>>>>>>> lines
>>>>>>>> (solution 1):
>>>>>>>>
>>>>>>>>> * Section B
>>>>>>>>> # <<human-readable>>
>>>>>>>>
>>>>>>>> But since there are already LOGBOOK drawers under the heading, it would
>>>>>>>> be
>>>>>>>> a
>>>>>>>> lot clearer to use a property, like EXPORT_ID (solution 2):
>>>>>>>>
>>>>>>>>> * Section B
>>>>>>>>> :PROPERTIES:
>>>>>>>>> :EXPORT_ID: human-readable
>>>>>>>>> :END:
>>>>>>>>
>>>>>>>>
>>>>>>>> In this way, the TOC can reliably find the EXPORT_ID, and then generate:
>>>>>>>>> <h2 id="sec-2"><a name="human-readable" id="human-readable"></
>>>>>>>>> a>2 Section B
>>>>>>>>> </h2>
>>>>>>>>
>>>>>>>> (You could also leave *just* the human-readable id, but having two is
>>>>>>>> not
>>>>>>>> bad.
>>>>>>>>
>>>>>>>>
>>>>>>>> I would prefer solution 1, but I don't because I'm not sure that the TOC
>>>>>>>> can
>>>>>>>> find the ID if it is written as a comment anywhere under the heading
>>>>>>>> (and
>>>>>>>> together with other things).
>>>>>>>>
>>>>>>>> Solution 2 involves thus: a new property to specify the human-
>>>>>>>> readable entry ID, which will be used to link to the entry. The
>>>>>>>> automatic
>>>>>>>> ID
>>>>>>>> (#sec-2) will still work for all entrys.
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> * Distinguishing automatic and human readable IDs
>>>>>>>>>
>>>>>>>>> One thing I like is, that we now _can_ distinguish the
>>>>>>>>> `human-readable-target' (human readable) from the `sec-2' (not human
>>>>>>>>> readable and not context related) using a regular expression.
>>>>>>>>>
>>>>>>>>> In org-info.js, I can now prefere the human readable ID in <a> from an
>>>>>>>>> automatic created one, and thus use that to create the links for `l'
>>>>>>>>> and `L'. The same holds true for other programming languages and
>>>>>>>>> parsers.
>>>>>>>>>
>>>>>>>>> If we open the <h3>'s ID for user defined values (bad), we can not
>>>>>>>>> distinguish those ID's using a regular expression and there is no way
>>>>>>>>> to detect the human readable one. There will be no way to _know_ that
>>>>>>>>> the <a>'s ID is the prefered one used for human readable links.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Solution 2 doesn't break the parsing techniques you use; in fact it can
>>>>>>>> also
>>>>>>>> make clearer which ID is the human readable one and which one not.
>>>>>>>>
>>>>>>>>
>>>>>>>> This is not extremely important; just useful:
>>>>>>>> - for pages with many incoming links from external sites
>>>>>>>> - to ensure link integrity (now you can't assure that links will still
>>>>>>>> work
>>>>>>>> in
>>>>>>>> 1 year ... or in some weeks)
>>>>>>>> - to avoid that HTML visitors get directed to a wrong section and can't
>>>>>>>> find
>>>>>>>> what they searched
>>>>>>>>
>>>>>>>>
>>>>>>>> Greetings,
>>>>>>>> Daniel
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Emacs-orgmode mailing list
>>>>>>>> Remember: use `Reply All' to send replies to the list.
>>>>>>>> Emacs-orgmode@gnu.org
>>>>>>>> http://lists.gnu.org/mailman/listinfo/emacs-orgmode
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Sebastian Rose, EMMA STIL - mediendesign, Niemeyerstr.6, 30449 Hannover
>>>>>> Tel.: +49 (0)511 - 36 58 472
>>>>>> Fax: +49 (0)1805 - 233633 - 11044
>>>>>> mobil: +49 (0)173 - 83 93 417
>>>>>> Email: s.rose@emma-stil.de, sebastian_rose@gmx.de
>>>>>> Http: www.emma-stil.de
>>>>>
>>>>
>>>> --
>>>> Sebastian Rose, EMMA STIL - mediendesign, Niemeyerstr.6, 30449 Hannover
>>>> Tel.: +49 (0)511 - 36 58 472
>>>> Fax: +49 (0)1805 - 233633 - 11044
>>>> mobil: +49 (0)173 - 83 93 417
>>>> Email: s.rose@emma-stil.de, sebastian_rose@gmx.de
>>>> Http: www.emma-stil.de
>>>
>>
>> --
>> Sebastian Rose, EMMA STIL - mediendesign, Niemeyerstr.6, 30449 Hannover
>> Tel.: +49 (0)511 - 36 58 472
>> Fax: +49 (0)1805 - 233633 - 11044
>> mobil: +49 (0)173 - 83 93 417
>> Email: s.rose@emma-stil.de, sebastian_rose@gmx.de
>> Http: www.emma-stil.de
>
--
Sebastian Rose, EMMA STIL - mediendesign, Niemeyerstr.6, 30449 Hannover
Tel.: +49 (0)511 - 36 58 472
Fax: +49 (0)1805 - 233633 - 11044
mobil: +49 (0)173 - 83 93 417
Email: s.rose@emma-stil.de, sebastian_rose@gmx.de
Http: www.emma-stil.de
next prev parent reply other threads:[~2009-04-16 22:34 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-27 12:47 Custom entry IDs in HTML export Daniel Clemente
2009-03-27 16:16 ` Carsten Dominik
2009-03-27 17:57 ` Bernt Hansen
2009-03-27 21:32 ` Sebastian Rose
2009-03-30 11:49 ` Daniel Clemente
2009-04-16 6:55 ` Carsten Dominik
2009-04-16 8:50 ` Sebastian Rose
2009-04-16 11:28 ` Carsten Dominik
2009-04-16 13:14 ` Sebastian Rose
2009-04-16 17:14 ` Carsten Dominik
2009-04-16 20:50 ` Sebastian Rose
2009-04-16 21:26 ` Carsten Dominik
2009-04-16 22:37 ` Sebastian Rose [this message]
2009-04-17 4:11 ` Carsten Dominik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.orgmode.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87prfcgr7m.fsf@kassiopeya.MSHEIMNETZ \
--to=sebastian_rose@gmx.de \
--cc=bernt@norang.ca \
--cc=carsten.dominik@gmail.com \
--cc=emacs-orgmode@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).