emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Carsten Dominik <carsten.dominik@gmail.com>
To: Sebastian Rose <sebastian_rose@gmx.de>
Cc: org-mode mailing list <emacs-orgmode@gnu.org>,
	Bernt Hansen <bernt@norang.ca>
Subject: Re: Re: Custom entry IDs in HTML export
Date: Fri, 17 Apr 2009 06:11:49 +0200	[thread overview]
Message-ID: <F11A94A2-0EA4-4801-B9E8-BB49D00255F1@gmail.com> (raw)
In-Reply-To: <87prfcgr7m.fsf@kassiopeya.MSHEIMNETZ>


On Apr 17, 2009, at 12:37 AM, Sebastian Rose wrote:

> Carsten Dominik <carsten.dominik@gmail.com> writes:
>> On Apr 16, 2009, at 10:50 PM, Sebastian Rose wrote:
>>
>>> Carsten Dominik <carsten.dominik@gmail.com> writes:
>>>> Hi Sebastian,
>>>>
>>>> On Apr 16, 2009, at 3:14 PM, Sebastian Rose wrote:
>>>>
>>>>> Hm - counter arguments?
>>>>>
>>>>> The only counter argument is, that hand made IDs for links are  
>>>>> prone to
>>>>> error. But that risk should be up to the user.
>>>>
>>>> Yes.  and during the export, I can actually check and throw a  
>>>> warning or an
>>>> error if the same custom ID shows up twice.
>>>>
>>>>>
>>>>> I actually changed my mind a little in this concern.
>>>>>
>>>>> If the user clicks a section link in the toc to jump to a  
>>>>> section, he
>>>>> can bookmark the page with exactly that jump target. If the jump  
>>>>> target
>>>>> (the ID) is human readable, the bookmark is more verbose.
>>>>
>>>> Yes, this is really the best application.  Also, when hovering  
>>>> over internal
>>>> links, it is helpful if the link displays the human-readable  form.
>>>>
>>>>> Just one wish:
>>>>>
>>>>> The containers should reflect that change (HRID = human readable  
>>>>> id):
>>>>>
>>>>> <div   id="outline-container-HRID">
>>>>> <h4  id="HRID">                   headline    </h4>
>>>>> <div id="outline-text-HRID">
>>>>>  sections content...
>>>>> </div>
>>>>> </div>
>>>>
>>>>
>>>> Sure, we can do this.  I would then add sec-xxx as one
>>>> of the alternative anchors as well.
>>>>
>>>> However:  If I make the structure as you indicate above,
>>>> do I understand correctly that the structure of a section without a
>>>> human-readable id should be changed to this:
>>>>
>>>> <div   id="outline-container-sec-1.1">
>>>> <h4  id="sec-1.1">                   headline    </h4>
>>>> <div id="outline-text-sec-1.1">
>>>>  sections content...
>>>> </div>
>>>> </div>
>>>>
>>>>
>>>> Note the "sec-" which is added to the stuff that currently
>>>> defines the structure.
>>>
>>>
>>>
>>> I considered the `sec-' part of the automatic IDs.
>>>
>>> In either case I'd have to adjust org-info.js. So why not go for the
>>> human readable IDs without `sec-'?
>>>
>>>
>>> Right now we have:
>>>
>>> <div id="outline-container-2" class="outline-2">
>>> <h2 id="sec-2"><span class="section-number-2">2</span> Things I  
>>> want to find
>>> out </h2>
>>> <div class="outline-text-2" id="text-2">
>>>
>>> The `sec-' part is in the headlines ID only.
>>
>>
>> Why?  Because this introduced a parsing inconsistency for you  
>> between automatic
>> and custom IDs.  Because for the automatic ones, you need to  strip  
>> "sec-" to
>> retrieve the correct suffix for the container etc  names.  With the  
>> custom IDs,
>> no such stripping should be done.  Does  this not make things harder?
>>
>> - Carsten
>
>
> That's the way it is _now_. The structure above is taken from one of  
> my
> exported org-files. But it's not that hard to strip `sec-' :)
>
> Now the scanning considers `sec-' a prefix - just like
> `outline-container-' and `outline-text-'.
>
>
> But in the future:
>
>
> If we now plan to use human readable IDs in the TOC, those IDs would  
> be
> the IDs of the section heading. That's why those IDs should have no
> `sec-' prefix.
>
> Otherwise, bookmark URLs would not be what we want them:
>
>   http://orgmode.org/org-faq.php#sec-isearch-in-links
>
> instead of
>
>   http://orgmode.org/org-faq.php#isearch-in-links
>
>
>
> Automatic IDs on the other hand must have a prefix, since an ID may
> _not_ start with a number.
>
>
> So wouldn't it make sense, to change the IDs of the containers this  
> way:
>
>  Case _automatic_:
>
>       <div id="outline-container-sec-1.1" ... >
>         <h3 id="sec-1.1"> .... </h3>
>         <div id="outline-text-sec-1.1" ... >
>         ....
>         </div>
>       </div>
>
>  Case _human-readable_:
>
>       <div id="outline-container-isearch-in-links" ... >
>         <h3 id="isearch-in-links"> .... </h3>
>         <div id="outline-text-isearch-in-links" ... >
>         ....
>         </div>
>       </div>

Yes, it does make sense.  t only introduces on tiny restriction:  A  
human-readable ID may not be something like sec-555, but that is  
reasonable, we can document and enforce this.

OK. This is what I have done now.  You need to use the property  
CUSTOM_ID.
Please do some testing, and then I will document this change.

Daniel, could you help testing, please?

- Carsten

>
> ??
>
>
>  Sebastian
>
>
>
>>>
>>>
>>>
>>>  Sebastian
>>>
>>>
>>>
>>>
>>>>> That way the script would keep working with older pages.
>>>>> Automatic IDs and human readable ones could be mixed.
>>>>>
>>>>>
>>>>> The '<a id="">' anchors are scanned anyway, as are all jump  
>>>>> targets in
>>>>> the page.
>>>>
>>>> Yes, you implemented that some time ago, I remember.
>>>>
>>>>>
>>>>> Maybe this is even the point to re-work the parser of org- 
>>>>> info.js to
>>>>> become independent of the TOC at all. The script could search for
>>>>> headings instead. That's more work, but the script would then  
>>>>> work for
>>>>> all HTML pages with a structure similar to the org-export's one:
>>>>
>>>> So this would mean, we could read web pages with your java
>>>> support even if those webpages were not created with Org?
>>>> Pretty cool.
>>>>
>>>>> <div id=""><hx id=""></hx><div>content</div></div>
>>>>>
>>>>> but I could postpone this, if you fullfill my wish above.
>>>>
>>>>
>>>> Best wishes
>>>>
>>>> - Carsten
>>>>
>>>>>
>>>>>
>>>>> Best wishes
>>>>>
>>>>> Sebastian
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Carsten Dominik <carsten.dominik@gmail.com> writes:
>>>>>> On Apr 16, 2009, at 10:50 AM, Sebastian Rose wrote:
>>>>>>
>>>>>>> Carsten Dominik <carsten.dominik@gmail.com> writes:
>>>>>>>> Hi Sebastian,
>>>>>>>>
>>>>>>>> I kind of like the idea to have a property that can be
>>>>>>>> used to set an ID, as an alternative to the <<target>>
>>>>>>>> notation.  Actually, using a property seems a lot cleaner,
>>>>>>>> thanks for coming up with this idea, Daniel.
>>>>>>>>
>>>>>>>> I can also follow the reasoning that it is useful to have
>>>>>>>> the table of contents link to the human-readable id, because
>>>>>>>> it provides a general, simple workflow to retrieve a link that
>>>>>>>> will persist through changes of the document.  This workflow
>>>>>>>> was described also by Bernt earlier in this thread.
>>>>>>>>
>>>>>>>> Finally, I also agree that the main id in the <h3> tag
>>>>>>>> should be the automatically generated one because this is
>>>>>>>> best for automatic processing and because of all the arguments
>>>>>>>> you have presented.
>>>>>>>>
>>>>>>>> Would it cause problems for org-info.js if the toc points to
>>>>>>>> a user specified anchor in the headline, instead of the main
>>>>>>>> ID that is inside the <h3> tag?  THis would really be the only
>>>>>>>> required change.
>>>>>>>
>>>>>>>
>>>>>>> I'll have to test this before I can give a final answer to this
>>>>>>> question.
>>>>>>>
>>>>>>> But regardless of the results, I will adjust the script to  
>>>>>>> reflect that
>>>>>>> change. The script should not rule the HTML export and it will  
>>>>>>> be an
>>>>>>> easy thing to do.
>>>>>>
>>>>>> But I do want to hear any counter arguments you might have....
>>>>>>
>>>>>> - Carsten
>>>>>>
>>>>>>>
>>>>>>> Sebastian
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> - Carsten
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mar 30, 2009, at 1:49 PM, Daniel Clemente wrote:
>>>>>>>>
>>>>>>>>> El dv, mar 27 2009, Sebastian Rose va escriure:
>>>>>>>>>>
>>>>>>>>>> What we have now, just as Carstens said:
>>>>>>>>>>
>>>>>>>>>> # <<human-readable>>
>>>>>>>>>> * Section B
>>>>>>>>>>
>>>>>>>>>> Creates this headline in HTML:
>>>>>>>>>>
>>>>>>>>>> <h2 id="sec-2"><a name="human-readable" id="human- 
>>>>>>>>>> readable"></
>>>>>>>>>> a>2 Section B
>>>>>>>>>> </h2>
>>>>>>>>>>
>>>>>>>>>> This is enough for all the use cases I can think of.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes, this is enough except for two things:
>>>>>>>>> 1. The TOC still links to #sec-2 and the user can't change  
>>>>>>>>> that
>>>>>>>>> 2. Your syntax doesn't fold very well in the outliner. I  
>>>>>>>>> mean: if you
>>>>>>>>> use
>>>>>>>>>
>>>>>>>>>> # <<human-readable>>
>>>>>>>>>> * Section B
>>>>>>>>>
>>>>>>>>> then the comment appears at the end of the previous section,  
>>>>>>>>> and you can
>>>>>>>>> miss
>>>>>>>>> it when you are viewing the heading „Section B“. I  would  
>>>>>>>>> swap both
>>>>>>>>> lines
>>>>>>>>> (solution 1):
>>>>>>>>>
>>>>>>>>>> * Section B
>>>>>>>>>> # <<human-readable>>
>>>>>>>>>
>>>>>>>>> But since there are already LOGBOOK drawers under the  
>>>>>>>>> heading, it would
>>>>>>>>> be
>>>>>>>>> a
>>>>>>>>> lot clearer to use a property, like EXPORT_ID (solution 2):
>>>>>>>>>
>>>>>>>>>> * Section B
>>>>>>>>>> :PROPERTIES:
>>>>>>>>>> :EXPORT_ID: human-readable
>>>>>>>>>> :END:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> In this way, the TOC can reliably find the EXPORT_ID, and  
>>>>>>>>> then generate:
>>>>>>>>>> <h2 id="sec-2"><a name="human-readable" id="human- 
>>>>>>>>>> readable"></
>>>>>>>>>> a>2 Section B
>>>>>>>>>> </h2>
>>>>>>>>>
>>>>>>>>> (You could also leave *just* the human-readable id, but  
>>>>>>>>> having two is
>>>>>>>>> not
>>>>>>>>> bad.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I would prefer solution 1, but I don't because I'm not sure  
>>>>>>>>> that the TOC
>>>>>>>>> can
>>>>>>>>> find the ID if it is written as a comment anywhere under   
>>>>>>>>> the heading
>>>>>>>>> (and
>>>>>>>>> together with other things).
>>>>>>>>>
>>>>>>>>> Solution 2 involves thus: a new property to specify the human-
>>>>>>>>> readable entry ID, which will be used to link to the entry.  
>>>>>>>>> The
>>>>>>>>> automatic
>>>>>>>>> ID
>>>>>>>>> (#sec-2) will still work for all entrys.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> * Distinguishing automatic and human readable IDs
>>>>>>>>>>
>>>>>>>>>> One thing I like is, that we now _can_ distinguish the
>>>>>>>>>> `human-readable-target' (human readable) from the  
>>>>>>>>>> `sec-2' (not human
>>>>>>>>>> readable and not context related) using a regular expression.
>>>>>>>>>>
>>>>>>>>>> In org-info.js, I can now prefere the human readable ID in  
>>>>>>>>>> <a> from an
>>>>>>>>>> automatic created one, and thus use that to create the  
>>>>>>>>>> links for `l'
>>>>>>>>>> and `L'. The same holds true for other programming  
>>>>>>>>>> languages and
>>>>>>>>>> parsers.
>>>>>>>>>>
>>>>>>>>>> If we open the <h3>'s ID for user defined values (bad), we  
>>>>>>>>>> can not
>>>>>>>>>> distinguish those ID's using a regular expression and there  
>>>>>>>>>> is no way
>>>>>>>>>> to detect the human readable one. There will be no way to  
>>>>>>>>>> _know_ that
>>>>>>>>>> the <a>'s ID is the prefered one used for human readable  
>>>>>>>>>> links.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Solution 2 doesn't break the parsing techniques you use; in  
>>>>>>>>> fact it can
>>>>>>>>> also
>>>>>>>>> make clearer which ID is the human readable one and which   
>>>>>>>>> one not.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This is not extremely important; just useful:
>>>>>>>>> - for pages with many incoming links from external sites
>>>>>>>>> - to ensure link integrity (now you can't assure that links  
>>>>>>>>> will still
>>>>>>>>> work
>>>>>>>>> in
>>>>>>>>> 1 year ... or in some weeks)
>>>>>>>>> - to avoid that HTML visitors get directed to a wrong  
>>>>>>>>> section and can't
>>>>>>>>> find
>>>>>>>>> what they searched
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Greetings,
>>>>>>>>> Daniel
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Emacs-orgmode mailing list
>>>>>>>>> Remember: use `Reply All' to send replies to the list.
>>>>>>>>> Emacs-orgmode@gnu.org
>>>>>>>>> http://lists.gnu.org/mailman/listinfo/emacs-orgmode
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Sebastian Rose, EMMA STIL - mediendesign, Niemeyerstr.6, 30449  
>>>>>>> Hannover
>>>>>>> Tel.:  +49 (0)511 - 36 58 472
>>>>>>> Fax:   +49 (0)1805 - 233633 - 11044
>>>>>>> mobil: +49 (0)173 - 83 93 417
>>>>>>> Email: s.rose@emma-stil.de, sebastian_rose@gmx.de
>>>>>>> Http:  www.emma-stil.de
>>>>>>
>>>>>
>>>>> --
>>>>> Sebastian Rose, EMMA STIL - mediendesign, Niemeyerstr.6, 30449  
>>>>> Hannover
>>>>> Tel.:  +49 (0)511 - 36 58 472
>>>>> Fax:   +49 (0)1805 - 233633 - 11044
>>>>> mobil: +49 (0)173 - 83 93 417
>>>>> Email: s.rose@emma-stil.de, sebastian_rose@gmx.de
>>>>> Http:  www.emma-stil.de
>>>>
>>>
>>> --
>>> Sebastian Rose, EMMA STIL - mediendesign, Niemeyerstr.6, 30449  
>>> Hannover
>>> Tel.:  +49 (0)511 - 36 58 472
>>> Fax:   +49 (0)1805 - 233633 - 11044
>>> mobil: +49 (0)173 - 83 93 417
>>> Email: s.rose@emma-stil.de, sebastian_rose@gmx.de
>>> Http:  www.emma-stil.de
>>
>
> -- 
> Sebastian Rose, EMMA STIL - mediendesign, Niemeyerstr.6, 30449  
> Hannover
> Tel.:  +49 (0)511 - 36 58 472
> Fax:   +49 (0)1805 - 233633 - 11044
> mobil: +49 (0)173 - 83 93 417
> Email: s.rose@emma-stil.de, sebastian_rose@gmx.de
> Http:  www.emma-stil.de

      reply	other threads:[~2009-04-17  4:12 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-27 12:47 Custom entry IDs in HTML export Daniel Clemente
2009-03-27 16:16 ` Carsten Dominik
2009-03-27 17:57   ` Bernt Hansen
2009-03-27 21:32     ` Sebastian Rose
2009-03-30 11:49       ` Daniel Clemente
2009-04-16  6:55         ` Carsten Dominik
2009-04-16  8:50           ` Sebastian Rose
2009-04-16 11:28             ` Carsten Dominik
2009-04-16 13:14               ` Sebastian Rose
2009-04-16 17:14                 ` Carsten Dominik
2009-04-16 20:50                   ` Sebastian Rose
2009-04-16 21:26                     ` Carsten Dominik
2009-04-16 22:37                       ` Sebastian Rose
2009-04-17  4:11                         ` Carsten Dominik [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=F11A94A2-0EA4-4801-B9E8-BB49D00255F1@gmail.com \
    --to=carsten.dominik@gmail.com \
    --cc=bernt@norang.ca \
    --cc=emacs-orgmode@gnu.org \
    --cc=sebastian_rose@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).