From mboxrd@z Thu Jan 1 00:00:00 1970 From: Carsten Dominik Subject: Re: Re: Custom entry IDs in HTML export Date: Thu, 16 Apr 2009 23:26:48 +0200 Message-ID: <70EC5312-4BB8-4A7F-A2AD-7B96CBF7C068@gmail.com> References: <87myb7w2s9.fsf@CPU107.opentrends.net> <6BF0FCBC-4343-4B8C-9A16-F4B9AC9B0F48@gmail.com> <87eiwiluft.fsf@gollum.intra.norang.ca> <87y6uqwsjw.fsf@kassiopeya.MSHEIMNETZ> <871vsfjkm3.fsf@CPU107.opentrends.net> <1FEE16B4-2913-487C-8822-094FF4EC725C@gmail.com> <878wm1ugml.fsf@kassiopeya.MSHEIMNETZ> <43758593-D9D0-43BC-B4D9-14E036C66271@gmail.com> <87vdp4u4e8.fsf@kassiopeya.MSHEIMNETZ> <87hc0ob9wc.fsf@kassiopeya.MSHEIMNETZ> Mime-Version: 1.0 (Apple Message framework v930.3) Content-Type: text/plain; charset=WINDOWS-1252; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable Return-path: Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1LuZ6m-0003ya-CL for emacs-orgmode@gnu.org; Thu, 16 Apr 2009 17:26:56 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1LuZ6i-0003y8-H5 for emacs-orgmode@gnu.org; Thu, 16 Apr 2009 17:26:56 -0400 Received: from [199.232.76.173] (port=49638 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1LuZ6i-0003y5-EO for emacs-orgmode@gnu.org; Thu, 16 Apr 2009 17:26:52 -0400 Received: from mail-ew0-f160.google.com ([209.85.219.160]:49918) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1LuZ6h-0003kD-OH for emacs-orgmode@gnu.org; Thu, 16 Apr 2009 17:26:52 -0400 Received: by ewy4 with SMTP id 4so610575ewy.42 for ; Thu, 16 Apr 2009 14:26:51 -0700 (PDT) In-Reply-To: <87hc0ob9wc.fsf@kassiopeya.MSHEIMNETZ> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Sebastian Rose Cc: org-mode mailing list , Bernt Hansen On Apr 16, 2009, at 10:50 PM, Sebastian Rose wrote: > Carsten Dominik writes: >> Hi Sebastian, >> >> On Apr 16, 2009, at 3:14 PM, Sebastian Rose wrote: >> >>> Hm - counter arguments? >>> >>> The only counter argument is, that hand made IDs for links are =20 >>> prone to >>> error. But that risk should be up to the user. >> >> Yes. and during the export, I can actually check and throw a =20 >> warning or an >> error if the same custom ID shows up twice. >> >>> >>> I actually changed my mind a little in this concern. >>> >>> If the user clicks a section link in the toc to jump to a section, =20= >>> he >>> can bookmark the page with exactly that jump target. If the jump =20 >>> target >>> (the ID) is human readable, the bookmark is more verbose. >> >> Yes, this is really the best application. Also, when hovering over =20= >> internal >> links, it is helpful if the link displays the human-readable form. >> >>> Just one wish: >>> >>> The containers should reflect that change (HRID =3D human readable =20= >>> id): >>> >>>
>>>

headline

>>>
>>> sections content... >>>
>>>
>> >> >> Sure, we can do this. I would then add sec-xxx as one >> of the alternative anchors as well. >> >> However: If I make the structure as you indicate above, >> do I understand correctly that the structure of a section without a >> human-readable id should be changed to this: >> >>
>>

headline

>>
>> sections content... >>
>>
>> >> >> Note the "sec-" which is added to the stuff that currently >> defines the structure. > > > > I considered the `sec-' part of the automatic IDs. > > In either case I'd have to adjust org-info.js. So why not go for the > human readable IDs without `sec-'? > > > Right now we have: > >
>

2 Things I = want =20 > to find out

>
> > The `sec-' part is in the headlines ID only. Why? Because this introduced a parsing inconsistency for you between =20= automatic and custom IDs. Because for the automatic ones, you need to =20= strip "sec-" to retrieve the correct suffix for the container etc =20 names. With the custom IDs, no such stripping should be done. Does =20 this not make things harder? - Carsten > > > > Sebastian > > > > >>> That way the script would keep working with older pages. >>> Automatic IDs and human readable ones could be mixed. >>> >>> >>> The '' anchors are scanned anyway, as are all jump =20 >>> targets in >>> the page. >> >> Yes, you implemented that some time ago, I remember. >> >>> >>> Maybe this is even the point to re-work the parser of org-info.js to >>> become independent of the TOC at all. The script could search for >>> headings instead. That's more work, but the script would then work =20= >>> for >>> all HTML pages with a structure similar to the org-export's one: >> >> So this would mean, we could read web pages with your java >> support even if those webpages were not created with Org? >> Pretty cool. >> >>>
content
>>> >>> but I could postpone this, if you fullfill my wish above. >> >> >> Best wishes >> >> - Carsten >> >>> >>> >>> Best wishes >>> >>> Sebastian >>> >>> >>> >>> >>> Carsten Dominik writes: >>>> On Apr 16, 2009, at 10:50 AM, Sebastian Rose wrote: >>>> >>>>> Carsten Dominik writes: >>>>>> Hi Sebastian, >>>>>> >>>>>> I kind of like the idea to have a property that can be >>>>>> used to set an ID, as an alternative to the <> >>>>>> notation. Actually, using a property seems a lot cleaner, >>>>>> thanks for coming up with this idea, Daniel. >>>>>> >>>>>> I can also follow the reasoning that it is useful to have >>>>>> the table of contents link to the human-readable id, because >>>>>> it provides a general, simple workflow to retrieve a link that >>>>>> will persist through changes of the document. This workflow >>>>>> was described also by Bernt earlier in this thread. >>>>>> >>>>>> Finally, I also agree that the main id in the

tag >>>>>> should be the automatically generated one because this is >>>>>> best for automatic processing and because of all the arguments >>>>>> you have presented. >>>>>> >>>>>> Would it cause problems for org-info.js if the toc points to >>>>>> a user specified anchor in the headline, instead of the main >>>>>> ID that is inside the

tag? THis would really be the only >>>>>> required change. >>>>> >>>>> >>>>> I'll have to test this before I can give a final answer to this >>>>> question. >>>>> >>>>> But regardless of the results, I will adjust the script to =20 >>>>> reflect that >>>>> change. The script should not rule the HTML export and it will =20 >>>>> be an >>>>> easy thing to do. >>>> >>>> But I do want to hear any counter arguments you might have.... >>>> >>>> - Carsten >>>> >>>>> >>>>> Sebastian >>>>> >>>>> >>>>> >>>>>> - Carsten >>>>>> >>>>>> >>>>>> On Mar 30, 2009, at 1:49 PM, Daniel Clemente wrote: >>>>>> >>>>>>> El dv, mar 27 2009, Sebastian Rose va escriure: >>>>>>>> >>>>>>>> What we have now, just as Carstens said: >>>>>>>> >>>>>>>> # <> >>>>>>>> * Section B >>>>>>>> >>>>>>>> Creates this headline in HTML: >>>>>>>> >>>>>>>>

>>>>>>> a>2 Section B >>>>>>>>

>>>>>>>> >>>>>>>> This is enough for all the use cases I can think of. >>>>>>>> >>>>>>> >>>>>>> Yes, this is enough except for two things: >>>>>>> 1. The TOC still links to #sec-2 and the user can't change that >>>>>>> 2. Your syntax doesn't fold very well in the outliner. I mean: =20= >>>>>>> if you use >>>>>>> >>>>>>>> # <> >>>>>>>> * Section B >>>>>>> >>>>>>> then the comment appears at the end of the previous section, =20 >>>>>>> and you can >>>>>>> miss >>>>>>> it when you are viewing the heading =84Section B=93. I would = swap =20 >>>>>>> both lines >>>>>>> (solution 1): >>>>>>> >>>>>>>> * Section B >>>>>>>> # <> >>>>>>> >>>>>>> But since there are already LOGBOOK drawers under the heading, =20= >>>>>>> it would be >>>>>>> a >>>>>>> lot clearer to use a property, like EXPORT_ID (solution 2): >>>>>>> >>>>>>>> * Section B >>>>>>>> :PROPERTIES: >>>>>>>> :EXPORT_ID: human-readable >>>>>>>> :END: >>>>>>> >>>>>>> >>>>>>> In this way, the TOC can reliably find the EXPORT_ID, and then =20= >>>>>>> generate: >>>>>>>>

>>>>>>> a>2 Section B >>>>>>>>

>>>>>>> >>>>>>> (You could also leave *just* the human-readable id, but having =20= >>>>>>> two is not >>>>>>> bad. >>>>>>> >>>>>>> >>>>>>> I would prefer solution 1, but I don't because I'm not sure =20 >>>>>>> that the TOC >>>>>>> can >>>>>>> find the ID if it is written as a comment anywhere under the =20= >>>>>>> heading (and >>>>>>> together with other things). >>>>>>> >>>>>>> Solution 2 involves thus: a new property to specify the human- >>>>>>> readable entry ID, which will be used to link to the entry. =20 >>>>>>> The automatic >>>>>>> ID >>>>>>> (#sec-2) will still work for all entrys. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> * Distinguishing automatic and human readable IDs >>>>>>>> >>>>>>>> One thing I like is, that we now _can_ distinguish the >>>>>>>> `human-readable-target' (human readable) from the =20 >>>>>>>> `sec-2' (not human >>>>>>>> readable and not context related) using a regular expression. >>>>>>>> >>>>>>>> In org-info.js, I can now prefere the human readable ID in =20 >>>>>>>> from an >>>>>>>> automatic created one, and thus use that to create the links =20= >>>>>>>> for `l' >>>>>>>> and `L'. The same holds true for other programming languages =20= >>>>>>>> and >>>>>>>> parsers. >>>>>>>> >>>>>>>> If we open the

's ID for user defined values (bad), we =20 >>>>>>>> can not >>>>>>>> distinguish those ID's using a regular expression and there =20 >>>>>>>> is no way >>>>>>>> to detect the human readable one. There will be no way to =20 >>>>>>>> _know_ that >>>>>>>> the 's ID is the prefered one used for human readable links. >>>>>>>> >>>>>>> >>>>>>> Solution 2 doesn't break the parsing techniques you use; in =20 >>>>>>> fact it can >>>>>>> also >>>>>>> make clearer which ID is the human readable one and which one =20= >>>>>>> not. >>>>>>> >>>>>>> >>>>>>> This is not extremely important; just useful: >>>>>>> - for pages with many incoming links from external sites >>>>>>> - to ensure link integrity (now you can't assure that links =20 >>>>>>> will still >>>>>>> work >>>>>>> in >>>>>>> 1 year ... or in some weeks) >>>>>>> - to avoid that HTML visitors get directed to a wrong section =20= >>>>>>> and can't >>>>>>> find >>>>>>> what they searched >>>>>>> >>>>>>> >>>>>>> Greetings, >>>>>>> Daniel >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Emacs-orgmode mailing list >>>>>>> Remember: use `Reply All' to send replies to the list. >>>>>>> Emacs-orgmode@gnu.org >>>>>>> http://lists.gnu.org/mailman/listinfo/emacs-orgmode >>>>>> >>>>> >>>>> -- >>>>> Sebastian Rose, EMMA STIL - mediendesign, Niemeyerstr.6, 30449 =20 >>>>> Hannover >>>>> Tel.: +49 (0)511 - 36 58 472 >>>>> Fax: +49 (0)1805 - 233633 - 11044 >>>>> mobil: +49 (0)173 - 83 93 417 >>>>> Email: s.rose@emma-stil.de, sebastian_rose@gmx.de >>>>> Http: www.emma-stil.de >>>> >>> >>> -- >>> Sebastian Rose, EMMA STIL - mediendesign, Niemeyerstr.6, 30449 =20 >>> Hannover >>> Tel.: +49 (0)511 - 36 58 472 >>> Fax: +49 (0)1805 - 233633 - 11044 >>> mobil: +49 (0)173 - 83 93 417 >>> Email: s.rose@emma-stil.de, sebastian_rose@gmx.de >>> Http: www.emma-stil.de >> > > --=20 > Sebastian Rose, EMMA STIL - mediendesign, Niemeyerstr.6, 30449 =20 > Hannover > Tel.: +49 (0)511 - 36 58 472 > Fax: +49 (0)1805 - 233633 - 11044 > mobil: +49 (0)173 - 83 93 417 > Email: s.rose@emma-stil.de, sebastian_rose@gmx.de > Http: www.emma-stil.de