From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nicolas Goaziou Subject: Re: Bug: HTML export ignoring CUSTOM_ID properties Date: Sun, 19 Apr 2015 11:08:11 +0200 Message-ID: <87a8y4mqo4.fsf@nicolasgoaziou.fr> References: <87egnhm3ue.fsf@nicolasgoaziou.fr> <87pp7021h7.fsf@jack.tftorrey.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:42872) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YjlBl-0003jw-S0 for emacs-orgmode@gnu.org; Sun, 19 Apr 2015 05:06:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YjlBk-0003gK-2u for emacs-orgmode@gnu.org; Sun, 19 Apr 2015 05:06:53 -0400 Received: from relay3-d.mail.gandi.net ([2001:4b98:c:538::195]:59208) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YjlBj-0003gG-QK for emacs-orgmode@gnu.org; Sun, 19 Apr 2015 05:06:52 -0400 In-Reply-To: <87pp7021h7.fsf@jack.tftorrey.com> (T. F. Torrey's message of "Sat, 18 Apr 2015 21:20:20 -0700") List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: "T.F. Torrey" Cc: emacs-orgmode@gnu.org, rasmus@gmx.us Hello, tftorrey@tftorrey.com (T.F. Torrey) writes: > Yes, changes on master can and do occasionally break Org, but they are > *supposed* to work. You wouldn't leave the spreadsheet functionality in > an unusable state and just tell people to use 8.2. CUSTOM_ID was *supposed* to work after this change. Actually, it worked, except in two use cases (custom CSS and linking from outside Org) in one specific export back-end. > But yes, it should be a simple matter to revert the commit that caused > the problem for me until the problem can be addressed. That was the > second thing I looked at. However, the place where this change happened > is not obvious in the git logs. I still don't know where it came > from. AFAICT, a single commit affected "ox-html" significantly lately: 459033265295723cbfb0fccb3577acbfdc9d0285. Anyway, you may also bisect to find the problematic commit; doing so might also improve the bug report. > I did see that most (maybe all) of your changes are accompanied by > tests. I'm not very familiar with the testing. Are the tests > restricted to merely checking if the code explodes? You cannot test everything. Moreover, we do not write tests specific to export back-ends (i.e., "ox.el" is extensively tested but not "ox-latex.el" or, in this case "ox-html.el"). > They aren't lying because they don't claim to allow only valid ID's. > They produce valid ID's on their own, but when a user calls for a > specific ID (the {#clinton} construct in Markdown comes to mind), they > just do what the user tells them to do. Which is a good thing. No, it isn't a good thing in all cases.=20 In some circumstances, the user creates an ID possibly without knowing about it, e.g. with targets and radio targets. How Org handles these objects is an implementation detail, and shouldn't be thrown at the face of users. However, it happened recently to some user (see ). The culprit was `org-export-solidify-link-text', a very wrong function. I didn't explain the problem because of its internal nature, but it seems I should have. For the record, `org-export-solidify-link-text' was just turning any character non alphanumerical and not among "_.-:" into a hyphen. This function was also applied to many things, including CUSTOM_ID. IOW, "clinton" became "clinton", but also both "clint=C3=A9n" and "clint=C3=A0n"= became "clint-n". So, basically CUSTOM_ID were already broken for anyone using non ASCII characters. Of course, if it had been only about CUSTOM_ID, the solution would have been to simply remove the call to `org-export-solidify-link-text' and let the user handle it. But there are also radio targets, and to a lesser extent, targets, which are expected to be human-readable. E.g., if, for some reason, I need to write <<<=C3=A9=C3=A9>>> and <<<=C3=A0=C3=A0= >>> in some document, I certainly don't want them to both refer to "--" ID. Note they could also be <<>> or <<>>, and could be exported through HTML, LaTeX, etc. all with different expectations for their IDs. Since there is no reason to impose restrictions about them on the user and the fact they are Org specific features, using an internal reference is fine in this case. However, `org-export-solidify-link-text' is not the answer, as it is not bijective. Therefore, I implemented `org-export-get-reference', which relies on a very basic and portable set of characters (alphanumeric ones) while still ensuring stability of references. Of course, predictability is not achieved, but it wasn't before either, excepted in the most simple cases. Furthermore it isn't a problem in practice since users are not expected to (and shouldn't) rely on these references externally. Here, CUSTOM_ID in the context of HTML export is the exception, not the rule. I hope this clarifies the purpose of the change. > In my view, the purpose of tools such as Org that convert documents to > HTML is to do what the user tells them to do, even if that means > creating invalid HTML. On many occasions in the past, and probably some > in the present and the future, I have used conversion tools to produce > technically invalid HTML as in intermediate format to be further > processed by XSLT to a final product. A tool that refused to produce > invalid HTML would be no help at all. In fact, I'm not aware of any > tool that disallows that except maybe for some beginner level things. > > On the contrary, the slant of Org's development lately seems to be first > to make sure users don't make any mistakes, and then to follow their > instructions. Again, this change isn't about protecting users, but fixing an incorrect function, alas very much used across code base and affecting many users. >> I overlooked the problem in HTML and made a mistake. It happens, more >> often than I would like. However, you are not required to be obnoxious >> about it. It helps no one. > > Your mistakes are very rare, and your work is sincerely appreciated. I > think your comment about my response is out of context, and I'm not sure > your final statement is true. My polite comments were summarily > dismissed, but now anyone who depended on CUSTOM_ID has been helped. Communication, and especially written one, is sometimes misleading, nevertheless I found the tone of your answer unnecessarily harsh. In any case, your last message didn't trigger the fix. The first one would have been sufficient, but I needed to get back online first. > Thank you for your prompt action, but can I ask what you mean by > "fixed"? Have you decided to revert CUSTOM_ID to its previous > functionality? Are you still planning on changing its functionality > and/or meaning? I didn't plan to change CUSTOM_ID functionality in the first place. I fixed a deeply internal function. This fix implied invasive changes, and CUSTOM_ID was merely affected by side effect. > Are you still planning on throwing warnings or errors in the event of > duplicate or invalid CUSTOM_ID's? I have implemented something related recently, but I'll comment about this in another thread. Regards, --=20 Nicolas Goaziou