From mboxrd@z Thu Jan 1 00:00:00 1970 From: tftorrey@tftorrey.com (T.F. Torrey) Subject: Re: Bug: New HTML exporter incorrect attributes Date: Mon, 25 Feb 2013 03:55:38 -0700 Message-ID: <871uc4acid.fsf@lapcat.tftorrey.com> References: <87fw0nibde.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([208.118.235.92]:51793) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U9w0Z-0006JO-9B for emacs-orgmode@gnu.org; Mon, 25 Feb 2013 06:14:15 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1U9w0U-0004S7-Im for emacs-orgmode@gnu.org; Mon, 25 Feb 2013 06:14:10 -0500 Received: from mail-da0-f42.google.com ([209.85.210.42]:45279) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U9w0U-0004R2-8N for emacs-orgmode@gnu.org; Mon, 25 Feb 2013 06:14:06 -0500 Received: by mail-da0-f42.google.com with SMTP id z17so1425209dal.15 for ; Mon, 25 Feb 2013 03:14:04 -0800 (PST) In-Reply-To: <87fw0nibde.fsf@gmail.com> (message from Nicolas Goaziou on Sat, 23 Feb 2013 11:16:13 +0100) List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Nicolas Goaziou Cc: emacs-orgmode@gnu.org Hello Nicolas, Nicolas Goaziou writes: > tftorrey@tftorrey.com (T.F. Torrey) writes: > >>> You don't assign attributes to an image in a paragraph, you assign >>> attributes to the paragraph itself. >> >> It would be nice if there actually was a way to assign an attribute to a >> paragraph, so that the ATTR_HTML: class="XXX" syntax would export as

> class="XXX">, but that is a different issue. > > It would be ATTR_HTML: :class "XXX". I try to unify syntax for > attributes with syntax for Babel and AFAICT, `html' is the last back-end > to have key="value" syntax. I see that this does not presently work, and the author listed on ox-html.el is not currently active on this list. I hope you are not the only one working on this. It would be our great misfortune for you to become burned out. >>> For the time being, Org syntax >>> doesn't allow to specify attributes per link object. >> >> I think what you are saying is that the current intended behavior is for >> whatever is specified by ATTR_HTML to apply to every image or link in >> the paragraph. > > No. I am saying that ATTR_HTML behaviour in _undefined_ when a paragraph > contains more than one link, as it has always been. > > If you carefully look at Org manual (in application with previous > exporter framework), in "Images in HTML export", you will notice that > HTML attributes only apply to a single link pointing to an image, not to > a paragraph containing many links. I see no such limitation in the Org manual (12.5.6). It says this: If you need to add attributes to an inlined image, use a `#+ATTR_HTML'. Though the example that follows doesn't show a paragraph, calling them "inline" indicates they will be within a paragraph. Org manual section 12.5.4 also shows ATTR_HTML applying to a hyperlink by itself, but hyperlinks would rarely be used that way in real life, and in fact the old exporter always applied ATTR_HTML attributes to the next item in a paragraph. I have always understood the manual to mean that an ATTR_HTML would apply to *the next thing* in the document that it could, and that was what happened in practice. That was a useful thing for them to do. >>> As a consequence, attributes will be assigned to every link within the >>> paragraph. >> >> Is this behavior helpful to anyone in any practical circumstances? > > I never said it was. It's not even a feature. I'm just explaining what > is happening. If it isn't intended behavior, and it isn't helpful, then we should make it stop doing that. >> Moreover, this means that, not only does the new exporter fail where the >> old one succeeded, > > I worked hard to make the new export framework compatible with defined > behaviour of previous exporter, not with handy undocumented side-effects > it may have. > >> It seems to me that, whether the user is happy with the output or not, >> the HTML exporter ought to produce valid HTML. > > I agree. But, in this case, you're using undefined Org syntax (which, > admittedly, used to "work" for you). The HTML exporter should produce valid HTML regardless of the input. > If there's a simple patch that mimics this for html back-end, I don't > mind applying it. But it still won't make up for a real solution. > > Unless, that is, it is decided that this behaviour is an official > feature supported by Org, in which case, it should be added to the > manual. The Org manual describes ATTR_HTML as a feature that applies to the following image or link. It makes no mention of restrictions to following content in the paragraph, and neither does it say it will apply to all following images or links. The manual could be amended to say that ATTR_HTML applies to just the next image or link. To fit the current situation, it might say, "In cases where ATTR_HTML is applied to an image in a paragraph, following links will not be made invalid." But why would anyone be expecting invalid HTML in the first place? Incidentally, I always thought that simply using another HTML_ATTR would handle multiple images or links in the old exporter. In other words, this: #+ATTR_HTML: width="10" alt=" [Cool thing] " [[file:cool_thing.jpg]] This is a paragraph about cool things. #+ATTR_HTML: class="bar" Cool thing found here [[http://example.com/][example.com]]. Would become this:

 [Cool thing] This is a paragraph about cool things. Cool thing found here example.com.

I don't remember using that in the old exporter, but I thought it would work. It almost works in the new exporter, but it begins a new paragraph before the second #+ATTR_HTML. I'm not sure this is the intended behavior, though, because it isn't formatted like other new paragraphs. >> A more general workaround that would help everyone affected would be to >> temporarily modify ox-html.el so that attributes from ATTR_HTML only >> apply to the *first* item in the paragraph. This would have the >> advantage of mimicking the behavior of the old exporter (thus not >> breaking existing content) and of keeping images for other export >> formats. Of course, anyone relying on the ATTR_HTML to set attributes >> for every image and/or link in a paragraph would have to adopt a >> different workaround, but ... does anyone really do this? > > It would solve your problem. But what if someone starts a paragraph with > a regular link and thereafter, add an image? ATTR_HTML attributes would > never reach it. That's true, but I think you are making the perfect the enemy of the good. > Again, there's no proper solution besides modifying link syntax. I agree that modifying the link syntax to support inline attributes is the best solution. (IMHO, the syntax AsciiDoc uses is good, and would be a good fit here.) Alternatively, having ATTR_HTML (or something more general) apply to the next thing, and having that work within paragraphs, is another possibility. However, this may not fit within the limitations of the new parser. Plus it's kind of ugly. Until there is a "proper" solution, however, could we please modify the exporter to apply ATTR_HTML to only the next image or link? I am very sure that was the spirit of the old exporter, and it would be nice if I could maintain my documents in Org without resorting to (even more) hacks. Best, Terry -- T.F. Torrey