From mboxrd@z Thu Jan 1 00:00:00 1970 From: tftorrey@tftorrey.com (T.F. Torrey) Subject: Re: Bug: New HTML exporter incorrect attributes Date: Mon, 25 Feb 2013 13:51:51 -0700 Message-ID: <87y5ec86c8.fsf@lapcat.tftorrey.com> References: <87vc9gfqr3.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([208.118.235.92]:35827) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UA51r-0001MD-AC for emacs-orgmode@gnu.org; Mon, 25 Feb 2013 15:52:10 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UA51p-0006xW-2U for emacs-orgmode@gnu.org; Mon, 25 Feb 2013 15:52:07 -0500 Received: from mail-pa0-f54.google.com ([209.85.220.54]:37040) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UA51o-0006xI-Q1 for emacs-orgmode@gnu.org; Mon, 25 Feb 2013 15:52:05 -0500 Received: by mail-pa0-f54.google.com with SMTP id fa10so1956733pad.41 for ; Mon, 25 Feb 2013 12:52:03 -0800 (PST) In-Reply-To: <87vc9gfqr3.fsf@gmail.com> (message from Nicolas Goaziou on Mon, 25 Feb 2013 14:49:04 +0100) List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Nicolas Goaziou Cc: emacs-orgmode@gnu.org Hello Nicolas, Thanks for your prompt reply, though I think our discussion is a little off-track here, as noted below. Nicolas Goaziou writes: > tftorrey@tftorrey.com (T.F. Torrey) writes: > >>> It would be ATTR_HTML: :class "XXX". I try to unify syntax for >>> attributes with syntax for Babel and AFAICT, `html' is the last back-end >>> to have key="value" syntax. This was your response to my comment that it would be handy to apply attributes to paragraphs, not the links or images within them. >> I see that this does not presently work, and the author listed on >> ox-html.el is not currently active on this list. I hope you are not the >> only one working on this. It would be our great misfortune for you to >> become burned out. > > It's not much work once we agree about the real syntax. For example, for > links, there are two ways to replace: I think you are now talking about a syntax for adding attributes to links and images within paragraphs. In the e-mail to which you are replying, I thought were agreeing that modifying the inline link and image syntax was the best solution to these, and had thoughts on what that might look like in another thread. What you describe here still has the limitation that either all links and images in a paragraph must have the same attributes, or some must be invalid, or only the first is reachable. Apart from that, I have other more general concerns with this approach noted below. > #+ATTR_HTML: width="400px" > > The easiest one, is simply to ask for `:options' before: > > #+ATTR_HTML: :options "width=\"400px\"" > > This is heavier but will be consistent with other back-ends. Yes, this is heavy. Escaping the quotes is unwieldy, and raises doubts about what else might need to be escaped. Also, given that the whole and only point of the ATTR_HTML keyword is for describing options, adding ":options" is redundant. From a design standpoint, it might be elegant that it matches other things, but here it seems very awkward, and I don't understand who it would benefit. This seems another step away from "plain text" toward "another programming language". The first syntax looks like plain text. The second looks like programming. For babel, and actual programming languages, I'm sure this makes good sense. For applying a width to an image, it's overkill. For instance, compare this "plain text": #+ATTR_HTML: :options "width=\"400px\" title=\"My image\"" [[file:image.jpg]] to the HTML alternative: Shouldn't the plain text be more straightforward to enter and easier to understand that what it is replacing? Here is the same thing in AsciiDoc: image:image.jpg["My image",width=400] Granted, I've already noted that I think we are moving toward inline attribute specifications, so this syntax for images and links is probably moot anyway, but I think the larger point still stands. I hate to say that I have other concerns as well, but I do, below. > Otherwise, there is also: > > #+ATTR_HTML: :width "400px" > > But this requires to have a list of all properties supported. If we take > that route, here is a suggested list of such properties for tag: > > - rel > - target > - type > - accesskey > - class > - style > - title > > and for > > - alt > - height > - width > > What do you think about it? I think it is rather heavy-handed. I don't understand why this "requires" a list of properties supported. The old exporter would simply plug whatever you told it into the tag, trusting that you knew what you were doing. I'm sure this simplified the code, and it gave great power to the user. Why should the user need "permission" from the developers to apply any arbitrary attributes to their elements? Imposing these restrictions on users seems to make more work for the users, and more work for the developers, to the benefit of no one. Also, I don't know why attributes should be defined for each backend rather than once for everywhere. The attributes would be designated for an object, and each backend would decide which to use or ignore. For instance, though I know the LaTeX syntax not correct, this seems like massive overkill for making a link red: #+ATTR_LATEX: :options "text-color: red" #+ATTR_HTML: :options "class=\"red\"" Here is a [[file:doc.html]][red link]]. FWIW, the same thing in AsciiDoc would be this: Here is a [red]#link:doc.html#[red link]. And it would work correctly for every backend, current or future. In AsciiDoc, the attributes belong to the item, and every backend is free to use or ignore them. Plain text sure looks appealing. Again, this is applying the old ATTR_ syntax instead of the suggested inline attribute designations, but if the new link syntax matches the spirit of the existing structure, something like this is in the works: Here is a [[file:doc.html][red link][@@html: "class=\"red\""@@ @@latex: "text-color: red"@@]]. IMHO, the AsciiDoc approach is much better. >> The HTML exporter should produce valid HTML regardless of the input. > > We cannot remove the ability to shoot oneself in the foot. The HTML > back-end cannot be responsible for undefined syntax. Think about: > > @@html:@@ I'm pretty sure you understand what I meant. The user should be free to ruin things however he likes, but whatever the exporter produces itself should be valid. [... 48 lines omitted ...] >> Until there is a "proper" solution, however, could we please modify the >> exporter to apply ATTR_HTML to only the next image or link? I am very >> sure that was the spirit of the old exporter, and it would be nice if I >> could maintain my documents in Org without resorting to (even more) >> hacks. > > Done. Thank you very much. > Regards, Some closing thoughts: I am concerned that the new parser is unnecessarily heavy-handed in general. For instance, why should document #+OPTIONS be restricted to a pre-defined set at all? It sure would be easier to hack if any arbitrary property could be set there and accessed with (org-get-property "whatever"). The same goes for node-level EXPORT_OPTIONS. I know that these have always been hard-coded in advance, but I was hoping the new parser would be a step toward freedom, not the other way. Surely the code would be simpler accepting any OPTION rather than checking each value to see if it is approved and rejecting those that are not. I have been a dedicated (but mostly silent) Org user for years. I have literally all of my life and livelihood documents in Org format. But right now I have seven or more tabs in Firefox open to AsciiDoc documentation. The syntax looks lightweight and easy and fun, like Org used to. I've been wondering for the past two days what it would take to use Org syntax for headlines, structure, Org tables, dynamic blocks, and Babel code, but to use AsciiDoc standards for the markup of the actual text. One big benefit of Org is that it is entirely elisp. I'm not a very good elisp programmer, and I don't know enough LaTeX, but I think even I could make a parser that would turn AsciiDoc text into HTML, plain text, and even DocBook. I wonder if there would be support for that. Our current system has support for basic inline styling (italic, bold, etc.), primitive links and images, and none at all for arbitrary spans. Adopting the AsciiDoc syntax for inline items would get us powerful styling for links, images, and arbitrary spans. Perhaps it could be developed as an add-on, and an OPTION could specify that the text markup was AsciiDoc. Exporters could also push the document to the AsciiDoc tool chain, which would give us another path to DocBook, HTML, PDF, and the rest. I'm afraid, though, that the new parser will enforce its own standards, making this kind of interchangeability impossible. Now I've gotten pretty far off-topic. Enough for now. Best regards, Terry -- T.F. Torrey