Bug: New HTML exporter incorrect attributes

emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed

* Bug: New HTML exporter incorrect attributes
@ 2013-02-22 22:23 T.F. Torrey
  2013-02-22 23:18 ` Nicolas Goaziou
  0 siblings, 1 reply; 11+ messages in thread
From: T.F. Torrey @ 2013-02-22 22:23 UTC (permalink / raw)
  To: emacs-orgmode

Hello,

Where attributes have been assigned to an image in a paragraph, the new
exporter applies those attributes to both the image and a following
link.

For example, this:

#+BEGIN_SRC org
#+ATTR_HTML: width="10" alt=" [Cool thing] "
[[file:cool_thing.jpg]]
Cool thing found here [[http://example.com/][example.com]].
#+END_SRC

is exported to this:

#+BEGIN_HTML
<p>
<img src="cool_thing.jpg" width="10" alt=" [Cool thing] "/>
Cool thing found here <a href="http://example.com/" width="10"
 alt=" [Cool thing] ">example.com</a>.
</p>
#+END_HTML

Emacs  : GNU Emacs 24.3.50.1 (i686-pc-linux-gnu, GTK+ Version 3.6.0)
 of 2012-12-24 on menkib, modified by Debian

Package: Org-mode version 7.9.3e (7.9.3e-1173-g14df16 @
/home/tftorrey/.emacs.d/elisp/org/lisp/)

Best regards,
Terry
-- 
T.F. Torrey

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bug: New HTML exporter incorrect attributes
  2013-02-22 22:23 Bug: New HTML exporter incorrect attributes T.F. Torrey
@ 2013-02-22 23:18 ` Nicolas Goaziou
  2013-02-23  9:43   ` T.F. Torrey
  0 siblings, 1 reply; 11+ messages in thread
From: Nicolas Goaziou @ 2013-02-22 23:18 UTC (permalink / raw)
  To: T.F. Torrey; +Cc: emacs-orgmode

Hello,

tftorrey@tftorrey.com (T.F. Torrey) writes:

> Where attributes have been assigned to an image in a paragraph, the new
> exporter applies those attributes to both the image and a following
> link.

You don't assign attributes to an image in a paragraph, you assign
attributes to the paragraph itself. For the time being, Org syntax
doesn't allow to specify attributes per link object.

As a consequence, attributes will be assigned to every link within the
paragraph. A hack could be implemented in ox-html.el so only image links
get these attributes, but it would be the same with multiple images
within the same paragraph.

A proper solution to the problem would be to slightly change link
syntax. Until then, you'll have to use workarounds (like, for example,
writing the other link in raw HTML syntax within an export snippet).

Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bug: New HTML exporter incorrect attributes
  2013-02-22 23:18 ` Nicolas Goaziou
@ 2013-02-23  9:43   ` T.F. Torrey
  2013-02-23 10:16     ` Nicolas Goaziou
  0 siblings, 1 reply; 11+ messages in thread
From: T.F. Torrey @ 2013-02-23  9:43 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode

Hello,

First, as always, thanks for the prompt reply.

Nicolas Goaziou <n.goaziou@gmail.com> writes:

> Hello,
>
> tftorrey@tftorrey.com (T.F. Torrey) writes:
>
>> Where attributes have been assigned to an image in a paragraph, the new
>> exporter applies those attributes to both the image and a following
>> link.
>
> You don't assign attributes to an image in a paragraph, you assign
> attributes to the paragraph itself.

It would be nice if there actually was a way to assign an attribute to a
paragraph, so that the ATTR_HTML: class="XXX" syntax would export as <p
class="XXX">, but that is a different issue.

> For the time being, Org syntax
> doesn't allow to specify attributes per link object.

I think what you are saying is that the current intended behavior is for
whatever is specified by ATTR_HTML to apply to every image or link in
the paragraph.

> As a consequence, attributes will be assigned to every link within the
> paragraph.

Is this behavior helpful to anyone in any practical circumstances?

Moreover, this means that, not only does the new exporter fail where the
old one succeeded, the new one produces invalid HTML (anchors with
invalid attributes) in the use case I described (ATTR_HTML to apply to
an image beginning a paragraph which later has a link in it, which
happens several times in almost all my documents).

It seems to me that, whether the user is happy with the output or not,
the HTML exporter ought to produce valid HTML.

> A hack could be implemented in ox-html.el so only image links
> get these attributes, but it would be the same with multiple images
> within the same paragraph.

Again, I can't think of a practical situation where this would be
helpful.  If all the images and/or links had the same styling, simple
CSS would suffice, and there would be no need for the ATTR_HTML.

In my case, however, this would actually work.

I know that it is possible to style links using ATTR_HTML, but does
anyone actually do that in practice?  I don't think I ever have.  If no
one uses it, would it be missed?

> A proper solution to the problem would be to slightly change link
> syntax.

The link syntax change will be a welcome addition, though I understand
that it is not a high priority.

> Until then, you'll have to use workarounds (like, for example, writing
> the other link in raw HTML syntax within an export snippet).

Yes, a personal workaround would be to use the raw HTML syntax to mark
the image in my example.  This has the strong disadvantage, however, of
meaning the image doesn't appear at all when the document is exported to
other formats, and of requiring changes to all affected documents when
the syntax changes again.

A more general workaround that would help everyone affected would be to
temporarily modify ox-html.el so that attributes from ATTR_HTML only
apply to the *first* item in the paragraph.  This would have the
advantage of mimicking the behavior of the old exporter (thus not
breaking existing content) and of keeping images for other export
formats.  Of course, anyone relying on the ATTR_HTML to set attributes
for every image and/or link in a paragraph would have to adopt a
different workaround, but ... does anyone really do this?

In my case, rather than changing all my documents to use raw HTML for
the images, I will write a filter function that walks through the final
HTML and removes invalid and superfluous attributes from the anchor
tags.  This strikes me as a rather ugly hack, though.

It seems unlikely to me that this issue only comes up with the HTML
exporter.  Surely some documents with primary output formats of LaTeX or
OpenDocument have similar requirements.  I wonder how those export
backends handle situations like this.

Thanks again for your help and hard work.

Best regards,
Terry
-- 
T.F. Torrey

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bug: New HTML exporter incorrect attributes
  2013-02-23  9:43   ` T.F. Torrey
@ 2013-02-23 10:16     ` Nicolas Goaziou
  2013-02-24 20:27       ` Samuel Wales
  2013-02-25 10:55       ` T.F. Torrey
  0 siblings, 2 replies; 11+ messages in thread
From: Nicolas Goaziou @ 2013-02-23 10:16 UTC (permalink / raw)
  To: T.F. Torrey; +Cc: emacs-orgmode

tftorrey@tftorrey.com (T.F. Torrey) writes:

>> You don't assign attributes to an image in a paragraph, you assign
>> attributes to the paragraph itself.
>
> It would be nice if there actually was a way to assign an attribute to a
> paragraph, so that the ATTR_HTML: class="XXX" syntax would export as <p
> class="XXX">, but that is a different issue.

It would be ATTR_HTML: :class "XXX". I try to unify syntax for
attributes with syntax for Babel and AFAICT, `html' is the last back-end
to have key="value" syntax.

>> For the time being, Org syntax
>> doesn't allow to specify attributes per link object.
>
> I think what you are saying is that the current intended behavior is for
> whatever is specified by ATTR_HTML to apply to every image or link in
> the paragraph.

No. I am saying that ATTR_HTML behaviour in _undefined_ when a paragraph
contains more than one link, as it has always been.

If you carefully look at Org manual (in application with previous
exporter framework), in "Images in HTML export", you will notice that
HTML attributes only apply to a single link pointing to an image, not to
a paragraph containing many links.

>> As a consequence, attributes will be assigned to every link within the
>> paragraph.
>
> Is this behavior helpful to anyone in any practical circumstances?

I never said it was. It's not even a feature. I'm just explaining what
is happening.

> Moreover, this means that, not only does the new exporter fail where the
> old one succeeded,

I worked hard to make the new export framework compatible with defined
behaviour of previous exporter, not with handy undocumented side-effects
it may have.

> It seems to me that, whether the user is happy with the output or not,
> the HTML exporter ought to produce valid HTML.

I agree. But, in this case, you're using undefined Org syntax (which,
admittedly, used to "work" for you).

If there's a simple patch that mimics this for html back-end, I don't
mind applying it. But it still won't make up for a real solution.

Unless, that is, it is decided that this behaviour is an official
feature supported by Org, in which case, it should be added to the
manual.

> A more general workaround that would help everyone affected would be to
> temporarily modify ox-html.el so that attributes from ATTR_HTML only
> apply to the *first* item in the paragraph.  This would have the
> advantage of mimicking the behavior of the old exporter (thus not
> breaking existing content) and of keeping images for other export
> formats.  Of course, anyone relying on the ATTR_HTML to set attributes
> for every image and/or link in a paragraph would have to adopt a
> different workaround, but ... does anyone really do this?

It would solve your problem. But what if someone starts a paragraph with
a regular link and thereafter, add an image? ATTR_HTML attributes would
never reach it.

Again, there's no proper solution besides modifying link syntax.

Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bug: New HTML exporter incorrect attributes
  2013-02-23 10:16     ` Nicolas Goaziou
@ 2013-02-24 20:27       ` Samuel Wales
  2013-02-25  8:23         ` Nicolas Goaziou
  2013-02-25 10:55       ` T.F. Torrey
  1 sibling, 1 reply; 11+ messages in thread
From: Samuel Wales @ 2013-02-24 20:27 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode

For flexibility and future proofing, it might be worth considering
universal syntax (e.g. $[link "..." :attr1 ... :attr2 ... ...]) for
fancy links instead of changing link syntax.

I've called it extensible syntax too.

Samuel

-- 
The Kafka Pandemic: http://thekafkapandemic.blogspot.com

The disease DOES progress.  MANY people have died from it.  ANYBODY
can get it.  There is no hope without action.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bug: New HTML exporter incorrect attributes
  2013-02-24 20:27       ` Samuel Wales
@ 2013-02-25  8:23         ` Nicolas Goaziou
  0 siblings, 0 replies; 11+ messages in thread
From: Nicolas Goaziou @ 2013-02-25  8:23 UTC (permalink / raw)
  To: Samuel Wales; +Cc: emacs-orgmode

Samuel Wales <samologist@gmail.com> writes:

> For flexibility and future proofing, it might be worth considering
> universal syntax (e.g. $[link "..." :attr1 ... :attr2 ... ...]) for
> fancy links instead of changing link syntax.
>
> I've called it extensible syntax too.

There are already four completely different ways to write a link.
I don't think adding a fifth would do any good.

I suggest to either slightly change regular link syntax (e.g.
[[link][desc][options]]) or replace it completely (the one you suggest
is not bad).

Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bug: New HTML exporter incorrect attributes
  2013-02-23 10:16     ` Nicolas Goaziou
  2013-02-24 20:27       ` Samuel Wales
@ 2013-02-25 10:55       ` T.F. Torrey
  2013-02-25 13:49         ` Nicolas Goaziou
  1 sibling, 1 reply; 11+ messages in thread
From: T.F. Torrey @ 2013-02-25 10:55 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode

Hello Nicolas,

Nicolas Goaziou <n.goaziou@gmail.com> writes:

> tftorrey@tftorrey.com (T.F. Torrey) writes:
>
>>> You don't assign attributes to an image in a paragraph, you assign
>>> attributes to the paragraph itself.
>>
>> It would be nice if there actually was a way to assign an attribute to a
>> paragraph, so that the ATTR_HTML: class="XXX" syntax would export as <p
>> class="XXX">, but that is a different issue.
>
> It would be ATTR_HTML: :class "XXX". I try to unify syntax for
> attributes with syntax for Babel and AFAICT, `html' is the last back-end
> to have key="value" syntax.

I see that this does not presently work, and the author listed on
ox-html.el is not currently active on this list.  I hope you are not the
only one working on this.  It would be our great misfortune for you to
become burned out.

>>> For the time being, Org syntax
>>> doesn't allow to specify attributes per link object.
>>
>> I think what you are saying is that the current intended behavior is for
>> whatever is specified by ATTR_HTML to apply to every image or link in
>> the paragraph.
>
> No. I am saying that ATTR_HTML behaviour in _undefined_ when a paragraph
> contains more than one link, as it has always been.
>
> If you carefully look at Org manual (in application with previous
> exporter framework), in "Images in HTML export", you will notice that
> HTML attributes only apply to a single link pointing to an image, not to
> a paragraph containing many links.

I see no such limitation in the Org manual (12.5.6).  It says this:

    If you need to add attributes to an inlined image, use a
    `#+ATTR_HTML'.

Though the example that follows doesn't show a paragraph, calling them
"inline" indicates they will be within a paragraph.  Org manual section
12.5.4 also shows ATTR_HTML applying to a hyperlink by itself, but
hyperlinks would rarely be used that way in real life, and in fact the
old exporter always applied ATTR_HTML attributes to the next item in a
paragraph.

I have always understood the manual to mean that an ATTR_HTML would
apply to *the next thing* in the document that it could, and that was
what happened in practice.  That was a useful thing for them to do.

>>> As a consequence, attributes will be assigned to every link within the
>>> paragraph.
>>
>> Is this behavior helpful to anyone in any practical circumstances?
>
> I never said it was. It's not even a feature. I'm just explaining what
> is happening.

If it isn't intended behavior, and it isn't helpful, then we should make
it stop doing that.

>> Moreover, this means that, not only does the new exporter fail where the
>> old one succeeded,
>
> I worked hard to make the new export framework compatible with defined
> behaviour of previous exporter, not with handy undocumented side-effects
> it may have.
>
>> It seems to me that, whether the user is happy with the output or not,
>> the HTML exporter ought to produce valid HTML.
>
> I agree. But, in this case, you're using undefined Org syntax (which,
> admittedly, used to "work" for you).

The HTML exporter should produce valid HTML regardless of the input.

> If there's a simple patch that mimics this for html back-end, I don't
> mind applying it. But it still won't make up for a real solution.
>
> Unless, that is, it is decided that this behaviour is an official
> feature supported by Org, in which case, it should be added to the
> manual.

The Org manual describes ATTR_HTML as a feature that applies to the
following image or link.  It makes no mention of restrictions to
following content in the paragraph, and neither does it say it will
apply to all following images or links.  The manual could be amended to
say that ATTR_HTML applies to just the next image or link.  To fit the
current situation, it might say, "In cases where ATTR_HTML is applied to
an image in a paragraph, following links will not be made invalid."  But
why would anyone be expecting invalid HTML in the first place?

Incidentally, I always thought that simply using another HTML_ATTR would
handle multiple images or links in the old exporter.  In other words,
this:

#+ATTR_HTML: width="10" alt=" [Cool thing] "
[[file:cool_thing.jpg]]
This is a paragraph about cool things.
#+ATTR_HTML: class="bar"
Cool thing found here [[http://example.com/][example.com]].

Would become this:

<p>
<img src="cool_thing.jpg" width="10" alt=" [Cool thing] "/>This is a 
paragraph about cool things. Cool thing found here <a
href="http://example.com/" class="bar">example.com</a>.
</p>

I don't remember using that in the old exporter, but I thought it would
work.

It almost works in the new exporter, but it begins a new paragraph
before the second #+ATTR_HTML.  I'm not sure this is the intended
behavior, though, because it isn't formatted like other new paragraphs.

>> A more general workaround that would help everyone affected would be to
>> temporarily modify ox-html.el so that attributes from ATTR_HTML only
>> apply to the *first* item in the paragraph.  This would have the
>> advantage of mimicking the behavior of the old exporter (thus not
>> breaking existing content) and of keeping images for other export
>> formats.  Of course, anyone relying on the ATTR_HTML to set attributes
>> for every image and/or link in a paragraph would have to adopt a
>> different workaround, but ... does anyone really do this?
>
> It would solve your problem. But what if someone starts a paragraph with
> a regular link and thereafter, add an image? ATTR_HTML attributes would
> never reach it.

That's true, but I think you are making the perfect the enemy of the
good.

> Again, there's no proper solution besides modifying link syntax.

I agree that modifying the link syntax to support inline attributes is
the best solution.  (IMHO, the syntax AsciiDoc uses is good, and would
be a good fit here.)

Alternatively, having ATTR_HTML (or something more general) apply to the
next thing, and having that work within paragraphs, is another
possibility.  However, this may not fit within the limitations of the
new parser.  Plus it's kind of ugly.

Until there is a "proper" solution, however, could we please modify the
exporter to apply ATTR_HTML to only the next image or link?  I am very
sure that was the spirit of the old exporter, and it would be nice if I
could maintain my documents in Org without resorting to (even more)
hacks.

Best,
Terry
-- 
T.F. Torrey

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bug: New HTML exporter incorrect attributes
  2013-02-25 10:55       ` T.F. Torrey
@ 2013-02-25 13:49         ` Nicolas Goaziou
  2013-02-25 13:57           ` Vincent Beffara
  2013-02-25 20:51           ` T.F. Torrey
  0 siblings, 2 replies; 11+ messages in thread
From: Nicolas Goaziou @ 2013-02-25 13:49 UTC (permalink / raw)
  To: T.F. Torrey; +Cc: emacs-orgmode

tftorrey@tftorrey.com (T.F. Torrey) writes:

>> It would be ATTR_HTML: :class "XXX". I try to unify syntax for
>> attributes with syntax for Babel and AFAICT, `html' is the last back-end
>> to have key="value" syntax.
>
> I see that this does not presently work, and the author listed on
> ox-html.el is not currently active on this list.  I hope you are not the
> only one working on this.  It would be our great misfortune for you to
> become burned out.

It's not much work once we agree about the real syntax. For example, for
links, there are two ways to replace:

  #+ATTR_HTML: width="400px"

The easiest one, is simply to ask for `:options' before:

  #+ATTR_HTML: :options "width=\"400px\""

This is heavier but will be consistent with other back-ends.  Otherwise,
there is also:

  #+ATTR_HTML: :width "400px"

But this requires to have a list of all properties supported. If we take
that route, here is a suggested list of such properties for <a> tag:

  - rel
  - target
  - type
  - accesskey
  - class
  - style
  - title

and for <img>

  - alt
  - height
  - width

What do you think about it?

> The HTML exporter should produce valid HTML regardless of the input.

We cannot remove the ability to shoot oneself in the foot. The HTML
back-end cannot be responsible for undefined syntax. Think about:

  @@html:<foo>@@

> The Org manual describes ATTR_HTML as a feature that applies to the
> following image or link.  It makes no mention of restrictions to
> following content in the paragraph, and neither does it say it will
> apply to all following images or links.  The manual could be amended to
> say that ATTR_HTML applies to just the next image or link.  To fit the
> current situation, it might say, "In cases where ATTR_HTML is applied to
> an image in a paragraph, following links will not be made invalid."  But
> why would anyone be expecting invalid HTML in the first place?
>
> Incidentally, I always thought that simply using another HTML_ATTR would
> handle multiple images or links in the old exporter.  In other words,
> this:
>
> #+ATTR_HTML: width="10" alt=" [Cool thing] "
> [[file:cool_thing.jpg]]
> This is a paragraph about cool things.
> #+ATTR_HTML: class="bar"
> Cool thing found here [[http://example.com/][example.com]].
>
> Would become this:
>
> <p>
> <img src="cool_thing.jpg" width="10" alt=" [Cool thing] "/>This is a 
> paragraph about cool things. Cool thing found here <a
> href="http://example.com/" class="bar">example.com</a>.
> </p>
>
> I don't remember using that in the old exporter, but I thought it would
> work.
>
> It almost works in the new exporter, but it begins a new paragraph
> before the second #+ATTR_HTML.  I'm not sure this is the intended
> behavior, though, because it isn't formatted like other new
> paragraphs.

This is the intended behaviour. Affiliated keywords can only exist at
the beginning of the element they refer to. So, in the previous example,
you start two paragraphs.

> Alternatively, having ATTR_HTML (or something more general) apply to the
> next thing, and having that work within paragraphs, is another
> possibility.  However, this may not fit within the limitations of the
> new parser.  Plus it's kind of ugly.

The parser won't support it. It goes against the definition of an
affiliated keyword. Moreover, it's merely a hack (what about links in
tables?). And it's ugly, indeed.

> Until there is a "proper" solution, however, could we please modify the
> exporter to apply ATTR_HTML to only the next image or link?  I am very
> sure that was the spirit of the old exporter, and it would be nice if I
> could maintain my documents in Org without resorting to (even more)
> hacks.

Done.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bug: New HTML exporter incorrect attributes
  2013-02-25 13:49         ` Nicolas Goaziou
@ 2013-02-25 13:57           ` Vincent Beffara
  2013-02-25 14:03             ` Nicolas Goaziou
  2013-02-25 20:51           ` T.F. Torrey
  1 sibling, 1 reply; 11+ messages in thread
From: Vincent Beffara @ 2013-02-25 13:57 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode

> #+ATTR_HTML: :options "width=\"400px\""
> 
> This is heavier but will be consistent with other back-ends. Otherwise,
> there is also:
> 
> #+ATTR_HTML: :width "400px"
> 
> But this requires to have a list of all properties supported.
How about both? I.e. a short-list of common options (class, title, id for links typically) plus a generic "options" as a back up to put whatever is not in the short-list ?

/v, big fan of the ugliest solutions imaginable
> If we take
> that route, here is a suggested list of such properties for <a> tag:
> 
> - rel
> - target
> - type
> - accesskey
> - class
> - style
> - title
> 
> and for <img>
> 
> - alt
> - height
> - width
> 
> What do you think about it?
> 
> > The HTML exporter should produce valid HTML regardless of the input.
> 
> We cannot remove the ability to shoot oneself in the foot. The HTML
> back-end cannot be responsible for undefined syntax. Think about:
> 
> @@html:<foo>@@
> 
> > The Org manual describes ATTR_HTML as a feature that applies to the
> > following image or link. It makes no mention of restrictions to
> > following content in the paragraph, and neither does it say it will
> > apply to all following images or links. The manual could be amended to
> > say that ATTR_HTML applies to just the next image or link. To fit the
> > current situation, it might say, "In cases where ATTR_HTML is applied to
> > an image in a paragraph, following links will not be made invalid." But
> > why would anyone be expecting invalid HTML in the first place?
> > 
> > Incidentally, I always thought that simply using another HTML_ATTR would
> > handle multiple images or links in the old exporter. In other words,
> > this:
> > 
> > #+ATTR_HTML: width="10" alt=" [Cool thing] "
> > [[file:cool_thing.jpg]]
> > This is a paragraph about cool things.
> > #+ATTR_HTML: class="bar"
> > Cool thing found here [[http://example.com/][example.com]].
> > 
> > Would become this:
> > 
> > <p>
> > <img src="cool_thing.jpg" width="10" alt=" [Cool thing] "/>This is a 
> > paragraph about cool things. Cool thing found here <a
> > href="http://example.com/" class="bar">example.com (http://example.com)</a>.
> > </p>
> > 
> > I don't remember using that in the old exporter, but I thought it would
> > work.
> > 
> > It almost works in the new exporter, but it begins a new paragraph
> > before the second #+ATTR_HTML. I'm not sure this is the intended
> > behavior, though, because it isn't formatted like other new
> > paragraphs.
> 
> 
> 
> This is the intended behaviour. Affiliated keywords can only exist at
> the beginning of the element they refer to. So, in the previous example,
> you start two paragraphs.
> 
> > Alternatively, having ATTR_HTML (or something more general) apply to the
> > next thing, and having that work within paragraphs, is another
> > possibility. However, this may not fit within the limitations of the
> > new parser. Plus it's kind of ugly.
> 
> 
> 
> The parser won't support it. It goes against the definition of an
> affiliated keyword. Moreover, it's merely a hack (what about links in
> tables?). And it's ugly, indeed.
> 
> > Until there is a "proper" solution, however, could we please modify the
> > exporter to apply ATTR_HTML to only the next image or link? I am very
> > sure that was the spirit of the old exporter, and it would be nice if I
> > could maintain my documents in Org without resorting to (even more)
> > hacks.
> 
> 
> 
> Done.
> 
> 
> Regards,
> 
> -- 
> Nicolas Goaziou

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bug: New HTML exporter incorrect attributes
  2013-02-25 13:57           ` Vincent Beffara
@ 2013-02-25 14:03             ` Nicolas Goaziou
  0 siblings, 0 replies; 11+ messages in thread
From: Nicolas Goaziou @ 2013-02-25 14:03 UTC (permalink / raw)
  To: Vincent Beffara; +Cc: emacs-orgmode

Hello,

Vincent Beffara <vbeffara@ens-lyon.fr> writes:

>> #+ATTR_HTML: :options "width=\"400px\""
>> 
>> This is heavier but will be consistent with other back-ends. Otherwise,
>> there is also:
>> 
>> #+ATTR_HTML: :width "400px"
>> 
>> But this requires to have a list of all properties supported.
>
> How about both? I.e. a short-list of common options (class, title, id for links typically) plus a generic "options" as a back up to put whatever is not in the short-list ?

A generic :options keyword is a good idea, indeed.

But we still have to agree on the common options part. For example,
I think :id is dangerous, because Org already provides its own way to
generate these properties (e.g. through #+NAME: keywords).

If we make a list of such options, per tag type, I can try to implement
it. Anyone wants to start?

Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Bug: New HTML exporter incorrect attributes
  2013-02-25 13:49         ` Nicolas Goaziou
  2013-02-25 13:57           ` Vincent Beffara
@ 2013-02-25 20:51           ` T.F. Torrey
  1 sibling, 0 replies; 11+ messages in thread
From: T.F. Torrey @ 2013-02-25 20:51 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode

Hello Nicolas,

Thanks for your prompt reply, though I think our discussion is a little
off-track here, as noted below.

Nicolas Goaziou <n.goaziou@gmail.com> writes:

> tftorrey@tftorrey.com (T.F. Torrey) writes:
>
>>> It would be ATTR_HTML: :class "XXX". I try to unify syntax for
>>> attributes with syntax for Babel and AFAICT, `html' is the last back-end
>>> to have key="value" syntax.

This was your response to my comment that it would be handy to apply
attributes to paragraphs, not the links or images within them.

>> I see that this does not presently work, and the author listed on
>> ox-html.el is not currently active on this list.  I hope you are not the
>> only one working on this.  It would be our great misfortune for you to
>> become burned out.
>
> It's not much work once we agree about the real syntax. For example, for
> links, there are two ways to replace:

I think you are now talking about a syntax for adding attributes to
links and images within paragraphs.  In the e-mail to which you are
replying, I thought were agreeing that modifying the inline link and
image syntax was the best solution to these, and had thoughts on what
that might look like in another thread.  What you describe here still
has the limitation that either all links and images in a paragraph must
have the same attributes, or some must be invalid, or only the first is
reachable.

Apart from that, I have other more general concerns with this approach
noted below.

>   #+ATTR_HTML: width="400px"
>
> The easiest one, is simply to ask for `:options' before:
>
>   #+ATTR_HTML: :options "width=\"400px\""
>
> This is heavier but will be consistent with other back-ends.

Yes, this is heavy.  Escaping the quotes is unwieldy, and raises doubts
about what else might need to be escaped.  Also, given that the whole
and only point of the ATTR_HTML keyword is for describing options,
adding ":options" is redundant.  From a design standpoint, it might be
elegant that it matches other things, but here it seems very awkward,
and I don't understand who it would benefit.

This seems another step away from "plain text" toward "another
programming language".  The first syntax looks like plain text.  The
second looks like programming.  For babel, and actual programming
languages, I'm sure this makes good sense.  For applying a width to an
image, it's overkill.

For instance, compare this "plain text":

#+ATTR_HTML: :options "width=\"400px\" title=\"My image\""
[[file:image.jpg]]

to the HTML alternative:

<img width="400px" title="My image" src="image.jpg"/>

Shouldn't the plain text be more straightforward to enter and easier to
understand that what it is replacing?

Here is the same thing in AsciiDoc:

image:image.jpg["My image",width=400]

Granted, I've already noted that I think we are moving toward inline
attribute specifications, so this syntax for images and links is
probably moot anyway, but I think the larger point still stands.

I hate to say that I have other concerns as well, but I do, below.

> Otherwise, there is also:
>
>   #+ATTR_HTML: :width "400px"
>
> But this requires to have a list of all properties supported. If we take
> that route, here is a suggested list of such properties for <a> tag:
>
>   - rel
>   - target
>   - type
>   - accesskey
>   - class
>   - style
>   - title
>
> and for <img>
>
>   - alt
>   - height
>   - width
>
> What do you think about it?

I think it is rather heavy-handed.  I don't understand why this
"requires" a list of properties supported.  The old exporter would
simply plug whatever you told it into the tag, trusting that you knew
what you were doing.  I'm sure this simplified the code, and it gave
great power to the user.  Why should the user need "permission" from the
developers to apply any arbitrary attributes to their elements?

Imposing these restrictions on users seems to make more work for the
users, and more work for the developers, to the benefit of no one.

Also, I don't know why attributes should be defined for each backend
rather than once for everywhere.  The attributes would be designated for
an object, and each backend would decide which to use or ignore.

For instance, though I know the LaTeX syntax not correct, this seems
like massive overkill for making a link red:

#+ATTR_LATEX: :options "text-color: red"
#+ATTR_HTML: :options "class=\"red\""
Here is a [[file:doc.html]][red link]].

FWIW, the same thing in AsciiDoc would be this:

Here is a [red]#link:doc.html#[red link].

And it would work correctly for every backend, current or future.

In AsciiDoc, the attributes belong to the item, and every backend is
free to use or ignore them.  Plain text sure looks appealing.

Again, this is applying the old ATTR_ syntax instead of the suggested
inline attribute designations, but if the new link syntax matches the
spirit of the existing structure, something like this is in the works:

Here is a [[file:doc.html][red link][@@html: "class=\"red\""@@ @@latex:
"text-color: red"@@]].

IMHO, the AsciiDoc approach is much better.

>> The HTML exporter should produce valid HTML regardless of the input.
>
> We cannot remove the ability to shoot oneself in the foot. The HTML
> back-end cannot be responsible for undefined syntax. Think about:
>
>   @@html:<foo>@@

I'm pretty sure you understand what I meant.  The user should be free to
ruin things however he likes, but whatever the exporter produces itself
should be valid.

[... 48 lines omitted ...]

>> Until there is a "proper" solution, however, could we please modify the
>> exporter to apply ATTR_HTML to only the next image or link?  I am very
>> sure that was the spirit of the old exporter, and it would be nice if I
>> could maintain my documents in Org without resorting to (even more)
>> hacks.
>
> Done.

Thank you very much.

> Regards,

Some closing thoughts:

I am concerned that the new parser is unnecessarily heavy-handed in
general.  For instance, why should document #+OPTIONS be restricted to a
pre-defined set at all?  It sure would be easier to hack if any
arbitrary property could be set there and accessed with
(org-get-property "whatever").  The same goes for node-level
EXPORT_OPTIONS.  I know that these have always been hard-coded in
advance, but I was hoping the new parser would be a step toward freedom,
not the other way.  Surely the code would be simpler accepting any
OPTION rather than checking each value to see if it is approved and
rejecting those that are not.

I have been a dedicated (but mostly silent) Org user for years.  I have
literally all of my life and livelihood documents in Org format.  But
right now I have seven or more tabs in Firefox open to AsciiDoc
documentation.  The syntax looks lightweight and easy and fun, like Org
used to.

I've been wondering for the past two days what it would take to use Org
syntax for headlines, structure, Org tables, dynamic blocks, and Babel
code, but to use AsciiDoc standards for the markup of the actual text.
One big benefit of Org is that it is entirely elisp.  I'm not a very
good elisp programmer, and I don't know enough LaTeX, but I think even I
could make a parser that would turn AsciiDoc text into HTML, plain text,
and even DocBook.

I wonder if there would be support for that.  Our current system has
support for basic inline styling (italic, bold, etc.), primitive links
and images, and none at all for arbitrary spans.  Adopting the AsciiDoc
syntax for inline items would get us powerful styling for links, images,
and arbitrary spans.  Perhaps it could be developed as an add-on, and an
OPTION could specify that the text markup was AsciiDoc.  Exporters could
also push the document to the AsciiDoc tool chain, which would give us
another path to DocBook, HTML, PDF, and the rest.

I'm afraid, though, that the new parser will enforce its own standards,
making this kind of interchangeability impossible.

Now I've gotten pretty far off-topic.  Enough for now.

Best regards,
Terry
-- 
T.F. Torrey

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-02-25 20:52 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-22 22:23 Bug: New HTML exporter incorrect attributes T.F. Torrey
2013-02-22 23:18 ` Nicolas Goaziou
2013-02-23  9:43   ` T.F. Torrey
2013-02-23 10:16     ` Nicolas Goaziou
2013-02-24 20:27       ` Samuel Wales
2013-02-25  8:23         ` Nicolas Goaziou
2013-02-25 10:55       ` T.F. Torrey
2013-02-25 13:49         ` Nicolas Goaziou
2013-02-25 13:57           ` Vincent Beffara
2013-02-25 14:03             ` Nicolas Goaziou
2013-02-25 20:51           ` T.F. Torrey

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).