emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: tftorrey@tftorrey.com (T.F. Torrey)
To: Nicolas Goaziou <n.goaziou@gmail.com>
Cc: emacs-orgmode@gnu.org
Subject: Re: Bug: New HTML exporter incorrect attributes
Date: Mon, 25 Feb 2013 13:51:51 -0700	[thread overview]
Message-ID: <87y5ec86c8.fsf@lapcat.tftorrey.com> (raw)
In-Reply-To: <87vc9gfqr3.fsf@gmail.com> (message from Nicolas Goaziou on Mon, 25 Feb 2013 14:49:04 +0100)

Hello Nicolas,

Thanks for your prompt reply, though I think our discussion is a little
off-track here, as noted below.

Nicolas Goaziou <n.goaziou@gmail.com> writes:

> tftorrey@tftorrey.com (T.F. Torrey) writes:
>
>>> It would be ATTR_HTML: :class "XXX". I try to unify syntax for
>>> attributes with syntax for Babel and AFAICT, `html' is the last back-end
>>> to have key="value" syntax.

This was your response to my comment that it would be handy to apply
attributes to paragraphs, not the links or images within them.

>> I see that this does not presently work, and the author listed on
>> ox-html.el is not currently active on this list.  I hope you are not the
>> only one working on this.  It would be our great misfortune for you to
>> become burned out.
>
> It's not much work once we agree about the real syntax. For example, for
> links, there are two ways to replace:

I think you are now talking about a syntax for adding attributes to
links and images within paragraphs.  In the e-mail to which you are
replying, I thought were agreeing that modifying the inline link and
image syntax was the best solution to these, and had thoughts on what
that might look like in another thread.  What you describe here still
has the limitation that either all links and images in a paragraph must
have the same attributes, or some must be invalid, or only the first is
reachable.

Apart from that, I have other more general concerns with this approach
noted below.

>   #+ATTR_HTML: width="400px"
>
> The easiest one, is simply to ask for `:options' before:
>
>   #+ATTR_HTML: :options "width=\"400px\""
>
> This is heavier but will be consistent with other back-ends.

Yes, this is heavy.  Escaping the quotes is unwieldy, and raises doubts
about what else might need to be escaped.  Also, given that the whole
and only point of the ATTR_HTML keyword is for describing options,
adding ":options" is redundant.  From a design standpoint, it might be
elegant that it matches other things, but here it seems very awkward,
and I don't understand who it would benefit.

This seems another step away from "plain text" toward "another
programming language".  The first syntax looks like plain text.  The
second looks like programming.  For babel, and actual programming
languages, I'm sure this makes good sense.  For applying a width to an
image, it's overkill.

For instance, compare this "plain text":

#+ATTR_HTML: :options "width=\"400px\" title=\"My image\""
[[file:image.jpg]]

to the HTML alternative:

<img width="400px" title="My image" src="image.jpg"/>

Shouldn't the plain text be more straightforward to enter and easier to
understand that what it is replacing?

Here is the same thing in AsciiDoc:

image:image.jpg["My image",width=400]

Granted, I've already noted that I think we are moving toward inline
attribute specifications, so this syntax for images and links is
probably moot anyway, but I think the larger point still stands.

I hate to say that I have other concerns as well, but I do, below.

> Otherwise, there is also:
>
>   #+ATTR_HTML: :width "400px"
>
> But this requires to have a list of all properties supported. If we take
> that route, here is a suggested list of such properties for <a> tag:
>
>   - rel
>   - target
>   - type
>   - accesskey
>   - class
>   - style
>   - title
>
> and for <img>
>
>   - alt
>   - height
>   - width
>
> What do you think about it?

I think it is rather heavy-handed.  I don't understand why this
"requires" a list of properties supported.  The old exporter would
simply plug whatever you told it into the tag, trusting that you knew
what you were doing.  I'm sure this simplified the code, and it gave
great power to the user.  Why should the user need "permission" from the
developers to apply any arbitrary attributes to their elements?

Imposing these restrictions on users seems to make more work for the
users, and more work for the developers, to the benefit of no one.

Also, I don't know why attributes should be defined for each backend
rather than once for everywhere.  The attributes would be designated for
an object, and each backend would decide which to use or ignore.

For instance, though I know the LaTeX syntax not correct, this seems
like massive overkill for making a link red:

#+ATTR_LATEX: :options "text-color: red"
#+ATTR_HTML: :options "class=\"red\""
Here is a [[file:doc.html]][red link]].

FWIW, the same thing in AsciiDoc would be this:

Here is a [red]#link:doc.html#[red link].

And it would work correctly for every backend, current or future.

In AsciiDoc, the attributes belong to the item, and every backend is
free to use or ignore them.  Plain text sure looks appealing.

Again, this is applying the old ATTR_ syntax instead of the suggested
inline attribute designations, but if the new link syntax matches the
spirit of the existing structure, something like this is in the works:

Here is a [[file:doc.html][red link][@@html: "class=\"red\""@@ @@latex:
"text-color: red"@@]].

IMHO, the AsciiDoc approach is much better.

>> The HTML exporter should produce valid HTML regardless of the input.
>
> We cannot remove the ability to shoot oneself in the foot. The HTML
> back-end cannot be responsible for undefined syntax. Think about:
>
>   @@html:<foo>@@

I'm pretty sure you understand what I meant.  The user should be free to
ruin things however he likes, but whatever the exporter produces itself
should be valid.

[... 48 lines omitted ...]

>> Until there is a "proper" solution, however, could we please modify the
>> exporter to apply ATTR_HTML to only the next image or link?  I am very
>> sure that was the spirit of the old exporter, and it would be nice if I
>> could maintain my documents in Org without resorting to (even more)
>> hacks.
>
> Done.

Thank you very much.

> Regards,

Some closing thoughts:

I am concerned that the new parser is unnecessarily heavy-handed in
general.  For instance, why should document #+OPTIONS be restricted to a
pre-defined set at all?  It sure would be easier to hack if any
arbitrary property could be set there and accessed with
(org-get-property "whatever").  The same goes for node-level
EXPORT_OPTIONS.  I know that these have always been hard-coded in
advance, but I was hoping the new parser would be a step toward freedom,
not the other way.  Surely the code would be simpler accepting any
OPTION rather than checking each value to see if it is approved and
rejecting those that are not.

I have been a dedicated (but mostly silent) Org user for years.  I have
literally all of my life and livelihood documents in Org format.  But
right now I have seven or more tabs in Firefox open to AsciiDoc
documentation.  The syntax looks lightweight and easy and fun, like Org
used to.

I've been wondering for the past two days what it would take to use Org
syntax for headlines, structure, Org tables, dynamic blocks, and Babel
code, but to use AsciiDoc standards for the markup of the actual text.
One big benefit of Org is that it is entirely elisp.  I'm not a very
good elisp programmer, and I don't know enough LaTeX, but I think even I
could make a parser that would turn AsciiDoc text into HTML, plain text,
and even DocBook.

I wonder if there would be support for that.  Our current system has
support for basic inline styling (italic, bold, etc.), primitive links
and images, and none at all for arbitrary spans.  Adopting the AsciiDoc
syntax for inline items would get us powerful styling for links, images,
and arbitrary spans.  Perhaps it could be developed as an add-on, and an
OPTION could specify that the text markup was AsciiDoc.  Exporters could
also push the document to the AsciiDoc tool chain, which would give us
another path to DocBook, HTML, PDF, and the rest.

I'm afraid, though, that the new parser will enforce its own standards,
making this kind of interchangeability impossible.

Now I've gotten pretty far off-topic.  Enough for now.

Best regards,
Terry
-- 
T.F. Torrey

      parent reply	other threads:[~2013-02-25 20:52 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-22 22:23 Bug: New HTML exporter incorrect attributes T.F. Torrey
2013-02-22 23:18 ` Nicolas Goaziou
2013-02-23  9:43   ` T.F. Torrey
2013-02-23 10:16     ` Nicolas Goaziou
2013-02-24 20:27       ` Samuel Wales
2013-02-25  8:23         ` Nicolas Goaziou
2013-02-25 10:55       ` T.F. Torrey
2013-02-25 13:49         ` Nicolas Goaziou
2013-02-25 13:57           ` Vincent Beffara
2013-02-25 14:03             ` Nicolas Goaziou
2013-02-25 20:51           ` T.F. Torrey [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y5ec86c8.fsf@lapcat.tftorrey.com \
    --to=tftorrey@tftorrey.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=n.goaziou@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).