emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [Bug] Problem with html export of description list items
@ 2011-04-06  0:07 Ethan Ligon
  2011-04-06  0:38 ` [PATCH] " Ethan Ligon
  0 siblings, 1 reply; 7+ messages in thread
From: Ethan Ligon @ 2011-04-06  0:07 UTC (permalink / raw)
  To: emacs-orgmode

I've just stumbled across what I regard as a bug in the html export of
description list items.

The problem has to do with whether the specification of a description
list includes a trailing space or not; i.e., whether "- Item ::" is
treated the same way as "- Item :: ".  LaTeX export treats these as
identical.  Html export gets confused about what the description list
item is, and winds up generating a "???" for the description.

Here's an example.

#+begin_src org
* Illustration of bug in html export
  - Has a space after the colons :: so will work in latex and html
  - No space after the colons ::so won't work in html
  - Has a terminating space :: 
    - So it works in both html and latex export!
    - Even though it's difficult to distinguish from the next example.
  - Lacks a terminating space ::
    - *Doesn't* work in html export, does in latex.
#+end_src

The relevant bit of the html export
#+begin_src html
  <div id="outline-container-1" class="outline-2">
  <h2 id="sec-1"><span class="section-number-2"></span> Illustration of bug in
html export </h2>
  <div class="outline-text-2" id="text-1">
  
  <dl>
  <dt>This has a space after the colons</dt><dd>so will work
  </dd>
  <dt>???</dt><dd>This doesn't have a space after the colons ::so won't work
  </dd>
  <dt>Has a terminating space</dt><dd>
  <ul>
  <li>So it works in both html and latex export!
  </li>
  <li>Even though it's difficult to distinguish from the next example.
  </li>
  </ul>
  
  </dd>
  <dt>???</dt><dd>Lacks a terminating space ::
  <ul>
  <li><b>Doesn't</b> work in html export, does in latex.
  </li>
  </ul>
  
  </dd>
  </dl>
  
  </div>
  </div>
#+end_src

The relevant bit of the latex export looks like this:

#+begin_src latex
\vspace*{1cm}
\section{Illustration of bug in html export}
\label{sec-1}

\begin{description}
\item[This has a space after the colons] so will work
\item[This doesn't have a space after the colons] so won't work
\item[Has a terminating space] 
\begin{itemize}
\item So it works in both html and latex export!
\item Even though it's difficult to distinguish from the next example.
\end{itemize}
\item[Lacks a terminating space] 
\begin{itemize}
\item \textbf{Doesn't} work in html export, does in latex.
\end{itemize}
\end{description}
#+end_src

Thanks for any help!

-Ethan Ligon

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] Problem with html export of description list items
  2011-04-06  0:07 [Bug] Problem with html export of description list items Ethan Ligon
@ 2011-04-06  0:38 ` Ethan Ligon
  2011-04-07 12:51   ` Nicolas Goaziou
  0 siblings, 1 reply; 7+ messages in thread
From: Ethan Ligon @ 2011-04-06  0:38 UTC (permalink / raw)
  To: emacs-orgmode

Ethan Ligon <ligon <at> are.berkeley.edu> writes:
> 
> I've just stumbled across what I regard as a bug in the html export of
> description list items.
> 
> The problem has to do with whether the specification of a description
> list includes a trailing space or not; i.e., whether "- Item ::" is
> treated the same way as "- Item :: ".  LaTeX export treats these as
> identical.  Html export gets confused about what the description list
> item is, and winds up generating a "???" for the description.
> 

Having done the work to describe the problem, it wasn't hard to find a
solution.  In this case that's a one character change to a regexp in
org-html.el. 

Here's the patch:

diff --git a/lisp/org-html.el b/lisp/org-html.el
index d19d88b..005a0f7 100644
--- a/lisp/org-html.el
+++ b/lisp/org-html.el
@@ -2501,7 +2501,7 @@ the alist of previous items."
        (concat "[ \t]*\\(\\S-+[ \t]*\\)"
               "\\(?:\\[@\\(?:start:\\)?\\([0-9]+\\|[A-Za-z]\\)\\]\\)?"
               "\\(?:\\(\\[[ X-]\\]\\)[ \t]+\\)?"
-              "\\(?:\\(.*\\)[ \t]+::[ \t]+\\)?"
+              "\\(?:\\(.*\\)[ \t]+::[ \t]*\\)?"
               "\\(.*\\)") line)
       (let* ((checkbox (match-string 3 line))
             (desc-tag (or (match-string 4 line) "???"))

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Problem with html export of description list items
  2011-04-06  0:38 ` [PATCH] " Ethan Ligon
@ 2011-04-07 12:51   ` Nicolas Goaziou
  2011-04-07 21:52     ` Ethan Ligon
  0 siblings, 1 reply; 7+ messages in thread
From: Nicolas Goaziou @ 2011-04-07 12:51 UTC (permalink / raw)
  To: Ethan Ligon; +Cc: emacs-orgmode

Hello,

Ethan Ligon <ligon@are.berkeley.edu> writes:

> Ethan Ligon <ligon <at> are.berkeley.edu> writes:
>> 
>> I've just stumbled across what I regard as a bug in the html export of
>> description list items.
>> 
>> The problem has to do with whether the specification of a description
>> list includes a trailing space or not; i.e., whether "- Item ::" is
>> treated the same way as "- Item :: ".  LaTeX export treats these as
>> identical.  Html export gets confused about what the description list
>> item is, and winds up generating a "???" for the description.

LaTeX exporter doesn't treats these as identical. What happens is that
in your example, the first item is correct and the list is thus set as
a description list. As such, LaTeX exporter tries hard to fill
description terms for every item in the list. If you exchange the first
and second items in your example, the list will be exported as
a standard itemize list in LaTeX.

For HTML (and DocBook) exporter, this is a little different, and term
recognition is hard-coded there. I will modify that.

> Having done the work to describe the problem, it wasn't hard to find a
> solution.  In this case that's a one character change to a regexp in
> org-html.el. 
>
> Here's the patch:
>
> diff --git a/lisp/org-html.el b/lisp/org-html.el
> index d19d88b..005a0f7 100644
> --- a/lisp/org-html.el
> +++ b/lisp/org-html.el
> @@ -2501,7 +2501,7 @@ the alist of previous items."
>         (concat "[ \t]*\\(\\S-+[ \t]*\\)"
>                "\\(?:\\[@\\(?:start:\\)?\\([0-9]+\\|[A-Za-z]\\)\\]\\)?"
>                "\\(?:\\(\\[[ X-]\\]\\)[ \t]+\\)?"
> -              "\\(?:\\(.*\\)[ \t]+::[ \t]+\\)?"
> +              "\\(?:\\(.*\\)[ \t]+::[ \t]*\\)?"
>                "\\(.*\\)") line)
>        (let* ((checkbox (match-string 3 line))
>              (desc-tag (or (match-string 4 line) "???"))

Your patch allows items like:

- term ::description

which are not valid for a description list.

Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Problem with html export of description list items
  2011-04-07 12:51   ` Nicolas Goaziou
@ 2011-04-07 21:52     ` Ethan Ligon
  2011-04-08 11:53       ` Nicolas Goaziou
  0 siblings, 1 reply; 7+ messages in thread
From: Ethan Ligon @ 2011-04-07 21:52 UTC (permalink / raw)
  To: emacs-orgmode

Nic-

Nicolas Goaziou <n.goaziou <at> gmail.com> writes:
> > Ethan Ligon <ligon <at> are.berkeley.edu> writes:
> >> 
> >> I've just stumbled across what I regard as a bug in the html export of
> >> description list items.
> >> 
> >> The problem has to do with whether the specification of a description
> >> list includes a trailing space or not; i.e., whether "- Item ::" is
> >> treated the same way as "- Item :: ".  LaTeX export treats these as
> >> identical.  Html export gets confused about what the description list
> >> item is, and winds up generating a "???" for the description.
> 
> LaTeX exporter doesn't treats these as identical. What happens is that
> in your example, the first item is correct and the list is thus set as
> a description list. As such, LaTeX exporter tries hard to fill
> description terms for every item in the list. If you exchange the first
> and second items in your example, the list will be exported as
> a standard itemize list in LaTeX.

<snip>

> Your patch allows items like:
> 
> - term ::description
> 
> which are not valid for a description list.

Thanks for correcting my misunderstanding of the latex-export
behavior.  But I still think this behavior is undesirable.

The org manual says that a description item takes the form '- term
:: ', and thus seems to require a space after the double colon.  I
suppose it's this that you're relying on in claiming that 
"- term ::description" is invalid.

I agree that "- term ::description" is ugly, but the use-case that is
giving me problems is something like
  
  - term ::
    1. A list
    2. Providing
    3. The description

The html export code currently allows "- term ::[ \t]+", so the above
breaks unless there's a space or a tab following the "::".  My issue
would be addressed if we could just slightly expand the set of
allowable white space following, so that we'd have "::[ \t\n]+".

Does that seem reasonable?

Thanks,
-Ethan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Problem with html export of description list items
  2011-04-07 21:52     ` Ethan Ligon
@ 2011-04-08 11:53       ` Nicolas Goaziou
  2011-04-08 17:46         ` [PATCH] Fix for html & docbook " Ethan Ligon
  2011-04-09  3:19         ` [PATCH] Problem with html " Ethan Ligon
  0 siblings, 2 replies; 7+ messages in thread
From: Nicolas Goaziou @ 2011-04-08 11:53 UTC (permalink / raw)
  To: Ethan Ligon; +Cc: emacs-orgmode

Hello,

Ethan Ligon <ligon@are.berkeley.edu> writes:

> Nicolas Goaziou <n.goaziou <at> gmail.com> writes:
>> > Ethan Ligon <ligon <at> are.berkeley.edu> writes:

>> Your patch allows items like:
>> 
>> - term ::description
>> 
>> which are not valid for a description list.
>
> Thanks for correcting my misunderstanding of the latex-export
> behavior.  But I still think this behavior is undesirable.

I don't disagree: I was just pointing out the internals of the LaTeX
exporter in that case.

> The org manual says that a description item takes the form '- term
> :: ', and thus seems to require a space after the double colon.  I
> suppose it's this that you're relying on in claiming that 
> "- term ::description" is invalid.

Indeed. On the other hand, font-lock will still fontify is as
a description item, which is paradoxical.

> I agree that "- term ::description" is ugly, but the use-case that is
> giving me problems is something like
>   
>   - term ::
>     1. A list
>     2. Providing
>     3. The description
>
> The html export code currently allows "- term ::[ \t]+", so the above
> breaks unless there's a space or a tab following the "::".  My issue
> would be addressed if we could just slightly expand the set of
> allowable white space following, so that we'd have "::[ \t\n]+".
>
> Does that seem reasonable?

It is. Or something like the slightly different "::\\(:?[ \t]+\\|$\\)".
And if the same modification is made to docbook.el (code is very
similar), that will make a good patch for the problem at hand.

Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Fix for html & docbook export of description list items
  2011-04-08 11:53       ` Nicolas Goaziou
@ 2011-04-08 17:46         ` Ethan Ligon
  2011-04-09  3:19         ` [PATCH] Problem with html " Ethan Ligon
  1 sibling, 0 replies; 7+ messages in thread
From: Ethan Ligon @ 2011-04-08 17:46 UTC (permalink / raw)
  To: emacs-orgmode

After some very helpful corrections and suggestions from Nic, I'd like
to propose the following patch, which addresses a problem in the html
and docbook export of description items.  

The problem is illustrated by the following example:

#+begin_src org
* Illustration of bug in html export
  - This has a space after the colons :: so will work in latex and html
  - This doesn't have a space after the colons ::so is an invalid
    description item according to the org manual.  Won't work in html
    or docbook.  Will nevertheless work in latex, provided /first/
    description item is valid.
  - Has a terminating space :: 
    - So it works in both html and latex export!
    - Even though it's difficult to distinguish from the next example.
  - Lacks a terminating space ::
    - At present, *doesn't* work in html or docbook export, does in
      latex.  This is the case that the following patch fixes.
#+end_src


diff --git a/lisp/org-docbook.el b/lisp/org-docbook.el
index dbb608d..124e1dc 100644
--- a/lisp/org-docbook.el
+++ b/lisp/org-docbook.el
@@ -1382,7 +1382,7 @@ the alist of previous items."
       (string-match (concat "[ \t]*\\(\\S-+[ \t]*\\)"
                            "\\(?:\\[@\\(?:start:\\)?\\([0-9]+\\|[a-zA-Z]\\)\\]\\)?"
                            "\\(?:\\(\\[[ X-]\\]\\)[ \t]+\\)?"
-                           "\\(?:\\(.*\\)[ \t]+::[ \t]+\\)?"
+                           "\\(?:\\(.*\\)[ \t]+::\\(?:[ \t]+\\|$\\)\\)?"
                            "\\(.*\\)")
                    line)
       (let* ((checkbox (match-string 3 line))
diff --git a/lisp/org-html.el b/lisp/org-html.el
index d19d88b..4ae6d99 100644
--- a/lisp/org-html.el
+++ b/lisp/org-html.el
@@ -2501,7 +2501,7 @@ the alist of previous items."
        (concat "[ \t]*\\(\\S-+[ \t]*\\)"
               "\\(?:\\[@\\(?:start:\\)?\\([0-9]+\\|[A-Za-z]\\)\\]\\)?"
               "\\(?:\\(\\[[ X-]\\]\\)[ \t]+\\)?"
-              "\\(?:\\(.*\\)[ \t]+::[ \t]+\\)?"
+              "\\(?:\\(.*\\)[ \t]+::\\(?:[ \t]+\\|$\\)\\)?"
               "\\(.*\\)") line)
       (let* ((checkbox (match-string 3 line))
             (desc-tag (or (match-string 4 line) "???"))

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Problem with html export of description list items
  2011-04-08 11:53       ` Nicolas Goaziou
  2011-04-08 17:46         ` [PATCH] Fix for html & docbook " Ethan Ligon
@ 2011-04-09  3:19         ` Ethan Ligon
  1 sibling, 0 replies; 7+ messages in thread
From: Ethan Ligon @ 2011-04-09  3:19 UTC (permalink / raw)
  To: Ethan Ligon, emacs-orgmode

Nic-

Sent this to the list earlier; should have cc'd you.

Thanks again for your help!
-Ethan
------------------------------------------------------------------------------
After some very helpful corrections and suggestions from Nic, I'd like
to propose the following patch, which addresses a problem in the html
and docbook export of description items.

The problem is illustrated by the following example:

#+begin_src org
* Illustration of bug in html export
  - This has a space after the colons :: so will work in latex and html
  - This doesn't have a space after the colons ::so is an invalid
    description item according to the org manual.  Won't work in html
    or docbook.  Will nevertheless work in latex, provided /first/
    description item is valid.
  - Has a terminating space ::
    - So it works in both html and latex export!
    - Even though it's difficult to distinguish from the next example.
  - Lacks a terminating space ::
    - At present, *doesn't* work in html or docbook export, does in
      latex.  This is the case that the following patch fixes.
#+end_src

diff --git a/lisp/org-docbook.el b/lisp/org-docbook.el
index dbb608d..124e1dc 100644
--- a/lisp/org-docbook.el
+++ b/lisp/org-docbook.el
@@ -1382,7 +1382,7 @@ the alist of previous items."
       (string-match (concat "[ \t]*\\(\\S-+[ \t]*\\)"

"\\(?:\\[@\\(?:start:\\)?\\([0-9]+\\|[a-zA-Z]\\)\\]\\)?"
                            "\\(?:\\(\\[[ X-]\\]\\)[ \t]+\\)?"
-                           "\\(?:\\(.*\\)[ \t]+::[ \t]+\\)?"
+                           "\\(?:\\(.*\\)[ \t]+::\\(?:[ \t]+\\|$\\)\\)?"
                            "\\(.*\\)")
                    line)
       (let* ((checkbox (match-string 3 line))
diff --git a/lisp/org-html.el b/lisp/org-html.el
index d19d88b..4ae6d99 100644
--- a/lisp/org-html.el
+++ b/lisp/org-html.el
@@ -2501,7 +2501,7 @@ the alist of previous items."
        (concat "[ \t]*\\(\\S-+[ \t]*\\)"
               "\\(?:\\[@\\(?:start:\\)?\\([0-9]+\\|[A-Za-z]\\)\\]\\)?"
               "\\(?:\\(\\[[ X-]\\]\\)[ \t]+\\)?"
-              "\\(?:\\(.*\\)[ \t]+::[ \t]+\\)?"
+              "\\(?:\\(.*\\)[ \t]+::\\(?:[ \t]+\\|$\\)\\)?"
               "\\(.*\\)") line)
       (let* ((checkbox (match-string 3 line))
             (desc-tag (or (match-string 4 line) "???"))


-- 
Ethan Ligon, Associate Professor
Agricultural & Resource Economics
University of California, Berkeley

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-04-09  3:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-06  0:07 [Bug] Problem with html export of description list items Ethan Ligon
2011-04-06  0:38 ` [PATCH] " Ethan Ligon
2011-04-07 12:51   ` Nicolas Goaziou
2011-04-07 21:52     ` Ethan Ligon
2011-04-08 11:53       ` Nicolas Goaziou
2011-04-08 17:46         ` [PATCH] Fix for html & docbook " Ethan Ligon
2011-04-09  3:19         ` [PATCH] Problem with html " Ethan Ligon

Code repositories for project(s) associated with this inbox:

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).