emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* ox-html: exporting LaTeX-environments
@ 2022-04-11 19:38 Vitus Schäfftlein
  2022-04-12  5:15 ` Thibault Marin
  0 siblings, 1 reply; 4+ messages in thread
From: Vitus Schäfftlein @ 2022-04-11 19:38 UTC (permalink / raw)
  To: emacs-orgmode@gnu.org; +Cc: thibault.marin@gmx.com

Dear org-mode mailing list,

first of all I want to express my appreciation for your work and the efforts you put into getting
org-mode together! You guys are awesome.

I’m writing to you because I am setting up my blog, for which I need
fully-fledged LaTeX support, so I am inserting my code via svg images. There are quite some
problems to get it working, though, and I am giving my best to try and report the issues I found.
All of them regard ox-html. Most of what I wrrite can be found in this github discussion: 
https://github.com/kaushalmodi/ox-hugo/discussions/618.

1. The current code in ox-html does not support equation numbers in parentheses. If you add

span.equation-label:before {
    content: '(';
}

span.equation-label:after {
    content: ')';
}

to your css file, strings of the form ( no ) are produced instead of strings of the form (no); for
example, you get ( 1 ) instead of (1).

2. Any environment (except in-line-math) gets an equation number. But some environments
 should not have (html) equation numbers, like tcolorbox.
3. Any LaTeX environment name foo is changed to foo* (except it already ends with an
 asterisk). For example, \begin{tabular} is changed to \begin{tabular*}; same for
 \end{tabular}. But tabular* differs from tabular in needing an extra width-argument, so
 the export won’t work properly.

I have put quite some elbow-grease into possible solutions, which I would like to share with you.
I am an emacs lisp beginner with background only in philosophical logic, so bear with me.

Changing (1) is simple. org-html--wrap-latex-environment produces the html span class
equation-label, where the equation number is then inserted. Specifically, it adds this string:

"\n<span class=\"equation-label\">\n%s\n</span>"

Now the newline commands \n before and after %s are exported as whitespace. Just replacing
\n%s\n by %s (that is, leaving the newlines out) solves the problem. HTML ignores newlines
anyway.

Changing (2) seems to be doable, too, and I think I know how to do it in theory:

1 Create a new variable ox-html-latex-environments-no-number of the form ("foo"
 "bar" "baz" ...), which contains all environments that should not receive equation
 numbers.
2 Change org-html--latex-environment-numbered-p. It is currently defined like this:

(defun org-html--latex-environment-numbered-p (element)
  "Non-nil when ELEMENT contains a numbered LaTeX math environment.
Starred and \"displaymath\" environments are not numbered."
  (not (string-match-p "\\`[ \t]*\\\\begin{\\(.*\\*\\|displaymath\\)}"
               (org-element-property :value element))))

Now we need to adjust the regular expression in such a way that if element has \begin{foo}
or \begin{bar} etc. (that is, the environment name is a member of
ox-html-latex-environments-no-number), it also returns nil. I think this is doable. I thought
about adding something like

  (not
    (or
 ; starred or display math
      (string-match-p "\\`[ \t]*\\\\begin{\\(.*\\*\\|displaymath\\)}"
      (org-element-property :value element))
 ; environment of ox-html-latex-environments-no-number
    (string-match-p (format "\\begin{%s}" [any element of ox-html-latex-environments-no-number])
               (org-element-property :value element))
))

I don’t know how to express in elisp what is in brackets, though. Does this make sense to you? I
am a beginner with elisp, so I can only state the ideas I have but not implement them (yet).

As to (3): Which images receive label numbers is controlled by this part of
org-html-latex-environment:

(let ((formula-link
            (org-html-format-latex
             (org-html--unlabel-latex-environment latex-frag)
             processing-type info)))

As you can see, at the moment, org-html--unlabel-latex-environment is applied to every
latex-frag. So we would again need a variable
org-html--latex-environments-leave-unlabelled of the same form as above whose
members are all latex environments which should not be unlabelled. Then, we could implement
a condition like

 (let ((formula-link
             (org-html-format-latex
   ; if latex-frag is one of org-html--unlabel-latex-environment
            (if (string-match-p (format "\\begin{%s}"
                                   [any element of org-html--latex-environments-leave-unlabelled])
               latex-frag)
                   ; then do not apply org-html-format-latex to latex-frag
             (org-html-format-latex latex-frag)
                       ; else do apply org-html--unlabel-latex to latex-frag
(org-html--unlabel-latex-environment latex-frag)
              processing-type info)))

It would be great if you could have a look at my solutions. The code surely is awful, but the ideas
behind it might be of value to you. If you could tell me how to solve these problems (or add a
commit which addresses them), this would be awesome! I am planning to do an in-depth guide
on how to write full-fledged LaTeX in html using svg images created with ox-html, and this is
the last step I need for everything to work smoothly.

Warm Regards,

Vitus



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ox-html: exporting LaTeX-environments
  2022-04-11 19:38 ox-html: exporting LaTeX-environments Vitus Schäfftlein
@ 2022-04-12  5:15 ` Thibault Marin
  2022-04-15 10:23   ` AW: " Vitus Schäfftlein
  0 siblings, 1 reply; 4+ messages in thread
From: Thibault Marin @ 2022-04-12  5:15 UTC (permalink / raw)
  To: Vitus Schäfftlein; +Cc: emacs-orgmode@gnu.org

[-- Attachment #1: Type: text/plain, Size: 2623 bytes --]


Hi Vitus, list.

My memory is quite fuzzy on this and I won't have a chance to take a
deep look until later, but I will try to share the information I have.

On Mon, 11 Apr 2022 19:38:13 +0000 (9 hours, 37 minutes, 37 seconds ago), Vitus Schäfftlein <vitusschaefftlein@live.de> wrote:

  Dear org-mode mailing list,

  [...]

  3. Any LaTeX environment name foo is changed to foo* (except it already ends with an
   asterisk). For example, \begin{tabular} is changed to \begin{tabular*}; same for
   \end{tabular}. But tabular* differs from tabular in needing an extra width-argument, so
   the export won’t work properly.

I had submitted a patch trying to address this
(https://list.orgmode.org/87h7ok3qi2.fsf@dell-desktop.WORKGROUP/, I have
attached a new version rebased on main to this message).  It never made
it in and I failed to follow-up.  This patch (or something similar)
could help with this issue.  It basically only adds the star for math
environments (using org-html--math-environment-p)

  [...]

  Now the newline commands \n before and after %s are exported as whitespace. Just replacing
  \n%s\n by %s (that is, leaving the newlines out) solves the problem. HTML ignores newlines
  anyway.

This seems to work better indeed; the \n's were just cosmetic.

  [...]

  1 Create a new variable ox-html-latex-environments-no-number of the form ("foo"
   "bar" "baz" ...), which contains all environments that should not receive equation
   numbers.

I don't know whether org-html--math-environment-p (as used in the
attached patch) is sufficient to determine whether we need to add a star
to the environment or if we need another variable (in my use cases,
testing for a match environment is sufficient but it may not be the case
in general).

  [...]

  I don’t know how to express in elisp what is in brackets, though. Does this make sense to you? I
  am a beginner with elisp, so I can only state the ideas I have but not implement them (yet).

This can be made to work if there is a consensus that we want to add a
ox-html-latex-environments-no-number variable (I can try to help with
that if needed, even though my elisp isn't great)

  [...]

  (let ((formula-link
              (org-html-format-latex
               (org-html--unlabel-latex-environment latex-frag)
               processing-type info)))

The patch should address that, I would be curious to see if you
encounter additional problematic cases after applying it.

Thanks for resurrecting this and for your help detecting and fixing the issues.

Best,

thibault


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-lisp-ox-html.el-Fix-automatic-numbering-of-non-math-.patch --]
[-- Type: text/x-diff, Size: 1294 bytes --]

From 61a27c4816a0dae1072f851cc67ea48cec2d362c Mon Sep 17 00:00:00 2001
From: thibault <thibault.marin@gmx.com>
Date: Tue, 12 Apr 2022 00:45:18 -0400
Subject: [PATCH] lisp/ox-html.el: Fix automatic numbering of non-math
 environment

* ox-html.el (org-html-latex-environment): Prevent addition of * to
non-math environments.  Added * is used for math environments to
replace latex equation numbering by org labels for html linking.
---
 lisp/ox-html.el | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lisp/ox-html.el b/lisp/ox-html.el
index 81ef002a0..0968e2199 100644
--- a/lisp/ox-html.el
+++ b/lisp/ox-html.el
@@ -2945,7 +2945,9 @@ CONTENTS is nil.  INFO is a plist holding contextual information."
      ((assq processing-type org-preview-latex-process-alist)
       (let ((formula-link
              (org-html-format-latex
-              (org-html--unlabel-latex-environment latex-frag)
+	      (if (eq nil (org-html--math-environment-p latex-environment))
+		  latex-frag
+		(org-html--unlabel-latex-environment latex-frag))
               processing-type info)))
         (when (and formula-link (string-match "file:\\([^]]*\\)" formula-link))
           (let ((source (org-export-file-uri (match-string 1 formula-link))))
--
2.33.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* AW: ox-html: exporting LaTeX-environments
  2022-04-12  5:15 ` Thibault Marin
@ 2022-04-15 10:23   ` Vitus Schäfftlein
  2022-04-20 18:46     ` WG: " Vitus Schäfftlein
  0 siblings, 1 reply; 4+ messages in thread
From: Vitus Schäfftlein @ 2022-04-15 10:23 UTC (permalink / raw)
  To: thibault.marin@gmx.com; +Cc: emacs-orgmode@gnu.org

Dear Thibault, dear list,

thanks for your fast answer, Thibault!  I am happy to have convinced you of dropping the extra \n's.  The patch you addressed works fine for the asterisk issue: Only those environments get an extra asterisk which are in org-latex-math-environments-re. Nonetheless, I think we should make that variable more user-friendly by not requiring one huge regexp but a list of environment names like ("foo" "bar" "baz").

Unfortunately, the patch does not have any impact on the numbering problem. I guess we actually need org-html--latex-environments-leave-unlabelled to get this running. I just wanna give two examples why it is not useful to lable every environment. 

1. Imagine you do not need links to your equations and use the \tag command in your equation environment to get the right equation numbers within the svg. This looks nicer anyway because the font is the same and the formatting is perfect. If we did not have org-html--latex-environments-leave-unlabelled, there would be both the line number within the equation and, below it, another in-text-line-number on a new line. Same goes for the figure-environment which provides its own caption. In general, everything that has its own counter within the svg should not receive extra equation numbers.
2. There are some environments you do not want labelled. Imagine you compile a table summarizing the most important formulas of your post. This is not an equation and it should not be numbered, but it is. Below the table then is a number which looks awful.

Since I am in the position of setting up a blog just now, I can only stress this variable is needed. I wish I could help you with the code more than I have done with my suggestions. But maybe some one else on the list has an idea?

Best Regards,
Vitus
________________________________________
Von: Thibault Marin [thibault.marin@gmx.com]
Gesendet: Dienstag, 12. April 2022 07:15
An: Vitus Schäfftlein
Cc: emacs-orgmode@gnu.org
Betreff: Re: ox-html: exporting LaTeX-environments

Hi Vitus, list.

My memory is quite fuzzy on this and I won't have a chance to take a
deep look until later, but I will try to share the information I have.

On Mon, 11 Apr 2022 19:38:13 +0000 (9 hours, 37 minutes, 37 seconds ago), Vitus Schäfftlein <vitusschaefftlein@live.de> wrote:

  Dear org-mode mailing list,

  [...]

  3. Any LaTeX environment name foo is changed to foo* (except it already ends with an
   asterisk). For example, \begin{tabular} is changed to \begin{tabular*}; same for
   \end{tabular}. But tabular* differs from tabular in needing an extra width-argument, so
   the export won’t work properly.

I had submitted a patch trying to address this
(https://list.orgmode.org/87h7ok3qi2.fsf@dell-desktop.WORKGROUP/, I have
attached a new version rebased on main to this message).  It never made
it in and I failed to follow-up.  This patch (or something similar)
could help with this issue.  It basically only adds the star for math
environments (using org-html--math-environment-p)

  [...]

  Now the newline commands \n before and after %s are exported as whitespace. Just replacing
  \n%s\n by %s (that is, leaving the newlines out) solves the problem. HTML ignores newlines
  anyway.

This seems to work better indeed; the \n's were just cosmetic.

  [...]

  1 Create a new variable ox-html-latex-environments-no-number of the form ("foo"
   "bar" "baz" ...), which contains all environments that should not receive equation
   numbers.

I don't know whether org-html--math-environment-p (as used in the
attached patch) is sufficient to determine whether we need to add a star
to the environment or if we need another variable (in my use cases,
testing for a match environment is sufficient but it may not be the case
in general).

  [...]

  I don’t know how to express in elisp what is in brackets, though. Does this make sense to you? I
  am a beginner with elisp, so I can only state the ideas I have but not implement them (yet).

This can be made to work if there is a consensus that we want to add a
ox-html-latex-environments-no-number variable (I can try to help with
that if needed, even though my elisp isn't great)

  [...]

  (let ((formula-link
              (org-html-format-latex
               (org-html--unlabel-latex-environment latex-frag)
               processing-type info)))

The patch should address that, I would be curious to see if you
encounter additional problematic cases after applying it.

Thanks for resurrecting this and for your help detecting and fixing the issues.

Best,

thibault



^ permalink raw reply	[flat|nested] 4+ messages in thread

* WG: ox-html: exporting LaTeX-environments
  2022-04-15 10:23   ` AW: " Vitus Schäfftlein
@ 2022-04-20 18:46     ` Vitus Schäfftlein
  0 siblings, 0 replies; 4+ messages in thread
From: Vitus Schäfftlein @ 2022-04-20 18:46 UTC (permalink / raw)
  To: emacs-orgmode@gnu.org

[-- Attachment #1: Type: text/plain, Size: 5080 bytes --]

[Bump]

Dear Thibault, dear list,

thanks for your fast answer, Thibault!  I am happy to have convinced you of dropping the extra \n's.  The patch you addressed works fine for the asterisk issue: Only those environments get an extra asterisk which are in org-latex-math-environments-re. Nonetheless, I think we should make that variable more user-friendly by not requiring one huge regexp but a list of environment names like ("foo" "bar" "baz").

Unfortunately, the patch does not have any impact on the numbering problem. I guess we actually need org-html--latex-environments-leave-unlabelled to get this running. I just wanna give two examples why it is not useful to lable every environment.

1. Imagine you do not need links to your equations and use the \tag command in your equation environment to get the right equation numbers within the svg. This looks nicer anyway because the font is the same and the formatting is perfect. If we did not have org-html--latex-environments-leave-unlabelled, there would be both the line number within the equation and, below it, another in-text-line-number on a new line. Same goes for the figure-environment which provides its own caption. In general, everything that has its own counter within the svg should not receive extra equation numbers.
2. There are some environments you do not want labelled. Imagine you compile a table summarizing the most important formulas of your post. This is not an equation and it should not be numbered, but it is. Below the table then is a number which looks awful.

You will find an example of the problem here: https://uni-muenster.sciebo.de/s/8gW9dCx7q8NHIeV
[https://uni-muenster.sciebo.de/apps/files_sharing/publicpreview?file=/line_numbers.png&t=8gW9dCx7q8NHIeV&x=200&y=200]<https://uni-muenster.sciebo.de/s/8gW9dCx7q8NHIeV>
sciebo - www.hochschulcloud.nrw<https://uni-muenster.sciebo.de/s/8gW9dCx7q8NHIeV>
line_numbers.png is publicly shared
uni-muenster.sciebo.de


Since I am in the position of setting up a blog just now, I can only stress this variable is needed. I wish I could help you with the code more than I have done with my suggestions. But maybe some one else on the list has an idea?

Best Regards,
Vitus
________________________________________
Von: Thibault Marin [thibault.marin@gmx.com]
Gesendet: Dienstag, 12. April 2022 07:15
An: Vitus Schäfftlein
Cc: emacs-orgmode@gnu.org
Betreff: Re: ox-html: exporting LaTeX-environments

Hi Vitus, list.

My memory is quite fuzzy on this and I won't have a chance to take a
deep look until later, but I will try to share the information I have.

On Mon, 11 Apr 2022 19:38:13 +0000 (9 hours, 37 minutes, 37 seconds ago), Vitus Schäfftlein <vitusschaefftlein@live.de> wrote:

  Dear org-mode mailing list,

  [...]

  3. Any LaTeX environment name foo is changed to foo* (except it already ends with an
   asterisk). For example, \begin{tabular} is changed to \begin{tabular*}; same for
   \end{tabular}. But tabular* differs from tabular in needing an extra width-argument, so
   the export won’t work properly.

I had submitted a patch trying to address this
(https://list.orgmode.org/87h7ok3qi2.fsf@dell-desktop.WORKGROUP/, I have
attached a new version rebased on main to this message).  It never made
it in and I failed to follow-up.  This patch (or something similar)
could help with this issue.  It basically only adds the star for math
environments (using org-html--math-environment-p)

  [...]

  Now the newline commands \n before and after %s are exported as whitespace. Just replacing
  \n%s\n by %s (that is, leaving the newlines out) solves the problem. HTML ignores newlines
  anyway.

This seems to work better indeed; the \n's were just cosmetic.

  [...]

  1 Create a new variable ox-html-latex-environments-no-number of the form ("foo"
   "bar" "baz" ...), which contains all environments that should not receive equation
   numbers.

I don't know whether org-html--math-environment-p (as used in the
attached patch) is sufficient to determine whether we need to add a star
to the environment or if we need another variable (in my use cases,
testing for a match environment is sufficient but it may not be the case
in general).

  [...]

  I don’t know how to express in elisp what is in brackets, though. Does this make sense to you? I
  am a beginner with elisp, so I can only state the ideas I have but not implement them (yet).

This can be made to work if there is a consensus that we want to add a
ox-html-latex-environments-no-number variable (I can try to help with
that if needed, even though my elisp isn't great)

  [...]

  (let ((formula-link
              (org-html-format-latex
               (org-html--unlabel-latex-environment latex-frag)
               processing-type info)))

The patch should address that, I would be curious to see if you
encounter additional problematic cases after applying it.

Thanks for resurrecting this and for your help detecting and fixing the issues.

Best,

thibault


[-- Attachment #2: Type: text/html, Size: 8400 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-04-20 19:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-11 19:38 ox-html: exporting LaTeX-environments Vitus Schäfftlein
2022-04-12  5:15 ` Thibault Marin
2022-04-15 10:23   ` AW: " Vitus Schäfftlein
2022-04-20 18:46     ` WG: " Vitus Schäfftlein

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).