emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [Tip] Export a bibliography to HTML with bibLaTeX and make4ht
@ 2021-01-23 11:03 Juan Manuel Macías
  2021-01-24 11:37 ` Gustavo Barros
  0 siblings, 1 reply; 7+ messages in thread
From: Juan Manuel Macías @ 2021-01-23 11:03 UTC (permalink / raw)
  To: orgmode

[-- Attachment #1: Type: text/plain, Size: 2693 bytes --]

Hi,

When I export to LaTeX an Org document that contains a bibliography, I
use bibLaTeX with a very custom style (i.e. quite a few lines of code
related to bibLaTeX in the preamble). I wanted to apply all that
bibLaTeX setting and styles when exporting to HTML too, so I came up
with this method, using make4ht. I share it here, in case it is useful
to someone.

The idea is to compile with make4ht (see:
https://www.ctan.org/pkg/make4ht) a simple file with *only* the
bibliography, and "embed" the HTML output in the Org document. You need
to create in the working directory a tex file, which will serve as a
minimal preamble and which also includes all code related to bibLaTeX.
We can name it preamble.tex, and it would start like this:

#+begin_src latex
\documentclass{article}
\usepackage{fontspec}
\usepackage[<whatever-language>]{babel}
\usepackage[backend=biber,style=authortitle,dashed=true,sorting=nyt]{biblatex}
%% more code related to bibLaTeX...
#+end_src

We also need a small lua file that will control the make4ht compilation.
If we run make4ht in draft mode it will not call Biber. This file can be
named build.lua:

#+begin_src lua
if mode=="draft" then
Make:htlatex {}
else
Make:htlatex {}
Make:biber {}
Make:htlatex {}
end
#+end_src

And finally, this function is defined in Elisp, which takes two
arguments: the preamble-file and the *.bib file to generate the list of
references. The optional draft argument is for make4ht to run in draft
mode (that is, so you don't rebuild the bibliography). In the end Pandoc
is executed with shell output to simplify the resulting HTML:

#+begin_src emacs-lisp
  (defun my-biblio-html (preamble bib &optional draft)
    (when (org-export-derived-backend-p org-export-current-backend 'html)
      (let ((file (file-name-sans-extension bib))
	    (d (if draft
		   "-m draft "
		 "")))
	(shell-command (concat
			"echo \"\\input{"
			preamble
			"}"
			"\\addbibresource{"
			bib
			"}"
			"\\begin{document}
    \\nocite{*}
    \\printbibliography[heading=none]
    \\end{document}\" > "
			file "-bib.tex"))
	(shell-command-to-string (concat "make4ht -ule build.lua "
					 d
					 file
					 "-bib.tex > /dev/null && "
					 "pandoc -f html -t html "
					 file
					 "-bib.html")))))
#+end_src

An example:

#+begin_src org
  ,#+HTML_HEAD: <style> dd { text-indent: -2em; margin-left: 2em; } </style>
  ,#+HTML_HEAD: <style> .rm-lmri-12{ font-style:italic;} </style>
  ,* References
  ,#+begin_src emacs-lisp :exports results :results html
  (my-biblio-html "preamble.tex" "file.bib")
  ,#+end_src
#+end_src

As you can see, the method is somewhat tricky, but it works well, for now. I hope that
be useful!

Regards,

Juan Manuel

[-- Attachment #2: Type: text/html, Size: 0 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Tip] Export a bibliography to HTML with bibLaTeX and make4ht
  2021-01-23 11:03 [Tip] Export a bibliography to HTML with bibLaTeX and make4ht Juan Manuel Macías
@ 2021-01-24 11:37 ` Gustavo Barros
  2021-01-24 13:00   ` Gustavo Barros
  0 siblings, 1 reply; 7+ messages in thread
From: Gustavo Barros @ 2021-01-24 11:37 UTC (permalink / raw)
  To: Juan Manuel Macías; +Cc: orgmode


Hi Juan,

that's very interesting.  Thanks for sharing.

On Sat, 23 Jan 2021 at 12:03, Juan Manuel Macías <maciaschain@posteo.net> wrote:

> When I export to LaTeX an Org document that contains a bibliography, I
> use bibLaTeX with a very custom style (i.e. quite a few lines of code
> related to bibLaTeX in the preamble). I wanted to apply all that
> bibLaTeX setting and styles when exporting to HTML too, so I came up
> with this method, using make4ht. I share it here, in case it is useful
> to someone.
>
> The idea is to compile with make4ht (see:
> https://www.ctan.org/pkg/make4ht) a simple file with *only* the
> bibliography, and "embed" the HTML output in the Org document. You need
> to create in the working directory a tex file, which will serve as a
> minimal preamble and which also includes all code related to bibLaTeX.
> We can name it preamble.tex, and it would start like this:

Indeed, when one actually needs biblatex-biber to process their
bibliography, using Org is really hard.  I have some history with this
problem, as I initially approached Emacs (once upon a time) trying to
use Org as a single source and multiple outputs (mainly pdf and odt).
However, as you, I rely on heavily customized styles, which simply won't
work with pandoc/CSL, so I got stuck.  I eventually stayed in Emacs and
use Org for a number of things, but for my more formal writing use
AUCTeX + RefTeX, which is great too (alas, no odt..., at least not
easily).

For a long time I fancied trying something about it, pretty much in the
same lines as you are doing here.  My idea was to use `preview-latex'
for this, which I still think is promising and, as far as I understand,
pretty much automates what you are doing, which is to generate a
stripped document, with a proper preamble, and run it on a piece of your
actual document.  It is used by AUCTeX and LyX (Org too, I presume) to
generate images, but I don't see why it could not be streamlined to
generate a dvi which could then be fed to tex4ht and friends, just as
you do too.  I thought that this procedure could, in principle, be used
to export to other formats, but also to Org itself, generating either a
second version of the source document with the citations and
bibliography already processed as text (sort of a
'org-biblatex-citeproc'), or as a preview, such as the ones for math.
Depending on how far you are willing to take your setup, this might be
one path.  It should handle two limitations of your procedure, which
are: getting the bibliography with the entries actually cited in the
document and citation callouts.  The first one is easy to handle in your
current approach by means of any of the multiple alternatives to
generate a bib file with only the cited entries.  The second one, much
harder, as far as I can see.

To my dismay, my own style customizations for biblatex are mainly aimed
at citations (primary/archival sources for Economic History).  But it
was quite interesting to see your approach here.  So, again, thank you.

Best,
Gustavo.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Tip] Export a bibliography to HTML with bibLaTeX and make4ht
  2021-01-24 11:37 ` Gustavo Barros
@ 2021-01-24 13:00   ` Gustavo Barros
  2021-01-24 19:20     ` Juan Manuel Macías
  0 siblings, 1 reply; 7+ messages in thread
From: Gustavo Barros @ 2021-01-24 13:00 UTC (permalink / raw)
  To: Juan Manuel Macías; +Cc: orgmode


On Sun, 24 Jan 2021 at 08:37, Gustavo Barros <gusbrs.2016@gmail.com> 
wrote:

> It should handle two limitations of your procedure, which
> are: getting the bibliography with the entries actually cited in the
> document and citation callouts.  The first one is easy to handle in 
> your
> current approach by means of any of the multiple alternatives to
> generate a bib file with only the cited entries.  The second one, much
> harder, as far as I can see.

Thinking this through: there is actually a third challenge to the 
approach, which is ensuring the relation of the citation callouts and 
the bibliography is correct.  For example, if using a numeric or alpha 
style, how to be sure the labels are the same in the citation and the 
bibliography.  Even in other styles, such as author-year, if 
disambiguation rules come into play (e.g. (Smith 1987a, Smith 1987b)), 
how to be sure the same rules are being applied by pandoc/CSL (on the 
citations) and biblatex (in the bibliography).  As far as I can tell, 
this will hang on sorting, something which biblatex is known to be more 
capable than other tools, so that I would expect differences (at least 
potentially).  Styles such as verbose or author-title would probably be 
safe, I guess.  Have you given some thought about this?  If so, how are 
you handling the case?

Best,
Gustavo.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Tip] Export a bibliography to HTML with bibLaTeX and make4ht
  2021-01-24 13:00   ` Gustavo Barros
@ 2021-01-24 19:20     ` Juan Manuel Macías
  2021-01-24 22:44       ` Gustavo Barros
  0 siblings, 1 reply; 7+ messages in thread
From: Juan Manuel Macías @ 2021-01-24 19:20 UTC (permalink / raw)
  To: Gustavo Barros; +Cc: orgmode

Hi Gustavo,

Thank you for your interesting comments.

Gustavo Barros <gusbrs.2016@gmail.com> writes:

> On Sun, 24 Jan 2021 at 08:37, Gustavo Barros <gusbrs.2016@gmail.com>
> wrote:
>
>> It should handle two limitations of your procedure, which
>> are: getting the bibliography with the entries actually cited in the
>> document and citation callouts.  The first one is easy to handle in
>> your
>> current approach by means of any of the multiple alternatives to
>> generate a bib file with only the cited entries.  The second one, much
>> harder, as far as I can see.
>
> Thinking this through: there is actually a third challenge to the
> approach, which is ensuring the relation of the citation callouts and 
> the bibliography is correct.  For example, if using a numeric or alpha
> style, how to be sure the labels are the same in the citation and the 
> bibliography.  Even in other styles, such as author-year, if
> disambiguation rules come into play (e.g. (Smith 1987a, Smith 1987b)), 
> how to be sure the same rules are being applied by pandoc/CSL (on the
> citations) and biblatex (in the bibliography).  As far as I can tell, 
> this will hang on sorting, something which biblatex is known to be
> more capable than other tools, so that I would expect differences (at
> least potentially).  Styles such as verbose or author-title would
> probably be safe, I guess.  Have you given some thought about this?
> If so, how are you handling the case?
>

I agree with what you comment here and in your previous message. In
fact, I'm afraid this (humble) approach of mine is focused only on
creating a mere list of references in HTML from a bib file, keeping the
same bibliography styles that I have customized in bibLaTeX, but not on
everything related to citations throughout the text and on the
consistency between citations and bibliographies. I would say that my
method is not a good starting point to implement a solution. The
essential problem, of course, is that our customization is LaTeXcentric:
it resides in LaTeX/bibLaTeX and not in Org.

In my case, anyway, I had been using the TeX ecosystem almost
exclusively for my work in typesetting and editorial design (I do not
use DTP software, which is not intended to create books but magazines
and newspapers), and Org Mode for writing and notes. But in recent years
I have come to realize that a workflow based also on Org and Org-Publish
is tremendously productive for me to manage the typesetting of a book,
especially a complex book. Let's say now I also use Org as a high-level
interface for LaTeX. I'm currently working on the /Hispanic Dictionary
of Classical Tradition/ (/Diccionario Hispánico de la Tradición
Clásica/), a volume of multiple authorship and about 1200 pages. The
method I raised in this thread has to do with this scenario, where each
dictionary entry is accompanied by a bibliography. As the dictionary
will have an online secondary version, I wanted to keep the same
bibliography style that I had defined for bibLaTeX. I have not had the
problem of the citations here, since the entries do not contain
citations (bibliographies only). Otherwise, I think an emergency
solution could be to export from Org to *.tex, and then generate the
HTML from there using make4ht and another preamble /ad hoc/, better than
using a mixed csl/bibLaTeX method which, as you say, can result in many
inconsistencies.

Long ago I tended to be more in favor of the idea that a single
source-text should produce multiple identical or interchangeable
formats. I really still believe it with enthusiasm and I have not
completely lost faith in such a utopia ;-) But nuances are necessary and
it must be accepted that each format has its idiosyncrasies and
limitations. For example, TeX and what TeX produces is at a level (let's
say) higher than what can be achieved through HTML/CSS, odt, epub... It
is not only a question of typographic refinement or fancy appearance
(typical of TeX), but also (in my opinion) of the book typography itself as a
form of expression. The other formats will often lag behind TeX, and
this must be taken into account when exporting, pros and cons, etc. On
the other hand, bibLaTeX is powerful and highly customizable, but sadly
depends on LaTeX...

Regards,

Juan Manuel 


> Best,
> Gustavo.
>



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Tip] Export a bibliography to HTML with bibLaTeX and make4ht
  2021-01-24 19:20     ` Juan Manuel Macías
@ 2021-01-24 22:44       ` Gustavo Barros
  2021-01-25 17:46         ` Juan Manuel Macías
  0 siblings, 1 reply; 7+ messages in thread
From: Gustavo Barros @ 2021-01-24 22:44 UTC (permalink / raw)
  To: Juan Manuel Macías; +Cc: orgmode

Hi Juan,

On Sun, 24 Jan 2021 at 16:20, Juan Manuel Macías 
<maciaschain@posteo.net> wrote:

> I agree with what you comment here and in your previous message. In
> fact, I'm afraid this (humble) approach of mine is focused only on
> creating a mere list of references in HTML from a bib file, keeping 
> the
> same bibliography styles that I have customized in bibLaTeX, but not 
> on
> everything related to citations throughout the text and on the
> consistency between citations and bibliographies. I would say that my
> method is not a good starting point to implement a solution. [...]
>
> In my case, anyway, I had been using the TeX ecosystem almost
> exclusively for my work in typesetting and editorial design (I do not
> use DTP software, which is not intended to create books but magazines
> and newspapers), and Org Mode for writing and notes. But in recent 
> years
> I have come to realize that a workflow based also on Org and 
> Org-Publish
> is tremendously productive for me to manage the typesetting of a book,
> especially a complex book. Let's say now I also use Org as a 
> high-level
> interface for LaTeX. I'm currently working on the /Hispanic Dictionary
> of Classical Tradition/ (/Diccionario Hispánico de la Tradición
> Clásica/), a volume of multiple authorship and about 1200 pages. The
> method I raised in this thread has to do with this scenario, where 
> each
> dictionary entry is accompanied by a bibliography. As the dictionary
> will have an online secondary version, I wanted to keep the same
> bibliography style that I had defined for bibLaTeX. I have not had the
> problem of the citations here, since the entries do not contain
> citations (bibliographies only). Otherwise, I think an emergency
> solution could be to export from Org to *.tex, and then generate the
> HTML from there using make4ht and another preamble /ad hoc/, better 
> than
> using a mixed csl/bibLaTeX method which, as you say, can result in 
> many
> inconsistencies.

Well, I think your approach should work quite well for your use case, 
and certainly a number of others. It is just a matter of being aware of 
the limitations of the tool. That given, it is great. Of course, I was 
also curious how you had figured things from a more general perspective.

> The
> essential problem, of course, is that our customization is 
> LaTeXcentric:
> it resides in LaTeX/bibLaTeX and not in Org. [...]
>

I think it is more than just being "LaTeXcentric".  Depending on 
requirements, there is really no choice.  We don't hear this often, but 
the fact is that Org does not support citation and bibliography by 
itself.  A lot of things "work", and in many requirements scenarios that 
seems to be enough, but what does work relies on outsourcing that task 
to other tools.  As far as I know, there are only two ways out of an Org 
document with citation and bibliography: LaTeX (and its related tools: 
bibtex, biblatex, biber, etc), and pandoc (which uses CSL to process 
these features).  The first option is extremely featureful, but 
restricts us to .pdf output.  The only sufficiently general option with 
multi output is then pandoc, which in turn bypasses the whole Org export 
infrastructure, implying its own trade-offs because of that.  Besides, 
there is no real link between the LaTeX infrastructure and pandoc/CSL, 
so that if you want to reach "best results in LaTeX, and acceptable 
results in other formats", you are bound to live with differences in 
output for citation/references across formats and to remain under the 
restrictions of the least featureful backend.

> Long ago I tended to be more in favor of the idea that a single
> source-text should produce multiple identical or interchangeable
> formats. I really still believe it with enthusiasm and I have not
> completely lost faith in such a utopia ;-)

I'd also would love to see that. ;-)

And I do think Org is, by far, the best placed tool to fill this place. 
But I also think citations and bibliography are a big bottleneck in that 
regard.  Of course, there is a long ongoing effort in that area, in the 
`wip-cite' branch, and the related `org-citeproc' package.  I'm still in 
the hope this will get merged in future not too distant, as it would 
change things in that regard.  Not in the sense of "magically solving 
all of these problems", but in providing a convened base upon which 
people can than invest their time and effort, and try to figure each 
case out, with time.

Best regards,
Gustavo.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Tip] Export a bibliography to HTML with bibLaTeX and make4ht
  2021-01-24 22:44       ` Gustavo Barros
@ 2021-01-25 17:46         ` Juan Manuel Macías
  2021-01-25 18:30           ` Gustavo Barros
  0 siblings, 1 reply; 7+ messages in thread
From: Juan Manuel Macías @ 2021-01-25 17:46 UTC (permalink / raw)
  To: Gustavo Barros; +Cc: orgmode

Hi Gustavo,

Gustavo Barros <gusbrs.2016@gmail.com> writes:

> I'd also would love to see that. ;-)
>
> And I do think Org is, by far, the best placed tool to fill this
> place. But I also think citations and bibliography are a big
> bottleneck in that regard.  Of course, there is a long ongoing effort
> in that area, in the `wip-cite' branch, and the related `org-citeproc'
> package.  I'm still in the hope this will get merged in future not too
> distant, as it would change things in that regard.  Not in the sense
> of "magically solving all of these problems", but in providing a
> convened base upon which people can than invest their time and effort,
> and try to figure each case out, with time.

I totally agree.

By the way... I have written some code to export the citations using
make4ht. It's just a proof of concept, and not too elegant I'm afraid.
But I wanted to explore a bit more the use of make4ht in this context.

The idea is to write the citations in Org as mere bibLaTeX commands, but
between !!- ... -!! (a provisional regexp, for convenience, and to see
if it works). It can be tested in this Org file, which includes the code
(you have to give a value to the variables `bib' and `preamble'):

https://gitlab.com/-/snippets/2066135

Best regards,

Juan Manuel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Tip] Export a bibliography to HTML with bibLaTeX and make4ht
  2021-01-25 17:46         ` Juan Manuel Macías
@ 2021-01-25 18:30           ` Gustavo Barros
  0 siblings, 0 replies; 7+ messages in thread
From: Gustavo Barros @ 2021-01-25 18:30 UTC (permalink / raw)
  To: Juan Manuel Macías; +Cc: orgmode

Hi Juan,

On Mon, 25 Jan 2021 at 14:46, Juan Manuel Macías 
<maciaschain@posteo.net> wrote:
>
> By the way... I have written some code to export the citations using
> make4ht. It's just a proof of concept, and not too elegant I'm afraid.
> But I wanted to explore a bit more the use of make4ht in this context.
>

Nice! I also think make4ht has potential for this 
purpose. tex4ht/make4ht is usually a somewhat delicate tool for a 
general LaTeX document (powerful, but complex), but the typical output 
of citation and bibliography is text with emphasis/bold etc, and perhaps 
a list, if we interpret the bibliography environment strictly. This is 
much simpler (again, typically) than an arbitrary document, to the point 
I believe it could be streamlined reliably for this subset of the 
document.

> The idea is to write the citations in Org as mere bibLaTeX commands, 
> but
> between !!- ... -!! (a provisional regexp, for convenience, and to see
> if it works). It can be tested in this Org file, which includes the 
> code
> (you have to give a value to the variables `bib' and `preamble'):
>
> https://gitlab.com/-/snippets/2066135
>

I understand using the regexp to separate the problems, provisionally, 
as you said.  If it evolves, you might wish to go with the current state 
of things in the wip-cite branch or, I reiterate the suggestion, look at 
latex-preview, which allows you to specify the commands of interest, if 
I recall correctly.

I hope you find your way trough the approach. If you do, please let me 
know. Or, if you wish to discuss a particular issue, feel free to write 
me directly.

Best regards,
Gustavo.



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-01-25 18:33 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-23 11:03 [Tip] Export a bibliography to HTML with bibLaTeX and make4ht Juan Manuel Macías
2021-01-24 11:37 ` Gustavo Barros
2021-01-24 13:00   ` Gustavo Barros
2021-01-24 19:20     ` Juan Manuel Macías
2021-01-24 22:44       ` Gustavo Barros
2021-01-25 17:46         ` Juan Manuel Macías
2021-01-25 18:30           ` Gustavo Barros

Code repositories for project(s) associated with this inbox:

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).