From: Ken Mankoff <mankoff@gmail.com>
To: Emmanuel Charpentier <emm.charpentier@free.fr>
Cc: emacs-orgmode <emacs-orgmode@gnu.org>,
John Kitchin <jkitchin@andrew.cmu.edu>
Subject: Re: Slight problems with links
Date: Tue, 30 Apr 2019 04:43:14 -0200 [thread overview]
Message-ID: <871s1kouwt.fsf@geus3064linuxwsm.geus.dk> (raw)
In-Reply-To: <450323f31e177a8a66aea436c688ec066caf4f26.camel@free.fr>
Hi Emmanuel,
I have looked into PDF and DOCX export repeatedly over the years with different versions of Org and Pandoc and Google Docs and keep finding that the best method is Org -> LaTeX, and then LaTeX -> DOCX with Pandoc. I use this babel block to achieve this:
#+BEGIN_SRC sh :results verbatim :var fn=(file-name-sans-extension (buffer-name))
bibexport -o ${fn}.bib ${fn}.aux
pandoc -f latex -i ${fn}.tex -t DOCX -o ${fn}.docx --bibliography ./${fn}.bib
#+END_SRC
The first line may not be needed. That extracts the references used in this document from my single massive (3500 element) bib file. The advantage of this Org -> LaTeX -> Pandoc -> DOCX method is that if you need to patch something for Pandoc (e.g. to fix your code captions or something) you could insert a NoWeb <<fix-tex-for-pandoc>> block before the pandoc line that generates a second, corrected, LaTeX file.
Note that I don't use org-ref, just the standard Org referencing code, but I don't see why this would change anything, if you're able to generate PDFs that meet your requirements.
-k.
On 2019-04-29 at 19:48 -0200, Emmanuel Charpentier <emm.charpentier@free.fr> wrote...
> Dear John,
> Indeed, I missed your point. I'll have to bet back to you after
> reading, understanding the code (org-mode is a tall order...) and
> *thinking*.
> However, the troubling fact that ox-latex manages to export org's
> labelling correctly shows that its author might be up to somethong.
> Indeed I just checked that its exported docx can be converted by pandoc
> into a "correct" docx (correct here meaning that my captions are
> correctly labelled and numbered).
> Have to think again...
> --Emmanuel Charpentier
> Le lundi 29 avril 2019 à 17:23 -0400, John Kitchin a écrit :
>> I think you have missed the main point. My point was first to find
>> some format that pandoc faithfully converts to docx with all the
>> features you need, and then we can figure out how to turn org-ref/org
>> into that format. So, if you can write a LaTeX document that is
>> correctly converted to docx (correct bibliography, figure labels, and
>> cross-references, correct code, etc), then we can probably get org to
>> output the right latex. But if LaTeX isn't converted to docx
>> correctly in pandoc, it does not seem likely that org will either
>> with any simple exporter.
>> John
>>
>> -----------------------------------
>> Professor John Kitchin
>> Doherty Hall A207F
>> Department of Chemical Engineering
>> Carnegie Mellon University
>> Pittsburgh, PA 15213
>> 412-268-7803
>> @johnkitchin
>> http://kitchingroup.cheme.cmu.edu
>>
>>
>>
>> On Mon, Apr 29, 2019 at 5:19 PM Emmanuel Charpentier <
>> emm.charpentier@free.fr> wrote:
>> > Dear John,
>> > Le lundi 29 avril 2019 à 16:57 -0400, John Kitchin a écrit :
>> > > For org-ref, there isn't much magic on what happens on export.
>> > > LaTeX is certainly the most well supported, and it seems like org
>> > > -> latex -> pandoc is the only way that makes sense to get to
>> > > docx to me. Using pandoc on org files directly is probably
>> > > hopeless unless you can get pandoc to include some definitions
>> > > for the org-ref links.
>> >
>> > This might be difficult : the development of ox-pandoc seems to not
>> > be very active at the moment...
>> > > Some of the link types in org-ref have some exports defined for
>> > > org, html, latex, sometimes ascii. If one of these works well
>> > > with pandoc we could try to make them output something useful for
>> > > them, or at least make sure that org->org export turns them into
>> > > something useful.
>> >
>> > I'm currently looking at the ox-latex exporter in order to
>> > understand what it does for source blocks with org's names and
>> > captions (and try to fix the fact that they are labeled and nubered
>> > as figures...). Theis understanding might help me to go in the
>> > direction you suggest.
>> > > Getting figure/table numbers has always been tricky; I don't
>> > > think this worked well with pandoc, and handling it on the org
>> > > side requires some preprocessing to add numbers. For now, the ox-
>> > > word exporter in scimax comes closest, but it isn't a feature I
>> > > use a lot, so it hasn't been improved in a while.
>> >
>> > Again, looking at what ox-latex does for org's names and captions
>> > might be helpful. Ox-pandoc seems to do a decent job on docx
>> > output.
>> > > John
>> > >
>> > > -----------------------------------
>> > > Professor John Kitchin
>> > > Doherty Hall A207F
>> > > Department of Chemical Engineering
>> > > Carnegie Mellon University
>> > > Pittsburgh, PA 15213
>> > > 412-268-7803
>> > > @johnkitchin
>> > > http://kitchingroup.cheme.cmu.edu
>> > >
>> > >
>> > >
>> > > On Mon, Apr 29, 2019 at 1:06 PM Emmanuel Charpentier <
>> > > emm.charpentier@free.fr> wrote:
>> > > > Dear list,
>> > > >
>> > > >
>> > > >
>> > > > one of my uses od org-mode is to prepare documents wrapping R
>> > > > (and
>> > > >
>> > > > sometimes Sagemath) call results in interpretation text. My
>> > > > reference
>> > > >
>> > > > output is .pdf documents, but I *have* to prepare a .docx
>> > > > version (for
>> > > >
>> > > > use in managerial spheres, where computer literacy is *very*
>> > > > low.
>> > > >
>> > > > Cross-references and citations are a sine qua non, maths are
>> > > > useful.
>> > > >
>> > > >
>> > > >
>> > > > I have been annoyed by a couple of deficiencies and
>> > > > inconsistencies
>> > > >
>> > > > between exporters, so I prepared a test document testing
>> > > > various cases.
>> > > >
>> > > > This documents and some exports are attached (NE = Native
>> > > > exporter, PE
>> > > >
>> > > > = ox-pandoc exporter).
>> > > >
>> > > >
>> > > >
>> > > > TL;DR :
>> > > >
>> > > >
>> > > >
>> > > > * I tested the built-in latex/pdf exporter as well as ox-
>> > > > pandoc, the
>> > > >
>> > > > latter both for .pdf and .docx export. The built-in ODT
>> > > > exporter
>> > > >
>> > > > doesn't export citations ; therefore, I didn't test it further.
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > * org-ref's :labels and :refs do not export to anything but
>> > > > the
>> > > >
>> > > > built-in latex exporter. The native system of #+NAME:s and
>> > > > #+CAPTION:s,
>> > > >
>> > > > a bit on the heavy side, seems not to fail (except that they do
>> > > > not
>> > > >
>> > > > expand in a caption...).
>> > > >
>> > > >
>> > > >
>> > > > * Maths, tables, figures are unproblematic.
>> > > >
>> > > >
>> > > >
>> > > > * The requirements of org-reftex, the built-in latex exporter
>> > > > and ox-
>> > > >
>> > > > pandoc being mutually incompatible, and some ingenuity is
>> > > > required. see
>> > > >
>> > > > the attached org source. Org-ref's requirements do not simplify
>> > > > the
>> > > >
>> > > > situation...
>> > > >
>> > > >
>> > > >
>> > > > * Code snippets (i. e. source blocks exporting code) have a
>> > > >
>> > > > captioning/numbering problem :
>> > > >
>> > > >
>> > > >
>> > > > - With the built-in latex exporter, they are numbered and
>> > > > labeled
>> > > >
>> > > > as figures.
>> > > >
>> > > >
>> > > >
>> > > > - The pandoc latex exporter numbers them separately (as
>> > > > seen by
>> > > >
>> > > > referencing them), but do not output this number (nor the
>> > > > category)
>> > > >
>> > > > before the caption.
>> > > >
>> > > >
>> > > >
>> > > > - The pandoc .docx exporter works as advertised.
>> > > >
>> > > >
>> > > >
>> > > > So I have a couple of questions:
>> > > >
>> > > >
>> > > >
>> > > > * What can be done to reconcile org-ref's, latex-exporter's
>> > > > and ox-
>> > > >
>> > > > pandoc's requirements for bibliographies ?
>> > > >
>> > > >
>> > > >
>> > > > * How to fix the pdf exporters' quirks with code snippets ?
>> > > >
>> > > >
>> > > >
>> > > > HTH,
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > >
>> > > > Emmanuel Charpentier
>> > > >
next prev parent reply other threads:[~2019-04-30 6:43 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-29 6:33 Slight problems with links Emmanuel Charpentier
2019-04-29 20:57 ` John Kitchin
2019-04-29 21:18 ` Emmanuel Charpentier
2019-04-29 21:23 ` John Kitchin
2019-04-29 21:48 ` Emmanuel Charpentier
2019-04-30 6:43 ` Ken Mankoff [this message]
2019-04-30 7:45 ` Emmanuel Charpentier
2019-04-30 18:11 ` John Kitchin
-- strict thread matches above, loose matches on Subject: below --
2019-04-29 7:35 Emmanuel Charpentier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.orgmode.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=871s1kouwt.fsf@geus3064linuxwsm.geus.dk \
--to=mankoff@gmail.com \
--cc=emacs-orgmode@gnu.org \
--cc=emm.charpentier@free.fr \
--cc=jkitchin@andrew.cmu.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).