emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: John Kitchin <jkitchin@andrew.cmu.edu>
To: Ken Mankoff <mankoff@gmail.com>
Cc: Emmanuel Charpentier <emm.charpentier@free.fr>,
	emacs-orgmode <emacs-orgmode@gnu.org>
Subject: Re: Slight problems with links
Date: Tue, 30 Apr 2019 14:11:54 -0400	[thread overview]
Message-ID: <m25zqve51x.fsf@andrew.cmu.edu> (raw)
In-Reply-To: <871s1kouwt.fsf@geus3064linuxwsm.geus.dk>

I second the need for the first line, in my past efforts pandoc would
choke on my large bibtex file, and trimming it down to just the
essential entries was helpful. I would achieve that with M-x
org-ref-extract-bibtex-to-file :). But it is otherwise the same.

Ken Mankoff <mankoff@gmail.com> writes:

> Hi Emmanuel,
>
> I have looked into PDF and DOCX export repeatedly over the years with different versions of Org and Pandoc and Google Docs and keep finding that the best method is Org -> LaTeX, and then LaTeX -> DOCX with Pandoc. I use this babel block to achieve this:
>
> #+BEGIN_SRC sh :results verbatim :var fn=(file-name-sans-extension (buffer-name))
> bibexport -o ${fn}.bib ${fn}.aux
> pandoc -f latex -i ${fn}.tex -t DOCX -o ${fn}.docx --bibliography ./${fn}.bib
> #+END_SRC
>
> The first line may not be needed. That extracts the references used in this document from my single massive (3500 element) bib file. The advantage of this Org -> LaTeX -> Pandoc -> DOCX  method is that if you need to patch something for Pandoc (e.g. to fix your code captions or something) you could insert a NoWeb <<fix-tex-for-pandoc>> block before the pandoc line that generates a second, corrected, LaTeX file.
>
> Note that I don't use org-ref, just the standard Org referencing code, but I don't see why this would change anything, if you're able to generate PDFs that meet your requirements.
>
>   -k.
>
>
> On 2019-04-29 at 19:48 -0200, Emmanuel Charpentier <emm.charpentier@free.fr> wrote...
>> Dear John,
>> Indeed, I missed your point. I'll have to bet back to you after
>> reading, understanding the code (org-mode is a tall order...) and
>> *thinking*.
>> However, the troubling fact that ox-latex manages to export org's
>> labelling correctly shows that its author might be up to somethong.
>> Indeed I just checked that its exported docx can be converted by pandoc
>> into a "correct" docx (correct here meaning that my captions are
>> correctly labelled and numbered).
>> Have to think again...
>> --Emmanuel Charpentier
>> Le lundi 29 avril 2019 à 17:23 -0400, John Kitchin a écrit :
>>> I think you have missed the main point. My point was first to find
>>> some format that pandoc faithfully converts to docx with all the
>>> features you need, and then we can figure out how to turn org-ref/org
>>> into that format. So, if you can write a LaTeX document that is
>>> correctly converted to docx (correct bibliography, figure labels, and
>>> cross-references, correct code, etc), then we can probably get org to
>>> output the right latex. But if LaTeX isn't converted to docx
>>> correctly in pandoc, it does not seem likely that org will either
>>> with any simple exporter.
>>> John
>>>
>>> -----------------------------------
>>> Professor John Kitchin
>>> Doherty Hall A207F
>>> Department of Chemical Engineering
>>> Carnegie Mellon University
>>> Pittsburgh, PA 15213
>>> 412-268-7803
>>> @johnkitchin
>>> http://kitchingroup.cheme.cmu.edu
>>>
>>>
>>>
>>> On Mon, Apr 29, 2019 at 5:19 PM Emmanuel Charpentier <
>>> emm.charpentier@free.fr> wrote:
>>> > Dear John,
>>> > Le lundi 29 avril 2019 à 16:57 -0400, John Kitchin a écrit :
>>> > > For org-ref, there isn't much magic on what happens on export.
>>> > > LaTeX is certainly the most well supported, and it seems like org
>>> > > -> latex -> pandoc is the only way that makes sense to get to
>>> > > docx to me.  Using pandoc on org files directly is probably
>>> > > hopeless unless you can get pandoc to include some definitions
>>> > > for the org-ref links.
>>> >
>>> > This might be difficult : the development of ox-pandoc seems to not
>>> > be very active at the moment...
>>> > > Some of the link types in org-ref have some exports defined for
>>> > > org, html, latex, sometimes ascii. If one of these works well
>>> > > with pandoc we could try to make them output something useful for
>>> > > them, or at least make sure that org->org export turns them into
>>> > > something useful.
>>> >
>>> > I'm currently looking at the ox-latex exporter in order to
>>> > understand what it does for source blocks with org's names and
>>> > captions (and try to fix the fact that they are labeled and nubered
>>> > as figures...). Theis understanding might help me to go in the
>>> > direction  you suggest.
>>> > > Getting figure/table numbers has always been tricky; I don't
>>> > > think this worked well with pandoc, and handling it on the org
>>> > > side requires some preprocessing to add numbers. For now, the ox-
>>> > > word exporter in scimax comes closest, but it isn't a feature I
>>> > > use a lot, so it hasn't been improved in a while.
>>> >
>>> > Again, looking at what ox-latex does for org's names and captions
>>> > might be helpful. Ox-pandoc seems to do a decent job on docx
>>> > output.
>>> > > John
>>> > >
>>> > > -----------------------------------
>>> > > Professor John Kitchin
>>> > > Doherty Hall A207F
>>> > > Department of Chemical Engineering
>>> > > Carnegie Mellon University
>>> > > Pittsburgh, PA 15213
>>> > > 412-268-7803
>>> > > @johnkitchin
>>> > > http://kitchingroup.cheme.cmu.edu
>>> > >
>>> > >
>>> > >
>>> > > On Mon, Apr 29, 2019 at 1:06 PM Emmanuel Charpentier <
>>> > > emm.charpentier@free.fr> wrote:
>>> > > > Dear list,
>>> > > >
>>> > > >
>>> > > >
>>> > > > one of my uses od org-mode is to prepare documents wrapping R
>>> > > > (and
>>> > > >
>>> > > > sometimes Sagemath) call results in interpretation text. My
>>> > > > reference
>>> > > >
>>> > > > output is .pdf documents, but I *have* to prepare a .docx
>>> > > > version (for
>>> > > >
>>> > > > use in managerial spheres, where computer literacy is *very*
>>> > > > low.
>>> > > >
>>> > > > Cross-references and citations are a sine qua non, maths are
>>> > > > useful.
>>> > > >
>>> > > >
>>> > > >
>>> > > > I have been annoyed by a couple of deficiencies and
>>> > > > inconsistencies
>>> > > >
>>> > > > between exporters, so I prepared a test document testing
>>> > > > various cases.
>>> > > >
>>> > > > This documents and some exports are attached (NE = Native
>>> > > > exporter, PE
>>> > > >
>>> > > > = ox-pandoc exporter).
>>> > > >
>>> > > >
>>> > > >
>>> > > > TL;DR :
>>> > > >
>>> > > >
>>> > > >
>>> > > >   * I tested the built-in latex/pdf exporter as well as ox-
>>> > > > pandoc, the
>>> > > >
>>> > > > latter both for .pdf and .docx export. The built-in ODT
>>> > > > exporter
>>> > > >
>>> > > > doesn't export citations ; therefore, I didn't test it further.
>>> > > >
>>> > > >
>>> > > >
>>> > > >
>>> > > >
>>> > > >   * org-ref's :labels and :refs do not export to anything but
>>> > > > the
>>> > > >
>>> > > > built-in latex exporter. The native system of #+NAME:s and
>>> > > > #+CAPTION:s,
>>> > > >
>>> > > > a bit on the heavy side, seems not to fail (except that they do
>>> > > > not
>>> > > >
>>> > > > expand in a caption...).
>>> > > >
>>> > > >
>>> > > >
>>> > > >   * Maths, tables, figures are unproblematic.
>>> > > >
>>> > > >
>>> > > >
>>> > > >   * The requirements of org-reftex, the built-in latex exporter
>>> > > > and ox-
>>> > > >
>>> > > > pandoc being mutually incompatible, and some ingenuity is
>>> > > > required. see
>>> > > >
>>> > > > the attached org source. Org-ref's requirements do not simplify
>>> > > > the
>>> > > >
>>> > > > situation...
>>> > > >
>>> > > >
>>> > > >
>>> > > >   * Code snippets (i. e. source blocks exporting code) have a
>>> > > >
>>> > > > captioning/numbering problem :
>>> > > >
>>> > > >
>>> > > >
>>> > > >     - With the built-in latex exporter, they are numbered and
>>> > > > labeled
>>> > > >
>>> > > > as figures.
>>> > > >
>>> > > >
>>> > > >
>>> > > >     - The pandoc latex exporter numbers them separately (as
>>> > > > seen by
>>> > > >
>>> > > > referencing them), but do not output this number (nor the
>>> > > > category)
>>> > > >
>>> > > > before the caption.
>>> > > >
>>> > > >
>>> > > >
>>> > > >     - The pandoc .docx exporter works as advertised.
>>> > > >
>>> > > >
>>> > > >
>>> > > > So I have a couple of questions:
>>> > > >
>>> > > >
>>> > > >
>>> > > >   * What can be done to reconcile org-ref's, latex-exporter's
>>> > > > and ox-
>>> > > >
>>> > > > pandoc's requirements for bibliographies ?
>>> > > >
>>> > > >
>>> > > >
>>> > > >   * How to fix the pdf exporters' quirks with code snippets ?
>>> > > >
>>> > > >
>>> > > >
>>> > > > HTH,
>>> > > >
>>> > > >
>>> > > >
>>> > > > --
>>> > > >
>>> > > > Emmanuel Charpentier
>>> > > >


--
Professor John Kitchin
Doherty Hall A207F
Department of Chemical Engineering
Carnegie Mellon University
Pittsburgh, PA 15213
412-268-7803
@johnkitchin
http://kitchingroup.cheme.cmu.edu

  parent reply	other threads:[~2019-04-30 18:12 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-29  6:33 Slight problems with links Emmanuel Charpentier
2019-04-29 20:57 ` John Kitchin
2019-04-29 21:18   ` Emmanuel Charpentier
2019-04-29 21:23     ` John Kitchin
2019-04-29 21:48       ` Emmanuel Charpentier
2019-04-30  6:43         ` Ken Mankoff
2019-04-30  7:45           ` Emmanuel Charpentier
2019-04-30 18:11           ` John Kitchin [this message]
  -- strict thread matches above, loose matches on Subject: below --
2019-04-29  7:35 Emmanuel Charpentier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m25zqve51x.fsf@andrew.cmu.edu \
    --to=jkitchin@andrew.cmu.edu \
    --cc=emacs-orgmode@gnu.org \
    --cc=emm.charpentier@free.fr \
    --cc=mankoff@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).