From mboxrd@z Thu Jan 1 00:00:00 1970 From: John Kitchin Subject: Re: Slight problems with links Date: Tue, 30 Apr 2019 14:11:54 -0400 Message-ID: References: <602b6645ab39fabcceb851b8e4f12a15b0c04c20.camel@free.fr> <450323f31e177a8a66aea436c688ec066caf4f26.camel@free.fr> <871s1kouwt.fsf@geus3064linuxwsm.geus.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([209.51.188.92]:57230) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hLXEg-0000aJ-T3 for emacs-orgmode@gnu.org; Tue, 30 Apr 2019 14:12:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hLXEb-0005xA-SP for emacs-orgmode@gnu.org; Tue, 30 Apr 2019 14:12:08 -0400 Received: from mail-qt1-x82c.google.com ([2607:f8b0:4864:20::82c]:46830) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hLXEa-0005mr-0w for emacs-orgmode@gnu.org; Tue, 30 Apr 2019 14:12:04 -0400 Received: by mail-qt1-x82c.google.com with SMTP id i31so8569411qti.13 for ; Tue, 30 Apr 2019 11:11:58 -0700 (PDT) In-reply-to: <871s1kouwt.fsf@geus3064linuxwsm.geus.dk> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: "Emacs-orgmode" To: Ken Mankoff Cc: Emmanuel Charpentier , emacs-orgmode I second the need for the first line, in my past efforts pandoc would choke on my large bibtex file, and trimming it down to just the essential entries was helpful. I would achieve that with M-x org-ref-extract-bibtex-to-file :). But it is otherwise the same. Ken Mankoff writes: > Hi Emmanuel, > > I have looked into PDF and DOCX export repeatedly over the years with dif= ferent versions of Org and Pandoc and Google Docs and keep finding that the= best method is Org -> LaTeX, and then LaTeX -> DOCX with Pandoc. I use thi= s babel block to achieve this: > > #+BEGIN_SRC sh :results verbatim :var fn=3D(file-name-sans-extension (buf= fer-name)) > bibexport -o ${fn}.bib ${fn}.aux > pandoc -f latex -i ${fn}.tex -t DOCX -o ${fn}.docx --bibliography ./${fn}= .bib > #+END_SRC > > The first line may not be needed. That extracts the references used in th= is document from my single massive (3500 element) bib file. The advantage o= f this Org -> LaTeX -> Pandoc -> DOCX method is that if you need to patch = something for Pandoc (e.g. to fix your code captions or something) you coul= d insert a NoWeb <> block before the pandoc line that g= enerates a second, corrected, LaTeX file. > > Note that I don't use org-ref, just the standard Org referencing code, bu= t I don't see why this would change anything, if you're able to generate PD= Fs that meet your requirements. > > -k. > > > On 2019-04-29 at 19:48 -0200, Emmanuel Charpentier wrote... >> Dear John, >> Indeed, I missed your point. I'll have to bet back to you after >> reading, understanding the code (org-mode is a tall order...) and >> *thinking*. >> However, the troubling fact that ox-latex manages to export org's >> labelling correctly shows that its author might be up to somethong. >> Indeed I just checked that its exported docx can be converted by pandoc >> into a "correct" docx (correct here meaning that my captions are >> correctly labelled and numbered). >> Have to think again... >> --Emmanuel Charpentier >> Le lundi 29 avril 2019 =C3=A0 17:23 -0400, John Kitchin a =C3=A9crit : >>> I think you have missed the main point. My point was first to find >>> some format that pandoc faithfully converts to docx with all the >>> features you need, and then we can figure out how to turn org-ref/org >>> into that format. So, if you can write a LaTeX document that is >>> correctly converted to docx (correct bibliography, figure labels, and >>> cross-references, correct code, etc), then we can probably get org to >>> output the right latex. But if LaTeX isn't converted to docx >>> correctly in pandoc, it does not seem likely that org will either >>> with any simple exporter. >>> John >>> >>> ----------------------------------- >>> Professor John Kitchin >>> Doherty Hall A207F >>> Department of Chemical Engineering >>> Carnegie Mellon University >>> Pittsburgh, PA 15213 >>> 412-268-7803 >>> @johnkitchin >>> http://kitchingroup.cheme.cmu.edu >>> >>> >>> >>> On Mon, Apr 29, 2019 at 5:19 PM Emmanuel Charpentier < >>> emm.charpentier@free.fr> wrote: >>> > Dear John, >>> > Le lundi 29 avril 2019 =C3=A0 16:57 -0400, John Kitchin a =C3=A9crit : >>> > > For org-ref, there isn't much magic on what happens on export. >>> > > LaTeX is certainly the most well supported, and it seems like org >>> > > -> latex -> pandoc is the only way that makes sense to get to >>> > > docx to me. Using pandoc on org files directly is probably >>> > > hopeless unless you can get pandoc to include some definitions >>> > > for the org-ref links. >>> > >>> > This might be difficult : the development of ox-pandoc seems to not >>> > be very active at the moment... >>> > > Some of the link types in org-ref have some exports defined for >>> > > org, html, latex, sometimes ascii. If one of these works well >>> > > with pandoc we could try to make them output something useful for >>> > > them, or at least make sure that org->org export turns them into >>> > > something useful. >>> > >>> > I'm currently looking at the ox-latex exporter in order to >>> > understand what it does for source blocks with org's names and >>> > captions (and try to fix the fact that they are labeled and nubered >>> > as figures...). Theis understanding might help me to go in the >>> > direction you suggest. >>> > > Getting figure/table numbers has always been tricky; I don't >>> > > think this worked well with pandoc, and handling it on the org >>> > > side requires some preprocessing to add numbers. For now, the ox- >>> > > word exporter in scimax comes closest, but it isn't a feature I >>> > > use a lot, so it hasn't been improved in a while. >>> > >>> > Again, looking at what ox-latex does for org's names and captions >>> > might be helpful. Ox-pandoc seems to do a decent job on docx >>> > output. >>> > > John >>> > > >>> > > ----------------------------------- >>> > > Professor John Kitchin >>> > > Doherty Hall A207F >>> > > Department of Chemical Engineering >>> > > Carnegie Mellon University >>> > > Pittsburgh, PA 15213 >>> > > 412-268-7803 >>> > > @johnkitchin >>> > > http://kitchingroup.cheme.cmu.edu >>> > > >>> > > >>> > > >>> > > On Mon, Apr 29, 2019 at 1:06 PM Emmanuel Charpentier < >>> > > emm.charpentier@free.fr> wrote: >>> > > > Dear list, >>> > > > >>> > > > >>> > > > >>> > > > one of my uses od org-mode is to prepare documents wrapping R >>> > > > (and >>> > > > >>> > > > sometimes Sagemath) call results in interpretation text. My >>> > > > reference >>> > > > >>> > > > output is .pdf documents, but I *have* to prepare a .docx >>> > > > version (for >>> > > > >>> > > > use in managerial spheres, where computer literacy is *very* >>> > > > low. >>> > > > >>> > > > Cross-references and citations are a sine qua non, maths are >>> > > > useful. >>> > > > >>> > > > >>> > > > >>> > > > I have been annoyed by a couple of deficiencies and >>> > > > inconsistencies >>> > > > >>> > > > between exporters, so I prepared a test document testing >>> > > > various cases. >>> > > > >>> > > > This documents and some exports are attached (NE =3D Native >>> > > > exporter, PE >>> > > > >>> > > > =3D ox-pandoc exporter). >>> > > > >>> > > > >>> > > > >>> > > > TL;DR : >>> > > > >>> > > > >>> > > > >>> > > > * I tested the built-in latex/pdf exporter as well as ox- >>> > > > pandoc, the >>> > > > >>> > > > latter both for .pdf and .docx export. The built-in ODT >>> > > > exporter >>> > > > >>> > > > doesn't export citations ; therefore, I didn't test it further. >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > * org-ref's :labels and :refs do not export to anything but >>> > > > the >>> > > > >>> > > > built-in latex exporter. The native system of #+NAME:s and >>> > > > #+CAPTION:s, >>> > > > >>> > > > a bit on the heavy side, seems not to fail (except that they do >>> > > > not >>> > > > >>> > > > expand in a caption...). >>> > > > >>> > > > >>> > > > >>> > > > * Maths, tables, figures are unproblematic. >>> > > > >>> > > > >>> > > > >>> > > > * The requirements of org-reftex, the built-in latex exporter >>> > > > and ox- >>> > > > >>> > > > pandoc being mutually incompatible, and some ingenuity is >>> > > > required. see >>> > > > >>> > > > the attached org source. Org-ref's requirements do not simplify >>> > > > the >>> > > > >>> > > > situation... >>> > > > >>> > > > >>> > > > >>> > > > * Code snippets (i. e. source blocks exporting code) have a >>> > > > >>> > > > captioning/numbering problem : >>> > > > >>> > > > >>> > > > >>> > > > - With the built-in latex exporter, they are numbered and >>> > > > labeled >>> > > > >>> > > > as figures. >>> > > > >>> > > > >>> > > > >>> > > > - The pandoc latex exporter numbers them separately (as >>> > > > seen by >>> > > > >>> > > > referencing them), but do not output this number (nor the >>> > > > category) >>> > > > >>> > > > before the caption. >>> > > > >>> > > > >>> > > > >>> > > > - The pandoc .docx exporter works as advertised. >>> > > > >>> > > > >>> > > > >>> > > > So I have a couple of questions: >>> > > > >>> > > > >>> > > > >>> > > > * What can be done to reconcile org-ref's, latex-exporter's >>> > > > and ox- >>> > > > >>> > > > pandoc's requirements for bibliographies ? >>> > > > >>> > > > >>> > > > >>> > > > * How to fix the pdf exporters' quirks with code snippets ? >>> > > > >>> > > > >>> > > > >>> > > > HTH, >>> > > > >>> > > > >>> > > > >>> > > > -- >>> > > > >>> > > > Emmanuel Charpentier >>> > > > -- Professor John Kitchin Doherty Hall A207F Department of Chemical Engineering Carnegie Mellon University Pittsburgh, PA 15213 412-268-7803 @johnkitchin http://kitchingroup.cheme.cmu.edu