From mboxrd@z Thu Jan 1 00:00:00 1970 From: Scott Otterson Subject: Re: Multiple underscores crash org latex export; other exporters survive Date: Mon, 12 Dec 2016 16:18:09 +0100 Message-ID: References: <49b70a0c-f81b-660b-e2f5-9921ab488d65@gmail.com> <50e77033-c13c-c0be-5d4a-ec5c107e93ae@gmail.com> <87bmwsatox.fsf@nicolasgoaziou.fr> <87mvg8ipmf.fsf@nicolasgoaziou.fr> <084a9c31-e7b1-72af-8d78-9655dc006d00@gmail.com> <87fum0htmk.fsf@nicolasgoaziou.fr> <878trncou3.fsf@nicolasgoaziou.fr> <87bmwhbne5.fsf@nicolasgoaziou.fr> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=001a1145bc90e89a0e0543779ec2 Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:37776) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cGSOn-0003H7-Ls for emacs-orgmode@gnu.org; Mon, 12 Dec 2016 10:20:19 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cGSOi-0002bf-Jj for emacs-orgmode@gnu.org; Mon, 12 Dec 2016 10:20:17 -0500 Received: from mail-wm0-f52.google.com ([74.125.82.52]:38851) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cGSOi-0002Ms-8e for emacs-orgmode@gnu.org; Mon, 12 Dec 2016 10:20:12 -0500 Received: by mail-wm0-f52.google.com with SMTP id f82so72889802wmf.1 for ; Mon, 12 Dec 2016 07:19:50 -0800 (PST) In-Reply-To: <87bmwhbne5.fsf@nicolasgoaziou.fr> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: "Emacs-orgmode" To: Scott Randby , "Emacs-orgmode@gnu.org" --001a1145bc90e89a0e0543779ec2 Content-Type: text/plain; charset=UTF-8 Thanks to Nicolas and Scott for your painstaking efforts. At least for me, a fine stopgap measure is to simply avoid Latex crashes for orgmode contents that are not explicitly Latex. Sometime after that, it would be ideal to produce similar output for all export types, insofar as that's possible. I thought I'd see what ox-pandoc does. As I'm sure you know, pandoc converts all input formats to a master markup language, and then converts that to whatever output format is desired -- a design that makes output uniformity easier to obtain. Orgmode is already halfway there, since the master markup language is orgmode itself. Here's what pandoc does in the three cases I've recently posted about: 1.) *Multiple underscores* (the subject of this thread): Pandoc doesn't crash and it exports the same thing for either html or latex: everything after the first underscore is subscripted and all underscores are deleted. I don't love that behavior but it's consistent. 2.) *Plain lists with more than four sublevels*: For html export, pandoc and orgmode do what you'd expect: produce a deeply nested html list. For (Windows) latex export, pandoc and orgmode also do the same thing: crash. Ideally, pandoc would have generated valid Latex for deep list nesting, but at least it's not completely ornery; it snips out the part of the original Latex error message that points to the cause. Perhaps pandoc latex export wouldn't crash in Linux, just as orgmode latex export doesn't crash in Linux (from Nicolas). This is still a mystery. Nicolas's Linux-produced tex file is essentially the same as the one I got in Windows, and it crashes Windows latexmk just like mine does. *Nicolas*, could it be that you're not running latexmk on your exports? 3.) *Web link with a '#' in the URL*: Pandoc never crashes and it exports nearly the same thing for html or latex pdf: In either case, clicking on the link sends you to the right web page, and the only difference is that, in the output pdf, the link text isn't highlighted; instead there's a tooltip popup. The reason pandoc latex export doesn't crash but orgmode does (in Windows) is that pandoc escapes the '#'. In the example I posted last week, orgmode does this: \section{Some section \href{http://orgmode.org/manual/Column-groups.html# Column-groups}{A random link}} while pandoc does this: \section{\texorpdfstring{Some section \href{http://orgmode.org/ manual/Column-groups.html\#Column-groups}{A random link}}{Some section A random link}} I don't understand why the escape prevents Windows crashes but doesn't appear to be needed for Linux. Nevertheless, it looks like pandoc does something special to prevent this crash. Scott > > --001a1145bc90e89a0e0543779ec2 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Thanks to Nicolas and Scott for your painstaking efforts.= =C2=A0 At least for me, a fine stopgap measure is to simply avoid Latex cra= shes for orgmode contents that are not explicitly Latex.=C2=A0 Sometime aft= er that, it would be ideal to produce similar output for all export types, = insofar as that's possible.

I thought I'd see wh= at ox-pandoc does.=C2=A0 As I'm sure you know, pandoc converts all inpu= t formats to a master markup language, and then converts that to whatever o= utput format is desired -- a design that makes output uniformity easier to = obtain.=C2=A0 Orgmode is already halfway there, since the master markup lan= guage is orgmode itself. =C2=A0

Here's what pa= ndoc does in the three cases I've recently posted about:=C2=A0

1.)=C2=A0Multiple underscores=C2=A0(the subject= of this thread): =C2=A0Pandoc doesn't crash and it exports the same th= ing for either html or latex: everything after the first underscore is subs= cripted and all underscores are deleted.=C2=A0 I don't love that behavi= or but it's consistent.

2.)=C2=A0Plain list= s with more than four sublevels: For html export, pandoc and orgmode do= what you'd expect: produce a deeply nested html list.=C2=A0 For (Windo= ws) latex export, pandoc and orgmode also do the same thing: crash. Ideally= , pandoc would have generated valid Latex for deep list nesting, but at lea= st it's not completely ornery; it snips out the part of the original La= tex error message that points to the cause. =C2=A0

Perhaps pandoc latex export wouldn't crash in Linux, just as orgmode l= atex export doesn't crash in Linux (from Nicolas).=C2=A0 This is still = a mystery. Nicolas's Linux-produced tex file is essentially the same as= the one I got in Windows, and it crashes Windows latexmk just like mine do= es. =C2=A0Nicolas, could it be that you're not running latexmk o= n your exports?

3.)=C2=A0Web link with a '#= ' in the URL: Pandoc never crashes and it exports nearly the same t= hing for html or latex pdf: =C2=A0In either case, clicking on the link send= s you to the right web page, and the only difference is that, in the output= pdf, the link text isn't highlighted; instead there's a tooltip po= pup. =C2=A0=C2=A0

The reason pandoc latex export d= oesn't crash but orgmode does (in Windows) is that pandoc escapes the &= #39;#'.=C2=A0 In the example I posted last week, orgmode does this:
=
\section{Some section =C2=A0\href{http://orgmode.= org/manual/Column-groups.html#Column-groups}{A=C2=A0random li= nk}}

while pandoc does this:

\section{\t= exorpdfstring{Some section \href{http://orgmode.org= /manual/Column-groups.html\#Column-groups}{A=C2=A0random link= }}{Some section=C2=A0A random link}}

I don't understand why the escape prevents Win= dows crashes but doesn't appear to be needed for Linux.=C2=A0 Neverthel= ess, it looks like pandoc does something special to prevent this crash.

Scott


--001a1145bc90e89a0e0543779ec2--