From mboxrd@z Thu Jan 1 00:00:00 1970 From: brian powell Subject: Re: Org mode, minted, and non-ASCII Date: Thu, 5 Jan 2012 10:17:09 -0500 Message-ID: References: <87vcoqdeoq.fsf@iro.umontreal.ca> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=bcaec53f35f90b206504b5c96dac Return-path: Received: from eggs.gnu.org ([140.186.70.92]:57939) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rip47-0007Pl-P8 for emacs-orgmode@gnu.org; Thu, 05 Jan 2012 10:17:20 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Rip42-0005ZT-TN for emacs-orgmode@gnu.org; Thu, 05 Jan 2012 10:17:15 -0500 Received: from mail-wi0-f169.google.com ([209.85.212.169]:58144) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rip42-0005ZP-Ji for emacs-orgmode@gnu.org; Thu, 05 Jan 2012 10:17:10 -0500 Received: by wibhq12 with SMTP id hq12so593942wib.0 for ; Thu, 05 Jan 2012 07:17:09 -0800 (PST) In-Reply-To: <87vcoqdeoq.fsf@iro.umontreal.ca> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: =?ISO-8859-1?Q?Fran=E7ois_Pinard?= Cc: emacs-orgmode@gnu.org --bcaec53f35f90b206504b5c96dac Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable * Firstly, thanks for sending this issue to the group: pygments & minted are very interesting tools for OrgMode/LaTeX persons. ** Read this http://ctan.mackichan.com/macros/latex/contrib/minted/minted.pdf ** And this: http://pygments.org/docs/unicode/ *** Seems that running the pygmetize from the command line has some provisos. *** Also, noticed this in bold type at http://pygments.org/docs/unicodethat might help you: "Since Pygments 0.6, all lexers use unicode strings internally. Because of that you might encounter the occasional UnicodeDecodeError if you pass strings with the wrong encoding. ... The formatters now send Unicode objects to the stream if you don't set the output encoding. You can do so by passing the formatters an encoding option= : from pygments.formatters import HtmlFormatter f =3D HtmlFormatter(encoding=3D'utf-8') You will have to set this option if you have non-ASCII characters in the source and the output stream does not accept Unicode written to it! This is the case for all regular files and for terminals." ---------- Forwarded message ---------- From: Fran=E7ois Pinard Date: 2012/1/4 Subject: [O] Org mode, minted, and non-ASCII To: emacs-orgmode@gnu.org Hi, Org people. Still experimenting around for this report, I installed *minted* so one of the appendices might nicely display a bulky bit of Python code. It works satisfactorily (and speedily enough) if I squash out all diacriticized and other Unicode special symbols in the file. However, no output is produced if I leave the tiniest non-ASCII character in the file. OK, OK, don't kill me :-). Agreed that all non-ASCII characters are neither tinier or bigger than one another in this context. The Org document, the Python sources, and the default charset for this machine are all UTF-8. I saw no Unicode problem between Unicode and LaTeX when minted is not in the picture. pygmentize also appears to do well with Unicode input. So the problem likely lies either between Org mode and minted LaTex, or within minted. Is that a known problem or limitation? This problem is bit more hurtful here, as the Python code really uses Unicode, and mangling out Unicode characters really changes the semantic of the code as displayed in the report. If it was not for this problem, the minted output is attractive, at least more than what I saw with the listings package. On last resort and of course, I may still include an unfontified Python source in the appendix, or produce it by other means; not such a big deal, it's just that I would have liked to impress my coworkers a bit more with Org mode integration and capabilities. :-). To confuse me a little more, I'm getting random (I mean, unpredictable by me) "org-mode fontification error" diagnostics while creating the PDF output. Perusing org.el tells me that this is likely a mere coincidence, as those fontification errors seem wholly unrelated to LaTeX processing. Fran=E7ois P.S. Who is a bit tired right now, and maybe missing something trivial? Tomorrow, I'll surely revisit most of today's experiments. --bcaec53f35f90b206504b5c96dac Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
* Firstly, thanks for sending this issue to the group: pygments & = minted are very interesting tools for OrgMode/LaTeX persons.

=
** Read this http://ctan.mackichan.com/macros/latex/contrib/minted/= minted.pdf


*** Seems that runn= ing the pygmetize from the command line has some provisos.

*** Also, noticed this in bold type at=A0http://pygments.org/docs/unicode that might help = you:

"Since Pygments 0.6, all lexers use unic= ode strings internally. Because of that you might encounter the occasional = UnicodeDecodeError if you pass strings with the wrong encoding.
...
The formatters now send Unicode objects to the stre= am if you don't set the output encoding. You can do so by passing the f= ormatters an encoding option:
from pygments.formatters import Htm= lFormatter
f =3D HtmlFormatter(encoding=3D'utf-8')
You will hav= e to set this option if you have non-ASCII characters in the source and the= output stream does not accept Unicode written to it! This is the case for = all regular files and for terminals."

---------- Forwarded m= essage ----------
From: Fran=E7ois Pinard<= /b> <pinard= @iro.umontreal.ca>
Date: 2012/1/4
Subject: [O] Org mode, minted, and non-ASCII
To: emacs-orgmode@gnu.org

Hi, Or= g people.

Still experimenting around for this report, I installed *minted* so one
of the appendices might nicely display a bulky bit of Python code.

It works satisfactorily (and speedily enough) if I squash out all
diacriticized and other Unicode special symbols in the file. =A0However, no output is produced if I leave the tiniest non-ASCII character in the
file. =A0OK, OK, don't kill me :-). =A0Agreed that all non-ASCII charac= ters
are neither tinier or bigger than one another in this context.

The Org document, the Python sources, and the default charset for this
machine are all UTF-8. =A0I saw no Unicode problem between Unicode and
LaTeX when minted is not in the picture. =A0pygmentize also appears to do well with Unicode input.

So the problem likely lies either between Org mode and minted LaTex, or
within minted. =A0Is that a known problem or limitation?

This problem is bit more hurtful here, as the Python code really uses
Unicode, and mangling out Unicode characters really changes the semantic of the code as displayed in the report. =A0If it was not for this problem,<= br> the minted output is attractive, at least more than what I saw with the
listings package. =A0On last resort and of course, I may still include an unfontified Python source in the appendix, or produce it by other means; not such a big deal, it's just that I would have liked to impress my coworkers a bit more with Org mode integration and capabilities. =A0:-).
To confuse me a little more, I'm getting random (I mean, unpredictable<= br> by me) "org-mode fontification error" diagnostics while creating = the PDF
output. =A0Perusing org.el tells me that this is likely a mere
coincidence, as those fontification errors seem wholly unrelated to
LaTeX processing.

Fran=E7ois

P.S. Who is a bit tired right now, and maybe missing something trivial?
Tomorrow, I'll surely revisit most of today's experiments.


--bcaec53f35f90b206504b5c96dac--