emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Org mode, minted, and non-ASCII
@ 2012-01-05  3:40 François Pinard
  2012-01-05 15:17 ` brian powell
  2012-01-05 19:47 ` Nick Dokos
  0 siblings, 2 replies; 4+ messages in thread
From: François Pinard @ 2012-01-05  3:40 UTC (permalink / raw)
  To: emacs-orgmode

Hi, Org people.

Still experimenting around for this report, I installed *minted* so one
of the appendices might nicely display a bulky bit of Python code.

It works satisfactorily (and speedily enough) if I squash out all
diacriticized and other Unicode special symbols in the file.  However,
no output is produced if I leave the tiniest non-ASCII character in the
file.  OK, OK, don't kill me :-).  Agreed that all non-ASCII characters
are neither tinier or bigger than one another in this context.

The Org document, the Python sources, and the default charset for this
machine are all UTF-8.  I saw no Unicode problem between Unicode and
LaTeX when minted is not in the picture.  pygmentize also appears to do
well with Unicode input.

So the problem likely lies either between Org mode and minted LaTex, or
within minted.  Is that a known problem or limitation?

This problem is bit more hurtful here, as the Python code really uses
Unicode, and mangling out Unicode characters really changes the semantic
of the code as displayed in the report.  If it was not for this problem,
the minted output is attractive, at least more than what I saw with the
listings package.  On last resort and of course, I may still include an
unfontified Python source in the appendix, or produce it by other means;
not such a big deal, it's just that I would have liked to impress my
coworkers a bit more with Org mode integration and capabilities.  :-).

To confuse me a little more, I'm getting random (I mean, unpredictable
by me) "org-mode fontification error" diagnostics while creating the PDF
output.  Perusing org.el tells me that this is likely a mere
coincidence, as those fontification errors seem wholly unrelated to
LaTeX processing.

François

P.S. Who is a bit tired right now, and maybe missing something trivial?
Tomorrow, I'll surely revisit most of today's experiments.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Org mode, minted, and non-ASCII
  2012-01-05  3:40 Org mode, minted, and non-ASCII François Pinard
@ 2012-01-05 15:17 ` brian powell
  2012-01-05 19:47 ` Nick Dokos
  1 sibling, 0 replies; 4+ messages in thread
From: brian powell @ 2012-01-05 15:17 UTC (permalink / raw)
  To: François Pinard; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 3165 bytes --]

* Firstly, thanks for sending this issue to the group: pygments & minted
are very interesting tools for OrgMode/LaTeX persons.

** Read this
http://ctan.mackichan.com/macros/latex/contrib/minted/minted.pdf

** And this: http://pygments.org/docs/unicode/

*** Seems that running the pygmetize from the command line has some
provisos.

*** Also, noticed this in bold type at
http://pygments.org/docs/unicodethat might help you:

"Since Pygments 0.6, all lexers use unicode strings internally. Because of
that you might encounter the occasional UnicodeDecodeError if you pass
strings with the wrong encoding.
...
The formatters now send Unicode objects to the stream if you don't set the
output encoding. You can do so by passing the formatters an encoding option:
from pygments.formatters import HtmlFormatter
f = HtmlFormatter(encoding='utf-8')
You will have to set this option if you have non-ASCII characters in the
source and the output stream does not accept Unicode written to it! This is
the case for all regular files and for terminals."

---------- Forwarded message ----------
From: François Pinard <pinard@iro.umontreal.ca>
Date: 2012/1/4
Subject: [O] Org mode, minted, and non-ASCII
To: emacs-orgmode@gnu.org

Hi, Org people.

Still experimenting around for this report, I installed *minted* so one
of the appendices might nicely display a bulky bit of Python code.

It works satisfactorily (and speedily enough) if I squash out all
diacriticized and other Unicode special symbols in the file.  However,
no output is produced if I leave the tiniest non-ASCII character in the
file.  OK, OK, don't kill me :-).  Agreed that all non-ASCII characters
are neither tinier or bigger than one another in this context.

The Org document, the Python sources, and the default charset for this
machine are all UTF-8.  I saw no Unicode problem between Unicode and
LaTeX when minted is not in the picture.  pygmentize also appears to do
well with Unicode input.

So the problem likely lies either between Org mode and minted LaTex, or
within minted.  Is that a known problem or limitation?

This problem is bit more hurtful here, as the Python code really uses
Unicode, and mangling out Unicode characters really changes the semantic
of the code as displayed in the report.  If it was not for this problem,
the minted output is attractive, at least more than what I saw with the
listings package.  On last resort and of course, I may still include an
unfontified Python source in the appendix, or produce it by other means;
not such a big deal, it's just that I would have liked to impress my
coworkers a bit more with Org mode integration and capabilities.  :-).

To confuse me a little more, I'm getting random (I mean, unpredictable
by me) "org-mode fontification error" diagnostics while creating the PDF
output.  Perusing org.el tells me that this is likely a mere
coincidence, as those fontification errors seem wholly unrelated to
LaTeX processing.

François

P.S. Who is a bit tired right now, and maybe missing something trivial?
Tomorrow, I'll surely revisit most of today's experiments.

[-- Attachment #2: Type: text/html, Size: 3993 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Org mode, minted, and non-ASCII
  2012-01-05  3:40 Org mode, minted, and non-ASCII François Pinard
  2012-01-05 15:17 ` brian powell
@ 2012-01-05 19:47 ` Nick Dokos
  2012-01-05 21:21   ` François Pinard
  1 sibling, 1 reply; 4+ messages in thread
From: Nick Dokos @ 2012-01-05 19:47 UTC (permalink / raw)
  To: =?utf-8?Q?Fran=C3=A7ois?= Pinard; +Cc: nicholas.dokos, emacs-orgmode

François Pinard <pinard@iro.umontreal.ca> wrote:

> Hi, Org people.
> 
> Still experimenting around for this report, I installed *minted* so one
> of the appendices might nicely display a bulky bit of Python code.
> 
> It works satisfactorily (and speedily enough) if I squash out all
> diacriticized and other Unicode special symbols in the file.  However,
> no output is produced if I leave the tiniest non-ASCII character in the
> file.  OK, OK, don't kill me :-).  Agreed that all non-ASCII characters
> are neither tinier or bigger than one another in this context.
> 
> The Org document, the Python sources, and the default charset for this
> machine are all UTF-8.  I saw no Unicode problem between Unicode and
> LaTeX when minted is not in the picture.  pygmentize also appears to do
> well with Unicode input.
> 
> So the problem likely lies either between Org mode and minted LaTex, or
> within minted.  Is that a known problem or limitation?
> 
> This problem is bit more hurtful here, as the Python code really uses
> Unicode, and mangling out Unicode characters really changes the semantic
> of the code as displayed in the report.  If it was not for this problem,
> the minted output is attractive, at least more than what I saw with the
> listings package.  On last resort and of course, I may still include an
> unfontified Python source in the appendix, or produce it by other means;
> not such a big deal, it's just that I would have liked to impress my
> coworkers a bit more with Org mode integration and capabilities.  :-).
> 
> To confuse me a little more, I'm getting random (I mean, unpredictable
> by me) "org-mode fontification error" diagnostics while creating the PDF
> output.  Perusing org.el tells me that this is likely a mere
> coincidence, as those fontification errors seem wholly unrelated to
> LaTeX processing.
> 

Yes, indeed it seems to be something that minted is doing (or not
doing).

The following tex file, python program and Makefile illustrate that
pygmentize and latex are fine as you stated. But when minted is inserted
into the mix, all hell breaks loose. I tried modifying minted.sty to
introduce utf-8 encoding options in the two places where pygmentize is
called, but this still does not work for me.

I never used pygmentize from the command line before. I believe the
Makefile describes the proper usage, but I'd appreciate corrections
before I dive into minted.

Nick

PS ... and yes, I know that the "german" in the following is just nonsense ;-)

Makefile:
--8<---------------cut here---------------start------------->8---
view:	fp.pdf
	xpdf fp.pdf

fp.pdf: fp.tex fp.py
	pygmentize -S default -f latex -P "encoding=utf-8" > fp.pyg
	pygmentize -l python -f latex -F tokenmerge -P "encoding=utf-8" -P "verboptions= " -o fp.out.pyg fp.py
	pdflatex -shell-escape fp.tex
	pdflatex -shell-escape fp.tex
	pdflatex -shell-escape fp.tex

clean:
	rm -f *~ fp.aux fp.pyg fp.out.pyg fp.log fp.toc fp.dvi fp.pdf


--8<---------------cut here---------------end--------------->8---


fp.py:
--8<---------------cut here---------------start------------->8---
#! /usr/bin/env python
# -*- coding: utf-8 -*-

"""
"""

import sys

x = 'This is a unicode string mit ümläute und großen problemen.'

def main(args):
    print x
    return 0

if __name__ == '__main__':
    status = main(sys.argv[1:])
    sys.exit(status)
--8<---------------cut here---------------end--------------->8---

fp.tex:
--8<---------------cut here---------------start------------->8---
\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage{fancyvrb}
\usepackage{color}

\begin{document}

\section{foo}
Pygmentize can deal with ünicode with nö problems (given -P ``encoding=utf-8'' options).

\input{fp.pyg}
\input{fp.out.pyg}

\end{document}

%%% Local Variables: 
%%% mode: latex
%%% TeX-master: t
%%% End: 
--8<---------------cut here---------------end--------------->8---

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Org mode, minted, and non-ASCII
  2012-01-05 19:47 ` Nick Dokos
@ 2012-01-05 21:21   ` François Pinard
  0 siblings, 0 replies; 4+ messages in thread
From: François Pinard @ 2012-01-05 21:21 UTC (permalink / raw)
  To: nicholas.dokos; +Cc: emacs-orgmode

Nick Dokos <nicholas.dokos@hp.com> writes:

> I never used pygmentize from the command line before. I believe the
> Makefile describes the proper usage, but I'd appreciate corrections
> before I dive into minted.

As this is all new to me, I'm not the one to correct you. :-) But I do
thank you for the hints and the complete example you provide!  They give
me useful directions.

François

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-01-05 21:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-05  3:40 Org mode, minted, and non-ASCII François Pinard
2012-01-05 15:17 ` brian powell
2012-01-05 19:47 ` Nick Dokos
2012-01-05 21:21   ` François Pinard

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).