emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Nick Dokos <nicholas.dokos@hp.com>
To: =?utf-8?Q?Fran=C3=A7ois?= Pinard <pinard@iro.umontreal.ca>
Cc: nicholas.dokos@hp.com, emacs-orgmode@gnu.org
Subject: Re: Org mode, minted, and non-ASCII
Date: Thu, 05 Jan 2012 14:47:49 -0500	[thread overview]
Message-ID: <7757.1325792869@alphaville.americas.hpqcorp.net> (raw)
In-Reply-To: Message from pinard@iro.umontreal.ca (=?utf-8?Q?Fran=C3=A7ois?= Pinard) of "Wed\, 04 Jan 2012 22\:40\:56 EST." <87vcoqdeoq.fsf@iro.umontreal.ca>

François Pinard <pinard@iro.umontreal.ca> wrote:

> Hi, Org people.
> 
> Still experimenting around for this report, I installed *minted* so one
> of the appendices might nicely display a bulky bit of Python code.
> 
> It works satisfactorily (and speedily enough) if I squash out all
> diacriticized and other Unicode special symbols in the file.  However,
> no output is produced if I leave the tiniest non-ASCII character in the
> file.  OK, OK, don't kill me :-).  Agreed that all non-ASCII characters
> are neither tinier or bigger than one another in this context.
> 
> The Org document, the Python sources, and the default charset for this
> machine are all UTF-8.  I saw no Unicode problem between Unicode and
> LaTeX when minted is not in the picture.  pygmentize also appears to do
> well with Unicode input.
> 
> So the problem likely lies either between Org mode and minted LaTex, or
> within minted.  Is that a known problem or limitation?
> 
> This problem is bit more hurtful here, as the Python code really uses
> Unicode, and mangling out Unicode characters really changes the semantic
> of the code as displayed in the report.  If it was not for this problem,
> the minted output is attractive, at least more than what I saw with the
> listings package.  On last resort and of course, I may still include an
> unfontified Python source in the appendix, or produce it by other means;
> not such a big deal, it's just that I would have liked to impress my
> coworkers a bit more with Org mode integration and capabilities.  :-).
> 
> To confuse me a little more, I'm getting random (I mean, unpredictable
> by me) "org-mode fontification error" diagnostics while creating the PDF
> output.  Perusing org.el tells me that this is likely a mere
> coincidence, as those fontification errors seem wholly unrelated to
> LaTeX processing.
> 

Yes, indeed it seems to be something that minted is doing (or not
doing).

The following tex file, python program and Makefile illustrate that
pygmentize and latex are fine as you stated. But when minted is inserted
into the mix, all hell breaks loose. I tried modifying minted.sty to
introduce utf-8 encoding options in the two places where pygmentize is
called, but this still does not work for me.

I never used pygmentize from the command line before. I believe the
Makefile describes the proper usage, but I'd appreciate corrections
before I dive into minted.

Nick

PS ... and yes, I know that the "german" in the following is just nonsense ;-)

Makefile:
--8<---------------cut here---------------start------------->8---
view:	fp.pdf
	xpdf fp.pdf

fp.pdf: fp.tex fp.py
	pygmentize -S default -f latex -P "encoding=utf-8" > fp.pyg
	pygmentize -l python -f latex -F tokenmerge -P "encoding=utf-8" -P "verboptions= " -o fp.out.pyg fp.py
	pdflatex -shell-escape fp.tex
	pdflatex -shell-escape fp.tex
	pdflatex -shell-escape fp.tex

clean:
	rm -f *~ fp.aux fp.pyg fp.out.pyg fp.log fp.toc fp.dvi fp.pdf


--8<---------------cut here---------------end--------------->8---


fp.py:
--8<---------------cut here---------------start------------->8---
#! /usr/bin/env python
# -*- coding: utf-8 -*-

"""
"""

import sys

x = 'This is a unicode string mit ümläute und großen problemen.'

def main(args):
    print x
    return 0

if __name__ == '__main__':
    status = main(sys.argv[1:])
    sys.exit(status)
--8<---------------cut here---------------end--------------->8---

fp.tex:
--8<---------------cut here---------------start------------->8---
\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage{fancyvrb}
\usepackage{color}

\begin{document}

\section{foo}
Pygmentize can deal with ünicode with nö problems (given -P ``encoding=utf-8'' options).

\input{fp.pyg}
\input{fp.out.pyg}

\end{document}

%%% Local Variables: 
%%% mode: latex
%%% TeX-master: t
%%% End: 
--8<---------------cut here---------------end--------------->8---

  parent reply	other threads:[~2012-01-05 19:47 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-05  3:40 Org mode, minted, and non-ASCII François Pinard
2012-01-05 15:17 ` brian powell
2012-01-05 19:47 ` Nick Dokos [this message]
2012-01-05 21:21   ` François Pinard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7757.1325792869@alphaville.americas.hpqcorp.net \
    --to=nicholas.dokos@hp.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=pinard@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).