* Exporting non utf8 org documents @ 2009-12-08 16:22 Francesco Pizzolante 2010-01-04 14:16 ` Carsten Dominik 2010-01-06 8:26 ` Carsten Dominik 0 siblings, 2 replies; 9+ messages in thread From: Francesco Pizzolante @ 2009-12-08 16:22 UTC (permalink / raw) To: mailing-list-org-mode Hi, I have colleagues who are writing Org documents with latin-1 encoding and when I export these documents to LaTeX I run into problems, because Org assumes utf8. Here's a little example: --8<---------------cut here---------------start------------->8--- #+LATEX_CLASS: article * Ceci est un test Voici un petit texte rédigé en français. * COMMENT Setup # This is for the sake of Emacs. # Local Variables: # coding: iso-latin-1 # End: --8<---------------cut here---------------end--------------->8--- The exportation to LaTeX gives the following result: --8<---------------cut here---------------start------------->8--- % Created 2009-12-08 mar. 17:10 \documentclass[11pt]{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \usepackage{graphicx} \usepackage{longtable} \usepackage{float} \usepackage{wrapfig} \usepackage{soul} \usepackage{amssymb} \usepackage{hyperref} \usepackage{xcolor} \usepackage{listings} \title{org-french} \author{Francesco Pizzolante} \date{08 décembre 2009} \begin{document} \maketitle \setcounter{tocdepth}{3} \tableofcontents \vspace*{1cm} \section{Ceci est un test} \label{sec-1} Voici un petit texte rédigé en français. \end{document} --8<---------------cut here---------------end--------------->8--- When compiling, due to the \usepackage[utf8]{inputenc} directive, I get this error: ERROR: Package utf8x Error: MalformedUTF-8sequence. In order to fix this issue, I see the following solutions: - Would it be possible for Org to automatically get the coding system of the buffer and then generate the correct option for the inputenc package? or - Would it be possible to have a variable like #+CODING-SYSTEM: iso-latin-1 which would be used to generate the correct option for the inputenc package? Any other proposition or idea is welcome. In addition, Org should use the `utf8x' option (instead of `utf8') which enables to handle unbreakable spaces (useful in french). Thanks. Regards, Francesco _______________________________________________ Emacs-orgmode mailing list Please use `Reply All' to send replies to the list. Emacs-orgmode-mXXj517/zsQ@public.gmane.org http://lists.gnu.org/mailman/listinfo/emacs-orgmode ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Exporting non utf8 org documents 2009-12-08 16:22 Exporting non utf8 org documents Francesco Pizzolante @ 2010-01-04 14:16 ` Carsten Dominik 2010-01-06 8:26 ` Carsten Dominik 1 sibling, 0 replies; 9+ messages in thread From: Carsten Dominik @ 2010-01-04 14:16 UTC (permalink / raw) To: Francesco Pizzolante; +Cc: mailing-list-org-mode Hi Francesco, can you com up with code that maps the Emacs buffer coding system to the inputenc option? Then I can write the code to insert this into the proper place in the LaTeX header.... - Carsten On Dec 8, 2009, at 5:22 PM, Francesco Pizzolante wrote: > Hi, > > I have colleagues who are writing Org documents with latin-1 > encoding and when > I export these documents to LaTeX I run into problems, because Org > assumes > utf8. > > Here's a little example: > > --8<---------------cut here---------------start------------->8--- > #+LATEX_CLASS: article > > * Ceci est un test > > Voici un petit texte rédigé en français. > > * COMMENT Setup > > # This is for the sake of Emacs. > # Local Variables: > # coding: iso-latin-1 > # End: > --8<---------------cut here---------------end--------------->8--- > > The exportation to LaTeX gives the following result: > > --8<---------------cut here---------------start------------->8--- > % Created 2009-12-08 mar. 17:10 > \documentclass[11pt]{article} > \usepackage[utf8]{inputenc} > \usepackage[T1]{fontenc} > \usepackage{graphicx} > \usepackage{longtable} > \usepackage{float} > \usepackage{wrapfig} > \usepackage{soul} > \usepackage{amssymb} > \usepackage{hyperref} > \usepackage{xcolor} > \usepackage{listings} > > \title{org-french} > \author{Francesco Pizzolante} > \date{08 décembre 2009} > > \begin{document} > > \maketitle > > \setcounter{tocdepth}{3} > \tableofcontents > \vspace*{1cm} > > \section{Ceci est un test} > \label{sec-1} > > > Voici un petit texte rédigé en français. > > > > \end{document} > --8<---------------cut here---------------end--------------->8--- > > When compiling, due to the \usepackage[utf8]{inputenc} directive, I > get this > error: > > ERROR: Package utf8x Error: MalformedUTF-8sequence. > > > In order to fix this issue, I see the following solutions: > > - Would it be possible for Org to automatically get the coding > system of the > buffer and then generate the correct option for the inputenc package? > > or > > - Would it be possible to have a variable like #+CODING-SYSTEM: iso- > latin-1 > which would be used to generate the correct option for the inputenc > package? > > Any other proposition or idea is welcome. > > In addition, Org should use the `utf8x' option (instead of `utf8') > which > enables to handle unbreakable spaces (useful in french). > > Thanks. > > Regards, > Francesco > > > _______________________________________________ > Emacs-orgmode mailing list > Please use `Reply All' to send replies to the list. > Emacs-orgmode@gnu.org > http://lists.gnu.org/mailman/listinfo/emacs-orgmode - Carsten ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Exporting non utf8 org documents 2009-12-08 16:22 Exporting non utf8 org documents Francesco Pizzolante 2010-01-04 14:16 ` Carsten Dominik @ 2010-01-06 8:26 ` Carsten Dominik [not found] ` <6B5F0F7A-F055-435F-ADE2-846E99649B1D-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 1 sibling, 1 reply; 9+ messages in thread From: Carsten Dominik @ 2010-01-06 8:26 UTC (permalink / raw) To: Francesco Pizzolante; +Cc: mailing-list-org-mode Hi Francesco, here is a possible solution: Please get the latest git version of org-mode. Then put the following code into .emacs: (defun my-org-export-latex-fix-inputenc () "Set the codingsystem in inputenc to what the buffer is." (let* ((cs buffer-file-coding-system) (opt (latexenc-coding-system-to-inputenc cs))) (when opt (goto-char (point-min)) (while (re-search-forward "\\\\usepackage\\[\\(.*?\\)\\] {inputenc}" nil t) (goto-char (match-beginning 1)) (delete-region (match-beginning 1) (match-end 1)) (insert opt)) (save-buffer)))) (eval-after-load "org-latex" '(add-hook 'org-export-latex-after-save-hook 'my-org-export-latex-fix-inputenc)) Let me know how it goes..... - Carsten On Dec 8, 2009, at 5:22 PM, Francesco Pizzolante wrote: > Hi, > > I have colleagues who are writing Org documents with latin-1 > encoding and when > I export these documents to LaTeX I run into problems, because Org > assumes > utf8. > > Here's a little example: > > --8<---------------cut here---------------start------------->8--- > #+LATEX_CLASS: article > > * Ceci est un test > > Voici un petit texte rédigé en français. > > * COMMENT Setup > > # This is for the sake of Emacs. > # Local Variables: > # coding: iso-latin-1 > # End: > --8<---------------cut here---------------end--------------->8--- > > The exportation to LaTeX gives the following result: > > --8<---------------cut here---------------start------------->8--- > % Created 2009-12-08 mar. 17:10 > \documentclass[11pt]{article} > \usepackage[utf8]{inputenc} > \usepackage[T1]{fontenc} > \usepackage{graphicx} > \usepackage{longtable} > \usepackage{float} > \usepackage{wrapfig} > \usepackage{soul} > \usepackage{amssymb} > \usepackage{hyperref} > \usepackage{xcolor} > \usepackage{listings} > > \title{org-french} > \author{Francesco Pizzolante} > \date{08 décembre 2009} > > \begin{document} > > \maketitle > > \setcounter{tocdepth}{3} > \tableofcontents > \vspace*{1cm} > > \section{Ceci est un test} > \label{sec-1} > > > Voici un petit texte rédigé en français. > > > > \end{document} > --8<---------------cut here---------------end--------------->8--- > > When compiling, due to the \usepackage[utf8]{inputenc} directive, I > get this > error: > > ERROR: Package utf8x Error: MalformedUTF-8sequence. > > > In order to fix this issue, I see the following solutions: > > - Would it be possible for Org to automatically get the coding > system of the > buffer and then generate the correct option for the inputenc package? > > or > > - Would it be possible to have a variable like #+CODING-SYSTEM: iso- > latin-1 > which would be used to generate the correct option for the inputenc > package? > > Any other proposition or idea is welcome. > > In addition, Org should use the `utf8x' option (instead of `utf8') > which > enables to handle unbreakable spaces (useful in french). > > Thanks. > > Regards, > Francesco > > > _______________________________________________ > Emacs-orgmode mailing list > Please use `Reply All' to send replies to the list. > Emacs-orgmode@gnu.org > http://lists.gnu.org/mailman/listinfo/emacs-orgmode - Carsten ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <6B5F0F7A-F055-435F-ADE2-846E99649B1D-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>]
* Re: Exporting non utf8 org documents [not found] ` <6B5F0F7A-F055-435F-ADE2-846E99649B1D-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> @ 2010-01-08 12:36 ` Francesco Pizzolante 2010-01-08 12:39 ` Carsten Dominik ` (2 more replies) 0 siblings, 3 replies; 9+ messages in thread From: Francesco Pizzolante @ 2010-01-08 12:36 UTC (permalink / raw) To: Carsten Dominik; +Cc: mailing-list-org-mode Hi Carsten, > here is a possible solution: > > Please get the latest git version of org-mode. Then put the following code > into > .emacs: > > (defun my-org-export-latex-fix-inputenc () > "Set the codingsystem in inputenc to what the buffer is." > (let* ((cs buffer-file-coding-system) > (opt (latexenc-coding-system-to-inputenc cs))) > (when opt > (goto-char (point-min)) > (while (re-search-forward "\\\\usepackage\\[\\(.*?\\)\\] > {inputenc}" > nil t) > (goto-char (match-beginning 1)) > (delete-region (match-beginning 1) (match-end 1)) > (insert opt)) > (save-buffer)))) > > (eval-after-load "org-latex" > '(add-hook 'org-export-latex-after-save-hook > 'my-org-export-latex-fix-inputenc)) > > Let me know how it goes..... Thanks for your solution. I've tested with both latin1 and utf8 Org buffers and I get the correct encoding passed to LaTeX in both cases. Regarding the utf8 encoding, I had a remark in my first message, which was: >> In addition, Org should use the `utf8x' option (instead of `utf8') which >> enables to handle unbreakable spaces (useful in french). Could you change that too? Thanks a lot, Francesco _______________________________________________ Emacs-orgmode mailing list Please use `Reply All' to send replies to the list. Emacs-orgmode-mXXj517/zsQ@public.gmane.org http://lists.gnu.org/mailman/listinfo/emacs-orgmode ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Exporting non utf8 org documents 2010-01-08 12:36 ` Francesco Pizzolante @ 2010-01-08 12:39 ` Carsten Dominik 2010-01-08 12:43 ` Carsten Dominik 2010-01-10 9:20 ` Carsten Dominik 2 siblings, 0 replies; 9+ messages in thread From: Carsten Dominik @ 2010-01-08 12:39 UTC (permalink / raw) To: Francesco Pizzolante; +Cc: mailing-list-org-mode On Jan 8, 2010, at 1:36 PM, Francesco Pizzolante wrote: > Hi Carsten, > >> here is a possible solution: >> >> Please get the latest git version of org-mode. Then put the >> following code >> into >> .emacs: >> >> (defun my-org-export-latex-fix-inputenc () >> "Set the codingsystem in inputenc to what the buffer is." >> (let* ((cs buffer-file-coding-system) >> (opt (latexenc-coding-system-to-inputenc cs))) >> (when opt >> (goto-char (point-min)) >> (while (re-search-forward "\\\\usepackage\\[\\(.*?\\)\\] >> {inputenc}" >> nil t) >> (goto-char (match-beginning 1)) >> (delete-region (match-beginning 1) (match-end 1)) >> (insert opt)) >> (save-buffer)))) >> >> (eval-after-load "org-latex" >> '(add-hook 'org-export-latex-after-save-hook >> 'my-org-export-latex-fix-inputenc)) >> >> Let me know how it goes..... > > Thanks for your solution. > > I've tested with both latin1 and utf8 Org buffers and I get the > correct > encoding passed to LaTeX in both cases. > > Regarding the utf8 encoding, I had a remark in my first message, > which was: > >>> In addition, Org should use the `utf8x' option (instead of `utf8') >>> which >>> enables to handle unbreakable spaces (useful in french). > > Could you change that too? no, because utf8x is not in all TeX distributions, so that is too risky. I was considering to make it configurable, though. - Carsten > > Thanks a lot, > Francesco - Carsten ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Exporting non utf8 org documents 2010-01-08 12:36 ` Francesco Pizzolante 2010-01-08 12:39 ` Carsten Dominik @ 2010-01-08 12:43 ` Carsten Dominik 2010-01-10 9:20 ` Carsten Dominik 2 siblings, 0 replies; 9+ messages in thread From: Carsten Dominik @ 2010-01-08 12:43 UTC (permalink / raw) To: Francesco Pizzolante; +Cc: mailing-list-org-mode On Jan 8, 2010, at 1:36 PM, Francesco Pizzolante wrote: > Hi Carsten, > >> here is a possible solution: >> >> Please get the latest git version of org-mode. Then put the >> following code >> into >> .emacs: >> >> (defun my-org-export-latex-fix-inputenc () >> "Set the codingsystem in inputenc to what the buffer is." >> (let* ((cs buffer-file-coding-system) >> (opt (latexenc-coding-system-to-inputenc cs))) >> (when opt >> (goto-char (point-min)) >> (while (re-search-forward "\\\\usepackage\\[\\(.*?\\)\\] >> {inputenc}" >> nil t) >> (goto-char (match-beginning 1)) >> (delete-region (match-beginning 1) (match-end 1)) >> (insert opt)) >> (save-buffer)))) >> >> (eval-after-load "org-latex" >> '(add-hook 'org-export-latex-after-save-hook >> 'my-org-export-latex-fix-inputenc)) >> >> Let me know how it goes..... > > Thanks for your solution. > > I've tested with both latin1 and utf8 Org buffers and I get the > correct > encoding passed to LaTeX in both cases. > > Regarding the utf8 encoding, I had a remark in my first message, > which was: > >>> In addition, Org should use the `utf8x' option (instead of `utf8') >>> which >>> enables to handle unbreakable spaces (useful in french). > > Could you change that too? In, fact, you can change it in the code I sent you: Add (if (equal opt "utf8") (setq opt "utf8x")) wight before (when opt I am still thinking about if and how I can add this in a stable way to the default code.... - Carsten - Carsten ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Exporting non utf8 org documents 2010-01-08 12:36 ` Francesco Pizzolante 2010-01-08 12:39 ` Carsten Dominik 2010-01-08 12:43 ` Carsten Dominik @ 2010-01-10 9:20 ` Carsten Dominik [not found] ` <0929970B-35B4-495A-A972-985AF9FF1B91-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 2 siblings, 1 reply; 9+ messages in thread From: Carsten Dominik @ 2010-01-10 9:20 UTC (permalink / raw) To: Francesco Pizzolante; +Cc: mailing-list-org-mode Hi Francesco, you can remove the code I sent you again, and instead grab the latest git release. Then you can also do (setq org-export-latex-inputenc-alist '(("utf8" . "utf8x"))) to get utf8x instead of utf8. HTH - Carsten On Jan 8, 2010, at 1:36 PM, Francesco Pizzolante wrote: > Hi Carsten, > >> here is a possible solution: >> >> Please get the latest git version of org-mode. Then put the >> following code >> into >> .emacs: >> >> (defun my-org-export-latex-fix-inputenc () >> "Set the codingsystem in inputenc to what the buffer is." >> (let* ((cs buffer-file-coding-system) >> (opt (latexenc-coding-system-to-inputenc cs))) >> (when opt >> (goto-char (point-min)) >> (while (re-search-forward "\\\\usepackage\\[\\(.*?\\)\\] >> {inputenc}" >> nil t) >> (goto-char (match-beginning 1)) >> (delete-region (match-beginning 1) (match-end 1)) >> (insert opt)) >> (save-buffer)))) >> >> (eval-after-load "org-latex" >> '(add-hook 'org-export-latex-after-save-hook >> 'my-org-export-latex-fix-inputenc)) >> >> Let me know how it goes..... > > Thanks for your solution. > > I've tested with both latin1 and utf8 Org buffers and I get the > correct > encoding passed to LaTeX in both cases. > > Regarding the utf8 encoding, I had a remark in my first message, > which was: > >>> In addition, Org should use the `utf8x' option (instead of `utf8') >>> which >>> enables to handle unbreakable spaces (useful in french). > > Could you change that too? > > Thanks a lot, > Francesco - Carsten ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <0929970B-35B4-495A-A972-985AF9FF1B91-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>]
* Re: Exporting non utf8 org documents [not found] ` <0929970B-35B4-495A-A972-985AF9FF1B91-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> @ 2010-03-22 14:20 ` Francesco Pizzolante 2010-03-23 8:06 ` Carsten Dominik 0 siblings, 1 reply; 9+ messages in thread From: Francesco Pizzolante @ 2010-03-22 14:20 UTC (permalink / raw) To: Carsten Dominik; +Cc: mailing-list-org-mode Hi Carsten, I'm sorry for my very very late reply on this topic, but I'm just observing problem with this. > you can remove the code I sent you again, and instead grab the latest git > release. > > Then you can also do > > (setq org-export-latex-inputenc-alist '(("utf8" . "utf8x"))) > > to get utf8x instead of utf8. I still get the utf8 encoding even if, as you said, I set this: --8<---------------cut here---------------start------------->8--- (setq org-export-latex-inputenc-alist '(("utf8" . "utf8x"))) --8<---------------cut here---------------end--------------->8--- So, we get the following code: --8<---------------cut here---------------start------------->8--- (defun org-export-latex-fix-inputenc () "Set the codingsystem in inputenc to what the buffer is." (let* ((cs buffer-file-coding-system) (opt (or (ignore-errors (latexenc-coding-system-to-inputenc cs)) "utf8"))) (when opt ;; Translate if that is requested (setq opt (or (cdr (assoc opt org-export-latex-inputenc-alist)) opt)) ;; find the \usepackage statement and replace the option (goto-char (point-min)) (while (re-search-forward "\\\\usepackage\\[\\(AUTO\\)\\]{inputenc}" nil t) (goto-char (match-beginning 1)) (delete-region (match-beginning 1) (match-end 1)) (insert opt)) (and buffer-file-name (save-buffer))))) --8<---------------cut here---------------end--------------->8--- If I print the opt variable (message opt), I can see that its value is correctly set to utf8x. But, the re-search-forward command always fails. In effect, if I change the last argument from t to nil, I get the following error: --8<---------------cut here---------------start------------->8--- while: Search failed: "\\\\usepackage\\[\\(AUTO\\)\\]{inputenc}" --8<---------------cut here---------------end--------------->8--- I'm using an almost empty Org buffer with no option at all, so it generates a simple article document class. I would like to give you more input, but I don't know how to better debug this. If you have any idea, please let me know. Thanks a lot, Francesco _______________________________________________ Emacs-orgmode mailing list Please use `Reply All' to send replies to the list. Emacs-orgmode-mXXj517/zsQ@public.gmane.org http://lists.gnu.org/mailman/listinfo/emacs-orgmode ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Exporting non utf8 org documents 2010-03-22 14:20 ` Francesco Pizzolante @ 2010-03-23 8:06 ` Carsten Dominik 0 siblings, 0 replies; 9+ messages in thread From: Carsten Dominik @ 2010-03-23 8:06 UTC (permalink / raw) To: Francesco Pizzolante; +Cc: mailing-list-org-mode On Mar 22, 2010, at 3:20 PM, Francesco Pizzolante wrote: > Hi Carsten, > > I'm sorry for my very very late reply on this topic, but I'm just > observing > problem with this. > >> you can remove the code I sent you again, and instead grab the >> latest git >> release. >> >> Then you can also do >> >> (setq org-export-latex-inputenc-alist '(("utf8" . "utf8x"))) >> >> to get utf8x instead of utf8. > > I still get the utf8 encoding even if, as you said, I set this: > > --8<---------------cut here---------------start------------->8--- > (setq org-export-latex-inputenc-alist '(("utf8" . "utf8x"))) > --8<---------------cut here---------------end--------------->8--- > > > So, we get the following code: > > --8<---------------cut here---------------start------------->8--- > (defun org-export-latex-fix-inputenc () > "Set the codingsystem in inputenc to what the buffer is." > (let* ((cs buffer-file-coding-system) > (opt (or (ignore-errors (latexenc-coding-system-to-inputenc cs)) > "utf8"))) > (when opt > ;; Translate if that is requested > (setq opt (or (cdr (assoc opt org-export-latex-inputenc-alist)) > opt)) > ;; find the \usepackage statement and replace the option > (goto-char (point-min)) > (while (re-search-forward "\\\\usepackage\\[\\(AUTO\\)\\] > {inputenc}" > nil t) > (goto-char (match-beginning 1)) > (delete-region (match-beginning 1) (match-end 1)) > (insert opt)) > (and buffer-file-name > (save-buffer))))) > --8<---------------cut here---------------end--------------->8--- > > If I print the opt variable (message opt), I can see that its value is > correctly set to utf8x. > > But, the re-search-forward command always fails. In effect, if I > change the > last argument from t to nil, I get the following error: > > --8<---------------cut here---------------start------------->8--- > while: Search failed: "\\\\usepackage\\[\\(AUTO\\)\\]{inputenc}" > --8<---------------cut here---------------end--------------->8--- > > I'm using an almost empty Org buffer with no option at all, so it > generates a > simple article document class. > > I would like to give you more input, but I don't know how to better > debug > this. If you have any idea, please let me know. Hi Francesco, Maybe you have customized org-export-latex-classes? Unfortunately, this will make the changes I make in the default value of the variable disappear. So you need to do one of two things: - remove your customizations of that variable or - edit your value so that in all entries it says \usepackage[AUTO]{inputenc} instead of a fixed [utf8] or whatever you have there. HTH - Carsten > > Thanks a lot, > Francesco - Carsten ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2010-03-23 9:25 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-12-08 16:22 Exporting non utf8 org documents Francesco Pizzolante 2010-01-04 14:16 ` Carsten Dominik 2010-01-06 8:26 ` Carsten Dominik [not found] ` <6B5F0F7A-F055-435F-ADE2-846E99649B1D-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 2010-01-08 12:36 ` Francesco Pizzolante 2010-01-08 12:39 ` Carsten Dominik 2010-01-08 12:43 ` Carsten Dominik 2010-01-10 9:20 ` Carsten Dominik [not found] ` <0929970B-35B4-495A-A972-985AF9FF1B91-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 2010-03-22 14:20 ` Francesco Pizzolante 2010-03-23 8:06 ` Carsten Dominik
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).