> Hi Jambunathan, > > See comments below. > > Ciao, > Renzo > P.S. I'm on a camping-site right now, so I do not have good Internet access... > > On 16 July 2011 22:13, Jambunathan K wrote: >> >> Renzo >> >>> I just want to add one point that I did not find in the org-manual.  I tested >>> some of my org-files and exported them to the OpenOffice format. When I tried to >>> open these documents in OpenOffice, they were corrupt and could not be opened. >>> >>> I soon found out why. If you want to export an org-mode file to .odt, you need >>> to explicitly set the file encoding to UTF-8 (I usually use iso-8859-1 encoding >>> for my files), like: >>> #-*- mode: org; coding: utf-8; -*- >>> After that OpenOffice could open the files without any problems. >> >> I use English for communication and I have to admit that I have zero >> understanding of things like character sets, encodings etc. > > As for communicating; I'm from the border regions of The Netherlands, Belgium > and Germany... And therefore I'm multilingual, and often need to type words > with accents. > >> Thanks for the above note. I surely see is a bug but my poor >> understanding prevents me from quantifying it further. > > Well... I would not really see it as a bug... As long as it is mentioned in the > documentation, that org-file encoding's other then utf-8 could result in corrupt > output-files. > >> Could you please send me a minimal iso-8859-1 test.org file and the >> associated corrupted test.odt file? I will look in to this issue. > > See attachment. I can only send you the org file, because I do not have access > to a working Emacs at the moment... > >> 1. Do you have any specific requirement on how the component xml files >>   be encoded? A cursory look at the odt exporter suggests that it could >>   actually be emitting xml files in iso-8859-1 format while wrongly >>   claiming UTF-8 encoding as below >> >> --8<---------------cut here---------------start------------->8--- >> >> --8<---------------cut here---------------end--------------->8--- >> >> 2. Should the xml file be always ejected in UTF-8 irrespective of how >>   the original Org file is encoded. > > Yes that would seem a good solution to me... If the odt-exporter checks the > files encoding, and then changes the encoding to utf-8 (maybe using a temporary > buffer?) before the actual exporting, then there would be no further > problems... > > As for the idea that the OpenOffice xml can actually be in another encoding > than utf-8; I do not know how much work that would be for you, to implement in > the odt-exporter. It might be to much effort... > Also I don't know if such an OpenOffice document will open with no problems in > all OpenOffice applications. > >> [Notes to Self] >> [Notes from odbook] >> >> Para 3 of http://books.evc-cit.info/odbook/apa.html#appc-11-fm2xml >> says >> >> --8<---------------cut here---------------start------------->8--- >> OpenDocument files are always encoded in UTF-8. >> --8<---------------cut here---------------end--------------->8--- >> >> Para 2 of >> http://books.evc-cit.info/odbook/apa.html#xml-other-char-encodings-section >> says >> >> --8<---------------cut here---------------start------------->8--- >> XML 1.0 allows a document to be encoded in any character set registered >> with the Internet Assigned Numbers Authority (IANA). European documents >> are commonly encoded in one of the ISO Latin character sets, such as >> ISO-8859-1. Japanese documents commonly use Shift-JIS, and Chinese >> documents use GB2312 and Big 5. >> --8<---------------cut here---------------end--------------->8--- >> >> Para 4 of >> http://books.evc-cit.info/odbook/apa.html#xml-other-char-encodings-section >> says >> >> --8<---------------cut here---------------start------------->8--- >> XML processors are not required by the XML 1.0 specification to support >> any more than UTF-8 and UTF-16, but most commonly support other >> encodings, such as US-ASCII and ISO-8859-1. >> --8<---------------cut here---------------end--------------->8--- >> >> >> [Notes from XMLmind XSL-FO Converter] >> >> >> XFC supports outputting of content.xml and styles.xml in UTF-8 as well >> as ISO-8859-1. >> >> http://xml.web.cern.ch/XML/www.xmlmind.com/xfc_perso_java-4_4_0/doc/user/command_line_java.html >> >> says >> >> ,---- [see outputEncoding section] >> | For OpenDocument output (.odt), this option specifies the encoding of >> | XML content (files styles.xml and content.xml) in the output >> | document. All encodings available in the current JVM are supported. The >> | option value may be either the encoding name (e.g. ISO8859_1) or the >> | charset name (e.g. ISO-8859-1). The default value is UTF8. >> `---- >> >> -- > --