emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* question about ODT export behavior
@ 2011-07-13 13:20 Rainer Stengele
  2011-07-13 14:23 ` Bastien
  2011-07-13 16:55 ` Jambunathan K
  0 siblings, 2 replies; 17+ messages in thread
From: Rainer Stengele @ 2011-07-13 13:20 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 645 bytes --]

Hi,

having this in an org file:
----------------------------------
* Test
** header 2
   - item 1
     * subitem 11
     * subitem 12
   - item 2
     * subitem 21
     * subitem 22
----------------------------------

and exporting to ODT I get (I simply copied the Org doc contents via clipboard) I get this:

----------------------------------
Test
Table of Contents
1. header 2
1.  header 2
item 1

subitem 11

subitem 12

item 2

subitem 21

subitem 22
----------------------------------

Why do I get extra lines between the items and subitems?
Where or how can I adjust the behaviour of the exporter?

I'll attach the org file.

- Rainer

[-- Attachment #2: test.odt --]
[-- Type: application/vnd.oasis.opendocument.text, Size: 8231 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: question about ODT export behavior
  2011-07-13 13:20 question about ODT export behavior Rainer Stengele
@ 2011-07-13 14:23 ` Bastien
  2011-07-13 15:04   ` Rainer Stengele
  2011-07-13 16:55 ` Jambunathan K
  1 sibling, 1 reply; 17+ messages in thread
From: Bastien @ 2011-07-13 14:23 UTC (permalink / raw)
  To: Rainer Stengele; +Cc: emacs-orgmode

Rainer Stengele <rainer.stengele@diplan.de> writes:

> I'll attach the org file.

You forgot the org file, you just attached the odt file.

I don't know when Jambunathan can have a loot at this, but please 
bare in mind that such formatting issues are relatively hard to fix
(talking from experience).

Thanks,

-- 
 Bastien

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: question about ODT export behavior
  2011-07-13 14:23 ` Bastien
@ 2011-07-13 15:04   ` Rainer Stengele
  2011-07-13 16:14     ` Bastien
  0 siblings, 1 reply; 17+ messages in thread
From: Rainer Stengele @ 2011-07-13 15:04 UTC (permalink / raw)
  To: Bastien; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 705 bytes --]

Am 13.07.2011 16:23, schrieb Bastien:
> Rainer Stengele <rainer.stengele@diplan.de> writes:
> 
>> I'll attach the org file.
> 
> You forgot the org file, you just attached the odt file.
> 
> I don't know when Jambunathan can have a loot at this, but please 
> bare in mind that such formatting issues are relatively hard to fix
> (talking from experience).
> 
> Thanks,
> 

Org file is attached.

I think this is the most simple org file - nothing special
- and the odt file just extends some extra lines, so I thought this
to be a fundamental issue.
I would like to be able to export the minutes of a meeting but cannot as my colleagues
will wonder why I put all these extra lines in ...

Thanks,
Rainer

[-- Attachment #2: test.org --]
[-- Type: text/org, Size: 115 bytes --]

* Test
** header 2
   - item 1
     * subitem 11
     * subitem 12
   - item 2
     * subitem 21
     * subitem 22

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: question about ODT export behavior
  2011-07-13 15:04   ` Rainer Stengele
@ 2011-07-13 16:14     ` Bastien
  2011-07-13 20:18       ` Jambunathan K
  0 siblings, 1 reply; 17+ messages in thread
From: Bastien @ 2011-07-13 16:14 UTC (permalink / raw)
  To: Rainer Stengele; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 124 bytes --]

I cannot reproduce your error.

Check the test_bastien.org file I get by exporting your test.org, 
there is no extra line.


[-- Attachment #2: test_bastien.odt --]
[-- Type: application/vnd.oasis.opendocument.text, Size: 8321 bytes --]

[-- Attachment #3: Type: text/plain, Size: 14 bytes --]


-- 
 Bastien

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: question about ODT export behavior
  2011-07-13 13:20 question about ODT export behavior Rainer Stengele
  2011-07-13 14:23 ` Bastien
@ 2011-07-13 16:55 ` Jambunathan K
  2011-07-13 20:15   ` Jambunathan K
  1 sibling, 1 reply; 17+ messages in thread
From: Jambunathan K @ 2011-07-13 16:55 UTC (permalink / raw)
  To: Rainer Stengele; +Cc: emacs-orgmode

Rainer Stengele <rainer.stengele@diplan.de> writes:

> Hi,
>
> having this in an org file:
> ----------------------------------
> * Test
> ** header 2
>    - item 1
>      * subitem 11
>      * subitem 12
>    - item 2
>      * subitem 21
>      * subitem 22
> ----------------------------------

Could you please post your complete #+OPTIONS line - specifically the
`H: ' and `toc: ' option? 

How exactly are you exporting - Are you exporting the file, a subtree,
region etc etc? 

What interactive command are you using for export?

> and exporting to ODT I get (I simply copied the Org doc contents via clipboard) I get this:
>
> ----------------------------------
> Test
> Table of Contents
> 1. header 2
> 1.  header 2
> item 1
>
> subitem 11
>
> subitem 12
>
> item 2
>
> subitem 21
>
> subitem 22
> ----------------------------------
>
> Why do I get extra lines between the items and subitems?

This is beacause there is an explicit line break at the end of the list
items.

If you open content.xml and remove the <text:line-break/> and save the
buffer, does the altered odt file match your expectations. 

If you export the above outline with the same settings, does the HTML
exporter also introduce <br/> at the end of the list items?

odt exporter is a derived from the html exporter and mimics the HTML
exporter mindlessly. I believe the line breaks can be removed.

Once I get the relevant details from you I can post a patch after a
closer look.

> Where or how can I adjust the behaviour of the exporter?
>
> I'll attach the org file.
>
> - Rainer
>

-- 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: question about ODT export behavior
  2011-07-13 16:55 ` Jambunathan K
@ 2011-07-13 20:15   ` Jambunathan K
  2011-07-14  6:50     ` Rainer Stengele
  0 siblings, 1 reply; 17+ messages in thread
From: Jambunathan K @ 2011-07-13 20:15 UTC (permalink / raw)
  To: Rainer Stengele; +Cc: emacs-orgmode

Jambunathan K <kjambunathan@gmail.com> writes:

> Rainer Stengele <rainer.stengele@diplan.de> writes:
>
>> Hi,
>>
>> having this in an org file:
>> ----------------------------------
>> * Test
>> ** header 2
>>    - item 1
>>      * subitem 11
>>      * subitem 12
>>    - item 2
>>      * subitem 21
>>      * subitem 22
>> ----------------------------------
>
> Could you please post your complete #+OPTIONS line - specifically the
> `H: ' and `toc: ' option? 

Looking at the code, I believe these options may not be relevant (for
the odt exporter)

Your org file doesn't use an explicit line break or timestamps. So the
only scenario under which line breaks can occur is precisely when you
have actually requested them.

Check this variable or the corresponding OPTIONS directive.


,----[ C-h v org-export-preserve-breaks RET ]
| org-export-preserve-breaks is a variable defined in `org-exp.el'.
| Its value is nil
| 
| Documentation:
| Non-nil means preserve all line breaks when exporting.
| Normally, in HTML output paragraphs will be reformatted.  In ASCII
| export, line breaks will always be preserved, regardless of this variable.
| 
| This option can also be set with the +OPTIONS line, e.g. "\n:t".
| 
| You can customize this variable.
| 
| [back]
`----


> How exactly are you exporting - Are you exporting the file, a subtree,
> region etc etc? 
>
> What interactive command are you using for export?
>
>> and exporting to ODT I get (I simply copied the Org doc contents via
>> clipboard) I get this:
>>
>> ----------------------------------
>> Test
>> Table of Contents
>> 1. header 2
>> 1.  header 2
>> item 1
>>
>> subitem 11
>>
>> subitem 12
>>
>> item 2
>>
>> subitem 21
>>
>> subitem 22
>> ----------------------------------
>>
>> Why do I get extra lines between the items and subitems?
>
> This is beacause there is an explicit line break at the end of the list
> items.
>
> If you open content.xml and remove the <text:line-break/> and save the
> buffer, does the altered odt file match your expectations. 
>
> If you export the above outline with the same settings, does the HTML
> exporter also introduce <br/> at the end of the list items?

IIRC, the (x)html exporter adds a line break after emitting an headline
which is listified.

> odt exporter is a derived from the html exporter and mimics the HTML
> exporter mindlessly. I believe the line breaks can be removed.

The odt exporter doesn't emit the line break like the xhtml
exporter. (May be I was a bit mindful while typing out org-odt.el) So
the options H, toc etc may not be relevant.

>
> Once I get the relevant details from you I can post a patch after a
> closer look.
>
>> Where or how can I adjust the behaviour of the exporter?
>>
>> I'll attach the org file.
>>
>> - Rainer
>>

-- 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: question about ODT export behavior
  2011-07-13 16:14     ` Bastien
@ 2011-07-13 20:18       ` Jambunathan K
  0 siblings, 0 replies; 17+ messages in thread
From: Jambunathan K @ 2011-07-13 20:18 UTC (permalink / raw)
  To: Bastien; +Cc: emacs-orgmode, Rainer Stengele

Bastien <bzg@altern.org> writes:

> I cannot reproduce your error.

Same here.  

Some custom setting (global or per-file) is kicking in Rainer's
setup. It is more likely to be org-export-preserve-breaks. 

Jambunathan K.


-- 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: question about ODT export behavior
  2011-07-13 20:15   ` Jambunathan K
@ 2011-07-14  6:50     ` Rainer Stengele
  2011-07-14 15:44       ` Bastien
  0 siblings, 1 reply; 17+ messages in thread
From: Rainer Stengele @ 2011-07-14  6:50 UTC (permalink / raw)
  To: Jambunathan K; +Cc: emacs-orgmode

Am 13.07.2011 22:15, schrieb Jambunathan K:
> Jambunathan K <kjambunathan@gmail.com> writes:
>
>> Rainer Stengele <rainer.stengele@diplan.de> writes:
>>
>>> Hi,
>>>
>>> having this in an org file:
>>> ----------------------------------
>>> * Test
>>> ** header 2
>>>    - item 1
>>>      * subitem 11
>>>      * subitem 12
>>>    - item 2
>>>      * subitem 21
>>>      * subitem 22
>>> ----------------------------------
>> Could you please post your complete #+OPTIONS line - specifically the
>> `H: ' and `toc: ' option? 
> Looking at the code, I believe these options may not be relevant (for
> the odt exporter)
>
> Your org file doesn't use an explicit line break or timestamps. So the
> only scenario under which line breaks can occur is precisely when you
> have actually requested them.
>
> Check this variable or the corresponding OPTIONS directive.
>
>
> ,----[ C-h v org-export-preserve-breaks RET ]
> | org-export-preserve-breaks is a variable defined in `org-exp.el'.
> | Its value is nil
> | 
> | Documentation:
> | Non-nil means preserve all line breaks when exporting.
> | Normally, in HTML output paragraphs will be reformatted.  In ASCII
> | export, line breaks will always be preserved, regardless of this variable.
> | 
> | This option can also be set with the +OPTIONS line, e.g. "\n:t".
> | 
> | You can customize this variable.
> | 
> | [back]
> `----
>
>

Yes. That was it. After changing

org-export-preserve-breaks

to nil the breaks are gone.
Sorry,  I did not know how the exporter works. I did therefore not check the export options.

Again, thank you for your excellent work!
Bastien, thanks for helping!

Can you give me a hint where I can find some documentation about changing the styles.xml?

- Rainer

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: question about ODT export behavior
  2011-07-14  6:50     ` Rainer Stengele
@ 2011-07-14 15:44       ` Bastien
  2011-07-15  5:54         ` Jambunathan K
  0 siblings, 1 reply; 17+ messages in thread
From: Bastien @ 2011-07-14 15:44 UTC (permalink / raw)
  To: Rainer Stengele; +Cc: emacs-orgmode, Jambunathan K

Hi Rainer,

Rainer Stengele <rainer.stengele@diplan.de> writes:

> Yes. That was it. After changing
>
> org-export-preserve-breaks
>
> to nil the breaks are gone.

Thanks for confirming.

> Sorry,  I did not know how the exporter works. I did therefore not
> check the export options.

Perhaps we could add a FAQ for this in Worg.

> Can you give me a hint where I can find some documentation about
> changing the styles.xml?

You can have a look at contrib/odt/README.org.  Jambunathan points 
to this message, which can help:

  http://lists.gnu.org/archive/html/emacs-orgmode/2011-03/msg01460.html

HTH,

-- 
 Bastien

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: question about ODT export behavior
  2011-07-14 15:44       ` Bastien
@ 2011-07-15  5:54         ` Jambunathan K
  2011-07-15 20:34           ` Renzo Been
  2011-07-22 14:38           ` [PATCH 1/2] org-odt: Improve customization of org-export-odt-styles-file Jambunathan K
  0 siblings, 2 replies; 17+ messages in thread
From: Jambunathan K @ 2011-07-15  5:54 UTC (permalink / raw)
  To: Rainer Stengele; +Cc: emacs-orgmode


>> Can you give me a hint where I can find some documentation about
>> changing the styles.xml?
>
> You can have a look at contrib/odt/README.org.  Jambunathan points 
> to this message, which can help:
>
>   http://lists.gnu.org/archive/html/emacs-orgmode/2011-03/msg01460.html

Here is the relevant variable.

,----[ C-h v org-export-odt-styles-file RET ]
| org-export-odt-styles-file is a variable defined in `org-odt.el'.
| Its value is nil
| 
| Documentation:
| Default style file for use with ODT exporter.
| Valid values are path to an styles.xml file or a path to a valid
| *.odt or a *.ott file or a list of the form (FILE (MEMBER1
| MEMBER2 ...)). In the last case, the specified FILE is unzipped
| and MEMBER1, MEMBER2 etc are copied in to the generated odt
| file. The last form is particularly useful if the styles.xml has
| reference to additional files like header and footer images.
| 
| 
| You can customize this variable.
| 
| [back]
`----

Here is a specific example:

,----
| (setq org-export-odt-styles-file
|       '("~/tmp-orgmode/Thu Thong Bao - Trai Ve Nguon XV (2011).odt"
|         ("styles.xml" "Pictures/10000000000002740000034B83A526F3.png")))
| 
| the styles.xml and header images would get copied on to the generated
| odt file. 
| 
| If the desired styles.xml makes no references to other files (as in the
| example above) then the above variable could be set to 
| 
| (setq org-export-odt-styles-file
|       "~/tmp-orgmode/Thu Thong Bao - Trai Ve Nguon XV (2011).odt")
| 
| or 
| 
| (setq org-export-odt-styles-file "~/elisp/styles.xml")
`----

I am seeing that customization interface for org-export-odt-styles-file
variable is only partially done. If the customization interface doesn't
do the right thing for you, you can use the setq form temporarily.

The default styles file used by the odt exporter is:
org-root-dir/contrib/odt/styles/OrgOdtStyles.xml

There is also an automatic styles file in there.


If you are interested in customizing the styles you could do this:

1. Export test.org to test.odt

2. Open test.odt and use the OpenOffice stylist to change the
   user-defined styles. The Org-specific user stlyes have `Org' as a
   prefix. 

   Hint: One can use M-x occur RET Org RET in OrgOdtStyles.xml buffer to
   see the Org specific styles.

   Caution: The style-names used in the styles.xml are the style-names
   written out in content.xml. So it is important that the style-names
   be not changed at all. 

3. Save the newly styled test.odt as say ~/.emacs.d/org-odt/custom.odt
   or ~/.emacs.d/org-odt/custom.ott file.

4. Use one of the following:

   1. Customize org-export-odt-styles-file to the custom.odt or
      custom.ott file.

   2. If you are a bit adventurous, you can open the newly saved
      test.odt in an archive-mode, extract the styles.xml and save it to
      ~/.emacs.d/org-odt/user-styles.xml.

      Then customize org-export-odt-styles-file to point to the above
      styles.xml file.

I anticipate more questions and follow-on enhancements to styles
interface as more and more people start using it. So all inputs are
welcome.

Jambunathan K.

>
> HTH,

-- 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: question about ODT export behavior
  2011-07-15  5:54         ` Jambunathan K
@ 2011-07-15 20:34           ` Renzo Been
  2011-07-16 20:13             ` ODT Charset/Encoding issues (was question about ODT export behavior) Jambunathan K
  2011-07-22 14:38           ` [PATCH 1/2] org-odt: Improve customization of org-export-odt-styles-file Jambunathan K
  1 sibling, 1 reply; 17+ messages in thread
From: Renzo Been @ 2011-07-15 20:34 UTC (permalink / raw)
  To: emacs-orgmode

Jambunathan K <kjambunathan <at> gmail.com> writes:

> >> Can you give me a hint where I can find some documentation about
> >> changing the styles.xml?
> >
> > You can have a look at contrib/odt/README.org.  Jambunathan points 
> > to this message, which can help:
> >
> >   http://lists.gnu.org/archive/html/emacs-orgmode/2011-03/msg01460.html
> 
> Here is the relevant variable.
> 
> ,----[ C-h v org-export-odt-styles-file RET ]
> | org-export-odt-styles-file is a variable defined in `org-odt.el'.
> `----
> 
> Here is a specific example:
> 
> ,----
> | (setq org-export-odt-styles-file
> |       '("~/tmp-orgmode/Thu Thong Bao - Trai Ve Nguon XV (2011).odt"
> `----
> 
> I am seeing that customization interface for org-export-odt-styles-file
> variable is only partially done. If the customization interface doesn't
> do the right thing for you, you can use the setq form temporarily.
> 
> The default styles file used by the odt exporter is:
> org-root-dir/contrib/odt/styles/OrgOdtStyles.xml
> 
> There is also an automatic styles file in there.
> 
> If you are interested in customizing the styles you could do this:
> 
> 1. Export test.org to test.odt
> 
> 2. Open test.odt and use the OpenOffice stylist to change the
>    user-defined styles. The Org-specific user stlyes have `Org' as a
>    prefix. 
> 
>    Hint: One can use M-x occur RET Org RET in OrgOdtStyles.xml buffer to
>    see the Org specific styles.
> 
>    Caution: The style-names used in the styles.xml are the style-names
>    written out in content.xml. So it is important that the style-names
>    be not changed at all. 
> 
> 3. Save the newly styled test.odt as say ~/.emacs.d/org-odt/custom.odt
>    or ~/.emacs.d/org-odt/custom.ott file.
> 
> 4. Use one of the following:
> 
>    1. Customize org-export-odt-styles-file to the custom.odt or
>       custom.ott file.
> 
>    2. If you are a bit adventurous, you can open the newly saved
>       test.odt in an archive-mode, extract the styles.xml and save it to
>       ~/.emacs.d/org-odt/user-styles.xml.
> 
>       Then customize org-export-odt-styles-file to point to the above
>       styles.xml file.
> 
> I anticipate more questions and follow-on enhancements to styles
> interface as more and more people start using it. So all inputs are
> welcome.
> 
> Jambunathan K.
> 
> >
> > HTH,

Hi,

First of all; I want to thank you for this great addon to org-mode. Being able
to export directly to OpenOffice documents is good for sharing my document with
friends.

I just want to add one point that I did not find in the org-manual.  I tested
some of my org-files and exported them to the OpenOffice format. When I tried to
open these documents in OpenOffice, they were corrupt and could not be opened.

I soon found out why. If you want to export an org-mode file to .odt, you need
to explicitly set the file encoding to UTF-8 (I usually use iso-8859-1 encoding
for my files), like:
#-*- mode: org; coding: utf-8; -*-
After that OpenOffice could open the files without any problems.

Thanx again, and I will follow your explanation below, to create my own
styles.xml

Ciao,
Renzo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ODT Charset/Encoding issues (was question about ODT export behavior)
  2011-07-15 20:34           ` Renzo Been
@ 2011-07-16 20:13             ` Jambunathan K
  2011-07-17 14:12               ` Renzo Been
  0 siblings, 1 reply; 17+ messages in thread
From: Jambunathan K @ 2011-07-16 20:13 UTC (permalink / raw)
  To: Renzo Been; +Cc: emacs-orgmode


Renzo

> I just want to add one point that I did not find in the org-manual.  I tested
> some of my org-files and exported them to the OpenOffice format. When I tried to
> open these documents in OpenOffice, they were corrupt and could not be opened.
>
> I soon found out why. If you want to export an org-mode file to .odt, you need
> to explicitly set the file encoding to UTF-8 (I usually use iso-8859-1 encoding
> for my files), like:
> #-*- mode: org; coding: utf-8; -*-
> After that OpenOffice could open the files without any problems.

I use English for communication and I have to admit that I have zero
understanding of things like character sets, encodings etc. 

Thanks for the above note. I surely see is a bug but my poor
understanding prevents me from quantifying it further.

Could you please send me a minimal iso-8859-1 test.org file and the
associated corrupted test.odt file? I will look in to this issue.

1. Do you have any specific requirement on how the component xml files
   be encoded? A cursory look at the odt exporter suggests that it could
   actually be emitting xml files in iso-8859-1 format while wrongly
   claiming UTF-8 encoding as below

--8<---------------cut here---------------start------------->8---
<?xml version="1.0" encoding="UTF-8"?>
--8<---------------cut here---------------end--------------->8---

2. Should the xml file be always ejected in UTF-8 irrespective of how
   the original Org file is encoded.


[Notes to Self]
[Notes from odbook]

Para 3 of http://books.evc-cit.info/odbook/apa.html#appc-11-fm2xml
says

--8<---------------cut here---------------start------------->8---
OpenDocument files are always encoded in UTF-8. 
--8<---------------cut here---------------end--------------->8---

Para 2 of
http://books.evc-cit.info/odbook/apa.html#xml-other-char-encodings-section
says

--8<---------------cut here---------------start------------->8---
XML 1.0 allows a document to be encoded in any character set registered
with the Internet Assigned Numbers Authority (IANA). European documents
are commonly encoded in one of the ISO Latin character sets, such as
ISO-8859-1. Japanese documents commonly use Shift-JIS, and Chinese
documents use GB2312 and Big 5.
--8<---------------cut here---------------end--------------->8---

Para 4 of
http://books.evc-cit.info/odbook/apa.html#xml-other-char-encodings-section
says

--8<---------------cut here---------------start------------->8---
XML processors are not required by the XML 1.0 specification to support
any more than UTF-8 and UTF-16, but most commonly support other
encodings, such as US-ASCII and ISO-8859-1.
--8<---------------cut here---------------end--------------->8---


[Notes from XMLmind XSL-FO Converter]


XFC supports outputting of content.xml and styles.xml in UTF-8 as well
as ISO-8859-1.

http://xml.web.cern.ch/XML/www.xmlmind.com/xfc_perso_java-4_4_0/doc/user/command_line_java.html

says

,---- [see outputEncoding section]
| For OpenDocument output (.odt), this option specifies the encoding of
| XML content (files styles.xml and content.xml) in the output
| document. All encodings available in the current JVM are supported. The
| option value may be either the encoding name (e.g. ISO8859_1) or the
| charset name (e.g. ISO-8859-1). The default value is UTF8.
`----

-- 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ODT Charset/Encoding issues (was question about ODT export behavior)
  2011-07-16 20:13             ` ODT Charset/Encoding issues (was question about ODT export behavior) Jambunathan K
@ 2011-07-17 14:12               ` Renzo Been
  2011-07-17 19:13                 ` Jambunathan K
  0 siblings, 1 reply; 17+ messages in thread
From: Renzo Been @ 2011-07-17 14:12 UTC (permalink / raw)
  To: Jambunathan K; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 4764 bytes --]

Hi Jambunathan,

See comments below.

Ciao,
Renzo
P.S. I'm on a camping-site right now, so I do not have good Internet access...

On 16 July 2011 22:13, Jambunathan K <kjambunathan@gmail.com> wrote:
>
> Renzo
>
>> I just want to add one point that I did not find in the org-manual.  I tested
>> some of my org-files and exported them to the OpenOffice format. When I tried to
>> open these documents in OpenOffice, they were corrupt and could not be opened.
>>
>> I soon found out why. If you want to export an org-mode file to .odt, you need
>> to explicitly set the file encoding to UTF-8 (I usually use iso-8859-1 encoding
>> for my files), like:
>> #-*- mode: org; coding: utf-8; -*-
>> After that OpenOffice could open the files without any problems.
>
> I use English for communication and I have to admit that I have zero
> understanding of things like character sets, encodings etc.

As for communicating; I'm from the border regions of The Netherlands, Belgium
and Germany... And therefore I'm multilingual, and often need to type words
with accents.

> Thanks for the above note. I surely see is a bug but my poor
> understanding prevents me from quantifying it further.

Well... I would not really see it as a bug... As long as it is mentioned in the
documentation, that org-file encoding's other then utf-8 could result in corrupt
output-files.

> Could you please send me a minimal iso-8859-1 test.org file and the
> associated corrupted test.odt file? I will look in to this issue.

See attachment. I can only send you the org file, because I do not have access
to a working Emacs at the moment...

> 1. Do you have any specific requirement on how the component xml files
>   be encoded? A cursory look at the odt exporter suggests that it could
>   actually be emitting xml files in iso-8859-1 format while wrongly
>   claiming UTF-8 encoding as below
>
> --8<---------------cut here---------------start------------->8---
> <?xml version="1.0" encoding="UTF-8"?>
> --8<---------------cut here---------------end--------------->8---
>
> 2. Should the xml file be always ejected in UTF-8 irrespective of how
>   the original Org file is encoded.

Yes that would seem a good solution to me... If the odt-exporter checks the
files encoding, and then changes the encoding to utf-8 (maybe using a temporary
buffer?) before the actual exporting, then there would be no further
problems...

As for the idea that the OpenOffice xml can actually be in another encoding
than utf-8; I do not know how much work that would be for you, to implement in
the odt-exporter. It might be to much effort...
Also I don't know if such an OpenOffice document will open with no problems in
all OpenOffice applications.

> [Notes to Self]
> [Notes from odbook]
>
> Para 3 of http://books.evc-cit.info/odbook/apa.html#appc-11-fm2xml
> says
>
> --8<---------------cut here---------------start------------->8---
> OpenDocument files are always encoded in UTF-8.
> --8<---------------cut here---------------end--------------->8---
>
> Para 2 of
> http://books.evc-cit.info/odbook/apa.html#xml-other-char-encodings-section
> says
>
> --8<---------------cut here---------------start------------->8---
> XML 1.0 allows a document to be encoded in any character set registered
> with the Internet Assigned Numbers Authority (IANA). European documents
> are commonly encoded in one of the ISO Latin character sets, such as
> ISO-8859-1. Japanese documents commonly use Shift-JIS, and Chinese
> documents use GB2312 and Big 5.
> --8<---------------cut here---------------end--------------->8---
>
> Para 4 of
> http://books.evc-cit.info/odbook/apa.html#xml-other-char-encodings-section
> says
>
> --8<---------------cut here---------------start------------->8---
> XML processors are not required by the XML 1.0 specification to support
> any more than UTF-8 and UTF-16, but most commonly support other
> encodings, such as US-ASCII and ISO-8859-1.
> --8<---------------cut here---------------end--------------->8---
>
>
> [Notes from XMLmind XSL-FO Converter]
>
>
> XFC supports outputting of content.xml and styles.xml in UTF-8 as well
> as ISO-8859-1.
>
> http://xml.web.cern.ch/XML/www.xmlmind.com/xfc_perso_java-4_4_0/doc/user/command_line_java.html
>
> says
>
> ,---- [see outputEncoding section]
> | For OpenDocument output (.odt), this option specifies the encoding of
> | XML content (files styles.xml and content.xml) in the output
> | document. All encodings available in the current JVM are supported. The
> | option value may be either the encoding name (e.g. ISO8859_1) or the
> | charset name (e.g. ISO-8859-1). The default value is UTF8.
> `----
>
> --

[-- Attachment #2: test-encoding.zip --]
[-- Type: application/zip, Size: 279 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: ODT Charset/Encoding issues (was question about ODT export behavior)
  2011-07-17 14:12               ` Renzo Been
@ 2011-07-17 19:13                 ` Jambunathan K
  2011-07-18  8:59                   ` Bastien
  0 siblings, 1 reply; 17+ messages in thread
From: Jambunathan K @ 2011-07-17 19:13 UTC (permalink / raw)
  To: Renzo Been, Christian Moe; +Cc: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 346 bytes --]


Hello Renzo & Christian

Thanks for the test files and sharing your views on this issue. With the
attached patch I can export the test files successfully.

The attached patch ensures that component xml files created by the odt
exporter are always utf-8 encoded. This is irrespective of the coding
system used by the Org buffer.

Jambunathan K.


[-- Attachment #2: 0001-org-odt-Correctly-export-iso-8859-1-files-with-non-a.patch --]
[-- Type: text/plain, Size: 1182 bytes --]

From 1ec1e3c9248387ab2daabe7b9c7cc4a3c42b4998 Mon Sep 17 00:00:00 2001
From: Jambunathan K <kjambunathan@gmail.com>
Date: Mon, 18 Jul 2011 00:26:41 +0530
Subject: [PATCH] org-odt: Correctly export iso-8859-1 files with non-ascii chars

* contrib/lisp/org-odt.el (org-odt-get): Set
CODING-SYSTEM-FOR-WRITE and CODING-SYSTEM-FOR-SAVE to 'utf-8
irrespective of buffer-file-coding-system.

Fixes issue reported by Renzo Been in the following post.
http://lists.gnu.org/archive/html/emacs-orgmode/2011-07/msg00795.html
---
 contrib/lisp/org-odt.el |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/contrib/lisp/org-odt.el b/contrib/lisp/org-odt.el
index f3a4067..bd2ea33 100644
--- a/contrib/lisp/org-odt.el
+++ b/contrib/lisp/org-odt.el
@@ -1380,6 +1380,8 @@ MAY-INLINE-P allows inlining it as an image."
     (PLAIN-TEXT-MAP '(("&" . "&amp;") ("<" . "&lt;") (">" . "&gt;")))
     (TABLE-FIRST-COLUMN-AS-LABELS nil)
     (FOOTNOTE-SEPARATOR (org-lparse-format 'FONTIFY "," 'superscript))
+    (CODING-SYSTEM-FOR-WRITE 'utf-8)
+    (CODING-SYSTEM-FOR-SAVE 'utf-8)
     (t (error "Unknown property: %s"  what))))
 
 (defun org-odt-parse-label (label)
-- 
1.7.2.3


[-- Attachment #3: Type: text/plain, Size: 4920 bytes --]




> Hi Jambunathan,
>
> See comments below.
>
> Ciao,
> Renzo
> P.S. I'm on a camping-site right now, so I do not have good Internet access...
>
> On 16 July 2011 22:13, Jambunathan K <kjambunathan@gmail.com> wrote:
>>
>> Renzo
>>
>>> I just want to add one point that I did not find in the org-manual.  I tested
>>> some of my org-files and exported them to the OpenOffice format. When I tried to
>>> open these documents in OpenOffice, they were corrupt and could not be opened.
>>>
>>> I soon found out why. If you want to export an org-mode file to .odt, you need
>>> to explicitly set the file encoding to UTF-8 (I usually use iso-8859-1 encoding
>>> for my files), like:
>>> #-*- mode: org; coding: utf-8; -*-
>>> After that OpenOffice could open the files without any problems.
>>
>> I use English for communication and I have to admit that I have zero
>> understanding of things like character sets, encodings etc.
>
> As for communicating; I'm from the border regions of The Netherlands, Belgium
> and Germany... And therefore I'm multilingual, and often need to type words
> with accents.
>
>> Thanks for the above note. I surely see is a bug but my poor
>> understanding prevents me from quantifying it further.
>
> Well... I would not really see it as a bug... As long as it is mentioned in the
> documentation, that org-file encoding's other then utf-8 could result in corrupt
> output-files.
>
>> Could you please send me a minimal iso-8859-1 test.org file and the
>> associated corrupted test.odt file? I will look in to this issue.
>
> See attachment. I can only send you the org file, because I do not have access
> to a working Emacs at the moment...
>
>> 1. Do you have any specific requirement on how the component xml files
>>   be encoded? A cursory look at the odt exporter suggests that it could
>>   actually be emitting xml files in iso-8859-1 format while wrongly
>>   claiming UTF-8 encoding as below
>>
>> --8<---------------cut here---------------start------------->8---
>> <?xml version="1.0" encoding="UTF-8"?>
>> --8<---------------cut here---------------end--------------->8---
>>
>> 2. Should the xml file be always ejected in UTF-8 irrespective of how
>>   the original Org file is encoded.
>
> Yes that would seem a good solution to me... If the odt-exporter checks the
> files encoding, and then changes the encoding to utf-8 (maybe using a temporary
> buffer?) before the actual exporting, then there would be no further
> problems...
>
> As for the idea that the OpenOffice xml can actually be in another encoding
> than utf-8; I do not know how much work that would be for you, to implement in
> the odt-exporter. It might be to much effort...
> Also I don't know if such an OpenOffice document will open with no problems in
> all OpenOffice applications.
>
>> [Notes to Self]
>> [Notes from odbook]
>>
>> Para 3 of http://books.evc-cit.info/odbook/apa.html#appc-11-fm2xml
>> says
>>
>> --8<---------------cut here---------------start------------->8---
>> OpenDocument files are always encoded in UTF-8.
>> --8<---------------cut here---------------end--------------->8---
>>
>> Para 2 of
>> http://books.evc-cit.info/odbook/apa.html#xml-other-char-encodings-section
>> says
>>
>> --8<---------------cut here---------------start------------->8---
>> XML 1.0 allows a document to be encoded in any character set registered
>> with the Internet Assigned Numbers Authority (IANA). European documents
>> are commonly encoded in one of the ISO Latin character sets, such as
>> ISO-8859-1. Japanese documents commonly use Shift-JIS, and Chinese
>> documents use GB2312 and Big 5.
>> --8<---------------cut here---------------end--------------->8---
>>
>> Para 4 of
>> http://books.evc-cit.info/odbook/apa.html#xml-other-char-encodings-section
>> says
>>
>> --8<---------------cut here---------------start------------->8---
>> XML processors are not required by the XML 1.0 specification to support
>> any more than UTF-8 and UTF-16, but most commonly support other
>> encodings, such as US-ASCII and ISO-8859-1.
>> --8<---------------cut here---------------end--------------->8---
>>
>>
>> [Notes from XMLmind XSL-FO Converter]
>>
>>
>> XFC supports outputting of content.xml and styles.xml in UTF-8 as well
>> as ISO-8859-1.
>>
>> http://xml.web.cern.ch/XML/www.xmlmind.com/xfc_perso_java-4_4_0/doc/user/command_line_java.html
>>
>> says
>>
>> ,---- [see outputEncoding section]
>> | For OpenDocument output (.odt), this option specifies the encoding of
>> | XML content (files styles.xml and content.xml) in the output
>> | document. All encodings available in the current JVM are supported. The
>> | option value may be either the encoding name (e.g. ISO8859_1) or the
>> | charset name (e.g. ISO-8859-1). The default value is UTF8.
>> `----
>>
>> --
>

-- 

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: ODT Charset/Encoding issues (was question about ODT export behavior)
  2011-07-17 19:13                 ` Jambunathan K
@ 2011-07-18  8:59                   ` Bastien
  0 siblings, 0 replies; 17+ messages in thread
From: Bastien @ 2011-07-18  8:59 UTC (permalink / raw)
  To: Jambunathan K; +Cc: Renzo Been, emacs-orgmode, Christian Moe

Hi Jambunathan,

Jambunathan K <kjambunathan@gmail.com> writes:

> The attached patch ensures that component xml files created by the odt
> exporter are always utf-8 encoded.

This looks like a reasonable change to me.

Please feel free to commit this patch.

Thanks,

-- 
 Bastien

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/2] org-odt: Improve customization of org-export-odt-styles-file
  2011-07-15  5:54         ` Jambunathan K
  2011-07-15 20:34           ` Renzo Been
@ 2011-07-22 14:38           ` Jambunathan K
  2011-07-22 15:49             ` Bastien
  1 sibling, 1 reply; 17+ messages in thread
From: Jambunathan K @ 2011-07-22 14:38 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 272 bytes --]


> I am seeing that customization interface for org-export-odt-styles-file
> variable is only partially done. If the customization interface doesn't
> do the right thing for you, you can use the setq form temporarily.

The attached patch takes care of the above "issue".


[-- Attachment #2: 0001-org-odt-Improve-customization-of-org-export-odt-styl.patch --]
[-- Type: text/plain, Size: 3751 bytes --]

From fe6cc741850cdfca4bd9577430f744208957e3eb Mon Sep 17 00:00:00 2001
From: Jambunathan K <kjambunathan@gmail.com>
Date: Fri, 22 Jul 2011 16:37:33 +0530
Subject: [PATCH 1/2] org-odt: Improve customization of org-export-odt-styles-file

* contrib/lisp/org-odt.el (org-odt-data-dir)
(org-export-odt-automatic-styles-file): Update docstring.
(org-export-odt-use-bookmarks-for-internal-links): Update
docstring. Improve customization interface.
---
 contrib/lisp/org-odt.el |   60 ++++++++++++++++++++++++++++++++++++----------
 1 files changed, 47 insertions(+), 13 deletions(-)

diff --git a/contrib/lisp/org-odt.el b/contrib/lisp/org-odt.el
index ea4e32b..c1c5b7f 100644
--- a/contrib/lisp/org-odt.el
+++ b/contrib/lisp/org-odt.el
@@ -73,7 +73,16 @@
     (cond
      ((file-directory-p dir1) dir1)
      ((file-directory-p dir2) dir2)
-     (t (error "Cannot find factory styles file. Check package dir layout")))))
+     (t (error "Cannot find factory styles file. Check package dir layout"))))
+  "Directory that holds auxiliary files used by the ODT exporter.
+
+The 'styles' subdir contains the following xml files -
+ 'OrgOdtStyles.xml' and 'OrgOdtAutomaticStyles.xml' - which are
+ used as factory settings of `org-export-odt-styles-file' and
+ `org-export-odt-automatic-styles-file'.
+
+The 'etc/schema' subdir contains rnc files for validating of
+OpenDocument xml files.")
 
 (defvar org-odt-file-extensions
   '(("odt" . "OpenDocument Text")
@@ -135,22 +144,47 @@
 (org-lparse-register-backend 'odt)
 
 (defcustom org-export-odt-automatic-styles-file nil
-  "Default style file for use with ODT exporter."
+  "Automatic styles for use with ODT exporter.
+If unspecified, the file under `org-odt-data-dir' is used."
   :type 'file
   :group 'org-export-odt)
 
-;; TODO: Make configuration user-friendly.
 (defcustom org-export-odt-styles-file nil
-  "Default style file for use with ODT exporter.
-Valid values are path to an styles.xml file or a path to a valid
-*.odt or a *.ott file or a list of the form (FILE (MEMBER1
-MEMBER2 ...)). In the last case, the specified FILE is unzipped
-and MEMBER1, MEMBER2 etc are copied in to the generated odt
-file. The last form is particularly useful if the styles.xml has
-reference to additional files like header and footer images.
-"
-  :type 'file
-  :group 'org-export-odt)
+  "Default styles file for use with ODT export.
+Valid values are one of:
+1. nil
+2. path to a styles.xml file
+3. path to a *.odt or a *.ott file
+4. list of the form (ODT-OR-OTT-FILE (FILE-MEMBER-1 FILE-MEMBER-2
+...))
+
+In case of option 1, an in-built styles.xml is used. See
+`org-odt-data-dir' for more information.
+
+In case of option 3, the specified file is unzipped and the
+styles.xml embedded therein is used.
+
+In case of option 4, the specified ODT-OR-OTT-FILE is unzipped
+and FILE-MEMBER-1, FILE-MEMBER-2 etc are copied in to the
+generated odt file.  Use relative path for specifying the
+FILE-MEMBERS.  styles.xml must be specified as one of the
+FILE-MEMBERS.
+
+Use options 1, 2 or 3 only if styles.xml alone suffices for
+achieving the desired formatting.  Use option 4, if the styles.xml
+references additional files like header and footer images for
+achieving the desired formattting."
+  :group 'org-export-odt
+  :type
+  '(choice
+    (const :tag "Factory settings" nil)
+    (file :must-match t :tag "styles.xml")
+    (file :must-match t :tag "ODT or OTT file")
+    (list :tag "ODT or OTT file + Members"
+	  (file :must-match t :tag "ODF Text or Text Template file")
+	  (cons :tag "Members"
+		(file :tag "	Member" "styles.xml")
+		(repeat (file :tag "Member"))))))
 
 (defconst org-export-odt-tmpdir-prefix "odt-")
 (defconst org-export-odt-bookmark-prefix "OrgXref.")
-- 
1.7.2.3


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] org-odt: Improve customization of org-export-odt-styles-file
  2011-07-22 14:38           ` [PATCH 1/2] org-odt: Improve customization of org-export-odt-styles-file Jambunathan K
@ 2011-07-22 15:49             ` Bastien
  0 siblings, 0 replies; 17+ messages in thread
From: Bastien @ 2011-07-22 15:49 UTC (permalink / raw)
  To: Jambunathan K; +Cc: emacs-orgmode

Jambunathan K <kjambunathan@gmail.com> writes:

>> I am seeing that customization interface for org-export-odt-styles-file
>> variable is only partially done. If the customization interface doesn't
>> do the right thing for you, you can use the setq form temporarily.
>
> The attached patch takes care of the above "issue".

Applied, thanks!

-- 
 Bastien

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2011-07-22 15:48 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-13 13:20 question about ODT export behavior Rainer Stengele
2011-07-13 14:23 ` Bastien
2011-07-13 15:04   ` Rainer Stengele
2011-07-13 16:14     ` Bastien
2011-07-13 20:18       ` Jambunathan K
2011-07-13 16:55 ` Jambunathan K
2011-07-13 20:15   ` Jambunathan K
2011-07-14  6:50     ` Rainer Stengele
2011-07-14 15:44       ` Bastien
2011-07-15  5:54         ` Jambunathan K
2011-07-15 20:34           ` Renzo Been
2011-07-16 20:13             ` ODT Charset/Encoding issues (was question about ODT export behavior) Jambunathan K
2011-07-17 14:12               ` Renzo Been
2011-07-17 19:13                 ` Jambunathan K
2011-07-18  8:59                   ` Bastien
2011-07-22 14:38           ` [PATCH 1/2] org-odt: Improve customization of org-export-odt-styles-file Jambunathan K
2011-07-22 15:49             ` Bastien

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).