emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* odt export bug, I think.
@ 2011-08-16 20:52 Matt Price
  2011-08-17 23:47 ` Jambunathan K
  0 siblings, 1 reply; 3+ messages in thread
From: Matt Price @ 2011-08-16 20:52 UTC (permalink / raw)
  To: Org Mode


[-- Attachment #1.1: Type: text/plain, Size: 848 bytes --]

Hi,

I think I've found an odt export bug.  Certain complex URL's stored within
links can end up being rendered with forbidden characters, e.g. '<' and
'>'.  so, e.g., a link to this URL:

http://www.jstor.org.myaccess.library.utoronto.ca/sici?origin=sfx%253Asfx&sici=1363-3554%25281995%252939%253C182%253E1.0.CO%253B2-L&

was rendered in content.xml like this:
xlink:href="
http://www.jstor.org/sici?origin=sfx:sfx&amp;sici=1363-3554(1995)39<182>
1.0.CO;2-L&amp;"

resulting in a syntax error when libreoffice tries to load it.  I've
attached a minimal test file that reproduces the bug.

This is happening under a recent git snapshot of org-mode, using an emacs
snapshot from 2011-04.  Not sure if there are other xml-related packages
whose versions I should be tracking.

Thanks as always, and let me know what I can do to help with this.

Matt

[-- Attachment #1.2: Type: text/html, Size: 1241 bytes --]

[-- Attachment #2: test.odt --]
[-- Type: application/vnd.oasis.opendocument.text, Size: 8346 bytes --]

[-- Attachment #3: test.org --]
[-- Type: application/octet-stream, Size: 268 bytes --]

- Massey, Doreen. “[[http://www.jstor.org/sici?origin=sfx%g3Asfx&sici=1363-3554%281995%2939%3C182%3E1.0.CO%3B2-L&][Places and Their Pasts.]] History Workshop Journal 39 (Spring 1995): 182-192
- Novick, Robert "The Defense of the West," ch. 10 of /That Noble Dream/ 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: odt export bug, I think.
  2011-08-16 20:52 odt export bug, I think Matt Price
@ 2011-08-17 23:47 ` Jambunathan K
  2011-08-18  0:44   ` Matt Price
  0 siblings, 1 reply; 3+ messages in thread
From: Jambunathan K @ 2011-08-17 23:47 UTC (permalink / raw)
  To: Matt Price; +Cc: Org Mode


Hello Matt

> Hi,
>
> I think I've found an odt export bug.  Certain complex URL's stored
> within links can end up being rendered with forbidden characters,
> e.g. '<' and '>'.  so, e.g., a link to this URL:
>
> http://www.jstor.org.myaccess.library.utoronto.ca/sici?origin=
> sfx%253Asfx&sici=1363-3554%25281995%252939%253C182%253E1.0.CO%253B2-L
> &
>
> was rendered in content.xml like this:
> xlink:href="http://www.jstor.org/sici?origin=sfx:sfx&amp;sici=
> 1363-3554(1995)39<182>1.0.CO;2-L&amp;"


I have some understanding of what the issue is. I would like to
know/confirm a few things before proceeding ahead:

1. How does the original URL look like? 
2. Where does the URL come from? Is it generated by an application or is
   it hand copied by you from your browser.
3. How do you enter the URL in to the org file. Specifically do you

   - Simply type it. ie type the open brackets, paste the link, paste
     the description, close  the brackets etc.
  
  Or

   - You use C-c l to store the link in Org file.

  Note that question 3 is very crucial because. This is because for the
  URL that you have provided what you see with C-c l on the link is
  different from what is actually stored in the Org file. (You can see
  how actually Org stores the link by backspacing from beyond the link
  or by toggling descriptive/literal links in the menu bar)

Please respond to Question 1 keeping behaviour in 3 mind. I am
specifically interested in seeing whether the app/database (if there is
one) actually provides a hexified link or not. I also see the
possibility that one could have handcrafted the URL in an one-off sense
by concatenating key/val pairs and forming the query string oneself. In
this case (a novice) user may not have hexified the URL to begin with.

ps: If my understanding is correct you are also having similar problems
with the html export (M-x org-export-as-html) as well. Either html file
is malformed or the link in the html export file simply doesn't
work. (TIP: odt exporter is derived from the html exporter. So it is
always a good idea to check the status of html export whenever one runs
in to issues with odt exporter)

I anticipate that fix for this issue might need some discussions with
Bastien, David Maus and may be others.

Jambunathan K.




>
> resulting in a syntax error when libreoffice tries to load it.  I've
> attached a minimal test file that reproduces the bug. 
>
> This is happening under a recent git snapshot of org-mode, using an
> emacs snapshot from 2011-04.  Not sure if there are other xml-related
> packages whose versions I should be tracking.
>
> Thanks as always, and let me know what I can do to help with this.
>
> Matt
>
>
>
>

-- 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: odt export bug, I think.
  2011-08-17 23:47 ` Jambunathan K
@ 2011-08-18  0:44   ` Matt Price
  0 siblings, 0 replies; 3+ messages in thread
From: Matt Price @ 2011-08-18  0:44 UTC (permalink / raw)
  To: Jambunathan K; +Cc: Org Mode


[-- Attachment #1.1: Type: text/plain, Size: 4280 bytes --]

Hi Jambunathan,

This is a little hard to do in gmail, which auto-encodes everything!
unfortunately wanderlust has been broken for me for some time...

I've attached a text file that I think answers all you questions
appropriately.  Unfortunately my original file file is a year old, so I'm
not entirely sure of the answer to question 3 (see below).

On Wed, Aug 17, 2011 at 7:47 PM, Jambunathan K <kjambunathan@gmail.com>wrote:

>
> Hello Matt
>
> > Hi,
> >
> > I think I've found an odt export bug.  Certain complex URL's stored
> > within links can end up being rendered with forbidden characters,
> > e.g. '<' and '>'.  so, e.g., a link to this URL:
> >
> > http://www.jstor.org.myaccess.library.utoronto.ca/sici?origin=
> > sfx%253Asfx&sici=1363-3554%25281995%252939%253C182%253E1.0.CO%253B2-L
> > &
> >
> > was rendered in content.xml like this:
> > xlink:href="http://www.jstor.org/sici?origin=sfx:sfx&amp;sici=
> > 1363-3554(1995)39<182>1.0.CO;2-L&amp;"
>
>
> I have some understanding of what the issue is. I would like to
> know/confirm a few things before proceeding ahead:
>
> 1. How does the original URL look like?
>
in my browser window, this is the way the link looks (pasted directly):
http://www.jstor.org.myaccess.library.utoronto.ca/sici?origin=sfx%25g3Asfx&sici=1363-3554%281995%2939%3C182%3E1.0.CO%3B2-L&

in the browser location bar, though, you can see the angle brackets (so, the
final segment of the url reads: 1363-3554(1995)39<182>1.0.CO;2-L& ).   For
reasons I don't understand neither parentheses nor angle brackets can be
pasted into emacs or any other editor (this on Ubuntu Maverick, running
Gnome 2.something).

2. Where does the URL come from? Is it generated by an application or is
>   it hand copied by you from your browser.
>
I'm *pretty* sure I got it using org-capture or (possibly!) org-remember. In
those days I used org-protocol to capture links; I don't really do that
anymore, though not for any particular reason.

> 3. How do you enter the URL in to the org file. Specifically do you
>
>   - Simply type it. ie type the open brackets, paste the link, paste
>     the description, close  the brackets etc.
>
>  Or
>
>   - You use C-c l to store the link in Org file.
>
> I am fairly certain I used org-protocol to capture the links. but note that
the error seems to persist even if I simply cut and paste directly from my
browser window.


>  Note that question 3 is very crucial because. This is because for the
>  URL that you have provided what you see with C-c l on the link is
>  different from what is actually stored in the Org file. (You can see
>  how actually Org stores the link by backspacing from beyond the link
>  or by toggling descriptive/literal links in the menu bar)
>
> Please respond to Question 1 keeping behaviour in 3 mind. I am
> specifically interested in seeing whether the app/database (if there is
> one) actually provides a hexified link or not. I also see the
> possibility that one could have handcrafted the URL in an one-off sense
> by concatenating key/val pairs and forming the query string oneself. In
> this case (a novice) user may not have hexified the URL to begin with.
>
> ps: If my understanding is correct you are also having similar problems
> with the html export (M-x org-export-as-html) as well. Either html file
> is malformed or the link in the html export file simply doesn't
> work.


this is precisely correct.


> (TIP: odt exporter is derived from the html exporter. So it is
> always a good idea to check the status of html export whenever one runs
> in to issues with odt exporter)
>

thanks for this.

>
> I anticipate that fix for this issue might need some discussions with
> Bastien, David Maus and may be others.
>
> If the issue originates in the manner in which the initial URL was enteredi
nto org/emacs, then this might not be worth too much of everyone's time. I
think , in all likelihood, I used an outmoded manner of exchange between org
and firefox (using org 6.x!), which even I don't use anymore.  And it turns
out htere is a simpler, permanent URL for the same resource so even that
particular issue no longer matters much for me. If you think it's a
significant issue, though, I will certainly do what I can to track it down.


thanks again, so much,
Matt

[-- Attachment #1.2: Type: text/html, Size: 6169 bytes --]

[-- Attachment #2: testingurls.org --]
[-- Type: application/octet-stream, Size: 611 bytes --]

My original list item:
- Massey, Doreen. [http://www.jstor.org/sici?origin=sfx%g3Asfx&sici=1363-3554%281995%2939%3C182%3E1.0.CO%3B2-L&][Places and Their Pasts.]] History Workshop Journal 39 (Spring 1995): 182-192

The url as it appears when pasted directly from my browser:
http://www.jstor.org/sici?origin=sfx%25g3Asfx&sici=1363-3554%281995%2939%3C182%3E1.0.CO%3B2-L&

The list item as 
- Massey, Doreen. "[[http://www.jstor.org.myaccess.library.utoronto.ca/sici?origin=sfx%25g3Asfx&sici=1363-3554%281995%2939%3C182%3E1.0.CO%3B2-L&][Places and Their Pasts.]] History Workshop Journal 39 (Spring 1995): 182-192

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-08-18  0:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-16 20:52 odt export bug, I think Matt Price
2011-08-17 23:47 ` Jambunathan K
2011-08-18  0:44   ` Matt Price

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).