emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Encoding Problem in export?
@ 2013-07-23 23:17 Robert Eckl
  2013-07-23 23:35 ` Nicolas Goaziou
  0 siblings, 1 reply; 25+ messages in thread
From: Robert Eckl @ 2013-07-23 23:17 UTC (permalink / raw)
  To: emacs-orgmode

At least since release_8.0.6-478-g9ee8e2  
(encoding utf-8)

If i'm using a link which contains the character "=" the character in the
target is replaced with "%3D", at least for export to html and latex.

[[http://example.de/?idprop=222][http://example.de/picture.jpg]]

->

<a href="http://example.de/?idprop%3D222" >
   <img src="http://example.de/picture.jpg" />
</a>

----------------------------------

#+BEGIN_LaTeX
\begin{window}[0,r,\href{http://example.de/?idprop=222}{\includegraphics{picture}},{}]
#+END_LaTeX

->

\begin{window}[0,r,\href{http://example.de/?idprop%3D222}{\includegraphics{picture}},{}]

With an very old version of orgmode 7.93 this was for a while standard for
the latex exporter, not for html.

Cu,
Robert

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-07-23 23:17 Encoding Problem in export? Robert Eckl
@ 2013-07-23 23:35 ` Nicolas Goaziou
  2013-07-24  1:50   ` Robert Eckl
  0 siblings, 1 reply; 25+ messages in thread
From: Nicolas Goaziou @ 2013-07-23 23:35 UTC (permalink / raw)
  To: Robert Eckl; +Cc: emacs-orgmode

Hello,

Robert Eckl <eckl.r@gmx.de> writes:

> At least since release_8.0.6-478-g9ee8e2  

This release number is suspicious. Latest is release_8.0.6-353-g2b5670.
You're 125 commits ahead of us.

> (encoding utf-8)
>
> If i'm using a link which contains the character "=" the character in the
> target is replaced with "%3D", at least for export to html and latex.

I cannot reproduce it. Could you upgrade and test again?


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-07-23 23:35 ` Nicolas Goaziou
@ 2013-07-24  1:50   ` Robert Eckl
  2013-07-24  7:34     ` Nicolas Goaziou
  0 siblings, 1 reply; 25+ messages in thread
From: Robert Eckl @ 2013-07-24  1:50 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode

Hello,

Am 24.07.2013 01:35, schrieb Nicolas Goaziou:
> Hello,
>
> Robert Eckl <eckl.r@gmx.de> writes:
>
>> At least since release_8.0.6-478-g9ee8e2
> This release number is suspicious. Latest is release_8.0.6-353-g2b5670.
> You're 125 commits ahead of us.
>
>> (encoding utf-8)
>>
>> If i'm using a link which contains the character "=" the character in the
>> target is replaced with "%3D", at least for export to html and latex.
> I cannot reproduce it. Could you upgrade and test again?
I did upgrade to release_8.0.6-353-g2b5670 and the issue continues.

The issue seems to be introduced with
Org-mode version 8.0.6 (release_8.0.6-4-g21dd83
      org-element: Do not url-decode parsed links
The description sounds like the issue, no?

Org-mode version 8.0.6 (release_8.0.6-3-g40b44e  works fine.


I played a bit with the Elpa version
Org-mode version 8.0.6 (8.0.6-5-gb4a8ec-elpa @ 
/home/re/.emacs.d/elpa/org-20130722/)

but with this i wasn't able to run the exporter

     Wrong type argument: arrayp, odt

Systems are Linux Mint Debian and MAC OS X 10.6.8 (Snow Leopard)

Regards,

Robert

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-07-24  1:50   ` Robert Eckl
@ 2013-07-24  7:34     ` Nicolas Goaziou
  2013-07-24  8:46       ` Robert Eckl
                         ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Nicolas Goaziou @ 2013-07-24  7:34 UTC (permalink / raw)
  To: eckl.r; +Cc: emacs-orgmode

Hello,

Robert Eckl <eckl.r@gmx.de> writes:

> The issue seems to be introduced with
> Org-mode version 8.0.6 (release_8.0.6-4-g21dd83
>      org-element: Do not url-decode parsed links
> The description sounds like the issue, no?

IIUC, this is different from your issue. You write a URL that is not
encoded, and, somehow, it gets encoded in the export output. This patch
is about not decoding something already encoded.

My guess is that your URL is encoded in the original buffer already. Try
to use M-x visible-mode to see what is the real URL. If you see
"http://example.de/?idprop%3D222", it probably means that
`org-link-escape' is a bit too zealous (BTW why don't this function rely
on `url-encode-url'?)


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-07-24  7:34     ` Nicolas Goaziou
@ 2013-07-24  8:46       ` Robert Eckl
  2013-07-24  9:16         ` Nicolas Goaziou
  2013-07-24  9:39       ` Nick Dokos
  2013-11-16 15:16       ` Michael Brand
  2 siblings, 1 reply; 25+ messages in thread
From: Robert Eckl @ 2013-07-24  8:46 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode

Am 24.07.2013 09:34, schrieb Nicolas Goaziou:
> Hello,
>
> Robert Eckl <eckl.r@gmx.de> writes:
>
>> The issue seems to be introduced with
>> Org-mode version 8.0.6 (release_8.0.6-4-g21dd83
>>       org-element: Do not url-decode parsed links
>> The description sounds like the issue, no?
> IIUC, this is different from your issue. You write a URL that is not
> encoded, and, somehow, it gets encoded in the export output. This patch
> is about not decoding something already encoded.
You are right.
>
> My guess is that your URL is encoded in the original buffer already. Try
> to use M-x visible-mode to see what is the real URL. If you see
> "http://example.de/?idprop%3D222", it probably means that
> `org-link-escape' is a bit too zealous (BTW why don't this function rely
> on `url-encode-url'?)
Ok, using  visible-mode i see the link with %3D, i can replace it,
then all works fine. If iinsert a new link from clipboard, i get %3D.


Regards,

Robert

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-07-24  8:46       ` Robert Eckl
@ 2013-07-24  9:16         ` Nicolas Goaziou
  2013-07-24 10:27           ` Robert Eckl
  0 siblings, 1 reply; 25+ messages in thread
From: Nicolas Goaziou @ 2013-07-24  9:16 UTC (permalink / raw)
  To: eckl.r; +Cc: emacs-orgmode

Robert Eckl <eckl.r@gmx.de> writes:

> Ok, using  visible-mode i see the link with %3D, i can replace it,
> then all works fine. If iinsert a new link from clipboard, i get %3D.

Then, as I said, `org-link-escape' (called from `org-insert-link', aka
C-c C-l) is buggy.

I think we should replace every occurence of `org-link-escape' with
`url-encode-url' and `org-link-unescape' with `url-unhex-string'.
I can't see a reason to reinvent the wheel here.

AFAICT, only org-mobile.el uses optional arguments from
`org-link-escape'. It just begs for a new internal function in
org-mobile.el.

WDYT?


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-07-24  7:34     ` Nicolas Goaziou
  2013-07-24  8:46       ` Robert Eckl
@ 2013-07-24  9:39       ` Nick Dokos
  2013-07-24 11:09         ` Nicolas Goaziou
  2013-11-16 15:16       ` Michael Brand
  2 siblings, 1 reply; 25+ messages in thread
From: Nick Dokos @ 2013-07-24  9:39 UTC (permalink / raw)
  To: emacs-orgmode

Nicolas Goaziou <n.goaziou@gmail.com> writes:

> My guess is that your URL is encoded in the original buffer already. Try
> to use M-x visible-mode to see what is the real URL. If you see
> "http://example.de/?idprop%3D222", it probably means that
> `org-link-escape' is a bit too zealous (BTW why don't this function rely
> on `url-encode-url'?)

FWIW, I think this is correct: using org-insert-link runs
org-link-escape with the default org-link-escape-chars which includes
'='. 

I don't know why '=' is included, so I looked to see whether deleting it
from org-link-escape-chars would cause problems.  Looking at the call
sites, org-link-escape is called from:

o org-mobile.el but with its own set of escape chars, so that's no
  problem.

o ob-tangle.el from org-babel-tangle-comment-links, which in turn is
  called (twice) from org-babel-expand-noweb-references. The default
  org-link-escape-chars is used, so any '=' would get escaped, but
  whether leaving it out would cause any problems, I don't know.

o org-docview.el from org-docview-export with the default escape chars.
  Again, I don't know if leaving '=' out would cause any problems.

o org.el from org-make-link-string (thrice) with the default escape
  chars and another three times from org-open-at-point, once for a
  "mailto" link and another two times to open an http/https/ftp/news/doi
  url but with a smaller set of escape chars that does not include '='.

o org-make-link-string is itself called from a bunch of places:
  org-capture.el, org-clock.el, org.el, org-protocol.el, org-w3m.el.
  I didn't chase it down through these.

I tried a ``git blame'' to see whether the '=' was added for some
reason, but it looks as if it's been there ab initio.

Maybe the thing to do is to delete '=' from org-link-escape-chars and
see what problems arise.

But I did find that '%' was originally in org-link-escape-chars and
David Maus hardcoded it (commit 139cc1d4), so that it is *always*
escaped. I assume there is a good reason for that, but if so,
url-encode-url might not be enough - afaict, it leaves '%' signs alone:

,----
| (setq url "http://www.google.org/foo=bar 30%=2")
| "http://www.google.org/foo=bar 30%=2"
| 
| (org-link-escape url)
| "http://www.google.org/foo%3Dbar%2030%25%3D2"
| 
| (url-encode-url url)
| "http://www.google.org/foo=bar%2030%=2"
`----

-- 
Nick

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-07-24  9:16         ` Nicolas Goaziou
@ 2013-07-24 10:27           ` Robert Eckl
  0 siblings, 0 replies; 25+ messages in thread
From: Robert Eckl @ 2013-07-24 10:27 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: emacs-orgmode

Am 24.07.2013 11:16, schrieb Nicolas Goaziou:
> Robert Eckl <eckl.r@gmx.de> writes:
>
>> Ok, using  visible-mode i see the link with %3D, i can replace it,
>> then all works fine. If iinsert a new link from clipboard, i get %3D.
> Then, as I said, `org-link-escape' (called from `org-insert-link', aka
> C-c C-l) is buggy.
>
> I think we should replace every occurence of `org-link-escape' with
> `url-encode-url' and `org-link-unescape' with `url-unhex-string'.
> I can't see a reason to reinvent the wheel here.
>
> AFAICT, only org-mobile.el uses optional arguments from
> `org-link-escape'. It just begs for a new internal function in
> org-mobile.el.
>
> WDYT?
It's fine for me - IME you really know what you do.

Thank you very much,

Robert

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-07-24  9:39       ` Nick Dokos
@ 2013-07-24 11:09         ` Nicolas Goaziou
  2013-07-25  4:05           ` David Maus
  0 siblings, 1 reply; 25+ messages in thread
From: Nicolas Goaziou @ 2013-07-24 11:09 UTC (permalink / raw)
  To: Nick Dokos; +Cc: David Maus, emacs-orgmode

Hello,

Nick Dokos <ndokos@gmail.com> writes:

> Maybe the thing to do is to delete '=' from org-link-escape-chars and
> see what problems arise.

AFAICT, `url-encode-url' is subtler than that. It encodes characters
whenever they are really forbidden, which is not the case of
`org-link-escape'. Hence my initial question: do we need to reinvent the
wheel?

> But I did find that '%' was originally in org-link-escape-chars and
> David Maus hardcoded it (commit 139cc1d4), so that it is *always*
> escaped.

I Cc David Maus in case he has time to enlighten us about his choice.

> I assume there is a good reason for that, but if so, url-encode-url
> might not be enough - afaict, it leaves '%' signs alone:

Yes, there is a comment in url-util.el:

  (defconst url-host-allowed-chars
    ;; Allow % to avoid re-encoding %-encoded sequences.
    (url--allowed-chars (append '(?% ?! ?$ ?& ?' ?\( ?\) ?* ?+ ?, ?\; ?=)
  			      url-unreserved-chars))
    "Allowed-character byte mask for the host segment of a URI.
  These characters are specified in RFC 3986, Appendix A.")

Not sure how it could affect URI correctness. I trust "url-util.el"
authors, though.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-07-24 11:09         ` Nicolas Goaziou
@ 2013-07-25  4:05           ` David Maus
  2013-07-25 21:46             ` Nicolas Goaziou
  0 siblings, 1 reply; 25+ messages in thread
From: David Maus @ 2013-07-25  4:05 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: Nick Dokos, emacs-orgmode, David Maus

Hi Nicolas,
Hi Nick,

At Wed, 24 Jul 2013 13:09:05 +0200,
Nicolas Goaziou wrote:
> 
> Hello,
> 
> Nick Dokos <ndokos@gmail.com> writes:
> 
> > Maybe the thing to do is to delete '=' from org-link-escape-chars and
> > see what problems arise.
> 
> AFAICT, `url-encode-url' is subtler than that. It encodes characters
> whenever they are really forbidden, which is not the case of
> `org-link-escape'. Hence my initial question: do we need to reinvent the
> wheel?
> 
> > But I did find that '%' was originally in org-link-escape-chars and
> > David Maus hardcoded it (commit 139cc1d4), so that it is *always*
> > escaped.
> 
> I Cc David Maus in case he has time to enlighten us about his choice.
>

IIRC org-link-escape is not used to create URLs but to escape
characters in a link that would otherwise conflict with Orgmode syntax
(e.g. square brackets). Org applies percent escaping to a link before
it is stored in the buffer and applies unescaping when it reads a link
back.

The percent sign is hardcoded because if org-link-escape/unescape is
used in this way we must make sure that the identity of a link is
preserved. If we would *not* escape the percent sign, then an original
link with percent encoded characters would be read back wrongly,
i.e. with the percent escaped characters unescaped.

This broke links.

E.g. consider a redirector link to the target url
`http://target.example.org?id=33&format=html"':

,----
| http://redirect.example.org?url=http%3A%2F%2Ftarget.example.org%3Fid%3D33%26format%3Dhtml
`----

If we don't escape the percent sign but apply unescaping when, say,
the user opens the link we would get:

,----
| http://redirect.example.org?url=http://target.example.org?id=33&format=html
`----

And voila: The `format' parameter is turned into a query parameter of
redirect.example.org, not target.example.org.

The specs (RFC3986) have to say the following about escaping:

,----
|    Because the percent ("%") character serves as the indicator for
|    percent-encoded octets, it must be percent-encoded as "%25" for that
|    octet to be used as data within a URI.  Implementations must not
|    percent-encode or decode the same string more than once, as decoding
|    an already decoded string might lead to misinterpreting a percent
|    data octet as the beginning of a percent-encoding, or vice versa in
|    the case of percent-encoding an already percent-encoded string.
`----

There is, of course, the nasty thing that we don't know if the link in
a buffer went through org-link-escape or not. E.g. if you paste

,----
| [[http://redirect.example.org?url=http%3A%2F%2Ftarget.example.org%3Fid%3D33%26format%3Dhtml]]
`----

into the buffer you'll get a broken link because org-link-open assumes
the link to be escaped by org.

The bottom-line: Org creates link programmatically (org-store-link)
and needs a mechanism to protected conflicting characters. It chose
percent-escaping and in order to preserve the identity of a link Org
has to escape the escape-character.

Hope that helps!

Best,
  -- David
-- 
OpenPGP... 0x99ADB83B5A4478E6
Jabber.... dmjena@jabber.org
Email..... dmaus@ictsoc.de

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-07-25  4:05           ` David Maus
@ 2013-07-25 21:46             ` Nicolas Goaziou
  2013-07-26  4:03               ` David Maus
  0 siblings, 1 reply; 25+ messages in thread
From: Nicolas Goaziou @ 2013-07-25 21:46 UTC (permalink / raw)
  To: David Maus; +Cc: Nick Dokos, emacs-orgmode

Hello,

David Maus <dmaus@ictsoc.de> writes:

> IIRC org-link-escape is not used to create URLs but to escape
> characters in a link that would otherwise conflict with Orgmode syntax
> (e.g. square brackets).

> Org applies percent escaping to a link before
> it is stored in the buffer and applies unescaping when it reads a link
> back.
>
> The percent sign is hardcoded because if org-link-escape/unescape is
> used in this way we must make sure that the identity of a link is
> preserved. If we would *not* escape the percent sign, then an original
> link with percent encoded characters would be read back wrongly,
> i.e. with the percent escaped characters unescaped.

[...]

> There is, of course, the nasty thing that we don't know if the link in
> a buffer went through org-link-escape or not. E.g. if you paste
>
> ,----
> | [[http://redirect.example.org?url=http%3A%2F%2Ftarget.example.org%3Fid%3D33%26format%3Dhtml]]
> `----
>
> into the buffer you'll get a broken link because org-link-open assumes
> the link to be escaped by org.
>
> The bottom-line: Org creates link programmatically (org-store-link)
> and needs a mechanism to protected conflicting characters. It chose
> percent-escaping and in order to preserve the identity of a link Org
> has to escape the escape-character.
>
> Hope that helps!

It does.

I think we are hunting two hares and that's why we are failing so far.

There are two URI transformations involved. One is mandatory (escape
square brackets in URI), and the other one is optional (normalize URI
for external processes consumption). The former must be bi-directional,
as escaping brackets must be transparent to the user (e.g., when editing
a link with `org-insert-link'). The latter needn't and can happen on the
fly, just before the URI is sent to whatever needs it (e.g., a browser).

Therefore, I suggest to use three functions:

  - `org-link-escape will first %-escape "%" characters, and then "["
    and "]" characters. `org-link-unescape' will reverse the operation.

    These function cannot break a link, encoded or not. They are applied
    when a link is created programmatically and read back for user
    editing.

  - `org-link-encode'[1] will %-escape every forbidden character in the
    URI. It doesn't need any "reverse" function. It will be called when
    opening a link, or parsing it.

    I think it shouldn't escape "%" characters, though, so that it can
    be applied on both encoded and plain strings. Since it isn't perfect
    (it doesn't parse URI), it should also be very conservative (i.e.
    allow more characters such as "=" or "&") and not get in the way.

WDYT?


Regards,

[1] `url-encode-url' was introduced in Emacs 24.3. It is too young to be
used mainstream, even though it does a better job than
`org-link-escape'. We will benefit from it when Emacs 25 is out (i.e.
when Emacs 23 support is dropped).

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-07-25 21:46             ` Nicolas Goaziou
@ 2013-07-26  4:03               ` David Maus
  2013-07-26 10:20                 ` Nicolas Goaziou
  0 siblings, 1 reply; 25+ messages in thread
From: David Maus @ 2013-07-26  4:03 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: David Maus, emacs-orgmode, Nick Dokos

At Thu, 25 Jul 2013 23:46:34 +0200,
Nicolas Goaziou wrote:
> 
> Hello,
> 
> David Maus <dmaus@ictsoc.de> writes:
> 
> >
> > The bottom-line: Org creates link programmatically (org-store-link)
> > and needs a mechanism to protected conflicting characters. It chose
> > percent-escaping and in order to preserve the identity of a link Org
> > has to escape the escape-character.
> >
> > Hope that helps!
> 
> It does.
> 
> I think we are hunting two hares and that's why we are failing so far.
>
>
> There are two URI transformations involved. One is mandatory (escape
> square brackets in URI), and the other one is optional (normalize URI
> for external processes consumption). The former must be bi-directional,
> as escaping brackets must be transparent to the user (e.g., when editing
> a link with `org-insert-link'). The latter needn't and can happen on the
> fly, just before the URI is sent to whatever needs it (e.g., a browser).
> 
> Therefore, I suggest to use three functions:
> 
>   - `org-link-escape will first %-escape "%" characters, and then "["
>     and "]" characters. `org-link-unescape' will reverse the operation.
> 
>     These function cannot break a link, encoded or not. They are applied
>     when a link is created programmatically and read back for user
>     editing.

It's not just square brackets, but also non-ascii
characters. Consider a link that contains UTF-8 encoded characters and
is inserted into a Org buffer encoded in ISO-8859-1.

Oh, and: ASCII controll characters. A link description with newlines.

Obviously changing the algorithm of org-link-escape/unescape also
creates a BC-issue.

> 
>   - `org-link-encode'[1] will %-escape every forbidden character in the
>     URI. It doesn't need any "reverse" function. It will be called when
>     opening a link, or parsing it.
> 
>     I think it shouldn't escape "%" characters, though, so that it can
>     be applied on both encoded and plain strings. Since it isn't perfect
>     (it doesn't parse URI), it should also be very conservative (i.e.
>     allow more characters such as "=" or "&") and not get in the way.

You would have to select the list of forbidden characters based on the
link protocol. The assumption underlying the current implementation is
to delegate dealing with forbidden characters to the consuming
application. Thus I would limit this to known URI protocols,
i.e. http: and https:.

Best,
  -- David

> 
> WDYT?
> 
> 
> Regards,
> 
> [1] `url-encode-url' was introduced in Emacs 24.3. It is too young to be
> used mainstream, even though it does a better job than
> `org-link-escape'. We will benefit from it when Emacs 25 is out (i.e.
> when Emacs 23 support is dropped).
> 
> -- 
> Nicolas Goaziou
-- 
OpenPGP... 0x99ADB83B5A4478E6
Jabber.... dmjena@jabber.org
Email..... dmaus@ictsoc.de

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-07-26  4:03               ` David Maus
@ 2013-07-26 10:20                 ` Nicolas Goaziou
  2013-07-27  7:23                   ` David Maus
  0 siblings, 1 reply; 25+ messages in thread
From: Nicolas Goaziou @ 2013-07-26 10:20 UTC (permalink / raw)
  To: David Maus; +Cc: Nick Dokos, emacs-orgmode

David Maus <dmaus@ictsoc.de> writes:

Thanks for your answer. It seems I got confused with the current state
of URI-encoding. Please scratch my previous suggestion and let's start
over.

> The assumption underlying the current implementation is
> to delegate dealing with forbidden characters to the consuming
> application.

I agree with this assumption, even though I think some URI-fixing (à la
`url-encode-url') would be nice too. But that's not the topic here.
Also, the current implementation doesn't totally follow this assumption
(e.g. `org-link-escape-chars-browser').

Alas, there is a serious flaw in the current implementation. As you
said:

> There is, of course, the nasty thing that we don't know if the link in
> a buffer went through org-link-escape or not. E.g. if you paste
>
>  ,----
> | [[http://redirect.example.org?url=http%3A%2F%2Ftarget.example.org%3Fid%3D33%26format%3Dhtml]]
>  `----
>
> into the buffer you'll get a broken link because org-link-open assumes
> the link to be escaped by org.

There is, indeed, no easy way to know if a link went through
`org-link-escape', so we cannot unescape it properly in every situation.
We could use text properties on escaped links, but that seems awkward.

I think there is a simpler solution: we never "unescape" links, which
means that escaping must be at its minimum. For example, we could only
replace "[" and "]" with, respectively, "%5B" and "%5D" and newlines
with spaces. It doesn't cripple link's readability very mucĥ, and is
safe as "[", "]" and "\n" are always forbidden in URI anyway.

Replacing non-ascii characters would make the link unreadable to
a human. Also, we don't prevent encoding mismatch (e.g., from UTF-8 to
ISO-8859-1) when yanking regular text in an Org buffer, so there's no
particular reason to do it for links.

This operation is clearly idempotent.

When sending the URL to the consuming, there will be problems, according
to the assumption at the beginning of this message. But that is to be
expected.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-07-26 10:20                 ` Nicolas Goaziou
@ 2013-07-27  7:23                   ` David Maus
  2013-07-27 11:09                     ` Nicolas Goaziou
  0 siblings, 1 reply; 25+ messages in thread
From: David Maus @ 2013-07-27  7:23 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: David Maus, emacs-orgmode, Nick Dokos

At Fri, 26 Jul 2013 12:20:37 +0200,
Nicolas Goaziou wrote:
> 
> David Maus <dmaus@ictsoc.de> writes:
> 
> Thanks for your answer. It seems I got confused with the current state
> of URI-encoding. Please scratch my previous suggestion and let's start
> over.

The more I think about it the more I grow certain that it is NOT about
URI encoding but protecting a string. Unless we parse the URI and know
the protocol we cannot tell if square brackets are allowed or not.

> 
> Alas, there is a serious flaw in the current implementation. As you
> said:
> 
> > There is, of course, the nasty thing that we don't know if the link in
> > a buffer went through org-link-escape or not. E.g. if you paste
> >
> >  ,----
> > | [[http://redirect.example.org?url=http%3A%2F%2Ftarget.example.org%3Fid%3D33%26format%3Dhtml]]
> >  `----
> >
> > into the buffer you'll get a broken link because org-link-open assumes
> > the link to be escaped by org.
> 
> There is, indeed, no easy way to know if a link went through
> `org-link-escape', so we cannot unescape it properly in every situation.
> We could use text properties on escaped links, but that seems awkward.
> 
> I think there is a simpler solution: we never "unescape" links,
> which means that escaping must be at its minimum. For example, we
> could only replace "[" and "]" with, respectively, "%5B" and "%5D"
> and newlines with spaces. It doesn't cripple link's readability very
> mucĥ, and is safe as "[", "]" and "\n" are always forbidden in URI
> anyway.

`[' and `]' are not forbidden per se, they belong to the set of
reserved characters (see RFC 3986, 2.2.).

"characters in the reserved set are protected from normalization and
are therefore safe to be used by scheme-specific and producer-specific
algorithms for delimiting data subcomponents within a URI."
(RFC 3986, p. 12)

Moreover they are explicitly required in the host part to denote a
IPv6 address literal (RFC 3986, 3.2.2).

If I am not mistaken then this is a valid http-URI with a XPointer
fragment pointing to the third `p' element in a locally hosted file:

http://[::1]/foo.xml#xpointer(//p[3])

,----[ http://www.w3.org/TR/xptr-framework/#escaping
| IRI references can be converted to URI references for consumption by
| URI resolvers. The disallowed characters in URI references include all
| non-ASCII characters, plus the excluded characters listed in Section
| 2.4 of [RFC 2396], except for the number sign (#) and percent sign (%)
| and the square bracket characters re-allowed in [RFC 2732]. 
`----

> When sending the URL to the consuming, there will be problems, according
> to the assumption at the beginning of this message. But that is to be
> expected.

If we escape but don't unescape there are *other* problems: Depending
on the protocol an escaped square bracket and a unescaped square
bracket can have different meaning. The assumption I mentioned referes
to unescaped characters. A consuming application knows the protocol
and can infer the characters that need to be escaped.

> Replacing non-ascii characters would make the link unreadable to a
> human. Also, we don't prevent encoding mismatch (e.g., from UTF-8 to
> ISO-8859-1) when yanking regular text in an Org buffer, so there's
> no particular reason to do it for links.

ACK. It's not about creating URIs but protecting strings, thus the
rules for percent escaping don't have to be applied.

Best,
  -- David

-- 
OpenPGP... 0x99ADB83B5A4478E6
Jabber.... dmjena@jabber.org
Email..... dmaus@ictsoc.de

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-07-27  7:23                   ` David Maus
@ 2013-07-27 11:09                     ` Nicolas Goaziou
  2013-07-28  8:36                       ` Jambunathan K
  0 siblings, 1 reply; 25+ messages in thread
From: Nicolas Goaziou @ 2013-07-27 11:09 UTC (permalink / raw)
  To: David Maus; +Cc: Nick Dokos, emacs-orgmode

David Maus <dmaus@ictsoc.de> writes:

> The more I think about it the more I grow certain that it is NOT about
> URI encoding but protecting a string.

This is what I mean.

> `[' and `]' are not forbidden per se, they belong to the set of
> reserved characters (see RFC 3986, 2.2.).
>
> "characters in the reserved set are protected from normalization and
> are therefore safe to be used by scheme-specific and producer-specific
> algorithms for delimiting data subcomponents within a URI."
> (RFC 3986, p. 12)
>
> Moreover they are explicitly required in the host part to denote a
> IPv6 address literal (RFC 3986, 3.2.2).
>
> If I am not mistaken then this is a valid http-URI with a XPointer
> fragment pointing to the third `p' element in a locally hosted file:
>
> http://[::1]/foo.xml#xpointer(//p[3])

Thanks for the info. I didn't read RFC 3986 thoroughly.

> If we escape but don't unescape there are *other* problems: Depending
> on the protocol an escaped square bracket and a unescaped square
> bracket can have different meaning. The assumption I mentioned referes
> to unescaped characters. A consuming application knows the protocol
> and can infer the characters that need to be escaped.

We cannot unescape if we use %-encoding, as stated before.

> ACK. It's not about creating URIs but protecting strings, thus the
> rules for percent escaping don't have to be applied.

Indeed. Ideally, we need to encode "[" and "]" with strings that cannot
ever be found in a URI. Then, it will be possible to decode them safely.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-07-27 11:09                     ` Nicolas Goaziou
@ 2013-07-28  8:36                       ` Jambunathan K
  2013-07-28  8:54                         ` Jambunathan K
                                           ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Jambunathan K @ 2013-07-28  8:36 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: David Maus, emacs-orgmode, Nick Dokos


If Org links are escaped by Org will the URLs be functional outside of
Org?

i.e., If I am on some machine, that has no Emacs or Org or if I am using
a version of Org that uses "new unescape" algorithm but the original
link was encoded with the "old escape" algorithm, will Copy-pasting the
link to a browser still work.

If Org is a MUST to unescape the link then it would be a good decision
to re-look at the link syntax so that the questions of escape and
un-escape is dealt with squarely and have no reasons to arise in future.

[1] IIRC, escaping doesn't happen if URL is copy-pasted but only if it
is "inserted".  i.e., escaping much depends on the workflow of the user
and the workflow of a user could much depend on his whims and fancies
and day of the week and seasons of the year.

Just a cent from a Org user.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-07-28  8:36                       ` Jambunathan K
@ 2013-07-28  8:54                         ` Jambunathan K
  2013-07-28 11:16                         ` David Maus
  2013-07-28 11:22                         ` Nicolas Goaziou
  2 siblings, 0 replies; 25+ messages in thread
From: Jambunathan K @ 2013-07-28  8:54 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: David Maus, emacs-orgmode, Nick Dokos


I sense a design flaw.  Fix it rather than "escape" it.


Jambunathan K <kjambunathan@gmail.com> writes:

> If Org links are escaped by Org will the URLs be functional outside of
> Org?
>
> i.e., If I am on some machine, that has no Emacs or Org or if I am using
> a version of Org that uses "new unescape" algorithm but the original
> link was encoded with the "old escape" algorithm, will Copy-pasting the
> link to a browser still work.
>
> If Org is a MUST to unescape the link then it would be a good decision
> to re-look at the link syntax so that the questions of escape and
> un-escape is dealt with squarely and have no reasons to arise in future.
>
> [1] IIRC, escaping doesn't happen if URL is copy-pasted but only if it
> is "inserted".  i.e., escaping much depends on the workflow of the user
> and the workflow of a user could much depend on his whims and fancies
> and day of the week and seasons of the year.
>
> Just a cent from a Org user.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-07-28  8:36                       ` Jambunathan K
  2013-07-28  8:54                         ` Jambunathan K
@ 2013-07-28 11:16                         ` David Maus
  2013-07-28 11:22                         ` Nicolas Goaziou
  2 siblings, 0 replies; 25+ messages in thread
From: David Maus @ 2013-07-28 11:16 UTC (permalink / raw)
  To: Jambunathan K; +Cc: David Maus, emacs-orgmode, Nicolas Goaziou, Nick Dokos

At Sun, 28 Jul 2013 14:06:54 +0530,
Jambunathan K wrote:
> 
> 
> If Org links are escaped by Org will the URLs be functional outside of
> Org?
> 
> i.e., If I am on some machine, that has no Emacs or Org or if I am using
> a version of Org that uses "new unescape" algorithm but the original
> link was encoded with the "old escape" algorithm, will Copy-pasting the
> link to a browser still work.

I think this is a good point or rather two good points: One is BC. If
we change the escaping algorithm we still have to deal with possibly
tons of old-style-links in user files. 

The other one is that leaving the edge cases aside it is possible to
just copy a link and paste it into the target application -- a percent
sign signifies percent encoding and the target application knows what
to do.

> If Org is a MUST to unescape the link then it would be a good decision
> to re-look at the link syntax so that the questions of escape and
> un-escape is dealt with squarely and have no reasons to arise in future.
> 

I'm not sure if it is worth the effort but, in theory, we could do
define our own URI schema `org' that disallows square brackets. If a
link is created programmatically (org-store-link et al.) we do not
store the URI as-is but as an "Orgmode-Link": Escape the square
brackets and prefix the link with `org:'. If we open a link we check
for the `org:'-prefix, reverse the escaping and handle the link to the
registered module. If the prefix is absent we skip the unescaping
step.

Best,
  -- David
-- 
OpenPGP... 0x99ADB83B5A4478E6
Jabber.... dmjena@jabber.org
Email..... dmaus@ictsoc.de

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-07-28  8:36                       ` Jambunathan K
  2013-07-28  8:54                         ` Jambunathan K
  2013-07-28 11:16                         ` David Maus
@ 2013-07-28 11:22                         ` Nicolas Goaziou
  2013-07-29  6:59                           ` Jambunathan K
  2 siblings, 1 reply; 25+ messages in thread
From: Nicolas Goaziou @ 2013-07-28 11:22 UTC (permalink / raw)
  To: Jambunathan K; +Cc: David Maus, emacs-orgmode, Nick Dokos

Hello,

Jambunathan K <kjambunathan@gmail.com> writes:

> If Org links are escaped by Org will the URLs be functional outside of
> Org?

If there is an "unencoding" part, and if that part cannot happen for
some reason, links will be unusable outside Org.

That's already the case with the current encoding, which will break, for
example, links already hexified if it cannot unencode them properly.

> If Org is a MUST to unescape the link then it would be a good decision
> to re-look at the link syntax so that the questions of escape and
> un-escape is dealt with squarely and have no reasons to arise in
> future.

We can also avoid any encoding, and extend link syntax so it allows
balanced square brackets (with a maximum depth, otherwise we cannot use
a regexp for bracket links anymore).


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-07-28 11:22                         ` Nicolas Goaziou
@ 2013-07-29  6:59                           ` Jambunathan K
  0 siblings, 0 replies; 25+ messages in thread
From: Jambunathan K @ 2013-07-29  6:59 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: David Maus, emacs-orgmode, Nick Dokos


Nicolas, David

I just interjected.

Nicolas Goaziou <n.goaziou@gmail.com> writes:

> ...if that part cannot happen for some reason, links will be unusable
> outside Org.

Correctness should overrule compatibility.  In practice, we may have to
strike a balance, with more weight thrown in favor of correctness.  

I am stating the obvious here, but in a way that is practically useless
to the discussion at hand.

I trust that any solution that you come up with will be a good one.
Also the timing is right.  We can always say Org-8.0 makes a clear
departure from earlier versions for reasons of robustness and
correctness.

----------------------------------------------------------------

Speaking from gut (aka making things up)

Link (un)escaping has also something to do with org-protocol and how the
URL in browser's address bar is "captured", "encoded"(?) and
"transferred" to the Emacs proper via the bookmarklet.  So the browser
(don't forget the clipboard) acts as *active* intermediaries as the URL
makes it's way from the browser to the Org file either via hand or
through emacsclient.  To complicate the issue, browser being user facing
may be expected to be very lenient with a URL or how it is "presented"
to the user.

ps-1: org-protocol to work on Windows is quite flaky.

ps-2: There are frequent posts to Emacs mailing lists where copying from
browser to a Emacs buffer will show up un-readable boxes.

----------------------------------------------------------------

Don't read further, if you are allergic to meta musings.

As far as Org is concerned, backward compatibility is not a issue.  The
community is always being replaced *every* academic year.  New scholars
come and the old scholars leave.  The only steady lot of the population
is the college dons.  They will not rely on Org *solely* for serious
publishing work.  They do revise their course support material - like
beamer presentations etc - every term.

In summary, shelf-life of an Org source file that is actually exported
is unlikely to be beyond 4-5 years.  The contents of such source file
has less of stable parts and more of moving parts.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-07-24  7:34     ` Nicolas Goaziou
  2013-07-24  8:46       ` Robert Eckl
  2013-07-24  9:39       ` Nick Dokos
@ 2013-11-16 15:16       ` Michael Brand
  2013-11-16 20:43         ` Nicolas Goaziou
  2 siblings, 1 reply; 25+ messages in thread
From: Michael Brand @ 2013-11-16 15:16 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: Org Mode

[-- Attachment #1: Type: text/plain, Size: 1813 bytes --]

Hi Nicolas

I would like to ask you to review the attached patch so I can change
it when necessary before I git push it.

> it probably means that `org-link-escape' is a bit too zealous (BTW
> why don't this function rely on `url-encode-url'?)

url-encode-url is a very good hint to solve a different issue that
I try to deal with:

(from http://lists.gnu.org/archive/html/emacs-orgmode/2013-10/msg00204.html)
On Sat, Oct 5, 2013 at 3:04 PM, Michael Brand
<michael.ch.brand@gmail.com> wrote:
> [...] related change that I will suggest with an ERT in a later
> patch: Just add "+" to org-link-escape-chars-browser.

"+" has not been added to org-link-escape-chars-browser yet and in the
meantime I realized that it should not be added in order to not break
existing Org links like:

    [[http://lists.gnu.org/archive/cgi-bin/namazu.cgi?idxname=emacs-orgmode&query="Release+8.2"]]

that work the same as

    [[http://lists.gnu.org/archive/cgi-bin/namazu.cgi?idxname=emacs-orgmode&query="Release
8.2"]]

Better is to change the function org-link-escape-browser to use
url-encode-url when available (since Emacs 24.3). With this I can
write my use case to open a browser with +subject:"Release 8.2" in the
query field now as an Org link written manually with %2B for the "+"
at the beginning and %25 for its "%" like this

    [[http://lists.gnu.org/archive/cgi-bin/namazu.cgi?idxname=emacs-orgmode&query=%252Bsubject:"Release+8.2"]]

or

    [[http://lists.gnu.org/archive/cgi-bin/namazu.cgi?idxname=emacs-orgmode&query=%252Bsubject:"Release
8.2"]]

Link escaping in org-store-link and link unescaping in the first part
of org-open-at-point are not changed, this is important to keep
backward compatibility with old Org links. I have used this patch for
several weeks on Emacs 24.3.2 without any problem.

Michael

[-- Attachment #2: review.patch.txt --]
[-- Type: text/plain, Size: 4891 bytes --]

commit 686bd788243ef38951c3c529d4c67c3ad766f417
Author: Michael Brand <michael.ch.brand@gmail.com>
Date:   Sat Nov 16 16:13:57 2013 +0100

    Hyperlink: Use url-encode-url for browse-url
    
    * lisp/org.el (org-open-at-point): When available (Emacs 24.3) use
    `url-encode-url' instead of `org-link-escape-browser'.
    
    * testing/lisp/test-org.el
    (test-org/org-link-escape-url-with-escaped-char): Substitute repeated
    literal string with constant.
    (test-org/org-link-escape-chars-browser): Extend test coverage with
    `url-encode-url' and with "query="-space as plus sign or space.

diff --git a/lisp/org.el b/lisp/org.el
index a3c1958..aa91ffc 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -10468,11 +10468,20 @@ application the system uses for this file type."
 	      (apply cmd (nreverse args1))))
 
 	   ((member type '("http" "https" "ftp" "news"))
-	    (browse-url (concat type ":" (org-link-escape-browser path))))
+	    ;; see `ert-deftest'
+	    ;; `test-org/org-link-escape-chars-browser'
+	    (browse-url
+	     (if (fboundp 'url-encode-url)
+		 (url-encode-url (concat type ":" path))
+	       (org-link-escape-browser (concat type ":" path)))))
 
 	   ((string= type "doi")
-	    (browse-url (concat org-doi-server-url
-				(org-link-escape-browser path))))
+	    ;; see `ert-deftest'
+	    ;; `test-org/org-link-escape-chars-browser'
+	    (browse-url
+	     (if (fboundp 'url-encode-url)
+		 (url-encode-url (concat org-doi-server-url path))
+	       (org-link-escape-browser (concat org-doi-server-url path)))))
 
 	   ((member type '("message"))
 	    (browse-url (concat type ":" path)))
diff --git a/testing/lisp/test-org.el b/testing/lisp/test-org.el
index f4672eb..084e95d 100644
--- a/testing/lisp/test-org.el
+++ b/testing/lisp/test-org.el
@@ -552,21 +552,53 @@
 (ert-deftest test-org/org-link-escape-url-with-escaped-char ()
   "Escape and unescape a URL that includes an escaped char.
 http://article.gmane.org/gmane.emacs.orgmode/21459/"
-  (should
-   (string=
-    "http://some.host.com/form?&id=blah%2Bblah25"
-    (org-link-unescape
-     (org-link-escape "http://some.host.com/form?&id=blah%2Bblah25")))))
+  (let ((a "http://some.host.com/form?&id=blah%2Bblah25"))
+    (should (string= a (org-link-unescape (org-link-escape a))))))
 
 (ert-deftest test-org/org-link-escape-chars-browser ()
-  "Escape a URL to pass to `browse-url'."
-  (should
-   (string=
-    (concat "http://lists.gnu.org/archive/cgi-bin/namazu.cgi?query="
-	    "%22Release%208.2%22&idxname=emacs-orgmode")
-    (org-link-escape-browser
-     (concat "http://lists.gnu.org/archive/cgi-bin/namazu.cgi?query="
-	     "\"Release 8.2\"&idxname=emacs-orgmode")))))
+  "Escape a URL before passing it to `browse-url'.
+
+This test is to ensure that `org-open-at-point' on the Org links
+
+    [[http://lists.gnu.org/archive/cgi-bin/namazu.cgi?idxname=emacs-orgmode&query=%252Bsubject:\"Release+8.2\"]]
+    [[http://lists.gnu.org/archive/cgi-bin/namazu.cgi?idxname=emacs-orgmode&query=%252Bsubject:\"Release 8.2\"]]
+
+will open a browser with +subject:\"Release 8.2\" in the query
+field."
+
+  ;; Each string argument passed to `url-encode-url' or
+  ;; `org-link-escape-browser' in the tests below (or when
+  ;; `org-open-at-point' is used in an Org buffer) looks like after
+  ;; the Org link from the docstring has been unescaped by
+  ;; `org-link-unescape' in `org-open-at-point'
+  (let ((query (concat "http://lists.gnu.org/archive/cgi-bin/namazu.cgi?"
+		       "idxname=emacs-orgmode&query="))
+	(plus  "%2Bsubject:\"Release+8.2\"")   ; "query="-space as plus sign
+	(space "%2Bsubject:\"Release 8.2\""))  ; "query="-space as space
+
+    ;; This is the behavior of `org-open-at-point' when used together
+    ;; with an Emacs 24.3 or later where `url-encode-url' is available
+    (when (fboundp 'url-encode-url)
+      ;; "query="-space as plus sign
+      (should (string= (concat query "%2Bsubject:%22Release+8.2%22")
+		       (url-encode-url (concat query plus))))
+      ;; "query="-space as space
+      (should (string= (concat query "%2Bsubject:%22Release%208.2%22")
+		       (url-encode-url (concat query space)))))
+
+    ;; The %252B below returned from `org-link-escape-browser' is not
+    ;; desired and not working with some browser/OS but tested here to
+    ;; document what happens when the fallback to
+    ;; `org-link-escape-browser' in `org-open-at-point' is in use,
+    ;; which is the legacy behavior of `org-open-at-point' when used
+    ;; together with an Emacs before version 24.3
+    ;;
+    ;; "query="-space as plus sign
+    (should (string= (concat query "%252Bsubject:%22Release+8.2%22")
+		     (org-link-escape-browser (concat query plus))))
+    ;; "query="-space as space
+    (should (string= (concat query "%252Bsubject:%22Release%208.2%22")
+		     (org-link-escape-browser (concat query space))))))
 
 
 \f

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-11-16 15:16       ` Michael Brand
@ 2013-11-16 20:43         ` Nicolas Goaziou
  2013-11-17 11:06           ` Michael Brand
  0 siblings, 1 reply; 25+ messages in thread
From: Nicolas Goaziou @ 2013-11-16 20:43 UTC (permalink / raw)
  To: Michael Brand; +Cc: Org Mode

Hello,

Michael Brand <michael.ch.brand@gmail.com> writes:

> I would like to ask you to review the attached patch so I can change
> it when necessary before I git push it.

Sure.

> -	    (browse-url (concat type ":" (org-link-escape-browser path))))
> +	    ;; see `ert-deftest'
> +	    ;; `test-org/org-link-escape-chars-browser'
> +	    (browse-url
> +	     (if (fboundp 'url-encode-url)
> +		 (url-encode-url (concat type ":" path))
> +	       (org-link-escape-browser (concat type ":" path)))))

IMO, the following is nicer:

(funcall (if (fboundp 'url-encode-url) #'url-encode-url #'org-link-escape-browser)
         (concat type ":" path))

Also, it's better to document this in the source code rather than in the
test suite. Also, you could add, as a reminder, that we can remove
`org-link-escape-browser' altogether once we drop support for Emacs 23.

> -	    (browse-url (concat org-doi-server-url
> -				(org-link-escape-browser path))))
> +	    ;; see `ert-deftest'
> +	    ;; `test-org/org-link-escape-chars-browser'
> +	    (browse-url
> +	     (if (fboundp 'url-encode-url)
> +		 (url-encode-url (concat org-doi-server-url path))
> +	       (org-link-escape-browser (concat org-doi-server-url path)))))

Ditto.

> -  (should
> -   (string=
> -    "http://some.host.com/form?&id=blah%2Bblah25"
> -    (org-link-unescape
> -     (org-link-escape "http://some.host.com/form?&id=blah%2Bblah25")))))
> +  (let ((a "http://some.host.com/form?&id=blah%2Bblah25"))
> +    (should (string= a (org-link-unescape (org-link-escape a))))))

No need to change this. Moreover, I tend to prefer `should' outside the
sexp because it is easier to debug, when needed (`should' is quite
opaque when stepping through the function).

> +    ;; This is the behavior of `org-open-at-point' when used together
> +    ;; with an Emacs 24.3 or later where `url-encode-url' is available
> +    (when (fboundp 'url-encode-url)
> +      ;; "query="-space as plus sign
> +      (should (string= (concat query "%2Bsubject:%22Release+8.2%22")
> +		       (url-encode-url (concat query plus))))
> +      ;; "query="-space as space
> +      (should (string= (concat query "%2Bsubject:%22Release%208.2%22")
> +		       (url-encode-url (concat query space)))))

You are testing `url-encode-url' here, not an Org function. Is it really
required?


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-11-16 20:43         ` Nicolas Goaziou
@ 2013-11-17 11:06           ` Michael Brand
  2013-11-17 11:46             ` Nicolas Goaziou
  0 siblings, 1 reply; 25+ messages in thread
From: Michael Brand @ 2013-11-17 11:06 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: Org Mode

[-- Attachment #1: Type: text/plain, Size: 373 bytes --]

Hi Nicolas

Thank you for reviewing.

On Sat, Nov 16, 2013 at 9:43 PM, Nicolas Goaziou <n.goaziou@gmail.com> wrote:
> You are testing `url-encode-url' here, not an Org function. Is it really
> required?

No, the point about documentation is now covered in org-open-at-point
like all other previous changes to the ERT. Please see the attached
changed local commit.

Michael

[-- Attachment #2: review2.patch.txt --]
[-- Type: text/plain, Size: 1900 bytes --]

commit 971a3a4e485c897b8b6c2c1c244d02cb8d943167
Author: Michael Brand <michael.ch.brand@gmail.com>
Date:   Sun Nov 17 12:00:18 2013 +0100

    Hyperlink: Use url-encode-url for browse-url
    
    * lisp/org.el (org-open-at-point): When available (Emacs 24.3.1) use
    `url-encode-url' instead of `org-link-escape-browser'.

diff --git a/lisp/org.el b/lisp/org.el
index ed3928f..5cfaa2c 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -10520,11 +10520,29 @@ application the system uses for this file type."
 	      (apply cmd (nreverse args1))))
 
 	   ((member type '("http" "https" "ftp" "news"))
-	    (browse-url (concat type ":" (org-link-escape-browser path))))
+	    ;; In the example of the http Org link
+	    ;; [[http://lists.gnu.org/archive/cgi-bin/namazu.cgi?idxname=emacs-orgmode&query=%252Bsubject:"Release+8.2"]]
+	    ;; to open a browser with +subject:"Release 8.2" in the
+	    ;; query field the variable `path' contains
+	    ;; [...]=%2Bsubject:"Release+8.2", `url-encode-url'
+	    ;; converts correct to [...]=%2Bsubject:%22Release+8.2%22
+	    ;; and `org-link-escape-browser' converts wrong to
+	    ;; [...]=%252Bsubject:%22Release+8.2%22.
+	    ;;
+	    ;; `url-encode-url' is available since Emacs 24.3.1 and
+	    ;; `org-link-escape-browser' can be removed altogether
+	    ;; once Org drops support for Emacs 24.1 and 24.2.
+	    (browse-url (funcall (if (fboundp 'url-encode-url)
+				     #'url-encode-url
+				   #'org-link-escape-browser)
+				 (concat type ":" path))))
 
 	   ((string= type "doi")
-	    (browse-url (concat org-doi-server-url
-				(org-link-escape-browser path))))
+	    ;; See comments for type http above
+	    (browse-url (funcall (if (fboundp 'url-encode-url)
+				     #'url-encode-url
+				   #'org-link-escape-browser)
+				 (concat org-doi-server-url path))))
 
 	   ((member type '("message"))
 	    (browse-url (concat type ":" path)))

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-11-17 11:06           ` Michael Brand
@ 2013-11-17 11:46             ` Nicolas Goaziou
  2013-11-17 11:51               ` Michael Brand
  0 siblings, 1 reply; 25+ messages in thread
From: Nicolas Goaziou @ 2013-11-17 11:46 UTC (permalink / raw)
  To: Michael Brand; +Cc: Org Mode

Hello,

Michael Brand <michael.ch.brand@gmail.com> writes:

> No, the point about documentation is now covered in org-open-at-point
> like all other previous changes to the ERT. Please see the attached
> changed local commit.

It looks good to me.

Thank you for the patch.


Regards,

-- 
Nicolas Goaziou

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Encoding Problem in export?
  2013-11-17 11:46             ` Nicolas Goaziou
@ 2013-11-17 11:51               ` Michael Brand
  0 siblings, 0 replies; 25+ messages in thread
From: Michael Brand @ 2013-11-17 11:51 UTC (permalink / raw)
  To: Nicolas Goaziou; +Cc: Org Mode

Hi Nicolas

On Sun, Nov 17, 2013 at 12:46 PM, Nicolas Goaziou <n.goaziou@gmail.com> wrote:
> It looks good to me.

Thank you, I just pushed.

Michael

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2013-11-17 11:51 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-23 23:17 Encoding Problem in export? Robert Eckl
2013-07-23 23:35 ` Nicolas Goaziou
2013-07-24  1:50   ` Robert Eckl
2013-07-24  7:34     ` Nicolas Goaziou
2013-07-24  8:46       ` Robert Eckl
2013-07-24  9:16         ` Nicolas Goaziou
2013-07-24 10:27           ` Robert Eckl
2013-07-24  9:39       ` Nick Dokos
2013-07-24 11:09         ` Nicolas Goaziou
2013-07-25  4:05           ` David Maus
2013-07-25 21:46             ` Nicolas Goaziou
2013-07-26  4:03               ` David Maus
2013-07-26 10:20                 ` Nicolas Goaziou
2013-07-27  7:23                   ` David Maus
2013-07-27 11:09                     ` Nicolas Goaziou
2013-07-28  8:36                       ` Jambunathan K
2013-07-28  8:54                         ` Jambunathan K
2013-07-28 11:16                         ` David Maus
2013-07-28 11:22                         ` Nicolas Goaziou
2013-07-29  6:59                           ` Jambunathan K
2013-11-16 15:16       ` Michael Brand
2013-11-16 20:43         ` Nicolas Goaziou
2013-11-17 11:06           ` Michael Brand
2013-11-17 11:46             ` Nicolas Goaziou
2013-11-17 11:51               ` Michael Brand

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).