emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Nicolas Goaziou <n.goaziou@gmail.com>
To: David Maus <dmaus@ictsoc.de>
Cc: Nick Dokos <ndokos@gmail.com>, emacs-orgmode@gnu.org
Subject: Re: Encoding Problem in export?
Date: Thu, 25 Jul 2013 23:46:34 +0200	[thread overview]
Message-ID: <87li4u48jp.fsf@gmail.com> (raw)
In-Reply-To: <87y58vp9mj.wl%dmaus@ictsoc.de> (David Maus's message of "Thu, 25 Jul 2013 06:05:24 +0200")

Hello,

David Maus <dmaus@ictsoc.de> writes:

> IIRC org-link-escape is not used to create URLs but to escape
> characters in a link that would otherwise conflict with Orgmode syntax
> (e.g. square brackets).

> Org applies percent escaping to a link before
> it is stored in the buffer and applies unescaping when it reads a link
> back.
>
> The percent sign is hardcoded because if org-link-escape/unescape is
> used in this way we must make sure that the identity of a link is
> preserved. If we would *not* escape the percent sign, then an original
> link with percent encoded characters would be read back wrongly,
> i.e. with the percent escaped characters unescaped.

[...]

> There is, of course, the nasty thing that we don't know if the link in
> a buffer went through org-link-escape or not. E.g. if you paste
>
> ,----
> | [[http://redirect.example.org?url=http%3A%2F%2Ftarget.example.org%3Fid%3D33%26format%3Dhtml]]
> `----
>
> into the buffer you'll get a broken link because org-link-open assumes
> the link to be escaped by org.
>
> The bottom-line: Org creates link programmatically (org-store-link)
> and needs a mechanism to protected conflicting characters. It chose
> percent-escaping and in order to preserve the identity of a link Org
> has to escape the escape-character.
>
> Hope that helps!

It does.

I think we are hunting two hares and that's why we are failing so far.

There are two URI transformations involved. One is mandatory (escape
square brackets in URI), and the other one is optional (normalize URI
for external processes consumption). The former must be bi-directional,
as escaping brackets must be transparent to the user (e.g., when editing
a link with `org-insert-link'). The latter needn't and can happen on the
fly, just before the URI is sent to whatever needs it (e.g., a browser).

Therefore, I suggest to use three functions:

  - `org-link-escape will first %-escape "%" characters, and then "["
    and "]" characters. `org-link-unescape' will reverse the operation.

    These function cannot break a link, encoded or not. They are applied
    when a link is created programmatically and read back for user
    editing.

  - `org-link-encode'[1] will %-escape every forbidden character in the
    URI. It doesn't need any "reverse" function. It will be called when
    opening a link, or parsing it.

    I think it shouldn't escape "%" characters, though, so that it can
    be applied on both encoded and plain strings. Since it isn't perfect
    (it doesn't parse URI), it should also be very conservative (i.e.
    allow more characters such as "=" or "&") and not get in the way.

WDYT?


Regards,

[1] `url-encode-url' was introduced in Emacs 24.3. It is too young to be
used mainstream, even though it does a better job than
`org-link-escape'. We will benefit from it when Emacs 25 is out (i.e.
when Emacs 23 support is dropped).

-- 
Nicolas Goaziou

  reply	other threads:[~2013-07-25 21:46 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-23 23:17 Encoding Problem in export? Robert Eckl
2013-07-23 23:35 ` Nicolas Goaziou
2013-07-24  1:50   ` Robert Eckl
2013-07-24  7:34     ` Nicolas Goaziou
2013-07-24  8:46       ` Robert Eckl
2013-07-24  9:16         ` Nicolas Goaziou
2013-07-24 10:27           ` Robert Eckl
2013-07-24  9:39       ` Nick Dokos
2013-07-24 11:09         ` Nicolas Goaziou
2013-07-25  4:05           ` David Maus
2013-07-25 21:46             ` Nicolas Goaziou [this message]
2013-07-26  4:03               ` David Maus
2013-07-26 10:20                 ` Nicolas Goaziou
2013-07-27  7:23                   ` David Maus
2013-07-27 11:09                     ` Nicolas Goaziou
2013-07-28  8:36                       ` Jambunathan K
2013-07-28  8:54                         ` Jambunathan K
2013-07-28 11:16                         ` David Maus
2013-07-28 11:22                         ` Nicolas Goaziou
2013-07-29  6:59                           ` Jambunathan K
2013-11-16 15:16       ` Michael Brand
2013-11-16 20:43         ` Nicolas Goaziou
2013-11-17 11:06           ` Michael Brand
2013-11-17 11:46             ` Nicolas Goaziou
2013-11-17 11:51               ` Michael Brand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87li4u48jp.fsf@gmail.com \
    --to=n.goaziou@gmail.com \
    --cc=dmaus@ictsoc.de \
    --cc=emacs-orgmode@gnu.org \
    --cc=ndokos@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).