From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nicolas Goaziou Subject: Re: Encoding Problem in export? Date: Sat, 27 Jul 2013 13:09:28 +0200 Message-ID: <87iozwmf87.fsf@gmail.com> References: <87bo5s27ey.fsf@sachwertpartner.de> <877ggg7suh.fsf@gmail.com> <51EF32F4.9030309@gmx.de> <87txjk5s2q.fsf@gmail.com> <87a9lcfg9g.fsf@gmail.com> <877ggg5i5q.fsf@gmail.com> <87y58vp9mj.wl%dmaus@ictsoc.de> <87li4u48jp.fsf@gmail.com> <87r4emdl2a.wl%dmaus@ictsoc.de> <87d2q54o7e.fsf@gmail.com> <87k3kc1n6f.wl%dmaus@ictsoc.de> Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:47365) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V32a0-0006nz-67 for emacs-orgmode@gnu.org; Sat, 27 Jul 2013 07:22:34 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1V32NB-0008Im-UT for emacs-orgmode@gnu.org; Sat, 27 Jul 2013 07:09:20 -0400 Received: from mail-we0-x22c.google.com ([2a00:1450:400c:c03::22c]:54070) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V32NB-0008IR-Oc for emacs-orgmode@gnu.org; Sat, 27 Jul 2013 07:09:17 -0400 Received: by mail-we0-f172.google.com with SMTP id t61so2636999wes.17 for ; Sat, 27 Jul 2013 04:09:16 -0700 (PDT) In-Reply-To: <87k3kc1n6f.wl%dmaus@ictsoc.de> (David Maus's message of "Sat, 27 Jul 2013 09:23:20 +0200") List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: David Maus Cc: Nick Dokos , emacs-orgmode@gnu.org David Maus writes: > The more I think about it the more I grow certain that it is NOT about > URI encoding but protecting a string. This is what I mean. > `[' and `]' are not forbidden per se, they belong to the set of > reserved characters (see RFC 3986, 2.2.). > > "characters in the reserved set are protected from normalization and > are therefore safe to be used by scheme-specific and producer-specific > algorithms for delimiting data subcomponents within a URI." > (RFC 3986, p. 12) > > Moreover they are explicitly required in the host part to denote a > IPv6 address literal (RFC 3986, 3.2.2). > > If I am not mistaken then this is a valid http-URI with a XPointer > fragment pointing to the third `p' element in a locally hosted file: > > http://[::1]/foo.xml#xpointer(//p[3]) Thanks for the info. I didn't read RFC 3986 thoroughly. > If we escape but don't unescape there are *other* problems: Depending > on the protocol an escaped square bracket and a unescaped square > bracket can have different meaning. The assumption I mentioned referes > to unescaped characters. A consuming application knows the protocol > and can infer the characters that need to be escaped. We cannot unescape if we use %-encoding, as stated before. > ACK. It's not about creating URIs but protecting strings, thus the > rules for percent escaping don't have to be applied. Indeed. Ideally, we need to encode "[" and "]" with strings that cannot ever be found in a URI. Then, it will be possible to decode them safely. Regards, -- Nicolas Goaziou