emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: David Maus <dmaus@ictsoc.de>
To: Sebastian Rose <sebastian_rose@gmx.de>
Cc: "David Maus" <dmaus@ictsoc.de>,
	carsten.dominik@gmail.com, emacs-orgmode@gnu.org,
	"Sébastien Vauban" <wxhgmqzgwmuf@spammotel.com>
Subject: Re: [bug] org-link-escape and (wrong-type-argument	stringp	nil)
Date: Thu, 04 Nov 2010 21:35:53 +0100	[thread overview]
Message-ID: <877hgs25ee.wl%dmaus@ictsoc.de> (raw)
In-Reply-To: <8739swi0f0.fsf@gmx.de>


[-- Attachment #1.1: Type: text/plain, Size: 3124 bytes --]

Okay, back to link escaping.

What this is about:

Current implementation of percent escaping URIs uses a whitelist
approach, e.g. only percent escapes characters that are in
`org-link-escape-chars' or in a user supplied list.  This is a problem
because using this function requires knowledge about all possible
characters that could occur in a URI -- and URIs are limited to plain
ASCII, meaning a call to the function must list literally all possible
characters and their escapings to get a properly percent escaped
string.

To solve this problem the behavior of the function is changed to
percent escape every character that is an ASCII controll character or
not an ASCII character.  Subsequently the unescaping function is
changed accordingly to handle percent encoded multibyte unicode
characters.

1/ I did some testing with the new proposed org-link-escape and the
modified `org-protocol-unhex-string': Create a random string with
ASCII and multibyte unicode characters, randomly taken from
(ucs-names); perform escape-unescape; compare the result with the
original string.  Works perfect.  Testing randomly created string with
old escaping of non-ascii strings is on the list.

2/ Of course there could still be the problem, that a user had created
a sequence of old escapes that the new unescaping function will
interpret wrongly.  Not sure how likely this is, but in theory this
could happen.  Personally I think we should risk breaking peoples'
links in this way.

3/ I highly suggest changing the syntax of `org-link-escape-chars'.
Currently it is a list of cons with the character in car and the
replacement string in cdr.  Using such a table in escaping is easy
(assq char table), but in the unescaping process it might get tricky.

Moreover if the function should do percent escaping, the escpae
sequence is already determined by the string to replace.  The new
syntax would be simply a list of characters to escape in addition to
the rule mentioned above (< 32 and > 126).

This would break compatibility with functions that have used
org-link-escape/unescape for something else than percent escaping
(e.g. replace ] by %FF and not %5D and such).  But this again is
bearable: Although it the docstring talks about escaping things that
are problematic, the only way to do such escaping in a standardized
way is percent escaping.

4/ If all agree that breaking backward incompatibility in the case
mentioned above (or did I forgot one?) is bearable, I would go ahead
and perform the necessary changes:

  1. Use the new algorithm in `org-link-escape'
  2. Modify Syntax of `org-link-escape-chars'
  3. Issue a warning if someone calls `org-link-escape' with a table
     of the old syntax.
  4. Move the unescaping functions from org-protocol.el to org.el and
     rename them.
  5. Declare `org-protocol-unhex-string' and
     `org-protocol-unhex-compound' obsolete (make-obsolete).
  6. Drop a message to the list informing about these changes.
  7. Wait some months and purge the obsolete functions.

Best,
  -- David
--
OpenPGP... 0x99ADB83B5A4478E6
Jabber.... dmjena@jabber.org
Email..... dmaus@ictsoc.de

[-- Attachment #1.2: Type: application/pgp-signature, Size: 230 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

  parent reply	other threads:[~2010-11-04 20:36 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-20 12:42 [bug] org-link-escape and (wrong-type-argument stringp nil) Sébastien Vauban
2010-09-20 18:57 ` David Maus
2010-09-20 19:31   ` Sebastian Rose
2010-09-22  7:19     ` David Maus
2010-09-22 14:25       ` Sebastian Rose
2010-09-23 18:40         ` David Maus
2010-09-23 19:57           ` Sebastian Rose
2010-09-26 18:22             ` David Maus
2010-09-26 21:23               ` Sebastian Rose
2010-09-26 22:43               ` Sebastian Rose
2010-09-26 22:47               ` Sebastian Rose
2010-09-26 22:51               ` Sebastian Rose
2010-09-27  5:36                 ` [PATCH] " David Maus
2010-09-27 12:43                   ` Sebastian Rose
2010-09-29 15:48                   ` Carsten Dominik
2010-09-27  5:36                 ` [PATCH] Decode single byte sequence if decoding unicode failed David Maus
2010-11-04 20:35                 ` David Maus [this message]
2010-09-20 19:49   ` [bug] org-link-escape and (wrong-type-argument stringp nil) Sébastien Vauban
2010-09-22  7:20     ` David Maus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877hgs25ee.wl%dmaus@ictsoc.de \
    --to=dmaus@ictsoc.de \
    --cc=carsten.dominik@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=sebastian_rose@gmx.de \
    --cc=wxhgmqzgwmuf@spammotel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).