emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* Improve percent escaping links in Org mode (pull request / OK to push)
@ 2011-01-02 19:37 David Maus
  2011-02-12 22:17 ` Bastien
  0 siblings, 1 reply; 22+ messages in thread
From: David Maus @ 2011-01-02 19:37 UTC (permalink / raw)
  To: org-mode, bastien.guerry


[-- Attachment #1.1: Type: text/plain, Size: 3241 bytes --]

This is a pull request or push announcement for the first set of
patches to improve Org mode's percent escaping functions.  This set of
changes solves the problems with percent escaping non-ascii
characters.

git@github.com:dmj/dmj-org-mode.git feature/org-percent-escaping

I do have commit access but because this set of changes might break
things seriously I'd like to get an "OK to push" or someone who pulls
and reviews the changeset.

The problem:

Current implementation of percent escaping URIs uses a whitelist
approach, e.g. only percent escapes characters that are in
`org-link-escape-chars' or in a user supplied list.  This is a problem
because using this function requires knowledge about all possible
characters that could occur in a URI -- and URIs are limited to plain
ASCII, meaning a call to the function must list literally all possible
characters and their escapings to get a properly percent escaped
string.

The changes:

- `org-link-escape' percent escapes every character that matches one
  of the following conditiions:

  * equal 37 (percent sign)
  * equal 127 (DEL, control character)
  * below 32 (control character)
  * above 127 (non-ASCII character)
  * a character in the escaping table (e.g. `org-link-escape-chars')

  The character in question is first encoded in UTF-8, then all bytes
  of the resulting character are percent escaped.  If converting to
  UTF-8 fails, Org throws an error indicating this problem.

  The function got a optional third argument which can be set to merge
  to user defined table with the default escaping table.

- `org-link-unescape' unescapes every percent-escape sequence.  It is
  no longer possible to supply a list of characters that should be
  unescaped.  No function in core used `org-link-unescape' with a
  unescaping table.

  Internally the `org-protocol-unhex-*' functions were renamend to
  `org-link-unescape-*', moved to org.el and refactored (thanks to
  Vincent Belaïche for suggesting some of the changes).  They are
  declared obsolete and aliased per 2010-11-21.

  The unescaping function is backward compatible and unescapes the old
  percent escape format for non-ASCII characters (thanks to Sebastian
  Rose).

  It is possible that the new implementation will break links in at
  least this (known) case: If the user stored a link to a file or
  directory containing a percent sign.  Currently Org mode does not
  percent escape the percent sign and subsequently the new variant of
  `org-link-unescape' will try to unescpae the alleged percent escape
  sequence.[1]

- `org-link-escape-chars' format changed.  It's just a list of
  characters to escape, the percent escape sequence is implied by the
  character.

  Functions in core that used a custom escaping table are changed
  accordingly to use the new table format.

What is next:

  - check if we can fall back to use `url-hexify-string' and
    `url-unhex-string' instead our own functions
  - check if the recent problems with percent escaping are solved

Best,
  -- David

[1] Not escaping the percent sign is actually a glitch: Try to store
and open a link to a file literally called "foo%20baz.org".


[-- Attachment #1.2: Type: application/pgp-signature, Size: 230 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Improve percent escaping links in Org mode (pull request / OK to push)
  2011-01-02 19:37 Improve percent escaping links in Org mode (pull request / OK to push) David Maus
@ 2011-02-12 22:17 ` Bastien
  2011-02-13 12:01   ` David Maus
                     ` (16 more replies)
  0 siblings, 17 replies; 22+ messages in thread
From: Bastien @ 2011-02-12 22:17 UTC (permalink / raw)
  To: David Maus; +Cc: org-mode

Hi David,

David Maus <dmaus@ictsoc.de> writes:

> This is a pull request or push announcement for the first set of
> patches to improve Org mode's percent escaping functions.  This set of
> changes solves the problems with percent escaping non-ascii
> characters.

Wow... how could I missed this email?  Thanks for the thorough details
about this change, which is most than welcome (I read Vincent's emails
about this.)  

I hope you can rebase this on current head without too much headache,
and provide a set of patches.  I'd rather read patches than just test
from a branch...

Thanks a lot!

-- 
 Bastien

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Improve percent escaping links in Org mode (pull request / OK to push)
  2011-02-12 22:17 ` Bastien
@ 2011-02-13 12:01   ` David Maus
  2011-02-13 13:41     ` Bastien
  2011-02-13 12:01   ` [PATCH 01/16] Decode single byte sequence if decoding unicode failed David Maus
                     ` (15 subsequent siblings)
  16 siblings, 1 reply; 22+ messages in thread
From: David Maus @ 2011-02-13 12:01 UTC (permalink / raw)
  To: emacs-orgmode, bastien.guerry

Hi Bastien,

> Wow... how could I missed this email?  Thanks for the thorough details
> about this change, which is most than welcome (I read Vincent's emails
> about this.)  

> I hope you can rebase this on current head without too much headache,
> and provide a set of patches.  I'd rather read patches than just test
> from a branch...

Rebased to current head and here we go.

Best,
  -- David

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 01/16] Decode single byte sequence if decoding unicode failed.
  2011-02-12 22:17 ` Bastien
  2011-02-13 12:01   ` David Maus
@ 2011-02-13 12:01   ` David Maus
  2011-02-13 12:01   ` [PATCH 02/16] New unicode aware percent encoding algorithm David Maus
                     ` (14 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: David Maus @ 2011-02-13 12:01 UTC (permalink / raw)
  To: emacs-orgmode, bastien.guerry

From: Sebastian Rose <sebastian_rose@gmx.de>

* org-protocol.el (org-protocol-unhex-single-byte-sequence): New
function.  Decode hex-encoded singly byte sequences.
(org-protocol-unhex-compound): Use new function if decoding sequence
as unicode character failed.
---
 lisp/org-protocol.el |   26 +++++++++++++++++++++++---
 1 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/lisp/org-protocol.el b/lisp/org-protocol.el
index 1c501f3..33878a8 100644
--- a/lisp/org-protocol.el
+++ b/lisp/org-protocol.el
@@ -305,7 +305,7 @@ part."
 
 (defun org-protocol-unhex-string(str)
   "Unhex hexified unicode strings as returned from the JavaScript function
-encodeURIComponent. E.g. `%C3%B6' is the german Umlaut `ü'."
+encodeURIComponent. E.g. `%C3%B6' is the german Umlaut `ö'."
   (setq str (or str ""))
   (let ((tmp "")
 	(case-fold-search t))
@@ -321,7 +321,9 @@ encodeURIComponent. E.g. `%C3%B6' is the german Umlaut `ü'."
 
 
 (defun org-protocol-unhex-compound (hex)
-  "Unhexify unicode hex-chars. E.g. `%C3%B6' is the German Umlaut `ü'."
+  "Unhexify unicode hex-chars. E.g. `%C3%B6' is the German Umlaut `ö'.
+Note: this function also decodes single byte encodings like
+`%E1' (\"á\") if not followed by another `%[A-F0-9]{2}' group."
   (let* ((bytes (remove "" (split-string hex "%")))
 	 (ret "")
 	 (eat 0)
@@ -353,12 +355,30 @@ encodeURIComponent. E.g. `%C3%B6' is the german Umlaut `ü'."
 	(setq val (logxor val xor))
 	(setq sum (+ (lsh sum shift) val))
 	(if (> eat 0) (setq eat (- eat 1)))
-	(when (= 0 eat)
+	(cond
+	 ((= 0 eat)                         ;multi byte
 	  (setq ret (concat ret (org-protocol-char-to-string sum)))
 	  (setq sum 0))
+	 ((not bytes)                       ; single byte(s)
+	  (setq ret (org-protocol-unhex-single-byte-sequence hex))))
 	)) ;; end (while bytes
     ret ))
 
+(defun org-protocol-unhex-single-byte-sequence(hex)
+  "Unhexify hex-encoded single byte character sequences."
+  (let ((bytes (remove "" (split-string hex "%")))
+	(ret ""))
+    (while bytes
+      (let* ((b (pop bytes))
+	     (a (elt b 0))
+	     (b (elt b 1))
+	     (c1 (if (> a ?9) (+ 10 (- a ?A)) (- a ?0)))
+	     (c2 (if (> b ?9) (+ 10 (- b ?A)) (- b ?0))))
+	(setq ret
+	      (concat ret (char-to-string
+			   (+ (lsh c1 4) c2))))))
+    ret))
+
 (defun org-protocol-flatten-greedy (param-list &optional strip-path replacement)
   "Greedy handlers might receive a list like this from emacsclient:
  '( (\"/dir/org-protocol:/greedy:/~/path1\" (23 . 12)) (\"/dir/param\")
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 02/16] New unicode aware percent encoding algorithm
  2011-02-12 22:17 ` Bastien
  2011-02-13 12:01   ` David Maus
  2011-02-13 12:01   ` [PATCH 01/16] Decode single byte sequence if decoding unicode failed David Maus
@ 2011-02-13 12:01   ` David Maus
  2011-02-13 12:01   ` [PATCH 03/16] New format of percent escape table David Maus
                     ` (13 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: David Maus @ 2011-02-13 12:01 UTC (permalink / raw)
  To: emacs-orgmode, bastien.guerry; +Cc: David Maus

* org.el (org-link-escape): New unicode aware percent encoding
algorithm.
---
 lisp/org.el |   19 ++++++++-----------
 1 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/lisp/org.el b/lisp/org.el
index 0c46eec..9aeeeda 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -8576,17 +8576,14 @@ This is the list that is used before handing over to the browser.")
   (if (and org-url-encoding-use-url-hexify (not table))
       (url-hexify-string text)
     (setq table (or table org-link-escape-chars))
-    (when text
-      (let ((re (mapconcat (lambda (x) (regexp-quote
-					(char-to-string (car x))))
-			   table "\\|")))
-	(while (string-match re text)
-	  (setq text
-		(replace-match
-		 (cdr (assoc (string-to-char (match-string 0 text))
-			     table))
-	       t t text)))
-	text))))
+    (mapconcat
+     (lambda (char)
+       (if (or (assoc char table)
+	       (< char 32) (> char 126))
+	   (mapconcat (lambda (sequence)
+			(format "%%%.2X" sequence))
+		      (encode-coding-char char 'utf-8) "")
+	   (char-to-string char))) text "")))
 
 (defun org-link-unescape (text &optional table)
   "Reverse the action of `org-link-escape'."
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 03/16] New format of percent escape table
  2011-02-12 22:17 ` Bastien
                     ` (2 preceding siblings ...)
  2011-02-13 12:01   ` [PATCH 02/16] New unicode aware percent encoding algorithm David Maus
@ 2011-02-13 12:01   ` David Maus
  2011-02-13 12:01   ` [PATCH 04/16] Fixup doc string David Maus
                     ` (12 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: David Maus @ 2011-02-13 12:01 UTC (permalink / raw)
  To: emacs-orgmode, bastien.guerry; +Cc: David Maus

* org.el (org-link-escape-chars, org-link-escape-chars-browser): New
format of percent escape table.
(org-link-escape): Use new table format.

Just a plain list with the chars that should be escaped.
---
 lisp/org.el |   27 +++++----------------------
 1 files changed, 5 insertions(+), 22 deletions(-)

diff --git a/lisp/org.el b/lisp/org.el
index 9aeeeda..7d38907 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -8543,32 +8543,15 @@ according to FMT (default from `org-email-link-description-format')."
 	  "]"))
 
 (defconst org-link-escape-chars
-  '((?\    . "%20")
-    (?\[   . "%5B")
-    (?\]   . "%5D")
-    (?\340 . "%E0")  ; `a
-    (?\342 . "%E2")  ; ^a
-    (?\347 . "%E7")  ; ,c
-    (?\350 . "%E8")  ; `e
-    (?\351 . "%E9")  ; 'e
-    (?\352 . "%EA")  ; ^e
-    (?\356 . "%EE")  ; ^i
-    (?\364 . "%F4")  ; ^o
-    (?\371 . "%F9")  ; `u
-    (?\373 . "%FB")  ; ^u
-    (?\;   . "%3B")
-;;  (??    . "%3F")
-    (?=    . "%3D")
-    (?+    . "%2B")
-    )
-  "Association list of escapes for some characters problematic in links.
+  '(?\ ?\[ ?\] ?\; ?\= ?\+)
+  "List of characters that should be escaped in link.
 This is the list that is used for internal purposes.")
 
 (defvar org-url-encoding-use-url-hexify nil)
 
 (defconst org-link-escape-chars-browser
-  '((?\  . "%20")) ; 32 for the SPC char
-  "Association list of escapes for some characters problematic in links.
+  '(?\ )
+  "List of escapes for characters that are problematic in links.
 This is the list that is used before handing over to the browser.")
 
 (defun org-link-escape (text &optional table)
@@ -8578,7 +8561,7 @@ This is the list that is used before handing over to the browser.")
     (setq table (or table org-link-escape-chars))
     (mapconcat
      (lambda (char)
-       (if (or (assoc char table)
+       (if (or (member char table)
 	       (< char 32) (> char 126))
 	   (mapconcat (lambda (sequence)
 			(format "%%%.2X" sequence))
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 04/16] Fixup doc string
  2011-02-12 22:17 ` Bastien
                     ` (3 preceding siblings ...)
  2011-02-13 12:01   ` [PATCH 03/16] New format of percent escape table David Maus
@ 2011-02-13 12:01   ` David Maus
  2011-02-13 12:01   ` [PATCH 05/16] New optional argument: Merge user table with default table David Maus
                     ` (11 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: David Maus @ 2011-02-13 12:01 UTC (permalink / raw)
  To: emacs-orgmode, bastien.guerry; +Cc: David Maus

* org.el (org-link-escape): Fixup doc string.
---
 lisp/org.el |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/lisp/org.el b/lisp/org.el
index 7d38907..cafb673 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -8555,7 +8555,10 @@ This is the list that is used for internal purposes.")
 This is the list that is used before handing over to the browser.")
 
 (defun org-link-escape (text &optional table)
-  "Escape characters in TEXT that are problematic for links."
+  "Return percent escaped representation of TEXT.
+TEXT is a string with the text to escape.
+Optional argument TABLE is a list with characters that should be
+escaped.  When nil, `org-link-escape-chars' is used."
   (if (and org-url-encoding-use-url-hexify (not table))
       (url-hexify-string text)
     (setq table (or table org-link-escape-chars))
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 05/16] New optional argument: Merge user table with default table
  2011-02-12 22:17 ` Bastien
                     ` (4 preceding siblings ...)
  2011-02-13 12:01   ` [PATCH 04/16] Fixup doc string David Maus
@ 2011-02-13 12:01   ` David Maus
  2011-02-13 12:01   ` [PATCH 06/16] Inline function to properly decode utf8 characters in Emacs 22 David Maus
                     ` (10 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: David Maus @ 2011-02-13 12:01 UTC (permalink / raw)
  To: emacs-orgmode, bastien.guerry; +Cc: David Maus

* org.el (org-link-escape): New optional argument.  Merge user table
with default table.
---
 lisp/org.el |   14 +++++++++++---
 1 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/lisp/org.el b/lisp/org.el
index cafb673..a29d429 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -8554,14 +8554,22 @@ This is the list that is used for internal purposes.")
   "List of escapes for characters that are problematic in links.
 This is the list that is used before handing over to the browser.")
 
-(defun org-link-escape (text &optional table)
+(defun org-link-escape (text &optional table merge)
   "Return percent escaped representation of TEXT.
 TEXT is a string with the text to escape.
 Optional argument TABLE is a list with characters that should be
-escaped.  When nil, `org-link-escape-chars' is used."
+escaped.  When nil, `org-link-escape-chars' is used.
+If optional argument MERGE is set, merge TABLE into
+`org-link-escape-chars'."
   (if (and org-url-encoding-use-url-hexify (not table))
       (url-hexify-string text)
-    (setq table (or table org-link-escape-chars))
+    (cond
+     ((and table merge)
+      (mapc (lambda (defchr)
+	      (unless (member defchr table)
+		(setq table (cons defchr table)))) org-link-escape-chars))
+     ((null table)
+      (setq table org-link-escape-chars)))
     (mapconcat
      (lambda (char)
        (if (or (member char table)
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 06/16] Inline function to properly decode utf8 characters in Emacs 22
  2011-02-12 22:17 ` Bastien
                     ` (5 preceding siblings ...)
  2011-02-13 12:01   ` [PATCH 05/16] New optional argument: Merge user table with default table David Maus
@ 2011-02-13 12:01   ` David Maus
  2011-02-13 12:01   ` [PATCH 07/16] Unescape functions moved and renamed from org-protocol.el David Maus
                     ` (9 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: David Maus @ 2011-02-13 12:01 UTC (permalink / raw)
  To: emacs-orgmode, bastien.guerry; +Cc: David Maus

* org-macs.el (org-char-to-string): Inline function to properly decode
utf8 characters in Emacs 22.  Moved and renamed from org-protocol.el.

* org-protocol.el (org-protocol-unhex-compound): Use renamed inline
function.
---
 lisp/org-macs.el     |    9 ++++++++-
 lisp/org-protocol.el |   13 +------------
 2 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/lisp/org-macs.el b/lisp/org-macs.el
index 5a56123..4451a54 100644
--- a/lisp/org-macs.el
+++ b/lisp/org-macs.el
@@ -35,7 +35,14 @@
 
 (eval-and-compile
   (unless (fboundp 'declare-function)
-    (defmacro declare-function (fn file &optional arglist fileonly))))
+    (defmacro declare-function (fn file &optional arglist fileonly)))
+  (if (>= emacs-major-version 23)
+      (defsubst org-char-to-string(c)
+	"Defsubst to decode UTF-8 character values in emacs 23 and beyond."
+	(char-to-string c))
+    (defsubst org-char-to-string (c)
+      "Defsubst to decode UTF-8 character values in emacs 22."
+      (string (decode-char 'ucs c)))))
 
 (declare-function org-add-props "org-compat" (string plist &rest props))
 (declare-function org-string-match-p "org-compat" (&rest args))
diff --git a/lisp/org-protocol.el b/lisp/org-protocol.el
index 33878a8..eb77f02 100644
--- a/lisp/org-protocol.el
+++ b/lisp/org-protocol.el
@@ -292,17 +292,6 @@ part."
 	  (mapcar 'org-protocol-unhex-string split-parts))
       split-parts)))
 
-;; This inline function is needed in org-protocol-unhex-compound to do
-;; the right thing to decode UTF-8 char integer values.
-(eval-when-compile
-  (if (>= emacs-major-version 23)
-      (defsubst org-protocol-char-to-string(c)
-	"Defsubst to decode UTF-8 character values in emacs 23 and beyond."
-	(char-to-string c))
-    (defsubst org-protocol-char-to-string (c)
-      "Defsubst to decode UTF-8 character values in emacs 22."
-      (string (decode-char 'ucs c)))))
-
 (defun org-protocol-unhex-string(str)
   "Unhex hexified unicode strings as returned from the JavaScript function
 encodeURIComponent. E.g. `%C3%B6' is the german Umlaut `ö'."
@@ -357,7 +346,7 @@ Note: this function also decodes single byte encodings like
 	(if (> eat 0) (setq eat (- eat 1)))
 	(cond
 	 ((= 0 eat)                         ;multi byte
-	  (setq ret (concat ret (org-protocol-char-to-string sum)))
+	  (setq ret (concat ret (org-char-to-string sum)))
 	  (setq sum 0))
 	 ((not bytes)                       ; single byte(s)
 	  (setq ret (org-protocol-unhex-single-byte-sequence hex))))
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 07/16] Unescape functions moved and renamed from org-protocol.el
  2011-02-12 22:17 ` Bastien
                     ` (6 preceding siblings ...)
  2011-02-13 12:01   ` [PATCH 06/16] Inline function to properly decode utf8 characters in Emacs 22 David Maus
@ 2011-02-13 12:01   ` David Maus
  2011-02-13 12:01   ` [PATCH 08/16] Declare obsolete & alias to respective org-link-unescape-* functions David Maus
                     ` (8 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: David Maus @ 2011-02-13 12:01 UTC (permalink / raw)
  To: emacs-orgmode, bastien.guerry; +Cc: David Maus

* org.el (org-link-unescape, org-link-unescape-compound)
(org-link-unescape-single-byte-sequence): Functions moved and renamed
from org-protocol.el.
---
 lisp/org.el |   90 ++++++++++++++++++++++++++++++++++++++++++++++++----------
 1 files changed, 74 insertions(+), 16 deletions(-)

diff --git a/lisp/org.el b/lisp/org.el
index a29d429..602462d 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -8579,22 +8579,80 @@ If optional argument MERGE is set, merge TABLE into
 		      (encode-coding-char char 'utf-8) "")
 	   (char-to-string char))) text "")))
 
-(defun org-link-unescape (text &optional table)
-  "Reverse the action of `org-link-escape'."
-  (if (and org-url-encoding-use-url-hexify (not table))
-      (url-unhex-string text)
-    (setq table (or table org-link-escape-chars))
-    (when text
-      (let ((case-fold-search t)
-	    (re (mapconcat (lambda (x) (regexp-quote (downcase (cdr x))))
-			   table "\\|")))
-	(while (string-match re text)
-	  (setq text
-		(replace-match
-		 (char-to-string (car (rassoc (upcase (match-string 0 text))
-					      table)))
-		 t t text)))
-	text))))
+(defun org-link-unescape (str)
+  "Unhex hexified unicode strings as returned from the JavaScript function
+encodeURIComponent. E.g. `%C3%B6' is the german Umlaut `ö'."
+  (setq str (or str ""))
+  (let ((tmp "")
+	(case-fold-search t))
+    (while (string-match "\\(%[0-9a-f][0-9a-f]\\)+" str)
+      (let* ((start (match-beginning 0))
+	     (end (match-end 0))
+	     (hex (match-string 0 str))
+	     (replacement (org-link-unescape-compound (upcase hex))))
+	(setq tmp (concat tmp (substring str 0 start) replacement))
+	(setq str (substring str end))))
+    (setq tmp (concat tmp str))
+    tmp))
+
+(defun org-link-unescape-compound (hex)
+  "Unhexify unicode hex-chars. E.g. `%C3%B6' is the German Umlaut `ö'.
+Note: this function also decodes single byte encodings like
+`%E1' (\"á\") if not followed by another `%[A-F0-9]{2}' group."
+  (let* ((bytes (remove "" (split-string hex "%")))
+	 (ret "")
+	 (eat 0)
+	 (sum 0))
+    (while bytes
+      (let* ((b (pop bytes))
+	     (a (elt b 0))
+	     (b (elt b 1))
+	     (c1 (if (> a ?9) (+ 10 (- a ?A)) (- a ?0)))
+	     (c2 (if (> b ?9) (+ 10 (- b ?A)) (- b ?0)))
+	     (val (+ (lsh c1 4) c2))
+	     (shift
+	      (if (= 0 eat) ;; new byte
+		  (if (>= val 252) 6
+		    (if (>= val 248) 5
+		      (if (>= val 240) 4
+			(if (>= val 224) 3
+			  (if (>= val 192) 2 0)))))
+		6))
+	     (xor
+	      (if (= 0 eat) ;; new byte
+		  (if (>= val 252) 252
+		    (if (>= val 248) 248
+		      (if (>= val 240) 240
+			(if (>= val 224) 224
+			  (if (>= val 192) 192 0)))))
+		128)))
+	(if (>= val 192) (setq eat shift))
+	(setq val (logxor val xor))
+	(setq sum (+ (lsh sum shift) val))
+	(if (> eat 0) (setq eat (- eat 1)))
+	(cond
+	 ((= 0 eat)                         ;multi byte
+	  (setq ret (concat ret (org-char-to-string sum)))
+	  (setq sum 0))
+	 ((not bytes)                       ; single byte(s)
+	  (setq ret (org-link-unescape-single-byte-sequence hex))))
+	)) ;; end (while bytes
+    ret ))
+
+(defun org-link-unescape-single-byte-sequence (hex)
+  "Unhexify hex-encoded single byte character sequences."
+  (let ((bytes (remove "" (split-string hex "%")))
+	(ret ""))
+    (while bytes
+      (let* ((b (pop bytes))
+	     (a (elt b 0))
+	     (b (elt b 1))
+	     (c1 (if (> a ?9) (+ 10 (- a ?A)) (- a ?0)))
+	     (c2 (if (> b ?9) (+ 10 (- b ?A)) (- b ?0))))
+	(setq ret
+	      (concat ret (char-to-string
+			   (+ (lsh c1 4) c2))))))
+    ret))
 
 (defun org-xor (a b)
   "Exclusive or."
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 08/16] Declare obsolete & alias to respective org-link-unescape-* functions
  2011-02-12 22:17 ` Bastien
                     ` (7 preceding siblings ...)
  2011-02-13 12:01   ` [PATCH 07/16] Unescape functions moved and renamed from org-protocol.el David Maus
@ 2011-02-13 12:01   ` David Maus
  2011-02-13 12:01   ` [PATCH 09/16] Remove obsolete argument in call to org-link-unescape David Maus
                     ` (7 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: David Maus @ 2011-02-13 12:01 UTC (permalink / raw)
  To: emacs-orgmode, bastien.guerry; +Cc: David Maus

* org-protocol.el (org-protocol-unhex-string)
(org-protocol-unhex-compound)
(org-protocol-unhex-single-byte-sequence): Declare obsolete and
alias to respective org-link-unescape-* functions.
---
 lisp/org-protocol.el |   88 +++++++-------------------------------------------
 1 files changed, 12 insertions(+), 76 deletions(-)

diff --git a/lisp/org-protocol.el b/lisp/org-protocol.el
index eb77f02..078905a 100644
--- a/lisp/org-protocol.el
+++ b/lisp/org-protocol.el
@@ -130,6 +130,18 @@
 		  (filename &optional up))
 (declare-function server-edit "server" (&optional arg))
 
+(define-obsolete-function-alias
+  'org-protocol-unhex-compound 'org-link-unescape-compound
+  "2010-11-21")
+
+(define-obsolete-function-alias
+  'org-protocol-unhex-string 'org-link-unescape
+  "2010-11-21")
+
+(define-obsolete-function-alias
+  'org-protocol-unhex-single-byte-sequence
+  'org-link-unescape-single-byte-sequence
+  "2011-11-21")
 
 (defgroup org-protocol nil
   "Intercept calls from emacsclient to trigger custom actions.
@@ -292,82 +304,6 @@ part."
 	  (mapcar 'org-protocol-unhex-string split-parts))
       split-parts)))
 
-(defun org-protocol-unhex-string(str)
-  "Unhex hexified unicode strings as returned from the JavaScript function
-encodeURIComponent. E.g. `%C3%B6' is the german Umlaut `ö'."
-  (setq str (or str ""))
-  (let ((tmp "")
-	(case-fold-search t))
-    (while (string-match "\\(%[0-9a-f][0-9a-f]\\)+" str)
-      (let* ((start (match-beginning 0))
-	     (end (match-end 0))
-	     (hex (match-string 0 str))
-	     (replacement (org-protocol-unhex-compound (upcase hex))))
-	(setq tmp (concat tmp (substring str 0 start) replacement))
-	(setq str (substring str end))))
-    (setq tmp (concat tmp str))
-    tmp))
-
-
-(defun org-protocol-unhex-compound (hex)
-  "Unhexify unicode hex-chars. E.g. `%C3%B6' is the German Umlaut `ö'.
-Note: this function also decodes single byte encodings like
-`%E1' (\"á\") if not followed by another `%[A-F0-9]{2}' group."
-  (let* ((bytes (remove "" (split-string hex "%")))
-	 (ret "")
-	 (eat 0)
-	 (sum 0))
-    (while bytes
-      (let* ((b (pop bytes))
-	     (a (elt b 0))
-	     (b (elt b 1))
-	     (c1 (if (> a ?9) (+ 10 (- a ?A)) (- a ?0)))
-	     (c2 (if (> b ?9) (+ 10 (- b ?A)) (- b ?0)))
-	     (val (+ (lsh c1 4) c2))
-	     (shift
-	      (if (= 0 eat) ;; new byte
-		  (if (>= val 252) 6
-		    (if (>= val 248) 5
-		      (if (>= val 240) 4
-			(if (>= val 224) 3
-			  (if (>= val 192) 2 0)))))
-		6))
-	     (xor
-	      (if (= 0 eat) ;; new byte
-		  (if (>= val 252) 252
-		    (if (>= val 248) 248
-		      (if (>= val 240) 240
-			(if (>= val 224) 224
-			  (if (>= val 192) 192 0)))))
-		128)))
-	(if (>= val 192) (setq eat shift))
-	(setq val (logxor val xor))
-	(setq sum (+ (lsh sum shift) val))
-	(if (> eat 0) (setq eat (- eat 1)))
-	(cond
-	 ((= 0 eat)                         ;multi byte
-	  (setq ret (concat ret (org-char-to-string sum)))
-	  (setq sum 0))
-	 ((not bytes)                       ; single byte(s)
-	  (setq ret (org-protocol-unhex-single-byte-sequence hex))))
-	)) ;; end (while bytes
-    ret ))
-
-(defun org-protocol-unhex-single-byte-sequence(hex)
-  "Unhexify hex-encoded single byte character sequences."
-  (let ((bytes (remove "" (split-string hex "%")))
-	(ret ""))
-    (while bytes
-      (let* ((b (pop bytes))
-	     (a (elt b 0))
-	     (b (elt b 1))
-	     (c1 (if (> a ?9) (+ 10 (- a ?A)) (- a ?0)))
-	     (c2 (if (> b ?9) (+ 10 (- b ?A)) (- b ?0))))
-	(setq ret
-	      (concat ret (char-to-string
-			   (+ (lsh c1 4) c2))))))
-    ret))
-
 (defun org-protocol-flatten-greedy (param-list &optional strip-path replacement)
   "Greedy handlers might receive a list like this from emacsclient:
  '( (\"/dir/org-protocol:/greedy:/~/path1\" (23 . 12)) (\"/dir/param\")
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 09/16] Remove obsolete argument in call to org-link-unescape
  2011-02-12 22:17 ` Bastien
                     ` (8 preceding siblings ...)
  2011-02-13 12:01   ` [PATCH 08/16] Declare obsolete & alias to respective org-link-unescape-* functions David Maus
@ 2011-02-13 12:01   ` David Maus
  2011-02-13 12:01   ` [PATCH 10/16] Use new percent escape character table format David Maus
                     ` (6 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: David Maus @ 2011-02-13 12:01 UTC (permalink / raw)
  To: emacs-orgmode, bastien.guerry; +Cc: David Maus

* org-mobile.el (org-mobile-locate-entry): Remove obsolete argument in
call to org-link-unescape.

`org-link-unescape' always unescapes all percent escaped sequences.
---
 lisp/org-mobile.el |    7 +++----
 1 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/lisp/org-mobile.el b/lisp/org-mobile.el
index a278fb1..6616876 100644
--- a/lisp/org-mobile.el
+++ b/lisp/org-mobile.el
@@ -969,11 +969,10 @@ is currently a noop.")
     (if (not (string-match "\\`olp:\\(.*?\\):\\(.*\\)$" link))
 	nil
       (let ((file (match-string 1 link))
-	    (path (match-string 2 link))
-	    (table '((?: . "%3a") (?\[ . "%5b") (?\] . "%5d") (?/ . "%2f"))))
-	(setq file (org-link-unescape file table))
+	    (path (match-string 2 link)))
+	(setq file (org-link-unescape file))
 	(setq file (expand-file-name file org-directory))
-	(setq path (mapcar (lambda (x) (org-link-unescape x table))
+	(setq path (mapcar 'org-link-unescape
 			   (org-split-string path "/")))
 	(org-find-olp (cons file path))))))
 
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 10/16] Use new percent escape character table format
  2011-02-12 22:17 ` Bastien
                     ` (9 preceding siblings ...)
  2011-02-13 12:01   ` [PATCH 09/16] Remove obsolete argument in call to org-link-unescape David Maus
@ 2011-02-13 12:01   ` David Maus
  2011-02-13 12:01   ` [PATCH 11/16] Add percent sign to list of escape chars David Maus
                     ` (5 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: David Maus @ 2011-02-13 12:01 UTC (permalink / raw)
  To: emacs-orgmode, bastien.guerry; +Cc: David Maus

* org-mobile.el (org-mobile-escape-olp): Use new percent escape
character table format.
---
 lisp/org-mobile.el |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/lisp/org-mobile.el b/lisp/org-mobile.el
index 6616876..fe0a287 100644
--- a/lisp/org-mobile.el
+++ b/lisp/org-mobile.el
@@ -660,7 +660,7 @@ The table of checksums is written to the file mobile-checksums."
 	    (org-mobile-escape-olp (nth 4 (org-heading-components))))))
 
 (defun org-mobile-escape-olp (s)
-  (let  ((table '((?: . "%3a") (?\[ . "%5b") (?\] . "%5d") (?/ . "%2f"))))
+  (let  ((table '(?: ?/)))
     (org-link-escape s table)))
 
 ;;;###autoload
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 11/16] Add percent sign to list of escape chars
  2011-02-12 22:17 ` Bastien
                     ` (10 preceding siblings ...)
  2011-02-13 12:01   ` [PATCH 10/16] Use new percent escape character table format David Maus
@ 2011-02-13 12:01   ` David Maus
  2011-02-13 12:01   ` [PATCH 12/16] Rename lambda argument David Maus
                     ` (4 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: David Maus @ 2011-02-13 12:01 UTC (permalink / raw)
  To: emacs-orgmode, bastien.guerry; +Cc: David Maus

* org.el (org-link-escape-chars-browser, org-link-escape-chars): Add
percent sign to list of escape chars.
---
 lisp/org.el |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lisp/org.el b/lisp/org.el
index 602462d..370109b 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -8543,14 +8543,14 @@ according to FMT (default from `org-email-link-description-format')."
 	  "]"))
 
 (defconst org-link-escape-chars
-  '(?\ ?\[ ?\] ?\; ?\= ?\+)
+  '(?\ ?\[ ?\] ?\; ?\= ?\+ ?\%)
   "List of characters that should be escaped in link.
 This is the list that is used for internal purposes.")
 
 (defvar org-url-encoding-use-url-hexify nil)
 
 (defconst org-link-escape-chars-browser
-  '(?\ )
+  '(?\ ?\%)
   "List of escapes for characters that are problematic in links.
 This is the list that is used before handing over to the browser.")
 
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 12/16] Rename lambda argument
  2011-02-12 22:17 ` Bastien
                     ` (11 preceding siblings ...)
  2011-02-13 12:01   ` [PATCH 11/16] Add percent sign to list of escape chars David Maus
@ 2011-02-13 12:01   ` David Maus
  2011-02-13 12:01   ` [PATCH 13/16] Refactor unescaping functions David Maus
                     ` (3 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: David Maus @ 2011-02-13 12:01 UTC (permalink / raw)
  To: emacs-orgmode, bastien.guerry; +Cc: David Maus

* org.el (org-link-escape): Rename lambda argument.
---
 lisp/org.el |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lisp/org.el b/lisp/org.el
index 1b5c3a8..8d49c05 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -8576,8 +8576,8 @@ If optional argument MERGE is set, merge TABLE into
      (lambda (char)
        (if (or (member char table)
 	       (< char 32) (> char 126))
-	   (mapconcat (lambda (sequence)
-			(format "%%%.2X" sequence))
+	   (mapconcat (lambda (sequence-element)
+			(format "%%%.2X" sequence-element))
 		      (encode-coding-char char 'utf-8) "")
 	   (char-to-string char))) text "")))
 
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 13/16] Refactor unescaping functions
  2011-02-12 22:17 ` Bastien
                     ` (12 preceding siblings ...)
  2011-02-13 12:01   ` [PATCH 12/16] Rename lambda argument David Maus
@ 2011-02-13 12:01   ` David Maus
  2011-02-13 12:01   ` [PATCH 14/16] Always percent escape the percent sign David Maus
                     ` (2 subsequent siblings)
  16 siblings, 0 replies; 22+ messages in thread
From: David Maus @ 2011-02-13 12:01 UTC (permalink / raw)
  To: emacs-orgmode, bastien.guerry; +Cc: David Maus

* org.el (org-link-unescape): Simpler algorithm for replacing percent
escapes.
(org-link-unescape-compound): Use cond statements instead of nested
if, convert hex string with string-to-number, save match data.
(org-link-unescape-single-byte-sequence): Use mapconcat and
string-to-number for unescaping single byte sequence.
---
 lisp/org.el |  102 ++++++++++++++++++++++------------------------------------
 1 files changed, 39 insertions(+), 63 deletions(-)

diff --git a/lisp/org.el b/lisp/org.el
index fcd421f..f35f898 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -8584,77 +8584,53 @@ If optional argument MERGE is set, merge TABLE into
 (defun org-link-unescape (str)
   "Unhex hexified unicode strings as returned from the JavaScript function
 encodeURIComponent. E.g. `%C3%B6' is the german Umlaut `ö'."
-  (setq str (or str ""))
-  (let ((tmp "")
-	(case-fold-search t))
-    (while (string-match "\\(%[0-9a-f][0-9a-f]\\)+" str)
-      (let* ((start (match-beginning 0))
-	     (end (match-end 0))
-	     (hex (match-string 0 str))
-	     (replacement (org-link-unescape-compound (upcase hex))))
-	(setq tmp (concat tmp (substring str 0 start) replacement))
-	(setq str (substring str end))))
-    (setq tmp (concat tmp str))
-    tmp))
+  (unless (and (null str) (string= "" str))
+    (let ((pos 0) (case-fold-search t) unhexed)
+      (while (setq pos (string-match "\\(%[0-9a-f][0-9a-f]\\)+" str pos))
+	(setq unhexed (org-link-unescape-compound (match-string 0 str)))
+	(setq str (replace-match unhexed t t str))
+	(setq pos (+ pos (length unhexed))))))
+  str)
 
 (defun org-link-unescape-compound (hex)
   "Unhexify unicode hex-chars. E.g. `%C3%B6' is the German Umlaut `ö'.
 Note: this function also decodes single byte encodings like
 `%E1' (\"á\") if not followed by another `%[A-F0-9]{2}' group."
-  (let* ((bytes (remove "" (split-string hex "%")))
-	 (ret "")
-	 (eat 0)
-	 (sum 0))
-    (while bytes
-      (let* ((b (pop bytes))
-	     (a (elt b 0))
-	     (b (elt b 1))
-	     (c1 (if (> a ?9) (+ 10 (- a ?A)) (- a ?0)))
-	     (c2 (if (> b ?9) (+ 10 (- b ?A)) (- b ?0)))
-	     (val (+ (lsh c1 4) c2))
-	     (shift
-	      (if (= 0 eat) ;; new byte
-		  (if (>= val 252) 6
-		    (if (>= val 248) 5
-		      (if (>= val 240) 4
-			(if (>= val 224) 3
-			  (if (>= val 192) 2 0)))))
-		6))
-	     (xor
-	      (if (= 0 eat) ;; new byte
-		  (if (>= val 252) 252
-		    (if (>= val 248) 248
-		      (if (>= val 240) 240
-			(if (>= val 224) 224
-			  (if (>= val 192) 192 0)))))
-		128)))
-	(if (>= val 192) (setq eat shift))
-	(setq val (logxor val xor))
-	(setq sum (+ (lsh sum shift) val))
-	(if (> eat 0) (setq eat (- eat 1)))
-	(cond
-	 ((= 0 eat)                         ;multi byte
-	  (setq ret (concat ret (org-char-to-string sum)))
-	  (setq sum 0))
-	 ((not bytes)                       ; single byte(s)
-	  (setq ret (org-link-unescape-single-byte-sequence hex))))
-	)) ;; end (while bytes
-    ret ))
+  (save-match-data
+    (let* ((bytes (cdr (split-string hex "%")))
+	   (ret "")
+	   (eat 0)
+	   (sum 0))
+      (while bytes
+	(let* ((val (string-to-number (pop bytes) 16))
+	       (shift-xor
+		(if (= 0 eat)
+		    (cond
+		     ((>= val 252) (cons 6 252))
+		     ((>= val 248) (cons 5 248))
+		     ((>= val 240) (cons 4 240))
+		     ((>= val 224) (cons 3 224))
+		     ((>= val 192) (cons 2 192))
+		     (t (cons 0 0)))
+		  (cons 6 128))))
+	  (if (>= val 192) (setq eat (car shift-xor)))
+	  (setq val (logxor val (cdr shift-xor)))
+	  (setq sum (+ (lsh sum (car shift-xor)) val))
+	  (if (> eat 0) (setq eat (- eat 1)))
+	  (cond
+	   ((= 0 eat)			;multi byte
+	    (setq ret (concat ret (org-char-to-string sum)))
+	    (setq sum 0))
+	   ((not bytes)			; single byte(s)
+	    (setq ret (org-link-unescape-single-byte-sequence hex))))
+	  )) ;; end (while bytes
+      ret )))
 
 (defun org-link-unescape-single-byte-sequence (hex)
   "Unhexify hex-encoded single byte character sequences."
-  (let ((bytes (remove "" (split-string hex "%")))
-	(ret ""))
-    (while bytes
-      (let* ((b (pop bytes))
-	     (a (elt b 0))
-	     (b (elt b 1))
-	     (c1 (if (> a ?9) (+ 10 (- a ?A)) (- a ?0)))
-	     (c2 (if (> b ?9) (+ 10 (- b ?A)) (- b ?0))))
-	(setq ret
-	      (concat ret (char-to-string
-			   (+ (lsh c1 4) c2))))))
-    ret))
+  (mapconcat (lambda (byte)
+	       (char-to-string (string-to-number byte 16)))
+	     (cdr (split-string hex "%")) ""))
 
 (defun org-xor (a b)
   "Exclusive or."
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 14/16] Always percent escape the percent sign
  2011-02-12 22:17 ` Bastien
                     ` (13 preceding siblings ...)
  2011-02-13 12:01   ` [PATCH 13/16] Refactor unescaping functions David Maus
@ 2011-02-13 12:01   ` David Maus
  2011-02-13 12:01   ` [PATCH 15/16] Use `org-link-unescape' instead of obsolete unhex string function David Maus
  2011-02-13 12:01   ` [PATCH 16/16] Throw error if encoding character in utf8 fails David Maus
  16 siblings, 0 replies; 22+ messages in thread
From: David Maus @ 2011-02-13 12:01 UTC (permalink / raw)
  To: emacs-orgmode, bastien.guerry; +Cc: David Maus

* lisp/org.el (org-link-escape, org-link-escape-chars-browser)
(org-link-escape-chars): Always percent escape the percent sign.
---
 lisp/org.el |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/lisp/org.el b/lisp/org.el
index 8fcb9c4..1415eb1 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -8565,14 +8565,14 @@ according to FMT (default from `org-email-link-description-format')."
 	  "]"))
 
 (defconst org-link-escape-chars
-  '(?\ ?\[ ?\] ?\; ?\= ?\+ ?\%)
+  '(?\ ?\[ ?\] ?\; ?\= ?\+)
   "List of characters that should be escaped in link.
 This is the list that is used for internal purposes.")
 
 (defvar org-url-encoding-use-url-hexify nil)
 
 (defconst org-link-escape-chars-browser
-  '(?\ ?\%)
+  '(?\ )
   "List of escapes for characters that are problematic in links.
 This is the list that is used before handing over to the browser.")
 
@@ -8595,7 +8595,7 @@ If optional argument MERGE is set, merge TABLE into
     (mapconcat
      (lambda (char)
        (if (or (member char table)
-	       (< char 32) (> char 126))
+	       (< char 32) (= char 37) (> char 126))
 	   (mapconcat (lambda (sequence-element)
 			(format "%%%.2X" sequence-element))
 		      (encode-coding-char char 'utf-8) "")
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 15/16] Use `org-link-unescape' instead of obsolete unhex string function
  2011-02-12 22:17 ` Bastien
                     ` (14 preceding siblings ...)
  2011-02-13 12:01   ` [PATCH 14/16] Always percent escape the percent sign David Maus
@ 2011-02-13 12:01   ` David Maus
  2011-02-13 12:01   ` [PATCH 16/16] Throw error if encoding character in utf8 fails David Maus
  16 siblings, 0 replies; 22+ messages in thread
From: David Maus @ 2011-02-13 12:01 UTC (permalink / raw)
  To: emacs-orgmode, bastien.guerry; +Cc: David Maus

* lisp/org-protocol.el (org-protocol-split-data) (org-protocol-open-source):
Use `org-link-unescape' instead of obsolete unhex string function.
---
 lisp/org-protocol.el |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lisp/org-protocol.el b/lisp/org-protocol.el
index 46441db..b1ad0a9 100644
--- a/lisp/org-protocol.el
+++ b/lisp/org-protocol.el
@@ -301,7 +301,7 @@ part."
     (if unhexify
 	(if (fboundp unhexify)
 	    (mapcar unhexify split-parts)
-	  (mapcar 'org-protocol-unhex-string split-parts))
+	  (mapcar 'org-link-unescape split-parts))
       split-parts)))
 
 (defun org-protocol-flatten-greedy (param-list &optional strip-path replacement)
@@ -476,7 +476,7 @@ The location for a browser's bookmark should look like this:
   ;; As we enter this function for a match on our protocol, the return value
   ;; defaults to nil.
   (let ((result nil)
-        (f (org-protocol-unhex-string fname)))
+        (f (org-link-unescape fname)))
     (catch 'result
       (dolist (prolist org-protocol-project-alist)
         (let* ((base-url (plist-get (cdr prolist) :base-url))
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 16/16] Throw error if encoding character in utf8 fails
  2011-02-12 22:17 ` Bastien
                     ` (15 preceding siblings ...)
  2011-02-13 12:01   ` [PATCH 15/16] Use `org-link-unescape' instead of obsolete unhex string function David Maus
@ 2011-02-13 12:01   ` David Maus
  16 siblings, 0 replies; 22+ messages in thread
From: David Maus @ 2011-02-13 12:01 UTC (permalink / raw)
  To: emacs-orgmode, bastien.guerry; +Cc: David Maus

* lisp/org.el (org-link-escape): Throw error if encoding character in
utf8 fails.
---
 lisp/org.el |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/lisp/org.el b/lisp/org.el
index 1415eb1..0eb3a2b 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -8598,8 +8598,10 @@ If optional argument MERGE is set, merge TABLE into
 	       (< char 32) (= char 37) (> char 126))
 	   (mapconcat (lambda (sequence-element)
 			(format "%%%.2X" sequence-element))
-		      (encode-coding-char char 'utf-8) "")
-	   (char-to-string char))) text "")))
+		      (or (encode-coding-char char 'utf-8)
+			  (error "Unable to percent escape character: %s"
+				 (char-to-string char))) "")
+	 (char-to-string char))) text "")))
 
 (defun org-link-unescape (str)
   "Unhex hexified unicode strings as returned from the JavaScript function
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: Improve percent escaping links in Org mode (pull request / OK to push)
  2011-02-13 12:01   ` David Maus
@ 2011-02-13 13:41     ` Bastien
  2011-02-14  6:38       ` David Maus
  0 siblings, 1 reply; 22+ messages in thread
From: Bastien @ 2011-02-13 13:41 UTC (permalink / raw)
  To: David Maus; +Cc: emacs-orgmode

Hi David,

David Maus <dmaus@ictsoc.de> writes:

> Rebased to current head and here we go.

Wow, great work -- thanks for the perfect changelogs!

I've been through the patches, everythings looks good, feel 
free to push (and to mark patches as "accepted" in patchwork.)

You mentioned some possible backward compatibility issues with 
a few existing links before in this thead, any update on this?

Thanks a lot to you, Sebastian -- and Vincent B. for bringing 
up this issue!

-- 
 Bastien

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Improve percent escaping links in Org mode (pull request / OK to push)
  2011-02-13 13:41     ` Bastien
@ 2011-02-14  6:38       ` David Maus
  2011-02-14 10:09         ` Bastien
  0 siblings, 1 reply; 22+ messages in thread
From: David Maus @ 2011-02-14  6:38 UTC (permalink / raw)
  To: Bastien; +Cc: David Maus, emacs-orgmode


[-- Attachment #1.1: Type: text/plain, Size: 1130 bytes --]

At Sun, 13 Feb 2011 14:41:14 +0100,
Bastien wrote:
>
> Hi David,
>
> David Maus <dmaus@ictsoc.de> writes:
>
> > Rebased to current head and here we go.
>
> Wow, great work -- thanks for the perfect changelogs!
>
> I've been through the patches, everythings looks good, feel
> free to push (and to mark patches as "accepted" in patchwork.)

Thanks for the quick review. I won't be available until wednesday so I
most likely push wednesday or thursday evening with a short warning
notice.

> You mentioned some possible backward compatibility issues with
> a few existing links before in this thead, any update on this?

Nope, but it just occured to me that we might provide a small elisp
command that users can run in a buffer to check for possible problems?

The elisp could check each link for a substring that matches the def
of a percent escaped character (%[a-zA-Z]{2}) and is not in the old
`org-link-escape-chars' list. Such links might pose a problem because
the new unescaping function will unescape this sequence.

Best,
  -- David
--
OpenPGP... 0x99ADB83B5A4478E6
Jabber.... dmjena@jabber.org
Email..... dmaus@ictsoc.de

[-- Attachment #1.2: Type: application/pgp-signature, Size: 230 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Improve percent escaping links in Org mode (pull request / OK to push)
  2011-02-14  6:38       ` David Maus
@ 2011-02-14 10:09         ` Bastien
  0 siblings, 0 replies; 22+ messages in thread
From: Bastien @ 2011-02-14 10:09 UTC (permalink / raw)
  To: David Maus; +Cc: emacs-orgmode

Hi David,

David Maus <dmaus@ictsoc.de> writes:

> Thanks for the quick review. I won't be available until wednesday so I
> most likely push wednesday or thursday evening with a short warning
> notice.

Looks good, thanks.

>> You mentioned some possible backward compatibility issues with
>> a few existing links before in this thead, any update on this?
>
> Nope, but it just occured to me that we might provide a small elisp
> command that users can run in a buffer to check for possible problems?
>
> The elisp could check each link for a substring that matches the def
> of a percent escaped character (%[a-zA-Z]{2}) and is not in the old
> `org-link-escape-chars' list. Such links might pose a problem because
> the new unescaping function will unescape this sequence.

Good idea!

-- 
 Bastien

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2011-02-14 10:09 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-02 19:37 Improve percent escaping links in Org mode (pull request / OK to push) David Maus
2011-02-12 22:17 ` Bastien
2011-02-13 12:01   ` David Maus
2011-02-13 13:41     ` Bastien
2011-02-14  6:38       ` David Maus
2011-02-14 10:09         ` Bastien
2011-02-13 12:01   ` [PATCH 01/16] Decode single byte sequence if decoding unicode failed David Maus
2011-02-13 12:01   ` [PATCH 02/16] New unicode aware percent encoding algorithm David Maus
2011-02-13 12:01   ` [PATCH 03/16] New format of percent escape table David Maus
2011-02-13 12:01   ` [PATCH 04/16] Fixup doc string David Maus
2011-02-13 12:01   ` [PATCH 05/16] New optional argument: Merge user table with default table David Maus
2011-02-13 12:01   ` [PATCH 06/16] Inline function to properly decode utf8 characters in Emacs 22 David Maus
2011-02-13 12:01   ` [PATCH 07/16] Unescape functions moved and renamed from org-protocol.el David Maus
2011-02-13 12:01   ` [PATCH 08/16] Declare obsolete & alias to respective org-link-unescape-* functions David Maus
2011-02-13 12:01   ` [PATCH 09/16] Remove obsolete argument in call to org-link-unescape David Maus
2011-02-13 12:01   ` [PATCH 10/16] Use new percent escape character table format David Maus
2011-02-13 12:01   ` [PATCH 11/16] Add percent sign to list of escape chars David Maus
2011-02-13 12:01   ` [PATCH 12/16] Rename lambda argument David Maus
2011-02-13 12:01   ` [PATCH 13/16] Refactor unescaping functions David Maus
2011-02-13 12:01   ` [PATCH 14/16] Always percent escape the percent sign David Maus
2011-02-13 12:01   ` [PATCH 15/16] Use `org-link-unescape' instead of obsolete unhex string function David Maus
2011-02-13 12:01   ` [PATCH 16/16] Throw error if encoding character in utf8 fails David Maus

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).