emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* RE: %20 in file://... URL
@ 2010-11-22 15:46 Vincent Belaïche
  2010-11-22 18:16 ` David Maus
  0 siblings, 1 reply; 21+ messages in thread
From: Vincent Belaïche @ 2010-11-22 15:46 UTC (permalink / raw)
  To: Org mode; +Cc: Vincent Belaïche

[-- Attachment #1: Type: text/plain, Size: 1500 bytes --]

> Date: Wed, 17 Nov 2010 21:43:59 +0100
> From: dmaus@ictsoc.de
> To: vincent.b.1@hotmail.fr
> Subject: Re: [Orgmode] %20 in file://... URL
> CC: emacs-orgmode@gnu.org; carsten.dominik@gmail.com
> 

[...]

Hello,

Sorry for the delay, I was on business trip. 
>  
> Thanks for sending the patch, but it won't provide a clean solution to
> the problem: The function modified by your patch works under the
> assumption, that for example the sequence %3A represents a percent
> escaped colon.  But the function that creates the link in the first
> place does not percent-escape chars 

Er, in my situation I create the link with another package, and I *did*
escaped the colon.

> -- If we use just this patch, opening a link to a file literarally
> called "%3A.org" will fail.
>  
> So we need to modify all functions that create links to propertly
> percent-escape the part of a link that follows the link type in order
> to make all functions unescape the link.
>  
> Good news: Reworking the percent-escaping is a work in progress on my
> list[1] and if it is finished and accepted, the problem should be
> solved.
>  
> Best,
>   -- David

I see, so I understand that you will someday modify a function creating
links in order to implement character escaping. I can give a hand if
tell me the function name.

I also send you my patch with a git diff, just in case (with same
changelog attached again). Sorry for using `diff -c', I just followed
the info node `(emacs) Sending Patches'

   Vincent.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Patch --]
[-- Type: text/x-patch, Size: 1448 bytes --]

diff --git a/lisp/org.el b/lisp/org.el
index 201dd87..4e2e2c4 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -9639,9 +9639,28 @@ to search for.  If LINE or SEARCH is given, the file will be
 opened in Emacs, unless an entry from org-file-apps that makes
 use of groups in a regexp matches.
 If the file does not exist, an error is thrown."
-  (let* ((file (if (equal path "")
+  (let* ((%xx-decoded-path 
+	  (let ((pos 0) (%xx-decoded-path path))
+	    (setq %xx-decoded-path path)
+	    (while (setq pos (string-match "%\\([0-9A-F]\\)\\([0-9A-F]\\)" %xx-decoded-path pos))
+	      (setq pos (1+ pos)
+		    %xx-decoded-path (replace-match 
+				      (string (let ((code 0) digit)
+						(dotimes (i 2)
+						  (setq 
+						   digit (aref (match-string (1+ i) %xx-decoded-path) 0)
+						   code (+ (if (<= digit ?9)
+							       (- digit ?0)
+							     (- digit 55))
+							   (* 16 code)))) code))
+				      t t %xx-decoded-path)))
+	    ;; remove //localhost/ prefix if any
+	    (and (string-match "\\`//localhost/" %xx-decoded-path)
+		 (setq %xx-decoded-path (substring %xx-decoded-path 12)))
+	    %xx-decoded-path))
+	 (file (if (equal path "")
 		   buffer-file-name
-		 (substitute-in-file-name (expand-file-name path))))
+		 (substitute-in-file-name (expand-file-name %xx-decoded-path))))
 	 (file-apps (append org-file-apps (org-default-apps)))
 	 (apps (org-remove-if
 		'org-file-apps-entry-match-against-dlink-p file-apps))

[-- Attachment #3: ChangeLog --]
[-- Type: text/plain, Size: 218 bytes --]

2010-11-13  Vincent Belaïche  <vincentb1@users.sourceforge.net>

	* org.el (org-open-file): Decode %XX escapes in URL with file
	type, so that applications other than browsers are not confused with the filename.


[-- Attachment #4: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: %20 in file://... URL
  2010-11-22 15:46 %20 in file://... URL Vincent Belaïche
@ 2010-11-22 18:16 ` David Maus
  2011-02-12 15:02   ` Bastien
  0 siblings, 1 reply; 21+ messages in thread
From: David Maus @ 2010-11-22 18:16 UTC (permalink / raw)
  To: Vincent Belaïche; +Cc: Org mode


[-- Attachment #1.1: Type: text/plain, Size: 938 bytes --]

At Mon, 22 Nov 2010 16:46:44 +0100,
Vincent Belaïche wrote:
> I see, so I understand that you will someday modify a function creating
> links in order to implement character escaping. I can give a hand if
> tell me the function name.

To be exact: Org already escapes some characters (C-h v
org-link-escape-chars RET) and the colon is a candidate for beeing on
the list.  The functions responsible for escaping/unescaping are
`org-link-escape' and `org-link-unescape' and the new implementations
of these functions can be found in 

https://github.com/dmj/dmj-org-mode/tree/feature/org-percent-escaping

The task at hand: Anticipate the consquences of the new implementation.
I.e.  what will happen to links created with the old algorithm.

Patches, ideas, and comments on the modifications are welcome.

Best,
  -- David
-- 
OpenPGP... 0x99ADB83B5A4478E6
Jabber.... dmjena@jabber.org
Email..... dmaus@ictsoc.de

[-- Attachment #1.2: Type: application/pgp-signature, Size: 230 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: %20 in file://... URL
  2010-11-22 18:16 ` David Maus
@ 2011-02-12 15:02   ` Bastien
  0 siblings, 0 replies; 21+ messages in thread
From: Bastien @ 2011-02-12 15:02 UTC (permalink / raw)
  To: David Maus; +Cc: Vincent Belaïche, Org mode

Hi David,

David Maus <dmaus@ictsoc.de> writes:

> To be exact: Org already escapes some characters (C-h v
> org-link-escape-chars RET) and the colon is a candidate for beeing on
> the list.  The functions responsible for escaping/unescaping are
> `org-link-escape' and `org-link-unescape' and the new implementations
> of these functions can be found in 
>
> https://github.com/dmj/dmj-org-mode/tree/feature/org-percent-escaping
>
> The task at hand: Anticipate the consquences of the new implementation.
> I.e.  what will happen to links created with the old algorithm.

Did you have time to progress on this front?  Is there something I can
test right now?

Thanks!

-- 
 Bastien

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: %20 in file://... URL
  2010-11-24 20:57 ` David Maus
@ 2011-02-12 14:36   ` Bastien
  0 siblings, 0 replies; 21+ messages in thread
From: Bastien @ 2011-02-12 14:36 UTC (permalink / raw)
  To: David Maus; +Cc: Vincent Belaïche, Org mode

Hi David,

did you find time to work on characters escaping in links?

Is there some code we can test by patching the current git HEAD?

Thanks!

-- 
 Bastien

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: %20 in file://... URL
@ 2010-12-30  5:29 Vincent Belaïche
  0 siblings, 0 replies; 21+ messages in thread
From: Vincent Belaïche @ 2010-12-30  5:29 UTC (permalink / raw)
  To: Org mode, David Maus


[...]

>> 
>> hoping that the above helps.
>
>Definitely.
>
>Last not least: On this mailing list you should normally Cc: answers
>to the original poster -- some are not subscribed to the list at all,
>some (like me) read the list in a different account than their main
>mail account and miss answers etc.
>
>Best and thanks,
>  -- David

By the way, I realized that emacs embeds a "URL" package that already
has some URL parse function url-generic-parse-url. 

Wouldn't it be better if Org would just rely on this function and/or
extend it, or at least if org would offer the same API as url and try to
align on the same conventions for non standard URL's, so that org could
be a replacement to the URL package.

I noticed that the URL package does not seem to make any %XX decoding,
for instance on my machine:

(url-generic-parse-url "file:c%3A/toto.html")

evaluates to 

[cl-struct-url "file" nil nil nil 21 "c%3A/toto.html" nil nil nil]

I also noticed that the info:FILE#NODE does not seem to be supported by
Org, but it is by URL. 

Actually it would be even more useful to have also info:FILE#NODE::NNN
with NNN being the line number within the info NODE, but url does not
support the ::NNN extension which seems to be defined only in Org.

VBR, 
   Vincent.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: %20 in file://... URL
       [not found] ` <BLU104-W15A3F7F6097ED8F6D95CEB84210@phx.gbl>
@ 2010-11-29 20:03   ` David Maus
  0 siblings, 0 replies; 21+ messages in thread
From: David Maus @ 2010-11-29 20:03 UTC (permalink / raw)
  To: Vincent Belaïche; +Cc: David Maus, emacs-orgmode


[-- Attachment #1.1: Type: text/plain, Size: 2596 bytes --]

At Fri, 26 Nov 2010 23:11:13 +0100,
Vincent Belaïche wrote:
> 
> [1  <text/plain; iso-8859-1 (quoted-printable)>]
> 
> 
> [...]
> 
> 
> > 1. The percent escaping/unescaping functions are not unicode aware;
> 
> My understanding/feeling is that a link in a file foo.org should be
> interpreted with the coding scheme of this file. 

I think this is not reasoble: The information about the coding system
of the file where the link was created is not carried with the link.
E.g. the unescaping function would have no idea about how to properly
unescape the escaped chars.

> Now I am surprised that you write that there is no unicode support, become
> some code like this looks like unicoding the stuff:

You are right: I should have said: The *escaping* function is not
unicode aware.  The unescaping function wasn't neither, but in the
development version on Github I replace the old `org-link-unescape'
with the function formerly known as `org-protocol-unhex-string'.

> > 2. The percent escaping/unescaping functions require a user to
> >    explicitly tell which characters should be escaped;
> 
> That should be dependant on the type of link, file and http should support
> that all characters are escaped, or that no character but % and ] are escaped.

Correct.  The new algorithm escapes characters if one of these
conditions is true:

 - the character is a ASCII control character (<32, 127)
 - the character is the percent sign
 - the character is a non-ASCII character (>127, unicode)
 - the character is in the user supplied list

For unescaping there is no table, it just unescapes all percent
escaped characters.

> 
> I have a question to you: emacs has a url package to interprete url. Why does
> org does not rely on this.

Good question.  This is something to find out: There is C-h v
org-url-encoding-use-url-hexify RET

org-url-encoding-use-url-hexify is a variable defined in `org.el'.
Its value is nil

Documentation:
Not documented as a variable.

This variable was added back in 2009 (commit b077f710) but seems not
used at all.

The only difference I can see is that you can pass org-link-escape in
a table of user defined characters that should be escaped -- but not
sure if this functionality is really needed.

So the next step is check all functions that use escape/unescape and
see if replacing the calls to org-link-escape/unescape can be replaced
with calls to url-hexify/unhexify.

Thanks,
  -- David
-- 
OpenPGP... 0x99ADB83B5A4478E6
Jabber.... dmjena@jabber.org
Email..... dmaus@ictsoc.de

[-- Attachment #1.2: Type: application/pgp-signature, Size: 230 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: %20 in file://... URL
  2010-11-23  5:25 Vincent Belaïche
@ 2010-11-24 20:57 ` David Maus
  2011-02-12 14:36   ` Bastien
       [not found] ` <BLU104-W15A3F7F6097ED8F6D95CEB84210@phx.gbl>
  1 sibling, 1 reply; 21+ messages in thread
From: David Maus @ 2010-11-24 20:57 UTC (permalink / raw)
  To: Vincent Belaïche; +Cc: Org mode


[-- Attachment #1.1: Type: text/plain, Size: 5984 bytes --]

At Tue, 23 Nov 2010 06:25:40 +0100,
Vincent Belaïche wrote:
> 
> >From: 	David Maus
> >Subject: 	Re: [Orgmode] %20 in file://... URL
> >Date: 	Mon, 22 Nov 2010 19:16:09 +0100
> >User-agent: 	Wanderlust/2.15.9 (Almost Unreal) SEMI/1.14.6 (Maruoka) FLIM/1.14.9 (GojÅ) APEL/10.8 Emacs/23.2 (i486-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO)
> >
> >At Mon, 22 Nov 2010 16:46:44 +0100,
> >Vincent Belaïche wrote:
> >> I see, so I understand that you will someday modify a function creating
> >> links in order to implement character escaping. I can give a hand if
> >> tell me the function name.
> >
> >To be exact: Org already escapes some characters (C-h v
> >org-link-escape-chars RET) and the colon is a candidate for beeing on
> >the list.  
> 
> What does "already" exactly means ? 

It means that Org performs percent escaping in some cases but there
are currently three problems:

 1. The percent escaping/unescaping functions are not unicode aware;
 2. The percent escaping/unescaping functions require a user to
    explicitly tell which characters should be escaped;
 3. There is no clear rule in place when to escape/unescape -- that is
    the problem you'Ve hit.

The solution for the 3rd problem is not to modify `org-open-file' but
to implement the rule that says: 

 - If a link is written in an Org file, the everything after the
   scheme (type) is percent escaped.  IIRC this is already done for
   characters that would breake the parser (square brackets) but some
   chars are missing (the percent sign, obviously).
 - if a link is read from an Org file and passed to another function,
   the part after scheme is unescaped.

These two rule would cover the problem you face: Although the link is
not created by Org, it is unescaped before it is opened, so %3A would
expand to ":".

> 
> Ok, you mean that some version of org already does the job, but not the
> org that is on the official Git depo ?
> 

Yes, kind of.  It's a personal working copy of the offical repository
and when I am finished I either merge it into the offical one or ask
someone review the changes and "pull" my version into the offical
repo.  Think of it as the bleeding bleeding edge -- highly unstable,
not guaranteed to work at all.


> 
> The following is just comments on the code, most of it is a matter of
> taste, which you may well disagree with.
> 
> 1. In the org.el file in the link which you provided I found also these
>    functions org-entry-protect-space & org-entry-restore-space which
>    does also some escaping, why not use a unique function

Good point.

> 
> 2. In the function org-link-escape, there is a lambda expression  
> 
>    (lambda (sequence)
>    			(format "%%%.2X" sequence))
> 
>    The argument name should be sequence-element rather than sequence.

Dto.  Changed it.

> 
> 3. In org-link-unescape, there are 3 substringing-or-concatenations, but
>    you could make it simpler by a single replace-match and using a start-position in the
>    string-match. That would look like this (*not tested*):
> 
> (defun org-link-unescape (str)
>   "Unhex hexified unicode strings as returned from the JavaScript function
> encodeURIComponent. E.g. `%C3%B6' is the german Umlaut `ö'."
>   (setq str (or str ""))
>   (let ((case-fold-search t)
>         (pos 0))
>     (while (string-match "\\(%[0-9a-f][0-9a-f]\\)+" str pos)
>              (setq pos (+ pos (/ (- (match-end 0) (match-beginning 0))
> 				 3))
> 		   str (replace-match 
> 			(org-link-unescape-compound (upcase  (match-string 0 str); hex
>                                                    ))
> 			t t str))))
>   str))
> 
> My feeling that the kind of code above is slightly simpler in
> execution as there is only one string manipulation at each
> iteration instead of two, and also easier to maintain as is has
> fewer use cases (i.e. it does not really matter if the escaped
> sequence is at the end of string or not). You also avoid some
> intermediate variables like `replacement' as the use of
> replace-match make it self explanatory that the result of
> org-link-unescape-compound is a replacement.

Agreed, refactoring the unescape functions is on the list.

> 
> 3. in org-link-unescape-compound,  
> 
>     (remove "" (split-string hex "%"))
> 
> 
>    can be replaced by (cdr  (split-string hex "%")) because there is
>    always only one empty string in the sequence and it is in the 1st
>    place.

Agreed.

> 
> 4. in org-link-unescape-compound, you could have made fewer comparison
>    by replacing code
> 
> 	     (shift
> 	      (if (= 0 eat) ;; new byte
> 		  (if (>= val 252) 6
> 		    (if (>= val 248) 5
> 		      (if (>= val 240) 4
> 			(if (>= val 224) 3
> 			  (if (>= val 192) 2 0)))))
> 		6))
> 	     (xor
> 	      (if (= 0 eat) ;; new byte
> 		  (if (>= val 252) 252
> 		    (if (>= val 248) 248
> 		      (if (>= val 240) 240
> 			(if (>= val 224) 224
> 			  (if (>= val 192) 192 0)))))
> 		128)))
> 
> by (*not tested*):
> 
> 	     (shift-xor
> 	      (if (= 0 eat) ;; new byte
> 		  (if (>= val 252) '(6 . 252)
> 		    (if (>= val 248) '(5 . 248)
> 		      (if (>= val 240) '(4 . 240)
> 			(if (>= val 224) '(3 . 224)
> 			  (if (>= val 192) '(2 . 192) '(0. 0))))))
> 		 '(6 . 128)))
>          (shift (car shift-xor))
> 	     (xor (cdr shift-xor))
> 
> 
> the code above looks more concise to me, depending on val it may also
> run faster.

Okay, I have to look at this suggestion.

> 
> hoping that the above helps.

Definitely.

Last not least: On this mailing list you should normally Cc: answers
to the original poster -- some are not subscribed to the list at all,
some (like me) read the list in a different account than their main
mail account and miss answers etc.

Best and thanks,
  -- David
-- 
OpenPGP... 0x99ADB83B5A4478E6
Jabber.... dmjena@jabber.org
Email..... dmaus@ictsoc.de

[-- Attachment #1.2: Type: application/pgp-signature, Size: 230 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: %20 in file://... URL
@ 2010-11-23  5:25 Vincent Belaïche
  2010-11-24 20:57 ` David Maus
       [not found] ` <BLU104-W15A3F7F6097ED8F6D95CEB84210@phx.gbl>
  0 siblings, 2 replies; 21+ messages in thread
From: Vincent Belaïche @ 2010-11-23  5:25 UTC (permalink / raw)
  To: Org mode; +Cc: Vincent Belaïche

>From: 	David Maus
>Subject: 	Re: [Orgmode] %20 in file://... URL
>Date: 	Mon, 22 Nov 2010 19:16:09 +0100
>User-agent: 	Wanderlust/2.15.9 (Almost Unreal) SEMI/1.14.6 (Maruoka) FLIM/1.14.9 (GojÅ) APEL/10.8 Emacs/23.2 (i486-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO)
>
>At Mon, 22 Nov 2010 16:46:44 +0100,
>Vincent Belaïche wrote:
>> I see, so I understand that you will someday modify a function creating
>> links in order to implement character escaping. I can give a hand if
>> tell me the function name.
>
>To be exact: Org already escapes some characters (C-h v
>org-link-escape-chars RET) and the colon is a candidate for beeing on
>the list.  

What does "already" exactly means ? I pushed the colon '(?: . "%3A")
into this org-link-escape-chars list, and I made a trial with a link
like this:

[[file://localhost/c%3A/msys/1.0/temp/jay.html][link]]

I get this message: "if: No such file:
//localhost/c%3A/msys/1.0/temp/jay.html", evaluating the full org.el on
the link you gave does not make it either because I get the message that
org-complete cannot be loaded.

>The functions responsible for escaping/unescaping are `org-link-escape'
>and `org-link-unescape' and the new implementations of these functions
>can be found in
>
>https://github.com/dmj/dmj-org-mode/tree/feature/org-percent-escaping
>

Ok, you mean that some version of org already does the job, but not the
org that is on the official Git depo ?


>The task at hand: Anticipate the consquences of the new implementation.
>I.e.  what will happen to links created with the old algorithm.
>

I have no idea of the consequences, I can be a beta tester of it, but
for the time being this code does not work with the kind of link which I
use.

>Patches, ideas, and comments on the modifications are welcome.
>

The following is just comments on the code, most of it is a matter of
taste, which you may well disagree with.

1. In the org.el file in the link which you provided I found also these
   functions org-entry-protect-space & org-entry-restore-space which
   does also some escaping, why not use a unique function

2. In the function org-link-escape, there is a lambda expression  

   (lambda (sequence)
   			(format "%%%.2X" sequence))

   The argument name should be sequence-element rather than sequence.

3. In org-link-unescape, there are 3 substringing-or-concatenations, but
   you could make it simpler by a single replace-match and using a start-position in the
   string-match. That would look like this (*not tested*):

(defun org-link-unescape (str)
  "Unhex hexified unicode strings as returned from the JavaScript function
encodeURIComponent. E.g. `%C3%B6' is the german Umlaut `ö'."
  (setq str (or str ""))
  (let ((case-fold-search t)
        (pos 0))
    (while (string-match "\\(%[0-9a-f][0-9a-f]\\)+" str pos)
             (setq pos (+ pos (/ (- (match-end 0) (match-beginning 0))
				 3))
		   str (replace-match 
			(org-link-unescape-compound (upcase  (match-string 0 str); hex
                                                   ))
			t t str))))
  str))

My feeling that the kind of code above is slightly simpler in
execution as there is only one string manipulation at each
iteration instead of two, and also easier to maintain as is has
fewer use cases (i.e. it does not really matter if the escaped
sequence is at the end of string or not). You also avoid some
intermediate variables like `replacement' as the use of
replace-match make it self explanatory that the result of
org-link-unescape-compound is a replacement.

3. in org-link-unescape-compound,  

    (remove "" (split-string hex "%"))


   can be replaced by (cdr  (split-string hex "%")) because there is
   always only one empty string in the sequence and it is in the 1st
   place.

4. in org-link-unescape-compound, you could have made fewer comparison
   by replacing code

	     (shift
	      (if (= 0 eat) ;; new byte
		  (if (>= val 252) 6
		    (if (>= val 248) 5
		      (if (>= val 240) 4
			(if (>= val 224) 3
			  (if (>= val 192) 2 0)))))
		6))
	     (xor
	      (if (= 0 eat) ;; new byte
		  (if (>= val 252) 252
		    (if (>= val 248) 248
		      (if (>= val 240) 240
			(if (>= val 224) 224
			  (if (>= val 192) 192 0)))))
		128)))

by (*not tested*):

	     (shift-xor
	      (if (= 0 eat) ;; new byte
		  (if (>= val 252) '(6 . 252)
		    (if (>= val 248) '(5 . 248)
		      (if (>= val 240) '(4 . 240)
			(if (>= val 224) '(3 . 224)
			  (if (>= val 192) '(2 . 192) '(0. 0))))))
		 '(6 . 128)))
         (shift (car shift-xor))
	     (xor (cdr shift-xor))


the code above looks more concise to me, depending on val it may also
run faster.

hoping that the above helps.

>
>Best,
>  -- David
>-- 

BR,
   Vincent.

[...]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: %20 in file://... URL
@ 2010-11-22 15:46 Vincent Belaïche
  0 siblings, 0 replies; 21+ messages in thread
From: Vincent Belaïche @ 2010-11-22 15:46 UTC (permalink / raw)
  To: Org mode; +Cc: Vincent Belaïche

[-- Attachment #1: Type: text/plain, Size: 1500 bytes --]

> Date: Wed, 17 Nov 2010 21:43:59 +0100
> From: dmaus@ictsoc.de
> To: vincent.b.1@hotmail.fr
> Subject: Re: [Orgmode] %20 in file://... URL
> CC: emacs-orgmode@gnu.org; carsten.dominik@gmail.com
> 

[...]

Hello,

Sorry for the delay, I was on business trip. 
>  
> Thanks for sending the patch, but it won't provide a clean solution to
> the problem: The function modified by your patch works under the
> assumption, that for example the sequence %3A represents a percent
> escaped colon.  But the function that creates the link in the first
> place does not percent-escape chars 

Er, in my situation I create the link with another package, and I *did*
escaped the colon.

> -- If we use just this patch, opening a link to a file literarally
> called "%3A.org" will fail.
>  
> So we need to modify all functions that create links to propertly
> percent-escape the part of a link that follows the link type in order
> to make all functions unescape the link.
>  
> Good news: Reworking the percent-escaping is a work in progress on my
> list[1] and if it is finished and accepted, the problem should be
> solved.
>  
> Best,
>   -- David

I see, so I understand that you will someday modify a function creating
links in order to implement character escaping. I can give a hand if
tell me the function name.

I also send you my patch with a git diff, just in case (with same
changelog attached again). Sorry for using `diff -c', I just followed
the info node `(emacs) Sending Patches'

   Vincent.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Patch --]
[-- Type: text/x-patch, Size: 1448 bytes --]

diff --git a/lisp/org.el b/lisp/org.el
index 201dd87..4e2e2c4 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -9639,9 +9639,28 @@ to search for.  If LINE or SEARCH is given, the file will be
 opened in Emacs, unless an entry from org-file-apps that makes
 use of groups in a regexp matches.
 If the file does not exist, an error is thrown."
-  (let* ((file (if (equal path "")
+  (let* ((%xx-decoded-path 
+	  (let ((pos 0) (%xx-decoded-path path))
+	    (setq %xx-decoded-path path)
+	    (while (setq pos (string-match "%\\([0-9A-F]\\)\\([0-9A-F]\\)" %xx-decoded-path pos))
+	      (setq pos (1+ pos)
+		    %xx-decoded-path (replace-match 
+				      (string (let ((code 0) digit)
+						(dotimes (i 2)
+						  (setq 
+						   digit (aref (match-string (1+ i) %xx-decoded-path) 0)
+						   code (+ (if (<= digit ?9)
+							       (- digit ?0)
+							     (- digit 55))
+							   (* 16 code)))) code))
+				      t t %xx-decoded-path)))
+	    ;; remove //localhost/ prefix if any
+	    (and (string-match "\\`//localhost/" %xx-decoded-path)
+		 (setq %xx-decoded-path (substring %xx-decoded-path 12)))
+	    %xx-decoded-path))
+	 (file (if (equal path "")
 		   buffer-file-name
-		 (substitute-in-file-name (expand-file-name path))))
+		 (substitute-in-file-name (expand-file-name %xx-decoded-path))))
 	 (file-apps (append org-file-apps (org-default-apps)))
 	 (apps (org-remove-if
 		'org-file-apps-entry-match-against-dlink-p file-apps))

[-- Attachment #3: ChangeLog --]
[-- Type: text/plain, Size: 218 bytes --]

2010-11-13  Vincent Belaïche  <vincentb1@users.sourceforge.net>

	* org.el (org-open-file): Decode %XX escapes in URL with file
	type, so that applications other than browsers are not confused with the filename.


[-- Attachment #4: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: %20 in file://... URL
  2010-11-13  6:18 Vincent Belaïche
                   ` (2 preceding siblings ...)
  2010-11-17 20:43 ` David Maus
@ 2010-11-17 20:43 ` David Maus
  3 siblings, 0 replies; 21+ messages in thread
From: David Maus @ 2010-11-17 20:43 UTC (permalink / raw)
  To: Vincent Belaïche; +Cc: Org mode, carsten.dominik


[-- Attachment #1.1: Type: text/plain, Size: 1067 bytes --]

At Sat, 13 Nov 2010 07:18:42 +0100,
Vincent Belaïche wrote:
> 
> Herein attached follows my patch. Please feel free for brickbats...
> 

Thanks for sending the patch, but it won't provide a clean solution to
the problem: The function modified by your patch works under the
assumption, that for example the sequence %3A represents a percent
escaped colon.  But the function that creates the link in the first
place does not percent-escape chars -- If we use just this patch,
opening a link to a file literarally called "%3A.org" will fail.

So we need to modify all functions that create links to propertly
percent-escape the part of a link that follows the link type in order
to make all functions unescape the link.

Good news: Reworking the percent-escaping is a work in progress on my
list[1] and if it is finished and accepted, the problem should be
solved.

Best,
  -- David

[1] http://thread.gmane.org/gmane.emacs.orgmode/30694/focus=33179
-- 
OpenPGP... 0x99ADB83B5A4478E6
Jabber.... dmjena@jabber.org
Email..... dmaus@ictsoc.de

[-- Attachment #1.2: Type: application/pgp-signature, Size: 230 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: %20 in file://... URL
  2010-11-13  6:18 Vincent Belaïche
  2010-11-13  6:28 ` Vincent Belaïche
  2010-11-14 17:30 ` David Maus
@ 2010-11-17 20:43 ` David Maus
  2010-11-17 20:43 ` David Maus
  3 siblings, 0 replies; 21+ messages in thread
From: David Maus @ 2010-11-17 20:43 UTC (permalink / raw)
  To: Vincent Belaïche; +Cc: Org mode, carsten.dominik


[-- Attachment #1.1: Type: text/plain, Size: 1067 bytes --]

At Sat, 13 Nov 2010 07:18:42 +0100,
Vincent Belaïche wrote:
> 
> Herein attached follows my patch. Please feel free for brickbats...
> 

Thanks for sending the patch, but it won't provide a clean solution to
the problem: The function modified by your patch works under the
assumption, that for example the sequence %3A represents a percent
escaped colon.  But the function that creates the link in the first
place does not percent-escape chars -- If we use just this patch,
opening a link to a file literarally called "%3A.org" will fail.

So we need to modify all functions that create links to propertly
percent-escape the part of a link that follows the link type in order
to make all functions unescape the link.

Good news: Reworking the percent-escaping is a work in progress on my
list[1] and if it is finished and accepted, the problem should be
solved.

Best,
  -- David

[1] http://thread.gmane.org/gmane.emacs.orgmode/30694/focus=33179
-- 
OpenPGP... 0x99ADB83B5A4478E6
Jabber.... dmjena@jabber.org
Email..... dmaus@ictsoc.de

[-- Attachment #1.2: Type: application/pgp-signature, Size: 230 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: %20 in file://... URL
  2010-11-13  6:18 Vincent Belaïche
  2010-11-13  6:28 ` Vincent Belaïche
@ 2010-11-14 17:30 ` David Maus
  2010-11-17 20:43 ` David Maus
  2010-11-17 20:43 ` David Maus
  3 siblings, 0 replies; 21+ messages in thread
From: David Maus @ 2010-11-14 17:30 UTC (permalink / raw)
  To: Vincent Belaïche; +Cc: Org mode


[-- Attachment #1.1: Type: text/plain, Size: 383 bytes --]

At Sat, 13 Nov 2010 07:18:42 +0100,
Vincent Belaïche wrote:
> 
> Herein attached follows my patch. Please feel free for brickbats...
> 

Could I ask you to resend the patch in a format that can be applied
with Git?  E.g. try:

git diff > my-new-patch.patch

Best,
  -- David
-- 
OpenPGP... 0x99ADB83B5A4478E6
Jabber.... dmjena@jabber.org
Email..... dmaus@ictsoc.de

[-- Attachment #1.2: Type: application/pgp-signature, Size: 230 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: %20 in file://... URL
  2010-11-13  6:18 Vincent Belaïche
@ 2010-11-13  6:28 ` Vincent Belaïche
  2010-11-14 17:30 ` David Maus
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 21+ messages in thread
From: Vincent Belaïche @ 2010-11-13  6:28 UTC (permalink / raw)
  To: emacs-orgmode, Giovanni Ridolfi


[-- Attachment #1.1: Type: text/plain, Size: 776 bytes --]




>From: vincent.b.1@hotmail.fr
>To: emacs-orgmode@gnu.org; giovanni.ridolfi@yahoo.it
>Subject: Re: [Orgmode] %20 in file://... URL
>Date: Sat, 13 Nov 2010 07:18:42 +0100
>CC: vincent.b.1@hotmail.fr
>
> 
>[...]
> 
>>
>>Please, do! :-)
>>
>>Giovanni
>>
> 
>Herein attached follows my patch. Please feel free for brickbats...
> 
>   Vincent.
> 

BTW, I made the patch based on a GIT pull, so the org.el I used
should be almost the latest one --- unless people have pushed new
versions since the time when I made the pull.

  Vincent
_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode 		 	   		  

[-- Attachment #1.2: Type: text/html, Size: 1114 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: %20 in file://... URL
@ 2010-11-13  6:18 Vincent Belaïche
  2010-11-13  6:28 ` Vincent Belaïche
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Vincent Belaïche @ 2010-11-13  6:18 UTC (permalink / raw)
  To: Org mode, giovanni.ridolfi; +Cc: Vincent Belaïche

[-- Attachment #1: Type: text/plain, Size: 124 bytes --]


[...]

>
>Please, do! :-)
>
>Giovanni
>

Herein attached follows my patch. Please feel free for brickbats...

   Vincent.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: %xx decode --]
[-- Type: text/x-patch, Size: 1738 bytes --]

*** org.el.old	Fri Nov  5 19:16:29 2010
--- org.el	Sat Nov 13 05:50:54 2010
***************
*** 9639,9647 ****
  opened in Emacs, unless an entry from org-file-apps that makes
  use of groups in a regexp matches.
  If the file does not exist, an error is thrown."
!   (let* ((file (if (equal path "")
  		   buffer-file-name
! 		 (substitute-in-file-name (expand-file-name path))))
  	 (file-apps (append org-file-apps (org-default-apps)))
  	 (apps (org-remove-if
  		'org-file-apps-entry-match-against-dlink-p file-apps))
--- 9639,9666 ----
  opened in Emacs, unless an entry from org-file-apps that makes
  use of groups in a regexp matches.
  If the file does not exist, an error is thrown."
!   (let* ((%xx-decoded-path 
! 	  (let ((pos 0) (%xx-decoded-path path))
! 	    (setq %xx-decoded-path path)
! 	    (while (setq pos (string-match "%\\([0-9A-F]\\)\\([0-9A-F]\\)" %xx-decoded-path pos))
! 	      (setq pos (1+ pos)
! 		    %xx-decoded-path (replace-match 
! 				      (string (let ((code 0) digit)
! 						(dotimes (i 2)
! 						  (setq 
! 						   digit (aref (match-string (1+ i) %xx-decoded-path) 0)
! 						   code (+ (if (<= digit ?9)
! 							       (- digit ?0)
! 							     (- digit 55))
! 							   (* 16 code)))) code))
! 				      t t %xx-decoded-path)))
! 	    ;; remove //localhost/ prefix if any
! 	    (and (string-match "\\`//localhost/" %xx-decoded-path)
! 		 (setq %xx-decoded-path (substring %xx-decoded-path 12)))
! 	    %xx-decoded-path))
! 	 (file (if (equal path "")
  		   buffer-file-name
! 		 (substitute-in-file-name (expand-file-name %xx-decoded-path))))
  	 (file-apps (append org-file-apps (org-default-apps)))
  	 (apps (org-remove-if
  		'org-file-apps-entry-match-against-dlink-p file-apps))

[-- Attachment #3: Change log --]
[-- Type: text/plain, Size: 218 bytes --]

2010-11-13  Vincent Belaïche  <vincentb1@users.sourceforge.net>

	* org.el (org-open-file): Decode %XX escapes in URL with file
	type, so that applications other than browsers are not confused with the filename.


[-- Attachment #4: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: %20 in file://... URL
  2010-11-05  6:42 Vincent Belaïche
@ 2010-11-05  8:39 ` Giovanni Ridolfi
  0 siblings, 0 replies; 21+ messages in thread
From: Giovanni Ridolfi @ 2010-11-05  8:39 UTC (permalink / raw)
  To: Vincent Belaïche; +Cc: Org mode

Vincent Belaïche <vincent.b.1@hotmail.fr> writes:

> Sorry to dwell on it: I am just wondering, is there any reason why %20
> and suchlikes are not supported with the file: protocole ? 
>
> I not, I can submit a patch to correct this and have the % constructs
> decoded.

Please, do! :-)

Giovanni

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: %20 in file://... URL
@ 2010-11-05  6:42 Vincent Belaïche
  2010-11-05  8:39 ` Giovanni Ridolfi
  0 siblings, 1 reply; 21+ messages in thread
From: Vincent Belaïche @ 2010-11-05  6:42 UTC (permalink / raw)
  To: Org mode; +Cc: Vincent Belaïche

Hello,

Sorry to dwell on it: I am just wondering, is there any reason why %20
and suchlikes are not supported with the file: protocole ? 

I not, I can submit a patch to correct this and have the % constructs
decoded.

BR,
   Vincent.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: %20 in file://... URL
@ 2010-10-27 21:19 Vincent Belaïche
  0 siblings, 0 replies; 21+ messages in thread
From: Vincent Belaïche @ 2010-10-27 21:19 UTC (permalink / raw)
  To: Org mode; +Cc: Vincent Belaïche



> From: giovanni.ridolfi@yahoo.it
> To: vincent.b.1@hotmail.fr
> Subject: Re: [Orgmode] %20 in file://... URL
> Date: Tue, 26 Oct 2010 17:39:55 +0200
> CC: emacs-orgmode@gnu.org
>

[...]

>
> *But*, Vincent, why do you use "%3A" when the colon ":" works? ?-/
>

The reason is quite simple, I wrote a package called w32utils.el which
does several things useful for MSWindows users, among which converting
path of marked files in Dired mode to various format, like URL for
navigator, for LaTeX hyperref, for orgmode, and backslashed MSWindows
path (that was the primary purpose), amongst other. 

This package also makes it easier to open bash shell buffers (using MSYS
bash) under emacs in MSWindows, and also allows some easier update of
the default-directory variable when you make CD to some path (like
changing the driver letter, or using the MSYS fstab links).

If you are interested in that I can put w32utils.el on my page and send you a
link. This is still very experimental, and the manual is not uptodate.

Well, this package makes a strict and complete conversion of paths to
URL, and this is the reason for the %3A.

> cheers,
>
> Giovanni
>

BR,
  Vincent.

> _______________________________________________
> Emacs-orgmode mailing list
> Please use `Reply All' to send replies to the list.
> Emacs-orgmode@gnu.org
> http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: %20 in file://... URL
  2010-10-26  5:15 Vincent Belaïche
@ 2010-10-26 15:39 ` Giovanni Ridolfi
  0 siblings, 0 replies; 21+ messages in thread
From: Giovanni Ridolfi @ 2010-10-26 15:39 UTC (permalink / raw)
  To: Vincent Belaïche; +Cc: Org mode

Vincent Belaïche <vincent.b.1@hotmail.fr> writes:

>>> My Org mode version is not able to interprete any `%20' or suchlike
>>> escape codes in file://... URL

>>Which Org mode version are you using?
>> 
>>M-x org-version RET
>> 
>>And can you give an example of a link that does not work as expected?
>> 
>
> [[file://localhost/c%3A/msys/1.0/temp/foo.html][link]]
>
> the file exists on my PC as 
>
> c:\msys\1.0\temp\foo.html
>
> I am under MSWindows XP.
> M-x org-version
> => Org-mode version 7.01
>
Emacs version?
Here:
Org-mode version 7.01trans commit-4cd56cfa7b93902544acb32848e36ee4004239a3
GNU Emacs 23.2.1 (i386-mingw-nt5.1.2600) of 2010-05-08 on G41R2F1
Windows XP 

I can confirm the error. 

[[file://localhost/c:/Documents and Settings/A/Documenti/z-emacs/a.org][link-localh-:]]   works

[[file://c:/Documents and Settings/A/Documenti/z-emacs/a.org][link-c:]]      works, faster

[[file://localhost/c%3/Documents and Settings/A/Documenti/z-emacs/a.org][link-localh% ]] does 

not work the way expected: it opens a "a.org" buffer, but  the  path is
wrong:  if you try to save the "a.org" buffer  C-x w the directory 
proposed is:

.../A/Documenti/z-emacs/

*But*, Vincent, why do you use "%3A" when the colon ":" works? ?-/

cheers,

Giovanni

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: %20 in file://... URL
@ 2010-10-26  5:15 Vincent Belaïche
  2010-10-26 15:39 ` Giovanni Ridolfi
  0 siblings, 1 reply; 21+ messages in thread
From: Vincent Belaïche @ 2010-10-26  5:15 UTC (permalink / raw)
  To: Org mode; +Cc: Vincent Belaïche


>Which Org mode version are you using?
> 
>M-x org-version RET
> 
>And can you give an example of a link that does not work as expected?
> 
>Best,
>  -- David
>
Hello,

Thanks for the feedback. Here is an example of failing link:

[[file://localhost/c%3A/msys/1.0/temp/foo.html][link]]

the file exists on my PC as 

c:\msys\1.0\temp\foo.html


I am under MSWindows XP.

M-x org-version
=> Org-mode version 7.01

This is more or less the latest version on emacs trunk.

BR,
   Vincent.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: %20 in file://... URL
  2010-10-24 20:49 Vincent Belaïche
@ 2010-10-24 21:02 ` David Maus
  0 siblings, 0 replies; 21+ messages in thread
From: David Maus @ 2010-10-24 21:02 UTC (permalink / raw)
  To: Vincent Belaïche; +Cc: Org mode


[-- Attachment #1.1: Type: text/plain, Size: 456 bytes --]

At Sun, 24 Oct 2010 22:49:12 +0200,
Vincent Belaïche wrote:
> 
> Hello,
> 
> My Org mode version is not able to interprete any `%20' or suchlike
> escape codes in file://... URL, is that normal ?

Which Org mode version are you using?

M-x org-version RET

And can you give an example of a link that does not work as expected?

Best,
  -- David
-- 
OpenPGP... 0x99ADB83B5A4478E6
Jabber.... dmjena@jabber.org
Email..... dmaus@ictsoc.de

[-- Attachment #1.2: Type: application/pgp-signature, Size: 230 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Please use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 21+ messages in thread

* %20 in file://... URL
@ 2010-10-24 20:49 Vincent Belaïche
  2010-10-24 21:02 ` David Maus
  0 siblings, 1 reply; 21+ messages in thread
From: Vincent Belaïche @ 2010-10-24 20:49 UTC (permalink / raw)
  To: Org mode; +Cc: Vincent Belaïche

Hello,

My Org mode version is not able to interprete any `%20' or suchlike
escape codes in file://... URL, is that normal ?

      Vincent.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2011-02-12 15:02 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-22 15:46 %20 in file://... URL Vincent Belaïche
2010-11-22 18:16 ` David Maus
2011-02-12 15:02   ` Bastien
  -- strict thread matches above, loose matches on Subject: below --
2010-12-30  5:29 Vincent Belaïche
2010-11-23  5:25 Vincent Belaïche
2010-11-24 20:57 ` David Maus
2011-02-12 14:36   ` Bastien
     [not found] ` <BLU104-W15A3F7F6097ED8F6D95CEB84210@phx.gbl>
2010-11-29 20:03   ` David Maus
2010-11-22 15:46 Vincent Belaïche
2010-11-13  6:18 Vincent Belaïche
2010-11-13  6:28 ` Vincent Belaïche
2010-11-14 17:30 ` David Maus
2010-11-17 20:43 ` David Maus
2010-11-17 20:43 ` David Maus
2010-11-05  6:42 Vincent Belaïche
2010-11-05  8:39 ` Giovanni Ridolfi
2010-10-27 21:19 Vincent Belaïche
2010-10-26  5:15 Vincent Belaïche
2010-10-26 15:39 ` Giovanni Ridolfi
2010-10-24 20:49 Vincent Belaïche
2010-10-24 21:02 ` David Maus

Code repositories for project(s) associated with this inbox:

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).