emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* link syntax fixing bug?
@ 2021-03-20 22:46 Samuel Wales
  2021-04-17 19:27 ` Kyle Meyer
  2021-04-25 10:46 ` Maxim Nikulin
  0 siblings, 2 replies; 3+ messages in thread
From: Samuel Wales @ 2021-03-20 22:46 UTC (permalink / raw)
  To: emacs-orgmode

in recent maint, i am trying the code included with the maint release
to update org link escaping syntax.

the issue is that when i click on google, the space before "hi" does
not show up in the earch box.  ergo, different results.

*** should be orig
[[http://www.google.com/search?q=%7E%22retroactive%20whatever%22%20%22hi%22][retro
original]]
*** should be fixed, is not?
[[http://www.google.com/search?q=~"retroactive whatever" "hi"][retro
original]]
*** [[https://orgmode.org/Changes.html][Org mode for Emacs – Release notes]]
The following function will help switching your links to the new syntax:

(defun org-update-link-syntax (&optional no-query)
  "Update syntax for links in current buffer.
Query before replacing a link, unless optional argument NO-QUERY
is non-nil."
  (interactive "P")
  (org-with-point-at 1
    (let ((case-fold-search t))
      (while (re-search-forward "\\[\\[[^]]*?%\\(?:2[05]\\|5[BD]\\)" nil t)
        (let ((object (save-match-data (org-element-context))))
          (when (and (eq 'link (org-element-type object))
                     (= (match-beginning 0)
                        (org-element-property :begin object)))
            (goto-char (org-element-property :end object))
            (let* ((uri-start (+ 2 (match-beginning 0)))
                   (uri-end (save-excursion
                              (goto-char uri-start)
                              (re-search-forward "\\][][]" nil t)
                              (match-beginning 0)))
                   (uri (buffer-substring-no-properties uri-start uri-end)))
              (when (or no-query
                        (y-or-n-p
                         (format "Possibly obsolete URI syntax: %S.  Fix? "
                                 uri)))
                (setf (buffer-substring uri-start uri-end)
                      (org-link-escape (org-link-decode uri)))))))))))

i'm kind of clueless about what the issue is.
thank you.

-- 
The Kafka Pandemic

Please learn what misopathy is.
https://thekafkapandemic.blogspot.com/2013/10/why-some-diseases-are-wronged.html


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: link syntax fixing bug?
  2021-03-20 22:46 link syntax fixing bug? Samuel Wales
@ 2021-04-17 19:27 ` Kyle Meyer
  2021-04-25 10:46 ` Maxim Nikulin
  1 sibling, 0 replies; 3+ messages in thread
From: Kyle Meyer @ 2021-04-17 19:27 UTC (permalink / raw)
  To: Samuel Wales; +Cc: emacs-orgmode

Samuel Wales writes:

> in recent maint, i am trying the code included with the maint release
> to update org link escaping syntax.
>
> the issue is that when i click on google, the space before "hi" does
> not show up in the earch box.  ergo, different results.
>
> *** should be orig
> [[http://www.google.com/search?q=%7E%22retroactive%20whatever%22%20%22hi%22][retro
> original]]
> *** should be fixed, is not?
> [[http://www.google.com/search?q=~"retroactive whatever" "hi"][retro
> original]]

My understanding is that the Org 9.3 changes were about moving away from
the percent-encoding that Org used to avoid "[" and "]" in the link.  It
looks like the URL you're showing above should be left as is because it
is the usual URL percent-encoding, without the pre-9.3 Org
percent-encoding on top.

Here are some threads related to this:

  https://orgmode.org/list/87tvguyohn.fsf@nicolasgoaziou.fr/T/#u
  https://orgmode.org/list/87sgvusl43.fsf@nicolasgoaziou.fr/T/#u


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: link syntax fixing bug?
  2021-03-20 22:46 link syntax fixing bug? Samuel Wales
  2021-04-17 19:27 ` Kyle Meyer
@ 2021-04-25 10:46 ` Maxim Nikulin
  1 sibling, 0 replies; 3+ messages in thread
From: Maxim Nikulin @ 2021-04-25 10:46 UTC (permalink / raw)
  To: emacs-orgmode

On 21/03/2021 05:46, Samuel Wales wrote:
> the issue is that when i click on google, the space before "hi" does
> not show up in the earch box.  ergo, different results.
> 
> *** should be orig
> [[http://www.google.com/search?q=%7E%22retroactive%20whatever%22%20%22hi%22][retro
> original]]
> *** should be fixed, is not?
> [[http://www.google.com/search?q=~"retroactive whatever" "hi"][retro
> original]]

Reading Kyle's response, I have realized that you might have unsafe URL 
handler. I hope, I am wrong. To factor out some excessively smart JS, I 
tried

     firefox 'http:/127.0.0.1/search?q=~"retroactive whatever" "hi"'

and I got expected result in the URL bar. With the following test script 
"fake-browser"

#!/bin/sh
exec kdialog --title "Fake Browser" --msgbox "Args $#: '$*'"

and a some customization:

  '(browse-url-browser-function (quote browse-url-generic))
  '(browse-url-generic-program "fake-browser")

I did not get any white space problem for the following link

[[http:/127.0.0.1/search?q=~"retroactive whatever" "hi"][retro-original]]

So neither passing URL to handler nor handling URL by firefox cause a 
problem.

However protecting spaces in URLs from `org-fill-paragraph' function was 
mentioned  in mail list archive as one of the reasons to introduce 
second pass of percent encoding. Double percent encoding is clearly a 
problem since there is no way to reliably guess whether second pass was 
applied or not. My impression, it were not a problem if just "offensive" 
for org symbols "][ \" would be replaced by percent-encoded equivalent 
in URLs. Maybe I just missed cases when mixing percent-encoded and 
unicode characters leads to some problem, so I believe it is safe. My 
hypotesis is that replacing just "[", "]", and "\" to percent encoded 
equivalent in any URL does not cause any issue, web-servers are able to 
decode them (selective encoding, not second pass for whole URL). Maybe 
file links on windows is an exception.

My opinion is that `org-lint' gives false positives for URLs with 
percent encoded characters. They are rather wide spread e.g. in search 
queries.

> *** [[https://orgmode.org/Changes.html][Org mode for Emacs – Release notes]]
> The following function will help switching your links to the new syntax:
> 
> (defun org-update-link-syntax (&optional no-query)
...
>        (while (re-search-forward "\\[\\[[^]]*?%\\(?:2[05]\\|5[BD]\\)" nil t)

I believe, the logic at least for space symbol (%20) should be more 
sophisticated. Maybe decoding of URLs with "%20" should be performed 
only if decoded URL still contains percent-encoded characters. Maybe 
decoding should be prevented if any of characters mandatory for percent 
encoding ("[]?/", etc) is present besides percent-encoded sequences. 
Maybe the only way is interactive comparison of original and decoded URL.

I do not think that particular example you provided

http://www.google.com/search?q=%7E%22retroactive%20whatever%22%20%22hi%22

needs decoding. It is not human friendly but it is more safe and quite 
wide spread. On the other hand, decoded variant should not lead to any 
problem as well unless something is misconfigured

[[http://www.google.com/search?q=~"retroactive whatever" "hi"][retro 
original]]



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-04-25 10:46 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-20 22:46 link syntax fixing bug? Samuel Wales
2021-04-17 19:27 ` Kyle Meyer
2021-04-25 10:46 ` Maxim Nikulin

Code repositories for project(s) associated with this inbox:

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).