From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Maus Subject: Re: Clicking on URL does convert some special characters Date: Sun, 11 Sep 2011 19:40:10 +0200 Message-ID: <87bour566d.wl%dmaus@ictsoc.de> References: <80liu0aev0.fsf@somewhere.org> <8162l41w4r.fsf@gmail.com> Mime-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Sun_Sep_11_19:40:06_2011-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit Return-path: Received: from eggs.gnu.org ([140.186.70.92]:60053) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R2o18-0001YV-Fz for emacs-orgmode@gnu.org; Sun, 11 Sep 2011 13:40:31 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1R2o17-0003IG-0L for emacs-orgmode@gnu.org; Sun, 11 Sep 2011 13:40:30 -0400 Received: from plane.gmane.org ([80.91.229.3]:33738) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R2o16-0003I0-Nm for emacs-orgmode@gnu.org; Sun, 11 Sep 2011 13:40:28 -0400 Received: from public by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1R2o12-0003w4-Dj for emacs-orgmode@gnu.org; Sun, 11 Sep 2011 19:40:24 +0200 In-Reply-To: <8162l41w4r.fsf@gmail.com> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Jambunathan K Cc: public-emacs-orgmode-mXXj517/zsQ@plane.gmane.org, Sebastien Vauban --pgp-sign-Multipart_Sun_Sep_11_19:40:06_2011-1 Content-Type: multipart/mixed; boundary="Multipart_Sun_Sep_11_19:40:06_2011-1" --Multipart_Sun_Sep_11_19:40:06_2011-1 Content-Type: text/plain; charset=US-ASCII At Wed, 07 Sep 2011 16:01:48 +0530, Jambunathan K wrote: > > Hello, > > > > I just realized a diff in behavior between 3 URL entered in the Org buffer > > with slight differences: > > > > - http://web.com/file.php?name=Rep&path=%2FPROJ%2FSomeFile.txt > > This one is correctly exported, but when clicking on it from the Org buffer, > > the URL opened in the browser is > > http://web.com/file.php?name=Rep&path=%252FPROJ%252FSomeFile.txt, > > ^^ ^^ > > hence path not found error. > > > > - [[http://web.com/file.php?name=Rep&path=%2FPROJ%2FSomeFile.txt]] > > Works OK in Org and in exported HTML file. > > > > - [[http://web.com/file.php?name=Rep&path=%2FPROJ%2FSomeFile.txt][Description]] > > Idem. > > 2. When the Org buffer is exported to html or odt > > ,---- In org-html-handle-links > | (setq path (save-match-data (org-link-unescape <== > | (match-string 3 line)))) > | (setq type (cond > | ((match-end 2) (match-string 2 line)) > | ((save-match-data > | (or (file-name-absolute-p path) > | (string-match "^\\.\\.?/" path))) > | "file") > | (t "internal"))) > | (setq path (org-extract-attributes (org-link-unescape path))) <== > `---- > > link unescape happens twice. Asymmetry due to One link escape + two link > unescape asymmetry creates problem on export. > > Based on historical research, the second org-link-unescape can be > removed. The fact that attributes can be entered at C-c C-l prompt is > largely documented and so the second call to org-link-unescape can > largely be removed. The three issues (plain links, enter link via C-c C-l, and double-unescape) are not related in a strict sense. I just pushed: - a fix for `org-open-at-point' and plain links; problem was, that in contrast to bracket links the plain link was not unescaped when read from buffer - removed the second `org-link-unescape' in `org-html-handle-links'; PATH is already unescaped, does not change between first and third `setq' and should always be escaped only once. Attached patch is for org-lparse.el. The inconsistency C-c C-l vs. copy'n'paste vs. manually entering a link is under further review. The base problem is, that we (a) need to escape certain characters for Org mode (i.e. square brackets) (b) need to treat links in a Org buffer either as escaped -or- as unescaped; you can't always tell the difference from the string alone (e.g. "%25" could be the escaped percent sign or the unescaped sequence "%25") (c) don't know if the user enters or pastes a escaped or unescaped link; if the user manually enters a link with the sequence "%5B" and we later read that link, we can't tell if it is a bracket escaped by us or a percent escaped bracket in the original link Best, -- David -- OpenPGP... 0x99ADB83B5A4478E6 Jabber.... dmjena@jabber.org Email..... dmaus@ictsoc.de --Multipart_Sun_Sep_11_19:40:06_2011-1 Content-Type: text/plain; type=patch; charset=US-ASCII Content-Disposition: attachment; filename="0001-Remove-unecessary-link-unescape.patch" Content-Transfer-Encoding: base64 RnJvbSA2NmYwOWY0NjA4ZGFlMjcyYTBlYWM0MzJkZDA5N2EwMGY2MzJmMWQ2IE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBEYXZpZCBNYXVzIDxkbWF1c0BpY3Rzb2MuZGU+CkRhdGU6IFN1 biwgMTEgU2VwIDIwMTEgMTY6NTc6MDEgKzAyMDAKU3ViamVjdDogW1BBVENIXSBSZW1vdmUgdW5l Y2Vzc2FyeSBsaW5rIHVuZXNjYXBlCgoqIG9yZy1scGFyc2UuZWwgKG9yZy1scGFyc2UtZm9ybWF0 LW9yZy1saW5rKTogUmVtb3ZlIHVuZWNlc3NhcnkgbGluawp1bmVzY2FwZS4KLS0tCiBjb250cmli L2xpc3Avb3JnLWxwYXJzZS5lbCB8ICAgIDIgKy0KIDEgZmlsZXMgY2hhbmdlZCwgMSBpbnNlcnRp b25zKCspLCAxIGRlbGV0aW9ucygtKQoKZGlmZiAtLWdpdCBhL2NvbnRyaWIvbGlzcC9vcmctbHBh cnNlLmVsIGIvY29udHJpYi9saXNwL29yZy1scGFyc2UuZWwKaW5kZXggMzlkOTQwMy4uYTM2YjBk NyAxMDA3NTUKLS0tIGEvY29udHJpYi9saXNwL29yZy1scGFyc2UuZWwKKysrIGIvY29udHJpYi9s aXNwL29yZy1scGFyc2UuZWwKQEAgLTE5OSw3ICsxOTksNyBAQCBPUFQtUExJU1QgaXMgdGhlIGV4 cG9ydCBvcHRpb25zIGxpc3QuIgogCQkJIChzdHJpbmctbWF0Y2ggIl5cXC5cXC4/LyIgcGF0aCkp KQogCQkgICAiZmlsZSIpCiAJCSAgKHQgImludGVybmFsIikpKQotICAgICAgKHNldHEgcGF0aCAo b3JnLWV4dHJhY3QtYXR0cmlidXRlcyAob3JnLWxpbmstdW5lc2NhcGUgcGF0aCkpKQorICAgICAg KHNldHEgcGF0aCAob3JnLWV4dHJhY3QtYXR0cmlidXRlcyBwYXRoKSkKICAgICAgIChzZXRxIGF0 dHIgKGdldC10ZXh0LXByb3BlcnR5IDAgJ29yZy1hdHRyaWJ1dGVzIHBhdGgpKQogICAgICAgKHNl dHEgZGVzYzEgKGlmIChtYXRjaC1lbmQgNSkgKG1hdGNoLXN0cmluZyA1IGxpbmUpKQogCSAgICBk ZXNjMiAoaWYgKG1hdGNoLWVuZCAyKSAoY29uY2F0IHR5cGUgIjoiIHBhdGgpIHBhdGgpCi0tIAox LjcuMi41Cgo= --Multipart_Sun_Sep_11_19:40:06_2011-1-- --pgp-sign-Multipart_Sun_Sep_11_19:40:06_2011-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iF4EABEIAAYFAk5s8nYACgkQma24O1pEeOad4gD+OJKT8wgdJJpvjnGWVWEetXsT H6rbMY70j+Yw3hOZArMA/3bbxxQkIGVxTE7nuvSLXyEv264Ik4pP6WIfCBN1t8D+ =aehE -----END PGP SIGNATURE----- --pgp-sign-Multipart_Sun_Sep_11_19:40:06_2011-1--