From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nicolas Goaziou Subject: Re: [BUG] External unicode links without a description in ox-html Date: Mon, 25 Jul 2016 14:52:59 +0200 Message-ID: <87y44pvqyy.fsf@saiph.selenimh> References: <87k2gg9xnb.fsf@systemreboot.net> <87fur46pq6.fsf@saiph.selenimh> Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:59065) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bRfNW-0006hA-RV for emacs-orgmode@gnu.org; Mon, 25 Jul 2016 08:53:03 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bRfNU-0007Cn-JL for emacs-orgmode@gnu.org; Mon, 25 Jul 2016 08:53:01 -0400 Received: from relay4-d.mail.gandi.net ([2001:4b98:c:538::196]:39855) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bRfNU-0007CZ-Cs for emacs-orgmode@gnu.org; Mon, 25 Jul 2016 08:53:00 -0400 In-Reply-To: (Michael Brand's message of "Sat, 23 Jul 2016 17:34:09 +0200") List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: "Emacs-orgmode" To: Michael Brand Cc: org mode Hello, Michael Brand writes: > There seems to be a related issue with an inconsistency between HTML > and other export formats in using org-link-unescape for the link > _destination_ part: With the Org file > > 1) https://duckduckgo.com/?q=Org+mode+%252B+Worg > 2) https://duckduckgo.com/?q=Org+mode+%2B+Worg > > org-open-at-point on link 1) opens a web browser with the search field > filled with "Org mode + Worg" as expected by me. This looks like an error to me. If I type https://duckduckgo.com/?q=Org+mode+%252B+Worg in my browser, I get "Org mode %2B Worg" as the search string. It should be the same when opening the link from an Org document. These URI are /not/ equivalent. > The same happens when using link 1) of the HTML export. But when > exporting to PDF (via LaTeX), ODT or ASCII (browse-url-at-point) > I have to use link 2) to get the same result. I think one should be > able to consistently use link 1) for all export formats. It looks as we're trying to paper over an Org problem here, which is the redundant link escaping that happens when calling `org-insert-link' (C-c C-l). AFAICT, there are two reasons for Org to escape a link: when the link contains either "]]" or multiple consecutive spaces. The former obviously breaks Org link syntax. The latter doesn't survive a call to `fill-paragraph'. Alas, Org handles it the wrong way, by using a mechanism that cannot be properly undone; you cannot possibly know how many times the desired URI has been encoded, if at all. Moreover, this mechanism isn't user friendly, i.e., you cannot reasonably ask a user to encode an URI on the fly when jolting notes. I can see two ways out: 1. Do not escape anything. This prevent any link with a description to contain either "]]" or multiple spaces, but these requirements are so uncommon we probably shouldn't bother. 2. Use a different internal escape mechanism. By providing our own simple escape mechanism, e.g., \]\], we can solve the issues raised above. In any case, Org should not create something as https://duckduckgo.com/?q=Org+mode+%252B+Worg if the real URI is https://duckduckgo.com/?q=Org+mode+%2B+Worg WDYT? Regards, -- Nicolas Goaziou