From mboxrd@z Thu Jan 1 00:00:00 1970 From: stardiviner Subject: Re: [RFC] Fixing link encoding once and for all Date: Mon, 25 Feb 2019 16:54:59 +0800 Message-ID: <877edo6ye4.fsf@gmail.com> References: <87tvguyohn.fsf@nicolasgoaziou.fr> Reply-To: numbchild@gmail.com Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([209.51.188.92]:42384) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gyC2Y-0001uP-6V for emacs-orgmode@gnu.org; Mon, 25 Feb 2019 03:55:11 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gyC2X-0005Xx-0M for emacs-orgmode@gnu.org; Mon, 25 Feb 2019 03:55:10 -0500 Received: from [61.175.244.13] (port=17104 helo=dark.localdomain) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gyC2W-0005St-Bg for emacs-orgmode@gnu.org; Mon, 25 Feb 2019 03:55:08 -0500 In-reply-to: <87tvguyohn.fsf@nicolasgoaziou.fr> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: "Emacs-orgmode" To: emacs-orgmode@gnu.org Nicolas Goaziou writes: > Hello, > > Recently[1], issues about link escaping have resurfaced. I'd like to > solve this once and for all. > > As a reminder, the initial issue is that bracket links, i.e., "[[path]]" > or "[[path][description]]", cannot contain square brackets, for obvious > reasons. Therefore, they need to be escaped somehow. For some historical > reason, the "somehow" settled, for the path part[2], on URL encoding. > Therefore [ and ] in a link must appear as, respectively, "%5B" and > "%5D". Of course, the initial link could already contain any of these > strings, so percent signs also need to be escaped, as "%25". Eventually, > consecutive spaces are not very handled very gracefully by > `fill-paragraph' function, so it is also useful, but not mandatory, to > be able to escape white spaces, with "%20". It can sadly be confusing > when Org encoding is applied on top an already encoded URI. > > To sum it up, `org-link-escape', by default, URL encodes only square > brackets, percent signs and white spaces. Note that, however, > `org-link-unescape' is not its reciprocal function, despite its > docstring. It URL decodes every percent encoded combination. > > Anyway, square brackets in a bracket link almost looks like a solved > problem. Alas, if some links are inserted by helper functions, such as > `org-insert-link', others could have been typed right into the buffer. > Therefore, there is usually no way to know if a link is already > Org-encoded or not. Consequently, there is usually no way to know when > a link needs to be Org-decoded. This is the root of all evil, or at > least, all bugs encountered so far. Some links end up being encoded or > decoded once too many. > > To solve this, we must assume that every bracket link is properly > Org-encoded in a buffer. In other words, when typing, or yanking, > a bracket link right into a buffer, users are required to use %5B, %5D, > and %25 in the path part of the link, if necessary. I understand it will > bite some users, but using `org-insert-link' would mitigate the pain. It > is also limited to square brackets, which, I assume, is not the type of > link you usually yank. > > With that assumption, the parser can safely Org-decode links > appropriately, and store paths in their decoded form. Consumers, like > export back-ends, need not call `org-link-unescape' anymore. In fact, > the only situation where `org-link-unescape' is still needed is when > extracting the path part of a bracket link from the buffer, e.g., > through regexp matching. > > Of course, the manual should mention this assumption, if we agree on it. > > Thoughts? > > Regards, > I agree and upvote on this. Use `org-insert-link' as unique entry will help unify all behavior. The only inconvenient of inserting link literately is where user can't access `org-insert-link'. Like on web, in other editor. But I think whatever Org Mode is limited in Emacs already, so no matter add this on. Also, at the end, if other clients want to support Org Mode, then can insert link with encoded and handle this properly. WDYT? > Footnotes: > > [1] E.g., > or . > > [2] There is no clear mechanism for the description part. > `org-insert-link' will replace square brackets with curly ones. We could > also use entities, but none of them appears as a square bracket. Anyway, > I'll ignore this issue for the time being. -- [ stardiviner ] I try to make every word tell the meaning what I want to express. Blog: https://stardiviner.github.io/ IRC(freenode): stardiviner, Matrix: stardiviner GPG: F09F650D7D674819892591401B5DF1C95AE89AC3