From mboxrd@z Thu Jan 1 00:00:00 1970 From: Neil Jerram Subject: Re: [RFC] Fixing link encoding once and for all Date: Mon, 4 Mar 2019 23:16:07 +0000 Message-ID: References: <87tvguyohn.fsf@nicolasgoaziou.fr> <87sgw9cxr8.fsf@nicolasgoaziou.fr> <87lg1znh9t.fsf@nicolasgoaziou.fr> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Return-path: Received: from eggs.gnu.org ([209.51.188.92]:56127) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h0wp7-0002hy-NN for emacs-orgmode@gnu.org; Mon, 04 Mar 2019 18:16:42 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1h0wp4-0006mq-Fq for emacs-orgmode@gnu.org; Mon, 04 Mar 2019 18:16:40 -0500 Received: from mail-lf1-x129.google.com ([2a00:1450:4864:20::129]:40290) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1h0wos-0006XO-IU for emacs-orgmode@gnu.org; Mon, 04 Mar 2019 18:16:30 -0500 Received: by mail-lf1-x129.google.com with SMTP id a8so4766234lfi.7 for ; Mon, 04 Mar 2019 15:16:21 -0800 (PST) In-Reply-To: <87lg1znh9t.fsf@nicolasgoaziou.fr> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: "Emacs-orgmode" To: Neil Jerram , Org Mode List On Fri, 1 Mar 2019 at 08:14, Nicolas Goaziou wrote: > > Hello, > > Neil Jerram writes: > > > Do you mean Windows file names in existing Org files? I.e. the > > back-compatibility concern? > > > > If so, yes, I confess I didn't think at all about back-compatibility, > > with my suggestion above. So perhaps that rules my idea out. > > > > If we were starting from scratch, however, > > - I believe it would technically be fine; i.e. it's a complete and > > unambiguous encoding > > - it might be considered awkward for Windows users to have to write > > c:\\system32\\mydoc.txt instead of c:\system32\mydoc.txt, but I don't > > know how big a concern that would be. > > Thinking a bit more about it, we don't need to escape /all/ square > brackets, only "]]" and "][" constructs. Therefore, we don't need to > escape every backslash either. Agreed. > The regexp for bracket links could be, in its simple (!) form: > > \[\[\(.*?[^\\]\(?:\\\)*\)\]\(?:\[\([^\000]+?\)\]\)?\] [then a bit later] > Small update, in its string form now: > > "\\[\\[\\([^\000]*?[^\\]\\(\\\\\\\\\\)*\\)\\]\\(?:\\[\\([^\000]+?\\)\\]\\)?\\]" Is [^\000] the only (or best) way of saying "any character, including newlines"? Could there be actual NUL characters in the document? More generally I'm not sure I'm fully understanding the regex. I _think_ it breaks down like this: \[\[ # literal [[ \( # begin group 1 [^\000]*? # non-greedy any characters (0 or more) [^\] # something not a backslash \( # begin group 2 \\\\ # literal \\ \)* # end group 2, and allow 0 or more of it \) # end group 1 \] # literal ] \( # begin group 3 ? # don't understand :\[ # literal :[ \( # begin group 4 [^\000]+? # non-greedy any characters (1 or more) \) # end group 4 \] # literal ] \)? # end group 3, and allow 0 or 1 or it \] # literal ] but there's at least a ? that I don't understand, and I'm afraid I'm not seeing how it's useful. > Most links would need no change. I see one notable exception: > directories in Windows: > > [[c:\system32\\]] for "c:\system32\" But I guess it would be unusual to write a trailing backslash like that. > Some further notes: > > 1. Macros already use backslashes to escape commas in arguments, so it > is at least consistent with this part of Org. > > 2. The description part of the link, like most parts of Org, does not > use backslash escaping. If needed, we can implement an entity for > a square bracket. > > 3. There will be some backward compatibility issues. We can add > a checker in Org Lint to catch most of those. For example, we could > look at URI where every percent is followed only by 25, 5B, and 5D. > > WDYT? If you think it works, I'm happy to defer to your judgement on that! Although I suggested the idea, I don't know Org nearly well enough to be sure that I haven't missed problems; but I guess that you would know that. Best wishes, Neil