From mboxrd@z Thu Jan 1 00:00:00 1970 From: Carsten Dominik Subject: Re: [RFC] Alternative to sub/superscript regexp Date: Tue, 26 Nov 2013 10:20:20 +0100 Message-ID: <7CF22FEC-70F1-46A5-BDE0-A07832F9EDAF@gmail.com> References: <87wqjw8kuk.fsf@gmail.com> Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\)) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:46249) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VlEoq-0008GT-2I for emacs-orgmode@gnu.org; Tue, 26 Nov 2013 04:20:40 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VlEoh-0007gO-Eh for emacs-orgmode@gnu.org; Tue, 26 Nov 2013 04:20:31 -0500 Received: from mail-ea0-x233.google.com ([2a00:1450:4013:c01::233]:53533) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VlEoh-0007gC-8W for emacs-orgmode@gnu.org; Tue, 26 Nov 2013 04:20:23 -0500 Received: by mail-ea0-f179.google.com with SMTP id r15so3263563ead.38 for ; Tue, 26 Nov 2013 01:20:22 -0800 (PST) In-Reply-To: <87wqjw8kuk.fsf@gmail.com> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Nicolas Goaziou Cc: Org Mode List Hi Nicolas, I have tested this a bit, and it does pretty much what I want. Just to be sure: We will also support expressions with braces, right? - Carsten On Nov 25, 2013, at 6:14 PM, Nicolas Goaziou = wrote: > Hello, >=20 > For the record `org-match-substring-regexp' is a variation on: >=20 > "\\(\\S-\\)\\([_^]\\)\\(\\(?:\\*\\|[-+]?[^-+*!@#$%^_ = \t\r\n,:\"?<>~;./{}=3D()]+\\)\\)\\)" >=20 > I think it is a bit convoluted and therefore difficult to predict. For > example, as recent bug report showed, you may tend to interpret > a_b[fn:1] as >=20 > a_{b}[fn:1] >=20 > but, in fact, it is equivalent to >=20 > a_{b[fn}:1] >=20 > Of course, we can prevent this by forbidding "[" and "]" in the last > part of the regexp. But I wonder if there's something better to do. >=20 > The idea behind this regexp is that we should be able to write simple > sub/superscript, including numbers and entities, without requiring = curly > braces (see `org-use-sub-superscripts' docstring for details). Maybe > something like the following could be an interesting alternative: >=20 > = "\\(\\S-\\)\\([_^]\\)\\(\\*\\|[+-]?\\(?:\\w\\|[0-9.,\\]\\)*\\(\\w\\|[0-9]\= \)\\)" >=20 > That is, without braces, either an asterisk or any combination of = word, > number, dot, comma and backslash characters, which may start with = either > a plus or a minus sign but cannot end with either a dot or a comma. >=20 > I find it arguably more predictable (no inverted class). Also, we = "gain" > the following: >=20 > a^3.14. <=3D> a^{3.14}. >=20 > At the moment, a^3.14. <=3D> a^{3}.14. >=20 > What do you think? >=20 >=20 > Regards, >=20 > --=20 > Nicolas Goaziou >=20