From mboxrd@z Thu Jan 1 00:00:00 1970 From: Carsten Dominik Subject: Re: Question aboug Regexp Date: Tue, 23 May 2006 05:58:17 +0200 Message-ID: <7f55ba6da92bc0c958dea27173f9e47b@science.uva.nl> References: <87lkstxx24.fsf@tolchz.net> Mime-Version: 1.0 (Apple Message framework v623) Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1FiO2K-00072C-Dl for emacs-orgmode@gnu.org; Mon, 22 May 2006 23:58:24 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1FiO2I-000720-Th for emacs-orgmode@gnu.org; Mon, 22 May 2006 23:58:23 -0400 Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1FiO2I-00071x-NE for emacs-orgmode@gnu.org; Mon, 22 May 2006 23:58:22 -0400 Received: from [194.134.35.144] (helo=smtp04.wanadoo.nl) by monty-python.gnu.org with esmtp (Exim 4.52) id 1FiO6Q-0002D9-KH for emacs-orgmode@gnu.org; Tue, 23 May 2006 00:02:38 -0400 In-Reply-To: <87lkstxx24.fsf@tolchz.net> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Todd Neal Cc: emacs-orgmode@gnu.org On May 23, 2006, at 5:02, Todd Neal wrote: > > > I am looking at why the following link does not work: > > [[elisp: (+ 1 2 3)]] > > I think that the problem lies with this regexp: > > 1 (defconst org-link-re-with-space2 The regexp org-link-re-with-space2 requires that the first character after elisp: is not a space character. This was originally to make sure that the following would not be matched as a link: I can explain you a feature in elisp: Parenthesis are everything. This is not documented properly, thanks for reporting this. In a regexp character class, the first character of a class is special and can be used to include character into he class which are otherwise difficult to get into a class, for example the minus "-" or a square bracket. Since a character class [] or [^] is meaningless, this is a special case so that []] matches the closing bracket and [^]] everything besides the closing bracket. - Carsten From the Emacs manual, node "Regexps", I have marked the important parts with "!" in the first column, `[ ... ]' is a "character set", which begins with `[' and is terminated by `]'. In the simplest case, the characters between the two brackets are what this set can match. Thus, `[ad]' matches either one `a' or one `d', and `[ad]*' matches any string composed of just `a's and `d's (including the empty string), from which it follows that `c[ad]*r' matches `cr', `car', `cdr', `caddaar', etc. You can also include character ranges in a character set, by writing the starting and ending characters with a `-' between them. Thus, `[a-z]' matches any lower-case ASCII letter. Ranges may be intermixed freely with individual characters, as in `[a-z$%.]', which matches any lower-case ASCII letter or `$', `%' or period. Note that the usual regexp special characters are not special inside a character set. A completely different set of special characters exists inside character sets: `]', `-' and `^'. ! To include a `]' in a character set, you must make it the first ! character. For example, `[]a]' matches `]' or `a'. To include a ! `-', write `-' as the first or last character of the set, or put ! it after a range. Thus, `[]-]' matches both `]' and `-'. To include `^' in a set, put it anywhere but at the beginning of the set. (At the beginning, it complements the set--see below.) When you use a range in case-insensitive search, you should write both ends of the range in upper case, or both in lower case, or both should be non-letters. The behavior of a mixed-case range such as `A-z' is somewhat ill-defined, and it may change in future Emacs versions. `[^ ... ]' `[^' begins a "complemented character set", which matches any character except the ones specified. Thus, `[^a-z0-9A-Z]' matches all characters _except_ ASCII letters and digits. ! `^' is not special in a character set unless it is the first ! character. The character following the `^' is treated as if it ! were first (in other words, `-' and `]' are not special there). A complemented character set can match a newline, unless newline is mentioned as one of the characters not to match. This is in contrast to the handling of regexps in programs such as `grep'. > 2 (concat > 3 " 4 "\\([^" org-non-link-chars " ]" > 5 "[^]\t\n\r]*" > 6 "[^" org-non-link-chars " ]\\)>?") > 7 "Matches a link with spaces, optional angular brackets > around it.") > > > I am more used to PCRE so I may be incorrect, but is the "[^]" a typo? > > > Also we have the following definition: > > (defconst org-non-link-chars "]\t\n\r<>") > > > Doesn't this make line 4 evaluate to: > > "\\([^]\t\n\r<> ]" > > or is the right-bracket escaped somehow? > > > Todd > > > > _______________________________________________ > Emacs-orgmode mailing list > Emacs-orgmode@gnu.org > http://lists.gnu.org/mailman/listinfo/emacs-orgmode > > -- Carsten Dominik Sterrenkundig Instituut "Anton Pannekoek" Universiteit van Amsterdam Kruislaan 403 NL-1098SJ Amsterdam phone: +31 20 525 7477