* Question aboug Regexp
@ 2006-05-23 3:02 Todd Neal
2006-05-23 3:58 ` Carsten Dominik
0 siblings, 1 reply; 3+ messages in thread
From: Todd Neal @ 2006-05-23 3:02 UTC (permalink / raw)
To: emacs-orgmode
I am looking at why the following link does not work:
[[elisp: (+ 1 2 3)]]
I think that the problem lies with this regexp:
1 (defconst org-link-re-with-space2
2 (concat
3 "<?\\(" (mapconcat 'identity org-link-types "\\|") "\\):"
4 "\\([^" org-non-link-chars " ]"
5 "[^]\t\n\r]*"
6 "[^" org-non-link-chars " ]\\)>?")
7 "Matches a link with spaces, optional angular brackets around it.")
I am more used to PCRE so I may be incorrect, but is the "[^]" a typo?
Also we have the following definition:
(defconst org-non-link-chars "]\t\n\r<>")
Doesn't this make line 4 evaluate to:
"\\([^]\t\n\r<> ]"
or is the right-bracket escaped somehow?
Todd
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Question aboug Regexp
2006-05-23 3:02 Question aboug Regexp Todd Neal
@ 2006-05-23 3:58 ` Carsten Dominik
2006-05-23 4:10 ` Todd Neal
0 siblings, 1 reply; 3+ messages in thread
From: Carsten Dominik @ 2006-05-23 3:58 UTC (permalink / raw)
To: Todd Neal; +Cc: emacs-orgmode
On May 23, 2006, at 5:02, Todd Neal wrote:
>
>
> I am looking at why the following link does not work:
>
> [[elisp: (+ 1 2 3)]]
>
> I think that the problem lies with this regexp:
>
> 1 (defconst org-link-re-with-space2
The regexp org-link-re-with-space2 requires that the first character
after elisp: is not a space character. This was originally to make
sure that the following would not be matched as a link:
I can explain you a feature in elisp: Parenthesis are everything.
This is not documented properly, thanks for reporting this.
In a regexp character class, the first character of a class is special
and can be used to include character into he class which are otherwise
difficult to get into a class, for example the minus "-" or a square
bracket. Since a character class [] or [^] is meaningless, this is a
special case so that []] matches the closing bracket and [^]]
everything besides the closing bracket.
- Carsten
From the Emacs manual, node "Regexps", I have marked the important
parts with "!" in the first column,
`[ ... ]'
is a "character set", which begins with `[' and is terminated by
`]'. In the simplest case, the characters between the two
brackets are what this set can match.
Thus, `[ad]' matches either one `a' or one `d', and `[ad]*'
matches any string composed of just `a's and `d's (including the
empty string), from which it follows that `c[ad]*r' matches `cr',
`car', `cdr', `caddaar', etc.
You can also include character ranges in a character set, by
writing the starting and ending characters with a `-' between
them. Thus, `[a-z]' matches any lower-case ASCII letter. Ranges
may be intermixed freely with individual characters, as in
`[a-z$%.]', which matches any lower-case ASCII letter or `$', `%'
or period.
Note that the usual regexp special characters are not special
inside a character set. A completely different set of special
characters exists inside character sets: `]', `-' and `^'.
! To include a `]' in a character set, you must make it the first
! character. For example, `[]a]' matches `]' or `a'. To include a
! `-', write `-' as the first or last character of the set, or put
! it after a range. Thus, `[]-]' matches both `]' and `-'.
To include `^' in a set, put it anywhere but at the beginning of
the set. (At the beginning, it complements the set--see below.)
When you use a range in case-insensitive search, you should write
both ends of the range in upper case, or both in lower case, or
both should be non-letters. The behavior of a mixed-case range
such as `A-z' is somewhat ill-defined, and it may change in future
Emacs versions.
`[^ ... ]'
`[^' begins a "complemented character set", which matches any
character except the ones specified. Thus, `[^a-z0-9A-Z]' matches
all characters _except_ ASCII letters and digits.
! `^' is not special in a character set unless it is the first
! character. The character following the `^' is treated as if it
! were first (in other words, `-' and `]' are not special there).
A complemented character set can match a newline, unless newline is
mentioned as one of the characters not to match. This is in
contrast to the handling of regexps in programs such as `grep'.
> 2 (concat
> 3 "<?\\(" (mapconcat 'identity org-link-types "\\|") "\\):"
> 4 "\\([^" org-non-link-chars " ]"
> 5 "[^]\t\n\r]*"
> 6 "[^" org-non-link-chars " ]\\)>?")
> 7 "Matches a link with spaces, optional angular brackets
> around it.")
>
>
> I am more used to PCRE so I may be incorrect, but is the "[^]" a typo?
>
>
> Also we have the following definition:
>
> (defconst org-non-link-chars "]\t\n\r<>")
>
>
> Doesn't this make line 4 evaluate to:
>
> "\\([^]\t\n\r<> ]"
>
> or is the right-bracket escaped somehow?
>
>
> Todd
>
>
>
> _______________________________________________
> Emacs-orgmode mailing list
> Emacs-orgmode@gnu.org
> http://lists.gnu.org/mailman/listinfo/emacs-orgmode
>
>
--
Carsten Dominik
Sterrenkundig Instituut "Anton Pannekoek"
Universiteit van Amsterdam
Kruislaan 403
NL-1098SJ Amsterdam
phone: +31 20 525 7477
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Question aboug Regexp
2006-05-23 3:58 ` Carsten Dominik
@ 2006-05-23 4:10 ` Todd Neal
0 siblings, 0 replies; 3+ messages in thread
From: Todd Neal @ 2006-05-23 4:10 UTC (permalink / raw)
To: emacs-orgmode
Carsten Dominik <dominik@science.uva.nl> writes:
> On May 23, 2006, at 5:02, Todd Neal wrote:
>
>>
>>
>> I am looking at why the following link does not work:
>>
>> [[elisp: (+ 1 2 3)]]
>>
>> I think that the problem lies with this regexp:
>>
>> 1 (defconst org-link-re-with-space2
>
>
> The regexp org-link-re-with-space2 requires that the first character
> after elisp: is not a space character. This was originally to make
> sure that the following would not be matched as a link:
>
> I can explain you a feature in elisp: Parenthesis are everything.
>
> This is not documented properly, thanks for reporting this.
>
> In a regexp character class, the first character of a class is special
> and can be used to include character into he class which are otherwise
> difficult to get into a class, for example the minus "-" or a square
> bracket. Since a character class [] or [^] is meaningless, this is a
> special case so that []] matches the closing bracket and [^]]
> everything besides the closing bracket.
>
> - Carsten
Thanks, I should have checked the info a bit more carefully.
Todd
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2006-05-23 4:09 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-05-23 3:02 Question aboug Regexp Todd Neal
2006-05-23 3:58 ` Carsten Dominik
2006-05-23 4:10 ` Todd Neal
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).