I am writing an application layer on top of org that uses the entry mapping API, but I needed both negation of complex selections and heading searches. Because the current tag query parser does not handle parenthesized expressions, it does not allow negating complex queries. At first, I wrote a workaround solution that mimics/specializes the mapping API, but that approach seemed inelegant and harder to maintain. So instead I implemented a full parser for tag queries with a number of useful features (see the labeled Notes at the bottom for further comments on these features): 1. Parenthesized expressions to arbitrary depth are allowed. 2. A '-' can be used to negate a parenthesized term. [Note a] 3. Regex's in {} can contain braces escaped by doubling: {{ }}. [Note b] 4. Supports fast property search on HEADING and PRIORITY. [Note c] 5. Handles hyphens in property names properly. [Note d,h] 6. Allows only the proper comparison operators, including ==. [Note e,h] 7. Allows spaces around operators and terms for readability. [Note f] 8. Matchers use the original expression order; not a big deal, but free. 9. The error messages during parsing are reasonably helpful. 10. Several bug fixes and a cleaner `org-make-tags-matcher'. [Note h] I'm submitting the code for your consideration, with the goal of eventually incorporating this into org.el. I would be happy to hear any comments or suggestions you have. As I'll describe below, this involves relatively minor changes to two existing functions and adding a few new support functions. I've attached two files org-tag-query-parse.el (the code) and tag-query-tests.el (a collection of tests built on a simple framework). I've also put the files in http://www.stat.cmu.edu/~genovese/emacs/. The comments in both files will I hope be helpful. At the risk of going on too long, I'd like to add a few comments about the code and tests. First, the two existing functions that are affected in the code are `org-make-tags-matcher' and `org-scan-tags'. In the new version of the former, I've extracted out both kinds of query parsing, leading to a shorter and cleaner function. The new version of the latter differs in only a couple *very minor* places that capture two values that were already being computed anyway (see the diff reproduced in the comments). Btw, I'm working from the 7.8.11 code. Loading org-tag-query-parse.el does not change the original functions. Instead, I've added a `-NEW' to the names of these functions and saved the originals also with a `-ORIGINAL' added. After loading the file, you can choose a version to try by doing (org-tmp-use-tag-parser 'new) and (org-tmp-use-tag-parser 'original) or do (org-tmp-use-tag-parser) to toggle between versions. You can also just use the names with suffixes directly. I'd also suggest byte-compiling the file. I think the place to start looking at the code is the new version of `org-make-tags-matcher'. The main entry function for the new parser is `org-tag-query-parse', though the real workhorse is actually the function `org-tag-query-parse-1'. There is also a new function `org-todo-query-parse' which just extracts the existing todo matching method. (I didn't do anything with that method as the manual makes it clear that it is of secondary importance.) I think the modularity here makes `org-make-tags-matcher' and each separate parser easier to read and understand. The other substantial piece (in terms of lines of code) is a utility macro `org-match-cond' that is used throughout and makes the main parser much more readable IMHO. Admittedly, I went a bit overboard in optimizing it; the first version worked fine but this one produces really nice code. I'd suggest ignoring this code (in section "Parsing utility for readable matchers") on first pass. The docstring is pretty complete, and its use is more or less self-explanatory. Most of its work is done at compile time. To run the tests, load org-tag-query-parse.el and tag-query-tests.el and do (tag-test-run :results) ; use :summary for a brief summary of all runs (tag-test-other-tests) ; miscellaneous other tests, including scanning or name individual suites. They are at the moment: (tag-test-run :results 'org-comparison-1) ; or use :summary (tag-test-run :results 'org-comparison-2) (tag-test-run :results 'match-results-1) (tag-test-run :results 'match-results-2) (tag-test-run :results 'should-error-1) If you have other ideas for tests or find any bugs, please let me know. Sorry for the homegrown framework; it just sort of grew and then I was too tired to rewrite the tests. One complication here is that the original and new algorithms produce different term orders and use a few different functions. The function tag-test-transform transforms original results to the new algorithms conventions, but it does not handle PRIORITY or HEADING matches at the moment. Use the tree form of the tess (see match-results-1 for example) on these. Btw, I've run the tests on GNU Emacs 23.2 and 24.1 (running on OS X lion). Notes: a. There is no need to introduce a new character such as ! for negation because the semantics of the - are clear and are consistent with its use for tags. A - binds more tightly than & which in turn binds more tightly than |. A + selector can also be used for positive selection of a parenthesized term but it is equivalent to using no selector, just as for tags. b. Because \'s are so heavily used in regex's and because they have to be doubled in strings, using \'s for an additional escape layer would be messy, ambiguous, and hard to read. Only the {}'s need to be escaped and the doubling escapes {{ -> { and }} -> } are simple, readable, and fast to parse. For example: "+{abc\\{{3,7\\}}}" gives the regex "abc\\{3,7\\}". Parity makes correctness clear at a glance. c. Because headline (and priority) searches can be useful and powerful, and because the information on those fields is *already processed* in `org-scan-tags', we get those special searches *essentially for free*, requiring only two minor changes to `org-scan-tags'. See the unified diff in comments. The special PRIORITY property already exists; I added the special HEADING property for these purposes. I'm open to changing the name of course, but I do think the feature is both useful and elegant. (I'm using it in my application, for instance.) d. I did not see it in the manual, but I think that property names with hyphens should have these \-escaped -'s in the query string, with the escaping slashes removed in the produced matcher. This is not currently done, but the new version does. See Note h for details. e. It seems desirable to support both = and == as equality operators since the latter is so common by habit. The new version allows this explicitly. The original version does as well, but the regex for the comparison operator also allows other operators <<, ><, >>, =>, and >= as well, which can produce bad matchers. See Note h for details. f. Currently, spaces are ignored around &, |, the implicit & between terms, around the comparison operators in property searches, and around +/- selectors. Spaces are not ignored inside {}'s for a regexp match. g. The current code also allows +/- selectors before property comparisons. I don't really like this because +PROP<>"something" and -PROP="something" have the same meaning but look very different. But the new code does support this. As a side note, there's really no need for the & characters as +/- serve the and/and-not function completely. But again, no prob. h. A few bugs detected in the 7.8.11 code: + Faulty test for todo matcher in org-make-tags-matcher (string-match "/+" match) Ex: (org-make-tags-matcher "PROP={^\\s-*// .*$}") produces an erroneous matcher: ("PROP={^\\s-*// .*$}" progn (setq org-cached-props nil) (member "PROP" tags-list)) For all practical purposes it will be enough to do: (string-match "\\(/\\(!\\)?\\s-*\\)[^{}\"]*$" match) instead of the current test in org-make-tags-matcher. This works as long as the TODO keywords do not contain a right brace or quotation marks. (In most other cases, the new parser should give an error message at parse time.) A technicality: this is /not/ a complete solution because arbitrary strings can be TODO keywords. For instance, both PROP={/!} and PROP="/!{/!}" are valid TODO keywords (it works!) *and* valid property comparisons. So, a pattern alone is insufficient. We want to find the first slash that is not enclosed in {}'s or ""'s; if found, a todo match is needed. The function `org-find-todo-query' does this and (org-find-todo-query match) can be plugged in directly replacing the above (string-match ...) in then new `org-make-tags-matcher'. But because the todo parsing uses {}'s for regex matches, TODO keywords with {}'s are ignored anyway. So there's no need to go beyond the fixed string-match above. The function `org-todo-query-parse', which handles todo parsing in the new version, makes this explicit. + Property names with -'s are not handled properly (cf. Note d) Specifically, the escapes are not removed. Example: (org-make-tags-matcher "PROP\\-WITH\\-HYPHENS=2") produces ("PROP\\-WITH\\-HYPHENS=2" and (progn (setq org-cached-props nil) (= (string-to-number (or (org-cached-entry-get nil "PROP\\-WITH\\-HYPHENS") "")) 2)) t) The original code /does/ instead remove -'s from tag names, which shouldn't have them anyway. I suspect that this was intended for property names rather than tag names. The new version fixes up property names but does not allow -'s in tags. + Incorrect comparison operators allowed (cf. Note e) The regular expression used is "[<=>]\\{1,2\\}" is used to detect the comparison operators. But this can produce bad matchers that fail opaquely at match time rather than giving an appropriate error message at parse time. Ex: (org-make-tags-matcher "P<<2") produces ("P<<2" and (progn (setq org-cached-props nil) (nil (string-to-number (or (org-cached-entry-get nil "P") "")) 2)) t) This is fixed in the new version and delivers an error message at parse time. + missing org-re (line 7179 in org.el) with posix classes Minor consistency issue. This line does not occur in the new code. Thanks and regards, Christopher Genovese