* new tag query parser [1/5] -- the motivating issues @ 2012-08-16 3:57 Christopher Genovese 2012-08-18 14:18 ` Martin Pohlack 0 siblings, 1 reply; 3+ messages in thread From: Christopher Genovese @ 2012-08-16 3:57 UTC (permalink / raw) To: emacs-orgmode [-- Attachment #1: Type: text/plain, Size: 2114 bytes --] My proposed changes in the tag query parser are motivated by the need and/or desire to do the following. (The example strings work with the new parser.) 1. Combine and modify tag queries programmatically. The leading case is that a function is given a tag query string and needs to *exclude* lines matching that query. To do this, we can transform query strings like so: "foo+bar+zap/TODO" --> "-(foo+bar+zap/TODO)" "foo|bar|zap" --> "-(foo|bar|zap)" The key is that we want to do this programatically while still using the mapping or agenda search command.^* I use this a lot in my GTD layer for org; other combinations and transformations come up as well. 2. Write complex queries as simply as possible (i.e., using parens). Parentheses aren't always necessary, but they can make things nicer. "(xyz|{^a}-abc) & LEVEL > 1" versus "xyz&LEVEL>1|{^a}-abc&LEVEL>1" 3. Make *fast* heading and priority searches That information is *already matched* in the current code but access is not given (or is slow in the case of PRIORITY). "LEVEL == 2 & HEADING <> {<.*>} & PRIORITY <> \"A\" " 4. Include braces in regular expression matches. "+{abc\\{{3,7\\}}}" -> regex "abc\\{3,7\\}" "{[A-Z]+\\S-+{{template}}.*$}" -> regex "[A-Z]+\\S-+{template}.*$" Because \ escapes are used so heavily in regexex and because strings require doubling them, using additional \'s would be messy, ambiguous, and hard to read. Instead, exploit that we only need to protect {}'s by *doubling* them: {{ -> { and }} -> }. This is simple, readable, fast, and parity makes correctness clear at a glance.^** 5. Allow spaces in query strings for readability. Not a big deal, but easy. See the above examples 6. Get helpful error messages at parse time when there is a problem. ^* It is of course possible to create a matcher from the string and do the search directly with lower level functions, but that ends up being a clunky solution. ^** The doubling strategy is also familiar from the doubling of \'s in quoted strings. [-- Attachment #2: Type: text/html, Size: 7436 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: new tag query parser [1/5] -- the motivating issues 2012-08-16 3:57 new tag query parser [1/5] -- the motivating issues Christopher Genovese @ 2012-08-18 14:18 ` Martin Pohlack 2012-08-18 18:10 ` Christopher Genovese 0 siblings, 1 reply; 3+ messages in thread From: Martin Pohlack @ 2012-08-18 14:18 UTC (permalink / raw) To: Christopher Genovese; +Cc: emacs-orgmode Hi Christopher, If I understand your descriptions correctly, your proposed changes are very cool. Could you elaborate a little bit on performance? * Are we going to see speedups? In what cases? How much? * If we lose performance, could you quantify that a bit with some examples? A question regarding backwards compatibility (I might have missed that in the description, sorry): Are you converting existing queries on the fly each time, or do we have to convert our queries once? If yes, is there some assisting code? Thanks, Martin ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: new tag query parser [1/5] -- the motivating issues 2012-08-18 14:18 ` Martin Pohlack @ 2012-08-18 18:10 ` Christopher Genovese 0 siblings, 0 replies; 3+ messages in thread From: Christopher Genovese @ 2012-08-18 18:10 UTC (permalink / raw) To: Martin Pohlack; +Cc: emacs-orgmode [-- Attachment #1: Type: text/plain, Size: 2958 bytes --] Hi Martin, Assuming that org.el (with the new parser code) is byte-compiled, the performance difference is very minor. The only difference comes in converting the query string to a matcher form. The new parser has some additional overhead in function calls and keeping track of state, but in practice it is negligible. For example, in some basic benchmarks, both parsers can convert 10,000 fairly complex query strings in a second or two *total*. If you run the tests, you'll see that it does over 200 cases plus comparisons and a good deal of other stuff in a blink of an eye. So for any given agenda search or entry mapping, users will not notice any real difference. Regarding backward compatibility, there is no conversion necessary. All currently valid queries produce equivalent matchers with the new code. The new parser extends the grammar to incorporate features that would not produce valid matchers with current code: parenthesized expressions, spaces, and {}-escapes in regexp matches. The only issue in this regard is that I added the name HEADING to the list of special properties (like LEVEL, CATEGORY, PRIORITY, etc.). This allows heading matches, which is one of my favorite features. So existing queries with a user-defined property HEADING would match the real heading rather than the property. This seems like a minor issue to me, but it would need to be noted. Regards, Christopher P.S. The provision above (and in the original posts) about byte compiling the parser code (which would be in org.el) relates to macro-expansion overhead. I use a macro that makes the new parser function more readable and maintainable, and does much of its work at compile time to produce faster code. In interpreted code that macro is expanded each pass through the loop. The macro could be eliminated if necessary, or made faster in interpreted code by various tricks (that would add some overhead to compiled code). But since org.el is typically byte compiled during installation, this doesn't seem to me to be a problem. Performance is fine in practice either way, though faster in the typical compiled case, and I think the clarity gained from the macro is worthwhile. But definitely byte compile the new code before testing, as I advise in the posts. On Sat, Aug 18, 2012 at 10:18 AM, Martin Pohlack <mp26@os.inf.tu-dresden.de>wrote: > Hi Christopher, > > If I understand your descriptions correctly, your proposed changes are > very cool. > > Could you elaborate a little bit on performance? > > * Are we going to see speedups? In what cases? How much? > > * If we lose performance, could you quantify that a bit with some examples? > > A question regarding backwards compatibility (I might have missed that > in the description, sorry): Are you converting existing queries on the > fly each time, or do we have to convert our queries once? If yes, is > there some assisting code? > > Thanks, > Martin > > [-- Attachment #2: Type: text/html, Size: 3440 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2012-08-18 18:10 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-08-16 3:57 new tag query parser [1/5] -- the motivating issues Christopher Genovese 2012-08-18 14:18 ` Martin Pohlack 2012-08-18 18:10 ` Christopher Genovese
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).