From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christopher Genovese Subject: Re: new tag query parser [1/5] -- the motivating issues Date: Sat, 18 Aug 2012 14:10:33 -0400 Message-ID: References: <502FA424.4010004@os.inf.tu-dresden.de> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=047d7b10ccd58b563304c78e32df Return-path: Received: from eggs.gnu.org ([208.118.235.92]:33577) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T2nU9-0006ms-IK for emacs-orgmode@gnu.org; Sat, 18 Aug 2012 14:10:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1T2nU8-0000Fb-21 for emacs-orgmode@gnu.org; Sat, 18 Aug 2012 14:10:57 -0400 Received: from mail-pb0-f41.google.com ([209.85.160.41]:35089) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T2nU7-0000Et-Nu for emacs-orgmode@gnu.org; Sat, 18 Aug 2012 14:10:55 -0400 Received: by pbbro12 with SMTP id ro12so5535752pbb.0 for ; Sat, 18 Aug 2012 11:10:54 -0700 (PDT) In-Reply-To: <502FA424.4010004@os.inf.tu-dresden.de> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Martin Pohlack Cc: emacs-orgmode@gnu.org --047d7b10ccd58b563304c78e32df Content-Type: text/plain; charset=ISO-8859-1 Hi Martin, Assuming that org.el (with the new parser code) is byte-compiled, the performance difference is very minor. The only difference comes in converting the query string to a matcher form. The new parser has some additional overhead in function calls and keeping track of state, but in practice it is negligible. For example, in some basic benchmarks, both parsers can convert 10,000 fairly complex query strings in a second or two *total*. If you run the tests, you'll see that it does over 200 cases plus comparisons and a good deal of other stuff in a blink of an eye. So for any given agenda search or entry mapping, users will not notice any real difference. Regarding backward compatibility, there is no conversion necessary. All currently valid queries produce equivalent matchers with the new code. The new parser extends the grammar to incorporate features that would not produce valid matchers with current code: parenthesized expressions, spaces, and {}-escapes in regexp matches. The only issue in this regard is that I added the name HEADING to the list of special properties (like LEVEL, CATEGORY, PRIORITY, etc.). This allows heading matches, which is one of my favorite features. So existing queries with a user-defined property HEADING would match the real heading rather than the property. This seems like a minor issue to me, but it would need to be noted. Regards, Christopher P.S. The provision above (and in the original posts) about byte compiling the parser code (which would be in org.el) relates to macro-expansion overhead. I use a macro that makes the new parser function more readable and maintainable, and does much of its work at compile time to produce faster code. In interpreted code that macro is expanded each pass through the loop. The macro could be eliminated if necessary, or made faster in interpreted code by various tricks (that would add some overhead to compiled code). But since org.el is typically byte compiled during installation, this doesn't seem to me to be a problem. Performance is fine in practice either way, though faster in the typical compiled case, and I think the clarity gained from the macro is worthwhile. But definitely byte compile the new code before testing, as I advise in the posts. On Sat, Aug 18, 2012 at 10:18 AM, Martin Pohlack wrote: > Hi Christopher, > > If I understand your descriptions correctly, your proposed changes are > very cool. > > Could you elaborate a little bit on performance? > > * Are we going to see speedups? In what cases? How much? > > * If we lose performance, could you quantify that a bit with some examples? > > A question regarding backwards compatibility (I might have missed that > in the description, sorry): Are you converting existing queries on the > fly each time, or do we have to convert our queries once? If yes, is > there some assisting code? > > Thanks, > Martin > > --047d7b10ccd58b563304c78e32df Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi Martin,

=A0=A0 Assuming that org.el (with the new parser code) is= byte-compiled, the performance
difference is very minor. The only diffe= rence comes in converting the query string
to a matcher form. The new pa= rser has some additional overhead in function calls and
keeping track of state, but in practice it is negligible.

=A0=A0 For= example, in some basic benchmarks, both parsers can convert 10,000 fairly<= br>complex query strings in a second or two *total*. If you run the tests, = you'll see that it does
over 200 cases plus comparisons and a good deal of other stuff in a blink o= f an eye.

=A0=A0 So for any given agenda search or entry mapping, us= ers will not notice any real difference.

=A0=A0 Regarding backward c= ompatibility, there is no conversion necessary. All currently
valid queries produce equivalent matchers with the new code. The new parser= extends
the grammar to incorporate features that would not produce vali= d matchers with current
code: parenthesized expressions, spaces, and {}-= escapes in regexp matches.

=A0=A0=A0 The only issue in this regard is that I added the name HEADIN= G
to the list of special properties (like LEVEL, CATEGORY, PRIORITY, etc= .).
This allows heading matches, which is one of my favorite features. S= o existing
queries with a user-defined property HEADING would match the real heading r= ather
than the property. This seems like a minor issue to me, but it wou= ld need to be noted.

=A0=A0=A0=A0 Regards,

=A0=A0=A0=A0=A0 Ch= ristopher

P.S. The provision above (and in the original posts) about byte compiling t= he
parser code (which would be in org.el) relates to macro-expansion ove= rhead.
I use a macro that makes the new parser function more readable an= d maintainable,
and does much of its work at compile time to produce faster code.
In in= terpreted code that macro is expanded each pass through the loop.
The ma= cro could be eliminated if necessary, or made faster in interpreted code by=
various tricks (that would add some overhead to compiled code).
But sin= ce org.el is typically=A0 byte compiled during installation, this doesn'= ;t seem
to me to be a problem. Performance is fine in practice either wa= y, though faster in
the typical compiled case, and I think the clarity gained from the macro is= worthwhile.

But definitely byte compile the new code before testing= , as I advise in the posts.

On Sat, Aug 1= 8, 2012 at 10:18 AM, Martin Pohlack <mp26@os.inf.tu-dresden.de= > wrote:
Hi Christopher,

If I understand your descriptions correctly, your proposed changes are
very cool.

Could you elaborate a little bit on performance?

* Are we going to see speedups? =A0In what cases? =A0How much?

* If we lose performance, could you quantify that a bit with some examples?=

A question regarding backwards compatibility (I might have missed that
in the description, sorry): =A0Are you converting existing queries on the fly each time, or do we have to convert our queries once? =A0If yes, is
there some assisting code?

Thanks,
Martin


--047d7b10ccd58b563304c78e32df--