From mboxrd@z Thu Jan  1 00:00:00 1970
From: Christopher Genovese <genovese@cmu.edu>
Subject: Re: new tag query parser [1/5] -- the motivating issues
Date: Sat, 18 Aug 2012 14:10:33 -0400
Message-ID: <CAPum5FgXdxW3=w33wSC7bdL7E4baSx2warPV6SjjHq6Wx767Lg@mail.gmail.com>
References: <CAPum5Fg053gQzGgsVjROg1exAbLeYQq5Qw7DjP=-MLiRyYzgBA@mail.gmail.com>
	<502FA424.4010004@os.inf.tu-dresden.de>
Mime-Version: 1.0
Content-Type: multipart/alternative; boundary=047d7b10ccd58b563304c78e32df
Return-path: <emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org>
Received: from eggs.gnu.org ([208.118.235.92]:33577)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <genovese.cr@gmail.com>) id 1T2nU9-0006ms-IK
	for emacs-orgmode@gnu.org; Sat, 18 Aug 2012 14:10:58 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <genovese.cr@gmail.com>) id 1T2nU8-0000Fb-21
	for emacs-orgmode@gnu.org; Sat, 18 Aug 2012 14:10:57 -0400
Received: from mail-pb0-f41.google.com ([209.85.160.41]:35089)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <genovese.cr@gmail.com>) id 1T2nU7-0000Et-Nu
	for emacs-orgmode@gnu.org; Sat, 18 Aug 2012 14:10:55 -0400
Received: by pbbro12 with SMTP id ro12so5535752pbb.0
	for <emacs-orgmode@gnu.org>; Sat, 18 Aug 2012 11:10:54 -0700 (PDT)
In-Reply-To: <502FA424.4010004@os.inf.tu-dresden.de>
List-Id: "General discussions about Org-mode." <emacs-orgmode.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-orgmode>,
	<mailto:emacs-orgmode-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-orgmode>
List-Post: <mailto:emacs-orgmode@gnu.org>
List-Help: <mailto:emacs-orgmode-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-orgmode>,
	<mailto:emacs-orgmode-request@gnu.org?subject=subscribe>
Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org
Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org
To: Martin Pohlack <mp26@os.inf.tu-dresden.de>
Cc: emacs-orgmode@gnu.org

--047d7b10ccd58b563304c78e32df
Content-Type: text/plain; charset=ISO-8859-1

Hi Martin,

   Assuming that org.el (with the new parser code) is byte-compiled, the
performance
difference is very minor. The only difference comes in converting the query
string
to a matcher form. The new parser has some additional overhead in function
calls and
keeping track of state, but in practice it is negligible.

   For example, in some basic benchmarks, both parsers can convert 10,000
fairly
complex query strings in a second or two *total*. If you run the tests,
you'll see that it does
over 200 cases plus comparisons and a good deal of other stuff in a blink
of an eye.

   So for any given agenda search or entry mapping, users will not notice
any real difference.

   Regarding backward compatibility, there is no conversion necessary. All
currently
valid queries produce equivalent matchers with the new code. The new parser
extends
the grammar to incorporate features that would not produce valid matchers
with current
code: parenthesized expressions, spaces, and {}-escapes in regexp matches.

    The only issue in this regard is that I added the name HEADING
to the list of special properties (like LEVEL, CATEGORY, PRIORITY, etc.).
This allows heading matches, which is one of my favorite features. So
existing
queries with a user-defined property HEADING would match the real heading
rather
than the property. This seems like a minor issue to me, but it would need
to be noted.

     Regards,

      Christopher

P.S. The provision above (and in the original posts) about byte compiling
the
parser code (which would be in org.el) relates to macro-expansion overhead.
I use a macro that makes the new parser function more readable and
maintainable,
and does much of its work at compile time to produce faster code.
In interpreted code that macro is expanded each pass through the loop.
The macro could be eliminated if necessary, or made faster in interpreted
code by
various tricks (that would add some overhead to compiled code).
But since org.el is typically  byte compiled during installation, this
doesn't seem
to me to be a problem. Performance is fine in practice either way, though
faster in
the typical compiled case, and I think the clarity gained from the macro is
worthwhile.

But definitely byte compile the new code before testing, as I advise in the
posts.

On Sat, Aug 18, 2012 at 10:18 AM, Martin Pohlack
<mp26@os.inf.tu-dresden.de>wrote:

> Hi Christopher,
>
> If I understand your descriptions correctly, your proposed changes are
> very cool.
>
> Could you elaborate a little bit on performance?
>
> * Are we going to see speedups?  In what cases?  How much?
>
> * If we lose performance, could you quantify that a bit with some examples?
>
> A question regarding backwards compatibility (I might have missed that
> in the description, sorry):  Are you converting existing queries on the
> fly each time, or do we have to convert our queries once?  If yes, is
> there some assisting code?
>
> Thanks,
> Martin
>
>

--047d7b10ccd58b563304c78e32df
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hi Martin,<br><br>=A0=A0 Assuming that org.el (with the new parser code) is=
 byte-compiled, the performance<br>difference is very minor. The only diffe=
rence comes in converting the query string<br>to a matcher form. The new pa=
rser has some additional overhead in function calls and<br>

keeping track of state, but in practice it is negligible.<br><br>=A0=A0 For=
 example, in some basic benchmarks, both parsers can convert 10,000 fairly<=
br>complex query strings in a second or two *total*. If you run the tests, =
you&#39;ll see that it does<br>

over 200 cases plus comparisons and a good deal of other stuff in a blink o=
f an eye.<br><br>=A0=A0 So for any given agenda search or entry mapping, us=
ers will not notice any real difference.<br><br>=A0=A0 Regarding backward c=
ompatibility, there is no conversion necessary. All currently<br>

valid queries produce equivalent matchers with the new code. The new parser=
 extends<br>the grammar to incorporate features that would not produce vali=
d matchers with current<br>code: parenthesized expressions, spaces, and {}-=
escapes in regexp matches.<br>

<br>=A0=A0=A0 The only issue in this regard is that I added the name HEADIN=
G<br>to the list of special properties (like LEVEL, CATEGORY, PRIORITY, etc=
.).<br>This allows heading matches, which is one of my favorite features. S=
o existing<br>

queries with a user-defined property HEADING would match the real heading r=
ather<br>than the property. This seems like a minor issue to me, but it wou=
ld need to be noted.<br><br>=A0=A0=A0=A0 Regards,<br><br>=A0=A0=A0=A0=A0 Ch=
ristopher<br>
<br>
P.S. The provision above (and in the original posts) about byte compiling t=
he<br>parser code (which would be in org.el) relates to macro-expansion ove=
rhead.<br>I use a macro that makes the new parser function more readable an=
d maintainable,<br>

and does much of its work at compile time to produce faster code. <br>In in=
terpreted code that macro is expanded each pass through the loop.<br>The ma=
cro could be eliminated if necessary, or made faster in interpreted code by=
<br>

various tricks (that would add some overhead to compiled code). <br>But sin=
ce org.el is typically=A0 byte compiled during installation, this doesn&#39=
;t seem<br>to me to be a problem. Performance is fine in practice either wa=
y, though faster in<br>

the typical compiled case, and I think the clarity gained from the macro is=
 worthwhile.<br><br>But definitely byte compile the new code before testing=
, as I advise in the posts.<br><br><div class=3D"gmail_quote">On Sat, Aug 1=
8, 2012 at 10:18 AM, Martin Pohlack <span dir=3D"ltr">&lt;<a href=3D"mailto=
:mp26@os.inf.tu-dresden.de" target=3D"_blank">mp26@os.inf.tu-dresden.de</a>=
&gt;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">Hi Christopher,<br>
<br>
If I understand your descriptions correctly, your proposed changes are<br>
very cool.<br>
<br>
Could you elaborate a little bit on performance?<br>
<br>
* Are we going to see speedups? =A0In what cases? =A0How much?<br>
<br>
* If we lose performance, could you quantify that a bit with some examples?=
<br>
<br>
A question regarding backwards compatibility (I might have missed that<br>
in the description, sorry): =A0Are you converting existing queries on the<b=
r>
fly each time, or do we have to convert our queries once? =A0If yes, is<br>
there some assisting code?<br>
<br>
Thanks,<br>
Martin<br>
<br>
</blockquote></div><br>

--047d7b10ccd58b563304c78e32df--