emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Christopher Genovese <genovese@cmu.edu>
To: Samuel Wales <samologist@gmail.com>
Cc: emacs-orgmode@gnu.org
Subject: Re: full parser implementation for tag queries (parentheses, fast heading match, and more)
Date: Thu, 16 Aug 2012 13:47:54 -0400	[thread overview]
Message-ID: <CAPum5FhU+xUeYYapVhhr4E20oaZBTcEB3386tvKSct7FivFNrg@mail.gmail.com> (raw)
In-Reply-To: <CAJcAo8u9ZxUtfiXeWqaUACd_6UTiGo+fL_f-h4KjAko1p8Hq5A@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 10508 bytes --]

Hi Samuel,

  Thanks for your note.  Just FYI, the earlier parser code in this thread
has
been superseded by the code in the post "new tag query parser [3/5]".

   The sexp input is a nice idea, and it would be very easy I think.
The function org-make-tags-matcher now takes a query string
but could easily be modified to operate on a form as well.

    I've included some code below that basically does the job. It defines a
function
`mtrans' that transforms a sexp representation into a matcher.  It can
certainly be
better optimized (and fully tested), but I think it would do just what you
want
if inserted in org-make-tags-matcher. (Note: The car of the matcher is the
query string,
for reasons that aren't entirely clear. Because this is dropped anyway in
practice,
I don't bother making it accurate in this code. As such, I'll just give the
cdr in the examples below.)

    A few examples follow to give the idea and show the results. It seems
to handle
all the cases nicely. In the sexp representation, strings stand for exact
string matches
and  both [<string>] and (re <string>) stand for regex matches, with
symbols
for properties and standard boolean and comparison ops in the form.
The keyword :todo-only acts like /! in the query strings, and = and <> also
allow arbitrary lisp code for the property comparison with equal (as long
as the
form does not start with or, and, not, or re but then it can be shielded
with (identity ...)).

    I then append the code, which is also attached. I see no problems
with adding this to org-make-tags-matcher and would be interested in other
opinions.

    Best,

      Christopher

;;; Examples

(mtrans "foo")     ; corresponds to query string  "foo"
  => (member "foo" tags-list)

(mtrans ["^f"])   ; or (mtrans '(re "^f"))  corresponds to query string
"{^f}"
  =>  (progn
        (setq org-cached-props nil)
        (org-match-any-p "^f" tags-list))

(mtrans '(and "foo" (not "bar") (re "^u.*v")))  ; query string
"foo-bar+{^u.*v}
  => (progn
       (setq org-cached-props nil)
       (and
        (member "foo" tags-list)
        (not (member "bar" tags-list))
        (org-match-any-p "^u.*v" tags-list)))

(mtrans '(or (and "xyz" (= TODO ["^T"]) ["u\\{2,4\\}"] (<= LEVEL 3))
             (> APROP "foo")
             (and (= BPROP 4) (<> HEADING "ignore"))))
                                          ; query string
"xyz+TODO={^T}+{u\\{{2,4\\}}}+LEVEL<=3 | APROP > \"foo\" | BPROP=4&HEADING
<> \"ignore\""
  => (progn
       (setq org-cached-props nil)
       (or
        (and
         (member "xyz" tags-list)
         (org-string-match= (or todo "") "^T")
         (org-match-any-p "u\\{2,4\\}" tags-list)
         (<= level 3))
        (org-string> (or (org-cached-entry-get nil "APROP") "") "foo")
        (and
         (= (org-cached-entry-get nil "BPROP") 4)
         (org-string<> (or heading "") "ignore"))))

(mtrans '(or (and "foo" (not "bar") ["^u.*v"] (> LEVEL 2))
             (= APROP "foo")
             (and (= BPROP ["/.*/"]) (<> BPROP "/ignore/"))
             (<> TODO "TODO")
             (> SCHEDULED "<2008-11-12>")))
              ; query string "foo-bar+{^u.*v}+LEVEL>2 | APROP=\"foo\"|
BPROP={/.*/} & BPROP <> "/ignore/" | TODO<>\"TODO\" | SCHEDULED >
\"<2008-11-12>\""
  => (progn
       (setq org-cached-props nil)
       (or
        (and
         (member "foo" tags-list)
         (not (member "bar" tags-list))
         (org-match-any-p "^u.*v" tags-list)
         (> level 2))
        (string= (or (org-cached-entry-get nil "APROP") "") "foo")
        (and
         (org-string-match= (or (org-cached-entry-get nil "BPROP") "")
"/.*/")
         (org-string<> (or (org-cached-entry-get nil "BPROP") "")
"/ignore/"))
        (org-string<> (or todo "") "TODO")
        (org-time> (or (org-cached-entry-get nil "SCHEDULED") "")
1226466000.0)))

(mtrans '(and :todo-only
              (or (and (not ["^abc"]) ["ex"] (= A_PROP "foo"))
                  (>= B_PROP 1.2e10)
                  (and (< D_PROP "<2008-12-24 18:30>") (= FOO (call other
lisp code here))))))
           ; except for FOO part which has no analogue, query string
"-{^abc}+{ex}&A_PROP=\"foo\" | B_PROP > 1.2e10 | DROP < \"<2008-12-24
18:30>\" & FOO = ..."
  => (progn
       (setq org-cached-props nil)
       (and
        (member todo org-not-done-keywords)
        (or
         (and
          (not (org-match-any-p "^abc" tags-list))
          (org-match-any-p "ex" tags-list)
          (string= (or (org-cached-entry-get nil "A_PROP") "") "foo"))
         (>= (org-cached-entry-get nil "B_PROP") 12000000000.0)
         (and
          (org-time< (or (org-cached-entry-get nil "D_PROP") "")
1230094800.0)
          (equal (org-cached-entry-get nil "FOO") (call other lisp code
here))))))


;;; The Code

(eval-when-compile (require 'cl))

(defun mtrans (matcher-sexp)
  "Create a tag/todo matcher from a sexp representation.
In the sexp representation, components are transformed as follows:

  + A literal string becomes an exact tag match.
  + A [<string>] or (re <string>) becomes a tag regex match
  + (or <item>...), (and <item>...), (not <item>)
    act as boolean operators, and processing continues on the <item>'s
  + (<op> <lhs> <rhs>) is a property comparison, where op must be
    one of   <, <=, >=, >, =, ==, or <>. One of lhs or rhs must be a
    symbol naming a property and the other must be either a number,
    string, [<string>] or (re <string>) for regex, or a generic form.
    (Only =, ==, and <> are allowed on the latter two.) Property
    symbols TODO, PRIORITY, HEADING, CATEGORY, are handled specially,
    otherwise, the symbol name is used as the property name.

  + A keyword :todo-only restricts attention to not done todo keywords,
    like /! does in standard queries.

Returns a tags matcher in the standard form, although the string
in the car of the matcher is (for now) fake, i.e., the query
string would not generate the same (or any useful) matcher."
  (let ((query "!ignored!")) ; ignore making this now, as it is not really
used anyway
    (cons query
          (cond
           ((atom matcher-sexp) (mtrans-1 matcher-sexp))
           ((and (consp matcher-sexp) (listp (cdr matcher-sexp)))
            `(progn
               (setq org-cached-props nil)
               ,(mtrans-1 matcher-sexp)))
           (t (error "Badly formed matcher sexp"))))))

(defun mtrans-1 (mitem)
  (if (atom mitem)
      (cond
       ((eq mitem :todo-only)
        '(member todo org-not-done-keywords))
       ((stringp mitem)
        `(member ,mitem tags-list))
       ((and (vectorp mitem) (stringp (aref mitem 0)))
        `(org-match-any-p ,(aref mitem 0) tags-list))
       (t mitem))
    (let ((head (car mitem)))
      (case head
       ((or and)
        `(,head ,@(mapcar 'mtrans-1 (cdr mitem))))
       (not
        (when (cddr mitem) (error "not is a unary operator"))
        `(not ,(mtrans-1 (cadr mitem))))
       ((< <= >= > = == <>)
        (let* ((arg1 (cadr mitem))
               (arg2 (car (cddr mitem)))
               (rhs  (or (mtrans-cmp-rhs-p arg1)
                         (mtrans-cmp-rhs-p arg2))))
          (cond
           ((and (symbolp arg1) rhs)
            (mtrans-cmp head arg1 rhs))
           ((and (symbolp arg2) rhs)
            (mtrans-cmp head arg2 rhs))
           (t (error "Badly formed property comparison"))
           (mtrans-cmp head (cadr mitem) (car (cddr mitem))))))
       (re
        `(org-match-any-p ,(cadr mitem) tags-list))
       (t mitem)))))

(defun mtrans-cmp-rhs-p (item)
  (cond
   ((numberp item)
    `(number ,item))
   ((and (stringp item) (string-match-p "[[<].*?[]>]" item))
    `(time ,(org-matcher-time item)))
   ((stringp item)
    `(string ,item))
   ((and (vectorp item) (stringp (aref item 0)))
    `(re ,(aref item 0)))
   ((consp item)
    `(form ,item))
   (t nil)))

(defun org-not-equal (a b) (not (equal a b)))

(defvar mtrans-op-alist
  '((<  (number . <)  (string . string<)          (time . org-time<)  (re .
nil)                (form . nil))
    (<= (number . <=) (string . org-string<=)     (time . org-time<=) (re .
nil)                (form . nil))
    (>= (number . >=) (string . org-string>=)     (time . org-time>=) (re .
nil)                (form . nil))
    (>  (number . >)  (string . org-string>)      (time . org-time>)  (re .
nil)                (form . nil))
    (=  (number . =)  (string . string=)          (time . org-time>)  (re .
org-string-match=)  (form . equal))
    (== (number . =)  (string . string=)          (time . org-time=)  (re .
org-string-match=)  (form . equal))
    (<> (number . org<>)  (string . org-string<>) (time . org-time<>) (re .
org-string-match<>) (form . org-not-equal)))
  "Maps comparison operators and types to suitable comparison function.
A nil value means the comparison is erroneous.")

(defvar mtrans-special-props-alist
  `((TODO todo string re form)
    (LEVEL level number)
    (HEADING heading string re form)
    (PRIORITY priority string re form)
    (CATEGORY (get-text-property (point) 'org-category) string re form))
  "Maps special property names to their matcher symbol and constraints.
Each value is of the form (MATCHER-SYMBOL TYPE...), where TYPE is
a symbol for an allowed comparison value type.")

(defun mtrans-cmp (op prop obj)
  (let* ((type (car obj))
         (val  (cadr obj))
         (special (cdr (assoc prop mtrans-special-props-alist)))
         (prop-ref (or (car special)
                       `(org-cached-entry-get nil ,(symbol-name prop))))
         (func-alist (cdr (assq op mtrans-op-alist)))
         (func (cdr (assoc type func-alist))))
    (when (and special (not (memq type (cdr special))))
      (error "Type mismatch in %s comparison" prop))
    (when (null func)
      (error "Improper operator for %s comparison" type))
    `(,func ,(if (memq type '(number form)) prop-ref `(or ,prop-ref ""))
,val)))


On Thu, Aug 16, 2012 at 1:02 AM, Samuel Wales <samologist@gmail.com> wrote:

> You have really dived into this.  I think it's excellent to allow more
> flexibility in searches.
>
> Just a brainstorm question, but having just modified the code, how
> difficult do you think it would be to provide a sexp syntax?
>
> Despite all of your obvious hard work, I'd find sexp easier to look
> up, make sense of, and remember.  I favor identifiers-like-this over
> single-character symbols, and (expressions (like this)) over
> precedence rules.
>
> Maybe just me though.  :)
>
> Samuel
>
> --
> The Kafka Pandemic: http://thekafkapandemic.blogspot.com
>

[-- Attachment #1.2: Type: text/html, Size: 30205 bytes --]

[-- Attachment #2: tag-sexp-matchers.el --]
[-- Type: application/octet-stream, Size: 5102 bytes --]

(eval-when-compile (require 'cl))

(defun mtrans (matcher-sexp)
  "Create a tag/todo matcher from a sexp representation.
In the sexp representation, components are transformed as follows:

  + A literal string becomes an exact tag match.
  + A [<string>] or (re <string>) becomes a tag regex match
  + (or <item>...), (and <item>...), (not <item>)
    act as boolean operators, and processing continues on the <item>'s
  + (<op> <lhs> <rhs>) is a property comparison, where op must be
    one of   <, <=, >=, >, =, ==, or <>. One of lhs or rhs must be a
    symbol naming a property and the other must be either a number,
    string, [<string>] or (re <string>) for regex, or a generic form.
    (Only =, ==, and <> are allowed on the latter two.) Property
    symbols TODO, PRIORITY, HEADING, CATEGORY, are handled specially, 
    otherwise, the symbol name is used as the property name.

  + A keyword :todo-only restricts attention to not done todo keywords,
    like /! does in standard queries.

Returns a tags matcher in the standard form, although the string
in the car of the matcher is (for now) fake, i.e., the query
string would not generate the same (or any useful) matcher."
  (let ((query "!ignored!")) ; ignore making this now, as it is not really used anyway
    (cons query
          (cond
           ((atom matcher-sexp) (mtrans-1 matcher-sexp))
           ((and (consp matcher-sexp) (listp (cdr matcher-sexp)))
            `(progn
               (setq org-cached-props nil)
               ,(mtrans-1 matcher-sexp)))
           (t (error "Badly formed matcher sexp"))))))

(defun mtrans-1 (mitem)
  (if (atom mitem)
      (cond
       ((eq mitem :todo-only)
        '(member todo org-not-done-keywords))
       ((stringp mitem)
        `(member ,mitem tags-list))
       ((and (vectorp mitem) (stringp (aref mitem 0)))
        `(org-match-any-p ,(aref mitem 0) tags-list))
       (t mitem))
    (let ((head (car mitem)))
      (case head
       ((or and)
        `(,head ,@(mapcar 'mtrans-1 (cdr mitem))))
       (not
        (when (cddr mitem) (error "not is a unary operator"))
        `(not ,(mtrans-1 (cadr mitem))))
       ((< <= >= > = == <>)
        (let* ((arg1 (cadr mitem))
               (arg2 (car (cddr mitem)))
               (rhs  (or (mtrans-cmp-rhs-p arg1)
                         (mtrans-cmp-rhs-p arg2))))
          (cond
           ((and (symbolp arg1) rhs)
            (mtrans-cmp head arg1 rhs))
           ((and (symbolp arg2) rhs)
            (mtrans-cmp head arg2 rhs))
           (t (error "Badly formed property comparison"))
           (mtrans-cmp head (cadr mitem) (car (cddr mitem))))))
       (re
        `(org-match-any-p ,(cadr mitem) tags-list))
       (t mitem)))))

(defun mtrans-cmp-rhs-p (item)
  (cond
   ((numberp item)
    `(number ,item))
   ((and (stringp item) (string-match-p "[[<].*?[]>]" item))
    `(time ,(org-matcher-time item)))
   ((stringp item)
    `(string ,item))
   ((and (vectorp item) (stringp (aref item 0)))
    `(re ,(aref item 0)))
   ((consp item)
    `(form ,item))
   (t nil)))

(defun org-not-equal (a b) (not (equal a b)))

(defvar mtrans-op-alist
  '((<  (number . <)  (string . string<)          (time . org-time<)  (re . nil)                (form . nil))
    (<= (number . <=) (string . org-string<=)     (time . org-time<=) (re . nil)                (form . nil))
    (>= (number . >=) (string . org-string>=)     (time . org-time>=) (re . nil)                (form . nil))
    (>  (number . >)  (string . org-string>)      (time . org-time>)  (re . nil)                (form . nil))
    (=  (number . =)  (string . string=)          (time . org-time>)  (re . org-string-match=)  (form . equal))
    (== (number . =)  (string . string=)          (time . org-time=)  (re . org-string-match=)  (form . equal))
    (<> (number . org<>)  (string . org-string<>) (time . org-time<>) (re . org-string-match<>) (form . org-not-equal)))
  "Maps comparison operators and types to suitable comparison function.
A nil value means the comparison is erroneous.")

(defvar mtrans-special-props-alist
  `((TODO todo string re form)
    (LEVEL level number)
    (HEADING heading string re form)
    (PRIORITY priority string re form)
    (CATEGORY (get-text-property (point) 'org-category) string re form))
  "Maps special property names to their matcher symbol and constraints.
Each value is of the form (MATCHER-SYMBOL TYPE...), where TYPE is
a symbol for an allowed comparison value type.")

(defun mtrans-cmp (op prop obj)
  (let* ((type (car obj))
         (val  (cadr obj))
         (special (cdr (assoc prop mtrans-special-props-alist)))
         (prop-ref (or (car special)
                       `(org-cached-entry-get nil ,(symbol-name prop))))
         (func-alist (cdr (assq op mtrans-op-alist)))
         (func (cdr (assoc type func-alist))))
    (when (and special (not (memq type (cdr special))))
      (error "Type mismatch in %s comparison" prop))
    (when (null func)
      (error "Improper operator for %s comparison" type))
    `(,func ,(if (memq type '(number form)) prop-ref `(or ,prop-ref "")) ,val)))


      reply	other threads:[~2012-08-16 17:48 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-04  7:50 Christopher Genovese
2012-08-04 10:07 ` Christopher Genovese
2012-08-04 18:32   ` Christopher Genovese
2012-08-16  5:02     ` Samuel Wales
2012-08-16 17:47       ` Christopher Genovese [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPum5FhU+xUeYYapVhhr4E20oaZBTcEB3386tvKSct7FivFNrg@mail.gmail.com \
    --to=genovese@cmu.edu \
    --cc=emacs-orgmode@gnu.org \
    --cc=samologist@gmail.com \
    --subject='Re: full parser implementation for tag queries (parentheses, fast heading match, and more)' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Code repositories for project(s) associated with this inbox:

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).