emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [PATCH] org-element.el; significant optimizations for org-element--interpret-affiliated-keywords
@ 2024-11-29 22:38 Dwarshuis, Nathan J
  2024-12-25 12:20 ` Ihor Radchenko
  0 siblings, 1 reply; 2+ messages in thread
From: Dwarshuis, Nathan J @ 2024-11-29 22:38 UTC (permalink / raw)
  To: org-mode-email

[-- Attachment #1: Type: text/plain, Size: 809 bytes --]

Hi all,

I noticed that calling `org-element-interpret-data' on objects is generally 5-10x faster than when calling on elements. The reason seems to be that `org-element--interpret-affiliated-keywords' (which is only called on elements) does alot of unnecessary work. Namely, it runs on all elements (including those that should never have an affiliated keyword) and also loops over :standard-properties which should not be relevant here.

The attached patch addresses this.

I also attached some benchmarks pre and post-patch and accompanying code, which shows many elements (specifically those without children) approaching the execution time of objects. For the graphs, just pay attention to the 'org-element' bars (red); the others were from me trying to optimize the `org-ml` package.

Thank you,
Nate



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-org-element.el-Make-affiliated-keyword-interpreter-f.patch --]
[-- Type: text/x-patch, Size: 5720 bytes --]

From 1b4fb607bd6e233a15a569960517a9388a80974b Mon Sep 17 00:00:00 2001
From: ndwarshuis <ndwar@yavin4.ch>
Date: Mon, 25 Nov 2024 22:04:09 -0500
Subject: [PATCH] org-element.el: Make affiliated keyword interpreter faster

* lisp/org-element.el (org-element--interpret-affiliated-keywords):
Optimize performance by bypassing unnecessary types and reducing
loop complexity. Added new constant
`org-element-elements-no-affiliated` which stores the types to
be bypassed.

This function was doing redundant work on several levels which
dramatically reduced performance of interpreting element nodes
relative to object nodes.

First, all types were interpreted regardless of if they could
possibly contain affiliated keywords. Skipping these types
dramatically speeds up typical use cases since many of these
skipped types are common (headline, item, etc).

Second, the loop was much more complex than needed. The loop included
:standard-properties which should not be necessary here. It also
duplicated some work between calls to `org-element--properties-mapc`
and `mapconcat` (the code was moved entirely under the former). The
result should be faster and more readable.

TINYCHANGE
---
 lisp/org-element.el | 82 +++++++++++++++++++++++----------------------
 1 file changed, 42 insertions(+), 40 deletions(-)

diff --git a/lisp/org-element.el b/lisp/org-element.el
index 9f8f52745..3b90dce2a 100644
--- a/lisp/org-element.el
+++ b/lisp/org-element.el
@@ -335,6 +335,11 @@ specially in `org-element--object-lex'.")
   (append org-element-recursive-objects '(paragraph table-row verse-block))
   "List of object or element types that can directly contain objects.")
 
+(defconst org-element-elements-no-affiliated
+  '(org-data comment clock headline inlinetask item
+             node-property planning property-drawer
+             section table-row))
+
 (defconst org-element-affiliated-keywords
   '("CAPTION" "DATA" "HEADER" "HEADERS" "LABEL" "NAME" "PLOT" "RESNAME" "RESULT"
     "RESULTS" "SOURCE" "SRCNAME" "TBLNAME")
@@ -5517,49 +5522,46 @@ to interpret.  Return Org syntax as a string."
 			      (make-string blank ?\n)))))))))
     (funcall fun data nil)))
 
+(defun org-element--keyword-to-org (key value)
+  (let (dual)
+    (when (member key org-element-dual-keywords)
+      (setq dual (cdr value) value (car value)))
+    (concat "#+" (downcase key)
+            (and dual
+                 (format "[%s]" (org-element-interpret-data dual)))
+            ": "
+            (if (member key org-element-parsed-keywords)
+                (org-element-interpret-data value)
+              value)
+            "\n")))
+
 (defun org-element--interpret-affiliated-keywords (element)
   "Return ELEMENT's affiliated keywords as Org syntax.
 If there is no affiliated keyword, return the empty string."
-  (let ((keyword-to-org
-	 (lambda (key value)
-	   (let (dual)
-	     (when (member key org-element-dual-keywords)
-	       (setq dual (cdr value) value (car value)))
-	     (concat "#+" (downcase key)
-		     (and dual
-			  (format "[%s]" (org-element-interpret-data dual)))
-		     ": "
-		     (if (member key org-element-parsed-keywords)
-			 (org-element-interpret-data value)
-		       value)
-		     "\n")))))
-    (mapconcat
-     (lambda (prop)
-       (let ((value (org-element-property prop element))
-	     (keyword (upcase (substring (symbol-name prop) 1))))
-	 (when value
-	   (if (or (member keyword org-element-multiple-keywords)
-		   ;; All attribute keywords can have multiple lines.
-		   (string-match-p "^ATTR_" keyword))
-	       (mapconcat (lambda (line) (funcall keyword-to-org keyword line))
-			  value "")
-	     (funcall keyword-to-org keyword value)))))
-     ;; List all ELEMENT's properties matching an attribute line or an
-     ;; affiliated keyword, but ignore translated keywords since they
-     ;; cannot belong to the property list.
-     (let (acc)
-       (org-element-properties-mapc
-        (lambda (prop _ __)
-          (let  ((keyword (upcase (substring (symbol-name prop) 1))))
-            (when (or (string-match-p "^ATTR_" keyword)
-		      (and
-		       (member keyword org-element-affiliated-keywords)
-		       (not (assoc keyword
-			         org-element-keyword-translation-alist))))
-              (push prop acc))))
-        element t)
-       (nreverse acc))
-     "")))
+  ;; there are some elements that will never have affiliated keywords,
+  ;; so do nothing for these
+  (if (member (org-element-type element) org-element-elements-no-affiliated)
+      ""
+    (let (acc)
+      (org-element-properties-resolve element t)
+      (org-element--properties-mapc
+       (lambda (prop value)
+         (when value
+           (let ((keyword (upcase (substring (symbol-name prop) 1))))
+             (when (or (string-match-p "^ATTR_" keyword)
+                       (and
+                        (member keyword org-element-affiliated-keywords)
+                        (not (assoc keyword
+                                    org-element-keyword-translation-alist))))
+               (push (if (or (member keyword org-element-multiple-keywords)
+                             ;; All attribute keywords can have multiple lines.
+                             (string-match-p "^ATTR_" keyword))
+                         (mapconcat (lambda (line) (org-element--keyword-to-org keyword line))
+                                    value "")
+                       (org-element--keyword-to-org keyword value))
+                     acc)))))
+      element nil t)
+    (apply #'concat (nreverse acc)))))
 
 ;; Because interpretation of the parse tree must return the same
 ;; number of blank lines between elements and the same number of white
-- 
2.47.0


[-- Attachment #3: org-ml-postpatch.pdf --]
[-- Type: application/pdf, Size: 10220 bytes --]

[-- Attachment #4: org-ml-prepatch.pdf --]
[-- Type: application/pdf, Size: 10187 bytes --]

[-- Attachment #5: benchmark.el --]
[-- Type: text/plain, Size: 9161 bytes --]

(defmacro time-stringy (class what thing)
  `(progn
     (let ((reps 5000)
           t0 t1 t2 t3)
       (garbage-collect)
       (setq t0 (float-time))
       (--dotimes reps (org-element-interpret-data ,thing))
       (setq t1 (float-time))
       (garbage-collect)
       (setq t2 (float-time))
       (--dotimes reps (org-ml-to-string ,thing))
       (setq t3 (float-time))
       (format "%s, %s, %f, %f" ,class ,what (- t3 t2) (- t1 t0)))))

(defun run-tests ()
  (let* ((text "abc")
         ;; value objects
         (code (org-ml-build-code "abc"))
         (verbatim (org-ml-build-verbatim "abc"))
         (target (org-ml-build-target "abc"))
         (ibcall (org-ml-build-inline-babel-call "abc"))
         (isblock (org-ml-build-inline-src-block "abc" :value "xyz"))
         (break (org-ml-build-line-break))
         (macro (org-ml-build-macro "abc"))
         (stat (org-ml-build-statistics-cookie '(1 2)))

         ;; recursive objects
         (bold (org-ml-build-bold "abc"))
         (radio-target (org-ml-build-radio-target "abc"))
         (strike (org-ml-build-strike-through "abc"))
         (italic (org-ml-build-italic "abc"))
         (underline (org-ml-build-underline "abc"))
         (subscript (org-ml-build-subscript :use-brackets-p t "abc"))
         (superscript (org-ml-build-superscript :use-brackets-p t "abc"))
         (cell (org-ml-build-table-cell "abc"))
         (link (org-ml-build-link "https://downloadmoreram.com" "abc"))
         (fnref (org-ml-build-footnote-reference "abc"))

         ;; other objects
         (latex-frag (org-ml-build-latex-fragment "abc"))
         (keyword (org-ml-build-keyword "thing" "abc"))
         (timestamp (org-ml-build-timestamp! '(2024 1 1 0 0)))
         (timestamp-long (org-ml-build-timestamp!
                          '(2024 1 1 0 0)
                          :end '(2024 1 1 1 0)
                          :collapsed nil
                          :repeater '(cumulate 10 day)
                          :deadline '(14 day)
                          :warning '(all 2 day)
                          ))
         (entity (org-ml-build-entity "Agrave"))
         (export-snip (org-ml-build-export-snippet "abc" "xyz"))

         ;; leaf elements
         (comment (org-ml-build-comment "abc"))
         (comment-block (org-ml-build-comment-block :value "abc"))
         (example-block (org-ml-build-example-block :value "abc"))
         (horz (org-ml-build-horizontal-rule))
         (export-block (org-ml-build-export-block "abc" "xyz"))
         (fixed-width (org-ml-build-fixed-width "abc"))
         (clock (org-ml-build-clock! '(2024 1 1 0 0)))
         (clock-long (org-ml-build-clock! '(2024 1 1 0 0) :end '(2024 1 1 1 1)))
         (bcall (org-ml-build-babel-call "abc"))
         (diary-sexp (org-ml-build-diary-sexp :value '(poop)))
         (latex-env (org-ml-build-latex-environment (list "abc" "xyz")))
         (node-prop (org-ml-build-node-property "abc" "xyz"))
         (plan (org-ml-build-planning! :closed '(2024 1 1)))
         (plan-long (org-ml-build-planning! :closed '(2024 1 1) :scheduled '(2024 1 1) :deadline '(2024 1 1)))
         (src-block (org-ml-build-src-block :value "abc"))

         ;; elements with objects
         (para0 (org-ml-build-paragraph))
         (row0 (org-ml-build-table-row))
         (verse0 (org-ml-build-verse-block))
         (para3 (org-ml-build-paragraph text text text))
         (row3 (org-ml-build-table-row cell cell cell))
         (verse3 (org-ml-build-verse-block text text text))
         (para6 (org-ml-build-paragraph text text text text text text))
         (row6 (org-ml-build-table-row cell cell cell cell cell cell))
         (verse6 (org-ml-build-verse-block text text text text text text))

         ;; greater elements
         (item (org-ml-build-item! :bullet '- :checkbox 'on :paragraph "abc"))
         (hl0 (org-ml-build-headline! :title-text "abc"
                                      :todo-keyword "TODO"
                                      :tags '("poopy" "butthole")))
         (hl3 (org-ml-build-headline! :title-text "abc"
                                      :todo-keyword "TODO"
                                      :tags '("poopy" "butthole")
                                      :section-children (-repeat 3 para3)))
         (hl6 (org-ml-build-headline! :title-text "abc"
                                      :todo-keyword "TODO"
                                      :tags '("poopy" "butthole")
                                      :section-children (-repeat 3 para3)))
         (pd0 (org-ml-build-property-drawer))
         (pd3 (apply #'org-ml-build-property-drawer! (-repeat 3 '("abc" "xyz"))))
         (pd6 (apply #'org-ml-build-property-drawer! (-repeat 6 '("abc" "xyz"))))
         (pl3 (apply #'org-ml-build-plain-list (-repeat 3 item)))
         (pl6 (apply #'org-ml-build-plain-list (-repeat 6 item)))
         (drawer0 (org-ml-build-drawer "abc"))
         (drawer3 (org-ml-build-drawer "abc" pl3))
         (drawer6 (org-ml-build-drawer "abc" pl6))
         (table3 (apply #'org-ml-build-table (-repeat 3 row3)))
         (table6 (apply #'org-ml-build-table (-repeat 6 row6)))
         )
    (->> (list
          (time-stringy "plain" "text" text)
          (time-stringy "object" "code" code)
          (time-stringy "object" "verbatim" verbatim)
          (time-stringy "object" "target" target)
          (time-stringy "object" "inline-babel-call" ibcall)
          (time-stringy "object" "inline-src-block" isblock)
          (time-stringy "object" "entity" entity)
          (time-stringy "object" "keyword" keyword)
          (time-stringy "object" "timestamp" timestamp)
          (time-stringy "object" "timestamp-long" timestamp-long)
          (time-stringy "object" "latex-frag" latex-frag)
          (time-stringy "object" "export-snippet" export-snip)
          (time-stringy "object" "line-break" break)
          (time-stringy "object" "macro" macro)
          (time-stringy "object" "statcookie" stat)

          (time-stringy "recursive object" "bold" bold)
          (time-stringy "recursive object" "strikethru" strike)
          (time-stringy "recursive object" "italic" italic)
          (time-stringy "recursive object" "underline" underline)
          (time-stringy "recursive object" "superscript" superscript)
          (time-stringy "recursive object" "subscript" subscript)
          (time-stringy "recursive object" "cell" cell)
          (time-stringy "recursive object" "link" link)
          (time-stringy "recursive object" "footnote-ref" fnref)
          (time-stringy "recursive object" "radio-target" radio-target)

          (time-stringy "element" "babel call" bcall)
          (time-stringy "element" "comment" comment)
          (time-stringy "element" "comment-block" comment-block)
          (time-stringy "element" "clock" clock)
          (time-stringy "element" "clock-long" clock-long)
          (time-stringy "element" "diary-sexp" diary-sexp)
          (time-stringy "element" "export-block" export-block)
          (time-stringy "element" "example-block" example-block)
          (time-stringy "element" "fixed-width" fixed-width)
          (time-stringy "element" "horizontal rule" horz)
          (time-stringy "element" "latex env" latex-env)
          (time-stringy "element" "prop" node-prop)
          (time-stringy "element" "planning" plan)
          (time-stringy "element" "planning-long" plan-long)
          (time-stringy "element" "src-block" src-block)

          (time-stringy "element with objects" "paragraph0" para0)
          (time-stringy "element with objects" "table-row0" row0)
          (time-stringy "element with objects" "verse-block0" verse0)
          (time-stringy "element with objects" "paragraph3" para3)
          (time-stringy "element with objects" "table-row3" row3)
          (time-stringy "element with objects" "verse-block3" verse3)
          (time-stringy "element with objects" "paragraph6" para6)
          (time-stringy "element with objects" "table-row6" row6)
          (time-stringy "element with objects" "verse-block6" verse6)

          (time-stringy "greater element" "headline0" hl0)
          (time-stringy "greater element" "headline3" hl3)
          ;; (time-stringy "greater element" "headline6" hl6)
          (time-stringy "greater element" "drawer0" drawer0)
          (time-stringy "greater element" "drawer3" drawer3)
          ;; (time-stringy "greater element" "drawer6" drawer6)
          (time-stringy "greater element" "item" item)
          (time-stringy "greater element" "property-drawer0" pd0)
          (time-stringy "greater element" "property-drawer3" pd3)
          ;; (time-stringy "greater element" "property-drawer6" pd6)
          (time-stringy "greater element" "plain-list3" pl3)
          ;; (time-stringy "greater element" "plain-list6" pl6)
          (time-stringy "greater element" "table3" table3)
          ;; (time-stringy "greater element" "table6" table6)
          )
         (s-join "\n"))))

;; (run-tests)
       
(--> (-repeat 5 (run-tests))
     (s-join "\n" it)
     (f-write-text it 'utf-8 "/tmp/runs_interpret.out"))

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] org-element.el; significant optimizations for org-element--interpret-affiliated-keywords
  2024-11-29 22:38 [PATCH] org-element.el; significant optimizations for org-element--interpret-affiliated-keywords Dwarshuis, Nathan J
@ 2024-12-25 12:20 ` Ihor Radchenko
  0 siblings, 0 replies; 2+ messages in thread
From: Ihor Radchenko @ 2024-12-25 12:20 UTC (permalink / raw)
  To: Dwarshuis, Nathan J; +Cc: org-mode-email

[-- Attachment #1: Type: text/plain, Size: 800 bytes --]

"Dwarshuis, Nathan J" <ndwar@yavin4.ch> writes:

> I noticed that calling `org-element-interpret-data' on objects is
> generally 5-10x faster than when calling on elements. The reason seems
> to be that `org-element--interpret-affiliated-keywords' (which is only
> called on elements) does alot of unnecessary work. Namely, it runs on
> all elements (including those that should never have an affiliated
> keyword)
>
> The attached patch addresses this.

Thanks!
I am attaching some extra suggestions on top of the patch.

> ...  and also loops over :standard-properties which should not be
> relevant here.

There is nothing stopping us from adding some affiliated keywords to
standard properties in future. What happens if you drop this
optimization? Does the benchmark still show an improvement?


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-suggestions.patch --]
[-- Type: text/x-patch, Size: 3776 bytes --]

From 0301efb86b994e2c79a37c21f17c664c1193d4c0 Mon Sep 17 00:00:00 2001
Message-ID: <0301efb86b994e2c79a37c21f17c664c1193d4c0.1735129004.git.yantar92@posteo.net>
From: Ihor Radchenko <yantar92@posteo.net>
Date: Wed, 25 Dec 2024 13:16:36 +0100
Subject: [PATCH] suggestions

---
 lisp/org-element.el | 32 ++++++++++++++++++--------------
 1 file changed, 18 insertions(+), 14 deletions(-)

diff --git a/lisp/org-element.el b/lisp/org-element.el
index 3b90dce2a2..d386ee4184 100644
--- a/lisp/org-element.el
+++ b/lisp/org-element.el
@@ -338,7 +338,8 @@ (defconst org-element-object-containers
 (defconst org-element-elements-no-affiliated
   '(org-data comment clock headline inlinetask item
              node-property planning property-drawer
-             section table-row))
+             section table-row)
+  "List of paragraph-level node types that cannot have affiliated keywords.")
 
 (defconst org-element-affiliated-keywords
   '("CAPTION" "DATA" "HEADER" "HEADERS" "LABEL" "NAME" "PLOT" "RESNAME" "RESULT"
@@ -5522,7 +5523,8 @@ (defun org-element-interpret-data (data)
 			      (make-string blank ?\n)))))))))
     (funcall fun data nil)))
 
-(defun org-element--keyword-to-org (key value)
+(defun org-element--interpret-affiliated-keyword (key value)
+  "Interpret affiliated keyword with KEY and VALUE."
   (let (dual)
     (when (member key org-element-dual-keywords)
       (setq dual (cdr value) value (car value)))
@@ -5540,28 +5542,30 @@ (defun org-element--interpret-affiliated-keywords (element)
 If there is no affiliated keyword, return the empty string."
   ;; there are some elements that will never have affiliated keywords,
   ;; so do nothing for these
-  (if (member (org-element-type element) org-element-elements-no-affiliated)
+  (if (member (org-element-type element)
+              org-element-elements-no-affiliated)
       ""
     (let (acc)
       (org-element-properties-resolve element t)
       (org-element--properties-mapc
        (lambda (prop value)
          (when value
-           (let ((keyword (upcase (substring (symbol-name prop) 1))))
-             (when (or (string-match-p "^ATTR_" keyword)
+           (let* ((keyword (upcase (substring (symbol-name prop) 1)))
+                  (attrp (string-match-p "^ATTR_" keyword)))
+             (when (or attrp
                        (and
                         (member keyword org-element-affiliated-keywords)
                         (not (assoc keyword
-                                    org-element-keyword-translation-alist))))
-               (push (if (or (member keyword org-element-multiple-keywords)
-                             ;; All attribute keywords can have multiple lines.
-                             (string-match-p "^ATTR_" keyword))
-                         (mapconcat (lambda (line) (org-element--keyword-to-org keyword line))
-                                    value "")
-                       (org-element--keyword-to-org keyword value))
+                                  org-element-keyword-translation-alist))))
+               (push (if (or attrp ; All attribute keywords can have multiple lines.
+                             (member keyword org-element-multiple-keywords))
+                         (mapconcat
+                          (lambda (line) (org-element--interpret-affiliated-keyword keyword line))
+                          value "")
+                       (org-element--interpret-affiliated-keyword keyword value))
                      acc)))))
-      element nil t)
-    (apply #'concat (nreverse acc)))))
+       element nil t)
+      (apply #'concat (nreverse acc)))))
 
 ;; Because interpretation of the parse tree must return the same
 ;; number of blank lines between elements and the same number of white
-- 
2.47.1


[-- Attachment #3: Type: text/plain, Size: 223 bytes --]


-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-12-25 12:19 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-29 22:38 [PATCH] org-element.el; significant optimizations for org-element--interpret-affiliated-keywords Dwarshuis, Nathan J
2024-12-25 12:20 ` Ihor Radchenko

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).