emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
@ 2021-01-01 19:58 Emily Bourke
  2021-01-04  3:28 ` Kyle Meyer
  0 siblings, 1 reply; 14+ messages in thread
From: Emily Bourke @ 2021-01-01 19:58 UTC (permalink / raw)
  To: emacs-orgmode@gnu.org

[-- Attachment #1: Type: text/plain, Size: 474 bytes --]

Hi,

I found publishing when there were no changes to be slower than expected. Profiling showed me that `org-publish-cache-file-needs-publishing' was invoking the `after-find-file' hooks, which I don't think is necessary.

I've changed it to avoid doing that, by using `with-temp-buffer' and `insert-file-contents', and noticed a significant increase in speed.

Is there any reason I'm missing for using `find-file-noselect' in this case?

Best wishes,
Emily Bourke

[-- Attachment #2: 0001-ox-publish.el-Speed-up-org-publish-cache-file-needs-.patch --]
[-- Type: application/octet-stream, Size: 2742 bytes --]

From 0ae9d031ec368ddfe6295dce76aafbd368c5eb25 Mon Sep 17 00:00:00 2001
From: Emily Bourke <undergroundquizscene@gmail.com>
Date: Fri, 1 Jan 2021 18:47:45 +0000
Subject: [PATCH] ox-publish.el: Speed up
 org-publish-cache-file-needs-publishing

* lisp/ox-publish.el (org-publish-cache-file-needs-publishing): Use
`with-temp-buffer' with `insert-file-contents' instead of
`find-file-noselect'.  This avoids running the `after-find-file' hook,
which can make it significantly faster.

TINYCHANGE
---
 lisp/ox-publish.el | 43 ++++++++++++++++++++-----------------------
 1 file changed, 20 insertions(+), 23 deletions(-)

diff --git a/lisp/ox-publish.el b/lisp/ox-publish.el
index 7bb2fed6e..e967286cf 100644
--- a/lisp/ox-publish.el
+++ b/lisp/ox-publish.el
@@ -1290,29 +1290,26 @@ the file including them will be republished as well."
 	 (org-inhibit-startup t)
 	 included-files-ctime)
     (when (equal (file-name-extension filename) "org")
-      (let ((visiting (find-buffer-visiting filename))
-	    (buf (find-file-noselect filename))
-	    (case-fold-search t))
-	(unwind-protect
-	    (with-current-buffer buf
-	      (goto-char (point-min))
-	      (while (re-search-forward "^[ \t]*#\\+INCLUDE:" nil t)
-		(let ((element (org-element-at-point)))
-		  (when (eq 'keyword (org-element-type element))
-		    (let* ((value (org-element-property :value element))
-			   (filename
-			    (and (string-match "\\`\\(\".+?\"\\|\\S-+\\)" value)
-				 (let ((m (org-strip-quotes
-					   (match-string 1 value))))
-				   ;; Ignore search suffix.
-				   (if (string-match "::.*?\\'" m)
-				       (substring m 0 (match-beginning 0))
-				     m)))))
-		      (when filename
-			(push (org-publish-cache-ctime-of-src
-			       (expand-file-name filename))
-			      included-files-ctime)))))))
-	  (unless visiting (kill-buffer buf)))))
+      (let ((case-fold-search t))
+	(with-temp-buffer
+	  (insert-file-contents filename)
+	  (goto-char (point-min))
+	  (while (re-search-forward "^[ \t]*#\\+INCLUDE:" nil t)
+	    (let ((element (org-element-at-point)))
+	      (when (eq 'keyword (org-element-type element))
+		(let* ((value (org-element-property :value element))
+		       (filename
+			(and (string-match "\\`\\(\".+?\"\\|\\S-+\\)" value)
+			     (let ((m (org-strip-quotes
+				       (match-string 1 value))))
+			       ;; Ignore search suffix.
+			       (if (string-match "::.*?\\'" m)
+				   (substring m 0 (match-beginning 0))
+				 m)))))
+		  (when filename
+		    (push (org-publish-cache-ctime-of-src
+			   (expand-file-name filename))
+			  included-files-ctime)))))))))
     (or (null pstamp)
 	(let ((ctime (org-publish-cache-ctime-of-src filename)))
 	  (or (time-less-p pstamp ctime)
-- 
2.21.1 (Apple Git-122.3)


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
  2021-01-01 19:58 [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing Emily Bourke
@ 2021-01-04  3:28 ` Kyle Meyer
  2021-01-06 20:58   ` Emily Bourke
  2021-01-07  1:11   ` Dr. Arne Babenhauserheide
  0 siblings, 2 replies; 14+ messages in thread
From: Kyle Meyer @ 2021-01-04  3:28 UTC (permalink / raw)
  To: Emily Bourke; +Cc: emacs-orgmode@gnu.org

Thank you for the patch.

Emily Bourke writes:

> I found publishing when there were no changes to be slower than
> expected. Profiling showed me that
> `org-publish-cache-file-needs-publishing' was invoking the
> `after-find-file' hooks, which I don't think is necessary.
>
> I've changed it to avoid doing that, by using `with-temp-buffer' and
> `insert-file-contents', and noticed a significant increase in speed.
>
> Is there any reason I'm missing for using `find-file-noselect' in this
> case?

Nothing jumps out to me.  For large files that are already visited, I
suppose find-file-noselect returning an existing buffer can be faster,
so relevant factors would include how many Org files a project has, how
large they are, and how many of those are visited in the current
session.  My guess is that using with-temp-buffer and
insert-file-contents would be a net gain, though that gain would be
narrowed some if the temporary buffer was put into org-mode rather than
kept in fundamental-mode (more below).

> Subject: [PATCH] ox-publish.el: Speed up
>  org-publish-cache-file-needs-publishing
>
> * lisp/ox-publish.el (org-publish-cache-file-needs-publishing): Use
> `with-temp-buffer' with `insert-file-contents' instead of
> `find-file-noselect'.  This avoids running the `after-find-file' hook,
> which can make it significantly faster.

This reads to me like after-find-file is the hook itself.  Perhaps
something like this would be clearer: "... avoids calling
after-find-file and running find-file-hook, ...".

> diff --git a/lisp/ox-publish.el b/lisp/ox-publish.el
> index 7bb2fed6e..e967286cf 100644
> --- a/lisp/ox-publish.el
> +++ b/lisp/ox-publish.el
> @@ -1290,29 +1290,26 @@ the file including them will be republished as well."
>  	 (org-inhibit-startup t)
>  	 included-files-ctime)
>      (when (equal (file-name-extension filename) "org")
> -      (let ((visiting (find-buffer-visiting filename))
> -	    (buf (find-file-noselect filename))
> -	    (case-fold-search t))
> -	(unwind-protect
> -	    (with-current-buffer buf
> -	      (goto-char (point-min))
> -	      (while (re-search-forward "^[ \t]*#\\+INCLUDE:" nil t)
> -		(let ((element (org-element-at-point)))
> -		  (when (eq 'keyword (org-element-type element))
> -		    (let* ((value (org-element-property :value element))
> -			   (filename
> -			    (and (string-match "\\`\\(\".+?\"\\|\\S-+\\)" value)
> -				 (let ((m (org-strip-quotes
> -					   (match-string 1 value))))
> -				   ;; Ignore search suffix.
> -				   (if (string-match "::.*?\\'" m)
> -				       (substring m 0 (match-beginning 0))
> -				     m)))))
> -		      (when filename
> -			(push (org-publish-cache-ctime-of-src
> -			       (expand-file-name filename))
> -			      included-files-ctime)))))))
> -	  (unless visiting (kill-buffer buf)))))
> +      (let ((case-fold-search t))
> +	(with-temp-buffer
> +	  (insert-file-contents filename)
> +	  (goto-char (point-min))

The goto-char call can be dropped now because insert-file-contents inserts
after point.

Unlike the previous code, this doesn't activate org-mode in the buffer.
That gives a speedup.  And I don't spot any code downstream that depends
on the major mode being org-mode, so it's probably safe, though perhaps
there's a subtle change in behavior here (e.g., related to syntax
table).

If org-mode isn't called, the org-inhibit-startup binding above could be
dropped.

> +	  (while (re-search-forward "^[ \t]*#\\+INCLUDE:" nil t)
> +	    (let ((element (org-element-at-point)))
> +	      (when (eq 'keyword (org-element-type element))
> +		(let* ((value (org-element-property :value element))
> +		       (filename
> +			(and (string-match "\\`\\(\".+?\"\\|\\S-+\\)" value)
> +			     (let ((m (org-strip-quotes
> +				       (match-string 1 value))))
> +			       ;; Ignore search suffix.
> +			       (if (string-match "::.*?\\'" m)
> +				   (substring m 0 (match-beginning 0))
> +				 m)))))
> +		  (when filename
> +		    (push (org-publish-cache-ctime-of-src
> +			   (expand-file-name filename))

This introduces a regression.  With the previous code, the
find-file-noselect call led to default-directory being set to the Org
file's directory, and then this expand-file call on the included file
was relative to that.  With the new code, default-directory isn't
changed, so it points to a non-existing or incorrect file unless the
current default-directory and the Org file's happen to match.

> +			  included-files-ctime)))))))))


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
  2021-01-04  3:28 ` Kyle Meyer
@ 2021-01-06 20:58   ` Emily Bourke
  2021-05-01 10:22     ` Bastien
  2021-01-07  1:11   ` Dr. Arne Babenhauserheide
  1 sibling, 1 reply; 14+ messages in thread
From: Emily Bourke @ 2021-01-06 20:58 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: emacs-orgmode@gnu.org

Thanks for the feedback!

> Nothing jumps out to me. For large files that are already visited, I
> suppose find-file-noselect returning an existing buffer can be faster,
> so relevant factors would include how many Org files a project has, how
> large they are, and how many of those are visited in the current
> session. My guess is that using with-temp-buffer and
> insert-file-contents would be a net gain, though that gain would be
> narrowed some if the temporary buffer was put into org-mode rather than
> kept in fundamental-mode (more below).

I'll do some testing with some large org files and see how things compare – it might be worth switching to the buffer for the file if there is one already.

> This reads to me like after-find-file is the hook itself. Perhaps
> something like this would be clearer: "... avoids calling
> after-find-file and running find-file-hook, ...".

Ah yes, I had misunderstood – I'll rephrase it.

> The goto-char call can be dropped now because insert-file-contents inserts
> after point.

I follow, will remove.

> Unlike the previous code, this doesn't activate org-mode in the buffer.
> That gives a speedup. And I don't spot any code downstream that depends
> on the major mode being org-mode, so it's probably safe, though perhaps
> there's a subtle change in behavior here (e.g., related to syntax
> table).
>
> If org-mode isn't called, the org-inhibit-startup binding above could be
> dropped.

Yeah, if you're worried about it I could try manually activating org mode in the temp buffer – I'm not confident I could predict any problems there might be from not activating it.

> This introduces a regression. With the previous code, the
> find-file-noselect call led to default-directory being set to the Org
> file's directory, and then this expand-file call on the included file
> was relative to that. With the new code, default-directory isn't
> changed, so it points to a non-existing or incorrect file unless the
> current default-directory and the Org file's happen to match.

Ah, I hadn't noticed this – I'll change it to set default-directory manually.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
  2021-01-04  3:28 ` Kyle Meyer
  2021-01-06 20:58   ` Emily Bourke
@ 2021-01-07  1:11   ` Dr. Arne Babenhauserheide
  1 sibling, 0 replies; 14+ messages in thread
From: Dr. Arne Babenhauserheide @ 2021-01-07  1:11 UTC (permalink / raw)
  To: Kyle Meyer; +Cc: Emily Bourke, emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 913 bytes --]


Kyle Meyer <kyle@kyleam.com> writes:

> Nothing jumps out to me.  For large files that are already visited, I
> suppose find-file-noselect returning an existing buffer can be faster,
> so relevant factors would include how many Org files a project has, how
> large they are, and how many of those are visited in the current
> session.  My guess is that using with-temp-buffer and
> insert-file-contents would be a net gain, though that gain would be
> narrowed some if the temporary buffer was put into org-mode rather than
> kept in fundamental-mode (more below).

If you want to test a rather complex setup, you can try my website build:

hg clone http://hg.sr.ht/~arnebab/draketo; cd draketo/; autoreconf -i; ./configure ; time make

It takes around 3 minutes to build on my machine.

(hg is Mercurial)

Best wishes,
Arne
-- 
Unpolitisch sein
heißt politisch sein
ohne es zu merken

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 1125 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
  2021-01-06 20:58   ` Emily Bourke
@ 2021-05-01 10:22     ` Bastien
  2021-05-05 18:55       ` Emily Bourke
  0 siblings, 1 reply; 14+ messages in thread
From: Bastien @ 2021-05-01 10:22 UTC (permalink / raw)
  To: Emily Bourke; +Cc: Kyle Meyer, emacs-orgmode@gnu.org

Hi Emily,

thanks for the patch and sorry to reactivate this old thread.

Did you find time to make the tests and, perhaps, to update the patch?

Please let us know - thanks!

-- 
 Bastien


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
  2021-05-01 10:22     ` Bastien
@ 2021-05-05 18:55       ` Emily Bourke
  2021-09-26 10:26         ` Bastien
  0 siblings, 1 reply; 14+ messages in thread
From: Emily Bourke @ 2021-05-05 18:55 UTC (permalink / raw)
  To: Bastien; +Cc: Kyle Meyer, emacs-orgmode@gnu.org

Hi Bastien,

> thanks for the patch and sorry to reactivate this old thread.

No problem, thanks for getting in touch - the reminder is appreciated.

> Did you find time to make the tests and, perhaps, to update the patch?

I'm afraid I haven't had a chance to look at this any further since my last email. I'll try to find some time this week.

Best wishes,
Emily


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
  2021-05-05 18:55       ` Emily Bourke
@ 2021-09-26 10:26         ` Bastien
  2021-09-26 11:58           ` Emily Bourke
  0 siblings, 1 reply; 14+ messages in thread
From: Bastien @ 2021-09-26 10:26 UTC (permalink / raw)
  To: Emily Bourke; +Cc: Kyle Meyer, emacs-orgmode@gnu.org

Hi Emily,

sorry to bump an old thread...

Emily Bourke <undergroundquizscene@protonmail.com> writes:

> I'm afraid I haven't had a chance to look at this any further since
> my last email. I'll try to find some time this week.

... did you have time to take a closer look at this?

We are working on preparing Org 9.5, hopefully releasing it quite
soon, it would be great if we can apply and test your patch before
that.

Thanks a lot!

-- 
 Bastien


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
  2021-09-26 10:26         ` Bastien
@ 2021-09-26 11:58           ` Emily Bourke
  2021-09-26 13:22             ` Bastien
  0 siblings, 1 reply; 14+ messages in thread
From: Emily Bourke @ 2021-09-26 11:58 UTC (permalink / raw)
  To: Bastien; +Cc: Kyle Meyer, emacs-orgmode@gnu.org

Hi Bastien,

Thanks for getting in touch!

> ... did you have time to take a closer look at this?

I’m afraid I haven’t had any look at this since. However, I’ll have time tomorrow and later in the week to take a look – will that be early enough for the 9.5 release?

Best wishes,
Emily


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
  2021-09-26 11:58           ` Emily Bourke
@ 2021-09-26 13:22             ` Bastien
  2021-09-26 15:14               ` Emily Bourke
  0 siblings, 1 reply; 14+ messages in thread
From: Bastien @ 2021-09-26 13:22 UTC (permalink / raw)
  To: Emily Bourke; +Cc: Kyle Meyer, emacs-orgmode@gnu.org

Hi Emily,

thanks for your quick feedback!

Emily Bourke <undergroundquizscene@protonmail.com> writes:

>> ... did you have time to take a closer look at this?
>
> I’m afraid I haven’t had any look at this since. However, I’ll have
> time tomorrow and later in the week to take a look – will that be
> early enough for the 9.5 release?

Tomorrow for sure, and probably next week will be okay too.

To be clear, I hope we can release 9.5 soon to be able to make it
included in the next Emacs 28.1 version.

Best,

-- 
 Bastien


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
  2021-09-26 13:22             ` Bastien
@ 2021-09-26 15:14               ` Emily Bourke
  2021-09-27  8:33                 ` Emily Bourke
  0 siblings, 1 reply; 14+ messages in thread
From: Emily Bourke @ 2021-09-26 15:14 UTC (permalink / raw)
  To: bzg; +Cc: kyle, emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 317 bytes --]

Hi Bastien,

> Tomorrow for sure, and probably next week will be okay too.

Perfect, I’ll get back to you once I’ve taken a look tomorrow.

> To be clear, I hope we can release 9.5 soon to be able to make it
> included in the next Emacs 28.1 version.

Understood, thanks for the clarification.

Best wishes,
Emily

[-- Attachment #2: Type: text/html, Size: 366 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
  2021-09-26 15:14               ` Emily Bourke
@ 2021-09-27  8:33                 ` Emily Bourke
  2021-09-27  8:48                   ` Bastien
  0 siblings, 1 reply; 14+ messages in thread
From: Emily Bourke @ 2021-09-27  8:33 UTC (permalink / raw)
  To: bzg; +Cc: kyle, emacs-orgmode

Hi Bastien,

It looks like the same changes have been made separately and merged into master since my original email – see commit aa0fa8c75360d2aa491b9ae10e59d22de2aedc92 by Gustav Wikström.

Best wishes,
Emily


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
  2021-09-27  8:33                 ` Emily Bourke
@ 2021-09-27  8:48                   ` Bastien
  2021-09-27  8:50                     ` Emily Bourke
       [not found]                     ` <DB6PR02MB3174059E6DD21014657284A0DAA79@DB6PR02MB3174.eurprd02.prod.outlook.com>
  0 siblings, 2 replies; 14+ messages in thread
From: Bastien @ 2021-09-27  8:48 UTC (permalink / raw)
  To: Emily Bourke; +Cc: kyle, emacs-orgmode, Gustav Wikström

Hi Emily,

Emily Bourke <undergroundquizscene@protonmail.com> writes:

> It looks like the same changes have been made separately and merged
> into master since my original email – see commit
> aa0fa8c75360d2aa491b9ae10e59d22de2aedc92 by Gustav Wikström.

oh, you're right!  Sorry I pinged you for nothing on this.

Gustav, for next time, don't forget to add the original author as the
author of the commit, thanks.

-- 
 Bastien


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
  2021-09-27  8:48                   ` Bastien
@ 2021-09-27  8:50                     ` Emily Bourke
       [not found]                     ` <DB6PR02MB3174059E6DD21014657284A0DAA79@DB6PR02MB3174.eurprd02.prod.outlook.com>
  1 sibling, 0 replies; 14+ messages in thread
From: Emily Bourke @ 2021-09-27  8:50 UTC (permalink / raw)
  To: Bastien; +Cc: kyle, emacs-orgmode, Gustav Wikström

Hi Bastien,

> oh, you're right! Sorry I pinged you for nothing on this.

No worries!

Best wishes,
Emily


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Fwd: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing
       [not found]                           ` <874ka5pwxo.fsf@gnu.org>
@ 2021-10-03  1:50                             ` Gustav Wikström
  0 siblings, 0 replies; 14+ messages in thread
From: Gustav Wikström @ 2021-10-03  1:50 UTC (permalink / raw)
  To: emacs-orgmode@gnu.org; +Cc: undergroundquizscene@protonmail.com

[-- Attachment #1: Type: text/plain, Size: 732 bytes --]

FYI. It seems we fixed the same issue half a year apart by chance..!

Maybe if we had a forge or smth this would have been solved already by Emily's patch beginning 2021. One can always wish, right? ;-)

Best
Gustav

________________________________
From: Bastien <bzg@gnu.org>
Sent: Monday, September 27, 2021 10:48 PM
To: Gustav Wikström
Subject: Re: [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing

Gustav Wikström <gustav@whil.se> writes:

> Aha, hmm... I don't remember commiting any other persons code. I
> honestly think it's just me fixing the same thing thing that
> apparently already had a patch without me knowing it!

I'm glad to read this :) - thanks, then!

--
 Bastien

[-- Attachment #2: Type: text/html, Size: 2170 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-10-03  2:02 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-01 19:58 [PATCH] ox-publish.el: Speed up org-publish-cache-file-needs-publishing Emily Bourke
2021-01-04  3:28 ` Kyle Meyer
2021-01-06 20:58   ` Emily Bourke
2021-05-01 10:22     ` Bastien
2021-05-05 18:55       ` Emily Bourke
2021-09-26 10:26         ` Bastien
2021-09-26 11:58           ` Emily Bourke
2021-09-26 13:22             ` Bastien
2021-09-26 15:14               ` Emily Bourke
2021-09-27  8:33                 ` Emily Bourke
2021-09-27  8:48                   ` Bastien
2021-09-27  8:50                     ` Emily Bourke
     [not found]                     ` <DB6PR02MB3174059E6DD21014657284A0DAA79@DB6PR02MB3174.eurprd02.prod.outlook.com>
     [not found]                       ` <87czotpyk0.fsf@bzg.fr>
     [not found]                         ` <A2476131-8275-48D6-AF10-902CF347C1F9@whil.se>
     [not found]                           ` <874ka5pwxo.fsf@gnu.org>
2021-10-03  1:50                             ` Fwd: " Gustav Wikström
2021-01-07  1:11   ` Dr. Arne Babenhauserheide

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).