emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Ihor Radchenko <yantar92@gmail.com>
To: Max Nikulin <manikulin@gmail.com>
Cc: Greg Minshall <minshall@umich.edu>,  Org Mode <emacs-orgmode@gnu.org>
Subject: [PATCH] Re: tangle option to not write a file with same contents?
Date: Sun, 08 May 2022 12:42:53 +0800	[thread overview]
Message-ID: <878rrcws6q.fsf@localhost> (raw)
In-Reply-To: <12cd2bb0-584e-dcff-baca-1ed27d0281ff@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 846 bytes --]

Max Nikulin <manikulin@gmail.com> writes:

> On 28/10/2021 11:04, Greg Minshall wrote:
>> 
>> i wonder if it would be reasonable to add an option such that, when
>> tangling, `org-babel-tangle` would not write a file with the
>> already-existing contents of the target file?
>> 
>> this would be helpful, e.g., for those of us who use make(1)-based work
>> flows.
>
> It was not obvious for me earlier that it should be namely an *option*, 
> not just change of behavior, since e.g. `org-babel-load-file' relies on 
> timestamp comparison of the source .org file and the derived .el file. I 
> am unsure concerning default value of such setting.

I agree that it should be the default behaviour.
The patch is attached.

On SSD, when tangling into ~200 files, the patch speeds up tangling
by almost 2x: before 7.6 sec; after 4.4 sec.

Best,
Ihor


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-org-tangle-Do-not-overwrite-when-contents-does-not-c.patch --]
[-- Type: text/x-patch, Size: 2212 bytes --]

From 68d90e73da17e423220211897ad1e86a4eb2e5a1 Mon Sep 17 00:00:00 2001
Message-Id: <68d90e73da17e423220211897ad1e86a4eb2e5a1.1651984776.git.yantar92@gmail.com>
From: Ihor Radchenko <yantar92@gmail.com>
Date: Sun, 8 May 2022 12:32:40 +0800
Subject: [PATCH] org-tangle: Do not overwrite when contents does not change

* lisp/ob-tangle.el (org-babel-tangle): Do not overwrite existing
tangled files if their contents is exactly the same as we are going to
write during tangle process.  This avoids unneeded disk writes and can
speed up tangling significantly when many small files are tangles from
a single .org source.

An example of performance improvement when tangling an .org file into
~200 files:
(benchmark-run 10 (org-babel-tangle))
Before the commit (on SSD): (76.33826743 8 11.551725374)
After the commit:           (43.628606052 4 5.751274237)
---
 lisp/ob-tangle.el | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/lisp/ob-tangle.el b/lisp/ob-tangle.el
index 16d938afb..c011bf662 100644
--- a/lisp/ob-tangle.el
+++ b/lisp/ob-tangle.el
@@ -282,11 +282,17 @@ (defun org-babel-tangle (&optional arg target-file lang-re)
 		    lspecs)
 		   (when make-dir
 		     (make-directory fnd 'parents))
-                   ;; erase previous file
-                   (when (file-exists-p file-name)
-                     (delete-file file-name))
-		   (write-region nil nil file-name)
-		   (mapc (lambda (mode) (set-file-modes file-name mode)) modes)
+                   (unless
+                       (let ((new-contents-hash (buffer-hash)))
+                         (with-temp-buffer
+                           (when (file-exists-p file-name)
+                             (insert-file-contents file-name))
+                           (equal (buffer-hash) new-contents-hash)))
+                     ;; erase previous file
+                     (when (file-exists-p file-name)
+                       (delete-file file-name))
+		     (write-region nil nil file-name)
+		     (mapc (lambda (mode) (set-file-modes file-name mode)) modes))
                    (push file-name path-collector))))))
 	 (if (equal arg '(4))
 	     (org-babel-tangle-single-block 1 t)
-- 
2.35.1


  reply	other threads:[~2022-05-08  4:43 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-28  4:04 tangle option to not write a file with same contents? Greg Minshall
2021-10-29 16:21 ` Max Nikulin
2021-10-29 17:58   ` Greg Minshall
2021-10-30 15:13     ` Max Nikulin
2021-10-30 16:13       ` Greg Minshall
2022-05-07  8:05 ` Max Nikulin
2022-05-08  4:42   ` Ihor Radchenko [this message]
2022-05-17 15:39     ` [PATCH] " Max Nikulin
2022-05-30  3:14       ` Ihor Radchenko
2022-05-31 16:07         ` Max Nikulin
2022-06-03  7:04           ` Ihor Radchenko
2022-06-07  3:47             ` Tom Gillespie
2022-06-01 13:18     ` Greg Minshall
2022-09-12 17:36       ` Org version mismatch -- hooray! Greg Minshall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878rrcws6q.fsf@localhost \
    --to=yantar92@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=manikulin@gmail.com \
    --cc=minshall@umich.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).