From: Max Nikulin <email@example.com>
Subject: Re: [PATCH] Re: tangle option to not write a file with same contents?
Date: Tue, 17 May 2022 22:39:30 +0700 [thread overview]
Message-ID: <firstname.lastname@example.org> (raw)
On 08/05/2022 11:42, Ihor Radchenko wrote:
>> On 28/10/2021 11:04, Greg Minshall wrote:
>>> i wonder if it would be reasonable to add an option such that, when
>>> tangling, `org-babel-tangle` would not write a file with the
>>> already-existing contents of the target file?
> The patch is attached.
> On SSD, when tangling into ~200 files, the patch speeds up tangling
> by almost 2x: before 7.6 sec; after 4.4 sec.
By mistake I sent my reply to Ihor off-list, so a part of discussion is
missed in the list archives. The only excuse is that a copy of message
received as Cc does not have List-Post header, so reply action works as
for private messages.
Ihor Radchenko. [BUG] org-babel-load-file can not compile file. Fri, 13
May 2022 18:38:14 +0800. https://list.orgmode.org/878rr5ra3t.fsf@localhost
This is a patch from another thread that should be a part of this change.
> diff --git a/lisp/org.el b/lisp/org.el
> index 47a16e94b..09a001414 100644
> --- a/lisp/org.el
> +++ b/lisp/org.el
> @@ -256,6 +256,11 @@ (defun org-babel-load-file (file &optional compile)
> (file-attributes (file-truename file))))
> + ;; Make sure that tangled file modification time is
> + ;; updated even when `org-babel-tangle-file' does not make changes.
> + ;; This avoids re-tangling changed FILE where the changes did
> + ;; not affect the tangled code.
> + (set-file-times tangled-file)
> (org-babel-tangle-file file
> (rx string-start
`set-file-times' signals if the file does not exist, so I expect that
the call should be after `org-babel-tangle-file' otherwise first
invocation for a new org file will fail. I would prefer to avoid
touching the tangled file at all, but it makes impossible to check if
the file is up to date (at least without saving hashes somewhere, that
is unnecessary complication here). With optimizing of writhing of the
tangled file overall behavior is rather close to original approach, so
`set-file-times' should be OK.
Ihor Radchenko, off-list [PATCH v2] Re: tangle option to not write a
file with same contents? Mon, 09 May 2022 21:22:55 +0800.
> diff --git a/lisp/ob-tangle.el b/lisp/ob-tangle.el
> index 16d938afb..76243f83f 100644
> --- a/lisp/ob-tangle.el
> +++ b/lisp/ob-tangle.el
> @@ -282,11 +282,24 @@ (defun org-babel-tangle (&optional arg target-file lang-re)
> (when make-dir
> (make-directory fnd 'parents))
> - ;; erase previous file
> - (when (file-exists-p file-name)
> - (delete-file file-name))
> - (write-region nil nil file-name)
> - (mapc (lambda (mode) (set-file-modes file-name mode)) modes)
> + (unless
> + (when (file-exists-p file-name)
> + (let ((tangle-buf (current-buffer)))
> + (with-temp-buffer
> + (insert-file-contents file-name)
> + (and
> + (equal (buffer-size)
> + (buffer-size tangle-buf))
> + (= 0
> + (let (case-fold-search)
> + (compare-buffer-substrings
> + nil nil nil
> + tangle-buf nil nil)))))))
> + ;; erase previous file
> + (when (file-exists-p file-name)
> + (delete-file file-name))
> + (write-region nil nil file-name)
> + (mapc (lambda (mode) (set-file-modes file-name mode)) modes))
> (push file-name path-collector))))))
> (if (equal arg '(4))
> (org-babel-tangle-single-block 1 t)
I do not like (unless (when ...)) composition. If I remember correctly,
`when' should be used for side effects, so `and' may be more suitable
here. Otherwise it looks like what Greg suggested and should work faster
than first variant of this patch.
My fault caused significant delay, so feel free to ignore comments below.
I still had a hope that `org-babel-load-file' might be improved a bit by
using `byte-recompile-file' with 0 passed for ARG (previously I
incorrectly wrote FORCE). The goal is to avoid recompiling the tangled
.el file if it is not changed.
I am still curious if it is reliable to compare file size from
`file-attributes' with (+ 1 (bufferpos-to-filepos (buffer-size))) for
tangle result prior to loading existing file. I am unsure due to
variations in encodings and newline formats, however it might further
improve performance then tangle result changes.
I have noticed that `org-babel-tangle-file' may create empty org file if
it does not exist. From my point of view it is questionable behavior.
Finally some comments on performance numbers. Ihor, your test simulates
iterative debugging. Tangle results were likely in disk caches. Another
case may give different numbers. Consider single pass after small
modification of the source .org file. For comparison existing files are
mostly should be loaded from disk. I did not mean disabling disk caches
completely. After tangling it may take some time till files are actually
written to disk. I am unsure if during repetitive benchmarking some
files may be replaced in caches without writing to disk at all, likely
timeout for dirty cache pages is small enough.
Outline of more fair performance test (however I do not think that such
accuracy is really required):
- purge disk caches, so earlier tangled files have to be loaded from disk
- flush caches (sync) to complete cycle.
And of course, tangling to single large file is not the same as multiple
Leaving aside further changes and details of benchmarking, I hope these
2 patches will improve experience for make users and will not break
anything in Org.
next prev parent reply other threads:[~2022-05-17 15:40 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-28 4:04 tangle option to not write a file with same contents? Greg Minshall
2021-10-29 16:21 ` Max Nikulin
2021-10-29 17:58 ` Greg Minshall
2021-10-30 15:13 ` Max Nikulin
2021-10-30 16:13 ` Greg Minshall
2022-05-07 8:05 ` Max Nikulin
2022-05-08 4:42 ` [PATCH] " Ihor Radchenko
2022-05-17 15:39 ` Max Nikulin [this message]
2022-05-30 3:14 ` Ihor Radchenko
2022-05-31 16:07 ` Max Nikulin
2022-06-03 7:04 ` Ihor Radchenko
2022-06-07 3:47 ` Tom Gillespie
2022-06-01 13:18 ` Greg Minshall
2022-09-12 17:36 ` Org version mismatch -- hooray! Greg Minshall
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
List information: https://www.orgmode.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).