From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp12.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms5.migadu.com with LMTPS id SKXqHQDCg2JhAwAAbAwnHQ (envelope-from ) for ; Tue, 17 May 2022 17:40:48 +0200 Received: from aspmx1.migadu.com ([2001:41d0:2:bcc0::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp12.migadu.com with LMTPS id MJ/CHQDCg2ImOQEAauVa8A (envelope-from ) for ; Tue, 17 May 2022 17:40:48 +0200 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 2610B142E8 for ; Tue, 17 May 2022 17:40:48 +0200 (CEST) Received: from localhost ([::1]:55064 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nqzJX-0002hr-CW for larch@yhetil.org; Tue, 17 May 2022 11:40:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45944) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nqzIT-0002e5-UO for emacs-orgmode@gnu.org; Tue, 17 May 2022 11:39:41 -0400 Received: from ciao.gmane.io ([116.202.254.214]:40520) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nqzIR-0003Qa-SB for emacs-orgmode@gnu.org; Tue, 17 May 2022 11:39:41 -0400 Received: from list by ciao.gmane.io with local (Exim 4.92) (envelope-from ) id 1nqzIO-0007dE-TS for emacs-orgmode@gnu.org; Tue, 17 May 2022 17:39:36 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: emacs-orgmode@gnu.org From: Max Nikulin Subject: Re: [PATCH] Re: tangle option to not write a file with same contents? Date: Tue, 17 May 2022 22:39:30 +0700 Message-ID: References: <583051.1635393898@apollo2.minshall.org> <12cd2bb0-584e-dcff-baca-1ed27d0281ff@gmail.com> <878rrcws6q.fsf@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.1 Content-Language: en-US In-Reply-To: <878rrcws6q.fsf@localhost> Received-SPF: pass client-ip=116.202.254.214; envelope-from=geo-emacs-orgmode@m.gmane-mx.org; helo=ciao.gmane.io X-Spam_score_int: 28 X-Spam_score: 2.8 X-Spam_bar: ++ X-Spam_report: (2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_ADSP_CUSTOM_MED=0.001, FORGED_GMAIL_RCVD=1, FORGED_MUA_MOZILLA=2.309, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.249, NICE_REPLY_A=-0.001, NML_ADSP_CUSTOM_MED=0.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Migadu-Flow: FLOW_IN X-Migadu-To: larch@yhetil.org X-Migadu-Country: US ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=yhetil.org; s=key1; t=1652802048; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=UCDE8PqAl9dZNDN0a+5cG3mxYWPJgaEyhV4r2xCu450=; b=Rk5fGiqibWNWOzscBRF2VUgI2g9vmG1s5C565wXSPSu4P7vnMIWwIOAiHj0fF226Xqioyi RUxk0NoRmnsJEpi6ihkyk6zEnyXf2sXYy/VU/ld2HFIzMd9wlJQ7tSITRayGi4xZIMPFKK c1XshUic5WoueiNJ2IOYIaEaPFMOhMVIPZwQRiPd217boKgUHjDZ9ptKBH+T2UhbtrWRgU rCKUxH2fzfeOgwM9FHr+RQleoHLJKBsjk4GggU7Mu77i9yLvxle6NsuZFIy15N8Zmwt6pr 72wAwVOQJ5XhFQCgjUzzWM/Me4VVa1aMJ8Id2DPafeb3AYEWd8Tifs1QoYZUpA== ARC-Seal: i=1; s=key1; d=yhetil.org; t=1652802048; a=rsa-sha256; cv=none; b=k9DX7oj8CFX3+GZK6fLZzw6ZKiWSZn7vCtinailq80D+O2kIXcgty8uKaNIpKL7AaEFJgK 9bwaVNa/OAp2DOpv+WdpU4ZCHM+hZ0M41RcVoJLigzyUlXQM3Jyt+FCiM326d2Fobs043g xHTMOdCU9cmmhqk3KlHLv3M/ApYTxL9O/A8avQLNFtPsfuj+p/qp9R6tEzIbpWbTYhg3qa c1tF5zxA3NY2LXsMgI/VSl1O6Z6P6wl5qSh1742wtT9V5lSMWBYxU2dh4zu8CFUJQHG28E KQIIpw1lO2rVGEy5AS3vi3Qd8X6JmkrZmTjdYFoDQ8bflqTgrI63S8PJO6IOyA== ARC-Authentication-Results: i=1; aspmx1.migadu.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Spam-Score: 3.16 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of "emacs-orgmode-bounces+larch=yhetil.org@gnu.org" designates 209.51.188.17 as permitted sender) smtp.mailfrom="emacs-orgmode-bounces+larch=yhetil.org@gnu.org" X-Migadu-Queue-Id: 2610B142E8 X-Spam-Score: 3.16 X-Migadu-Scanner: scn0.migadu.com X-TUID: I4XNPVXXMuiI On 08/05/2022 11:42, Ihor Radchenko wrote: >> On 28/10/2021 11:04, Greg Minshall wrote: >>> >>> i wonder if it would be reasonable to add an option such that, when >>> tangling, `org-babel-tangle` would not write a file with the >>> already-existing contents of the target file? [...] > The patch is attached. > > On SSD, when tangling into ~200 files, the patch speeds up tangling > by almost 2x: before 7.6 sec; after 4.4 sec. By mistake I sent my reply to Ihor off-list, so a part of discussion is missed in the list archives. The only excuse is that a copy of message received as Cc does not have List-Post header, so reply action works as for private messages. Ihor Radchenko. [BUG] org-babel-load-file can not compile file. Fri, 13 May 2022 18:38:14 +0800. https://list.orgmode.org/878rr5ra3t.fsf@localhost This is a patch from another thread that should be a part of this change. > diff --git a/lisp/org.el b/lisp/org.el > index 47a16e94b..09a001414 100644 > --- a/lisp/org.el > +++ b/lisp/org.el > @@ -256,6 +256,11 @@ (defun org-babel-load-file (file &optional compile) > tangled-file > (file-attribute-modification-time > (file-attributes (file-truename file)))) > + ;; Make sure that tangled file modification time is > + ;; updated even when `org-babel-tangle-file' does not make changes. > + ;; This avoids re-tangling changed FILE where the changes did > + ;; not affect the tangled code. > + (set-file-times tangled-file) > (org-babel-tangle-file file > tangled-file > (rx string-start `set-file-times' signals if the file does not exist, so I expect that the call should be after `org-babel-tangle-file' otherwise first invocation for a new org file will fail. I would prefer to avoid touching the tangled file at all, but it makes impossible to check if the file is up to date (at least without saving hashes somewhere, that is unnecessary complication here). With optimizing of writhing of the tangled file overall behavior is rather close to original approach, so `set-file-times' should be OK. Ihor Radchenko, off-list [PATCH v2] Re: tangle option to not write a file with same contents? Mon, 09 May 2022 21:22:55 +0800. > diff --git a/lisp/ob-tangle.el b/lisp/ob-tangle.el > index 16d938afb..76243f83f 100644 > --- a/lisp/ob-tangle.el > +++ b/lisp/ob-tangle.el > @@ -282,11 +282,24 @@ (defun org-babel-tangle (&optional arg target-file lang-re) > lspecs) > (when make-dir > (make-directory fnd 'parents)) > - ;; erase previous file > - (when (file-exists-p file-name) > - (delete-file file-name)) > - (write-region nil nil file-name) > - (mapc (lambda (mode) (set-file-modes file-name mode)) modes) > + (unless > + (when (file-exists-p file-name) > + (let ((tangle-buf (current-buffer))) > + (with-temp-buffer > + (insert-file-contents file-name) > + (and > + (equal (buffer-size) > + (buffer-size tangle-buf)) > + (= 0 > + (let (case-fold-search) > + (compare-buffer-substrings > + nil nil nil > + tangle-buf nil nil))))))) > + ;; erase previous file > + (when (file-exists-p file-name) > + (delete-file file-name)) > + (write-region nil nil file-name) > + (mapc (lambda (mode) (set-file-modes file-name mode)) modes)) > (push file-name path-collector)))))) > (if (equal arg '(4)) > (org-babel-tangle-single-block 1 t) I do not like (unless (when ...)) composition. If I remember correctly, `when' should be used for side effects, so `and' may be more suitable here. Otherwise it looks like what Greg suggested and should work faster than first variant of this patch. My fault caused significant delay, so feel free to ignore comments below. I still had a hope that `org-babel-load-file' might be improved a bit by using `byte-recompile-file' with 0 passed for ARG (previously I incorrectly wrote FORCE). The goal is to avoid recompiling the tangled .el file if it is not changed. I am still curious if it is reliable to compare file size from `file-attributes' with (+ 1 (bufferpos-to-filepos (buffer-size))) for tangle result prior to loading existing file. I am unsure due to variations in encodings and newline formats, however it might further improve performance then tangle result changes. I have noticed that `org-babel-tangle-file' may create empty org file if it does not exist. From my point of view it is questionable behavior. Finally some comments on performance numbers. Ihor, your test simulates iterative debugging. Tangle results were likely in disk caches. Another case may give different numbers. Consider single pass after small modification of the source .org file. For comparison existing files are mostly should be loaded from disk. I did not mean disabling disk caches completely. After tangling it may take some time till files are actually written to disk. I am unsure if during repetitive benchmarking some files may be replaced in caches without writing to disk at all, likely timeout for dirty cache pages is small enough. Outline of more fair performance test (however I do not think that such accuracy is really required): - purge disk caches, so earlier tangled files have to be loaded from disk - tangle - flush caches (sync) to complete cycle. And of course, tangling to single large file is not the same as multiple small ones. Leaving aside further changes and details of benchmarking, I hope these 2 patches will improve experience for make users and will not break anything in Org.