emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Ihor Radchenko <yantar92@gmail.com>
To: Jamie Matthews <jdm204@cam.ac.uk>,
	Nicolas Goaziou <mail@nicolasgoaziou.fr>
Cc: "emacs-orgmode@gnu.org" <emacs-orgmode@gnu.org>
Subject: [PATCH] Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library
Date: Sat, 19 Mar 2022 19:47:35 +0800	[thread overview]
Message-ID: <877d8qf9k8.fsf@localhost> (raw)
In-Reply-To: <LO2P265MB1758E12E04832DC712F5A8E9DC149@LO2P265MB1758.GBRP265.PROD.OUTLOOK.COM>

[-- Attachment #1: Type: text/plain, Size: 1400 bytes --]

Jamie Matthews <jdm204@cam.ac.uk> writes:

> I can confirm that the key turns red on insert when I altered the key outside of emacs (with that second version of `org-cite-basic--parse-bibliography`).

Great. Then, I am attaching the patch with the new version of the
function. I will let Nicolas decide if it is good enough to be merged. I
do not feel confident enough with org-cite code to judge if my approach
is not missing some edge cases.

> However, I'm now noticing that the hang improvement earlier (< a second down from ~10) doesn't always occur.
> Specifically, if I
>   1.  emacs -Q
>   2.  eval your code in scratch
>   3.  C-x C-f to the org file
> I get the hang. However, if I then
>   1.  kill the org buffer
>   2.  eval the code again
>   3.  re-find the org file
> the hang is gone. Without eval​ing your code in between, killing the org buffer and finding it again in the same emacs session reproduces the hang everytime, which was probably what I did the first time I report the improvement, as in I didn't check it worked from startup.

It is most likely because you defun the function after emacs -Q before
org is loaded, then open Org file (opening Org file autoloads org-mode),
and then org-mode overwrites the manual defun.
If I am correct, putting (require 'oc-basic) before defun will fix what
you are seeing.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-oc-basic-Speed-up-cached-bibliography-retrival.patch --]
[-- Type: text/x-patch, Size: 4266 bytes --]

From 100b722708c68bc65af637c3ad4e289943cccd7c Mon Sep 17 00:00:00 2001
Message-Id: <100b722708c68bc65af637c3ad4e289943cccd7c.1647690044.git.yantar92@gmail.com>
From: Ihor Radchenko <yantar92@gmail.com>
Date: Sat, 19 Mar 2022 19:24:55 +0800
Subject: [PATCH] oc-basic: Speed up cached bibliography retrival

* lisp/oc-basic.el (org-cite-basic--file-id-cache): New variable
storing hash of bibliography file contents.
(org-cite-basic--parse-bibliography): Skip buffer hash calculation
when bibliography file is unchanged on disk.  Calculating buffer hash
on every call is slow when bibliography file is large.

* lisp/org-compat.el:
(org-file-has-changed-p--hash-table, org-file-has-changed-p): Define
`file-has-changed-p' from Emacs 29 if necessary.

See https://list.orgmode.org/LO2P265MB1758E12E04832DC712F5A8E9DC149@LO2P265MB1758.GBRP265.PROD.OUTLOOK.COM/T/#t
 lisp/oc-basic.el   | 11 +++++++++--
 lisp/org-compat.el | 29 +++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/lisp/oc-basic.el b/lisp/oc-basic.el
index e2dfc0603..d1ea620ad 100644
--- a/lisp/oc-basic.el
+++ b/lisp/oc-basic.el
@@ -233,6 +233,8 @@ (defun org-cite-basic--parse-bibtex (dialect)
+(defvar org-cite-basic--file-id-cache nil
+  "Hash table linking files to their hash.")
 (defun org-cite-basic--parse-bibliography (&optional info)
   "List all entries available in the buffer.
@@ -245,14 +247,19 @@ (defun org-cite-basic--parse-bibliography (&optional info)
 as symbols, and values as strings or nil.
 Optional argument INFO is the export state, as a property list."
+  (unless (hash-table-p org-cite-basic--file-id-cache)
+    (setq org-cite-basic--file-id-cache (make-hash-table :test #'equal)))
   (if (plist-member info :cite-basic/bibliography)
       (plist-get info :cite-basic/bibliography)
     (let ((results nil))
       (dolist (file (org-cite-list-bibliography-files))
         (when (file-readable-p file)
-            (insert-file-contents file)
-	    (let* ((file-id (cons file (org-buffer-hash)))
+            (when (or (org-file-has-changed-p file)
+                      (not (gethash file org-cite-basic--file-id-cache)))
+              (insert-file-contents file)
+              (puthash file (org-buffer-hash) org-cite-basic--file-id-cache))
+	    (let* ((file-id (cons file (gethash file org-cite-basic--file-id-cache)))
                     (or (cdr (assoc file-id org-cite-basic--bibliography-cache))
                         (let ((table
diff --git a/lisp/org-compat.el b/lisp/org-compat.el
index 38d330de6..e96dcba8b 100644
--- a/lisp/org-compat.el
+++ b/lisp/org-compat.el
@@ -73,6 +73,35 @@ (defvar org-table-dataline-regexp)
 (defvar org-table-tab-recognizes-table.el)
 (defvar org-table1-hline-regexp)
+;;; Emacs < 29 compatibility
+(defvar org-file-has-changed-p--hash-table (make-hash-table :test #'equal)
+  "Internal variable used by `org-file-has-changed-p'.")
+(if (fboundp 'file-has-changed-p)
+    (defalias org-file-has-changed-p #'file-has-changed-p)
+  (defun org-file-has-changed-p (file &optional tag)
+    "Return non-nil if FILE has changed.
+The size and modification time of FILE are compared to the size
+and modification time of the same FILE during a previous
+invocation of `org-file-has-changed-p'.  Thus, the first invocation
+of `org-file-has-changed-p' always returns non-nil when FILE exists.
+The optional argument TAG, which must be a symbol, can be used to
+limit the comparison to invocations with identical tags; it can be
+the symbol of the calling function, for example."
+    (let* ((file (directory-file-name (expand-file-name file)))
+           (remote-file-name-inhibit-cache t)
+           (fileattr (file-attributes file 'integer))
+	   (attr (and fileattr
+                      (cons (file-attribute-size fileattr)
+		            (file-attribute-modification-time fileattr))))
+	   (sym (concat (symbol-name tag) "@" file))
+	   (cachedattr (gethash sym org-file-has-changed-p--hash-table)))
+      (when (not (equal attr cachedattr))
+        (puthash sym attr org-file-has-changed-p--hash-table)))))
 ;;; Emacs < 28.1 compatibility

  reply	other threads:[~2022-03-19 12:14 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-14 14:45 [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library [9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)] Jamie Matthews
2022-03-16 13:01 ` Bruce D'Arcus
2022-03-19  8:28 ` Ihor Radchenko
2022-03-19  8:57   ` Jamie Matthews
2022-03-19  9:23     ` Ihor Radchenko
2022-03-19  9:25       ` Jamie Matthews
2022-03-19  9:57         ` Ihor Radchenko
2022-03-19 10:12           ` Jamie Matthews
2022-03-19 10:28             ` Ihor Radchenko
2022-03-19 11:17               ` Jamie Matthews
2022-03-19 11:47                 ` Ihor Radchenko [this message]
2022-03-19 12:01                   ` [PATCH] Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library Jamie Matthews
2022-03-19 12:12                     ` Ihor Radchenko
2022-03-19 20:13                       ` psychosis
2022-03-20  4:20                         ` Ihor Radchenko
2022-03-21 16:51                           ` psychosis
2022-03-22 12:27                             ` Ihor Radchenko
2022-03-22 16:42                               ` psychosis
2022-03-23 11:07                                 ` Ihor Radchenko
2022-04-16 10:11                   ` Ihor Radchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877d8qf9k8.fsf@localhost \
    --to=yantar92@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=jdm204@cam.ac.uk \
    --cc=mail@nicolasgoaziou.fr \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).