From: Ihor Radchenko <yantar92@gmail.com>
To: Jamie Matthews <jdm204@cam.ac.uk>,
Nicolas Goaziou <mail@nicolasgoaziou.fr>
Cc: "emacs-orgmode@gnu.org" <emacs-orgmode@gnu.org>
Subject: [PATCH] Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library
Date: Sat, 19 Mar 2022 19:47:35 +0800 [thread overview]
Message-ID: <877d8qf9k8.fsf@localhost> (raw)
In-Reply-To: <LO2P265MB1758E12E04832DC712F5A8E9DC149@LO2P265MB1758.GBRP265.PROD.OUTLOOK.COM>
[-- Attachment #1: Type: text/plain, Size: 1400 bytes --]
Jamie Matthews <jdm204@cam.ac.uk> writes:
> I can confirm that the key turns red on insert when I altered the key outside of emacs (with that second version of `org-cite-basic--parse-bibliography`).
Great. Then, I am attaching the patch with the new version of the
function. I will let Nicolas decide if it is good enough to be merged. I
do not feel confident enough with org-cite code to judge if my approach
is not missing some edge cases.
> However, I'm now noticing that the hang improvement earlier (< a second down from ~10) doesn't always occur.
>
> Specifically, if I
>
> 1. emacs -Q
> 2. eval your code in scratch
> 3. C-x C-f to the org file
>
> I get the hang. However, if I then
>
>
> 1. kill the org buffer
> 2. eval the code again
> 3. re-find the org file
>
> the hang is gone. Without evaling your code in between, killing the org buffer and finding it again in the same emacs session reproduces the hang everytime, which was probably what I did the first time I report the improvement, as in I didn't check it worked from startup.
It is most likely because you defun the function after emacs -Q before
org is loaded, then open Org file (opening Org file autoloads org-mode),
and then org-mode overwrites the manual defun.
If I am correct, putting (require 'oc-basic) before defun will fix what
you are seeing.
Best,
Ihor
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-oc-basic-Speed-up-cached-bibliography-retrival.patch --]
[-- Type: text/x-patch, Size: 4266 bytes --]
From 100b722708c68bc65af637c3ad4e289943cccd7c Mon Sep 17 00:00:00 2001
Message-Id: <100b722708c68bc65af637c3ad4e289943cccd7c.1647690044.git.yantar92@gmail.com>
From: Ihor Radchenko <yantar92@gmail.com>
Date: Sat, 19 Mar 2022 19:24:55 +0800
Subject: [PATCH] oc-basic: Speed up cached bibliography retrival
* lisp/oc-basic.el (org-cite-basic--file-id-cache): New variable
storing hash of bibliography file contents.
(org-cite-basic--parse-bibliography): Skip buffer hash calculation
when bibliography file is unchanged on disk. Calculating buffer hash
on every call is slow when bibliography file is large.
* lisp/org-compat.el:
(org-file-has-changed-p--hash-table, org-file-has-changed-p): Define
`file-has-changed-p' from Emacs 29 if necessary.
See https://list.orgmode.org/LO2P265MB1758E12E04832DC712F5A8E9DC149@LO2P265MB1758.GBRP265.PROD.OUTLOOK.COM/T/#t
---
lisp/oc-basic.el | 11 +++++++++--
lisp/org-compat.el | 29 +++++++++++++++++++++++++++++
2 files changed, 38 insertions(+), 2 deletions(-)
diff --git a/lisp/oc-basic.el b/lisp/oc-basic.el
index e2dfc0603..d1ea620ad 100644
--- a/lisp/oc-basic.el
+++ b/lisp/oc-basic.el
@@ -233,6 +233,8 @@ (defun org-cite-basic--parse-bibtex (dialect)
entries)))
entries))
+(defvar org-cite-basic--file-id-cache nil
+ "Hash table linking files to their hash.")
(defun org-cite-basic--parse-bibliography (&optional info)
"List all entries available in the buffer.
@@ -245,14 +247,19 @@ (defun org-cite-basic--parse-bibliography (&optional info)
as symbols, and values as strings or nil.
Optional argument INFO is the export state, as a property list."
+ (unless (hash-table-p org-cite-basic--file-id-cache)
+ (setq org-cite-basic--file-id-cache (make-hash-table :test #'equal)))
(if (plist-member info :cite-basic/bibliography)
(plist-get info :cite-basic/bibliography)
(let ((results nil))
(dolist (file (org-cite-list-bibliography-files))
(when (file-readable-p file)
(with-temp-buffer
- (insert-file-contents file)
- (let* ((file-id (cons file (org-buffer-hash)))
+ (when (or (org-file-has-changed-p file)
+ (not (gethash file org-cite-basic--file-id-cache)))
+ (insert-file-contents file)
+ (puthash file (org-buffer-hash) org-cite-basic--file-id-cache))
+ (let* ((file-id (cons file (gethash file org-cite-basic--file-id-cache)))
(entries
(or (cdr (assoc file-id org-cite-basic--bibliography-cache))
(let ((table
diff --git a/lisp/org-compat.el b/lisp/org-compat.el
index 38d330de6..e96dcba8b 100644
--- a/lisp/org-compat.el
+++ b/lisp/org-compat.el
@@ -73,6 +73,35 @@ (defvar org-table-dataline-regexp)
(defvar org-table-tab-recognizes-table.el)
(defvar org-table1-hline-regexp)
+\f
+;;; Emacs < 29 compatibility
+
+(defvar org-file-has-changed-p--hash-table (make-hash-table :test #'equal)
+ "Internal variable used by `org-file-has-changed-p'.")
+
+(if (fboundp 'file-has-changed-p)
+ (defalias org-file-has-changed-p #'file-has-changed-p)
+ (defun org-file-has-changed-p (file &optional tag)
+ "Return non-nil if FILE has changed.
+The size and modification time of FILE are compared to the size
+and modification time of the same FILE during a previous
+invocation of `org-file-has-changed-p'. Thus, the first invocation
+of `org-file-has-changed-p' always returns non-nil when FILE exists.
+The optional argument TAG, which must be a symbol, can be used to
+limit the comparison to invocations with identical tags; it can be
+the symbol of the calling function, for example."
+ (let* ((file (directory-file-name (expand-file-name file)))
+ (remote-file-name-inhibit-cache t)
+ (fileattr (file-attributes file 'integer))
+ (attr (and fileattr
+ (cons (file-attribute-size fileattr)
+ (file-attribute-modification-time fileattr))))
+ (sym (concat (symbol-name tag) "@" file))
+ (cachedattr (gethash sym org-file-has-changed-p--hash-table)))
+ (when (not (equal attr cachedattr))
+ (puthash sym attr org-file-has-changed-p--hash-table)))))
+
+
\f
;;; Emacs < 28.1 compatibility
--
2.34.1
next prev parent reply other threads:[~2022-03-19 12:14 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-14 14:45 [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library [9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)] Jamie Matthews
2022-03-16 13:01 ` Bruce D'Arcus
2022-03-19 8:28 ` Ihor Radchenko
2022-03-19 8:57 ` Jamie Matthews
2022-03-19 9:23 ` Ihor Radchenko
2022-03-19 9:25 ` Jamie Matthews
2022-03-19 9:57 ` Ihor Radchenko
2022-03-19 10:12 ` Jamie Matthews
2022-03-19 10:28 ` Ihor Radchenko
2022-03-19 11:17 ` Jamie Matthews
2022-03-19 11:47 ` Ihor Radchenko [this message]
2022-03-19 12:01 ` [PATCH] Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library Jamie Matthews
2022-03-19 12:12 ` Ihor Radchenko
2022-03-19 20:13 ` psychosis
2022-03-20 4:20 ` Ihor Radchenko
2022-03-21 16:51 ` psychosis
2022-03-22 12:27 ` Ihor Radchenko
2022-03-22 16:42 ` psychosis
2022-03-23 11:07 ` Ihor Radchenko
2022-04-16 10:11 ` Ihor Radchenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.orgmode.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877d8qf9k8.fsf@localhost \
--to=yantar92@gmail.com \
--cc=emacs-orgmode@gnu.org \
--cc=jdm204@cam.ac.uk \
--cc=mail@nicolasgoaziou.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).