* [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library [9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)]
@ 2022-03-14 14:45 Jamie Matthews
2022-03-16 13:01 ` Bruce D'Arcus
2022-03-19 8:28 ` Ihor Radchenko
0 siblings, 2 replies; 20+ messages in thread
From: Jamie Matthews @ 2022-03-14 14:45 UTC (permalink / raw)
To: emacs-orgmode@gnu.org
[-- Attachment #1: Type: text/plain, Size: 5598 bytes --]
# Issue
Starting emacs with emacs -Q, then navigating to a minimal example org file with C-x C-f yields a ~10 second hang on an 8-core/16GB RAM machine with nothing else running. Also, scrolling commands like C-v are often laggy after the initial hang.
The minimal org file begins:
"""
#+bibliography: ~/cloud/library/lib.bib
[cite:@tillyPrimaryAnaplasticLargeCell1997]
"""
with another 80 citations afterwards, one-by-line, but nothing else. As mentioned in the title, the lib.bib file is ~10MB - if I swap this out for a non-existent or tiny bibtex file the problem goes away, and the in-buffer citations are rendered in a red face. Clearly from this and the below profile the issue is something to do with checking etc the citations for fontification purposes.
# profile
## cpu
1,542,884,496 99% - redisplay_internal (C function)
1,542,860,504 99% - jit-lock-function
1,542,860,504 99% - jit-lock-fontify-now
1,542,860,504 99% - jit-lock--run-functions
1,542,860,504 99% - run-hook-wrapped
1,542,860,504 99% - #<compiled 0x19ba3bad288bcf3d>
1,542,860,504 99% - font-lock-fontify-region
1,542,860,504 99% - font-lock-default-fontify-region
1,542,792,728 99% - font-lock-fontify-keywords-region
1,542,661,211 99% - org-cite-activate
1,542,542,267 99% - org-cite-basic-activate
787,037,416 50% - org-cite-basic--get-entry
4,065,194 0% + org-cite-basic--parse-bibliography
754,769,872 48% - org-cite-basic--all-keys
6,151,200 0% + seq-mapcat
3,850,126 0% + org-cite-basic--parse-bibliography
481,332 0% + org-element-interpret-data
129,376 0% + org-cite-basic--print-entry
109,615 0% + org-cite-get-references
87,264 0% org-element-citation-parser
5,988 0% org-activate-links
5,677 0% org-do-emphasis-faces
3,524 0% org-fontify-meta-lines-and-blocks
3,072 0% + org-activate-footnote-links
3,072 0% org-do-latex-and-related
7,392 0% - eval
7,392 0% if
3,072 0% kill-this-buffer-enabled-p
2,112 0% - tool-bar-make-keymap
2,112 0% - tool-bar-make-keymap-1
2,112 0% - mapcar
2,112 0% #<compiled 0x1dd4863a123beee7>
1,176 0% menu-bar-update-buffers
6,219,935 0% + command-execute
1,280 0% + timer-event-handler
24 0% + eldoc-schedule-timer
21 0% + #<compiled -0xa96583997f59b99>
0 0% ...
## memory
5778 95% - redisplay_internal (C function)
5778 95% - jit-lock-function
5778 95% - jit-lock-fontify-now
5778 95% - jit-lock--run-functions
5778 95% - run-hook-wrapped
5778 95% - #<compiled 0x19ba3bad288bcf3d>
5778 95% - font-lock-fontify-region
5778 95% - font-lock-default-fontify-region
5778 95% - font-lock-fontify-keywords-region
5774 95% - org-cite-activate
5762 95% - org-cite-basic-activate
2939 48% - org-cite-basic--get-entry
1631 26% - org-cite-basic--parse-bibliography
400 6% - set-auto-coding
400 6% find-auto-coding
20 0% + org-cite-list-bibliography-files
4 0% + #<compiled -0x1c8cad8ab65dbcdf>
4 0% after-insert-file-set-coding
2819 46% - org-cite-basic--all-keys
1515 25% - org-cite-basic--parse-bibliography
364 6% - set-auto-coding
364 6% find-auto-coding
16 0% - org-cite-list-bibliography-files
8 0% + org-collect-keywords
4 0% #<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_23>
32 0% + seq-mapcat
4 0% + org-cite-basic--print-entry
4 0% org-element-citation-parser
4 0% org-activate-links
233 3% - command-execute
218 3% - byte-code
149 2% + read-extended-command
69 1% - find-file-read-args
69 1% - read-file-name
69 1% - read-file-name-default
15 0% - completing-read-default
1 0% - command-execute
1 0% - funcall-interactively
1 0% - minibuffer-complete
1 0% - completion-in-region
1 0% - completion--in-region
1 0% + #<compiled -0xf4b91251a4f5a2>
15 0% - funcall-interactively
6 0% + find-file
5 0% + execute-extended-command
4 0% scroll-up-command
26 0% + ...
4 0% + timer-event-handler
4 0% + #<compiled -0xa96583997f59b99>
# Version info:
Emacs : GNU Emacs 29.0.50 (build 3, x86_64-pc-linux-gnu, GTK+ Version 3.24.30, cairo version 1.16.0)
of 2022-03-11
Package: Org mode version 9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)
[-- Attachment #2: Type: text/html, Size: 11541 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library [9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)]
2022-03-14 14:45 [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library [9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)] Jamie Matthews
@ 2022-03-16 13:01 ` Bruce D'Arcus
2022-03-19 8:28 ` Ihor Radchenko
1 sibling, 0 replies; 20+ messages in thread
From: Bruce D'Arcus @ 2022-03-16 13:01 UTC (permalink / raw)
To: Jamie Matthews; +Cc: emacs-orgmode@gnu.org
Thanks for posting this!
For Nicolas et al, my more general question about performance with
oc-basic was related to this, so you can ignore that.
On Tue, Mar 15, 2022 at 1:01 PM Jamie Matthews <jdm204@cam.ac.uk> wrote:
>
> # Issue
> Starting emacs with emacs -Q, then navigating to a minimal example org file with C-x C-f yields a ~10 second hang on an 8-core/16GB RAM machine with nothing else running. Also, scrolling commands like C-v are often laggy after the initial hang.
>
> The minimal org file begins:
> """
> #+bibliography: ~/cloud/library/lib.bib
>
> [cite:@tillyPrimaryAnaplasticLargeCell1997]
> """
> with another 80 citations afterwards, one-by-line, but nothing else. As mentioned in the title, the lib.bib file is ~10MB - if I swap this out for a non-existent or tiny bibtex file the problem goes away, and the in-buffer citations are rendered in a red face. Clearly from this and the below profile the issue is something to do with checking etc the citations for fontification purposes.
>
> # profile
>
> ## cpu
> 1,542,884,496 99% - redisplay_internal (C function)
> 1,542,860,504 99% - jit-lock-function
> 1,542,860,504 99% - jit-lock-fontify-now
> 1,542,860,504 99% - jit-lock--run-functions
> 1,542,860,504 99% - run-hook-wrapped
> 1,542,860,504 99% - #<compiled 0x19ba3bad288bcf3d>
> 1,542,860,504 99% - font-lock-fontify-region
> 1,542,860,504 99% - font-lock-default-fontify-region
> 1,542,792,728 99% - font-lock-fontify-keywords-region
> 1,542,661,211 99% - org-cite-activate
> 1,542,542,267 99% - org-cite-basic-activate
> 787,037,416 50% - org-cite-basic--get-entry
> 4,065,194 0% + org-cite-basic--parse-bibliography
> 754,769,872 48% - org-cite-basic--all-keys
> 6,151,200 0% + seq-mapcat
> 3,850,126 0% + org-cite-basic--parse-bibliography
> 481,332 0% + org-element-interpret-data
> 129,376 0% + org-cite-basic--print-entry
> 109,615 0% + org-cite-get-references
> 87,264 0% org-element-citation-parser
> 5,988 0% org-activate-links
> 5,677 0% org-do-emphasis-faces
> 3,524 0% org-fontify-meta-lines-and-blocks
> 3,072 0% + org-activate-footnote-links
> 3,072 0% org-do-latex-and-related
> 7,392 0% - eval
> 7,392 0% if
> 3,072 0% kill-this-buffer-enabled-p
> 2,112 0% - tool-bar-make-keymap
> 2,112 0% - tool-bar-make-keymap-1
> 2,112 0% - mapcar
> 2,112 0% #<compiled 0x1dd4863a123beee7>
> 1,176 0% menu-bar-update-buffers
> 6,219,935 0% + command-execute
> 1,280 0% + timer-event-handler
> 24 0% + eldoc-schedule-timer
> 21 0% + #<compiled -0xa96583997f59b99>
> 0 0% ...
>
> ## memory
>
> 5778 95% - redisplay_internal (C function)
> 5778 95% - jit-lock-function
> 5778 95% - jit-lock-fontify-now
> 5778 95% - jit-lock--run-functions
> 5778 95% - run-hook-wrapped
> 5778 95% - #<compiled 0x19ba3bad288bcf3d>
> 5778 95% - font-lock-fontify-region
> 5778 95% - font-lock-default-fontify-region
> 5778 95% - font-lock-fontify-keywords-region
> 5774 95% - org-cite-activate
> 5762 95% - org-cite-basic-activate
> 2939 48% - org-cite-basic--get-entry
> 1631 26% - org-cite-basic--parse-bibliography
> 400 6% - set-auto-coding
> 400 6% find-auto-coding
> 20 0% + org-cite-list-bibliography-files
> 4 0% + #<compiled -0x1c8cad8ab65dbcdf>
> 4 0% after-insert-file-set-coding
> 2819 46% - org-cite-basic--all-keys
> 1515 25% - org-cite-basic--parse-bibliography
> 364 6% - set-auto-coding
> 364 6% find-auto-coding
> 16 0% - org-cite-list-bibliography-files
> 8 0% + org-collect-keywords
> 4 0% #<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_23>
> 32 0% + seq-mapcat
> 4 0% + org-cite-basic--print-entry
> 4 0% org-element-citation-parser
> 4 0% org-activate-links
> 233 3% - command-execute
> 218 3% - byte-code
> 149 2% + read-extended-command
> 69 1% - find-file-read-args
> 69 1% - read-file-name
> 69 1% - read-file-name-default
> 15 0% - completing-read-default
> 1 0% - command-execute
> 1 0% - funcall-interactively
> 1 0% - minibuffer-complete
> 1 0% - completion-in-region
> 1 0% - completion--in-region
> 1 0% + #<compiled -0xf4b91251a4f5a2>
> 15 0% - funcall-interactively
> 6 0% + find-file
> 5 0% + execute-extended-command
> 4 0% scroll-up-command
> 26 0% + ...
> 4 0% + timer-event-handler
> 4 0% + #<compiled -0xa96583997f59b99>
>
>
>
> # Version info:
> Emacs : GNU Emacs 29.0.50 (build 3, x86_64-pc-linux-gnu, GTK+ Version 3.24.30, cairo version 1.16.0)
> of 2022-03-11
> Package: Org mode version 9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library [9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)]
2022-03-14 14:45 [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library [9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)] Jamie Matthews
2022-03-16 13:01 ` Bruce D'Arcus
@ 2022-03-19 8:28 ` Ihor Radchenko
2022-03-19 8:57 ` Jamie Matthews
1 sibling, 1 reply; 20+ messages in thread
From: Ihor Radchenko @ 2022-03-19 8:28 UTC (permalink / raw)
To: Jamie Matthews; +Cc: emacs-orgmode@gnu.org
Jamie Matthews <jdm204@cam.ac.uk> writes:
> # Issue
> Starting emacs with emacs -Q, then navigating to a minimal example org file with C-x C-f yields a ~10 second hang on an 8-core/16GB RAM machine with nothing else running. Also, scrolling commands like C-v are often laggy after the initial hang.
>
> The minimal org file begins:
> """
> #+bibliography: ~/cloud/library/lib.bib
>
> [cite:@tillyPrimaryAnaplasticLargeCell1997]
> """
> with another 80 citations afterwards, one-by-line, but nothing else. As mentioned in the title, the lib.bib file is ~10MB - if I swap this out for a non-existent or tiny bibtex file the problem goes away, and the in-buffer citations are rendered in a red face. Clearly from this and the below profile the issue is something to do with checking etc the citations for fontification purposes.
>
> # profile
>
> ## cpu
> 1,542,542,267 99% - org-cite-basic-activate
> 787,037,416 50% - org-cite-basic--get-entry
> 4,065,194 0% + org-cite-basic--parse-bibliography
> 754,769,872 48% - org-cite-basic--all-keys
Could you try the following:
1. Execute the following code:
(require 'elp)
(elp-restore-all)
(elp-instrument-function #'org-cite-basic--get-entry)
(elp-instrument-function #'org-cite-basic-activate)
(elp-instrument-function #'org-cite-basic--parse-bibliography)
(elp-instrument-function #'org-cite-basic--all-keys)
2. Open your org file where Emacs hangs
3. Run M-x elp-results
4. Report the ELP buffer contents here
Best,
Ihor
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library [9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)]
2022-03-19 8:28 ` Ihor Radchenko
@ 2022-03-19 8:57 ` Jamie Matthews
2022-03-19 9:23 ` Ihor Radchenko
0 siblings, 1 reply; 20+ messages in thread
From: Jamie Matthews @ 2022-03-19 8:57 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: emacs-orgmode@gnu.org
[-- Attachment #1: Type: text/plain, Size: 2543 bytes --]
Hi Ihor,
I tried to run that code but get errors for the elp-instrument-function calls (both with my config and emacs -Q):
```
Debugger entered--Lisp error: (error "ELP cannot profile the function: org-cite-basic--g...")
error("ELP cannot profile the function: %s" org-cite-basic--get-entry)
elp-instrument-function(org-cite-basic--get-entry)
(progn (elp-instrument-function #'org-cite-basic--get-entry))
elisp--eval-last-sexp(nil)
eval-last-sexp(nil)
funcall-interactively(eval-last-sexp nil)
command-execute(eval-last-sexp)
```
Best,
Jamie
________________________________
From: Ihor Radchenko <yantar92@gmail.com>
Sent: Saturday, March 19, 2022 8:28 AM
To: Jamie Matthews <jdm204@cam.ac.uk>
Cc: emacs-orgmode@gnu.org <emacs-orgmode@gnu.org>
Subject: Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library [9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)]
Jamie Matthews <jdm204@cam.ac.uk> writes:
> # Issue
> Starting emacs with emacs -Q, then navigating to a minimal example org file with C-x C-f yields a ~10 second hang on an 8-core/16GB RAM machine with nothing else running. Also, scrolling commands like C-v are often laggy after the initial hang.
>
> The minimal org file begins:
> """
> #+bibliography: ~/cloud/library/lib.bib
>
> [cite:@tillyPrimaryAnaplasticLargeCell1997]
> """
> with another 80 citations afterwards, one-by-line, but nothing else. As mentioned in the title, the lib.bib file is ~10MB - if I swap this out for a non-existent or tiny bibtex file the problem goes away, and the in-buffer citations are rendered in a red face. Clearly from this and the below profile the issue is something to do with checking etc the citations for fontification purposes.
>
> # profile
>
> ## cpu
> 1,542,542,267 99% - org-cite-basic-activate
> 787,037,416 50% - org-cite-basic--get-entry
> 4,065,194 0% + org-cite-basic--parse-bibliography
> 754,769,872 48% - org-cite-basic--all-keys
Could you try the following:
1. Execute the following code:
(require 'elp)
(elp-restore-all)
(elp-instrument-function #'org-cite-basic--get-entry)
(elp-instrument-function #'org-cite-basic-activate)
(elp-instrument-function #'org-cite-basic--parse-bibliography)
(elp-instrument-function #'org-cite-basic--all-keys)
2. Open your org file where Emacs hangs
3. Run M-x elp-results
4. Report the ELP buffer contents here
Best,
Ihor
[-- Attachment #2: Type: text/html, Size: 5022 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library [9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)]
2022-03-19 8:57 ` Jamie Matthews
@ 2022-03-19 9:23 ` Ihor Radchenko
2022-03-19 9:25 ` Jamie Matthews
0 siblings, 1 reply; 20+ messages in thread
From: Ihor Radchenko @ 2022-03-19 9:23 UTC (permalink / raw)
To: Jamie Matthews; +Cc: emacs-orgmode@gnu.org
Jamie Matthews <jdm204@cam.ac.uk> writes:
> Hi Ihor,
>
> I tried to run that code but get errors for the elp-instrument-function calls (both with my config and emacs -Q):
>
> ```
> Debugger entered--Lisp error: (error "ELP cannot profile the function: org-cite-basic--g...")
Then also add (require 'oc-basic) before the code I provided. I seems
that org-cite has not been loaded yet when you tried to execute the
code.
Best,
Ihor
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library [9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)]
2022-03-19 9:23 ` Ihor Radchenko
@ 2022-03-19 9:25 ` Jamie Matthews
2022-03-19 9:57 ` Ihor Radchenko
0 siblings, 1 reply; 20+ messages in thread
From: Jamie Matthews @ 2022-03-19 9:25 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: emacs-orgmode@gnu.org
[-- Attachment #1: Type: text/plain, Size: 1154 bytes --]
Thanks:
```
org-cite-basic-activate 59 10.724349447 0.1817686346
org-cite-basic--parse-bibliography 129 10.559936049 0.0818599693
org-cite-basic--all-keys 59 7.830202561 0.1327152976
org-cite-basic--get-entry 70 2.7772344940 0.0396747784
```
________________________________
From: Ihor Radchenko <yantar92@gmail.com>
Sent: Saturday, March 19, 2022 9:23 AM
To: Jamie Matthews <jdm204@cam.ac.uk>
Cc: emacs-orgmode@gnu.org <emacs-orgmode@gnu.org>
Subject: Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library [9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)]
Jamie Matthews <jdm204@cam.ac.uk> writes:
> Hi Ihor,
>
> I tried to run that code but get errors for the elp-instrument-function calls (both with my config and emacs -Q):
>
> ```
> Debugger entered--Lisp error: (error "ELP cannot profile the function: org-cite-basic--g...")
Then also add (require 'oc-basic) before the code I provided. I seems
that org-cite has not been loaded yet when you tried to execute the
code.
Best,
Ihor
[-- Attachment #2: Type: text/html, Size: 2664 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library [9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)]
2022-03-19 9:25 ` Jamie Matthews
@ 2022-03-19 9:57 ` Ihor Radchenko
2022-03-19 10:12 ` Jamie Matthews
0 siblings, 1 reply; 20+ messages in thread
From: Ihor Radchenko @ 2022-03-19 9:57 UTC (permalink / raw)
To: Jamie Matthews; +Cc: emacs-orgmode@gnu.org
Jamie Matthews <jdm204@cam.ac.uk> writes:
> Thanks:
>
> ```
> org-cite-basic-activate 59 10.724349447 0.1817686346
> org-cite-basic--parse-bibliography 129 10.559936049 0.0818599693
> org-cite-basic--all-keys 59 7.830202561 0.1327152976
> org-cite-basic--get-entry 70 2.7772344940 0.0396747784
> ```
org-cite-basic--parse-bibliography appears to be the main bottleneck.
I tried to write a quick fix (untested).
Can you try to redefine org-cite-basic--parse-bibliography to the
version below (note an extra defvar) and let me know how it goes:
(defvar org-cite-basic--file-id-cache nil
"Hash table linking files to their hash.")
(defun org-cite-basic--parse-bibliography (&optional info)
"List all entries available in the buffer.
Each association follows the pattern
(FILE . ENTRIES)
where FILE is the absolute file name of the BibTeX file, and ENTRIES is a hash
table where keys are references and values are association lists between fields,
as symbols, and values as strings or nil.
Optional argument INFO is the export state, as a property list."
(unless (hash-table-p org-cite-basic--file-id-cache)
(setq org-cite-basic--file-id-cache (make-hash-table :test #'equal)))
(if (plist-member info :cite-basic/bibliography)
(plist-get info :cite-basic/bibliography)
(let ((results nil))
(dolist (file (org-cite-list-bibliography-files))
(when (file-readable-p file)
(with-temp-buffer
(when (or (file-has-changed-p file)
(not (gethash file org-cite-basic--file-id-cache)))
(insert-file-contents file))
(unless (gethash file org-cite-basic--file-id-cache)
(puthash file (org-buffer-hash) org-cite-basic--file-id-cache))
(let* ((file-id (cons file (gethash file org-cite-basic--file-id-cache)))
(entries
(or (cdr (assoc file-id org-cite-basic--bibliography-cache))
(let ((table
(pcase (file-name-extension file)
("json" (org-cite-basic--parse-json))
("bib" (org-cite-basic--parse-bibtex 'biblatex))
("bibtex" (org-cite-basic--parse-bibtex 'BibTeX))
(ext
(user-error "Unknown bibliography extension: %S"
ext)))))
(push (cons file-id table) org-cite-basic--bibliography-cache)
table))))
(push (cons file entries) results)))))
(when info (plist-put info :cite-basic/bibliography results))
results)))
Best,
Ihor
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library [9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)]
2022-03-19 9:57 ` Ihor Radchenko
@ 2022-03-19 10:12 ` Jamie Matthews
2022-03-19 10:28 ` Ihor Radchenko
0 siblings, 1 reply; 20+ messages in thread
From: Jamie Matthews @ 2022-03-19 10:12 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: emacs-orgmode@gnu.org
[-- Attachment #1: Type: text/plain, Size: 3838 bytes --]
It seems a very significant improvement - the lag on scrolling after the buffer is loaded is not noticeable, and the initial loading hang essentially disappears!
I'm slightly surprised that elp-results doesn't seem to show as dramatic an improvement as it feels for me:
```
org-cite-basic-activate 123 6.5652978389 0.0533764051
org-cite-basic--all-keys 123 6.2679872230 0.0509592457
org-cite-basic--parse-bibliography 270 6.2513217860 0.0231530436
org-cite-basic--get-entry 147 0.1550045990 0.0010544530
```
________________________________
From: Ihor Radchenko <yantar92@gmail.com>
Sent: Saturday, March 19, 2022 9:57 AM
To: Jamie Matthews <jdm204@cam.ac.uk>
Cc: emacs-orgmode@gnu.org <emacs-orgmode@gnu.org>
Subject: Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library [9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)]
Jamie Matthews <jdm204@cam.ac.uk> writes:
> Thanks:
>
> ```
> org-cite-basic-activate 59 10.724349447 0.1817686346
> org-cite-basic--parse-bibliography 129 10.559936049 0.0818599693
> org-cite-basic--all-keys 59 7.830202561 0.1327152976
> org-cite-basic--get-entry 70 2.7772344940 0.0396747784
> ```
org-cite-basic--parse-bibliography appears to be the main bottleneck.
I tried to write a quick fix (untested).
Can you try to redefine org-cite-basic--parse-bibliography to the
version below (note an extra defvar) and let me know how it goes:
(defvar org-cite-basic--file-id-cache nil
"Hash table linking files to their hash.")
(defun org-cite-basic--parse-bibliography (&optional info)
"List all entries available in the buffer.
Each association follows the pattern
(FILE . ENTRIES)
where FILE is the absolute file name of the BibTeX file, and ENTRIES is a hash
table where keys are references and values are association lists between fields,
as symbols, and values as strings or nil.
Optional argument INFO is the export state, as a property list."
(unless (hash-table-p org-cite-basic--file-id-cache)
(setq org-cite-basic--file-id-cache (make-hash-table :test #'equal)))
(if (plist-member info :cite-basic/bibliography)
(plist-get info :cite-basic/bibliography)
(let ((results nil))
(dolist (file (org-cite-list-bibliography-files))
(when (file-readable-p file)
(with-temp-buffer
(when (or (file-has-changed-p file)
(not (gethash file org-cite-basic--file-id-cache)))
(insert-file-contents file))
(unless (gethash file org-cite-basic--file-id-cache)
(puthash file (org-buffer-hash) org-cite-basic--file-id-cache))
(let* ((file-id (cons file (gethash file org-cite-basic--file-id-cache)))
(entries
(or (cdr (assoc file-id org-cite-basic--bibliography-cache))
(let ((table
(pcase (file-name-extension file)
("json" (org-cite-basic--parse-json))
("bib" (org-cite-basic--parse-bibtex 'biblatex))
("bibtex" (org-cite-basic--parse-bibtex 'BibTeX))
(ext
(user-error "Unknown bibliography extension: %S"
ext)))))
(push (cons file-id table) org-cite-basic--bibliography-cache)
table))))
(push (cons file entries) results)))))
(when info (plist-put info :cite-basic/bibliography results))
results)))
Best,
Ihor
[-- Attachment #2: Type: text/html, Size: 8737 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library [9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)]
2022-03-19 10:12 ` Jamie Matthews
@ 2022-03-19 10:28 ` Ihor Radchenko
2022-03-19 11:17 ` Jamie Matthews
0 siblings, 1 reply; 20+ messages in thread
From: Ihor Radchenko @ 2022-03-19 10:28 UTC (permalink / raw)
To: Jamie Matthews; +Cc: emacs-orgmode@gnu.org
Jamie Matthews <jdm204@cam.ac.uk> writes:
> It seems a very significant improvement - the lag on scrolling after the buffer is loaded is not noticeable, and the initial loading hang essentially disappears!
Great! Do you mean that there is no apparent slowness at all?
> I'm slightly surprised that elp-results doesn't seem to show as dramatic an improvement as it feels for me:
>
> ```
> org-cite-basic-activate 123 6.5652978389 0.0533764051
> org-cite-basic--all-keys 123 6.2679872230 0.0509592457
> org-cite-basic--parse-bibliography 270 6.2513217860 0.0231530436
> org-cite-basic--get-entry 147 0.1550045990 0.0010544530
> ```
It does. You need to look at the last column. The total time in the
second column is large probably because you interacted (scrolled?) with
the buffer a bit more than first time and the functions got called more
(see second column).
Also, can you try the following version of
org-cite-basic--parse-bibliography again and try to change cite key of
one of the cited entries of the bibliography outside emacs? The key
should be marked red if you try to insert the changed entry in Org.
(defvar org-cite-basic--file-id-cache nil
"Hash table linking files to their hash.")
(defun org-cite-basic--parse-bibliography (&optional info)
"List all entries available in the buffer.
Each association follows the pattern
(FILE . ENTRIES)
where FILE is the absolute file name of the BibTeX file, and ENTRIES is a hash
table where keys are references and values are association lists between fields,
as symbols, and values as strings or nil.
Optional argument INFO is the export state, as a property list."
(unless (hash-table-p org-cite-basic--file-id-cache)
(setq org-cite-basic--file-id-cache (make-hash-table :test #'equal)))
(if (plist-member info :cite-basic/bibliography)
(plist-get info :cite-basic/bibliography)
(let ((results nil))
(dolist (file (org-cite-list-bibliography-files))
(when (file-readable-p file)
(with-temp-buffer
(when (or (file-has-changed-p file)
(not (gethash file org-cite-basic--file-id-cache)))
(insert-file-contents file)
(puthash file (org-buffer-hash) org-cite-basic--file-id-cache))
(let* ((file-id (cons file (gethash file org-cite-basic--file-id-cache)))
(entries
(or (cdr (assoc file-id org-cite-basic--bibliography-cache))
(let ((table
(pcase (file-name-extension file)
("json" (org-cite-basic--parse-json))
("bib" (org-cite-basic--parse-bibtex 'biblatex))
("bibtex" (org-cite-basic--parse-bibtex 'BibTeX))
(ext
(user-error "Unknown bibliography extension: %S"
ext)))))
(push (cons file-id table) org-cite-basic--bibliography-cache)
table))))
(push (cons file entries) results)))))
(when info (plist-put info :cite-basic/bibliography results))
results)))
Best,
Ihor
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library [9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)]
2022-03-19 10:28 ` Ihor Radchenko
@ 2022-03-19 11:17 ` Jamie Matthews
2022-03-19 11:47 ` [PATCH] Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library Ihor Radchenko
0 siblings, 1 reply; 20+ messages in thread
From: Jamie Matthews @ 2022-03-19 11:17 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: emacs-orgmode@gnu.org
[-- Attachment #1: Type: text/plain, Size: 4537 bytes --]
I can confirm that the key turns red on insert when I altered the key outside of emacs (with that second version of `org-cite-basic--parse-bibliography`).
However, I'm now noticing that the hang improvement earlier (< a second down from ~10) doesn't always occur.
Specifically, if I
1. emacs -Q
2. eval your code in scratch
3. C-x C-f to the org file
I get the hang. However, if I then
1. kill the org buffer
2. eval the code again
3. re-find the org file
the hang is gone. Without evaling your code in between, killing the org buffer and finding it again in the same emacs session reproduces the hang everytime, which was probably what I did the first time I report the improvement, as in I didn't check it worked from startup.
________________________________
From: Ihor Radchenko <yantar92@gmail.com>
Sent: Saturday, March 19, 2022 10:28 AM
To: Jamie Matthews <jdm204@cam.ac.uk>
Cc: emacs-orgmode@gnu.org <emacs-orgmode@gnu.org>
Subject: Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library [9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)]
Jamie Matthews <jdm204@cam.ac.uk> writes:
> It seems a very significant improvement - the lag on scrolling after the buffer is loaded is not noticeable, and the initial loading hang essentially disappears!
Great! Do you mean that there is no apparent slowness at all?
> I'm slightly surprised that elp-results doesn't seem to show as dramatic an improvement as it feels for me:
>
> ```
> org-cite-basic-activate 123 6.5652978389 0.0533764051
> org-cite-basic--all-keys 123 6.2679872230 0.0509592457
> org-cite-basic--parse-bibliography 270 6.2513217860 0.0231530436
> org-cite-basic--get-entry 147 0.1550045990 0.0010544530
> ```
It does. You need to look at the last column. The total time in the
second column is large probably because you interacted (scrolled?) with
the buffer a bit more than first time and the functions got called more
(see second column).
Also, can you try the following version of
org-cite-basic--parse-bibliography again and try to change cite key of
one of the cited entries of the bibliography outside emacs? The key
should be marked red if you try to insert the changed entry in Org.
(defvar org-cite-basic--file-id-cache nil
"Hash table linking files to their hash.")
(defun org-cite-basic--parse-bibliography (&optional info)
"List all entries available in the buffer.
Each association follows the pattern
(FILE . ENTRIES)
where FILE is the absolute file name of the BibTeX file, and ENTRIES is a hash
table where keys are references and values are association lists between fields,
as symbols, and values as strings or nil.
Optional argument INFO is the export state, as a property list."
(unless (hash-table-p org-cite-basic--file-id-cache)
(setq org-cite-basic--file-id-cache (make-hash-table :test #'equal)))
(if (plist-member info :cite-basic/bibliography)
(plist-get info :cite-basic/bibliography)
(let ((results nil))
(dolist (file (org-cite-list-bibliography-files))
(when (file-readable-p file)
(with-temp-buffer
(when (or (file-has-changed-p file)
(not (gethash file org-cite-basic--file-id-cache)))
(insert-file-contents file)
(puthash file (org-buffer-hash) org-cite-basic--file-id-cache))
(let* ((file-id (cons file (gethash file org-cite-basic--file-id-cache)))
(entries
(or (cdr (assoc file-id org-cite-basic--bibliography-cache))
(let ((table
(pcase (file-name-extension file)
("json" (org-cite-basic--parse-json))
("bib" (org-cite-basic--parse-bibtex 'biblatex))
("bibtex" (org-cite-basic--parse-bibtex 'BibTeX))
(ext
(user-error "Unknown bibliography extension: %S"
ext)))))
(push (cons file-id table) org-cite-basic--bibliography-cache)
table))))
(push (cons file entries) results)))))
(when info (plist-put info :cite-basic/bibliography results))
results)))
Best,
Ihor
[-- Attachment #2: Type: text/html, Size: 9333 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH] Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library
2022-03-19 11:17 ` Jamie Matthews
@ 2022-03-19 11:47 ` Ihor Radchenko
2022-03-19 12:01 ` Jamie Matthews
2022-04-16 10:11 ` Ihor Radchenko
0 siblings, 2 replies; 20+ messages in thread
From: Ihor Radchenko @ 2022-03-19 11:47 UTC (permalink / raw)
To: Jamie Matthews, Nicolas Goaziou; +Cc: emacs-orgmode@gnu.org
[-- Attachment #1: Type: text/plain, Size: 1400 bytes --]
Jamie Matthews <jdm204@cam.ac.uk> writes:
> I can confirm that the key turns red on insert when I altered the key outside of emacs (with that second version of `org-cite-basic--parse-bibliography`).
Great. Then, I am attaching the patch with the new version of the
function. I will let Nicolas decide if it is good enough to be merged. I
do not feel confident enough with org-cite code to judge if my approach
is not missing some edge cases.
> However, I'm now noticing that the hang improvement earlier (< a second down from ~10) doesn't always occur.
>
> Specifically, if I
>
> 1. emacs -Q
> 2. eval your code in scratch
> 3. C-x C-f to the org file
>
> I get the hang. However, if I then
>
>
> 1. kill the org buffer
> 2. eval the code again
> 3. re-find the org file
>
> the hang is gone. Without evaling your code in between, killing the org buffer and finding it again in the same emacs session reproduces the hang everytime, which was probably what I did the first time I report the improvement, as in I didn't check it worked from startup.
It is most likely because you defun the function after emacs -Q before
org is loaded, then open Org file (opening Org file autoloads org-mode),
and then org-mode overwrites the manual defun.
If I am correct, putting (require 'oc-basic) before defun will fix what
you are seeing.
Best,
Ihor
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-oc-basic-Speed-up-cached-bibliography-retrival.patch --]
[-- Type: text/x-patch, Size: 4266 bytes --]
From 100b722708c68bc65af637c3ad4e289943cccd7c Mon Sep 17 00:00:00 2001
Message-Id: <100b722708c68bc65af637c3ad4e289943cccd7c.1647690044.git.yantar92@gmail.com>
From: Ihor Radchenko <yantar92@gmail.com>
Date: Sat, 19 Mar 2022 19:24:55 +0800
Subject: [PATCH] oc-basic: Speed up cached bibliography retrival
* lisp/oc-basic.el (org-cite-basic--file-id-cache): New variable
storing hash of bibliography file contents.
(org-cite-basic--parse-bibliography): Skip buffer hash calculation
when bibliography file is unchanged on disk. Calculating buffer hash
on every call is slow when bibliography file is large.
* lisp/org-compat.el:
(org-file-has-changed-p--hash-table, org-file-has-changed-p): Define
`file-has-changed-p' from Emacs 29 if necessary.
See https://list.orgmode.org/LO2P265MB1758E12E04832DC712F5A8E9DC149@LO2P265MB1758.GBRP265.PROD.OUTLOOK.COM/T/#t
---
lisp/oc-basic.el | 11 +++++++++--
lisp/org-compat.el | 29 +++++++++++++++++++++++++++++
2 files changed, 38 insertions(+), 2 deletions(-)
diff --git a/lisp/oc-basic.el b/lisp/oc-basic.el
index e2dfc0603..d1ea620ad 100644
--- a/lisp/oc-basic.el
+++ b/lisp/oc-basic.el
@@ -233,6 +233,8 @@ (defun org-cite-basic--parse-bibtex (dialect)
entries)))
entries))
+(defvar org-cite-basic--file-id-cache nil
+ "Hash table linking files to their hash.")
(defun org-cite-basic--parse-bibliography (&optional info)
"List all entries available in the buffer.
@@ -245,14 +247,19 @@ (defun org-cite-basic--parse-bibliography (&optional info)
as symbols, and values as strings or nil.
Optional argument INFO is the export state, as a property list."
+ (unless (hash-table-p org-cite-basic--file-id-cache)
+ (setq org-cite-basic--file-id-cache (make-hash-table :test #'equal)))
(if (plist-member info :cite-basic/bibliography)
(plist-get info :cite-basic/bibliography)
(let ((results nil))
(dolist (file (org-cite-list-bibliography-files))
(when (file-readable-p file)
(with-temp-buffer
- (insert-file-contents file)
- (let* ((file-id (cons file (org-buffer-hash)))
+ (when (or (org-file-has-changed-p file)
+ (not (gethash file org-cite-basic--file-id-cache)))
+ (insert-file-contents file)
+ (puthash file (org-buffer-hash) org-cite-basic--file-id-cache))
+ (let* ((file-id (cons file (gethash file org-cite-basic--file-id-cache)))
(entries
(or (cdr (assoc file-id org-cite-basic--bibliography-cache))
(let ((table
diff --git a/lisp/org-compat.el b/lisp/org-compat.el
index 38d330de6..e96dcba8b 100644
--- a/lisp/org-compat.el
+++ b/lisp/org-compat.el
@@ -73,6 +73,35 @@ (defvar org-table-dataline-regexp)
(defvar org-table-tab-recognizes-table.el)
(defvar org-table1-hline-regexp)
+\f
+;;; Emacs < 29 compatibility
+
+(defvar org-file-has-changed-p--hash-table (make-hash-table :test #'equal)
+ "Internal variable used by `org-file-has-changed-p'.")
+
+(if (fboundp 'file-has-changed-p)
+ (defalias org-file-has-changed-p #'file-has-changed-p)
+ (defun org-file-has-changed-p (file &optional tag)
+ "Return non-nil if FILE has changed.
+The size and modification time of FILE are compared to the size
+and modification time of the same FILE during a previous
+invocation of `org-file-has-changed-p'. Thus, the first invocation
+of `org-file-has-changed-p' always returns non-nil when FILE exists.
+The optional argument TAG, which must be a symbol, can be used to
+limit the comparison to invocations with identical tags; it can be
+the symbol of the calling function, for example."
+ (let* ((file (directory-file-name (expand-file-name file)))
+ (remote-file-name-inhibit-cache t)
+ (fileattr (file-attributes file 'integer))
+ (attr (and fileattr
+ (cons (file-attribute-size fileattr)
+ (file-attribute-modification-time fileattr))))
+ (sym (concat (symbol-name tag) "@" file))
+ (cachedattr (gethash sym org-file-has-changed-p--hash-table)))
+ (when (not (equal attr cachedattr))
+ (puthash sym attr org-file-has-changed-p--hash-table)))))
+
+
\f
;;; Emacs < 28.1 compatibility
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library
2022-03-19 11:47 ` [PATCH] Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library Ihor Radchenko
@ 2022-03-19 12:01 ` Jamie Matthews
2022-03-19 12:12 ` Ihor Radchenko
2022-04-16 10:11 ` Ihor Radchenko
1 sibling, 1 reply; 20+ messages in thread
From: Jamie Matthews @ 2022-03-19 12:01 UTC (permalink / raw)
To: Ihor Radchenko, Nicolas Goaziou; +Cc: emacs-orgmode@gnu.org
[-- Attachment #1: Type: text/plain, Size: 2348 bytes --]
> Great. Then, I am attaching the patch with the new version of the
> function. I will let Nicolas decide if it is good enough to be merged. I
> do not feel confident enough with org-cite code to judge if my approach
> is not missing some edge cases.
Great, thanks for your help!
> If I am correct, putting (require 'oc-basic) before defun will fix what
you are seeing.
Hmm, this doesn't fix what I'm seeing - still hang on first load and quicker after killing and finding again. Very possible there's something else obvious I'm missing though, I'm very much an emacs 'end user'!
________________________________
From: Ihor Radchenko <yantar92@gmail.com>
Sent: Saturday, March 19, 2022 11:47 AM
To: Jamie Matthews <jdm204@cam.ac.uk>; Nicolas Goaziou <mail@nicolasgoaziou.fr>
Cc: emacs-orgmode@gnu.org <emacs-orgmode@gnu.org>
Subject: [PATCH] Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library
Jamie Matthews <jdm204@cam.ac.uk> writes:
> I can confirm that the key turns red on insert when I altered the key outside of emacs (with that second version of `org-cite-basic--parse-bibliography`).
Great. Then, I am attaching the patch with the new version of the
function. I will let Nicolas decide if it is good enough to be merged. I
do not feel confident enough with org-cite code to judge if my approach
is not missing some edge cases.
> However, I'm now noticing that the hang improvement earlier (< a second down from ~10) doesn't always occur.
>
> Specifically, if I
>
> 1. emacs -Q
> 2. eval your code in scratch
> 3. C-x C-f to the org file
>
> I get the hang. However, if I then
>
>
> 1. kill the org buffer
> 2. eval the code again
> 3. re-find the org file
>
> the hang is gone. Without evaling your code in between, killing the org buffer and finding it again in the same emacs session reproduces the hang everytime, which was probably what I did the first time I report the improvement, as in I didn't check it worked from startup.
It is most likely because you defun the function after emacs -Q before
org is loaded, then open Org file (opening Org file autoloads org-mode),
and then org-mode overwrites the manual defun.
If I am correct, putting (require 'oc-basic) before defun will fix what
you are seeing.
Best,
Ihor
[-- Attachment #2: Type: text/html, Size: 4214 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library
2022-03-19 12:01 ` Jamie Matthews
@ 2022-03-19 12:12 ` Ihor Radchenko
2022-03-19 20:13 ` psychosis
0 siblings, 1 reply; 20+ messages in thread
From: Ihor Radchenko @ 2022-03-19 12:12 UTC (permalink / raw)
To: Jamie Matthews; +Cc: emacs-orgmode@gnu.org, Nicolas Goaziou
Jamie Matthews <jdm204@cam.ac.uk> writes:
> Hmm, this doesn't fix what I'm seeing - still hang on first load and quicker after killing and finding again. Very possible there's something else obvious I'm missing though, I'm very much an emacs 'end user'!
This is strange. I would then try to run the profiler and look at the
results. The very first run might be somewhat slow because org-cite
still needs to parse the whole bibliography at the very beginning. But
it should be still faster than 10 seconds you observed earlier.
Best,
Ihor
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library
2022-03-19 12:12 ` Ihor Radchenko
@ 2022-03-19 20:13 ` psychosis
2022-03-20 4:20 ` Ihor Radchenko
0 siblings, 1 reply; 20+ messages in thread
From: psychosis @ 2022-03-19 20:13 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: Jamie Matthews, emacs-orgmode, Nicolas Goaziou
Hello,
I have a related problem (see
https://list.orgmode.org/77KM6R.3BGD0Z3Z4PHF1@disroot.org/ for
reference) and Ihor’s patch reduced the initial load time of my
biblatex bibliography file from almost ten minutes to about 10 to 20
seconds.
For me the changed cite-key turns only red if I restart Emacs.
Thanks for the patch! Please let me know, if I can provide information
for further improvement.
Regards
Paul
On Sa, Mär 19 2022 at 20:12:39 +0800, Ihor Radchenko
<yantar92@gmail.com> wrote:
> Jamie Matthews <jdm204@cam.ac.uk> writes:
>
>> Hmm, this doesn't fix what I'm seeing - still hang on first load
>> and quicker after killing and finding again. Very possible there's
>> something else obvious I'm missing though, I'm very much an emacs
>> 'end user'!
>
> This is strange. I would then try to run the profiler and look at the
> results. The very first run might be somewhat slow because org-cite
> still needs to parse the whole bibliography at the very beginning. But
> it should be still faster than 10 seconds you observed earlier.
>
> Best,
> Ihor
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library
2022-03-19 20:13 ` psychosis
@ 2022-03-20 4:20 ` Ihor Radchenko
2022-03-21 16:51 ` psychosis
0 siblings, 1 reply; 20+ messages in thread
From: Ihor Radchenko @ 2022-03-20 4:20 UTC (permalink / raw)
To: psychosis; +Cc: emacs-orgmode, Nicolas Goaziou, Jamie Matthews
psychosis <psychosis@disroot.org> writes:
> I have a related problem (see
> https://list.orgmode.org/77KM6R.3BGD0Z3Z4PHF1@disroot.org/ for
> reference) and Ihor’s patch reduced the initial load time of my
> biblatex bibliography file from almost ten minutes to about 10 to 20
> seconds.
I have seen your email. Persisting the cache between sessions is a more
tricky. It will require a dedicated effort - significantly more than the
few-liner patch here.
I assume that your bibliography is much larger than 10Mb. Am I correct?
> For me the changed cite-key turns only red if I restart Emacs.
This is bad. Would you mind constructing a reproducer? (using the
patched Org version)
> Thanks for the patch! Please let me know, if I can provide information
> for further improvement.
You definitely can.
1. You may share your bibtex file. It will make things much easier for
me. I only have 1Mb example bibliography, which I used before I switched
to storing references in Org files.
2. You may provide profiling information using M-x profiler-start M-x
profiler-report and using ELP as I described in the previous messages.
3. We may arrange a live jitsi meeting and profile things together.
Best,
Ihor
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library
2022-03-20 4:20 ` Ihor Radchenko
@ 2022-03-21 16:51 ` psychosis
2022-03-22 12:27 ` Ihor Radchenko
0 siblings, 1 reply; 20+ messages in thread
From: psychosis @ 2022-03-21 16:51 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: Jamie Matthews, emacs-orgmode, Nicolas Goaziou
[-- Attachment #1: Type: text/plain, Size: 1891 bytes --]
Hello,
On So, Mär 20 2022 at 12:20:42 +0800, Ihor Radchenko
<yantar92@gmail.com> wrote:
> I have seen your email. Persisting the cache between sessions is a
> more
> tricky. It will require a dedicated effort - significantly more than
> the
> few-liner patch here.
Ah, I am not programmer and made an uneducated assumption because of
the names “org-persist”, “org-cite-basic--bibliography-cache”
and “org-cite-basic--completion-cache”.
> I assume that your bibliography is much larger than 10Mb. Am I
> correct?
Actually, it is only 2Mb (7000 entries). Probably there is a problem
with the data structure of my .bib-file? I create it from Zotero and
the BibLaTeX-export of the BetterBibTeX-Plugin.
> This is bad. Would you mind constructing a reproducer? (using the
> patched Org version)
I made a mistake, I inserted the changed cite-key on a new line, which
does not work. Inserting the new key on the same line as the old one
works. I don’t know if this is the intended behavior.
> 2. You may provide profiling information using M-x profiler-start M-x
> profiler-report and using ELP as I described in the previous messages.
I attached the profiler-reports and the ELP-reports for “emacs -Q”
with and without the patch. I use the latest Emacs version, built with
“native-compilation” and “pgtk” (GNU Emacs 29.0.50,
x86_64-pc-linux-gnu, GTK+ Version 3.24.31, cairo version 1.17.4; Org
mode version 9.5.2, release_9.5.2-24-g668205) with an .org-file that
only contains one citation and the location of the bibliography. The
CPU is an Intel Core i5-3320M @ 2.60GHz × 4.
Thank you very much for your help! I should add, I am content with the
improvement of the parsing time that your patch provides. So if you
assume that the problem is only on my end, please attend to more
pressing issues.
Kind regards
Paul
[-- Attachment #2.1: reports_with_patch.org --]
[-- Type: text/plain, Size: 11418 bytes --]
* CPU-Profiler-Report
8720 53% + ...
7498 45% - command-execute
7060 42% - funcall-interactively
6681 40% - execute-extended-command
6681 40% - command-execute
6681 40% - funcall-interactively
6666 40% - org-cite-insert
6665 40% + #<compiled -0x1f00c9f117456103>
1 0% - org-element-context
1 0% - org-element--object-lex
1 0% org-element-citation-parser
373 2% - find-file
373 2% - find-file-noselect
372 2% - find-file-noselect-1
371 2% - after-find-file
370 2% - normal-mode
370 2% - set-auto-mode
369 2% - set-auto-mode--apply-alist
369 2% - set-auto-mode-0
330 2% - org-mode
236 1% - org-load-modules-maybe
224 1% - require
224 1% - byte-code
203 1% - require
202 1% - byte-code
164 0% - require
163 0% - byte-code
128 0% - require
124 0% - byte-code
101 0% - require
101 0% - byte-code
75 0% - require
73 0% - byte-code
47 0% - require
46 0% - byte-code
29 0% - require
29 0% - byte-code
29 0% - require
18 0% byte-code
8 0% - require
8 0% - byte-code
2 0% - require
1 0% - load-with-code-conversion
1 0% - hack-read-symbol-shorthands
1 0% hack-local-variables--find-variables
3 0% - eieio-defclass-internal
3 0% eieio--add-new-slot
1 0% eval-after-load
1 0% defvar
1 0% - custom-declare-group
1 0% custom-handle-keyword
1 0% - defvar
1 0% byte-code
1 0% - easy-menu-do-define
1 0% - easy-menu-create-menu
1 0% easy-menu-convert-item
2 0% - defvar
2 0% - byte-code
1 0% - sort-coding-systems
1 0% + #<compiled -0x6cdb28b2d51a5ea>
1 0% - mapcar
1 0% #<compiled -0x1344a1181027709e>
1 0% load-with-code-conversion
2 0% - custom-declare-variable
1 0% custom-handle-keyword
1 0% - custom-initialize-reset
1 0% - eval
1 0% - funcall
1 0% + #<compiled 0x160ff216c5504571>
1 0% custom-declare-face
1 0% - define-keymap
1 0% keymap-set
1 0% easy-menu-do-define
1 0% - defvar
1 0% - byte-code
1 0% - define-keymap
1 0% - keymap-set
1 0% - keymap--check
1 0% key-valid-p
2 0% - define-keymap
2 0% - keymap-set
1 0% - keymap--check
1 0% key-valid-p
1 0% - custom-declare-group
1 0% custom-handle-keyword
1 0% - custom-declare-face
1 0% - custom-handle-all-keywords
1 0% custom-handle-keyword
1 0% - easy-menu-do-define
1 0% - easy-menu-create-menu
1 0% - easy-menu-convert-item
1 0% - easy-menu-convert-item-1
1 0% easy-menu-create-menu
79 0% - byte-code
32 0% - require
24 0% - byte-code
9 0% - require
7 0% - byte-code
3 0% - require
3 0% - byte-code
1 0% - require
1 0% - byte-code
1 0% - custom-declare-variable
1 0% - custom-initialize-reset
1 0% - eval
1 0% funcall
3 0% - custom-declare-variable
3 0% - custom-initialize-reset
3 0% - eval
3 0% - funcall
3 0% + #<compiled -0x607663b6614a677>
3 0% - org-link-set-parameters
3 0% - org-link-make-regexps
1 0% - rx-to-string
1 0% - rx--translate
1 0% - rx--translate-form
1 0% rx--translate-seq
3 0% - load
2 0% load-with-code-conversion
2 0% - defvar
2 0% byte-code
1 0% - do-after-load-evaluation
1 0% elisp--font-lock-flush-elisp-buffers
5 0% - custom-declare-variable
5 0% - custom-initialize-reset
5 0% - org-babel-do-load-languages
1 0% - require
1 0% byte-code
4 0% - load
3 0% load-with-code-conversion
4 0% - org-set-regexps-and-options
2 0% - cl-mapcan
2 0% - byte-code
1 0% - require
1 0% - defvar
1 0% byte-code
4 0% - org-macro-initialize-templates
4 0% - byte-code
2 0% require
2 0% - defconst
2 0% - org-version
1 0% locate-library
1 0% hack-dir-local--get-variables
1 0% - run-hooks
1 0% - vc-refresh-state
1 0% - vc-backend
1 0% - vc-registered
1 0% - mapc
1 0% + #<compiled 0x176cf3f2aee2e339>
1 0% - insert-file-contents
1 0% make-lock-file-name
1 0% - find-buffer-visiting
1 0% - file-truename
1 0% - file-truename
1 0% - file-truename
1 0% file-truename
4 0% - next-line
1 0% - line-move
1 0% - line-move-visual
1 0% - eval
1 0% if
438 2% - byte-code
339 2% - read-extended-command
126 0% - completing-read-default
3 0% + redisplay_internal (C function)
1 0% - undo-auto--add-boundary
1 0% undo-auto--boundaries
1 0% - command-execute
1 0% - funcall-interactively
1 0% - minibuffer-complete
1 0% - completion-in-region
1 0% - completion--in-region
1 0% + #<compiled -0x73104f7a7357822>
99 0% - find-file-read-args
99 0% - read-file-name
99 0% - read-file-name-default
34 0% - completing-read-default
2 0% rfn-eshadow-update-overlay
1 0% - command-execute
1 0% - funcall-interactively
1 0% - minibuffer-complete
1 0% - completion-in-region
1 0% - completion--in-region
1 0% + #<compiled -0x73104f7a7357822>
1 0% - frame-windows-min-size
1 0% - window-min-size
1 0% - window--min-size-1
1 0% window-size-fixed-p
132 0% - timer-event-handler
132 0% - apply
130 0% + #<compiled -0x1426c3f9fec74ca4>
2 0% + #<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_12>
81 0% + redisplay_internal (C function)
1 0% clear-minibuffer-message
* Memory-Profiler-Report
1,294,962,301 99% + command-execute
6,842,988 0% + redisplay_internal (C function)
257,232 0% - timer-event-handler
247,296 0% - apply
246,096 0% + #<compiled -0x1426c3f9fec74ca4>
1,056 0% show-paren-function
144 0% - blink-cursor-start
144 0% - blink-cursor--start-timer
144 0% - run-with-timer
144 0% - run-at-time
96 0% timer-set-time
48 0% - timer-activate
48 0% timer--activate
6,240 0% timer-inc-time
3,168 0% - timer-activate-when-idle
3,168 0% - timer--activate
3,168 0% timer--time-less-p
216 0% - timer-activate
216 0% timer--activate
4,165 0% - help-command-error-confusable-suggestions
4,165 0% substitute-command-keys
1,056 0% - undo-auto--add-boundary
1,056 0% undo-auto--boundaries
24 0% - eldoc-schedule-timer
24 0% - run-with-idle-timer
24 0% timer-set-idle-time
0 0% ...
* ELP Profiling Results
org-cite-basic--parse-bibliography 37183 14.256997996 0.0003834278
org-cite-basic--get-entry 37180 10.932912038 0.0002940535
org-cite-basic-activate 1 3.796198629 3.796198629
org-cite-basic--all-keys 2 3.65558611 1.827793055
[-- Attachment #2.2: reports_without_patch.org --]
[-- Type: text/plain, Size: 7553 bytes --]
* CPU-Profiler-Report
555055 98% - command-execute
554536 98% - funcall-interactively
554148 98% - execute-extended-command
554147 98% - command-execute
554147 98% - funcall-interactively
554136 98% - org-cite-insert
554136 98% + #<compiled -0x1c5f6fba0011a443>
1 0% - run-at-time
1 0% timer-set-time
382 0% - find-file
382 0% - find-file-noselect
380 0% - find-file-noselect-1
380 0% - after-find-file
379 0% - normal-mode
379 0% - set-auto-mode
378 0% - set-auto-mode--apply-alist
378 0% - set-auto-mode-0
339 0% - org-mode
227 0% - org-load-modules-maybe
215 0% - require
215 0% - byte-code
195 0% - require
192 0% - byte-code
153 0% - require
152 0% - byte-code
122 0% - require
118 0% - byte-code
96 0% - require
95 0% - byte-code
69 0% - require
68 0% - byte-code
39 0% - require
39 0% - byte-code
26 0% - require
26 0% - byte-code
25 0% - require
23 0% - byte-code
3 0% require
2 0% - require
2 0% byte-code
1 0% cl-generic-define
2 0% - custom-declare-variable
2 0% - custom-initialize-reset
2 0% - eval
2 0% - funcall
1 0% + #<compiled 0xe378fc5bb636585>
1 0% + #<compiled 0x4db6beb5f185dde>
1 0% - defvar
1 0% - byte-code
1 0% - define-keymap
1 0% keymap-set
1 0% do-after-load-evaluation
3 0% - defvar
3 0% - byte-code
2 0% - sort-coding-systems
2 0% + #<compiled -0x6cdb28b2d51a5ea>
1 0% - mapcar
1 0% #<compiled -0x16dce84f3afb709d>
1 0% load-with-code-conversion
1 0% - custom-declare-face
1 0% - face-spec-set
1 0% - face-spec-recalc
1 0% face-spec-reset-face
1 0% - define-keymap
1 0% keymap-set
1 0% - custom-declare-variable
1 0% - custom-initialize-reset
1 0% - eval
1 0% - funcall
1 0% + #<compiled -0x1c24747dda5cfbb4>
1 0% - defvar
1 0% - byte-code
1 0% - define-keymap
1 0% keymap-set
3 0% - define-keymap
3 0% - keymap-set
1 0% key-parse
1 0% custom-declare-variable
2 0% + defvar
1 0% + do-after-load-evaluation
100 0% + byte-code
5 0% + org-macro-initialize-templates
2 0% + defconst
1 0% + run-mode-hooks
1 0% + hack-local-variables
1 0% + file-truename
1 0% create-file-buffer
3 0% + save-buffer
1 0% next-line
519 0% - byte-code
414 0% - read-extended-command
141 0% - completing-read-default
59 0% - command-execute
59 0% - funcall-interactively
35 0% - minibuffer-complete
35 0% - completion-in-region
35 0% - completion--in-region
35 0% + #<compiled -0x73d91f7d4123822>
22 0% - next-line-or-history-element
22 0% - next-history-element
22 0% - goto-history-element
21 0% + #<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_42>
19 0% + #<compiled -0xdc802aa55ec78c6>
3 0% - timer-event-handler
3 0% - apply
2 0% - show-paren-function
2 0% - show-paren--default
2 0% show-paren--locate-near-paren
1 0% - blink-cursor-start
1 0% - blink-cursor--start-timer
1 0% - run-with-timer
1 0% - run-at-time
1 0% timer-set-time
2 0% + redisplay_internal (C function)
104 0% - find-file-read-args
104 0% - read-file-name
104 0% - read-file-name-default
25 0% - completing-read-default
2 0% - timer-event-handler
2 0% - apply
2 0% #<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_10>
9833 1% + ...
331 0% + redisplay_internal (C function)
131 0% - timer-event-handler
131 0% - apply
131 0% + #<compiled -0x81af4a86cf8d641>
* Memory-Profiler-Report
82,333,670,467 99% + command-execute
8,492,807 0% + redisplay_internal (C function)
200,120 0% - timer-event-handler
197,640 0% - apply
197,448 0% + #<compiled -0x81af4a86cf8d641>
192 0% - blink-cursor-start
192 0% - blink-cursor--start-timer
192 0% - run-with-timer
192 0% - run-at-time
120 0% timer-set-time
72 0% - timer-activate
72 0% timer--activate
1,760 0% timer-inc-time
192 0% - timer-activate
192 0% timer--activate
24 0% - eldoc-schedule-timer
24 0% - run-with-idle-timer
24 0% timer-set-idle-time
0 0% ...
* ELP Profiling Results
org-cite-basic--parse-bibliography 37185 556.48202320 0.0149652285
org-cite-basic--get-entry 37181 553.60200644 0.0148893791
org-cite-basic-activate 2 3.9012992189 1.9506496094
org-cite-basic--all-keys 3 3.74166148 1.2472204933
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library
2022-03-21 16:51 ` psychosis
@ 2022-03-22 12:27 ` Ihor Radchenko
2022-03-22 16:42 ` psychosis
0 siblings, 1 reply; 20+ messages in thread
From: Ihor Radchenko @ 2022-03-22 12:27 UTC (permalink / raw)
To: psychosis; +Cc: emacs-orgmode, Nicolas Goaziou, Jamie Matthews
psychosis <psychosis@disroot.org> writes:
>> I assume that your bibliography is much larger than 10Mb. Am I
>> correct?
>
> Actually, it is only 2Mb (7000 entries). Probably there is a problem
> with the data structure of my .bib-file? I create it from Zotero and
> the BibLaTeX-export of the BetterBibTeX-Plugin.
> * ELP Profiling Results
>
> org-cite-basic--parse-bibliography 37183 14.256997996 0.0003834278
> org-cite-basic--get-entry 37180 10.932912038 0.0002940535
> org-cite-basic-activate 1 3.796198629 3.796198629
> org-cite-basic--all-keys 2 3.65558611 1.827793055
> I attached the profiler-reports and the ELP-reports for “emacs -Q”
> with and without the patch. I use the latest Emacs version, built with
> “native-compilation” and “pgtk” (GNU Emacs 29.0.50,
> x86_64-pc-linux-gnu, GTK+ Version 3.24.31, cairo version 1.17.4; Org
> mode version 9.5.2, release_9.5.2-24-g668205) with an .org-file that
> only contains one citation and the location of the bibliography. The
> CPU is an Intel Core i5-3320M @ 2.60GHz × 4.
Judging from ELP data, most of the time is spent fontifying individual
citations, not reading the .bib file. And the fontification gets called
37k times! This is fishy. It is not supposed to happen.
Could you provide your example file and the relevant .bib record?
>> This is bad. Would you mind constructing a reproducer? (using the
>> patched Org version)
>
> I made a mistake, I inserted the changed cite-key on a new line, which
> does not work. Inserting the new key on the same line as the old one
> works. I don’t know if this is the intended behavior.
To clarify, org-cite must detect changes in your .bib file and if a
citation record key in the .bib files was changed, trying to insert it
should trigger org-cite to mark the new citation in red. Regardless if
the citation gets inserted at the same of other line.
Or maybe I misunderstand what you did.
>> 2. You may provide profiling information using M-x profiler-start M-x
>> profiler-report and using ELP as I described in the previous messages.
>
> Thank you very much for your help! I should add, I am content with the
> improvement of the parsing time that your patch provides. So if you
> assume that the problem is only on my end, please attend to more
> pressing issues.
It's fine. I plan to work on org-cite in future, so I use such bug
reports as an opportunity to familiarise myself with the source code.
Best,
Ihor
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library
2022-03-22 12:27 ` Ihor Radchenko
@ 2022-03-22 16:42 ` psychosis
2022-03-23 11:07 ` Ihor Radchenko
0 siblings, 1 reply; 20+ messages in thread
From: psychosis @ 2022-03-22 16:42 UTC (permalink / raw)
To: Ihor Radchenko; +Cc: emacs-orgmode, Nicolas Goaziou, Jamie Matthews
[-- Attachment #1: Type: text/plain, Size: 734 bytes --]
Hello,
On Di, Mär 22 2022 at 20:27:27 +0800, Ihor Radchenko
<yantar92@gmail.com> wrote:
> To clarify, org-cite must detect changes in your .bib file and if a
> citation record key in the .bib files was changed, trying to insert it
> should trigger org-cite to mark the new citation in red. Regardless if
> the citation gets inserted at the same of other line.
>
> Or maybe I misunderstand what you did.
Ok, for me it does not seem to work as intended. The old cite-key turns
immediately red if and if only I insert the new cite-key on the same
line. For it to turn red if I insert it on other lines, I have to kill
the buffer and reload it. I used “emacs -Q” with the attached files.
Kind regards
Paul
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2.1: test.bib --]
[-- Type: text/x-bibtex, Size: 229 bytes --]
@book{abelson_et_al_2000,
title = {Structure and Interpretation of Computer Programs},
author = {Abelson, Harold and Sussman, Gerald Jay and Sussman, Julie},
date = {2000},
publisher = {{MIT P}},
abstract = {2012}
}
[-- Attachment #2.2: test.org --]
[-- Type: text/plain, Size: 90 bytes --]
#+bibliography: ~/org/test.bib
[cite:@abelson_et_al_2012]
test
[cite:@abelson_et_al_2000]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library
2022-03-22 16:42 ` psychosis
@ 2022-03-23 11:07 ` Ihor Radchenko
0 siblings, 0 replies; 20+ messages in thread
From: Ihor Radchenko @ 2022-03-23 11:07 UTC (permalink / raw)
To: psychosis; +Cc: emacs-orgmode, Jamie Matthews, Nicolas Goaziou
psychosis <psychosis@disroot.org> writes:
> Ok, for me it does not seem to work as intended. The old cite-key turns
> immediately red if and if only I insert the new cite-key on the same
> line. For it to turn red if I insert it on other lines, I have to kill
> the buffer and reload it. I used “emacs -Q” with the attached files.
Thanks! I understand now. What you see is expected. This behaviour has
been there before the patch.
What I was asking to test is whether org-cite can detect changes in
bibliography made outside Emacs. Those changes are only reflected in the
link colour during next fontification. Already fontified links are not
going to be updated (we can actually do it, but it can degrade
performance even further).
When you insert a link at a new line, only that newly inserted link will
be fontified taking into account changes in the bibliography. The old
link is already fontified from Emacs' point of view and hence will not
be updated.
When you insert a link at the same line with previous, Emacs detects
changes made in the line and re-fontifies the whole line (single line is
the minimal possible fontified region in Emacs by default). Hence, both
the old link and the newly inserted link get fontified according to the
updated bibliography.
Best,
Ihor
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library
2022-03-19 11:47 ` [PATCH] Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library Ihor Radchenko
2022-03-19 12:01 ` Jamie Matthews
@ 2022-04-16 10:11 ` Ihor Radchenko
1 sibling, 0 replies; 20+ messages in thread
From: Ihor Radchenko @ 2022-04-16 10:11 UTC (permalink / raw)
To: Jamie Matthews; +Cc: emacs-orgmode@gnu.org, Nicolas Goaziou
Ihor Radchenko <yantar92@gmail.com> writes:
> Jamie Matthews <jdm204@cam.ac.uk> writes:
>
>> I can confirm that the key turns red on insert when I altered the key outside of emacs (with that second version of `org-cite-basic--parse-bibliography`).
>
> Great. Then, I am attaching the patch with the new version of the
> function. I will let Nicolas decide if it is good enough to be merged. I
> do not feel confident enough with org-cite code to judge if my approach
> is not missing some edge cases.
It has been almost one month since I posted the patch.
I took a freedom to merge it (+ one more extra optimisation) given the
feedback and other complaints about oc-basic performance (e.g. By Prof.
Kitchin).
Best,
Ihor
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2022-04-16 10:11 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-14 14:45 [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library [9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)] Jamie Matthews
2022-03-16 13:01 ` Bruce D'Arcus
2022-03-19 8:28 ` Ihor Radchenko
2022-03-19 8:57 ` Jamie Matthews
2022-03-19 9:23 ` Ihor Radchenko
2022-03-19 9:25 ` Jamie Matthews
2022-03-19 9:57 ` Ihor Radchenko
2022-03-19 10:12 ` Jamie Matthews
2022-03-19 10:28 ` Ihor Radchenko
2022-03-19 11:17 ` Jamie Matthews
2022-03-19 11:47 ` [PATCH] Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library Ihor Radchenko
2022-03-19 12:01 ` Jamie Matthews
2022-03-19 12:12 ` Ihor Radchenko
2022-03-19 20:13 ` psychosis
2022-03-20 4:20 ` Ihor Radchenko
2022-03-21 16:51 ` psychosis
2022-03-22 12:27 ` Ihor Radchenko
2022-03-22 16:42 ` psychosis
2022-03-23 11:07 ` Ihor Radchenko
2022-04-16 10:11 ` Ihor Radchenko
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).