emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Ihor Radchenko <yantar92@gmail.com>
To: Jamie Matthews <jdm204@cam.ac.uk>
Cc: "emacs-orgmode@gnu.org" <emacs-orgmode@gnu.org>
Subject: Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library [9.5.2 (9.5.2-g91681f @ /home/jdm204/.config/emacs/straight/build/org/)]
Date: Sat, 19 Mar 2022 17:57:15 +0800	[thread overview]
Message-ID: <87cziifeo4.fsf@localhost> (raw)
In-Reply-To: <LO2P265MB1758DAB8941C29D634E23298DC149@LO2P265MB1758.GBRP265.PROD.OUTLOOK.COM>

Jamie Matthews <jdm204@cam.ac.uk> writes:

> Thanks:
>
> ```
> org-cite-basic-activate             59          10.724349447  0.1817686346
> org-cite-basic--parse-bibliography  129         10.559936049  0.0818599693
> org-cite-basic--all-keys            59          7.830202561   0.1327152976
> org-cite-basic--get-entry           70          2.7772344940  0.0396747784
> ```

org-cite-basic--parse-bibliography appears to be the main bottleneck.

I tried to write a quick fix (untested).
Can you try to redefine org-cite-basic--parse-bibliography to the
version below (note an extra defvar) and let me know how it goes:

(defvar org-cite-basic--file-id-cache nil
  "Hash table linking files to their hash.")
(defun org-cite-basic--parse-bibliography (&optional info)
  "List all entries available in the buffer.

Each association follows the pattern

  (FILE . ENTRIES)

where FILE is the absolute file name of the BibTeX file, and ENTRIES is a hash
table where keys are references and values are association lists between fields,
as symbols, and values as strings or nil.

Optional argument INFO is the export state, as a property list."
  (unless (hash-table-p org-cite-basic--file-id-cache)
    (setq org-cite-basic--file-id-cache (make-hash-table :test #'equal)))
  (if (plist-member info :cite-basic/bibliography)
      (plist-get info :cite-basic/bibliography)
    (let ((results nil))
      (dolist (file (org-cite-list-bibliography-files))
        (when (file-readable-p file)
          (with-temp-buffer
            (when (or (file-has-changed-p file)
                      (not (gethash file org-cite-basic--file-id-cache)))
              (insert-file-contents file))
            (unless (gethash file org-cite-basic--file-id-cache)
              (puthash file (org-buffer-hash) org-cite-basic--file-id-cache))
	    (let* ((file-id (cons file (gethash file org-cite-basic--file-id-cache)))
                   (entries
                    (or (cdr (assoc file-id org-cite-basic--bibliography-cache))
                        (let ((table
                               (pcase (file-name-extension file)
                                 ("json" (org-cite-basic--parse-json))
                                 ("bib" (org-cite-basic--parse-bibtex 'biblatex))
                                 ("bibtex" (org-cite-basic--parse-bibtex 'BibTeX))
                                 (ext
                                  (user-error "Unknown bibliography extension: %S"
                                              ext)))))
                          (push (cons file-id table) org-cite-basic--bibliography-cache)
                          table))))
              (push (cons file entries) results)))))
      (when info (plist-put info :cite-basic/bibliography results))
      results)))

Best,
Ihor


  reply	other threads:[~2022-03-19  9:57 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-14 14:45 Jamie Matthews
2022-03-16 13:01 ` Bruce D'Arcus
2022-03-19  8:28 ` Ihor Radchenko
2022-03-19  8:57   ` Jamie Matthews
2022-03-19  9:23     ` Ihor Radchenko
2022-03-19  9:25       ` Jamie Matthews
2022-03-19  9:57         ` Ihor Radchenko [this message]
2022-03-19 10:12           ` Jamie Matthews
2022-03-19 10:28             ` Ihor Radchenko
2022-03-19 11:17               ` Jamie Matthews
2022-03-19 11:47                 ` [PATCH] Re: [BUG] org-cite: 10 second hang opening a ~4k org file with 10MB bibtex library Ihor Radchenko
2022-03-19 12:01                   ` Jamie Matthews
2022-03-19 12:12                     ` Ihor Radchenko
2022-03-19 20:13                       ` psychosis
2022-03-20  4:20                         ` Ihor Radchenko
2022-03-21 16:51                           ` psychosis
2022-03-22 12:27                             ` Ihor Radchenko
2022-03-22 16:42                               ` psychosis
2022-03-23 11:07                                 ` Ihor Radchenko
2022-04-16 10:11                   ` Ihor Radchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87cziifeo4.fsf@localhost \
    --to=yantar92@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=jdm204@cam.ac.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).