I'm trying to figure out what causes high latency while typing in large org-mode files.  The issue is very clearly a result of my large config file, but I'm not sure how to track it down with any precision. 

My main literate config file is ~/.emacs.d/emacs-init.org, currently 15000 lines, 260 src blocks. 
If I create a ~minimal.el~ config like this:

(let* ((all-paths
          '("/home/matt/src/org-mode/emacs/site-lisp/org")))
    (dolist (p all-paths)
      (add-to-list 'load-path p)))

  (require 'org)
  (find-file "~/.emacs.d/emacs-init.org")

then I do not notice any latency while typing.  If I run the profiler while using the minimal config, the profile looks about like this at a high level:

        1397  71% - command-execute
         740  37%  - funcall-interactively
         718  36%   - org-self-insert-command
         686  34%    + org-element--cache-after-change
          10   0%    + org-fold-core--fix-folded-region
           3   0%    + blink-paren-post-self-insert-function
           2   0%    + jit-lock-after-change
           1   0%      org-fold-check-before-invisible-edit--text-properties
           9   0%   + previous-line
           6   0%   + minibuffer-complete
           3   0%   + org-return
           3   0%   + execute-extended-command
         657  33%  - byte-code
         657  33%   - read-extended-command
          64   3%    - completing-read-default
          14   0%     + redisplay_internal (C function)
           1   0%     + timer-event-handler
         371  18% - redisplay_internal (C function)
         251  12%  + jit-lock-function
          90   4%  + assq
           7   0%  + substitute-command-keys
           3   0%  + eval
         125   6% + timer-event-handler
          69   3% + ...

--------------------------
However, if I instead use my fairly extensive main config, latency is high enough that there's a noticeable delay while typing ordinary words. I see this  regardless of whether I build from main or from Ihor's org-fold feature branch on github. The profiler overview here is pretty different -- redisplay_internal takes a much higher percentage of the CPU requirement:

         3170  56% - redisplay_internal (C function)
         693  12%  - substitute-command-keys
         417   7%   + #<compiled -0x1c8b98a4b03336f3>
          59   1%  + assq
          49   0%  + org-in-subtree-not-table-p
          36   0%  + tab-bar-make-keymap
          35   0%    and
          24   0%  + not
          16   0%    org-at-table-p
          13   0%  + jit-lock-function
           8   0%    keymap-canonicalize
           7   0%  + #<compiled 0x74a551771c7fdf1>
           4   0%  + funcall
           4   0%    display-graphic-p
           3   0%  + #<compiled 0xe5940664f7881ee>
           3   0%    file-readable-p
           3   0%  + table--probe-cell
           3   0%    table--row-column-insertion-point-p
        1486  26% - command-execute
        1200  21%  - byte-code
        1200  21%   - read-extended-command
        1200  21%    - completing-read-default
        1200  21%     - apply
        1200  21%      - vertico--advice
         475   8%       + #<subr completing-read-default>

----------------------
I've almost never used the profiler and am not quite sure how I should proceed to debug this.  I realize I can comment out parts of the config one at a time, but that is not so easy for me to do in my current setup, and I suppose there are likely to be multiple contributing causes, which I may not really notice except in the aggregate. 

If anyone has suggestions, I would love to hear them!

Thanks,

Matt