emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Ihor Radchenko <yantar92@gmail.com>
To: Max Nikulin <manikulin@gmail.com>
Cc: emacs-orgmode@gnu.org
Subject: Re: profiling latency in large org-mode buffers (under both main & org-fold feature)
Date: Sun, 27 Feb 2022 14:43:29 +0800	[thread overview]
Message-ID: <87y21wkdwu.fsf@localhost> (raw)
In-Reply-To: <svd7ds$2vl$1@ciao.gmane.io>

Max Nikulin <manikulin@gmail.com> writes:

>> Max Nikulin writes:
>>> Actually I suspect that markers may have a similar problem during regexp
>>> searches. I am curious if it is possible to invoke a kind of "vacuum"
>>> (in SQL parlance). Folding all headings and resetting refile cache does
>>> not restore performance to the initial state at session startup. Maybe
>>> it is effect of incremental searches.
>> 
>> I doubt that markers have anything to do with regexp search itself
>> (directly). They should only come into play when editing text in buffer,
>> where their performance is also O(N_markers).
>
> I believed, your confirmed my conclusion earlier:
>
> Ihor Radchenko. Re: [BUG] org-goto slows down org-set-property.
> Sun, 11 Jul 2021 19:49:08 +0800.
> https://list.orgmode.org/orgmode/87lf6dul3f.fsf@localhost/

I confirmed that invoking org-refile-get-targets slows down your nm-tst
looping over the headlines.

However, the issue is not with outline-next-heading there. Profiling
shows that the slowdown mostly happens in org-get-property-block

I have looked into regexp search C source and I did not find anything
that could depend on the number markers in buffer.
After further analysis now (after your email), I found that I may be
wrong and regexp search might actually be affected.

Now, I did an extended profiling of what is happening using perf:

;; perf cpu with refile cache (using your previous code on my largest Org buffer)
    19.68%   [.] mark_object
     6.20%   [.] buf_bytepos_to_charpos
     5.66%   [.] re_match_2_internal
     5.33%   [.] exec_byte_code
     5.07%   [.] rpl_re_search_2
     3.09%   [.] Fmemq
     2.56%   [.] allocate_vectorlike
     1.86%   [.] sweep_vectors
     1.47%   [.] mark_objects
     1.45%   [.] pdumper_marked_p_impl

;; perf cpu without refile cache (removing getting refile targets from the code)
    18.79%   [.] mark_object
     8.23%   [.] re_match_2_internal
     5.88%   [.] rpl_re_search_2
     4.06%   [.] buf_bytepos_to_charpos
     3.06%   [.] Fmemq
     2.45%   [.] allocate_vectorlike
     1.63%   [.] exec_byte_code
     1.50%   [.] pdumper_marked_p_impl

The bottleneck appears to be buf_bytepos_to_charpos, called by
BYTE_TO_CHAR macro, which, in turn, is used by set_search_regs

buf_bytepos_to_charpos contains the following loop:

  for (tail = BUF_MARKERS (b); tail; tail = tail->next)
    {
      CONSIDER (tail->bytepos, tail->charpos);

      /* If we are down to a range of 50 chars,
	 don't bother checking any other markers;
	 scan the intervening chars directly now.  */
      if (best_above - bytepos < distance
          || bytepos - best_below < distance)
	break;
      else
        distance += BYTECHAR_DISTANCE_INCREMENT;
    }

I am not sure if I understand the code correctly, but that loop is
clearly scaling performance with the number of markers

Finally, FYI. I plan to work on an alternative mechanism to access Org
headings - generic Org query library. It will not use markers and
implement ideas from org-ql. org-refile will eventually use that generic
library instead of current mechanism.

Best,
Ihor



  reply	other threads:[~2022-02-27  6:46 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-21 21:06 Matt Price
2022-02-21 22:22 ` Samuel Wales
2022-02-22  5:33   ` Ihor Radchenko
2022-02-22  5:44     ` Kaushal Modi
     [not found]       ` <CAN_Dec8kW5hQoa0xr7sszafYJJNmGipX0DA94DKNh11DWjce8g@mail.gmail.com>
2022-02-23  2:41         ` Matt Price
2022-02-23  5:22           ` Ihor Radchenko
2022-02-23 14:47             ` Matt Price
2022-02-23 15:10               ` Ihor Radchenko
2022-02-22 21:11     ` Rudolf Adamkovič
2022-02-23 12:37       ` Org mode profiling meetup on Sat, Feb 26 (was: profiling latency in large org-mode buffers (under both main & org-fold feature)) Ihor Radchenko
2022-02-23 16:43         ` Kaushal Modi
2022-02-25 14:30         ` Ihor Radchenko
2022-02-26 12:04           ` Ihor Radchenko
2022-02-26 12:51             ` Ihor Radchenko
2022-02-26 15:51               ` Quiliro Ordóñez
2022-03-23 10:57                 ` #2 Org mode profiling meetup on Sat, Mar 26 (was: Org mode profiling meetup on Sat, Feb 26 (was: profiling latency in large org-mode buffers (under both main & org-fold feature))) Ihor Radchenko
2022-03-24 11:17                   ` Ihor Radchenko
2022-03-24 11:27                   ` Bruce D'Arcus
2022-03-24 13:43                     ` Matt Price
2022-03-24 13:49                     ` Ihor Radchenko
2022-03-26 11:59                   ` Ihor Radchenko
2022-03-27  8:14                     ` Ihor Radchenko
2022-04-21  8:05                   ` #3 Org mode profiling meetup on Sat, Apr 23 (was: #2 Org mode profiling meetup on Sat, Mar 26) Ihor Radchenko
2022-04-23 12:08                     ` Ihor Radchenko
2022-04-24  4:27                       ` Ihor Radchenko
2022-02-27  7:41               ` Org mode profiling meetup on Sat, Feb 26 (was: profiling latency in large org-mode buffers (under both main & org-fold feature)) Ihor Radchenko
2022-02-23 16:03     ` profiling latency in large org-mode buffers (under both main & org-fold feature) Max Nikulin
2022-02-23 16:35       ` Ihor Radchenko
2022-02-25 12:38         ` Max Nikulin
2022-02-26  7:45           ` Ihor Radchenko
2022-02-26 12:45             ` Max Nikulin
2022-02-27  6:43               ` Ihor Radchenko [this message]
2022-03-02 12:23                 ` Max Nikulin
2022-03-02 15:12                   ` Ihor Radchenko
2022-03-03 14:56                     ` Max Nikulin
2022-03-19  8:49                       ` Ihor Radchenko
2022-02-26 15:07     ` Jean Louis
2022-02-23  2:39   ` Matt Price
2022-02-23  5:25     ` Ihor Radchenko
2022-02-22  5:30 ` Ihor Radchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y21wkdwu.fsf@localhost \
    --to=yantar92@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=manikulin@gmail.com \
    --subject='Re: profiling latency in large org-mode buffers (under both main & org-fold feature)' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Code repositories for project(s) associated with this inbox:

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).