emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Ihor Radchenko <yantar92@gmail.com>
To: Nicolas Goaziou <mail@nicolasgoaziou.fr>
Cc: emacs-orgmode@gnu.org
Subject: Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers
Date: Tue, 19 May 2020 00:52:08 +0800	[thread overview]
Message-ID: <87tv0d2nk7.fsf@localhost> (raw)
In-Reply-To: <87r1vhqpja.fsf@nicolasgoaziou.fr>

> As you noticed, using Org Element is a no-go, unfortunately. Parsing an
> element is a O(N) operation by the number of elements before it in
> a section. In particular, it is not bounded, and not mitigated by
> a cache. For large documents, it is going to be unbearably slow, too.

Ouch. I thought it is faster.
What do you mean by "not mitigated by a cache"?

The reason I would like to utilise org-element parser to make tracking
modifications more robust. Using details of the syntax would make the
code fragile if any modifications are made to syntax in future.
Debugging bugs in modification functions is not easy, according to my
experience.

One possible way to avoid performance issues during modification is
running parser in advance. For example, folding an element may
as well add information about the element to its text properties.
This will not degrade performance of folding since we are already
parsing the element during folding (at least, in
org-hide-drawer-toggle).

The problem with parsing an element during folding is that we cannot
easily detect changes like below without re-parsing.

:PROPERTIES: <folded>
:CREATED: [2020-05-18 Mon]
:END: <- added line
:ID: test
:END:

or even

:PROPERTIES:
:CREATED: [2020-05-18 Mon]
:ID: test
:END: <- delete this line

:DRAWER: <folded, cannot be unfolded if we don't re-parse after deletion>
test
:END:

The re-parsing can be done via regexp, as you suggested, but I don't
like this idea, because it will end up re-implementing
org-element-*-parser. Would it be acceptable to run org-element-*-parser
in after-change-functions?

> If you use modification-hooks and al., you don't need to parse anything,
> because you can store information as text properties. Therefore, once
> the modification happens, you already know where you are (or, at least
> where you were before the change).

> The ideas I suggested about sensitive parts of elements are worth
> exploring, IMO. Do you have any issue with them?

If I understand correctly, it is not as easy.
Consider the following example:

:PROPERTIES:
:CREATED: [2020-05-18 Mon]
<region-beginning>
:ID: example
:END:

<... a lot of text, maybe containing other drawers ...>

Nullam rutrum.
Pellentesque dapibus suscipit ligula.
<region-end>
Proin quam nisl, tincidunt et, mattis eget, convallis nec, purus.

If the region gets deleted, the modification hooks from chars inside
drawer will be called as (hook-function <region-beginning>
<region-end>). So, there is still a need to find the drawer somehow to
mark it as about to be modified (modification hooks are ran before
actual modification).

The only difference between using modification hooks and
before-change-functions is that modification hooks will trigger less
frequently. Considering the performance of org-element-at-point, it is
probably worth doing. Initially, I wanted to avoid it because setting a
single before-change-functions hook sounded cleaner than setting
modification-hooks, insert-behind-hooks, and insert-in-front-hooks.
Moreover, these text properties would be copied by default if one uses 
buffer-substring. Then, the hooks will also trigger later in the yanked
text, which may cause all kinds of bugs.

> `org-element-at-point' is local, `org-element-parse-buffer' is global.
> They are not equivalent, but is it an issue?

It was mostly an annoyance, because they returned different results on
the same element. Specifically, they returned different :post-blank and
:end properties, which does not sound right.

Best,
Ihor




Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:

> Hello,
>
> Ihor Radchenko <yantar92@gmail.com> writes:
>
>> Apparently my previous email was again refused by your mail server (I
>> tried to add patch as attachment this time).
>
> Ah. This is annoying, for you and for me.
>
>> The patch is in
>> https://gist.github.com/yantar92/6447754415457927293acda43a7fcaef
>
> Thank you.
>
>>> I have finished a seemingly stable implementation of handling changes
>>> inside drawer and block elements. For now, I did not bother with
>>> 'modification-hooks and 'insert-in-font/behind-hooks, but simply used
>>> before/after-change-functions.
>>>
>>> The basic idea is saving parsed org-elements before the modification
>>> (with :begin and :end replaced by markers) and comparing them with the 
>>> versions of the same elements after the modification.
>>> Any valid org element can be examined in such way by an arbitrary
>>> function (see org-track-modification-elements) [1].
>
> As you noticed, using Org Element is a no-go, unfortunately. Parsing an
> element is a O(N) operation by the number of elements before it in
> a section. In particular, it is not bounded, and not mitigated by
> a cache. For large documents, it is going to be unbearably slow, too.
>
> I don't think the solution is to use combine-after-change-calls either,
> because even a single call to `org-element-at-point' can be noticeable
> in a very large section. Such low-level code should avoid using the
> Element library altogether, except for the initial folding part, which
> is interactive.
>
> If you use modification-hooks and al., you don't need to parse anything,
> because you can store information as text properties. Therefore, once
> the modification happens, you already know where you are (or, at least
> where you were before the change).
>
> The ideas I suggested about sensitive parts of elements are worth
> exploring, IMO. Do you have any issue with them?
>
>>> For (2), I have introduced org--property-drawer-modified-re to override
>>> org-property-drawer-re in relevant *-change-function. This seems to work
>>> for property drawers. However, I am not sure if similar problem may
>>> happen in some border cases with ordinary drawers or blocks. 
>
> I already specified what parts were "sensitive" in a previous message.
>
>>> 2. I have noticed that results of org-element-at-point and
>>> org-element-parse-buffer are not always consistent.
>
> `org-element-at-point' is local, `org-element-parse-buffer' is global.
> They are not equivalent, but is it an issue?
>
>
> Regards,
>
> -- 
> Nicolas Goaziou

-- 
Ihor Radchenko,
PhD,
Center for Advancing Materials Performance from the Nanoscale (CAMP-nano)
State Key Laboratory for Mechanical Behavior of Materials, Xi'an Jiaotong University, Xi'an, China
Email: yantar92@gmail.com, ihor_radchenko@alumni.sutd.edu.sg


  reply	other threads:[~2020-05-18 17:07 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-24  6:55 Ihor Radchenko
2020-04-24  8:02 ` Nicolas Goaziou
2020-04-25  0:29   ` stardiviner
2020-04-26 16:04   ` Ihor Radchenko
2020-05-04 16:56     ` Karl Voit
2020-05-07  7:18       ` Karl Voit
2020-05-09 15:43       ` Ihor Radchenko
2020-05-07 11:04     ` Christian Heinrich
2020-05-09 15:46       ` Ihor Radchenko
2020-05-08 16:38     ` Nicolas Goaziou
2020-05-09 13:58       ` Nicolas Goaziou
2020-05-09 16:22         ` Ihor Radchenko
2020-05-09 17:21           ` Nicolas Goaziou
2020-05-10  5:25             ` Ihor Radchenko
2020-05-10  9:47               ` Nicolas Goaziou
2020-05-10 13:29                 ` Ihor Radchenko
2020-05-10 14:46                   ` Nicolas Goaziou
2020-05-10 16:21                     ` Ihor Radchenko
2020-05-10 16:38                       ` Nicolas Goaziou
2020-05-10 17:08                         ` Ihor Radchenko
2020-05-10 19:38                           ` Nicolas Goaziou
2020-05-09 15:40       ` Ihor Radchenko
2020-05-09 16:30         ` Ihor Radchenko
2020-05-09 17:32           ` Nicolas Goaziou
2020-05-09 18:06             ` Ihor Radchenko
2020-05-10 14:59               ` Nicolas Goaziou
2020-05-10 15:15                 ` Kyle Meyer
2020-05-10 16:30                 ` Ihor Radchenko
2020-05-10 19:32                   ` Nicolas Goaziou
2020-05-12 10:03                     ` Nicolas Goaziou
2020-05-17 15:00                     ` Ihor Radchenko
2020-05-17 15:40                       ` Ihor Radchenko
2020-05-18 14:35                         ` Nicolas Goaziou
2020-05-18 16:52                           ` Ihor Radchenko [this message]
2020-05-19 13:07                             ` Nicolas Goaziou
2020-05-23 13:52                               ` Ihor Radchenko
2020-05-23 13:53                                 ` Ihor Radchenko
2020-05-23 15:26                                   ` Ihor Radchenko
2020-05-26  8:33                                 ` Nicolas Goaziou
2020-06-02  9:21                                   ` Ihor Radchenko
2020-06-02  9:23                                     ` Ihor Radchenko
2020-06-02 12:10                                       ` Bastien
2020-06-02 13:12                                         ` Ihor Radchenko
2020-06-02 13:23                                           ` Bastien
2020-06-02 13:30                                             ` Ihor Radchenko
2020-06-02  9:25                                     ` Ihor Radchenko
2020-06-05  7:26                                     ` Nicolas Goaziou
2020-06-05  8:18                                       ` Ihor Radchenko
2020-06-05 13:50                                         ` Nicolas Goaziou
2020-06-08  5:05                                           ` Ihor Radchenko
2020-06-08  5:06                                             ` Ihor Radchenko
2020-06-08  5:08                                             ` Ihor Radchenko
2020-06-10 17:14                                             ` Nicolas Goaziou
2020-06-21  9:52                                               ` Ihor Radchenko
2020-06-21 15:01                                                 ` Nicolas Goaziou
2020-08-11  6:45                                               ` Ihor Radchenko
2020-08-11 23:07                                                 ` Kyle Meyer
2020-08-12  6:29                                                   ` Ihor Radchenko
2020-09-20  5:53                                                     ` Ihor Radchenko
2020-09-20 11:45                                                       ` Kévin Le Gouguec
2020-09-22  9:05                                                         ` Ihor Radchenko
2020-09-22 10:00                                                           ` Ihor Radchenko
2020-09-23  6:16                                                             ` Kévin Le Gouguec
2020-09-23  6:48                                                               ` Ihor Radchenko
2020-09-23  7:09                                                                 ` Bastien
2020-09-23  7:30                                                                   ` Ihor Radchenko
2020-09-24 18:07                                                                 ` Kévin Le Gouguec
2020-09-25  2:16                                                                   ` Ihor Radchenko
2020-12-15 17:38                                                                     ` [9.4] Fixing logbook visibility during isearch Kévin Le Gouguec
2020-12-16  3:15                                                                       ` Ihor Radchenko
2020-12-16 18:05                                                                         ` Kévin Le Gouguec
2020-12-17  3:18                                                                           ` Ihor Radchenko
2020-12-17 14:50                                                                             ` Kévin Le Gouguec
2020-12-18  2:23                                                                               ` Ihor Radchenko
2020-12-24 23:37                                                                                 ` Kévin Le Gouguec
2020-12-25  2:51                                                                                   ` Ihor Radchenko
2020-12-25 10:59                                                                                     ` Kévin Le Gouguec
2020-12-25 12:32                                                                                       ` Ihor Radchenko
2020-12-25 21:35                                                                                     ` Kévin Le Gouguec
2020-12-26  4:14                                                                                       ` Ihor Radchenko
2020-12-26 11:44                                                                                         ` Kévin Le Gouguec
2020-12-26 12:22                                                                                           ` Ihor Radchenko
2020-12-04  5:58                                                       ` [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers Ihor Radchenko
2021-03-21  9:09                                                         ` Ihor Radchenko
2021-05-03 17:28                                                           ` Bastien
2021-09-21 13:32                                                             ` Timothy
2021-10-26 17:25                                                               ` Matt Price
2021-10-27  6:27                                                                 ` Ihor Radchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tv0d2nk7.fsf@localhost \
    --to=yantar92@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=mail@nicolasgoaziou.fr \
    --subject='Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Code repositories for project(s) associated with this inbox:

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).