From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id SBgEKEQc1l7iCQAA0tVLHw (envelope-from ) for ; Tue, 02 Jun 2020 09:30:44 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id OF/gI0Qc1l6ZDAAAB5/wlQ (envelope-from ) for ; Tue, 02 Jun 2020 09:30:44 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id CD5C294036B for ; Tue, 2 Jun 2020 09:30:43 +0000 (UTC) Received: from localhost ([::1]:55760 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jg3Fq-0001im-Nt for larch@yhetil.org; Tue, 02 Jun 2020 05:30:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44016) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jg3FI-0001iH-TB for emacs-orgmode@gnu.org; Tue, 02 Jun 2020 05:30:08 -0400 Received: from mail-pg1-x52b.google.com ([2607:f8b0:4864:20::52b]:38976) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jg3FH-0004fY-00 for emacs-orgmode@gnu.org; Tue, 02 Jun 2020 05:30:08 -0400 Received: by mail-pg1-x52b.google.com with SMTP id w20so4828057pga.6 for ; Tue, 02 Jun 2020 02:30:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:in-reply-to:references:date:message-id :mime-version:content-transfer-encoding; bh=T8T5UX8HsVJKt5IKwITnL9hTObQr7THKKu3gSbaLSH4=; b=czV5bHj2HftT+mVY8wsd0ir8Exq3iZBe3yHHKnnhyhCuxsChiShILt4HHegl4M4pJZ qwaw1EYX077FpwdIPUKGOyhsuTymmEOfn97eKl0a3e1rfUWF7tHR72WKWQqCzQGegTaM LM829OXGGiG2Pxv5sZxSzlmSM9TKFVCzkHyGlVablq1vPmr0w8mCyfIOdCZLikX5ueQc QBCG4VQRZhtixhJyxlHBllaaFcyF/y+hB/KOJvRQkyZM287GFZawVpiW5nFLHCvDKpk7 iDAMtfN3grSsF/i1e59IUuEFOG1yIU9hUgGNB5sEfaZ/E/sWMc9a8imZkJbBWH/hWAoa RBbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version:content-transfer-encoding; bh=T8T5UX8HsVJKt5IKwITnL9hTObQr7THKKu3gSbaLSH4=; b=gkJntReTBYaxwIVzZ23UosyDawtprpTlqn0s4qTJznNTTQbpUg2cj4a/DdR3k7udXY VUFKUZ+vcydwX4ajGOev77zDvX9ceSVZZawpIGSM5I4essJGA5XHWK/r3FJ1lvwMLv2Z zyixLDJiFVQHH8EJatIijoQ9ZRwxGmatdiWDIO2RAQGXrlJpto6vB3tA8xEOAkIPqW9f ET+h00JSqURPi2vGnR4FT7K3Djxgxc1YGt5BKiGghU8tSSevC/ZgzC10kZ19IXcvMdxA ZdCyRcCHf6Ymfpgu+5VpBY1Q/ZgpPA77gyyIIO29l0Fnr/tJwsVK9qy70Y7/oYyup/oc rxzg== X-Gm-Message-State: AOAM532duG7C3Dze3NOfqvozL7YqBG59uoGg1ODMpLW5CypO3wMrFWcC dUkvlX9/4V0PbcefsTN6bDu1HGszguX+PUZI X-Google-Smtp-Source: ABdhPJzURtzjnfBX8+uCM/fb6JBrNdMEGoEFR6nRdzij1oWArtwizBOOwrptV5V8T83LzR6y4SuzCA== X-Received: by 2002:a63:f959:: with SMTP id q25mr22639840pgk.137.1591090204518; Tue, 02 Jun 2020 02:30:04 -0700 (PDT) Received: from localhost ([101.99.64.65]) by smtp.gmail.com with ESMTPSA id 140sm1819483pfv.38.2020.06.02.02.30.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Jun 2020 02:30:04 -0700 (PDT) From: Ihor Radchenko To: Nicolas Goaziou Subject: Re: [patch suggestion] Mitigating the poor Emacs performance on huge org files: Do not use overlays for PROPERTY and LOGBOOK drawers In-Reply-To: <87367d4ydc.fsf@localhost> References: <87h7x9e5jo.fsf@localhost> <875zdpia5i.fsf@nicolasgoaziou.fr> <87y2qi8c8w.fsf@localhost> <87r1vu5qmc.fsf@nicolasgoaziou.fr> <87imh5w1zt.fsf@localhost> <87blmxjckl.fsf@localhost> <87y2q13tgs.fsf@nicolasgoaziou.fr> <878si1j83x.fsf@localhost> <87d07bzvhd.fsf@nicolasgoaziou.fr> <87imh34usq.fsf@localhost> <87pnbby49m.fsf@nicolasgoaziou.fr> <87tv0efvyd.fsf@localhost> <874kse1seu.fsf@localhost> <87r1vhqpja.fsf@nicolasgoaziou.fr> <87tv0d2nk7.fsf@localhost> <87o8qkhy3g.fsf@nicolasgoaziou.fr> <87sgfqu5av.fsf@localhost> <87sgfn6qpc.fsf@nicolasgoaziou.fr> <87367d4ydc.fsf@localhost> Date: Tue, 02 Jun 2020 17:25:18 +0800 Message-ID: <87v9k93jlt.fsf@localhost> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=2607:f8b0:4864:20::52b; envelope-from=yantar92@gmail.com; helo=mail-pg1-x52b.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: emacs-orgmode@gnu.org Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=czV5bHj2; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Spam-Score: -1.21 X-TUID: OudlMYgreojV Github link to the patch: https://gist.github.com/yantar92/6447754415457927293acda43a7fcaef=20 Ihor Radchenko writes: > Hello, > > [The patch itself will be provided in the following email] > > I have three updates from the previous version of the patch: > > 1. I managed to implement buffer-local text properties. > Now, outline folding also uses text properties without a need to give > up independent folding in indirect buffers. > > 2. The code handling modifications in folded drawers/blocks was > rewritten. The new code uses after-change-functions to re-hide text > inserted in the middle of folded regions; and text properties to > unfold folded drawers/blocks if one changes BEGIN/END line. > > 3. [experimental] Started working on improving memory and cpu footprint > of the old code related to folding/unfolding. org-hide-drawer-all now > works significantly faster because I can utilise simplified drawer > parser, which require a lot less memory. Overall, I managed to reduce > Emacs memory footprint after loading all my agenda_files twice. The > loading is also noticeably faster. > > ----------------------------------------------------------------------- > ----------------------------------------------------------------------- > > More details on the buffer-local text properties: > > I have found char-property-alias-alist variable that controls how Emacs > calculates text property value if the property is not set. This variable > can be buffer-local, which allows independent 'invisible states in > different buffers. > > All the implementation stays in > org--get-buffer-local-text-property-symbol, which takes care about > generating unique property name and mapping it to 'invisible (or any > other) text property. > > ----------------------------------------------------------------------- > ----------------------------------------------------------------------- > > More details on the new implementation for tracking changes: > > I simplified the code as suggested, without using pairs of before- and > after-change-functions. > > Handling text inserted into folded/invisible region is handled by a > simple after-change function. After testing, it turned out that simple > re-hiding text based on 'invisible property of the text before/after the > inserted region works pretty well. > > Modifications to BEGIN/END line of the drawers and blocks is handled via > 'modification-hooks + 'insert-behind-hooks text properties (there is no > after-change-functions analogue for text properties in Emacs). The > property is applied during folding and the modification-hook function is > made aware about the drawer/block boundaries (via apply-partially > passing element containing :begin :end markers for the current > drawer/block). Passing the element boundary is important because the > 'modification-hook will not directly know where it belongs to. Only the > modified region (which can be larger than the drawer) is passed to the > function. In the worst case, the region can be the whole buffer (if one > runs revert-buffer). > > It turned out that adding 'modification-hook text property takes a > significant cpu time (partially, because we need to take care about > possible existing 'modification-hook value, see > org--add-to-list-text-property). For now, I decided to not clear the > modification hooks during unfolding because of poor performance. > However, this approach would lead to partial unfolding in the following > case: > > :asd: > :drawer: > lksjdfksdfjl > sdfsdfsdf > :end: > > If :asd: was inserted in front of folded :drawer:, changes in :drawer: > line of the new folded :asd: drawer would reveal the text between > :drawer: and :end:. > > Let me know what you think on this. > >> You shouldn't be bothered by the case you're describing here, for >> multiple reasons. >>=20 >> First, this issue already arises in the current implementation. No one >> bothered so far: this change is very unlikely to happen. If it becomes >> an issue, we could make sure that `org-reveal' handles this. >>=20 >> But, more importantly, we actually /want it/ as a feature. Indeed, if >> DRAWER is expanded every time ":BLAH:" is inserted above, then inserting >> a drawer manually would unfold /all/ drawers in the section. The user is >> more likely to write first ":BLAH:" (everything is unfolded) then >> ":END:" than ":END:", then ":BLAH:". > > Agree. This allowed me to simplify the code significantly. > >> It seems you're getting it backwards. `before-change-functions' are the >> functions being called with a possibly wide, imprecise, region to >> handle: >>=20 >> When that happens, the arguments to =E2=80=98before-change-functions= =E2=80=99 will >> enclose a region in which the individual changes are made, but won= =E2=80=99t >> necessarily be the minimal such region >>=20 >> however, after-change-functions calls are always minimal: >>=20 >> and the arguments to each successive call of >> =E2=80=98after-change-functions=E2=80=99 will then delimit the part = of text being >> changed exactly. >>=20 >> If you stick to `after-change-functions', there will be no such thing as >> you describe. > > You are right here, I missed that before-change-functions are likely to > be called on large regions. I thought that the regions are same for > before/after-change-functions, but after-change-functions could be > called more than 1 time. After second thought, your vision that it is > mostly 0 or 1 times should be the majority of cases in practice. > > ----------------------------------------------------------------------- > ----------------------------------------------------------------------- > > More details on reducing cpu and memory footprint of org buffers: > > My simplified implementation of element boundary parser > (org--get-element-region-at-point) appears to be much faster and also > uses much less memory in comparison with org-element-at-point. > Moreover, not all the places where org-element-at-point is called > actually need the full parsed element. For example, org-hide-drawer-all, > org-hide-drawer-toggle, org-hide-block-toggle, and > org--hide-wrapper-toggle only need element type and some information > about the element boundaries - the information we can get from > org--get-element-region-at-point. > > The following version of org-hide-drawer-all seems to work much faster > in comparison with original: > > (defun org-hide-drawer-all () > "Fold all drawers in the current buffer." > (save-excursion > (goto-char (point-min)) > (while (re-search-forward org-drawer-regexp nil t) > (when-let* ((drawer (org--get-element-region-at-point '(property-dr= awer drawer))) > (type (org-element-type drawer))) > (org-hide-drawer-toggle t nil drawer) > ;; Make sure to skip drawer entirely or we might flag it > ;; another time when matching its ending line with > ;; `org-drawer-regexp'. > (goto-char (org-element-property :end drawer)))))) > > What do you think about the idea of making use of > org--get-element-region-at-point in org code base? > > ----------------------------------------------------------------------- > ----------------------------------------------------------------------- > > Further work: > > 1. Look into other code using overlays. Specifically, > org-toggle-custom-properties, Babel hashes, and narrowed table columns. > > Best, > Ihor > > Nicolas Goaziou writes: > >> Hello, >> >> Ihor Radchenko writes: >> >>> I have five updates from the previous version of the patch: >> >> Thank you. >> >>> 1. I implemented a simplified version of element parsing to detect >>> changes in folded drawers or blocks. No computationally expensive calls >>> of org-element-at-point or org-element-parse-buffer are needed now. >>> >>> 2. The patch is now compatible with master (commit 2e96dc639). I >>> reverted the earlier change in folding drawers and blocks. Now, they are >>> back to using 'org-hide-block and 'org-hide-drawer. Using 'outline would >>> achieve nothing when we use text properties. >>> >>> 3. 'invisible text property can now be nested. This is important, for >>> example, when text inside drawers contains fontified links (which also >>> use 'invisible text property to hide parts of the link). Now, the old >>> 'invisible spec is recovered after unfolding. >> >> Interesting. I'm running out of time, so I cannot properly inspect the >> code right now. I'll try to do that before the end of the week. >> >>> 4. Some outline-* function calls in org referred to outline-flag-region >>> implementation, which is not in sync with org-flag-region in this patch. >>> I have implemented their org-* versions and replaced the calls >>> throughout .el files. Actually, some org-* versions were already >>> implemented in org, but not used for some reason (or not mentioned in >>> the manual). I have updated the relevant sections of manual. These >>> changes might be relevant to org independently of this feature branch. >> >> Yes, we certainly want to move to org-specific versions in all cases. >> >>> 5. I have managed to get a working version of outline folding via text >>> properties. However, that approach has a big downside - folding state >>> cannot be different in indirect buffer when we use text properties. I >>> have seen packages relying on this feature of org and I do not see any >>> obvious way to achieve different folding state in indirect buffer while >>> using text properties for outline folding. >> >> Hmm. Good point. This is a serious issue to consider. Even if we don't >> use text properties for outline, this also affects drawers and blocks. >> >>> For now, I still used before/after-change-functions combination. >> >> You shouldn't. >> >>> I see the following problems with using only after-change-functions:=20 >>> >>> 1. They are not guaranteed to be called after every single change: >> >> Of course they are! See below. >> >>> From (elisp) Change Hooks: >>> "... some complex primitives call =E2=80=98before-change-functions=E2= =80=99 once before >>> making changes, and then call =E2=80=98after-change-functions=E2=80=99 = zero or more >>> times" >> >> "zero" means there are no changes at all, so, `after-change-functions' >> are not called, which is expected. >> >>> The consequence of it is a possibility that region passed to the >>> after-change-functions is quite big (including all the singular changes, >>> even if they are distant). This region may contain changed drawers as >>> well and unchanged drawers and needs to be parsed to determine which >>> drawers need to be re-folded. >> >> It seems you're getting it backwards. `before-change-functions' are the >> functions being called with a possibly wide, imprecise, region to >> handle: >> >> When that happens, the arguments to =E2=80=98before-change-functions= =E2=80=99 will >> enclose a region in which the individual changes are made, but won= =E2=80=99t >> necessarily be the minimal such region >> >> however, after-change-functions calls are always minimal: >> >> and the arguments to each successive call of >> =E2=80=98after-change-functions=E2=80=99 will then delimit the part = of text being >> changed exactly. >> >> If you stick to `after-change-functions', there will be no such thing as >> you describe. >> >>>> And, more importantly, they are not meant to be used together, i.e., y= ou >>>> cannot assume that a single call to `before-change-functions' always >>>> happens before calling `after-change-functions'. This can be tricky if >>>> you want to use the former to pass information to the latter. >>> >>> The fact that before-change-functions can be called multiple times >>> before after-change-functions, is trivially solved by using buffer-local >>> changes register (see org--modified-elements). >> >> Famous last words. Been there, done that, and it failed. >> >> Let me quote the manual: >> >> In general, we advise to use either before- or the after-change >> hooks, but not both. >> >> So, let me insist: don't do that. If you don't agree with me, let's at >> least agree with Emacs developers. >> >>> The register is populated by before-change-functions and cleared by >>> after-change-functions. >> >> You cannot expect `after-change-functions' to clear what >> `before-change-functions' did. This is likely to introduce pernicious >> bugs. Sorry if it sounds like FUD, but bugs in those areas are just >> horrible to squash. >> >>>> Well, `before-change-fuctions' and `after-change-functions' are not >>>> clean at all: you modify an unrelated part of the buffer, but still ca= ll >>>> those to check if a drawer needs to be unfolded somewhere. >>> >>> 2. As you pointed, instead of global before-change-functions, we can use >>> modification-hooks text property on sensitive parts of the >>> drawers/blocks. This would work, but I am concerned about one annoying >>> special case: >>> >>> -----------------------------------------------------------------------= -- >>> :BLAH: >>> >>> >>> >>> :DRAWER: >>> Donec at pede. >>> :END: >>> -----------------------------------------------------------------------= -- >>> In this example, the user would not be able to unfold the folder DRAWER >>> because it will technically become a part of a new giant BLAH drawer. >>> This may be especially annoying if is more than one screen >>> long and there is no easy way to identify why unfolding does not work >>> (with point at :DRAWER:). >> >> You shouldn't be bothered by the case you're describing here, for >> multiple reasons. >> >> First, this issue already arises in the current implementation. No one >> bothered so far: this change is very unlikely to happen. If it becomes >> an issue, we could make sure that `org-reveal' handles this. >> >> But, more importantly, we actually /want it/ as a feature. Indeed, if >> DRAWER is expanded every time ":BLAH:" is inserted above, then inserting >> a drawer manually would unfold /all/ drawers in the section. The user is >> more likely to write first ":BLAH:" (everything is unfolded) then >> ":END:" than ":END:", then ":BLAH:". >> >>> Because of this scenario, limiting before-change-functions to folded >>> drawers is not sufficient. Any change in text may need to trigger >>> unfolding. >> >> after-change-functions is more appropriate than before-change-functions, >> and local parsing, as explained in this thread, is more efficient than >> re-inventing the parser. >> >>> In the patch, I always register possible modifications in the >>> blocks/drawers intersecting with the modified region + a drawer/block >>> right next to the region. >>> >>> ----------------------------------------------------------------------- >>> ----------------------------------------------------------------------- >>> >>> More details on the nested 'invisible text property implementation. >>> >>> The idea is to keep 'invisible property stack push and popping from it >>> as we add/remove 'invisible text property. All the work is done in >>> org-flag-region. >> >> This sounds like a good idea. >> >>> This was originally intended for folding outlines via text properties. >>> Since using text properties for folding outlines is not a good idea, >>> nested text properties have much less use. >> >> AFAIU, they have. You mention link fontification, but there are other >> pieces that we could switch to text properties instead of overlays, >> e.g., Babel hashes, narrowed table columns=E2=80=A6 >> >>> 3. Multiple calls to before/after-change-functions is still a problem. I >>> am looking into following ways to reduce this number: >>> - reduce the number of elements registered as potentially modified >>> + do not add duplicates to org--modified-elements >>> + do not add unfolded elements to org--modified-elements >>> + register after-change-function as post-command hook and remove it >>> from global after-change-functions. This way, it will be called >>> twice per command only. >>> - determine common region containing org--modified-elements. if change >>> is happening within that region, there is no need to parse >>> drawers/blocks there again. >> >> This is over-engineering. Again, please focus on local changes, as >> discussed before. >> >>> Recipe to have different (org-element-at-point) and >>> (org-element-parse-buffer 'element) >>> -----------------------------------------------------------------------= -- >>> >>> :PROPERTIES: >>> :CREATED: [2020-05-23 Sat 02:32] >>> :END: >>> >>> >>> >>> -----------------------------------------------------------------------= -- >> >> I didn't look at this situation in particular, but there are cases where >> different :post-blank values are inevitable, for example at the end of >> a section. >> >> Regards, >> >> --=20 >> Nicolas Goaziou > > --=20 > Ihor Radchenko, > PhD, > Center for Advancing Materials Performance from the Nanoscale (CAMP-nano) > State Key Laboratory for Mechanical Behavior of Materials, Xi'an Jiaotong= University, Xi'an, China > Email: yantar92@gmail.com, ihor_radchenko@alumni.sutd.edu.sg --=20 Ihor Radchenko, PhD, Center for Advancing Materials Performance from the Nanoscale (CAMP-nano) State Key Laboratory for Mechanical Behavior of Materials, Xi'an Jiaotong U= niversity, Xi'an, China Email: yantar92@gmail.com, ihor_radchenko@alumni.sutd.edu.sg