* How to make agenda generation faster @ 2018-10-07 4:53 Marcin Borkowski 2018-10-08 7:20 ` Michael Welle ` (2 more replies) 0 siblings, 3 replies; 24+ messages in thread From: Marcin Borkowski @ 2018-10-07 4:53 UTC (permalink / raw) To: Org-Mode mailing list Hi Orgers, my agenda takes almost 10 seconds to show up. Are there any ideas for profiling that? I suspect that archiving a lot of old entries I don't use anymore might help, but is there any way to e.g. display some stats on which file/headline took how much time? TIA, -- Marcin Borkowski http://mbork.pl ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-07 4:53 How to make agenda generation faster Marcin Borkowski @ 2018-10-08 7:20 ` Michael Welle 2018-10-10 20:03 ` Marcin Borkowski 2018-10-09 6:37 ` Adam Porter 2018-10-09 11:47 ` Julius Dittmar 2 siblings, 1 reply; 24+ messages in thread From: Michael Welle @ 2018-10-08 7:20 UTC (permalink / raw) To: emacs-orgmode Hello, Marcin Borkowski <mbork@mbork.pl> writes: > Hi Orgers, > > my agenda takes almost 10 seconds to show up. Are there any ideas for > profiling that? > > I suspect that archiving a lot of old entries I don't use anymore might > help, but is there any way to e.g. display some stats on which > file/headline took how much time? since no one answered yet, there are some similar threads. IIRC the way to go is to use elp for profiling. Well, on my laptop the initial agenda run takes about 7s or so (150 agenda files) using the current day/week agenda ("a"). All subsequent (after loading the files) agenda runs are fast (split second I would say). I had some performance issues in the past caused by SCM. Emacs tried to check if every file is checked out in the latest version. That slowed down the process a lot (starting 150 mercurial processes in sequential order, checking results, etc.). The initial run doesn't bother me much. I bound the initial agenda run to an idle timer at Emacs start. Regards hmw ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-08 7:20 ` Michael Welle @ 2018-10-10 20:03 ` Marcin Borkowski 2018-10-10 21:01 ` Samuel Wales 2018-10-11 6:48 ` Michael Welle 0 siblings, 2 replies; 24+ messages in thread From: Marcin Borkowski @ 2018-10-10 20:03 UTC (permalink / raw) To: Michael Welle; +Cc: emacs-orgmode On 2018-10-08, at 09:20, Michael Welle <mwe012008@gmx.net> wrote: > Hello, > > Marcin Borkowski <mbork@mbork.pl> writes: > >> Hi Orgers, >> >> my agenda takes almost 10 seconds to show up. Are there any ideas for >> profiling that? >> >> I suspect that archiving a lot of old entries I don't use anymore might >> help, but is there any way to e.g. display some stats on which >> file/headline took how much time? > since no one answered yet, there are some similar threads. IIRC the way > to go is to use elp for profiling. > > Well, on my laptop the initial agenda run takes about 7s or so (150 > agenda files) using the current day/week agenda ("a"). All subsequent > (after loading the files) agenda runs are fast (split second I would > say). I had some performance issues in the past caused by SCM. Emacs > tried to check if every file is checked out in the latest version. That > slowed down the process a lot (starting 150 mercurial processes in > sequential order, checking results, etc.). The initial run doesn't > bother me much. I bound the initial agenda run to an idle timer at Emacs > start. Interesting. I did not notice such differences between the first and subsequent runs. Anyway, thanks for your input (to all people who replied, actually). -- Marcin Borkowski http://mbork.pl ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-10 20:03 ` Marcin Borkowski @ 2018-10-10 21:01 ` Samuel Wales 2018-10-11 6:48 ` Michael Welle 1 sibling, 0 replies; 24+ messages in thread From: Samuel Wales @ 2018-10-10 21:01 UTC (permalink / raw) To: Marcin Borkowski; +Cc: emacs-orgmode, Michael Welle for cleaning logbook entries, i'd enjoy having an agenda view that shows every entry that has state changes [above a minimum number of them to keep it small], with the size of the logbook drawer in the prefix or so next to the category, sorted by that size. there would be a corresponding agenda batch command that would archive, delete, or archive all except most recent for the marked entries. is it the number of headlines in a file or the total number in agenda files? i think it's great to have org-ql. lispy query is great. although mostly i just use text search, it would be more memorizable syntax for tags type search [and custom sorts?]. is this a suitable start for agenda-ng? will it be cleaner and faster? another speedup possibility might be to allow redoing the agenda with a new sorting strategy without having to redo the scanning of agenda files. i agree not scanning unchanged buffers could really speed up the agenda in principle. [it'd be great if emacs could parallelize across smp cores in addition. :]] On 10/10/18, Marcin Borkowski <mbork@mbork.pl> wrote: > > On 2018-10-08, at 09:20, Michael Welle <mwe012008@gmx.net> wrote: > >> Hello, >> >> Marcin Borkowski <mbork@mbork.pl> writes: >> >>> Hi Orgers, >>> >>> my agenda takes almost 10 seconds to show up. Are there any ideas for >>> profiling that? >>> >>> I suspect that archiving a lot of old entries I don't use anymore might >>> help, but is there any way to e.g. display some stats on which >>> file/headline took how much time? >> since no one answered yet, there are some similar threads. IIRC the way >> to go is to use elp for profiling. >> >> Well, on my laptop the initial agenda run takes about 7s or so (150 >> agenda files) using the current day/week agenda ("a"). All subsequent >> (after loading the files) agenda runs are fast (split second I would >> say). I had some performance issues in the past caused by SCM. Emacs >> tried to check if every file is checked out in the latest version. That >> slowed down the process a lot (starting 150 mercurial processes in >> sequential order, checking results, etc.). The initial run doesn't >> bother me much. I bound the initial agenda run to an idle timer at Emacs >> start. > > Interesting. I did not notice such differences between the first and > subsequent runs. > > Anyway, thanks for your input (to all people who replied, actually). > > -- > Marcin Borkowski > http://mbork.pl > > -- The Kafka Pandemic: <http://thekafkapandemic.blogspot.com> The disease DOES progress. MANY people have died from it. And ANYBODY can get it at any time. "You’ve really gotta quit this and get moving, because this is murder by neglect." --- <http://www.meaction.net/2017/02/03/pwme-people-with-me-are-being-murdered-by-neglect>. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-10 20:03 ` Marcin Borkowski 2018-10-10 21:01 ` Samuel Wales @ 2018-10-11 6:48 ` Michael Welle 2018-10-11 8:48 ` Marcin Borkowski 1 sibling, 1 reply; 24+ messages in thread From: Michael Welle @ 2018-10-11 6:48 UTC (permalink / raw) To: emacs-orgmode Hello, Marcin Borkowski <mbork@mbork.pl> writes: > On 2018-10-08, at 09:20, Michael Welle <mwe012008@gmx.net> wrote: [...] >> Well, on my laptop the initial agenda run takes about 7s or so (150 >> agenda files) using the current day/week agenda ("a"). All subsequent >> (after loading the files) agenda runs are fast (split second I would >> say). I had some performance issues in the past caused by SCM. Emacs >> tried to check if every file is checked out in the latest version. That >> slowed down the process a lot (starting 150 mercurial processes in >> sequential order, checking results, etc.). The initial run doesn't >> bother me much. I bound the initial agenda run to an idle timer at Emacs >> start. > > Interesting. I did not notice such differences between the first and > subsequent runs. I thought that behaviour is natural, scanning dirs for files and opening them is a costly operation. But a week ago I changed from rotating rust to solid state disks and that behaviour did not change much. I expected a speed up, but mee. Regards hmw ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-11 6:48 ` Michael Welle @ 2018-10-11 8:48 ` Marcin Borkowski 2018-10-11 19:59 ` Samuel Wales 0 siblings, 1 reply; 24+ messages in thread From: Marcin Borkowski @ 2018-10-11 8:48 UTC (permalink / raw) To: Michael Welle; +Cc: emacs-orgmode On 2018-10-11, at 08:48, Michael Welle <mwe012008@gmx.net> wrote: > Hello, > > Marcin Borkowski <mbork@mbork.pl> writes: > >> On 2018-10-08, at 09:20, Michael Welle <mwe012008@gmx.net> wrote: > [...] >>> Well, on my laptop the initial agenda run takes about 7s or so (150 >>> agenda files) using the current day/week agenda ("a"). All subsequent >>> (after loading the files) agenda runs are fast (split second I would >>> say). I had some performance issues in the past caused by SCM. Emacs >>> tried to check if every file is checked out in the latest version. That >>> slowed down the process a lot (starting 150 mercurial processes in >>> sequential order, checking results, etc.). The initial run doesn't >>> bother me much. I bound the initial agenda run to an idle timer at Emacs >>> start. >> >> Interesting. I did not notice such differences between the first and >> subsequent runs. > I thought that behaviour is natural, scanning dirs for files and opening > them is a costly operation. But a week ago I changed from rotating rust > to solid state disks and that behaviour did not change much. I expected > a speed up, but mee. Ah, I have /visiting/ all my agenda files (but not generating the agenda itself) in my init.el. That explains a lot. Best, -- Marcin Borkowski http://mbork.pl ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-11 8:48 ` Marcin Borkowski @ 2018-10-11 19:59 ` Samuel Wales 2018-10-14 8:51 ` Marcin Borkowski 0 siblings, 1 reply; 24+ messages in thread From: Samuel Wales @ 2018-10-11 19:59 UTC (permalink / raw) To: Marcin Borkowski; +Cc: emacs-orgmode, Michael Welle i too visit all files when emacs starts. are we saying that the speed depends on the number of headlines total or the number of headlines in a single file among the agenda files? On 10/11/18, Marcin Borkowski <mbork@mbork.pl> wrote: > > On 2018-10-11, at 08:48, Michael Welle <mwe012008@gmx.net> wrote: > >> Hello, >> >> Marcin Borkowski <mbork@mbork.pl> writes: >> >>> On 2018-10-08, at 09:20, Michael Welle <mwe012008@gmx.net> wrote: >> [...] >>>> Well, on my laptop the initial agenda run takes about 7s or so (150 >>>> agenda files) using the current day/week agenda ("a"). All subsequent >>>> (after loading the files) agenda runs are fast (split second I would >>>> say). I had some performance issues in the past caused by SCM. Emacs >>>> tried to check if every file is checked out in the latest version. That >>>> slowed down the process a lot (starting 150 mercurial processes in >>>> sequential order, checking results, etc.). The initial run doesn't >>>> bother me much. I bound the initial agenda run to an idle timer at >>>> Emacs >>>> start. >>> >>> Interesting. I did not notice such differences between the first and >>> subsequent runs. >> I thought that behaviour is natural, scanning dirs for files and opening >> them is a costly operation. But a week ago I changed from rotating rust >> to solid state disks and that behaviour did not change much. I expected >> a speed up, but mee. > > Ah, I have /visiting/ all my agenda files (but not generating the agenda > itself) in my init.el. > > That explains a lot. > > Best, > > -- > Marcin Borkowski > http://mbork.pl > > -- The Kafka Pandemic: <http://thekafkapandemic.blogspot.com> The disease DOES progress. MANY people have died from it. And ANYBODY can get it at any time. "You’ve really gotta quit this and get moving, because this is murder by neglect." --- <http://www.meaction.net/2017/02/03/pwme-people-with-me-are-being-murdered-by-neglect>. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-11 19:59 ` Samuel Wales @ 2018-10-14 8:51 ` Marcin Borkowski 0 siblings, 0 replies; 24+ messages in thread From: Marcin Borkowski @ 2018-10-14 8:51 UTC (permalink / raw) To: Samuel Wales; +Cc: emacs-orgmode, Michael Welle On 2018-10-11, at 21:59, Samuel Wales <samologist@gmail.com> wrote: > i too visit all files when emacs starts. > > are we saying that the speed depends on the number of headlines total > or the number of headlines in a single file among the agenda files? Probably the former...? > > On 10/11/18, Marcin Borkowski <mbork@mbork.pl> wrote: >> >> On 2018-10-11, at 08:48, Michael Welle <mwe012008@gmx.net> wrote: >> >>> Hello, >>> >>> Marcin Borkowski <mbork@mbork.pl> writes: >>> >>>> On 2018-10-08, at 09:20, Michael Welle <mwe012008@gmx.net> wrote: >>> [...] >>>>> Well, on my laptop the initial agenda run takes about 7s or so (150 >>>>> agenda files) using the current day/week agenda ("a"). All subsequent >>>>> (after loading the files) agenda runs are fast (split second I would >>>>> say). I had some performance issues in the past caused by SCM. Emacs >>>>> tried to check if every file is checked out in the latest version. That >>>>> slowed down the process a lot (starting 150 mercurial processes in >>>>> sequential order, checking results, etc.). The initial run doesn't >>>>> bother me much. I bound the initial agenda run to an idle timer at >>>>> Emacs >>>>> start. >>>> >>>> Interesting. I did not notice such differences between the first and >>>> subsequent runs. >>> I thought that behaviour is natural, scanning dirs for files and opening >>> them is a costly operation. But a week ago I changed from rotating rust >>> to solid state disks and that behaviour did not change much. I expected >>> a speed up, but mee. >> >> Ah, I have /visiting/ all my agenda files (but not generating the agenda >> itself) in my init.el. >> >> That explains a lot. >> >> Best, >> >> -- >> Marcin Borkowski >> http://mbork.pl >> >> -- Marcin Borkowski http://mbork.pl ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-07 4:53 How to make agenda generation faster Marcin Borkowski 2018-10-08 7:20 ` Michael Welle @ 2018-10-09 6:37 ` Adam Porter 2018-10-09 16:11 ` Nicolas Goaziou 2018-10-10 19:59 ` Marcin Borkowski 2018-10-09 11:47 ` Julius Dittmar 2 siblings, 2 replies; 24+ messages in thread From: Adam Porter @ 2018-10-09 6:37 UTC (permalink / raw) To: emacs-orgmode Hi Marcin, My feedback is: there be dragons. ;) The Agenda code is very complicated and hard to follow, and it's hard to optimize something that is hard to understand. In the long run, to get significant speed improvements, I think it may be necessary to reimplement the Agenda. However, due to the nature of it (i.e. regexp searches through buffers to find entries), I don't know how much faster it can be made. I don't mean that I doubt it can be--I mean that, truly, I don't know, because it's hard to understand the flow of the code. I think that it is already fairly well optimized, given its limitations. However, an example of a potential improvement would be to refactor it to work with lexical-binding enabled (which didn't exist when it was first created); I can't say how much of an improvement it would make, but my understanding is that code that runs with lexical-binding enabled is generally faster. But doing that would be a non-trivial project, I think, requiring the fixing of many inevitable regressions in the process. If you haven't seen them already, you may find my org-ql and org-ql-agenda code useful. org-ql-agenda presents an Agenda-like buffer. N.B. It does *not* implement most of the Agenda features, but it does emulate an Org Agenda buffer by setting the appropriate text properties on entries and formatting them in a similar way. It's built on org-ql, which provides per-buffer query caching, which means that generating an org-ql-agenda view for Org buffers that haven't changed since the last view was generated is very fast. It's also written in a more functional way, which I think is easier to follow and modify. Performance of uncached queries/buffers depends on the query--some are relatively fast, while others are slower than the "real" Org Agenda. I think there is significant potential for optimizations, and I'm hoping to implement some in the future. Your feedback would be appreciated! https://github.com/alphapapa/org-ql ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-09 6:37 ` Adam Porter @ 2018-10-09 16:11 ` Nicolas Goaziou 2018-10-10 20:01 ` Marcin Borkowski 2018-10-16 20:35 ` Adam Porter 2018-10-10 19:59 ` Marcin Borkowski 1 sibling, 2 replies; 24+ messages in thread From: Nicolas Goaziou @ 2018-10-09 16:11 UTC (permalink / raw) To: Adam Porter; +Cc: emacs-orgmode Hello, Adam Porter <adam@alphapapa.net> writes: > My feedback is: there be dragons. ;) The Agenda code is very > complicated and hard to follow, and it's hard to optimize something that > is hard to understand. And hard to maintain. We should really do something about it. > In the long run, to get significant speed improvements, I think it may > be necessary to reimplement the Agenda. Agreed. > However, due to the nature of it (i.e. regexp searches through buffers > to find entries), I don't know how much faster it can be made. I don't > mean that I doubt it can be--I mean that, truly, I don't know, because > it's hard to understand the flow of the code. > > I think that it is already fairly well optimized, given its limitations. > However, an example of a potential improvement would be to refactor it > to work with lexical-binding enabled (which didn't exist when it was > first created); I can't say how much of an improvement it would make, > but my understanding is that code that runs with lexical-binding enabled > is generally faster. Not really. But it's certainly easier to understand since it removes one class of problems. > But doing that would be a non-trivial project, I > think, requiring the fixing of many inevitable regressions in the > process. > > If you haven't seen them already, you may find my org-ql and > org-ql-agenda code useful. org-ql-agenda presents an Agenda-like > buffer. N.B. It does *not* implement most of the Agenda features, but > it does emulate an Org Agenda buffer by setting the appropriate text > properties on entries and formatting them in a similar way. Instead of re-inventing the wheel, or putting efforts into a wheel-like invention, wouldn't it make sense to actually work on Org Agenda itself? I didn't look closely at org-ql, but I had the idea of splitting the Agenda in two distinct parts. One would be responsible for collecting, possibly asynchronously, and caching data from Org documents. The other one would provide a DSL to query and display the results extracted from the output of the first part. The second part could even be made generic enough to be extracted from Org and become some part of Emacs. Displaying filtered data, maybe in a timeline, could be useful for other packages. Unfortunately, I don't have time to work on this. Ah well. So again, wouldn't it be nice to think about Org Agenda-ng? Regards, -- Nicolas Goaziou ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-09 16:11 ` Nicolas Goaziou @ 2018-10-10 20:01 ` Marcin Borkowski 2018-10-16 20:35 ` Adam Porter 1 sibling, 0 replies; 24+ messages in thread From: Marcin Borkowski @ 2018-10-10 20:01 UTC (permalink / raw) To: Nicolas Goaziou; +Cc: Adam Porter, emacs-orgmode On 2018-10-09, at 18:11, Nicolas Goaziou <mail@nicolasgoaziou.fr> wrote: > Hello, > > Adam Porter <adam@alphapapa.net> writes: > >> My feedback is: there be dragons. ;) The Agenda code is very >> complicated and hard to follow, and it's hard to optimize something that >> is hard to understand. > > And hard to maintain. We should really do something about it. > >> In the long run, to get significant speed improvements, I think it may >> be necessary to reimplement the Agenda. > > Agreed. +1 > [...] > > I didn't look closely at org-ql, but I had the idea of splitting the > Agenda in two distinct parts. One would be responsible for collecting, > possibly asynchronously, and caching data from Org documents. The other > one would provide a DSL to query and display the results extracted from > the output of the first part. The second part could even be made generic > enough to be extracted from Org and become some part of Emacs. > Displaying filtered data, maybe in a timeline, could be useful for other > packages. Unfortunately, I don't have time to work on this. Ah well. > > So again, wouldn't it be nice to think about Org Agenda-ng? That is a great idea! In general, I find Org-mode to be lacking APIs. I'dlove to build some applications on top of it, but getting some information is very difficult. (For instance, I'd like to get info about clocks for all headlines in the agenda. It seems I have to implement parsing clocks myself, at least partially.) Best, -- Marcin Borkowski http://mbork.pl ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-09 16:11 ` Nicolas Goaziou 2018-10-10 20:01 ` Marcin Borkowski @ 2018-10-16 20:35 ` Adam Porter 2018-10-17 7:04 ` Ihor Radchenko 2018-10-17 13:01 ` Nicolas Goaziou 1 sibling, 2 replies; 24+ messages in thread From: Adam Porter @ 2018-10-16 20:35 UTC (permalink / raw) To: emacs-orgmode Nicolas Goaziou <mail@nicolasgoaziou.fr> writes: >> my understanding is that code that runs with lexical-binding enabled >> is generally faster. > > Not really. But it's certainly easier to understand since it removes one > class of problems. From what I've read, the byte-compiler can optimize better when lexical-binding is used. > Instead of re-inventing the wheel, or putting efforts into a > wheel-like invention, wouldn't it make sense to actually work on Org > Agenda itself? > > So again, wouldn't it be nice to think about Org Agenda-ng? As a matter of fact, what's now called org-ql-agenda was originally called org-agenda-ng. I factored org-ql out of it and realized that it should probably be its own, standalone package. Then I renamed org-agenda-ng to org-ql-agenda, so I could reasonably keep them in the same repo, and because I don't know if I will ever develop it far enough to be worthy of the name org-agenda-ng. It started as an experiment to build a foundation for a new, modular agenda implementation, and maybe it could be. > I didn't look closely at org-ql, but I had the idea of splitting the > Agenda in two distinct parts. One would be responsible for collecting, > possibly asynchronously, and caching data from Org documents. The other > one would provide a DSL to query and display the results extracted from > the output of the first part. The second part could even be made generic > enough to be extracted from Org and become some part of Emacs. > Displaying filtered data, maybe in a timeline, could be useful for other > packages. Unfortunately, I don't have time to work on this. Ah well. I've thought about this for a while. It seems to me that the issue is that Org buffers are, of course, plain-text buffers. There is no persistent, in-memory representation other than the buffer, so whenever Org needs structured/semantic data, it must parse it out of the buffer, which is necessarily rather slow. If there were a way to keep an outline tree in memory, parallel to the buffer itself, that would allow operations like search, agenda, etc. to be greatly sped up. But how would that work in Emacs? Theoretically, we could write some code, applied on self-insert-command, to update the "parallel tree structure" as the user manipulates the plain-text in the buffer (e.g. add a new node when the user types a "*" to create a new heading), and also apply it to functions that manipulate the outline structurally in the buffer. But, of course, that sounds very complicated. I would not relish the idea of debugging code to keep a cached tree in sync with a plain-text buffer outline. :) Besides that, AFAIK there would be no way to do it asynchronously other than calling out to a child Emacs process (because elisp is still single-threaded), printing and reading the data back and forth (which would tie up the parent process when reading). Maybe in the future elisp will be multithreaded... Anyway, org-ql tries to do some of what you mentioned. It does rudimentary, per-buffer, per-query caching (as long as the buffer is not modified, the cache remains valid), which helps when there are several Org files open that are referred to often but not as often modified. And the query and presentation code are separated (org-ql and org-ql-agenda). I don't know how widely it's used, but the repo is getting some regular traffic, and I'm using it as the backend for my org-sidebar package. I'd be happy if it could be made more generally useful, or if it could be helpful to Org itself in some way. Contributions are welcome. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-16 20:35 ` Adam Porter @ 2018-10-17 7:04 ` Ihor Radchenko 2018-10-17 13:01 ` Nicolas Goaziou 1 sibling, 0 replies; 24+ messages in thread From: Ihor Radchenko @ 2018-10-17 7:04 UTC (permalink / raw) To: Adam Porter, emacs-orgmode [-- Attachment #1: Type: text/plain, Size: 4626 bytes --] > I've thought about this for a while. It seems to me that the issue is > that Org buffers are, of course, plain-text buffers. There is no > persistent, in-memory representation other than the buffer, so whenever > Org needs structured/semantic data, it must parse it out of the buffer, > which is necessarily rather slow. If there were a way to keep an > outline tree in memory, parallel to the buffer itself, that would allow > operations like search, agenda, etc. to be greatly sped up. FYI A while ago I saw some cache implementation in org-element.el. Take a look at org-element--cache variable definition and the code below. ```` (defvar org-element--cache nil "AVL tree used to cache elements. Each node of the tree contains an element. Comparison is done with `org-element--cache-compare'. This cache is used in `org-element-at-point'.") ```` Best, Ihor Adam Porter <adam@alphapapa.net> writes: > Nicolas Goaziou <mail@nicolasgoaziou.fr> writes: > >>> my understanding is that code that runs with lexical-binding enabled >>> is generally faster. >> >> Not really. But it's certainly easier to understand since it removes one >> class of problems. > > From what I've read, the byte-compiler can optimize better when > lexical-binding is used. > >> Instead of re-inventing the wheel, or putting efforts into a >> wheel-like invention, wouldn't it make sense to actually work on Org >> Agenda itself? >> >> So again, wouldn't it be nice to think about Org Agenda-ng? > > As a matter of fact, what's now called org-ql-agenda was originally > called org-agenda-ng. I factored org-ql out of it and realized that it > should probably be its own, standalone package. Then I renamed > org-agenda-ng to org-ql-agenda, so I could reasonably keep them in the > same repo, and because I don't know if I will ever develop it far enough > to be worthy of the name org-agenda-ng. It started as an experiment to > build a foundation for a new, modular agenda implementation, and maybe > it could be. > >> I didn't look closely at org-ql, but I had the idea of splitting the >> Agenda in two distinct parts. One would be responsible for collecting, >> possibly asynchronously, and caching data from Org documents. The other >> one would provide a DSL to query and display the results extracted from >> the output of the first part. The second part could even be made generic >> enough to be extracted from Org and become some part of Emacs. >> Displaying filtered data, maybe in a timeline, could be useful for other >> packages. Unfortunately, I don't have time to work on this. Ah well. > > I've thought about this for a while. It seems to me that the issue is > that Org buffers are, of course, plain-text buffers. There is no > persistent, in-memory representation other than the buffer, so whenever > Org needs structured/semantic data, it must parse it out of the buffer, > which is necessarily rather slow. If there were a way to keep an > outline tree in memory, parallel to the buffer itself, that would allow > operations like search, agenda, etc. to be greatly sped up. > > But how would that work in Emacs? Theoretically, we could write some > code, applied on self-insert-command, to update the "parallel tree > structure" as the user manipulates the plain-text in the buffer > (e.g. add a new node when the user types a "*" to create a new heading), > and also apply it to functions that manipulate the outline structurally > in the buffer. But, of course, that sounds very complicated. I would > not relish the idea of debugging code to keep a cached tree in sync with > a plain-text buffer outline. :) > > Besides that, AFAIK there would be no way to do it asynchronously other > than calling out to a child Emacs process (because elisp is still > single-threaded), printing and reading the data back and forth (which > would tie up the parent process when reading). Maybe in the future > elisp will be multithreaded... > > Anyway, org-ql tries to do some of what you mentioned. It does > rudimentary, per-buffer, per-query caching (as long as the buffer is not > modified, the cache remains valid), which helps when there are several > Org files open that are referred to often but not as often modified. > And the query and presentation code are separated (org-ql and > org-ql-agenda). > > I don't know how widely it's used, but the repo is getting some regular > traffic, and I'm using it as the backend for my org-sidebar package. > I'd be happy if it could be made more generally useful, or if it could > be helpful to Org itself in some way. Contributions are welcome. > > [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 487 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-16 20:35 ` Adam Porter 2018-10-17 7:04 ` Ihor Radchenko @ 2018-10-17 13:01 ` Nicolas Goaziou 2018-10-17 19:12 ` Adam Porter 1 sibling, 1 reply; 24+ messages in thread From: Nicolas Goaziou @ 2018-10-17 13:01 UTC (permalink / raw) To: Adam Porter; +Cc: emacs-orgmode Hello, Adam Porter <adam@alphapapa.net> writes: > From what I've read, the byte-compiler can optimize better when > lexical-binding is used. It can, but AFAIK, it doesn't yet. It also means un-optimized lexical binding may be slightly slower than dynamic scoping for the time being. > I've thought about this for a while. It seems to me that the issue is > that Org buffers are, of course, plain-text buffers. There is no > persistent, in-memory representation other than the buffer, so whenever > Org needs structured/semantic data, it must parse it out of the buffer, > which is necessarily rather slow. If there were a way to keep an > outline tree in memory, parallel to the buffer itself, that would allow > operations like search, agenda, etc. to be greatly sped up. I don't think that's necessary. File caching as you suggest below, can go a long way. Filling cache during idle time, too. > But how would that work in Emacs? Theoretically, we could write some > code, applied on self-insert-command, to update the "parallel tree > structure" as the user manipulates the plain-text in the buffer > (e.g. add a new node when the user types a "*" to create a new heading), > and also apply it to functions that manipulate the outline structurally > in the buffer. But, of course, that sounds very complicated. I would > not relish the idea of debugging code to keep a cached tree in sync with > a plain-text buffer outline. :) My over-engineering-o-meter flashes red, too. > Anyway, org-ql tries to do some of what you mentioned. It does > rudimentary, per-buffer, per-query caching (as long as the buffer is not > modified, the cache remains valid), which helps when there are several > Org files open that are referred to often but not as often modified. That's what I did in an agenda upgrade I tried a few months ago. Unfortunately, caching is not compatible with the underlying logic of current Agenda, in particular with `org-agenda-skip-function'. > And the query and presentation code are separated (org-ql and > org-ql-agenda). That's a very good thing. > I don't know how widely it's used, but the repo is getting some regular > traffic, and I'm using it as the backend for my org-sidebar package. > I'd be happy if it could be made more generally useful, or if it could > be helpful to Org itself in some way. Contributions are welcome. That's not exactly what I'm suggesting. I suggest to move the work in Org tree, e.g., as an org-agenda-ng.el library, and, from there, implement back most of the features of the current agenda. Org cannot really benefit from libraries living outside Emacs, as we recently learnt with htmlize issue. Regards, -- Nicolas Goaziou ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-17 13:01 ` Nicolas Goaziou @ 2018-10-17 19:12 ` Adam Porter 2018-10-18 22:48 ` Nicolas Goaziou 0 siblings, 1 reply; 24+ messages in thread From: Adam Porter @ 2018-10-17 19:12 UTC (permalink / raw) To: emacs-orgmode Nicolas Goaziou <mail@nicolasgoaziou.fr> writes: > It can, but AFAIK, it doesn't yet. It also means un-optimized lexical > binding may be slightly slower than dynamic scoping for the time > being. Well, I can't vouch for it myself, because I haven't studied the code. But here's one of the resources that suggests it is faster to use lexical binding: https://emacs.stackexchange.com/questions/2129/why-is-let-faster-with-lexical-scope > That's not exactly what I'm suggesting. I suggest to move the work in > Org tree, e.g., as an org-agenda-ng.el library, and, from there, > implement back most of the features of the current agenda. > > Org cannot really benefit from libraries living outside Emacs, as we > recently learnt with htmlize issue. Org is welcome to take any of the org-ql or org-ql-agenda code you think would be useful. However, before it could be suitable as a possible replacement, it will likely require more optimization. Some queries, especially more complex ones, are slower than the equivalent searches and agendas in the current Org Agenda code. This is because of the way the queries run predicates on each heading. Despite the current Org Agenda code's complexity, it is well optimized and hard to beat. I have a proof-of-concept branch that begins to implement a relatively simple optimization that converts one suitable predicate in a query to a buffer-global regexp search. It significantly improves speed in some cases, but a query with several predicates still has to run all but one of them as predicates. Another possible optimization would be to convert as many predicates in a query to buffer regexp searches as possible, collecting a list of heading positions in the buffer, and then do a final pass with the appropriate union/intersection/difference operations on the lists. Then the list of positions could be used to gather the heading data. I use a similar technique in helm-org-rifle, and it seems to work quickly. It would require some work on a sort of "query compiler" to do the transformation and optimization. I don't have much experience with that kind of programming; maybe someone else would be interested in helping with that. So before taking any of the code into Org itself, you might want to consider these issues and decide whether it could be a suitable approach. Let me know what you'd like to do and how I can help. Thanks, Adam ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-17 19:12 ` Adam Porter @ 2018-10-18 22:48 ` Nicolas Goaziou 2018-10-19 0:04 ` stardiviner 2018-10-20 2:12 ` Adam Porter 0 siblings, 2 replies; 24+ messages in thread From: Nicolas Goaziou @ 2018-10-18 22:48 UTC (permalink / raw) To: Adam Porter; +Cc: emacs-orgmode Hello, Adam Porter <adam@alphapapa.net> writes: > Org is welcome to take any of the org-ql or org-ql-agenda code you think > would be useful. Thank you. > However, before it could be suitable as a possible replacement, it will > likely require more optimization. Some queries, especially more complex > ones, are slower than the equivalent searches and agendas in the current > Org Agenda code. This is because of the way the queries run predicates > on each heading. Despite the current Org Agenda code's complexity, it > is well optimized and hard to beat. Are you saying that queries are turned into regexp searches within Org files? If so, I don't think they should. Queries should only operate on the output of the data extraction, possibly a list of defstructs. I.e., you first extract all meaningful data from the document (during idle time, with cache, or whatever optimization would be chosen), store it in an appropriate format, then query it. WDYT? Regards, -- Nicolas Goaziou ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-18 22:48 ` Nicolas Goaziou @ 2018-10-19 0:04 ` stardiviner 2018-10-20 2:12 ` Adam Porter 1 sibling, 0 replies; 24+ messages in thread From: stardiviner @ 2018-10-19 0:04 UTC (permalink / raw) To: Nicolas Goaziou; +Cc: Adam Porter, emacs-orgmode >> However, before it could be suitable as a possible replacement, it will >> likely require more optimization. Some queries, especially more complex >> ones, are slower than the equivalent searches and agendas in the current >> Org Agenda code. This is because of the way the queries run predicates >> on each heading. Despite the current Org Agenda code's complexity, it >> is well optimized and hard to beat. > > Are you saying that queries are turned into regexp searches within Org > files? If so, I don't think they should. > > Queries should only operate on the output of the data extraction, > possibly a list of defstructs. I.e., you first extract all meaningful > data from the document (during idle time, with cache, or whatever > optimization would be chosen), store it in an appropriate format, then > query it. > I think the same way. In some language library like Clojure's enlive handle the HTML string the same way. -- [ stardiviner ] don't need to convince with trends. Blog: https://stardiviner.github.io/ IRC(freenode): stardiviner GPG: F09F650D7D674819892591401B5DF1C95AE89AC3 ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-18 22:48 ` Nicolas Goaziou 2018-10-19 0:04 ` stardiviner @ 2018-10-20 2:12 ` Adam Porter 2018-10-20 8:12 ` Nicolas Goaziou 1 sibling, 1 reply; 24+ messages in thread From: Adam Porter @ 2018-10-20 2:12 UTC (permalink / raw) Cc: emacs-orgmode [-- Attachment #1: Type: text/plain, Size: 2076 bytes --] On Oct 18, 2018 5:48 PM, "Nicolas Goaziou" <mail@nicolasgoaziou.fr> wrote: > Are you saying that queries are turned into regexp searches within Org files? If so, I don't think they should. Yes, because this is the fastest way to search for matching entries in a buffer, when it's possible to use a regexp search. > Queries should only operate on the output of the data extraction, possibly a list of defstructs. I.e., you first extract all meaningful data from the document (during idle time, with cache, or whatever optimization would be chosen), store it in an appropriate format, then query it. > > WDYT? That would be ideal. The problem I foresee is that, when a buffer's cache is not up-to-date, and the user runs an agenda query, the user will have to wait for the buffer to be parsed and cached, which is much slower than a regexp search through the buffer. That was what I first tried with org-agenda-ng: I parsed the whole buffer with org-element and ran predicates against the element tree. It was much too slow to be practical, so I switched to the current approach, which runs predicates against each node, only checking the necessary metadata. It's fast enough to be useful, but can still be slow in some cases, and I don't think it would be fast enough as a replacement for the current agenda code. But with further optimization, like using whole-buffer regexp searches when possible, it might be. Another idea I've had, similar to yours, would be to pre-process buffers, adding metadata as text-properties on heading lines. However, I haven't tested it, and I don't know what the performance would be like. And it would still suffer from the caching problem I mentioned. I think the fundamental problems are 1) keeping the cache in sync with the raw buffer, and 2) the slow speed of parsing an entire buffer's metadata at once (depending on the size of the files, of course, but mine are big enough to be slow, and I'm sure many users have larger ones). Of course, maybe someone cleverer than me can figure out a clever solution to these problems. :) [-- Attachment #2: Type: text/html, Size: 2358 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-20 2:12 ` Adam Porter @ 2018-10-20 8:12 ` Nicolas Goaziou 0 siblings, 0 replies; 24+ messages in thread From: Nicolas Goaziou @ 2018-10-20 8:12 UTC (permalink / raw) To: Adam Porter; +Cc: emacs-orgmode Hello, Adam Porter <adam@alphapapa.net> writes: > Yes, because this is the fastest way to search for matching entries in a > buffer, when it's possible to use a regexp search. You would still do regexp searches, but not at the time of queries. > That would be ideal. The problem I foresee is that, when a buffer's cache > is not up-to-date, and the user runs an agenda query, the user will have to > wait for the buffer to be parsed and cached, which is much slower than a > regexp search through the buffer. No, because filling cache is still a regexp search. > That was what I first tried with org-agenda-ng: I parsed the whole buffer > with org-element and ran predicates against the element tree. Org Element is not needed, and even shouldn't be used, to retrieve most agenda related data. There are exceptions of course, mainly plain timestamps and clocks. This is where the current agenda is hard to beat, because 1. it cheats and includes timestamps without checking context, 2. it only searches for timestamps related to the day being displayed in the agenda view. The last point makes it particularly fast for single day views. > Another idea I've had, similar to yours, would be to pre-process buffers, > adding metadata as text-properties on heading lines. However, I haven't > tested it, and I don't know what the performance would be like. And it > would still suffer from the caching problem I mentioned. It is still a way to cache stuff. The difficulty here is to keep data up-to-date with changes. Storing per-node cache could be nice, nevertheless. > I think the fundamental problems are 1) keeping the cache in sync with the > raw buffer, Yes, whole buffer caching is simpler here: drop all cached data if buffer contents differ from the cached one. That's what I did in may last attempt to speed up agenda, comparing md5sums. It works reasonably well. I also cached per agenda data type (schedules, deadlines, clocks…) but that means you know something about the query. I think querying and searching should be separated should it shouldn't be done. > and 2) the slow speed of parsing an entire buffer's metadata at > once (depending on the size of the files, of course, but mine are big > enough to be slow, and I'm sure many users have larger ones). I think this could be solved by fetching data preemptively during idle time. I would also work well with per-node caching, since you can interrupt fetching easily. Regards, -- Nicolas Goaziou ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-09 6:37 ` Adam Porter 2018-10-09 16:11 ` Nicolas Goaziou @ 2018-10-10 19:59 ` Marcin Borkowski 1 sibling, 0 replies; 24+ messages in thread From: Marcin Borkowski @ 2018-10-10 19:59 UTC (permalink / raw) To: Adam Porter; +Cc: emacs-orgmode On 2018-10-09, at 08:37, Adam Porter <adam@alphapapa.net> wrote: > Hi Marcin, > > [...] > > If you haven't seen them already, you may find my org-ql and > org-ql-agenda code useful. org-ql-agenda presents an Agenda-like > buffer. N.B. It does *not* implement most of the Agenda features, but > it does emulate an Org Agenda buffer by setting the appropriate text > properties on entries and formatting them in a similar way. > > It's built on org-ql, which provides per-buffer query caching, which > means that generating an org-ql-agenda view for Org buffers that haven't > changed since the last view was generated is very fast. It's also > written in a more functional way, which I think is easier to follow and > modify. Performance of uncached queries/buffers depends on the > query--some are relatively fast, while others are slower than the "real" > Org Agenda. I think there is significant potential for optimizations, > and I'm hoping to implement some in the future. Your feedback would be > appreciated! > > https://github.com/alphapapa/org-ql Thanks, I'll check those out! Best, -- Marcin Borkowski http://mbork.pl ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-07 4:53 How to make agenda generation faster Marcin Borkowski 2018-10-08 7:20 ` Michael Welle 2018-10-09 6:37 ` Adam Porter @ 2018-10-09 11:47 ` Julius Dittmar 2018-10-10 20:03 ` Marcin Borkowski 2 siblings, 1 reply; 24+ messages in thread From: Julius Dittmar @ 2018-10-09 11:47 UTC (permalink / raw) To: emacs-orgmode Hi Marcin, I can't advise as to profiling to find out what really bogs down agenda building. I found that log messages do bog it down. I have a lot of recurring tasks, which accumulate log entries for every closing (which in fact means rescheduling to the next day). Every two to three months I prune my org files of those log entries. This significantly speeds up agenda building. HTH, Julius ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-09 11:47 ` Julius Dittmar @ 2018-10-10 20:03 ` Marcin Borkowski 2018-10-11 6:40 ` Michael Welle 0 siblings, 1 reply; 24+ messages in thread From: Marcin Borkowski @ 2018-10-10 20:03 UTC (permalink / raw) To: Julius Dittmar; +Cc: emacs-orgmode On 2018-10-09, at 13:47, Julius Dittmar <Julius.Dittmar@gmx.de> wrote: > Hi Marcin, > > I can't advise as to profiling to find out what really bogs down agenda > building. > > I found that log messages do bog it down. > > I have a lot of recurring tasks, which accumulate log entries for every > closing (which in fact means rescheduling to the next day). Every two to > three months I prune my org files of those log entries. This > significantly speeds up agenda building. By experiments, I found that the main bottleneck was a file with lots (= a few thousand) headlines. Best, -- Marcin Borkowski http://mbork.pl ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-10 20:03 ` Marcin Borkowski @ 2018-10-11 6:40 ` Michael Welle 2018-10-14 7:42 ` Marcin Borkowski 0 siblings, 1 reply; 24+ messages in thread From: Michael Welle @ 2018-10-11 6:40 UTC (permalink / raw) To: emacs-orgmode Hello, Marcin Borkowski <mbork@mbork.pl> writes: > On 2018-10-09, at 13:47, Julius Dittmar <Julius.Dittmar@gmx.de> wrote: > >> Hi Marcin, >> >> I can't advise as to profiling to find out what really bogs down agenda >> building. >> >> I found that log messages do bog it down. >> >> I have a lot of recurring tasks, which accumulate log entries for every >> closing (which in fact means rescheduling to the next day). Every two to >> three months I prune my org files of those log entries. This >> significantly speeds up agenda building. > > By experiments, I found that the main bottleneck was a file with lots (= > a few thousand) headlines. ah, interesting. My org files usually aren't that deeply structured, so I don't get hit by that. Hm, I guess regexps are used to find headlines? Regards hmw ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: How to make agenda generation faster 2018-10-11 6:40 ` Michael Welle @ 2018-10-14 7:42 ` Marcin Borkowski 0 siblings, 0 replies; 24+ messages in thread From: Marcin Borkowski @ 2018-10-14 7:42 UTC (permalink / raw) To: Michael Welle; +Cc: emacs-orgmode On 2018-10-11, at 08:40, Michael Welle <mwe012008@gmx.net> wrote: > Hello, > > Marcin Borkowski <mbork@mbork.pl> writes: > >> On 2018-10-09, at 13:47, Julius Dittmar <Julius.Dittmar@gmx.de> wrote: >> >>> Hi Marcin, >>> >>> I can't advise as to profiling to find out what really bogs down agenda >>> building. >>> >>> I found that log messages do bog it down. >>> >>> I have a lot of recurring tasks, which accumulate log entries for every >>> closing (which in fact means rescheduling to the next day). Every two to >>> three months I prune my org files of those log entries. This >>> significantly speeds up agenda building. >> >> By experiments, I found that the main bottleneck was a file with lots (= >> a few thousand) headlines. > ah, interesting. My org files usually aren't that deeply structured, so > I don't get hit by that. Hm, I guess regexps are used to find headlines? Mine were very flat - I had *lots* of captured links to websites. Best, -- Marcin Borkowski http://mbork.pl ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2018-10-20 8:12 UTC | newest] Thread overview: 24+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-10-07 4:53 How to make agenda generation faster Marcin Borkowski 2018-10-08 7:20 ` Michael Welle 2018-10-10 20:03 ` Marcin Borkowski 2018-10-10 21:01 ` Samuel Wales 2018-10-11 6:48 ` Michael Welle 2018-10-11 8:48 ` Marcin Borkowski 2018-10-11 19:59 ` Samuel Wales 2018-10-14 8:51 ` Marcin Borkowski 2018-10-09 6:37 ` Adam Porter 2018-10-09 16:11 ` Nicolas Goaziou 2018-10-10 20:01 ` Marcin Borkowski 2018-10-16 20:35 ` Adam Porter 2018-10-17 7:04 ` Ihor Radchenko 2018-10-17 13:01 ` Nicolas Goaziou 2018-10-17 19:12 ` Adam Porter 2018-10-18 22:48 ` Nicolas Goaziou 2018-10-19 0:04 ` stardiviner 2018-10-20 2:12 ` Adam Porter 2018-10-20 8:12 ` Nicolas Goaziou 2018-10-10 19:59 ` Marcin Borkowski 2018-10-09 11:47 ` Julius Dittmar 2018-10-10 20:03 ` Marcin Borkowski 2018-10-11 6:40 ` Michael Welle 2018-10-14 7:42 ` Marcin Borkowski
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs/org-mode.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).