From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nicolas Goaziou Subject: Re: org-element checks make flyspell prohibitively slow Date: Mon, 17 Mar 2014 22:22:26 +0100 Message-ID: <87siqgcygd.fsf@gmail.com> References: <87zjkolwum.fsf@fastmail.fm> Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:41427) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WPeyy-0003vK-3P for emacs-orgmode@gnu.org; Mon, 17 Mar 2014 17:22:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WPeyt-0000Th-Rr for emacs-orgmode@gnu.org; Mon, 17 Mar 2014 17:22:04 -0400 Received: from mail-wg0-x229.google.com ([2a00:1450:400c:c00::229]:46839) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WPeyt-0000TU-KJ for emacs-orgmode@gnu.org; Mon, 17 Mar 2014 17:21:59 -0400 Received: by mail-wg0-f41.google.com with SMTP id n12so5099496wgh.0 for ; Mon, 17 Mar 2014 14:21:58 -0700 (PDT) In-Reply-To: <87zjkolwum.fsf@fastmail.fm> (Matt Lundin's message of "Mon, 17 Mar 2014 09:32:17 -0500") List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Matt Lundin Cc: Emacs-orgmode list Hello, Matt Lundin writes: > The rewrite of org-mode-flyspell-verify in commit > 4a27c2b4b67201e0b23f431bdaeb6460b31e1394 (Nov 21, 2013) makes navigating > org-mode files with large chunks of text very slow. [...] > => Org-mode version 8.2.5h (release_8.2.5h-757-gc444e4 @ > /home/matt/org-mode/lisp/) Could you update and try again? Parser's cache was inadvertently disabled. I re-enabled it. > I open a test.org file containing the following. > > * A headline * Arch packages * Another headline > > After opening a line under "Arch Packages" I call... > > C-u M-! pacman -Ss [RET] > > (Of course, this only works with archlinux.) This inserts a long list of > packages that look like this: > > core/acl 2.2.52-2 [installed] > Access control list utilities, libraries and headers > core/archlinux-keyring 20140220-1 [installed] > Arch Linux PGP keyring > core/attr 2.4.47-1 [installed] > Extended attribute support library for ACL support > core/autoconf 2.69-1 (base-devel) [installed] > A GNU tool for automatically configuring source code > core/automake 1.14.1-1 (base-devel) [installed] > A GNU tool for automatically creating Makefiles > > All in all, it's 12680 lines.... Note that it is a contrived example: the whole buffer is a single paragraph containing around 150 objects. The current algorithm for `org-element-context' is clearly not on par with such a density of objects per paragraph. Also, cache cannot help here, because each time you edit a paragraph, all objects within are removed from the cache (because, AFAIK, there is no way to know if the edition altered a previously parsed object or not, so, as a security measure, all of them are wiped out) and you have to parse them again. Therefore, navigation should be fast but editing (with flyspell-mode enabled) is going to be slow. > But this works (more or less) with other very large chunks of text. > E.g., > > C-u M-! w3m -dump http://www.gnu.org/software/emacs/manual/html_mono/emacs.html This one should be reasonably fast in both cases. > Is it possible to speed up org-element-context here? Certainly. `org-element-context' is the less optimized part of the parser code. There is room for improvements. > For something called as often as org-mode-flyspell-verify, do we need > all the overhead of the org-element parser? Of course. > Or would a hack optimized for speed (which is what the older version > of org-mode-flyspell-verify represented) be enough? IMO, the old version of this function was annoying as soon as you switched to a non-english language. YMMV. > I recall (though my memory may be faulty) discussions on the list > quite some time back in which we decided to prioritize > speed/efficiency over thoroughness/completeness in the checks run by > org-mode-flyspell-verify. Why prioritize when we can have both? I agree that `org-mode-flyspell-verify' is not fast enough at the time being, but it is quite usable anyway. Also, as a very demanding function, it is a good benchmark for the parser. In order to improve the current state, reports (like those in your message) help a lot. You can also help improving the algorithms. Thank you. Regards, -- Nicolas Goaziou