From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Schulte Subject: Re: Discussion request: 15m tangle time, details follow Date: Wed, 18 Jun 2014 16:59:16 -0400 Message-ID: <87wqce0w9n.fsf@gmail.com> References: <87ppi76irx.fsf@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:60936) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WxMxc-00037B-64 for emacs-orgmode@gnu.org; Wed, 18 Jun 2014 17:00:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WxMxX-00079k-Hs for emacs-orgmode@gnu.org; Wed, 18 Jun 2014 17:00:00 -0400 Received: from mail-qc0-x232.google.com ([2607:f8b0:400d:c01::232]:51788) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WxMxX-00079Y-CD for emacs-orgmode@gnu.org; Wed, 18 Jun 2014 16:59:55 -0400 Received: by mail-qc0-f178.google.com with SMTP id c9so1372283qcz.9 for ; Wed, 18 Jun 2014 13:59:54 -0700 (PDT) In-Reply-To: <87ppi76irx.fsf@gmail.com> (Aaron Ecay's message of "Tue, 17 Jun 2014 22:41:54 -0400") List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Grant Rettke Cc: "emacs-orgmode@gnu.org" Aaron Ecay writes: > Hi Grant, > > 2014ko ekainak 17an, Grant Rettke-ek idatzi zuen: >>=20 >> Good evening, >>=20 >> Over the past few months I've been working on the same literate >> document. It has been a learning >> experience for me, trial and error has abounded. The key tenet that >> I've adhered too though is to truly >> embrace literate programming, and the more I learn the more it makes >> sense. The document has >> grown quite organically and it has been and continues to be a >> wonderful experience. What I need >> help, feedback, discussion, and more on is the build time. >>=20 >> The average build takes 15m.=20 > > Here you mean time to tangle, correct? (As opposed to exporting to > HTML/LaTeX/etc.) > > I can confirm very long times to tangle a document with a structure like > yours. I ran the emacs profiler > > while tangling the document for 30 secs, then interrupted with C-g and > generated a report. That is attached. > > > > I did two non-standard things to this profile. The first was: > > (setq profiler-report-cpu-line-format > '((100 left) > ;; The 100 above is increased from the default of 50 > ;; to allow the deeply nested call tree to be seen > (24 right ((19 right) > (5 right))))) > > The second was to convert an anonymous lambda found in > org-babel-params-from-properties into a named function, so that it would > show up in the profiling results on its own line: > > (defun org-babel-params-from-properties-inner1 (header-arg) > (let (val) > (and (setq val (org-entry-get (point) header-arg t)) > (cons (intern (concat ":" header-arg)) > (org-babel-read val))))) > > The profile shows that most of the slowdown is in org-entry-get. Indeed, > org-babel-params-from-properties calls this function ~30 times per source > block. When called with the inherit arg set to t (as here), this function > takes time (at least) proportional to the number of headings dominating > the source block, which in your document can be up to 5. > Thanks for taking the time to profile this. It's nice to have more evidence that the use of properties is definitely the culprit here. > > I think there are two problems here. The first is the situation where > babel needs to fetch 30 properties per source block. Indeed, this is > marked =E2=80=9Cdeprecated=E2=80=9D in the source, in favor of a system w= here there is > only one header arg. This has been marked deprecated for almost exactly > a year in the code (Achim=E2=80=99s commit 90b16870 of 2013-06-23), but I= don=E2=80=99t > know of any prominent announcement of the deprecation. So I doubt the > old slow code could be removed without breaking many people=E2=80=99s set= ups, > although possibly a customization variable could be introduced to allow > users to opt in to the new, faster system. You=E2=80=99d then have to up= date > your file: > > :PROPERTIES: > :exports: none > :tangle: no > :END: > > becomes > > :PROPERTIES: > :header-args: :exports none :tangle no > :END: > > The new system is also a bit inferior, in that it doesn=E2=80=99t allow h= eader > arg inheritance as easily. So with the one-prop-per-arg system the > following works as expected: > > * foo > :PROPERTIES: > :exports: none > :END: > ** bar > :PROPERTIES: > :tangle: no > :END: > > (src block here) > > On the other hand, in the new system there=E2=80=99s no way to specify so= me > header args at foo and some at bar; the lowest header-args property > wins. (At least as far as I can see) > As I recall this inheritance issue is the wall that we ran up against. The deprecation comment in the code was premature. > > The second issue is that it=E2=80=99s desirable to memoize calls to > org-entry-get. Probably the easiest way to do this is to use the > org-element cache. Indeed, a quick and hacky test that I did seemed to > confirm that this yields some speedup. There are conceptual issues > though =E2=80=93 org-element forces all property keys to be uppercase, wh= ereas > org-entry-get (as near as I can tell...) follows the user=E2=80=99s > customization of case-fold-search to determine its case sensitivity. So > one has to think carefully about how a rewrite to use org-element might > affect the case-sensitivity of the property API (although code relying > on the API to be sensitive to case of property keys might be rare in > practice). > Thanks, it does sound like org-element cache could be useful here, I don't believe this existed last time we wrestled with this performance issue. The only other options I can think of are; - introduce a customization variable to eliminate or limit the use of property lookup for code blocks to either perform none or to only search for a limited set of properties - possibly extend org-element-get (or provide an alternative) which takes multiple keys (which may be more efficient depending on the implementation) > > TL;DR: > > 1. I see the same slowness you report > 2. It seems like an architectural issue rather than one of > (mis)configuration > 3. There are broad fixes available, but they require potentially > compatibility-breaking changes to Org > 4. (maybe with this analysis someone can come up with a more targeted > fix for your use case) > > Hope this is helpful, Very helpful, thanks for providing both empirical data and useful analysis. Best, Eric --=20 Eric Schulte https://cs.unm.edu/~eschulte PGP: 0x614CA05D (see https://u.fsf.org/yw)