emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Eric Schulte <schulte.eric@gmail.com>
To: Grant Rettke <gcr@wisdomandwonder.com>
Cc: "emacs-orgmode@gnu.org" <emacs-orgmode@gnu.org>
Subject: Re: Discussion request: 15m tangle time, details follow
Date: Wed, 18 Jun 2014 16:59:16 -0400	[thread overview]
Message-ID: <87wqce0w9n.fsf@gmail.com> (raw)
In-Reply-To: <87ppi76irx.fsf@gmail.com> (Aaron Ecay's message of "Tue, 17 Jun 2014 22:41:54 -0400")

Aaron Ecay <aaronecay@gmail.com> writes:

> Hi Grant,
>
> 2014ko ekainak 17an, Grant Rettke-ek idatzi zuen:
>> 
>> Good evening,
>> 
>> Over the past few months I've been working on the same literate
>> document. It has been a learning
>> experience for me, trial and error has abounded. The key tenet that
>> I've adhered too though is to truly
>> embrace literate programming, and the more I learn the more it makes
>> sense. The document has
>> grown quite organically and it has been and continues to be a
>> wonderful experience. What I need
>> help, feedback, discussion, and more on is the build time.
>> 
>> The average build takes 15m. 
>
> Here you mean time to tangle, correct?  (As opposed to exporting to
> HTML/LaTeX/etc.)
>
> I can confirm very long times to tangle a document with a structure like
> yours.  I ran the emacs profiler
> <https://www.gnu.org/software/emacs/manual/html_node/elisp/Profiling.html>
> while tangling the document for 30 secs, then interrupted with C-g and
> generated a report.  That is attached.
>
>
>
> I did two non-standard things to this profile.  The first was:
>
> (setq profiler-report-cpu-line-format
>   '((100 left)
>     ;; The 100 above is increased from the default of 50
>     ;; to allow the deeply nested call tree to be seen
>     (24 right ((19 right)
> 	       (5 right)))))
>
> The second was to convert an anonymous lambda found in
> org-babel-params-from-properties into a named function, so that it would
> show up in the profiling results on its own line:
>
> (defun org-babel-params-from-properties-inner1 (header-arg)
>   (let (val)
>     (and (setq val (org-entry-get (point) header-arg t))
> 	 (cons (intern (concat ":" header-arg))
> 	       (org-babel-read val)))))
>
> The profile shows that most of the slowdown is in org-entry-get.  Indeed,
> org-babel-params-from-properties calls this function ~30 times per source
> block.  When called with the inherit arg set to t (as here), this function
> takes time (at least) proportional to the number of headings dominating
> the source block, which in your document can be up to 5.
>

Thanks for taking the time to profile this.  It's nice to have more
evidence that the use of properties is definitely the culprit here.

>
> I think there are two problems here.  The first is the situation where
> babel needs to fetch 30 properties per source block.  Indeed, this is
> marked “deprecated” in the source, in favor of a system where there is
> only one header arg.  This has been marked deprecated for almost exactly
> a year in the code (Achim’s commit 90b16870 of 2013-06-23), but I don’t
> know of any prominent announcement of the deprecation.  So I doubt the
> old slow code could be removed without breaking many people’s setups,
> although possibly a customization variable could be introduced to allow
> users to opt in to the new, faster system.  You’d then have to update
> your file:
>
>   :PROPERTIES:
>   :exports: none
>   :tangle: no
>   :END:
>
> becomes
>
>   :PROPERTIES:
>   :header-args: :exports none :tangle no
>   :END:
>
> The new system is also a bit inferior, in that it doesn’t allow header
> arg inheritance as easily.  So with the one-prop-per-arg system the
> following works as expected:
>
>   * foo
>     :PROPERTIES:
>     :exports: none
>     :END:
>   ** bar
>      :PROPERTIES:
>      :tangle: no
>      :END:
>
>   (src block here)
>
> On the other hand, in the new system there’s no way to specify some
> header args at foo and some at bar; the lowest header-args property
> wins.  (At least as far as I can see)
>

As I recall this inheritance issue is the wall that we ran up against.
The deprecation comment in the code was premature.

>
> The second issue is that it’s desirable to memoize calls to
> org-entry-get.  Probably the easiest way to do this is to use the
> org-element cache.  Indeed, a quick and hacky test that I did seemed to
> confirm that this yields some speedup.  There are conceptual issues
> though – org-element forces all property keys to be uppercase, whereas
> org-entry-get (as near as I can tell...) follows the user’s
> customization of case-fold-search to determine its case sensitivity.  So
> one has to think carefully about how a rewrite to use org-element might
> affect the case-sensitivity of the property API (although code relying
> on the API to be sensitive to case of property keys might be rare in
> practice).
>

Thanks, it does sound like org-element cache could be useful here, I
don't believe this existed last time we wrestled with this performance
issue.

The only other options I can think of are;

- introduce a customization variable to eliminate or limit the use of
  property lookup for code blocks to either perform none or to only
  search for a limited set of properties

- possibly extend org-element-get (or provide an alternative) which
  takes multiple keys (which may be more efficient depending on the
  implementation)

>
> TL;DR:
>
> 1. I see the same slowness you report
> 2. It seems like an architectural issue rather than one of
>    (mis)configuration
> 3. There are broad fixes available, but they require potentially
>    compatibility-breaking changes to Org
> 4. (maybe with this analysis someone can come up with a more targeted
>    fix for your use case)
>
> Hope this is helpful,

Very helpful, thanks for providing both empirical data and useful
analysis.

Best,
Eric

-- 
Eric Schulte
https://cs.unm.edu/~eschulte
PGP: 0x614CA05D (see https://u.fsf.org/yw)

  parent reply	other threads:[~2014-06-18 21:00 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-18  0:16 Discussion request: 15m tangle time, details follow Grant Rettke
2014-06-18  2:41 ` Aaron Ecay
2014-06-18  8:13   ` Sebastien Vauban
2014-06-18 17:47     ` Aaron Ecay
2014-06-18 20:59   ` Eric Schulte [this message]
2014-06-19  1:05     ` Grant Rettke
2014-06-19 13:44       ` Eric Schulte
2014-06-20  0:01         ` Grant Rettke
2014-06-20  1:26           ` Grant Rettke
2014-06-20 14:50             ` Grant Rettke
2014-06-19  0:56   ` Grant Rettke
2014-06-18  2:54 ` Nick Dokos
2014-06-18  8:18 ` Thorsten Jolitz
2014-06-18  9:34   ` Thorsten Jolitz
2014-06-19  1:00   ` Grant Rettke
2014-06-18  8:20 ` Andreas Leha
2014-06-19  0:58   ` Grant Rettke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wqce0w9n.fsf@gmail.com \
    --to=schulte.eric@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=gcr@wisdomandwonder.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).