emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Thierry Banel <tbanelwebmin@free.fr>
To: Nicolas Goaziou <mail@nicolasgoaziou.fr>
Cc: emacs-orgmode@gnu.org
Subject: Re: speeding up Babel Gnuplot
Date: Wed, 04 Jan 2017 00:06:40 +0100	[thread overview]
Message-ID: <586C2E80.4050805@free.fr> (raw)
In-Reply-To: <87wpebbygr.fsf@nicolasgoaziou.fr>

Le 03/01/2017 22:55, Nicolas Goaziou a écrit :
> Hello,
>
> Thierry Banel <tbanelwebmin@free.fr> writes:
>
>> Here is a patch to avoid generating temporary files multiple times.
>>
>> There is no way to ensure a single call to
>> (org-babel-gnuplot-process-vars) without modifying ob-core.el. I don't
>> want to do that because I would have to change a lot of babel backends.
>> Thus, I come back to my first light patch.
>>
>> A 'param' list is passed around. It reflects the #+BEGIN_SRC header. My
>> patch changes it in-place from:
>>   (((:var data (3000) (2999) (2998) (2997) ...
>> to:
>>   (((:var data . "/tmp/babel-16991kSr/gnuplot-16991YBq") ...
>>
>> The 'param' list behaves as a cache. There is nothing wrong with that.
>> The worst thing that can happen is the caching no longer working in case
>> 'param' would be copied some day. Results would stay correct.
> Thank you.
>
> What is the benefit of this patch? I mean,
> `org-babel-gnuplot-process-vars' is already quite fast here. Do you have
> some benchmark for that?
The benefit is Babel Gnuplot running twice as fast on large Org tables
(thousands of rows). On small tables there is no real benefit. Two
temporary files are left over in /tmp. They have identical content: data
suitably formatted for Gnuplot. Creating such large temp files
out-weights any other Babel processing.

Granted, you have already speeded-up `org-babel-gnuplot-process-vars'
quite a lot by reworking `org-export-table-row-number'. Now, going down
from 4 seconds to 2 seconds on a 5000 rows table (on my computer) is
quite pleasant.

My patch is very light: `org-babel-gnuplot-process-vars' is the only
modified function, and the change involves only adding a (setcdr)
instruction to cache the result of a heavy processing.


>>  	(car pair) ;; variable name
>> -	(let* ((val (cdr pair)) ;; variable value
>> -	       (lp  (listp val)))
>> -	  (if lp
>> +	(let ((val (cdr pair))) ;; variable value
>> +	  (if (not (listp val))
>> +	      val
>> +	    (let ((temp-file (org-babel-temp-file "gnuplot-"))
>> +		  (first  (car val)))
>> +	      (setcdr pair temp-file) ;; <------ caching here
> It would be nice to expunge the comment a bit.

Yes sure. If the patch is accepted, I'll clean it.

> Another option would be to generate a file according to the hash of
> contents so `org-babel-gnuplot-process-vars' knows when to create a new
> file.
Your proposal provides an additional benefit: caching file generation
between several invocations of Babel. (The cache in my patch is intended
to be used within a single Babel invocation, and is then garbage
collected.). The drawback is that we need to go through all rows of the
table, compute the hash, just to discover that the hash was already
known. The purpose of the cache was precisely to avoid going through the
table again.

Your proposal involves substantial work. We may also want to extend it
to all other Babel backends (R, shell, C, etc.). I may help if enough
users need it.

Regards
Thierry

  reply	other threads:[~2017-01-03 23:06 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-28 20:33 speeding up Babel Gnuplot Thierry Banel
2016-12-29 20:04 ` Nicolas Goaziou
2016-12-29 20:34   ` Thierry Banel
2017-01-01 20:17 ` Thierry Banel
2017-01-01 23:34   ` Nicolas Goaziou
2017-01-02 20:11     ` Thierry Banel
2017-01-03 21:40       ` Thierry Banel
2017-01-03 21:55         ` Nicolas Goaziou
2017-01-03 23:06           ` Thierry Banel [this message]
2017-01-04 22:36             ` Nicolas Goaziou
2017-01-05 20:47               ` Thierry Banel
2017-01-06  9:41                 ` Nicolas Goaziou
2017-01-06 18:24                   ` Thierry Banel
2017-01-04 17:32         ` Achim Gratz
2017-01-04 20:29           ` Thierry Banel
2017-01-04 23:15           ` Charles C. Berry
2017-01-05 20:23             ` Thierry Banel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=586C2E80.4050805@free.fr \
    --to=tbanelwebmin@free.fr \
    --cc=emacs-orgmode@gnu.org \
    --cc=mail@nicolasgoaziou.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).