* [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
@ 2013-03-06 4:07 Aaron Ecay
2013-03-08 21:25 ` Aaron Ecay
2013-03-08 21:53 ` Achim Gratz
0 siblings, 2 replies; 27+ messages in thread
From: Aaron Ecay @ 2013-03-06 4:07 UTC (permalink / raw)
To: emacs-orgmode
In order for the cache feature to work, the hash of a finished
computation must be inserted. But, this is not currently done for src
blocks which have the option :results none. Thus, we should insert a
dummy empty result for these blocks, which will hold the hash.
---
lisp/ob-core.el | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/lisp/ob-core.el b/lisp/ob-core.el
index 3b7c463..eabfc05 100644
--- a/lisp/ob-core.el
+++ b/lisp/ob-core.el
@@ -576,7 +576,10 @@ block."
(if (member "none" result-params)
(progn
(funcall cmd body params)
- (message "result silenced"))
+ (message "result silenced")
+ (when cachep
+ (org-babel-insert-result
+ "" result-params info new-hash indent lang)))
(setq result
((lambda (result)
(if (and (eq (cdr (assoc :result-type params)) 'value)
--
1.8.1.5
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-06 4:07 [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results Aaron Ecay
@ 2013-03-08 21:25 ` Aaron Ecay
2013-03-08 22:07 ` Eric Schulte
2013-03-08 21:53 ` Achim Gratz
1 sibling, 1 reply; 27+ messages in thread
From: Aaron Ecay @ 2013-03-08 21:25 UTC (permalink / raw)
To: emacs-orgmode
On Tue, Mar 5, 2013 at 11:07 PM, Aaron Ecay <aaronecay@gmail.com> wrote:
> In order for the cache feature to work, the hash of a finished
> computation must be inserted. But, this is not currently done for src
> blocks which have the option :results none. Thus, we should insert a
> dummy empty result for these blocks, which will hold the hash.
> ---
> lisp/ob-core.el | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/lisp/ob-core.el b/lisp/ob-core.el
> index 3b7c463..eabfc05 100644
> --- a/lisp/ob-core.el
> +++ b/lisp/ob-core.el
> @@ -576,7 +576,10 @@ block."
> (if (member "none" result-params)
> (progn
> (funcall cmd body params)
> - (message "result silenced"))
> + (message "result silenced")
> + (when cachep
The above should be cache-p (with hyphen).
> + (org-babel-insert-result
> + "" result-params info new-hash indent lang)))
> (setq result
> ((lambda (result)
> (if (and (eq (cdr (assoc :result-type params)) 'value)
> --
> 1.8.1.5
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-06 4:07 [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results Aaron Ecay
2013-03-08 21:25 ` Aaron Ecay
@ 2013-03-08 21:53 ` Achim Gratz
2013-03-08 22:09 ` Eric Schulte
1 sibling, 1 reply; 27+ messages in thread
From: Achim Gratz @ 2013-03-08 21:53 UTC (permalink / raw)
To: emacs-orgmode
Aaron Ecay writes:
> In order for the cache feature to work, the hash of a finished
> computation must be inserted. But, this is not currently done for src
> blocks which have the option :results none. Thus, we should insert a
> dummy empty result for these blocks, which will hold the hash.
Getting a results block when specifying ":results none" feels a bit
strange. Since it is not the results that are hashed, but the effective
parameters of the invocation, wouldn't it make more sense to store the
parameter hash with the source block or call rather than the result?
That would free up the current place to hash the actual result to check
if the results have been tampered with.
Regards,
Achim.
--
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+
SD adaptation for Waldorf rackAttack V1.04R1:
http://Synth.Stromeko.net/Downloads.html#WaldorfSDada
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-08 21:25 ` Aaron Ecay
@ 2013-03-08 22:07 ` Eric Schulte
0 siblings, 0 replies; 27+ messages in thread
From: Eric Schulte @ 2013-03-08 22:07 UTC (permalink / raw)
To: Aaron Ecay; +Cc: emacs-orgmode
Aaron Ecay <aaronecay@gmail.com> writes:
> On Tue, Mar 5, 2013 at 11:07 PM, Aaron Ecay <aaronecay@gmail.com> wrote:
>> In order for the cache feature to work, the hash of a finished
>> computation must be inserted. But, this is not currently done for src
>> blocks which have the option :results none. Thus, we should insert a
>> dummy empty result for these blocks, which will hold the hash.
>> ---
>> lisp/ob-core.el | 5 ++++-
>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/lisp/ob-core.el b/lisp/ob-core.el
>> index 3b7c463..eabfc05 100644
>> --- a/lisp/ob-core.el
>> +++ b/lisp/ob-core.el
>> @@ -576,7 +576,10 @@ block."
>> (if (member "none" result-params)
>> (progn
>> (funcall cmd body params)
>> - (message "result silenced"))
>> + (message "result silenced")
>> + (when cachep
>
> The above should be cache-p (with hyphen).
>
The hyphen should only be required for multi-word functions, e.g.,
`listp' has no hyphen but `hash-table-p' does have a hyphen.
>
>> + (org-babel-insert-result
>> + "" result-params info new-hash indent lang)))
>> (setq result
>> ((lambda (result)
>> (if (and (eq (cdr (assoc :result-type params)) 'value)
>> --
>> 1.8.1.5
>>
>
--
Eric Schulte
http://cs.unm.edu/~eschulte
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-08 21:53 ` Achim Gratz
@ 2013-03-08 22:09 ` Eric Schulte
2013-03-08 22:24 ` aaronecay
2013-03-09 0:57 ` Achim Gratz
0 siblings, 2 replies; 27+ messages in thread
From: Eric Schulte @ 2013-03-08 22:09 UTC (permalink / raw)
To: Achim Gratz; +Cc: emacs-orgmode
Achim Gratz <Stromeko@nexgo.de> writes:
> Aaron Ecay writes:
>> In order for the cache feature to work, the hash of a finished
>> computation must be inserted. But, this is not currently done for src
>> blocks which have the option :results none. Thus, we should insert a
>> dummy empty result for these blocks, which will hold the hash.
>
> Getting a results block when specifying ":results none" feels a bit
> strange.
I would agree. I don't believe *any* changes should take place in the
buffer when a code block is executed with ":results none".
> Since it is not the results that are hashed, but the effective
> parameters of the invocation, wouldn't it make more sense to store the
> parameter hash with the source block or call rather than the result?
> That would free up the current place to hash the actual result to
> check if the results have been tampered with.
>
I prefer leaving the hash with the results, as it is the results which
are "hashed". Also, same input does not always guarantee same output,
e.g.,
#+begin_src sh
date
#+end_src
>
>
> Regards,
> Achim.
--
Eric Schulte
http://cs.unm.edu/~eschulte
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-08 22:09 ` Eric Schulte
@ 2013-03-08 22:24 ` aaronecay
2013-03-09 17:45 ` Eric Schulte
2013-03-09 0:57 ` Achim Gratz
1 sibling, 1 reply; 27+ messages in thread
From: aaronecay @ 2013-03-08 22:24 UTC (permalink / raw)
To: Eric Schulte; +Cc: Achim Gratz, emacs-orgmode
2013ko martxoak 8an, Eric Schulte-ek idatzi zuen:
>
> I would agree. I don't believe *any* changes should take place in the
> buffer when a code block is executed with ":results none".
A common use case for me is to use a babel block to load a large dataset
into R. I want this to be cached, in the sense that I want it not to be
run again (by e.g. C-c C-v C-b) unless the code changes. But I also
don’t want to see its result in the (mini)buffer. Is there a way to
accommodate this usage of the cache functionality?
> I prefer leaving the hash with the results, as it is the results which
> are "hashed". Also, same input does not always guarantee same output,
> e.g.,
>
> #+begin_src sh
> date
> #+end_src
In this case, the code block shouldn’t be marked :cache. Unless the
desired (and odd, IMO) behavior is to have a datestamp that is only
updated when the user forcibly re-evaluates the block (with C-u C-c
C-c).
Also, with regard to:
> The hyphen should only be required for multi-word functions, e.g.,
> `listp' has no hyphen but `hash-table-p' does have a hyphen.
The context surrounding this code binds cache-p; the lack of a hyphen
was just a typo in the patch. I agree that cachep is more idiomatic (in
fact, that is what led to the typo), but I tried to make the smallest
possible patch to address my intention.
--
Aaron Ecay
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-08 22:09 ` Eric Schulte
2013-03-08 22:24 ` aaronecay
@ 2013-03-09 0:57 ` Achim Gratz
2013-03-09 18:35 ` Eric Schulte
1 sibling, 1 reply; 27+ messages in thread
From: Achim Gratz @ 2013-03-09 0:57 UTC (permalink / raw)
To: emacs-orgmode
Eric Schulte writes:
> I prefer leaving the hash with the results, as it is the results which
> are "hashed". Also, same input does not always guarantee same output,
> e.g.,
>
> #+begin_src sh
> date
> #+end_src
That's not what I'm seeing, but I may be missing something again. The
hash is for the parameters of the call, not the result. If I'm editing
the result, Babel still marks the cache valid and does not re-compute
it. It does re-compute if I change the parameters explicitly or
implicitly, even if the result will not change.
Regards,
Achim.
--
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+
Wavetables for the Waldorf Blofeld:
http://Synth.Stromeko.net/Downloads.html#BlofeldUserWavetables
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-08 22:24 ` aaronecay
@ 2013-03-09 17:45 ` Eric Schulte
2013-03-09 18:56 ` Aaron Ecay
2013-03-09 20:03 ` Achim Gratz
0 siblings, 2 replies; 27+ messages in thread
From: Eric Schulte @ 2013-03-09 17:45 UTC (permalink / raw)
To: Achim Gratz; +Cc: emacs-orgmode
aaronecay@gmail.com writes:
> 2013ko martxoak 8an, Eric Schulte-ek idatzi zuen:
>>
>> I would agree. I don't believe *any* changes should take place in the
>> buffer when a code block is executed with ":results none".
>
> A common use case for me is to use a babel block to load a large dataset
> into R. I want this to be cached, in the sense that I want it not to be
> run again (by e.g. C-c C-v C-b) unless the code changes. But I also
> don’t want to see its result in the (mini)buffer. Is there a way to
> accommodate this usage of the cache functionality?
>
Maybe a better solution would be to add a feature to avoid echoing very
large results to the minibuffer. It should be very straightforward to
add a user customizable variable (e.g., `org-babel-max-echo-length' or
somesuch) which limits the number of characters echo'd to the
minibuffer.
>> The hyphen should only be required for multi-word functions, e.g.,
>> `listp' has no hyphen but `hash-table-p' does have a hyphen.
>
> The context surrounding this code binds cache-p; the lack of a hyphen
> was just a typo in the patch. I agree that cachep is more idiomatic (in
> fact, that is what led to the typo), but I tried to make the smallest
> possible patch to address my intention.
Ah, my fault for not completely reading and understanding your previous
post. I'm currently working on a set of patches with Achim which should
(I believe) resolve this issue.
--
Eric Schulte
http://cs.unm.edu/~eschulte
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-09 0:57 ` Achim Gratz
@ 2013-03-09 18:35 ` Eric Schulte
2013-03-09 19:22 ` Aaron Ecay
2013-03-10 8:52 ` Achim Gratz
0 siblings, 2 replies; 27+ messages in thread
From: Eric Schulte @ 2013-03-09 18:35 UTC (permalink / raw)
To: Achim Gratz; +Cc: emacs-orgmode
Achim Gratz <Stromeko@nexgo.de> writes:
> Eric Schulte writes:
>> I prefer leaving the hash with the results, as it is the results which
>> are "hashed". Also, same input does not always guarantee same output,
>> e.g.,
>>
>> #+begin_src sh
>> date
>> #+end_src
>
> That's not what I'm seeing, but I may be missing something again. The
> hash is for the parameters of the call, not the result. If I'm editing
> the result, Babel still marks the cache valid and does not re-compute
> it. It does re-compute if I change the parameters explicitly or
> implicitly, even if the result will not change.
>
A hash marks a *result* with an indication of what was used to generate
it (code block & parameters). The point of a hash is to allow the
result to be returned without having to re-execute. For this reason, I
think that the hash should live with the result. In general a hash
without a result doesn't make sense (because then what is cached?).
If one did want to move hashes to code blocks it would be a major
refactoring which would (in my opinion) require significant
justification.
As I understand this particular case, the OP is using a hash not to mark
a result as up to date, but rather to mark a side effect (loading data
into R) as having taken place. I think this is a misuse of a cache.
What if the R process restarts? The hash would still be valid, but the
side effects have been lost. I think a better approach would be to
implement the logic in R required to check if data is present and
conditionally load it if not. Then simply re-run this conditional
reloading code in full every time.
It is very possible I've missed something.
I hope these comments are helpful.
Best,
--
Eric Schulte
http://cs.unm.edu/~eschulte
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-09 17:45 ` Eric Schulte
@ 2013-03-09 18:56 ` Aaron Ecay
2013-03-09 20:03 ` Achim Gratz
1 sibling, 0 replies; 27+ messages in thread
From: Aaron Ecay @ 2013-03-09 18:56 UTC (permalink / raw)
To: Eric Schulte; +Cc: Achim Gratz, emacs-orgmode
2013ko martxoak 9an, Eric Schulte-ek idatzi zuen:
> Maybe a better solution would be to add a feature to avoid echoing
> very large results to the minibuffer. It should be very
> straightforward to add a user customizable variable (e.g.,
> `org-babel-max-echo-length' or somesuch) which limits the number of
> characters echo'd to the minibuffer.
If a very large result is read by emacs, it slows down drastically. This
is in fact the raison d’etre of :results none
(http://thread.gmane.org/gmane.emacs.orgmode/62115/focus=62665). So I’m
afraid this doesn’t help.
--
Aaron Ecay
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-09 18:35 ` Eric Schulte
@ 2013-03-09 19:22 ` Aaron Ecay
2013-03-09 20:26 ` Eric Schulte
2013-03-10 8:52 ` Achim Gratz
1 sibling, 1 reply; 27+ messages in thread
From: Aaron Ecay @ 2013-03-09 19:22 UTC (permalink / raw)
To: Eric Schulte; +Cc: Achim Gratz, emacs-orgmode
2013ko martxoak 9an, Eric Schulte-ek idatzi zuen:
> A hash marks a *result* with an indication of what was used to generate
> it (code block & parameters). The point of a hash is to allow the
> result to be returned without having to re-execute. For this reason, I
> think that the hash should live with the result. In general a hash
> without a result doesn't make sense (because then what is cached?).
A :results none code block is run for its side effects (by definition).
Caching a code block with results says “I do not want to recalculate
this value unless the code changes.” Caching a null result, by analogy,
says “I do not want these side effects again, unless the code changes”.
>
> As I understand this particular case, the OP is using a hash not to mark
> a result as up to date, but rather to mark a side effect (loading data
> into R) as having taken place. I think this is a misuse of a cache.
It depends on whether one looks at a cache as “a place to store results”
or “a way to conditionally rerun code blocks only when they change”, I
suppose. I guess you hold with the former; I think the latter is a
useful conceptual extension. Knitr (http://yihui.name/knitr/) is a very
useful literate programming tool for R, and it supports “caching” code
with side-effects using clever means. I don’t think org should do all
the tricks knitr does, but it would be useful to be able to
conditionally reexecute code with no results/with side effects.
>
> What if the R process restarts? The hash would still be valid, but the
> side effects have been lost.
This is also an issue if the external data files have changed, the RNG
seed is no longer the same, etc. In such cases, the user has to be
clever. But the same is true of any cached code that is not a pure
function.
In practice, if the R process is restarted the “variable not found”
errors quickly become apparent, and reloading the data is a simple C-u
C-c C-c away.
(That being said, including the PID of the R process in the results
hash, to the effect that the code would be rerun in the case you
mention, might not be a bad idea. But that is a separate discussion.)
--
Aaron Ecay
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-09 17:45 ` Eric Schulte
2013-03-09 18:56 ` Aaron Ecay
@ 2013-03-09 20:03 ` Achim Gratz
1 sibling, 0 replies; 27+ messages in thread
From: Achim Gratz @ 2013-03-09 20:03 UTC (permalink / raw)
To: emacs-orgmode
Eric Schulte writes:
> Ah, my fault for not completely reading and understanding your previous
> post. I'm currently working on a set of patches with Achim which should
> (I believe) resolve this issue.
It doesn't yet, but I#ll add another patch to rename this binding.
IIRC, the naming was so that it would rhyme more easily with
cache-current-p which has the hyphen (and should keep it, I think). But
if cachep is more idiomatic, I'm not the one to argue.
Regards,
Achim.
--
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+
Waldorf MIDI Implementation & additional documentation:
http://Synth.Stromeko.net/Downloads.html#WaldorfDocs
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-09 19:22 ` Aaron Ecay
@ 2013-03-09 20:26 ` Eric Schulte
2013-03-13 3:55 ` Aaron Ecay
0 siblings, 1 reply; 27+ messages in thread
From: Eric Schulte @ 2013-03-09 20:26 UTC (permalink / raw)
To: Eric Schulte; +Cc: Achim Gratz, emacs-orgmode
>>
>> As I understand this particular case, the OP is using a hash not to mark
>> a result as up to date, but rather to mark a side effect (loading data
>> into R) as having taken place. I think this is a misuse of a cache.
>
> It depends on whether one looks at a cache as “a place to store results”
> or “a way to conditionally rerun code blocks only when they change”, I
> suppose. I guess you hold with the former; I think the latter is a
> useful conceptual extension. Knitr (http://yihui.name/knitr/) is a very
> useful literate programming tool for R, and it supports “caching” code
> with side-effects using clever means. I don’t think org should do all
> the tricks knitr does, but it would be useful to be able to
> conditionally reexecute code with no results/with side effects.
>
Could something like the following work? Removing ":results none" and
adding something small as the returned result which may easily be parsed
and placed in the buffer w/o problem.
#+begin_src R :cache yes
# code to perform side effect
x <- 'side effect'
'done'
#+end_src
#+RESULTS[9f4e5b4b07e93c680ab37fc4ba1f75e1bfc0ee0a]:
: done
>
>>
>> What if the R process restarts? The hash would still be valid, but the
>> side effects have been lost.
>
> This is also an issue if the external data files have changed, the RNG
> seed is no longer the same, etc. In such cases, the user has to be
> clever. But the same is true of any cached code that is not a pure
> function.
>
> In practice, if the R process is restarted the “variable not found”
> errors quickly become apparent, and reloading the data is a simple C-u
> C-c C-c away.
>
> (That being said, including the PID of the R process in the results
> hash, to the effect that the code would be rerun in the case you
> mention, might not be a bad idea. But that is a separate discussion.)
This does not need special built in support, e.g.,
#+name: R-pid
#+begin_src sh :var R="/usr/lib64/R/bin/exec/R"
ps auxwww|grep "$R"|grep -v 'grep'|awk '{print $2}'
#+end_src
#+begin_src R :cache yes :var pid=R-pid
# code to perform side effect
x <- 'side effect'
'done'
#+end_src
#+RESULTS[da16f09882a6295815db51247592b77c80ed0056]:
: done
Best,
--
Eric Schulte
http://cs.unm.edu/~eschulte
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-09 18:35 ` Eric Schulte
2013-03-09 19:22 ` Aaron Ecay
@ 2013-03-10 8:52 ` Achim Gratz
2013-03-10 20:14 ` Sebastien Vauban
2013-03-13 4:12 ` Aaron Ecay
1 sibling, 2 replies; 27+ messages in thread
From: Achim Gratz @ 2013-03-10 8:52 UTC (permalink / raw)
To: emacs-orgmode
Eric Schulte writes:
> A hash marks a *result* with an indication of what was used to generate
> it (code block & parameters). The point of a hash is to allow the
> result to be returned without having to re-execute. For this reason, I
> think that the hash should live with the result.
Here Babel is assuming a very specific execution model, namely a
functional one (a function with the same parameters always delivers the
same results, so if you see the same function invoked with the same
parameters you can just substitute the result from an earlier
invocation). Babel isn't a functional language however, so it is both
possible and done in practice to use it for side-effects or even
side-effects only.
But back to my earlier remark about the hash value actually being a
signature of the source block and not the result. If I use noweb
references, the reference text is cached, not its expansion. See the
example below where after the first invocation I change the source block
referenced to deliver a different result. That invalidates the cache
for direct invocation of that block, but fails to do so for the indirect
invocation. If you look at the two result blocks, you see that the same
hash is added to two different blocks.
--8<---------------cut here---------------start------------->8---
#+name: list
#+header: :exports none :results yes :eval query :cache yes
#+begin_src emacs-lisp
'(a b c d)
#+end_src
#+RESULTS[6bd0507c2cc972cc7647a9c2c169a1095bab5941]: list
| a | b | c | d |
#+RESULTS[d8dad02c5c6fd93a991a4bb23471f273cc0b3415]: list-1
| a | b | c |
#+name: indirect
#+header: :noweb yes
#+header: :exports none :results yes :eval query :cache yes
#+begin_src emacs-lisp
<<list>>
#+end_src
#+RESULTS[0b6ada101242e80d4d50f4909f33d8819a88ea4e]: indirect
| a | b | c | d |
#+RESULTS[0b6ada101242e80d4d50f4909f33d8819a88ea4e]: indirect-1
| a | b | c |
--8<---------------cut here---------------end--------------->8---
I'm not saying this needs fixing (expanding references could easily be
the most costly step in a re-evaluation), but the description in the
manual talks about caching in terms of results which is not what is
actually implemented, as demonstrated above.
> In general a hash without a result doesn't make sense (because then
> what is cached?).
If the question was meant as "did this code block already run?" and the
invocation was for side-effects only, then it does make sense to me.
> If one did want to move hashes to code blocks it would be a major
> refactoring which would (in my opinion) require significant
> justification.
I'm not disputing that it requires significant effort. The benefits
would be that we might have a chance to clear up some confusion over the
code execution model of Babel and better support different ones.
> As I understand this particular case, the OP is using a hash not to mark
> a result as up to date, but rather to mark a side effect (loading data
> into R) as having taken place. I think this is a misuse of a cache.
Or you might call it a clever hack. But I think the general problem of
needing one-time invocations of source blocks is one that comes up often
when programming with side-effects that are not directly observable.
This again comes in different shades, I'd often want to run some blocks
only when the document is first opened, but then only again if something
changes. This would require that the hash value was a property of the
buffer text, not actual buffer text, I'd think. Sure, you can use hooks
to nuke the caches on load, but that only works when you are using your
own Emacs configuration.
> What if the R process restarts? The hash would still be valid, but the
> side effects have been lost. I think a better approach would be to
> implement the logic in R required to check if data is present and
> conditionally load it if not. Then simply re-run this conditional
> reloading code in full every time.
Oh yes, there's a whole set of _other_ problems that are waiting to be
solved. :-)
Regards,
Achim.
--
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+
SD adaptations for KORG EX-800 and Poly-800MkII V0.9:
http://Synth.Stromeko.net/Downloads.html#KorgSDada
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-10 8:52 ` Achim Gratz
@ 2013-03-10 20:14 ` Sebastien Vauban
2013-03-10 21:06 ` Achim Gratz
2013-03-13 4:12 ` Aaron Ecay
1 sibling, 1 reply; 27+ messages in thread
From: Sebastien Vauban @ 2013-03-10 20:14 UTC (permalink / raw)
To: emacs-orgmode-mXXj517/zsQ
Achim, Eric,
Achim Gratz wrote:
> Eric Schulte writes:
>> A hash marks a *result* with an indication of what was used to generate
>> it (code block & parameters). The point of a hash is to allow the
>> result to be returned without having to re-execute. For this reason, I
>> think that the hash should live with the result.
>
> Here Babel is assuming a very specific execution model, namely a
> functional one (a function with the same parameters always delivers the
> same results, so if you see the same function invoked with the same
> parameters you can just substitute the result from an earlier
> invocation). Babel isn't a functional language however, so it is both
> possible and done in practice to use it for side-effects or even
> side-effects only.
>
> But back to my earlier remark about the hash value actually being a
> signature of the source block and not the result. If I use noweb
> references, the reference text is cached, not its expansion.
Well seen... I wouldn't have thought of that...
A more general question: shouldn't cache be unusable (generate an error) when
there is a session? In the presence of a session, I've the impression that
caching results is always wrong. Who knows its contents before executing the
code, in the next Emacs session?
Best regards,
Seb
--
Sebastien Vauban
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-10 20:14 ` Sebastien Vauban
@ 2013-03-10 21:06 ` Achim Gratz
0 siblings, 0 replies; 27+ messages in thread
From: Achim Gratz @ 2013-03-10 21:06 UTC (permalink / raw)
To: emacs-orgmode
Sebastien Vauban writes:
> A more general question: shouldn't cache be unusable (generate an
> error) when there is a session? In the presence of a session, I've
> the impression that caching results is always wrong. Who knows its
> contents before executing the code, in the next Emacs session?
That depends on what execution model you have in mind. Generally
caching results from a session has questionable utility as you mention,
since you still have to arrange certain things by hand. Now, if Babel
would provide a way to check a session ID and have the hash depend on it
also, then suddenly this becomes an extremely useful tool. One
application would be to ensure that certain initialization have taken
place (but only once) in that session even though the code calling into
the session hasn't set it up.
Regards,
Achim.
--
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+
Factory and User Sound Singles for Waldorf Blofeld:
http://Synth.Stromeko.net/Downloads.html#WaldorfSounds
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-09 20:26 ` Eric Schulte
@ 2013-03-13 3:55 ` Aaron Ecay
2013-03-13 14:45 ` Eric Schulte
0 siblings, 1 reply; 27+ messages in thread
From: Aaron Ecay @ 2013-03-13 3:55 UTC (permalink / raw)
To: Eric Schulte; +Cc: Achim Gratz, emacs-orgmode
Hi Eric,
2013ko martxoak 9an, Eric Schulte-ek idatzi zuen:
> Could something like the following work? Removing ":results none" and
> adding something small as the returned result which may easily be parsed
> and placed in the buffer w/o problem.
>
> #+begin_src R :cache yes
> # code to perform side effect
> x <- 'side effect'
> 'done'
> #+end_src
>
> #+RESULTS[9f4e5b4b07e93c680ab37fc4ba1f75e1bfc0ee0a]:
> : done
It works, but it is a kludge. In fact, it is the same kludge that we
used to need before :results none (to avoid emacs choking on reading a
monster data frame).
> This does not need special built in support, e.g.,
>
> #+name: R-pid
> #+begin_src sh :var R="/usr/lib64/R/bin/exec/R"
> ps auxwww|grep "$R"|grep -v 'grep'|awk '{print $2}'
> #+end_src
>
> #+begin_src R :cache yes :var pid=R-pid
> # code to perform side effect
> x <- 'side effect'
> 'done'
> #+end_src
>
> #+RESULTS[da16f09882a6295815db51247592b77c80ed0056]:
> : done
Now *this* is a kludge! Since babel involves executing arbitrary code,
the question to ask is not “Is this possible in babel?”. The answer is
always “yes.” The right question is instead “What does it make the most
sense for babel to do?” I think Achim’s contributions to this thread
pushing us in the direction of thinking about what the execution model
is are exactly what is needed.
For cached code running in a session, I think a sensible model is:
- Code should be re-run once after each session startup
- Other than that, code should be re-run only if it changes, or if the
user explicitly requests it to be re-run.
In order to implement this, it is necessary to figure out how to hash
the contents of :results none blocks, and include the session process id
in the hash. If you have a different model in mind, then you will want
different behavior. But I think (thanks to Achim’s clarifying comments)
we can’t really discuss what is the “right” behavior without also
discussing which is the “right” model.
--
Aaron Ecay
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-10 8:52 ` Achim Gratz
2013-03-10 20:14 ` Sebastien Vauban
@ 2013-03-13 4:12 ` Aaron Ecay
2013-03-13 7:50 ` Achim Gratz
2013-03-13 14:42 ` Eric Schulte
1 sibling, 2 replies; 27+ messages in thread
From: Aaron Ecay @ 2013-03-13 4:12 UTC (permalink / raw)
To: Achim Gratz; +Cc: emacs-orgmode
Hi Achim,
2013ko martxoak 10an, Achim Gratz-ek idatzi zuen:
> But back to my earlier remark about the hash value actually being a
> signature of the source block and not the result. If I use noweb
> references, the reference text is cached, not its expansion. See the
> example below where after the first invocation I change the source block
> referenced to deliver a different result. That invalidates the cache
> for direct invocation of that block, but fails to do so for the indirect
> invocation. If you look at the two result blocks, you see that the same
> hash is added to two different blocks.
I think this points in the direction of having the notion of
dependencies among source blocks. This is an idea that knitr
(http://yihui.name/knitr/) implements. The idea would be to include in
the hash of a source block X (in addition to all the pieces that are
already in the hash) the hash of the blocks that X depends on. So in
your example, the data that generated the hashes beginning 0bd... would
be made distinct, because they would include in one case the hash
6bd... and in the other d8d... .
As in knitr, I think that manual dependency specification (e.g. in the
header args of the block) should be possible. But it would also be
possible to automatically infer that a block depends on any block that
it references via a :var header or noweb reference – which would in turn
automatically fix the case you discussed.
And when evaluating a block, the dependencies should be (recursively)
evaluated first, in case any of them has changed.
Is it clear what I am describing, and do you have thoughts on it?
>
>> If one did want to move hashes to code blocks it would be a major
>> refactoring which would (in my opinion) require significant
>> justification.
>
> I'm not disputing that it requires significant effort. The benefits
> would be that we might have a chance to clear up some confusion over the
> code execution model of Babel and better support different ones.
FWIW, I think that hashes shouldn’t be stored in the buffer text at all.
They’re not really part of the document data or metadata. Rather, they
are information about how the content of the document (code and its
results) was instantiated/computed in a particular environment/occasion.
I’d rather see them stored in a lisp data structure. They could be
written out to an invisible file when the org buffer is saved, and
re-read on load.
> Oh yes, there's a whole set of _other_ problems that are waiting to be
> solved. :-)
There always is. :-)
--
Aaron Ecay
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-13 4:12 ` Aaron Ecay
@ 2013-03-13 7:50 ` Achim Gratz
2013-03-13 14:42 ` Eric Schulte
1 sibling, 0 replies; 27+ messages in thread
From: Achim Gratz @ 2013-03-13 7:50 UTC (permalink / raw)
To: emacs-orgmode
Aaron Ecay writes:
> I think this points in the direction of having the notion of
> dependencies among source blocks.
[...]
I know nothing about knitr, but the problem at hand is both well studied
and has numerous solutions[*]. That is, once we've decided on what the
execution model is.
[*] Not necessarily efficient ones.
> FWIW, I think that hashes shouldn’t be stored in the buffer text at
> all.
I beg to differ. Org is a text format and should stay that way, you
should be able to take just all Org files with you and have everything
you need.
Regards,
Achim.
--
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+
SD adaptation for Waldorf rackAttack V1.04R1:
http://Synth.Stromeko.net/Downloads.html#WaldorfSDada
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-13 4:12 ` Aaron Ecay
2013-03-13 7:50 ` Achim Gratz
@ 2013-03-13 14:42 ` Eric Schulte
2013-03-13 18:25 ` Achim Gratz
1 sibling, 1 reply; 27+ messages in thread
From: Eric Schulte @ 2013-03-13 14:42 UTC (permalink / raw)
To: Achim Gratz; +Cc: emacs-orgmode
Aaron Ecay <aaronecay@gmail.com> writes:
> Hi Achim,
>
> 2013ko martxoak 10an, Achim Gratz-ek idatzi zuen:
>> But back to my earlier remark about the hash value actually being a
>> signature of the source block and not the result. If I use noweb
>> references, the reference text is cached, not its expansion. See the
>> example below where after the first invocation I change the source block
>> referenced to deliver a different result. That invalidates the cache
>> for direct invocation of that block, but fails to do so for the indirect
>> invocation. If you look at the two result blocks, you see that the same
>> hash is added to two different blocks.
>
> I think this points in the direction of having the notion of
> dependencies among source blocks. This is an idea that knitr
> (http://yihui.name/knitr/) implements. The idea would be to include in
> the hash of a source block X (in addition to all the pieces that are
> already in the hash) the hash of the blocks that X depends on. So in
> your example, the data that generated the hashes beginning 0bd... would
> be made distinct, because they would include in one case the hash
> 6bd... and in the other d8d... .
>
> As in knitr, I think that manual dependency specification (e.g. in the
> header args of the block) should be possible. But it would also be
> possible to automatically infer that a block depends on any block that
> it references via a :var header or noweb reference – which would in turn
> automatically fix the case you discussed.
>
This is what is already taking place. The :var header arguments are
automatically expanded into dependencies between code blocks, and the
results of previous code blocks are included in the hash calculation of
the current code block.
From re-looking at Achim's previous noweb example, it seems that we
currently do *not* include the values of noweb expansions in code block
hash calculations, I think this is a bug which should be fixed.
>
> And when evaluating a block, the dependencies should be (recursively)
> evaluated first, in case any of them has changed.
>
This is exactly what happens currently with previous blocks referenced
through :var header arguments.
>
> Is it clear what I am describing, and do you have thoughts on it?
>
Very, thank you for spelling it out. I believe that given the bug fix
just mentioned, the current model indeed does support automatic
inference of dependencies between blocks.
>
>>
>>> If one did want to move hashes to code blocks it would be a major
>>> refactoring which would (in my opinion) require significant
>>> justification.
>>
>> I'm not disputing that it requires significant effort. The benefits
>> would be that we might have a chance to clear up some confusion over the
>> code execution model of Babel and better support different ones.
>
> FWIW, I think that hashes shouldn’t be stored in the buffer text at
> all.
To echo Achim's response, you've accidentally uttered Org-mode heresy.
A core design principle is that everything be represented as plain text
in the buffer. That said, the hashes should be largely hidden by
default, and the degree of hiding can be controlled by the
`org-babel-hash-show' variable.
>
> They’re not really part of the document data or metadata. Rather,
> they are information about how the content of the document (code and
> its results) was instantiated/computed in a particular
> environment/occasion. I’d rather see them stored in a lisp data
> structure. They could be written out to an invisible file when the
> org buffer is saved, and re-read on load.
>
>> Oh yes, there's a whole set of _other_ problems that are waiting to be
>> solved. :-)
>
> There always is. :-)
I think Org-mode already provides the bulk of what is desired. If we
agree to treat ":cache yes :results none" as obviously taking place for
side effects, and then sticking a hash behind the :cache header argument
with the code block, then what functionality would be missing?
Thanks,
--
Eric Schulte
http://cs.unm.edu/~eschulte
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-13 3:55 ` Aaron Ecay
@ 2013-03-13 14:45 ` Eric Schulte
2013-03-19 4:49 ` Aaron Ecay
0 siblings, 1 reply; 27+ messages in thread
From: Eric Schulte @ 2013-03-13 14:45 UTC (permalink / raw)
To: Achim Gratz; +Cc: emacs-orgmode
Aaron Ecay <aaronecay@gmail.com> writes:
> Hi Eric,
>
> 2013ko martxoak 9an, Eric Schulte-ek idatzi zuen:
>> Could something like the following work? Removing ":results none" and
>> adding something small as the returned result which may easily be parsed
>> and placed in the buffer w/o problem.
>>
>> #+begin_src R :cache yes
>> # code to perform side effect
>> x <- 'side effect'
>> 'done'
>> #+end_src
>>
>> #+RESULTS[9f4e5b4b07e93c680ab37fc4ba1f75e1bfc0ee0a]:
>> : done
>
> It works, but it is a kludge. In fact, it is the same kludge that we
> used to need before :results none (to avoid emacs choking on reading a
> monster data frame).
>
Well, I suppose one man's dirty kludge is another's beautiful hack. The
question here is whether the complexity lies in the implementation (and
thus the interface) or in the code block itself. While I generally
prefer the later, in this case of ":results none :cache yes" I would be
open to placing some custom logic in the backend, which stores the hash
value with the code block, possibly changing
#+begin_src R :cache yes
# code to perform side effect
#+end_src
to
#+begin_src R :cache 9f4e5b4b07e93c680ab37fc4ba1f75e1bfc0ee0a
# code to perform side effect
#+end_src
keeping in mind that the actual hash value should be hidden after the
first couple of characters.
>
>
>> This does not need special built in support, e.g.,
>>
>> #+name: R-pid
>> #+begin_src sh :var R="/usr/lib64/R/bin/exec/R"
>> ps auxwww|grep "$R"|grep -v 'grep'|awk '{print $2}'
>> #+end_src
>>
>> #+begin_src R :cache yes :var pid=R-pid
>> # code to perform side effect
>> x <- 'side effect'
>> 'done'
>> #+end_src
>>
>> #+RESULTS[da16f09882a6295815db51247592b77c80ed0056]:
>> : done
>
> Now *this* is a kludge!
I was actually very proud of this solution. It is what would be done by
the framework if we did implement custom support, but by doing it with
code blocks the exact mechanics are visible to the user.
> Since babel involves executing arbitrary code, the question to ask is
> not “Is this possible in babel?”. The answer is always “yes.”
Thank you very much. :)
> The right question is instead “What does it make the most sense for
> babel to do?” I think Achim’s contributions to this thread pushing us
> in the direction of thinking about what the execution model is are
> exactly what is needed.
>
> For cached code running in a session, I think a sensible model is:
> - Code should be re-run once after each session startup
> - Other than that, code should be re-run only if it changes, or if the
> user explicitly requests it to be re-run.
>
How should session startup be determined if not through inclusion of the
session PID in the code block hash? Perhaps the above could be made
more elegant through the addition of an elisp function which returns the
pid of the current R session, allowing the above to be truncated to
something like the following.
#+begin_src R :cache yes :session foo :var pid=(R-pid "foo")
# code to perform side effect
x <- 'side effect'
'done'
#+end_src
I don't suppose ESS provides such a function?
>
> In order to implement this, it is necessary to figure out how to hash
> the contents of :results none blocks, and include the session process id
> in the hash. If you have a different model in mind, then you will want
> different behavior. But I think (thanks to Achim’s clarifying comments)
> we can’t really discuss what is the “right” behavior without also
> discussing which is the “right” model.
Perhaps what we want is a ":results hash" header argument, which returns
the hash of the code block *as* the code blocks result? I'm not yet
convinced that the existing variable/results support with dummy values
is insufficient to structure dependencies between blocks.
Thanks,
--
Eric Schulte
http://cs.unm.edu/~eschulte
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-13 14:42 ` Eric Schulte
@ 2013-03-13 18:25 ` Achim Gratz
2013-03-14 19:52 ` Eric Schulte
0 siblings, 1 reply; 27+ messages in thread
From: Achim Gratz @ 2013-03-13 18:25 UTC (permalink / raw)
To: emacs-orgmode
Eric Schulte writes:
> From re-looking at Achim's previous noweb example, it seems that we
> currently do *not* include the values of noweb expansions in code block
> hash calculations, I think this is a bug which should be fixed.
It could very well have been a conscious decision, given that this can
lead to exponential complexity (I guess it's too late to ask Dan Davison
about that). That's why I said we need to clarify what we want this to
do first and then see how we implement it.
Regards,
Achim.
--
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+
SD adaptations for KORG EX-800 and Poly-800MkII V0.9:
http://Synth.Stromeko.net/Downloads.html#KorgSDada
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-13 18:25 ` Achim Gratz
@ 2013-03-14 19:52 ` Eric Schulte
0 siblings, 0 replies; 27+ messages in thread
From: Eric Schulte @ 2013-03-14 19:52 UTC (permalink / raw)
To: Achim Gratz; +Cc: emacs-orgmode
Achim Gratz <Stromeko@nexgo.de> writes:
> Eric Schulte writes:
>> From re-looking at Achim's previous noweb example, it seems that we
>> currently do *not* include the values of noweb expansions in code block
>> hash calculations, I think this is a bug which should be fixed.
>
> It could very well have been a conscious decision, given that this can
> lead to exponential complexity (I guess it's too late to ask Dan Davison
> about that). That's why I said we need to clarify what we want this to
> do first and then see how we implement it.
>
I do think that the noweb expanded body should be used in code block
hashing. This is not very different from the expansion of included :var
header arguments before hashing takes place.
Cheers,
>
>
> Regards,
> Achim.
--
Eric Schulte
http://cs.unm.edu/~eschulte
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-13 14:45 ` Eric Schulte
@ 2013-03-19 4:49 ` Aaron Ecay
2013-03-23 22:34 ` Eric Schulte
0 siblings, 1 reply; 27+ messages in thread
From: Aaron Ecay @ 2013-03-19 4:49 UTC (permalink / raw)
To: Eric Schulte; +Cc: Achim Gratz, emacs-orgmode
Hi Eric,
I’m jointly replying to 2 of your emails.
2013ko martxoak 13an, Eric Schulte-ek idatzi zuen:
> This is what is already taking place. The :var header arguments are
> automatically expanded into dependencies between code blocks, and the
> results of previous code blocks are included in the hash calculation of
> the current code block.
Wow, I did not realize that the :var handling was so sophisticated.
Would it be possible to introduce a :depends code-block-name header
argument, which recycles the same dependency calculation but does not
bind a variable in the code block? Many of the variables that I rely on
between blocks are large data frames, and I worry that dumping them into
the org buffer and then reloading them into R[fn:1] will result in a
slowdown and/or loss of some structure in the data.
[fn:1] My understanding of the :var-handling code is that this is how it
acquires the values to assign to the variables, as opposed to re-using a
variable that is present in a session. But the code is complex, so
maybe I’m wrong (again).
I also think this would make the feature more discoverable: a :var is
just a sub-type of :depends, with extra functionality. Coming from a
Sweave/knitr background, I expected something like :depends, and thus
didn’t notice :var
>
> From re-looking at Achim's previous noweb example, it seems that we
> currently do *not* include the values of noweb expansions in code block
> hash calculations, I think this is a bug which should be fixed.
+1
> To echo Achim's response, you've accidentally uttered Org-mode heresy.
Oh no. The good news is that thanks to your and Achim’s explanation, I
think I now understand this principle better.
>>> Oh yes, there's a whole set of _other_ problems that are waiting to be
>>> solved. :-)
>>
>> There always is. :-)
>
> I think Org-mode already provides the bulk of what is desired. If we
> agree to treat ":cache yes :results none" as obviously taking place for
> side effects, and then sticking a hash behind the :cache header argument
> with the code block, then what functionality would be missing?
This was more of a joke on my part: life gets boring when you run out of
problems to work on. In this specific case, though:
1) a :depends header argument
2) including the session PID in results hashes by default (because it is
the only sensible thing to do)
2013ko martxoak 13an, Eric Schulte-ek idatzi zuen:
> Well, I suppose one man's dirty kludge is another's beautiful hack. The
> question here is whether the complexity lies in the implementation (and
> thus the interface) or in the code block itself. While I generally
> prefer the later, in this case of ":results none :cache yes" I would be
> open to placing some custom logic in the backend, which stores the hash
> value with the code block, possibly changing
>
> #+begin_src R :cache yes
> # code to perform side effect
> #+end_src
>
> to
>
> #+begin_src R :cache 9f4e5b4b07e93c680ab37fc4ba1f75e1bfc0ee0a
> # code to perform side effect
> #+end_src
>
> keeping in mind that the actual hash value should be hidden after the
> first couple of characters.
If you like this solution, may I try once more to convince you of the
empty #+RESULTS solution I originally proposed? I looked at the code
for inserting/hiding/finding hash values, and it looks like it would be
complicated to change. Empty #+RESULTS is easy, although perhaps less
conceptually pure.
If you want the cache in the header, I think I can try to work on a
patch, but it does look tricky. So I am not sure I will have time to
work on it until next week. (If anyone else wants to beat me to the
punch, please feel free!)
One question: should we have the cache in the header only for :results
none blocks, or for all blocks?
> I was actually very proud of this solution. It is what would be done by
> the framework if we did implement custom support, but by doing it with
> code blocks the exact mechanics are visible to the user.
Agreed. But if it is the only “right” thing to do, or one of a very
small set of “right” things, I think it’s a win in terms of
conciseness/ease of use to do it automatically. And I think this is the
case here: the presence of :session yes is a clear signal that there is
out-of-band (from org’s perspective) communication happening between
code blocks, and that the invariance of a result can’t be relied on in a
different session process. So when the session PID changes, the hash
value should change as well, to trigger reevaluation.
> How should session startup be determined if not through inclusion of the
> session PID in the code block hash? Perhaps the above could be made
> more elegant through the addition of an elisp function which returns the
> pid of the current R session, allowing the above to be truncated to
> something like the following.
>
> #+begin_src R :cache yes :session foo :var pid=(R-pid "foo")
> # code to perform side effect
> x <- 'side effect'
> 'done'
> #+end_src
>
> I don't suppose ESS provides such a function?
You can get the value with
(process-id (get-process ess-current-process-name)), which you have to
evaluate in the current session buffer (the one that C-c C-v C-z takes
you to). Generally speaking, I guess each ob-foo should provide a
function to retrieve this value, since it will be different for
different languages.
--
Aaron Ecay
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-19 4:49 ` Aaron Ecay
@ 2013-03-23 22:34 ` Eric Schulte
2013-04-01 5:10 ` Aaron Ecay
0 siblings, 1 reply; 27+ messages in thread
From: Eric Schulte @ 2013-03-23 22:34 UTC (permalink / raw)
To: emacs-orgmode
Aaron Ecay <aaronecay@gmail.com> writes:
> Hi Eric,
>
> I’m jointly replying to 2 of your emails.
>
> 2013ko martxoak 13an, Eric Schulte-ek idatzi zuen:
>> This is what is already taking place. The :var header arguments are
>> automatically expanded into dependencies between code blocks, and the
>> results of previous code blocks are included in the hash calculation of
>> the current code block.
>
> Wow, I did not realize that the :var handling was so sophisticated.
> Would it be possible to introduce a :depends code-block-name header
> argument, which recycles the same dependency calculation but does not
> bind a variable in the code block? Many of the variables that I rely on
> between blocks are large data frames, and I worry that dumping them into
> the org buffer and then reloading them into R[fn:1] will result in a
> slowdown and/or loss of some structure in the data.
>
Unless you actually try :var and find it lacking in some way, I'd prefer
to stick with simply using :var to identify dependencies between code
blocks. We've seen in other places how providing multiple alias for
header arguments increases rather than reduces confusion.
>
> [fn:1] My understanding of the :var-handling code is that this is how it
> acquires the values to assign to the variables, as opposed to re-using a
> variable that is present in a session. But the code is complex, so
> maybe I’m wrong (again).
>
> I also think this would make the feature more discoverable: a :var is
> just a sub-type of :depends, with extra functionality. Coming from a
> Sweave/knitr background, I expected something like :depends, and thus
> didn’t notice :var
>
Maybe the documentation of :var should be improved to enhance
discoverability. I would be happy to apply a patch to this effect.
>
>>
>> From re-looking at Achim's previous noweb example, it seems that we
>> currently do *not* include the values of noweb expansions in code block
>> hash calculations, I think this is a bug which should be fixed.
>
> +1
>
>> To echo Achim's response, you've accidentally uttered Org-mode heresy.
>
> Oh no. The good news is that thanks to your and Achim’s explanation, I
> think I now understand this principle better.
>
>>>> Oh yes, there's a whole set of _other_ problems that are waiting to be
>>>> solved. :-)
>>>
>>> There always is. :-)
>>
>> I think Org-mode already provides the bulk of what is desired. If we
>> agree to treat ":cache yes :results none" as obviously taking place for
>> side effects, and then sticking a hash behind the :cache header argument
>> with the code block, then what functionality would be missing?
>
> This was more of a joke on my part: life gets boring when you run out of
> problems to work on. In this specific case, though:
> 1) a :depends header argument
> 2) including the session PID in results hashes by default (because it is
> the only sensible thing to do)
>
I personally think automatically invalidating all hashes simply because
a new session has been started is surprising and counter-intuitive.
There is a "library of Babel", in which common code blocks (such as one
returning session ID for hashing) may be placed so that they can be
easily re-used across files and users.
>
> 2013ko martxoak 13an, Eric Schulte-ek idatzi zuen:
>> Well, I suppose one man's dirty kludge is another's beautiful hack. The
>> question here is whether the complexity lies in the implementation (and
>> thus the interface) or in the code block itself. While I generally
>> prefer the later, in this case of ":results none :cache yes" I would be
>> open to placing some custom logic in the backend, which stores the hash
>> value with the code block, possibly changing
>>
>> #+begin_src R :cache yes
>> # code to perform side effect
>> #+end_src
>>
>> to
>>
>> #+begin_src R :cache 9f4e5b4b07e93c680ab37fc4ba1f75e1bfc0ee0a
>> # code to perform side effect
>> #+end_src
>>
>> keeping in mind that the actual hash value should be hidden after the
>> first couple of characters.
>
> If you like this solution, may I try once more to convince you of the
> empty #+RESULTS solution I originally proposed? I looked at the code
> for inserting/hiding/finding hash values, and it looks like it would be
> complicated to change. Empty #+RESULTS is easy, although perhaps less
> conceptually pure.
>
Why not just return a dummy value at the end of the code block?
#+begin_src R :cache yes
# code to perform side effect
"done"
#+end_src
>
> If you want the cache in the header, I think I can try to work on a
> patch, but it does look tricky. So I am not sure I will have time to
> work on it until next week. (If anyone else wants to beat me to the
> punch, please feel free!)
>
> One question: should we have the cache in the header only for :results
> none blocks, or for all blocks?
>
I'm just as happy raising an error or warning when the :cache and
":results none" options are found together, and doing no caching in that
case. Users can always just return a dummy value and remove ":results
none".
>
>> I was actually very proud of this solution. It is what would be done by
>> the framework if we did implement custom support, but by doing it with
>> code blocks the exact mechanics are visible to the user.
>
> Agreed. But if it is the only “right” thing to do, or one of a very
> small set of “right” things, I think it’s a win in terms of
> conciseness/ease of use to do it automatically. And I think this is the
> case here: the presence of :session yes is a clear signal that there is
> out-of-band (from org’s perspective) communication happening between
> code blocks, and that the invariance of a result can’t be relied on in a
> different session process. So when the session PID changes, the hash
> value should change as well, to trigger reevaluation.
>
>> How should session startup be determined if not through inclusion of the
>> session PID in the code block hash? Perhaps the above could be made
>> more elegant through the addition of an elisp function which returns the
>> pid of the current R session, allowing the above to be truncated to
>> something like the following.
>>
>> #+begin_src R :cache yes :session foo :var pid=(R-pid "foo")
>> # code to perform side effect
>> x <- 'side effect'
>> 'done'
>> #+end_src
>>
>> I don't suppose ESS provides such a function?
>
> You can get the value with
> (process-id (get-process ess-current-process-name)), which you have to
> evaluate in the current session buffer (the one that C-c C-v C-z takes
> you to). Generally speaking, I guess each ob-foo should provide a
> function to retrieve this value, since it will be different for
> different languages.
It sounds like such an (R-pid "foo") function would be easy to
implement. I'd vote for that solution (implementing this function in
your .emacs, and then sharing it if necessary) for now. If this need to
associate PIDs with results becomes more wide spread (in a couple of
years of Org-mode code blocks this is the first time I've seen it
arise), then a built-in solution becomes more appealing.
Hope this is helpful,
--
Eric Schulte
http://cs.unm.edu/~eschulte
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-03-23 22:34 ` Eric Schulte
@ 2013-04-01 5:10 ` Aaron Ecay
2013-04-02 22:14 ` Eric Schulte
0 siblings, 1 reply; 27+ messages in thread
From: Aaron Ecay @ 2013-04-01 5:10 UTC (permalink / raw)
To: Eric Schulte; +Cc: emacs-orgmode
Hi Eric,
2013ko martxoak 23an, Eric Schulte-ek idatzi zuen:
> Unless you actually try :var and find it lacking in some way, I'd prefer
> to stick with simply using :var to identify dependencies between code
> blocks. We've seen in other places how providing multiple alias for
> header arguments increases rather than reduces confusion.
I’m uneasy with how magic :var is, in the sense that it does a lot of
heavy lifting with interconversion to/from org syntax, table formats,
etc. What if a special convention was introduced, whereby
:var _=whatever would not result in any variable binding being introduced
into the code block (but would behave the same wrt. dependencies)? This
is similar to the syntax for discarding unused values in some
programming languages (python comes to mind):
#+begin_src python
_, foo, _ = iOnlyCareAboutTheSecondValue()
#+end_src
So, this would look like:
#+begin_src R :var a=123 :var _=(R-pid) :var _=(something-else)
# code which can access a, but has no access to (R-pid) or (something-else)
#+end_src
If this doesn’t resonate with you, I’ll just drop this suggestion. I
will of course certainly report any problems I have using :var in
practice as well, with patches to fix them insofar as it is within my
ability to provide them.
> Maybe the documentation of :var should be improved to enhance
> discoverability. I would be happy to apply a patch to this effect.
Patch is on the way.
> Why not just return a dummy value at the end of the code block?
>
> #+begin_src R :cache yes
> # code to perform side effect
> "done"
> #+end_src
This would require the user to add this dummy result redundantly to many
code blocks, for no reason. That is cognitively burdensome (user must
remember when to add it) and ugly, if the source code is to be exported
in the document (or tangled).
But this case is straightforward to detect on org’s end, and fairly
straightforward to work around (this is in fact what my original patch
was). So I am still not sure why this burden should to be imposed.
>>> Well, I suppose one man's dirty kludge is another's beautiful hack. The
>>> question here is whether the complexity lies in the implementation (and
>>> thus the interface) or in the code block itself. While I generally
>>> prefer the later, in this case of ":results none :cache yes" I would be
>>> open to placing some custom logic in the backend, which stores the hash
>>> value with the code block, possibly changing
>>>
>>> #+begin_src R :cache yes
>>> # code to perform side effect
>>> #+end_src
>>>
>>> to
>>>
>>> #+begin_src R :cache 9f4e5b4b07e93c680ab37fc4ba1f75e1bfc0ee0a
>>> # code to perform side effect
>>> #+end_src
>>>
>>> keeping in mind that the actual hash value should be hidden after the
>>> first couple of characters.
[...]
>
>>
>> If you want the cache in the header, I think I can try to work on a
>> patch, but it does look tricky. So I am not sure I will have time to
>> work on it until next week. (If anyone else wants to beat me to the
>> punch, please feel free!)
>>
>> One question: should we have the cache in the header only for :results
>> none blocks, or for all blocks?
>>
>
> I'm just as happy raising an error or warning when the :cache and
> ":results none" options are found together, and doing no caching in that
> case. Users can always just return a dummy value and remove ":results
> none".
So should I not work on this modified version of my original patch? I
am genuinely trying to be helpful, so that my own modest contribution
can make even more useful what is already a very useful tool thanks to
the efforts of many people, including you. Maybe I am barking up the
wrong tree. I am certainly sorry if you are upset by something I have
said – such was never my intention.
> It sounds like such an (R-pid "foo") function would be easy to
> implement. I'd vote for that solution (implementing this function in
> your .emacs, and then sharing it if necessary) for now. If this need to
> associate PIDs with results becomes more wide spread (in a couple of
> years of Org-mode code blocks this is the first time I've seen it
> arise), then a built-in solution becomes more appealing.
This part of the solution I have implemented:
#+name: R-pid
#+BEGIN_SRC emacs-lisp
(let* ((info (org-babel-get-src-block-info 'light))
(session-name (cdr (assoc :session (nth 2 info)))))
(if (and session-name
(not (equal session-name "none")))
(progn
(org-babel-R-initiate-session session-name (nth 2 info))
(with-current-buffer (get-buffer session-name)
(process-id (get-process ess-current-process-name))))
"none"))
#+END_SRC
And in my init file:
#+BEGIN_SRC emacs-lisp
(setq org-babel-default-header-args:R '((:var . "R.pid=R-pid")))
#+END_SRC
I’d prefer to use the proposed _ for the var here, but otherwise this
seems to work.
Thanks,
--
Aaron Ecay
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results
2013-04-01 5:10 ` Aaron Ecay
@ 2013-04-02 22:14 ` Eric Schulte
0 siblings, 0 replies; 27+ messages in thread
From: Eric Schulte @ 2013-04-02 22:14 UTC (permalink / raw)
To: emacs-orgmode
Aaron Ecay <aaronecay@gmail.com> writes:
> Hi Eric,
>
> 2013ko martxoak 23an, Eric Schulte-ek idatzi zuen:
>
>> Unless you actually try :var and find it lacking in some way, I'd prefer
>> to stick with simply using :var to identify dependencies between code
>> blocks. We've seen in other places how providing multiple alias for
>> header arguments increases rather than reduces confusion.
>
> I’m uneasy with how magic :var is, in the sense that it does a lot of
> heavy lifting with interconversion to/from org syntax, table formats,
> etc. What if a special convention was introduced, whereby
> :var _=whatever would not result in any variable binding being introduced
> into the code block (but would behave the same wrt. dependencies)? This
> is similar to the syntax for discarding unused values in some
> programming languages (python comes to mind):
>
I think the following is the simplest and clearest solution in these
cases (certainly more straightforward than the syntax below).
#+begin_src R
x<-"make something big"
"done"
#+end_src
#+RESULTS:
: done
>
> #+begin_src python
> _, foo, _ = iOnlyCareAboutTheSecondValue()
> #+end_src
>
> So, this would look like:
> #+begin_src R :var a=123 :var _=(R-pid) :var _=(something-else)
> # code which can access a, but has no access to (R-pid) or (something-else)
> #+end_src
>
> If this doesn’t resonate with you, I’ll just drop this suggestion.
To me this sounds like a solution in search of a problem. If you
actually run into a problem in real life then we can consider if an
Org-mode solution is necessary.
> I will of course certainly report any problems I have using :var in
> practice as well, with patches to fix them insofar as it is within my
> ability to provide them.
>
Great, thanks.
>
>> Maybe the documentation of :var should be improved to enhance
>> discoverability. I would be happy to apply a patch to this effect.
>
> Patch is on the way.
>
>> Why not just return a dummy value at the end of the code block?
>>
>> #+begin_src R :cache yes
>> # code to perform side effect
>> "done"
>> #+end_src
>
> This would require the user to add this dummy result redundantly to many
> code blocks, for no reason. That is cognitively burdensome (user must
> remember when to add it) and ugly, if the source code is to be exported
> in the document (or tangled).
>
> But this case is straightforward to detect on org’s end, and fairly
> straightforward to work around (this is in fact what my original patch
> was). So I am still not sure why this burden should to be imposed.
>
Again, I think you're anticipating problems which don't crop up in
actuality (e.g., in the many years of Org-mode code block usage by me
and many others). Please just get to using Org-mode code blocks to do
something, and then much more attention will be paid to *experienced*
rather than *anticipated* problems.
>
>>>> Well, I suppose one man's dirty kludge is another's beautiful hack. The
>>>> question here is whether the complexity lies in the implementation (and
>>>> thus the interface) or in the code block itself. While I generally
>>>> prefer the later, in this case of ":results none :cache yes" I would be
>>>> open to placing some custom logic in the backend, which stores the hash
>>>> value with the code block, possibly changing
>>>>
>>>> #+begin_src R :cache yes
>>>> # code to perform side effect
>>>> #+end_src
>>>>
>>>> to
>>>>
>>>> #+begin_src R :cache 9f4e5b4b07e93c680ab37fc4ba1f75e1bfc0ee0a
>>>> # code to perform side effect
>>>> #+end_src
>>>>
>>>> keeping in mind that the actual hash value should be hidden after the
>>>> first couple of characters.
>
> [...]
>
>>
>>>
>>> If you want the cache in the header, I think I can try to work on a
>>> patch, but it does look tricky. So I am not sure I will have time to
>>> work on it until next week. (If anyone else wants to beat me to the
>>> punch, please feel free!)
>>>
>>> One question: should we have the cache in the header only for :results
>>> none blocks, or for all blocks?
>>>
>>
>> I'm just as happy raising an error or warning when the :cache and
>> ":results none" options are found together, and doing no caching in that
>> case. Users can always just return a dummy value and remove ":results
>> none".
>
> So should I not work on this modified version of my original patch? I
> am genuinely trying to be helpful, so that my own modest contribution
> can make even more useful what is already a very useful tool thanks to
> the efforts of many people, including you. Maybe I am barking up the
> wrong tree.
Correct, lets not work on implementing this cache in the header idea.
> I am certainly sorry if you are upset by something I have said – such
> was never my intention.
>
You misread my tone, I'm not upset.
>
>> It sounds like such an (R-pid "foo") function would be easy to
>> implement. I'd vote for that solution (implementing this function in
>> your .emacs, and then sharing it if necessary) for now. If this need to
>> associate PIDs with results becomes more wide spread (in a couple of
>> years of Org-mode code blocks this is the first time I've seen it
>> arise), then a built-in solution becomes more appealing.
>
> This part of the solution I have implemented:
>
> #+name: R-pid
> #+BEGIN_SRC emacs-lisp
> (let* ((info (org-babel-get-src-block-info 'light))
> (session-name (cdr (assoc :session (nth 2 info)))))
> (if (and session-name
> (not (equal session-name "none")))
> (progn
> (org-babel-R-initiate-session session-name (nth 2 info))
> (with-current-buffer (get-buffer session-name)
> (process-id (get-process ess-current-process-name))))
> "none"))
> #+END_SRC
>
> And in my init file:
>
> #+BEGIN_SRC emacs-lisp
> (setq org-babel-default-header-args:R '((:var . "R.pid=R-pid")))
> #+END_SRC
>
Sounds great.
>
> I’d prefer to use the proposed _ for the var here, but otherwise this
> seems to work.
>
> Thanks,
--
Eric Schulte
http://cs.unm.edu/~eschulte
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2013-04-02 22:15 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-06 4:07 [PATCH] * lisp/ob-core.el (org-babel-execute-src-block): insert hash for silent results Aaron Ecay
2013-03-08 21:25 ` Aaron Ecay
2013-03-08 22:07 ` Eric Schulte
2013-03-08 21:53 ` Achim Gratz
2013-03-08 22:09 ` Eric Schulte
2013-03-08 22:24 ` aaronecay
2013-03-09 17:45 ` Eric Schulte
2013-03-09 18:56 ` Aaron Ecay
2013-03-09 20:03 ` Achim Gratz
2013-03-09 0:57 ` Achim Gratz
2013-03-09 18:35 ` Eric Schulte
2013-03-09 19:22 ` Aaron Ecay
2013-03-09 20:26 ` Eric Schulte
2013-03-13 3:55 ` Aaron Ecay
2013-03-13 14:45 ` Eric Schulte
2013-03-19 4:49 ` Aaron Ecay
2013-03-23 22:34 ` Eric Schulte
2013-04-01 5:10 ` Aaron Ecay
2013-04-02 22:14 ` Eric Schulte
2013-03-10 8:52 ` Achim Gratz
2013-03-10 20:14 ` Sebastien Vauban
2013-03-10 21:06 ` Achim Gratz
2013-03-13 4:12 ` Aaron Ecay
2013-03-13 7:50 ` Achim Gratz
2013-03-13 14:42 ` Eric Schulte
2013-03-13 18:25 ` Achim Gratz
2013-03-14 19:52 ` Eric Schulte
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).