From mboxrd@z Thu Jan 1 00:00:00 1970 From: Rainer M Krug Subject: Re: Emacs/ESS/org freezes/hangs on big data/ RAM(~256GB) processes when run in org/babel Date: Sat, 20 Jun 2015 17:05:56 +0200 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:35571) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z6KLR-0005gk-HP for emacs-orgmode@gnu.org; Sat, 20 Jun 2015 11:06:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z6KLO-00047T-8Q for emacs-orgmode@gnu.org; Sat, 20 Jun 2015 11:06:09 -0400 Received: from mail-wg0-f52.google.com ([74.125.82.52]:34607) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z6KLN-00047P-Ux for emacs-orgmode@gnu.org; Sat, 20 Jun 2015 11:06:06 -0400 Received: by wgfq1 with SMTP id q1so63231933wgf.1 for ; Sat, 20 Jun 2015 08:06:05 -0700 (PDT) In-Reply-To: (Andreas Leha's message of "Fri, 19 Jun 2015 23:31:06 +0100") List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Andreas Leha Cc: emacs-orgmode@gnu.org --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Andreas Leha writes: > Hi Rainer, Hi Andreas, > > Rainer M Krug writes: >> "Charles C. Berry" writes: >> >>> On Wed, 17 Jun 2015, William Denton wrote: >>> >>>> On 17 June 2015, Xebar Saram wrote: >>>> >>>>> I do alot of modeling work that involves using huge datasets and run >>>>> process intensive R processes (such as complex mixed models, Gamms et= c). in >>>>> R studio all works well yet when i use the orgmode eval on R code blo= cks it >>>>> works well for small simple process but 90% of the time when dealing = with >>>>> complex models and bug data (up to 256GB) it will just freeze emacs/e= ss. >>>>> sometimes i can C-c or C-g it and other times i need to physically ki= ll >>>>> emacs. >>>> >>>> I've been having the same problem for a while, but wasn't able to >>>> isolate it any more than large data sets, lack of memory, and heavy >>>> CPU usage. Sometimes everything hangs and I need to power cycle the >>>> computer. :( >>>> >>> >>> And you (both) have `ess-eval-visibly' set to nil, right? >>> >>> I do statistical genomics, which can be compute intensive. Sometimes >>> processes need to run for a while, and I get impatient having to wait. >>> >>> I wrote (and use) ox-ravel[1] to speed up my write-run-revise cycle in >>> org-mode. >>> >>> Basically, ravel will export Org mode to a format that knitr (and the >>> like) can run - turning src blocks into `code chunks'. That allows me >>> to set the cache=3DTRUE chunk option, etc. I run knitr on the exported >>> document to initialize objects for long running computations or to >>> produce a finished report. >>> >>> When I start a session, I run knitr in the R session, then all the >>> cached objects are loaded in and ready to use. >>> >>> If I write a src block I know will take a long time to export, I >>> export from org mode to update the knitr document and re-knit it to >>> refresh the cache. >> >> I have a similar workflow, only that I use a package like >> approach, i.e. I tangle function definitions in a folder ./R, data into >> ./data (which makes it possible to share org defined variables with R >> running outside org) and scripts, i.e. the things which do a analysis, >> import data, ... i.e. which might take long, into a folder ./scripts/. I >> then add the usual R package infrastructure files (DESCRIPTION, >> NAMESPACE, ...). >> Then I have one file tangled into ./scripts/init.R: >> >> #+begin_src R :tangle ./scripts/init.R=20=20 >> library(devtools) >> load_all() >> #+end_src >> >> >> and one for the analysis: >> >> #+begin_src R :tangle ./scripts/myAnalysis.R=20=20 >> ## Do some really time intensive and horribly complicated and important >> ## stuff here >> save( >> fileNames, >> bw, >> cols, >> labels, >> fit, >> dens, >> gof, >> gofPerProf, >> file =3D "./cache/results.myAnalysis.rds" >> ) >> #+end_src >> >> >> Now after tangling, I have my code easily available in a new R session: >> >> 1) start R in the directory in which the DESCRIPTION file is,=20 >> 2) run source("./scripts/init.R") >> >> and I have all my functions and data available. >> >> To run a analysis, I do >> >> 3) source("./scripts/myAnalysis.R") >> >> and the results are saved in a file fn >> >> To analyse the data further, I can then simply use >> >> #+begin_src R :tangle ./scripts/myAnalysis.R >> fitSing <- attach("./cache/results.myAnalysis.rds") >> #+end_src >> >> >> so they won't interfere with my environment in R. >> >> I can finally remove the attached environment by doing >> >> #+begin_src R :tangle ./scripts/myAnalysis.R=20=20 >> detach( >> name =3D attr(fitSing, "name"), >> character.only =3D TRUE >> ) >> #+end_src >> >> Through these caching and compartmentalizing, I can easily do some >> things outside org and some inside, and easily combine all the data. >> >> Further advantage: I can actually create the package and send it to >> somebody for testing and review and it should run out of the box, as in >> the DESCRIPTION file all dependencies are defined. >> >> I am using this approach at the moment for a paper and which will also >> result in a paper. By executing all the scripts, one will be able to do >> import the raw data, do the analysis and create all graphs used in the >> paper. >> >> Hope this gives you another idea how one can handle long running >> analysis in R in org, >> >> Cheers, >> >> Rainer >> > > That is a cool workflow. I especially like the fact that you end up > with an R package. Thanks. Yes - the idea of having a package at the end was one main reason why I am using this approach. > > So, I'll try my again. Is there there any chance to see working > example of this? I'd love to see that. Let's say I am working on it. I am working on a project which is using this workflow and when it is finished, the package will be available as an electronic appendix to the paper. But I will see if I can condense an example and blog it - I'll let you kow when it is done. Cheers, Rainer > > Thanks, > Andreas > > =2D-=20 Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology,= UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Stellenbosch University South Africa Tel : +33 - (0)9 53 10 27 44 Cell: +33 - (0)6 85 62 59 98 Fax : +33 - (0)9 58 10 27 44 Fax (D): +49 - (0)3 21 21 25 22 44 email: Rainer@krugs.de Skype: RMkrug PGP: 0x0F52F982 --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2 iQEcBAEBCAAGBQJVhYFZAAoJENvXNx4PUvmCNO4H/3QHLyCLjU70XDjkfMLDfrcy RQW1ei0Z+Fd56D8ghpQ29YUMDPbDFyQ1ty4t81Zpilluv2sSC/beRZbw79QJNbY/ 4bJwuGA3Ak0PESzz2ovsoorgqd8mrvIIBZRbiJRcS07OZDJryv7NIXSBGSchJ5Ef +bZHByKzdOyubW5iwlfETCKpCQwgWYC9pDJVzLFxQVQ+sLUfD9mZjqn73/xtw31t 0kqqK9hqRTixb4eEKoIyw2T1/SuvsNr/DjL3RNTSmagjqJbUHWgc0umTHyFG/w3p MzRJVovnZTqLBMbyf76NlT9SKCmdQUDcMvgaC+nXp2cO3x22/Bh/YAvkzfNxUyo= =bP6E -----END PGP SIGNATURE----- --=-=-=--