From mboxrd@z Thu Jan 1 00:00:00 1970 From: John Hendy Subject: Re: Best way of plotting duration (hms) data with R? Date: Fri, 8 Mar 2013 22:36:00 -0600 Message-ID: References: <87ip52b6gt.fsf@slate.zedat.fu-berlin.de> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Return-path: Received: from eggs.gnu.org ([208.118.235.92]:52970) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UEBVw-0000LV-21 for emacs-orgmode@gnu.org; Fri, 08 Mar 2013 23:36:13 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UEBVq-0002Pq-CO for emacs-orgmode@gnu.org; Fri, 08 Mar 2013 23:36:07 -0500 Received: from mail-la0-x230.google.com ([2a00:1450:4010:c03::230]:39263) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UEBVq-0002Ph-09 for emacs-orgmode@gnu.org; Fri, 08 Mar 2013 23:36:02 -0500 Received: by mail-la0-f48.google.com with SMTP id fq13so2327343lab.21 for ; Fri, 08 Mar 2013 20:36:00 -0800 (PST) In-Reply-To: <87ip52b6gt.fsf@slate.zedat.fu-berlin.de> List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Loris Bennett Cc: emacs-orgmode@gnu.org On Fri, Mar 8, 2013 at 3:12 AM, Loris Bennett wrote: > Hi, > > I have some data representing durations in an Org file: > > | *Run* | *reference* | *test30* | *test31* | *test32* | > |-------+-------------+-------------+-------------+-------------+ > | Dur 2 | 00h 00' 32" | 00h 00' 44" | 00h 00' 39" | 00h 01' 05" | > | Dur 3 | 00h 00' 31" | 00h 00' 41" | 00h 00' 45" | 00h 01' 13" | > | Dur 4 | 00h 05' 46" | 00h 21' 54" | 00h 40' 10" | 00h 55' 02" | > | Dur 5 | 00h 03' 51" | 00h 13' 37" | 00h 23' 07" | 00h 28' 54" | > | Dur 7 | 00h 06' 49" | 00h 28' 48" | 00h 35' 23" | 01h 03' 17" | > | Dur 8 | 00h 06' 47" | 00h 22' 54" | 00h 47' 42" | 01h 02' 08" | > > I would like to plot these using R and I'm not sure what approach to > take. > > One way would be to do some Org/Calc magic to convert all the data to, > say seconds, and the pass the data to R. My problem with this is that I > can't work out how to convert from HMS in Calc, let alone how this would > look in Org. > > The other way would be to pass the data directly to R. However, dealing > with durations in R has always seemed very clumsy to me, everything > being much more geared towards date-times. Do you need to use the format =HHh MM' SS"=? If not, I'd ditch the apostrophe and quotation marks. I think you should be able to do this in R. I have little experience but did some date/time plotting in R a bit back which led to a StackOverflow question because I, like you, found it a bit confusing! [1] In any case, here's what I did with your data: - M-x replace-string RET ' RET m - M-x replace-string RET " RET s - M-x org-table-export RET "~/Desktop/table.csv" RET RET (to accept the default csv format) This gave me times in the format (using Unix date symbols; see `man date` for the complete list) "%Hh %Mm %Ss" Then I melted your data and then converted it to POSIXct (for whatever reason, if I converted first, then melted, it got screwed up). Finally, I plotted it just fine. Here's the code; let me know what you think or if you have any questions. I wasn't sure how you wanted to plot it, so this was a guess: #+begin_src R :session r :results silent library(ggplot2) library(reshape2) library(scales) # read the data; change the names data <- read.csv("~/Desktop/table.csv") names(data) <- c("Run", "reference", "test30", "test31", "test32") # melt the data and convert the time strings to POSIXct format data_melted <- melt(data, id.var = "Run") data_melted$value <- as.POSIXct(data_melted$value, format = "%Hh %Mm %Ss") # Plot! You can change the `date_format()` string to whatever you prefer. p <- ggplot(data_melted, aes(x = Run, y = value, colour = variable)) + geom_point() p <- p + scale_y_datetime(labels = date_format("%Hh %Mm %Ss")) p #+end_src If you want to play with the granularity, try something like this (replacing the above plot command): #+begin_src R p <- ggplot(data_melted, aes(x = Run, y = value, colour = variable)) + geom_point() p <- p + scale_y_datetime(breaks = date_breaks("10 min"), minor_breaks = date_breaks("30 sec"), labels = date_format("%Hh %Mm %Ss")) p #+end_src See the =scales= package for the various arguments you can pass to the date_breaks() command. [2] Hope that helps a little. Syntax is tricky, but I think R can probably handle whatever you'd like to toss at it. Just might take some getting used to. In my opinion, it's easier to learn the syntax than try and force Org to do something odd via calc or go through even another middle man (Org -> something -> R) to convert to something like seconds. Plus, Hadley's added the stuff above (ggplot's scale_datetime and his scales package) to give you really nice format over the axis labels which I think you'd have trouble with if converting to pure seconds. Let me know if you need more help! It's a learning exercise for me, too :) Best regards, John [1] The question: http://stackoverflow.com/questions/10770698/understanding-dates-and-plotting-a-histogram-with-ggplot2-in-r . You'll note it's still not completely answered as I never did figure out why the two answer-ers solutions didn't produce the same result! [2] Skip to the date_breaks() section: http://cran.r-project.org/web/packages/scales/scales.pdf > > I'd be grateful for any pointers on this. > > Cheers > > Loris > > -- > no sig is good sig > > > > >