From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Schulte Subject: Re: org-mode and python pandas Date: Sun, 30 Jun 2013 17:15:11 -0600 Message-ID: <87bo6nkv0e.fsf@gmail.com> References: Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:51077) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UtRap-0006oI-Py for emacs-orgmode@gnu.org; Sun, 30 Jun 2013 20:03:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UtRao-0001XT-Bh for emacs-orgmode@gnu.org; Sun, 30 Jun 2013 20:03:43 -0400 Received: from mail-pa0-x234.google.com ([2607:f8b0:400e:c03::234]:36641) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UtRao-0001XK-53 for emacs-orgmode@gnu.org; Sun, 30 Jun 2013 20:03:42 -0400 Received: by mail-pa0-f52.google.com with SMTP id kq13so4321023pab.25 for ; Sun, 30 Jun 2013 17:03:41 -0700 (PDT) List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: Dov Grobgeld Cc: emacs-orgmode Dov Grobgeld writes: > Has anyone used org-mode with the python pandas package? Pandas is in > a certain way an alternative to R, but with the (for me) familiar > syntax of python. See: http://pandas.pydata.org/ > > Pandas is very much built to be used interactively, and it outputs its > data in space separated tabular format. E.g. in ipython: > > In [1]: import pandas as pd > In [2]: import numpy as np > > In [3]: pd.DataFrame(np.random.random((4,3)), columns=['A','B','C']) > Out[3]: > A B C > 0 0.628365 0.424279 0.619791 > 1 0.799666 0.527572 0.132928 > 2 0.837255 0.138906 0.408233 > 3 0.388080 0.146212 0.575346 > > Unfortunately this doesn't output as nicely when used from org-mode: > > #+BEGIN_SRC python > import pandas as pd > import numpy as np > > return pd.DataFrame(np.random.random((4,3)), columns=list('ABC')) > #+END_SRC > > #+RESULTS: > : A B C > : 0 0.827817 0.664009 0.089161 > : 1 0.170031 0.729214 0.110918 > : 2 0.575918 0.863924 0.757536 > : 3 0.682722 0.774445 0.992041 > > while I would like to have: > > | | A | B | C | > |---+----------+----------+----------| > | 0 | 0.827817 | 0.664009 | 0.089161 | > | 1 | 0.170031 | 0.729214 | 0.110918 | > | 2 | 0.575918 | 0.863924 | 0.757536 | > | 3 | 0.682722 | 0.774445 | 0.992041 | > What happens if you add ":results table" to your code block? Would that be sufficient? > > The question is how to get this? Here are a few ideas: > > 1. Write a general filter in the org-mode elisp than uses heuristics > to recognize ascii aligned tables and change these to org-tables. The default value should be to convert multi-line output to tables, the ":results table" option above will force this conversion in case it is currently not taking place due to the default header arguments in use. > > 2. Add to pandas the option of globally influencing the text > formatting so that it outputs something more parsable by org-mode. This sounds promising, if pandas support csv output that will be correctly parsed by Org-mode. > > 3. Create a special language "pandas" that recognize the ascii aligned > tables and saves the need to import pandas and np? 4. And the obvious > approach of writing a python function that writes a org-mode parsable > table and always call it as part of the return. > > Which is the preferable approach? Any other ideas? > I think a header-argument-based approach would be ideal, I'd look at the value of org-babel-default-header-args:python, and read the portion of the manual related to the "results" header arguments. I don't understand multi-line strings in python, but I get the following behavior from simple shell script blocks. #+begin_src sh cat < > Regards, > Dov > -- Eric Schulte http://cs.unm.edu/~eschulte