emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Jack Kamm <jackkamm@gmail.com>
To: emacs-orgmode@gnu.org
Cc: Ihor Radchenko <yantar92@posteo.net>, Liu Hui <liuhui1610@gmail.com>
Subject: [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots
Date: Tue, 15 Aug 2023 16:46:59 -0700	[thread overview]
Message-ID: <87a5ur6f7w.fsf@gmail.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 2350 bytes --]

Following up on a discussion from last month [1], I am reviving my
proposal from a couple years ago [2] to improve ob-python results
handling. Since it's a relatively large change, I am sending it to the
list for review before applying the patch.

The patch changes how ob-python handles the following types of
results:

- Dictionaries
- Numpy arrays
- Pandas dataframes and series
- Matplotlib figures

Starting with dicts: these are no longer mangled. The current behavior
(before patch) is like so:

#+begin_src python
  return {"a": 1, "b": 2}
#+end_src

#+RESULTS:
| a | : | 1 | b | : | 2 |

But after the patch they appear like so:

#+begin_src python
  return {"a": 1, "b": 2}
#+end_src

#+RESULTS:
: {'a': 1, 'b': 2}

Next, for numpy arrays and pandas dataframes/series: these are
converted to tables, for example:

#+begin_src python
  import pandas as pd
  import numpy as np

  return pd.DataFrame(np.array([[1,2,3],[4,5,6]]),
                      columns=['a','b','c'])
#+end_src

#+RESULTS:
|   | a | b | c |
|---+---+---+---|
| 0 | 1 | 2 | 3 |
| 1 | 4 | 5 | 6 |

To avoid conversion, you can specify "raw", "verbatim", "scalar", or
"output" in the ":results" header argument.

Finally, for plots: ob-python now supports ":results graphics" header
arg. The behavior depends on whether using output or value
results. For output results, the current figure (pyplot.gcf) is
cleared before evaluating, then the result saved. For value results,
the block is expected to return a matplotlib Figure, which is
saved. To set the figure size, do it from within Python.

Here is an example of how to plot:

#+begin_src python :results output graphics file :file boxplot.svg
  import matplotlib.pyplot as plt
  import seaborn as sns
  plt.figure(figsize=(5, 5))
  tips = sns.load_dataset("tips")
  sns.boxplot(x="day", y="tip", data=tips)
#+end_src

Compared to the original version of this patch [2], I tried to
simplify and streamline things as much as possible, since this is a
relatively large and complex change. For example, the handling for
dict objects is much more simplistic now. And there are other
miscellaneous changes to the code structure which I hope improve the
clarity a bit.

[1] https://list.orgmode.org/CAOQTW-N9rE7fDRM1APMO8X5LRZmJfn_ZjhT3rvaF4X+s5M_jZw@mail.gmail.com/
[2] https://list.orgmode.org/87eenpfe77.fsf@gmail.com/


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-ob-python-Results-handling-for-dicts-dataframes-arra.patch --]
[-- Type: text/x-patch, Size: 16691 bytes --]

From 468eeaa69660a18d8b0503e5a68c275301d6e6ae Mon Sep 17 00:00:00 2001
From: Jack Kamm <jackkamm@gmail.com>
Date: Mon, 7 Sep 2020 09:58:30 -0700
Subject: [PATCH] ob-python: Results handling for dicts, dataframes, arrays,
 plots

* lisp/ob-python.el (org-babel-execute:python): Parse graphics-file
from params, and pass it to `org-babel-python-evaluate'.
(org-babel-python-table-or-string): Prevent `org-babel-script-escape'
from mangling dict results.
(org-babel-python--def-format-value): Python code for formatting
value results before returning.
(org-babel-python-wrapper-method): Removed.  Instead use part of the
string directly in `org-babel-python-evaluate-external-process'.
(org-babel-python-pp-wrapper-method): Removed.  Pretty printing is now
handled by `org-babel-python--def-format-value'.
(org-babel-python--output-graphics-wrapper): New constant.  Python
code to save graphical output.
(org-babel-python--exec-tmpfile): Removed.  Instead use the raw string
directly in `org-babel-python-evaluate-session'.
(org-babel-python--def-format-value): New constant.  Python function
to format and save value results to file.  Includes handling for
graphics, dataframes, and arrays.
(org-babel-python-format-session-value): Updated to use
`org-babel-python--def-format-value' for formatting value result.
(org-babel-python-evaluate): New parameter graphics-file.  Pass
graphics-file onto downstream helper functions.
(org-babel-python-evaluate-external-process): New parameter
graphics-file.  Use `org-babel-python--output-graphics-wrapper' for
graphical output.  For value result, use
`org-babel-python--def-format-value'.
(org-babel-python-evaluate-session): New parameter graphics-file.  Use
`org-babel-python--output-graphics-wrapper' for graphical output.
Replace the removed constant `org-babel-python--exec-tmpfile' with the
string directly.  Rename local variable tmp-results-file to
results-file, which may take the value of graphics-file when provided.
(org-babel-python-async-evaluate-session): New parameter
graphics-file.  Use `org-babel-python--output-graphics-wrapper' for
graphical output.  Rename local variable tmp-results-file to
results-file, which may take the value of graphics-file when provided.
---
 etc/ORG-NEWS      |  19 +++++-
 lisp/ob-python.el | 164 ++++++++++++++++++++++++++++------------------
 2 files changed, 119 insertions(+), 64 deletions(-)

diff --git a/etc/ORG-NEWS b/etc/ORG-NEWS
index 11fdf2825..2630554ae 100644
--- a/etc/ORG-NEWS
+++ b/etc/ORG-NEWS
@@ -576,6 +576,21 @@ of all relational operators (~<*~, ~=*~, ~!=*~, etc.) that work like
 the regular, unstarred operators but match a headline only if the
 tested property is actually present.
 
+*** =ob-python.el=: Support for more result types and plotting
+
+=ob-python= now recognizes numpy arrays, and pandas dataframes/series,
+and will convert them to org-mode tables when appropriate.
+
+In addition, dict results are now returned in appropriate string form,
+instead of being mangled as they were previously.
+
+When the header argument =:results graphics= is set, =ob-python= will
+use matplotlib to save graphics. The behavior depends on whether value
+or output results are used. For value results, the last line should
+return a matplotlib Figure object to plot. For output results, the
+current figure (as returned by =pyplot.gcf()=) is cleared before
+evaluation, and then plotted afterwards.
+
 ** New functions and changes in function arguments
 *** =TYPES= argument in ~org-element-lineage~ can now be a symbol
 
@@ -2041,8 +2056,8 @@ to switch to the new signature.
 *** Python session return values must be top-level expression statements
 
 Python blocks with ~:session :results value~ header arguments now only
-return a value if the last line is a top-level expression statement.
-Also, when a None value is returned, "None" will be printed under
+return a value if the last line is a top-level expression statement,
+otherwise the result is None. Also, None will now show up under
 "#+RESULTS:", as it already did with ~:results value~ for non-session
 blocks.
 
diff --git a/lisp/ob-python.el b/lisp/ob-python.el
index c15d45b96..35a82afc0 100644
--- a/lisp/ob-python.el
+++ b/lisp/ob-python.el
@@ -70,6 +70,8 @@ (defun org-babel-execute:python (body params)
 	      org-babel-python-command))
 	 (session (org-babel-python-initiate-session
 		   (cdr (assq :session params))))
+	 (graphics-file (and (member "graphics" (assq :result-params params))
+			     (org-babel-graphical-output-file params)))
          (result-params (cdr (assq :result-params params)))
          (result-type (cdr (assq :result-type params)))
 	 (return-val (when (eq result-type 'value)
@@ -85,7 +87,7 @@ (defun org-babel-execute:python (body params)
 	     (format (if session "\n%s" "\nreturn %s") return-val))))
          (result (org-babel-python-evaluate
 		  session full-body result-type
-		  result-params preamble async)))
+		  result-params preamble async graphics-file)))
     (org-babel-reassemble-table
      result
      (org-babel-pick-name (cdr (assq :colname-names params))
@@ -142,7 +144,9 @@ (defun org-babel-python-table-or-string (results)
   "Convert RESULTS into an appropriate elisp value.
 If the results look like a list or tuple, then convert them into an
 Emacs-lisp table, otherwise return the results as a string."
-  (let ((res (org-babel-script-escape results)))
+  (let ((res (if (string-equal "{" (substring results 0 1))
+                 results ;don't covert dicts to elisp
+               (org-babel-script-escape results))))
     (if (listp res)
         (mapcar (lambda (el) (if (eq el 'None)
                                  org-babel-python-None-to el))
@@ -218,32 +222,51 @@ (defun org-babel-python-initiate-session (&optional session _params)
 (defvar org-babel-python-eoe-indicator "org_babel_python_eoe"
   "A string to indicate that evaluation has completed.")
 
-(defconst org-babel-python-wrapper-method
-  "
-def main():
-%s
-
-open('%s', 'w').write( str(main()) )")
-(defconst org-babel-python-pp-wrapper-method
-  "
-import pprint
-def main():
+(defconst org-babel-python--output-graphics-wrapper "\
+import matplotlib.pyplot
+matplotlib.pyplot.gcf().clear()
 %s
-
-open('%s', 'w').write( pprint.pformat(main()) )")
-
-(defconst org-babel-python--exec-tmpfile "\
-with open('%s') as __org_babel_python_tmpfile:
-    exec(compile(__org_babel_python_tmpfile.read(), __org_babel_python_tmpfile.name, 'exec'))"
-  "Template for Python session command with output results.
-
-Has a single %s escape, the tempfile containing the source code
-to evaluate.")
+matplotlib.pyplot.savefig('%s')"
+  "Format string for saving Python graphical output.
+Has two %s escapes, for the Python code to be evaluated, and the
+file to save the graphics to.")
+
+(defconst org-babel-python--def-format-value "\
+def __org_babel_python_format_value(result, result_file, result_params):
+    with open(result_file, 'w') as f:
+        if 'graphics' in result_params:
+            result.savefig(result_file)
+        elif 'pp' in result_params:
+            import pprint
+            f.write(pprint.pformat(result))
+        else:
+            if not set(result_params).intersection(\
+['scalar', 'verbatim', 'raw']):
+                try:
+                    import pandas
+                except ImportError:
+                    pass
+                else:
+                    if isinstance(result, pandas.DataFrame):
+                        result = [[''] + list(result.columns), None] + \
+[[i] + list(row) for i, row in result.iterrows()]
+                    elif isinstance(result, pandas.Series):
+                        result = list(result.items())
+                try:
+                    import numpy
+                except ImportError:
+                    pass
+                else:
+                    if isinstance(result, numpy.ndarray):
+                        result = result.tolist()
+            f.write(str(result))"
+  "Python function to format value result and save it to file.")
 
 (defun org-babel-python-format-session-value
     (src-file result-file result-params)
   "Return Python code to evaluate SRC-FILE and write result to RESULT-FILE."
-  (format "\
+  (concat org-babel-python--def-format-value
+	  (format "
 import ast
 with open('%s') as __org_babel_python_tmpfile:
     __org_babel_python_ast = ast.parse(__org_babel_python_tmpfile.read())
@@ -253,30 +276,25 @@ (defun org-babel-python-format-session-value
     exec(compile(__org_babel_python_ast, '<string>', 'exec'))
     __org_babel_python_final = eval(compile(ast.Expression(
         __org_babel_python_final.value), '<string>', 'eval'))
-    with open('%s', 'w') as __org_babel_python_tmpfile:
-        if %s:
-            import pprint
-            __org_babel_python_tmpfile.write(pprint.pformat(__org_babel_python_final))
-        else:
-            __org_babel_python_tmpfile.write(str(__org_babel_python_final))
 else:
     exec(compile(__org_babel_python_ast, '<string>', 'exec'))
-    __org_babel_python_final = None"
-	  (org-babel-process-file-name src-file 'noquote)
-	  (org-babel-process-file-name result-file 'noquote)
-	  (if (member "pp" result-params) "True" "False")))
+    __org_babel_python_final = None
+__org_babel_python_format_value(__org_babel_python_final, '%s', %s)"
+		  (org-babel-process-file-name src-file 'noquote)
+		  (org-babel-process-file-name result-file 'noquote)
+		  (org-babel-python-var-to-python result-params))))
 
 (defun org-babel-python-evaluate
-    (session body &optional result-type result-params preamble async)
+    (session body &optional result-type result-params preamble async graphics-file)
   "Evaluate BODY as Python code."
   (if session
       (if async
 	  (org-babel-python-async-evaluate-session
-	   session body result-type result-params)
+	   session body result-type result-params graphics-file)
 	(org-babel-python-evaluate-session
-	 session body result-type result-params))
+	 session body result-type result-params graphics-file))
     (org-babel-python-evaluate-external-process
-     body result-type result-params preamble)))
+     body result-type result-params preamble graphics-file)))
 
 (defun org-babel-python--shift-right (body &optional count)
   (with-temp-buffer
@@ -292,28 +310,36 @@ (defun org-babel-python--shift-right (body &optional count)
     (buffer-string)))
 
 (defun org-babel-python-evaluate-external-process
-    (body &optional result-type result-params preamble)
+    (body &optional result-type result-params preamble graphics-file)
   "Evaluate BODY in external python process.
 If RESULT-TYPE equals `output' then return standard output as a
-string.  If RESULT-TYPE equals `value' then return the value of the
-last statement in BODY, as elisp."
+string.  If RESULT-TYPE equals `value' then return the value of
+the last statement in BODY, as elisp.  If GRAPHICS-FILE is
+non-nil, then save graphical results to that file instead."
   (let ((raw
          (pcase result-type
            (`output (org-babel-eval org-babel-python-command
 				    (concat preamble (and preamble "\n")
-					    body)))
-           (`value (let ((tmp-file (org-babel-temp-file "python-")))
+                                            (if graphics-file
+                                                (format org-babel-python--output-graphics-wrapper
+                                                        body graphics-file)
+                                              body))))
+           (`value (let ((results-file (or graphics-file
+				           (org-babel-temp-file "python-"))))
 		     (org-babel-eval
 		      org-babel-python-command
 		      (concat
 		       preamble (and preamble "\n")
 		       (format
-			(if (member "pp" result-params)
-			    org-babel-python-pp-wrapper-method
-			  org-babel-python-wrapper-method)
-			(org-babel-python--shift-right body)
-			(org-babel-process-file-name tmp-file 'noquote))))
-		     (org-babel-eval-read-file tmp-file))))))
+			(concat org-babel-python--def-format-value "
+def main():
+%s
+
+__org_babel_python_format_value(main(), '%s', %s)")
+                        (org-babel-python--shift-right body)
+			(org-babel-process-file-name results-file 'noquote)
+			(org-babel-python-var-to-python result-params))))
+		     (org-babel-eval-read-file results-file))))))
     (org-babel-result-cond result-params
       raw
       (org-babel-python-table-or-string (org-trim raw)))))
@@ -347,28 +373,36 @@ (defun org-babel-python-send-string (session body)
       (org-babel-chomp (substring string-buffer 0 (match-beginning 0))))))
 
 (defun org-babel-python-evaluate-session
-    (session body &optional result-type result-params)
+    (session body &optional result-type result-params graphics-file)
   "Pass BODY to the Python process in SESSION.
 If RESULT-TYPE equals `output' then return standard output as a
-string.  If RESULT-TYPE equals `value' then return the value of the
-last statement in BODY, as elisp."
+string.  If RESULT-TYPE equals `value' then return the value of
+the last statement in BODY, as elisp.  If GRAPHICS-FILE is
+non-nil, then save graphical results to that file instead."
   (let* ((tmp-src-file (org-babel-temp-file "python-"))
          (results
 	  (progn
-	    (with-temp-file tmp-src-file (insert body))
+	    (with-temp-file tmp-src-file
+              (insert (if (and graphics-file (eq result-type 'output))
+                          (format org-babel-python--output-graphics-wrapper
+                                  body graphics-file)
+                        body)))
             (pcase result-type
 	      (`output
-	       (let ((body (format org-babel-python--exec-tmpfile
+	       (let ((body (format "\
+with open('%s') as f:
+    exec(compile(f.read(), f.name, 'exec'))"
 				   (org-babel-process-file-name
 				    tmp-src-file 'noquote))))
 		 (org-babel-python-send-string session body)))
               (`value
-               (let* ((tmp-results-file (org-babel-temp-file "python-"))
+               (let* ((results-file (or graphics-file
+					(org-babel-temp-file "python-")))
 		      (body (org-babel-python-format-session-value
-			     tmp-src-file tmp-results-file result-params)))
+			     tmp-src-file results-file result-params)))
 		 (org-babel-python-send-string session body)
 		 (sleep-for 0 10)
-		 (org-babel-eval-read-file tmp-results-file)))))))
+		 (org-babel-eval-read-file results-file)))))))
     (org-babel-result-cond result-params
       results
       (org-babel-python-table-or-string results))))
@@ -392,7 +426,7 @@ (defun org-babel-python-async-value-callback (params tmp-file)
       (org-babel-python-table-or-string results))))
 
 (defun org-babel-python-async-evaluate-session
-    (session body &optional result-type result-params)
+    (session body &optional result-type result-params graphics-file)
   "Asynchronously evaluate BODY in SESSION.
 Returns a placeholder string for insertion, to later be replaced
 by `org-babel-comint-async-filter'."
@@ -406,7 +440,10 @@ (defun org-babel-python-async-evaluate-session
        (with-temp-buffer
          (insert (format org-babel-python-async-indicator "start" uuid))
          (insert "\n")
-         (insert body)
+         (insert (if graphics-file
+                     (format org-babel-python--output-graphics-wrapper
+                             body graphics-file)
+                   body))
          (insert "\n")
          (insert (format org-babel-python-async-indicator "end" uuid))
          (let ((python-shell-buffer-name
@@ -414,17 +451,20 @@ (defun org-babel-python-async-evaluate-session
            (python-shell-send-buffer)))
        uuid))
     (`value
-     (let ((tmp-results-file (org-babel-temp-file "python-"))
+     (let ((results-file (or graphics-file
+			     (org-babel-temp-file "python-")))
            (tmp-src-file (org-babel-temp-file "python-")))
        (with-temp-file tmp-src-file (insert body))
        (with-temp-buffer
-         (insert (org-babel-python-format-session-value tmp-src-file tmp-results-file result-params))
+         (insert (org-babel-python-format-session-value
+                  tmp-src-file results-file result-params))
          (insert "\n")
-         (insert (format org-babel-python-async-indicator "file" tmp-results-file))
+         (unless graphics-file
+           (insert (format org-babel-python-async-indicator "file" results-file)))
          (let ((python-shell-buffer-name
                 (org-babel-python-without-earmuffs session)))
            (python-shell-send-buffer)))
-       tmp-results-file))))
+       results-file))))
 
 (provide 'ob-python)
 
-- 
2.41.0


             reply	other threads:[~2023-08-15 23:48 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-15 23:46 Jack Kamm [this message]
2023-08-16  9:32 ` [PATCH] ob-python results handling for dicts, dataframes, arrays, and plots Ihor Radchenko
2023-08-17  4:04   ` Jack Kamm
2023-08-17  9:14     ` gerard.vermeulen
2023-08-17 12:10       ` Ihor Radchenko
2023-08-18  4:37         ` gerard.vermeulen
2023-08-18  6:01           ` gerard.vermeulen
2023-08-18 23:30       ` Jack Kamm
2023-08-19  8:50         ` Ihor Radchenko
2023-08-20 18:01           ` Jack Kamm
2023-08-20 18:21             ` Ihor Radchenko
2023-08-19  8:58         ` Ihor Radchenko
2023-08-20 18:13           ` Jack Kamm
2023-08-20 18:25             ` Ihor Radchenko
2023-08-22 23:37               ` Jack Kamm
2023-08-17 12:07     ` Ihor Radchenko
2023-08-18 22:49       ` Jack Kamm
2023-08-17  5:35 ` Liu Hui
2023-08-18 23:09   ` Jack Kamm
2023-08-20 12:13     ` Liu Hui
2023-08-20 18:31       ` Jack Kamm
2023-08-21  6:21         ` Liu Hui
2023-08-22 23:44         ` Jack Kamm
2023-08-17 11:57 ` Ihor Radchenko
2023-08-18 23:18   ` Jack Kamm
2023-08-19  8:54     ` Ihor Radchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a5ur6f7w.fsf@gmail.com \
    --to=jackkamm@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=liuhui1610@gmail.com \
    --cc=yantar92@posteo.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).