emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [PATCH] ob-comint,R,python: Options for more robust non-async session output
@ 2024-10-18 21:20 Jack Kamm
  2024-10-19  7:20 ` Ihor Radchenko
  0 siblings, 1 reply; 2+ messages in thread
From: Jack Kamm @ 2024-10-18 21:20 UTC (permalink / raw)
  To: emacs-orgmode; +Cc: yantar92, jeremiejuste

[-- Attachment #1: Type: text/plain, Size: 1645 bytes --]

This is related to my recent thread,

https://list.orgmode.org/87wmi9etku.fsf@gmail.com/T/#m93bd964243638ee358d4375c9ed4f40e066238d4

which addresses problems with prompt filtering of R and python async evaluation in Org 9.7.

Here, I address the non-async case.

First, I introduce options to `org-babel-comint-with-output' to skip the
cleanup of prompts and dangling text.  This allows individual Babel
languages to handle these tasks in a more robust, language-specific
manner, e.g. by sourcing temp files or relying on functionality from
external libraries like python.el or ESS.

In particular, as I discussed in the link above, it is very difficult to
do the prompt cleanup in a way that is robust to false positives.  ob-shell
accomplishes this by changing the prompt to a unique string, but this is
not an option for other languages, and has its own drawbacks (e.g. with
conda).  ob-python takes a different approach, and evaluates code in a
way that avoids leaking prompts, thereby skipping the need to clean them
up.

Previously, ob-python had eschewed using `org-babel-comint-with-output',
and re-implemented its own version without the cleanup steps.  With this
patch, ob-python can go back to using `org-babel-comint-with-output',
thereby reducing code duplication.

Finally, I add a new implementation for R non-async output, that follows
the ob-python approach of avoiding prompt leakage altogether, thereby
skipping the need for a cleanup step. I add a unit test to demonstrate
the improved robustness of this approach: previously, this test would
fail due to a false positive prompt, but it passes with the new
implementation.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-options-to-skip-extra-processing-in-org-babel-co.patch --]
[-- Type: text/x-patch, Size: 11563 bytes --]

From 232f4c60323b13e3d46cb377095a8b2751f7ed74 Mon Sep 17 00:00:00 2001
From: Jack Kamm <jackkamm@gmail.com>
Date: Thu, 17 Oct 2024 17:33:47 -0700
Subject: [PATCH] Add options to skip extra processing in
 org-babel-comint-with-output

This patch adds options to org-babel-comint-with-output to skip prompt
removal and saving of dangling text.  Allowing individual languages to
handle this cleanup can be more robust than relying on the generic
ob-comint implementation.  This allows ob-python to switch back to
using `org-babel-comint-with-output' rather than its own bespoke
reimplementation, reducing code duplication.  Furthermore, this adds a
new implementation of ob-R non-async session output evaluation, that
is similar to the ob-python approach in that it avoids leaking prompts
or interfering with dangling text altogether, rather than relying on
the cleanup from `org-babel-comint-with-output'.  A test is added to
test-ob-R.el to demonstrate the improved robustness of the new
approach; previously, this test would fail due to a false positive
prompt, but now passes.

* lisp/ob-comint.el (org-babel-comint-with-output): Add new arguments
to prevent extra processing for prompt cleanup and saving dangling
output.  Also, search for the end-of-execution sentinel within the
collected output rather than the comint buffer, so that point and
dangling text don't need to be modified.
* lisp/ob-python.el (org-babel-python-send-string): Switch to using
`org-babel-comint-with-output', rather than bespoke reimplementation.
* lisp/ob-R.el (ess-send-string): Declare external function.
(org-babel-R-evaluate-session): New implementation of output
evaluation that avoids leaking prompts, by writing the code block to a
tmp file and then sourcing it.  Also uses `ess-send-string' (rather
than inserting code into the comint buffer) to avoid interfering with
dangling text.
* testing/lisp/test-ob-R.el (test-ob-r/session-output-with->-bol): New
test for robustness against false positive prompts at the beginning of
a line.
---
 lisp/ob-R.el              | 32 ++++++++-----------
 lisp/ob-comint.el         | 65 +++++++++++++++++++++++----------------
 lisp/ob-python.el         | 26 +++++-----------
 testing/lisp/test-ob-R.el | 12 ++++++++
 4 files changed, 69 insertions(+), 66 deletions(-)

diff --git a/lisp/ob-R.el b/lisp/ob-R.el
index 481212202..80c6b0e09 100644
--- a/lisp/ob-R.el
+++ b/lisp/ob-R.el
@@ -42,6 +42,8 @@ (declare-function ess-make-buffer-current "ext:ess-inf" ())
 (declare-function ess-eval-buffer "ext:ess-inf" (vis))
 (declare-function ess-wait-for-process "ext:ess-inf"
 		  (&optional proc sec-prompt wait force-redisplay))
+(declare-function ess-send-string "ext:ess-inf"
+                  (process string &optional visibly message type))
 
 (defvar ess-current-process-name) ; ess-custom.el
 (defvar ess-local-process-name)   ; ess-custom.el
@@ -448,26 +450,16 @@ (defun org-babel-R-evaluate-session
 	  (org-babel-import-elisp-from-file tmp-file '(16)))
 	column-names-p)))
     (output
-     (mapconcat
-      'org-babel-chomp
-      (butlast
-       (delq nil
-	     (mapcar
-	      (lambda (line) (when (> (length line) 0) line))
-	      (mapcar
-	       (lambda (line) ;; cleanup extra prompts left in output
-		 (if (string-match
-		      "^\\([>+.]\\([ ][>.+]\\)*[ ]\\)"
-		      (car (split-string line "\n")))
-		     (substring line (match-end 1))
-		   line))
-	       (with-current-buffer session
-		 (let ((comint-prompt-regexp (concat "^" comint-prompt-regexp)))
-		   (org-babel-comint-with-output (session org-babel-R-eoe-output)
-		     (insert (mapconcat 'org-babel-chomp
-					(list body org-babel-R-eoe-indicator)
-					"\n"))
-		     (inferior-ess-send-input)))))))) "\n"))))
+     (let ((tmp-src-file (org-babel-temp-file "R-")))
+       (with-temp-file tmp-src-file
+         (insert (concat
+                  (org-babel-chomp body) "\n" org-babel-R-eoe-indicator)))
+       (with-current-buffer session
+         (org-babel-comint-with-output (session org-babel-R-eoe-output nil nil t t)
+           (ess-send-string (get-buffer-process (current-buffer))
+                            (format "source('%s', echo=F, print.eval=T)"
+                                    (org-babel-process-file-name
+			             tmp-src-file 'noquote)))))))))
 
 (defun org-babel-R-process-value-result (result column-names-p)
   "R-specific processing of return value.
diff --git a/lisp/ob-comint.el b/lisp/ob-comint.el
index b88ac445a..3fbfbe0ec 100644
--- a/lisp/ob-comint.el
+++ b/lisp/ob-comint.el
@@ -105,11 +105,17 @@ (defmacro org-babel-comint-with-output (meta &rest body)
   "Evaluate BODY in BUFFER and return process output.
 Will wait until EOE-INDICATOR appears in the output, then return
 all process output.  If REMOVE-ECHO and FULL-BODY are present and
-non-nil, then strip echo'd body from the returned output.  META
-should be a list containing the following where the last two
-elements are optional.
-
- (BUFFER EOE-INDICATOR REMOVE-ECHO FULL-BODY)
+non-nil, then strip echo'd body from the returned output.  If
+NO-SAVE-DANGLE is nil, current text at the prompt is removed
+before evaluation, then restored; if non-nil, supress that
+behavior.  If NO-CLEANUP-PROMPT is nil, prompts are detected in
+the output, and the returned value is a list of the output split
+on the prompt positions; if non-nil, suppress that behavior, and
+just return a single string of all the output up to
+EOE-INDICATOR.  META should be a list containing the following
+where the last four elements are optional.
+
+ (BUFFER EOE-INDICATOR REMOVE-ECHO FULL-BODY NO-SAVE-DANGLE NO-CLEANUP-PROMPT)
 
 This macro ensures that the filter is removed in case of an error
 or user `keyboard-quit' during execution of body."
@@ -117,7 +123,9 @@ (defmacro org-babel-comint-with-output (meta &rest body)
   (let ((buffer (nth 0 meta))
 	(eoe-indicator (nth 1 meta))
 	(remove-echo (nth 2 meta))
-	(full-body (nth 3 meta)))
+	(full-body (nth 3 meta))
+        (no-save-dangle (nth 5 meta))
+        (no-cleanup-prompt (nth 4 meta)))
     `(org-babel-comint-in-buffer ,buffer
        (let* ((string-buffer "")
 	      (comint-output-filter-functions
@@ -125,23 +133,22 @@ (defmacro org-babel-comint-with-output (meta &rest body)
                        (setq string-buffer (concat string-buffer text)))
 		     comint-output-filter-functions))
 	      dangling-text)
-	 ;; got located, and save dangling text
-	 (goto-char (process-mark (get-buffer-process (current-buffer))))
-	 (let ((start (point))
-	       (end (point-max)))
-	   (setq dangling-text (buffer-substring start end))
-	   (delete-region start end))
+         (unless ,no-save-dangle
+	   ;; got located, and save dangling text
+	   (goto-char (process-mark (get-buffer-process (current-buffer))))
+	   (let ((start (point))
+	         (end (point-max)))
+	     (setq dangling-text (buffer-substring start end))
+	     (delete-region start end)))
 	 ;; pass FULL-BODY to process
 	 ,@body
 	 ;; wait for end-of-evaluation indicator
          (let ((start-time (current-time)))
-	   (while (progn
-		    (goto-char comint-last-input-end)
-		    (not (save-excursion
-		         (and (re-search-forward
-			       (regexp-quote ,eoe-indicator) nil t)
-			      (re-search-forward
-			       comint-prompt-regexp nil t)))))
+	   (while (not (save-excursion
+		         (and (string-match
+			       (regexp-quote ,eoe-indicator) string-buffer)
+			      (string-match
+			       comint-prompt-regexp string-buffer))))
 	     (accept-process-output
               (get-buffer-process (current-buffer))
               org-babel-comint-fallback-regexp-threshold)
@@ -152,21 +159,25 @@ (defmacro org-babel-comint-with-output (meta &rest body)
 		          (goto-char comint-last-input-end)
 		          (save-excursion
                             (and
-                             (re-search-forward
-			      (regexp-quote ,eoe-indicator) nil t)
-			     (re-search-forward
-                              org-babel-comint-prompt-regexp-fallback nil t)))))
+                             (string-match
+			      (regexp-quote ,eoe-indicator) string-buffer)
+			     (string-match
+                              org-babel-comint-prompt-regexp-fallback string-buffer)))))
                (org-babel-comint--set-fallback-prompt))))
 	 ;; replace cut dangling text
-	 (goto-char (process-mark (get-buffer-process (current-buffer))))
-	 (insert dangling-text)
+         (unless ,no-save-dangle
+	   (goto-char (process-mark (get-buffer-process (current-buffer))))
+	   (insert dangling-text))
 
          ;; remove echo'd FULL-BODY from input
          (and ,remove-echo ,full-body
               (setq string-buffer (org-babel-comint--echo-filter string-buffer ,full-body)))
 
-         ;; Filter out prompts.
-         (org-babel-comint--prompt-filter string-buffer)))))
+         (if ,no-cleanup-prompt
+             (save-match-data
+               (string-match (regexp-quote ,eoe-indicator) string-buffer)
+               (org-babel-chomp (substring string-buffer 0 (match-beginning 0))))
+           (org-babel-comint--prompt-filter string-buffer))))))
 
 (defun org-babel-comint-input-command (buffer cmd)
   "Pass CMD to BUFFER.
diff --git a/lisp/ob-python.el b/lisp/ob-python.el
index 8a3c24f70..82f7e4e8f 100644
--- a/lisp/ob-python.el
+++ b/lisp/ob-python.el
@@ -451,31 +451,19 @@ (defun org-babel-python-evaluate-external-process
 (defun org-babel-python-send-string (session body)
   "Pass BODY to the Python process in SESSION.
 Return output."
-  (with-current-buffer session
-    (let* ((string-buffer "")
-	   (comint-output-filter-functions
-	    (cons (lambda (text) (setq string-buffer
-				       (concat string-buffer text)))
-		  comint-output-filter-functions))
-	   (body (format "\
+  (org-babel-comint-with-output
+      ((org-babel-session-buffer:python session)
+       org-babel-python-eoe-indicator
+       nil nil t t)
+    (python-shell-send-string (format "\
 try:
 %s
 except:
     raise
 finally:
     print('%s')"
-			 (org-babel-python--shift-right body 4)
-			 org-babel-python-eoe-indicator)))
-      (let ((python-shell-buffer-name
-	     (org-babel-python-without-earmuffs session)))
-	(python-shell-send-string body))
-      ;; same as `python-shell-comint-end-of-output-p' in emacs-25.1+
-      (while (not (and (python-shell-comint-end-of-output-p string-buffer)
-                       (string-match
-		        org-babel-python-eoe-indicator
-		        string-buffer)))
-	(accept-process-output (get-buffer-process (current-buffer))))
-      (org-babel-chomp (substring string-buffer 0 (match-beginning 0))))))
+			              (org-babel-python--shift-right body 4)
+			              org-babel-python-eoe-indicator))))
 
 (defun org-babel-python-evaluate-session
     (session body &optional result-type result-params graphics-file)
diff --git a/testing/lisp/test-ob-R.el b/testing/lisp/test-ob-R.el
index 0d291bf54..b8dcaa973 100644
--- a/testing/lisp/test-ob-R.el
+++ b/testing/lisp/test-ob-R.el
@@ -126,6 +126,18 @@ (ert-deftest test-ob-r/output-with-<> ()
 ))))
 
 
+(ert-deftest test-ob-r/session-output-with->-bol ()
+  "make sure prompt-like strings are well formatted, even when at beginning of line."
+    (let (ess-ask-for-ess-directory ess-history-file)
+      (should (string="abc
+def> <ghi"
+  (org-test-with-temp-text "#+begin_src R :results output :session R
+     cat(\"abc
+     def> <ghi\")
+   #+end_src
+"
+    (org-babel-execute-src-block))
+))))
 
 
 ;; (ert-deftest test-ob-r/output-with-error ()
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] ob-comint,R,python: Options for more robust non-async session output
  2024-10-18 21:20 [PATCH] ob-comint,R,python: Options for more robust non-async session output Jack Kamm
@ 2024-10-19  7:20 ` Ihor Radchenko
  0 siblings, 0 replies; 2+ messages in thread
From: Ihor Radchenko @ 2024-10-19  7:20 UTC (permalink / raw)
  To: Jack Kamm; +Cc: emacs-orgmode, jeremiejuste

Jack Kamm <jackkamm@gmail.com> writes:

> This is related to my recent thread,
>
> https://list.orgmode.org/87wmi9etku.fsf@gmail.com/T/#m93bd964243638ee358d4375c9ed4f40e066238d4
>
> which addresses problems with prompt filtering of R and python async evaluation in Org 9.7.
>
> Here, I address the non-async case.
>
> First, I introduce options to `org-babel-comint-with-output' to skip the
> cleanup of prompts and dangling text.  This allows individual Babel
> languages to handle these tasks in a more robust, language-specific
> manner, e.g. by sourcing temp files or relying on functionality from
> external libraries like python.el or ESS.

I have no objections to the idea.
Just one comment on the patch.

> -	 ;; got located, and save dangling text
> -	 (goto-char (process-mark (get-buffer-process (current-buffer))))
> -	 (let ((start (point))
> -	       (end (point-max)))
> -	   (setq dangling-text (buffer-substring start end))
> -	   (delete-region start end))
> +         (unless ,no-save-dangle
> +	   ;; got located, and save dangling text
> +	   (goto-char (process-mark (get-buffer-process (current-buffer))))
> +	   (let ((start (point))
> +	         (end (point-max)))
> +	     (setq dangling-text (buffer-substring start end))
> +	     (delete-region start end)))
>  	 ;; pass FULL-BODY to process
>  	 ,@body

Previously, the point was always at process-mark when BODY is
called. Now, it may not be the case. It may cause problems.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-10-19  7:19 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-18 21:20 [PATCH] ob-comint,R,python: Options for more robust non-async session output Jack Kamm
2024-10-19  7:20 ` Ihor Radchenko

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).