---- On Sat, 18 Nov 2023 16:54:39 +0100 Max Nikulin wrote --- > Hi, > > Trying to figure out the origin of the confusion with > "bash -c bash /path/to/file-containing-the-source-code.sh" > I have faced an inconsistency with :cmdline treatment in ob-shell.el. I > expect same results in the following cases: > > #+begin_src bash :cmdline 1 2 3 > printf "%s\n" "$1" > #+end_src > > #+RESULTS: > : 1 > > #+begin_src bash :cmdline 1 2 3 :shebang #!/bin/bash > printf "%s\n" "$1" > #+end_src > > #+RESULTS: > : 1 2 3 > > Emacs-28, Org is the current git HEAD. AFAIU, the inconsistency is due to how the characters following :cmdline are interpreted when the subprocess call is made. Consider the following, when only :cmdline is used: # Evaluates like: # # bash -c "./sh-script-8GJzdG 1 2 3" # #+begin_src bash :cmdline 1 2 3 echo \"$1\" #+end_src #+RESULTS: : 1 # Evaluates like: # # bash -c "./sh-script-8GJzdG \"1 2\" 3" # #+begin_src bash :cmdline "1 2" 3 echo \"$1\" #+end_src #+RESULTS: : 1 2 For :cmdline alone, the characters following :cmdline are passed as though each is quoted. That is, separate arguments are delimited by one or more spaces. The first example is equivalent to the following: # Evaluates like: # # bash -c "./sh-script-8GJzdG \"1\" \"2\" \"3\"" # #+begin_src bash :cmdline 1 2 3 echo \"$1\" #+end_src #+RESULTS: : 1 How would you expect :cmdline "1 2 3" to be evaluated? #+begin_src bash :cmdline "1 2 3" echo \"$1\" #+end_src My expectation would be that it evaluates like: bash -c "./sh-script-8GJzdG \"1 2 3\"" It turns out, however, that it's evaluated exactly like :cmdline 1 2 3, or :cmdline "1" "2" "3". The result is "1". To make the block evaluate as expected requires an extra set of parentheses: # Evaluates like: # # bash -c "./sh-script-8GJzdG \"1 2 3\"" # #+begin_src bash :cmdline "\"1 2 3\"" echo \"$1\" #+end_src #+RESULTS: : 1 2 3 This, however, appears to be separate from the reported issue[fn:1]. Now, consider :cmdline paired with :shebang, called with the same values as above. # Evaluates like: # # /tmp/babel-Xd6rGS/sh-script-61jvMa "1 2 3" # #+begin_src bash :cmdline 1 2 3 :shebang #!/usr/bin/env bash echo \"$1\" #+end_src #+RESULTS: : 1 2 3 # Evaluates like: # # /tmp/babel-Xd6rGS/sh-script-61jvMa "\"1 2\" 3" # #+begin_src bash :cmdline "1 2" 3 :shebang #!/usr/bin/env bash echo \"$1\" #+end_src #+RESULTS: : 1 2" 3" # Evaluates like: # # /tmp/babel-Xd6rGS/sh-script-61jvMa "1 2 3" # #+begin_src bash :cmdline "1 2 3" :shebang #!/usr/bin/env bash echo \"$1\" #+end_src #+RESULTS: : 1 2 3 # Evaluates like: # # /tmp/babel-Xd6rGS/sh-script-61jvMa "\"1 2 3\"" # #+begin_src bash :cmdline "\"1 2 3\"" :shebang #!/usr/bin/env bash echo \"$1\" #+end_src #+RESULTS: : 1 2 3"" # Evaluates like: # # /tmp/babel-Xd6rGS/sh-script-61jvMa "\"1\" \"2\" \"3\"" # #+begin_src bash :cmdline "1" "2" "3" :shebang #!/usr/bin/env bash echo \"$1\" #+end_src #+RESULTS: : 1" "2" "3"" An immediate observation is that the output results don't format correctly. If you change the results type to "raw", however, you'll see that the Org results match those from a terminal, like xfce4-terminal. The fact that raw output matches output from the terminal means that the formatting issue is (also) separate from the bug we're trying to fix. That is, the bug we're trying to fix occurs in how the subprocess call is made, not in how the result is formatted. In ob-shell, the subprocess call is made with 'process-file'. Arguments are determined casewise: 1. shebang+cmdline 2. cmdline The characters following :cmdline are received by the 'cmdline' argument to 'org-babel-sh-evaluate' as a string. Both cases put this string into a list for the ARGS of 'process-file': | header | 'org-babel-sh-evaluate' | process-file ARGS | | | cmdline variable value | shebang+cmdline | |----------------------+-------------------------+-----------------------| | :cmdline 1 2 3 | "1 2 3" | ("1 2 3") | | :cmdline "1 2" 3" | "\"1 2\" 3" | ("\"1 2\" 3") | | :cmdline "1" "2" "3" | "\"1\" \"2\" \"3\"" | ("\"1\" \"2\" \"3\"") | | header | 'org-babel-sh-evaluate' | process-file ARGS | | | cmdline variable value | cmdline | |----------------------+-------------------------+-----------------------| | :cmdline 1 2 3 | "1 2 3" | ("1 2 3") | | :cmdline "1 2" 3" | "\"1 2\" 3" | ("\"1 2\" 3") | | :cmdline "1" "2" "3" | "\"1\" \"2\" \"3\"" | ("\"1\" \"2\" \"3\"") | Notice that the ARGS passed to 'process-file' are the same for both cases. The problem is that the "block equivalent shell calls" are *not* the same. If we arrange the equivalent shell calls from the blocks given above into a table, we see that the forms are different: | header | cmdline variable value | shebang+cmdline call | |----------------------+------------------------+--------------------------------------------------------| | :cmdline 1 2 3 | "1 2 3" | /tmp/babel-Xd6rGS/sh-script-61jvMa "1 2 3" | | :cmdline "1 2" 3" | "\"1 2\" 3" | /tmp/babel-Xd6rGS/sh-script-61jvMa "\"1 2\" 3" | | :cmdline "1" "2" "3" | "\"1\" \"2\" \"3\"" | /tmp/babel-Xd6rGS/sh-script-61jvMa "\"1\" \"2\" \"3\"" | | header | cmdline variable value | cmdline call | |----------------------+------------------------+------------------------------------------------| | :cmdline 1 2 3 | "1 2 3" | bash -c "./sh-script-8GJzdG 1 2 3" | | :cmdline "1 2" 3" | "\"1 2\" 3" | bash -c "./sh-script-8GJzdG \"1 2\" 3" | | :cmdline "1" "2" "3" | "\"1\" \"2\" \"3\"" | bash -c "./sh-script-8GJzdG \"1\" \"2\" \"3\"" | The reported bug exists because shebang+cmdline interprets the characters following :cmdline as a *single* string. Without :shebang, a lone :cmdline interprets them as space delimited. One possible solution is to reformat the 'process-file' ARGS for the shebang+cmdline case so that characters following :cmdline are interpreted as space delimited. This is possible using 'split-string-and-unquote': (split-string-and-unquote "1 2 3") -> ("1" "2" "3") (split-string-and-unquote "\"1 2\" 3") -> ("1 2" "3") (split-string-and-unquote "\"1\" \"2\" \"3\"") -> ("1" "2" "3") Whether this is a solution, in part, depends on the perennial problem of shell blocks: knowing what's wrong means knowing what's right. The proposed solution assumes we intend to parse the characters following :cmdline as space delimited and grouped by quotes. However, AFAICT, the parsing issue makes this solution ambiguous. Thoughts? -- Matt Trzcinski Emacs Org contributor (ob-shell) Learn more about Org mode at https://orgmode.org Support Org development at https://liberapay.com/org-mode [fn:1] AFAICT, it's due to how headers are parsed by 'org-babel-parse-header-arguments' using 'org-babel-read'. The cell "\"1 2 3\"" (corresponding to :cmdline "1 2 3") is reduced through 'string-match' to "1 2 3". The cell "1 2 3" (corresponding to :cmdline 1 2 3), on the other hand, passes through. The result is that :cmdline "1 2 3" and :cmdline 1 2 3 become indistinguishable. I mention this because it's easy to get confused by this issue which, AFAICT, is independent of the one we're trying to fix. The reported issue appears only to be related to how the result of :cmdline header parsing is passed to the subprocess.