emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [FR] ob-awk.el specifying a delimeter argument in for output
@ 2023-03-05  9:50 Jeremie Juste
  2023-03-05 12:37 ` Ihor Radchenko
  2023-03-06 10:00 ` Max Nikulin
  0 siblings, 2 replies; 5+ messages in thread
From: Jeremie Juste @ 2023-03-05  9:50 UTC (permalink / raw)
  To: emacs-orgmode; +Cc: tyler

Hello,

ob-awk has proven very valuable to me lately so many
thanks for maintaining it. 

First of all let me specify that I'm a beginner user of awk and I don't
know if I'm using ob-awk as it is intended, so I'll be glad for any
suggestions. Let me explain further:


* Default behavior

If I have a csv file with comma separated values, I get the output as an
org table.

;; test.csv
123,0,123


#+begin_src awk :in-file test.csv :cmd-line -F ","
{print $0}
#+end_src

#+RESULTS:
| 123 | 0 | 123 |

* Request 

However If I have a csv file with say semi column delimited values (;)
I don't get the org table as output

#+begin_src awk :in-file test1.csv :cmd-line -F ";"
{print $0}
#+end_src

#+RESULTS:
: 123;0;123


In my opinion, this could be fixed if we could read the :cmd-line
parameter -F  and use the delimeter argument ; as a parameter to the
following function

modified   lisp/ob-awk.el
@@ -93,7 +93,7 @@ This function is called by `org-babel-execute-src-block'."
 	   results
 	   (let ((tmp (org-babel-temp-file "awk-results-")))
 	     (with-temp-file tmp (insert results))
-	     (org-babel-import-elisp-from-file tmp)))))
+	     (org-babel-import-elisp-from-file tmp ";")))))


Would this be the right way to do think about this issue? 

Best regards,
Jeremie

PS Note that we have a samilar issue in ob-shell
where the delimiter is by default a comma. 

#+begin_src shell
  echo '192;168;1;200' | awk -F ";"   '{print $0}'     
#+end_src

#+RESULTS:
: 192;168;1;200



#+begin_src shell
  echo '192,168,1,200' | awk -F ","   '{print $0}'     
#+end_src

#+RESULTS:
| 192 | 168 | 1 | 200 |


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [FR] ob-awk.el specifying a delimeter argument in for output
  2023-03-05  9:50 [FR] ob-awk.el specifying a delimeter argument in for output Jeremie Juste
@ 2023-03-05 12:37 ` Ihor Radchenko
  2023-03-06  7:36   ` Jeremie Juste
  2023-03-06 10:00 ` Max Nikulin
  1 sibling, 1 reply; 5+ messages in thread
From: Ihor Radchenko @ 2023-03-05 12:37 UTC (permalink / raw)
  To: Jeremie Juste; +Cc: emacs-orgmode, tyler

Jeremie Juste <jeremiejuste@gmail.com> writes:

> * Request 
>
> However If I have a csv file with say semi column delimited values (;)
> I don't get the org table as output
>
> #+begin_src awk :in-file test1.csv :cmd-line -F ";"
> {print $0}
> #+end_src
>
> #+RESULTS:
> : 123;0;123

> In my opinion, this could be fixed if we could read the :cmd-line
> parameter -F  and use the delimeter argument ; as a parameter to the
> following function

Org knows nothing about your output, by default.
You could as well do something like {print $1+-+$2+-+$3}
What should Org do in such case?

Currently, Org tries to guess the type of arbitrary output. If the
output looks like a table, with fields separated by tabs, commas, or
spaces, it converts the output to table. Otherwise, it is treated as
string.

I guess we might add an option to tell Org which separator to use when
parsing output when :results table header argument is provided (see 16.6
Results of Evaluation section of Org manual). However, you can achieve
the same now, using :post header argument, replacing the separators with
something Org can understand.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [FR] ob-awk.el specifying a delimeter argument in for output
  2023-03-05 12:37 ` Ihor Radchenko
@ 2023-03-06  7:36   ` Jeremie Juste
  2023-03-06  7:47     ` Jeremie Juste
  0 siblings, 1 reply; 5+ messages in thread
From: Jeremie Juste @ 2023-03-06  7:36 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode, tyler


Hello Ihor,


> Org knows nothing about your output, by default.
> You could as well do something like {print $1+-+$2+-+$3}
> What should Org do in such case?
>
> Currently, Org tries to guess the type of arbitrary output. If the
> output looks like a table, with fields separated by tabs, commas, or
> spaces, it converts the output to table. Otherwise, it is treated as
> string.

Many thanks for the insights.  Ok, I guess this it likely to take some
time, but it would be a great feature in my option.

>
> I guess we might add an option to tell Org which separator to use when
> parsing output when :results table header argument is provided (see 16.6
> Results of Evaluation section of Org manual). However, you can achieve
> the same now, using :post header argument, replacing the separators with
> something Org can understand.

Thanks for the suggestion. I tried using post replacing semi columns by
commas but was surprised by the output.

# file test.csv
# 123;0;123

#+NAME: specific-delim
#+BEGIN_SRC emacs-lisp :var tbl=""
(replace-regexp-in-string ";" "," tbl)
#+end_src

#+RESULTS: specific-delim



#+begin_src awk :in-file test.csv :cmd-line -F ";" :post specific-delim(*this*)
 {print $0}
#+end_src

#+RESULTS:
: 123,0,123



After a short investigation, I notice that the function
org-babel-import-elisp-from-file, is the function making the call wheter
the result should be a table or not i.e before the :post argument. Have
I understood correctly?

If it is the case, then I would have to not just replace the delimeter
but convert the entire results to an org-table.

Best regards,
Jeremie


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [FR] ob-awk.el specifying a delimeter argument in for output
  2023-03-06  7:36   ` Jeremie Juste
@ 2023-03-06  7:47     ` Jeremie Juste
  0 siblings, 0 replies; 5+ messages in thread
From: Jeremie Juste @ 2023-03-06  7:47 UTC (permalink / raw)
  To: Ihor Radchenko; +Cc: emacs-orgmode, tyler


Hello Ihor,

>> I guess we might add an option to tell Org which separator to use when
>> parsing output when :results table header argument is provided (see 16.6
>> Results of Evaluation section of Org manual). However, you can achieve
>> the same now, using :post header argument, replacing the separators with
>> something Org can understand.

Thanks again for the suggestion, after giving your solution more thoughts, I could
achieve the desired output. 

# file test.csv
# 123;0;123

#+NAME: specific-delim
#+BEGIN_SRC emacs-lisp :var tbl="" delim=";"
(format "|%s"  (replace-regexp-in-string delim  "|" tbl))
#+end_src

#+RESULTS: specific-delim
: |



#+begin_src awk :in-file test.csv :cmd-line -F ";" :post specific-delim(*this*,";") :results raw
 {print $0}
#+end_src

#+RESULTS:
| 123 | 0 | 123 |


Best regards,
Jeremie


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [FR] ob-awk.el specifying a delimeter argument in for output
  2023-03-05  9:50 [FR] ob-awk.el specifying a delimeter argument in for output Jeremie Juste
  2023-03-05 12:37 ` Ihor Radchenko
@ 2023-03-06 10:00 ` Max Nikulin
  1 sibling, 0 replies; 5+ messages in thread
From: Max Nikulin @ 2023-03-06 10:00 UTC (permalink / raw)
  To: Jeremie Juste, emacs-orgmode

On 05/03/2023 16:50, Jeremie Juste wrote:
> #+begin_src awk :in-file test.csv :cmd-line -F ","
> {print $0}
> #+end_src

Notice that awk has Output Field Separator that is space by default and 
may be set using -v OFS=; (or --assign) command line options. -F option 
sets input field separator FS variable.

"print $0" just sends input record to output literally. If you try to 
modify some field then record is rebuilt taken into account OFS

     echo 1,2 | awk -F , '{ $1=$1+10; print; }'

     11 2

See (info "(awk) Changing Fields)")
https://www.gnu.org/software/gawk/manual/gawk.html#Changing-Fields
I just have found this link on stackoverflow. It is enough to add even 
$1=$1 that should not really modify field values.

So I expect less issues with a more realistic example.

As to ";" as CSV values separator, it often appears with "," as decimal 
separator 1234,56 and dd.mm.yyyy date formats. Unfortunately handling of 
localized data formats is not a strong side of Emacs. E.g. fixed 
LC_NUMERIC=C forces "." as decimal separator, date parsing functions are 
rather limited.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-03-06 10:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-05  9:50 [FR] ob-awk.el specifying a delimeter argument in for output Jeremie Juste
2023-03-05 12:37 ` Ihor Radchenko
2023-03-06  7:36   ` Jeremie Juste
2023-03-06  7:47     ` Jeremie Juste
2023-03-06 10:00 ` Max Nikulin

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).