emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Utkarsh Singh <utkarsh190601@gmail.com>
To: emacs-orgmode@gnu.org, bug-gnu-emacs@gnu.org
Subject: [PATCH] org-table-import: Make it more smarter for interactive use
Date: Mon, 19 Apr 2021 10:13:31 +0530	[thread overview]
Message-ID: <87czuq9958.fsf@gmail.com> (raw)

Hi,

My previous patch proposed to add support for importing file with
arbitrary name and building upon that this patch tries to make use of it
by making org-table-import smarter by simply adding more separators
(delimiters).

Currently org-table-import 'smartly' guesses only COMMA, TAB and SPACE
as separator whereas this patch tries to add support for ';'(SEMICOLON)
and ':' (COLON).

Here is an example org-table generated using =M-x org-table-import=
/etc/passwd (uses COLON as separator) with private information removed.

| bin                    | x |     1 |     1 |                                | /                     | /usr/bin/nologin   |
| daemon                 | x |     2 |     2 |                                | /                     | /usr/bin/nologin   |
| mail                   | x |     8 |    12 |                                | /var/spool/mail       | /usr/bin/nologin   |
| ftp                    | x |    14 |    11 |                                | /srv/ftp              | /usr/bin/nologin   |
| http                   | x |    33 |    33 |                                | /srv/http             | /usr/bin/nologin   |
| nobody                 | x | 65534 | 65534 | Nobody                         | /                     | /usr/bin/nologin   |
| dbus                   | x |    81 |    81 | System Message Bus             | /                     | /usr/bin/nologin   |
| systemd-journal-remote | x |   981 |   981 | systemd Journal Remote         | /                     | /usr/bin/nologin   |
| systemd-network        | x |   980 |   980 | systemd Network Management     | /                     | /usr/bin/nologin   |
| systemd-oom            | x |   979 |   979 | systemd Userspace OOM Killer   | /                     | /usr/bin/nologin   |
| systemd-resolve        | x |   978 |   978 | systemd Resolver               | /                     | /usr/bin/nologin   |
| systemd-timesync       | x |   977 |   977 | systemd Time Synchronization   | /                     | /usr/bin/nologin   |
| systemd-coredump       | x |   976 |   976 | systemd Core Dumper            | /                     | /usr/bin/nologin   |
| avahi                  | x |   974 |   974 | Avahi mDNS/DNS-SD daemon       | /                     | /usr/bin/nologin   |
| colord                 | x |   973 |   973 | Color management daemon        | /var/lib/colord       | /usr/bin/nologin   |
| rtkit                  | x |   133 |   133 | RealtimeKit                    | /proc                 | /usr/bin/nologin   |
| transmission           | x |   169 |   169 | Transmission BitTorrent Daemon | /var/lib/transmission | /usr/bin/nologin   |
| geoclue                | x |   972 |   972 | Geoinformation service         | /var/lib/geoclue      | /usr/bin/nologin   |
| usbmux                 | x |   140 |   140 | usbmux user                    | /                     | /usr/bin/nologin   |


diff --git a/lisp/org/org-table.el b/lisp/org/org-table.el
index ab66859d6a..5ee4af612b 100644
--- a/lisp/org/org-table.el
+++ b/lisp/org/org-table.el
@@ -846,6 +846,35 @@ org-table-create
       (goto-char pos))
     (org-table-align)))
 
+
+(defun org-table-guess-separator (beg0 end0)
+  "Guess separator for `org-table-convert-region' for region BEG0 to END0.
+
+List of preferred separator:
+comma, TAB, ';', ':' or SPACE
+
+If region contains a line which doesn't contain the required
+separator then discard the separator and search again using next
+separator."
+  (let ((beg (save-excursion
+	       (goto-char (min beg0 end0))
+	       (beginning-of-line 1)
+	       (point)))
+	(end (save-excursion
+	       (goto-char (max beg0 end0))
+	       (end-of-line 1)
+	       (if (bolp) (backward-char 1) (end-of-line 1))
+	       (point))))
+    (save-excursion
+      (goto-char beg)
+      (cond
+       ((not (re-search-forward "^[^\n,]+$" end t)) '(4))
+       ((not (re-search-forward "^[^\n\t]+$" end t)) '(16))
+       ((not (re-search-forward "^[^\n;]+$" end t)) ";")
+       ((not (re-search-forward "^[^\n:]+$" end t)) ":")
+       ((not (re-search-forward "^\\([^'\"][^\n\s][^'\"]\\)+$" end t)) " ")
+       (t nil)))))
+
 ;;;###autoload
 (defun org-table-convert-region (beg0 end0 &optional separator)
   "Convert region to a table.
@@ -862,10 +891,7 @@ org-table-convert-region
 integer  When a number, use that many spaces, or a TAB, as field separator
 regexp   When a regular expression, use it to match the separator
 nil      When nil, the command tries to be smart and figure out the
-         separator in the following way:
-         - when each line contains a TAB, assume TAB-separated material
-         - when each line contains a comma, assume CSV material
-         - else, assume one or more SPACE characters as separator."
+         separator using `org-table-guess-seperator'."
   (interactive "r\nP")
   (let* ((beg (min beg0 end0))
 	 (end (max beg0 end0))
@@ -881,14 +907,9 @@ org-table-convert-region
       (goto-char end)
       (if (bolp) (backward-char 1) (end-of-line 1))
       (setq end (point-marker))
-      ;; Get the right field separator
-      (unless separator
-	(goto-char beg)
-	(setq separator
-	      (cond
-	       ((not (re-search-forward "^[^\n\t]+$" end t)) '(16))
-	       ((not (re-search-forward "^[^\n,]+$" end t)) '(4))
-	       (t 1))))
+      (if (and (not separator)
+               (not (setq separator (org-table-guess-separator beg end))))
+          (error "Unable to guess suitable separator."))
       (goto-char beg)
       (if (equal separator '(4))
 	  (while (< (point) end)
@@ -921,12 +942,8 @@ org-table-convert-region
 (defun org-table-import (file separator)
   "Import FILE as a table.
 
-The command tries to be smart and figure out the separator in the
-following way:
-
-- when each line contains a TAB, assume TAB-separated material;
-- when each line contains a comma, assume CSV material;
-- else, assume one or more SPACE characters as separator.
+The command tries to be smart and figure out the separator using
+`org-table-guess-seperator'.
 
 When non-nil, SEPARATOR specifies the field separator in the
 lines.  It can have the following values:

-- 
Utkarsh Singh
http://utkarshsingh.xyz


             reply	other threads:[~2021-04-19  4:44 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-19  4:43 Utkarsh Singh [this message]
2021-04-19  8:19 ` [PATCH] org-table-import: Make it more smarter for interactive use Nicolas Goaziou
2021-04-19 14:23   ` Utkarsh Singh
2021-04-20 13:40     ` Nicolas Goaziou
2021-04-20 17:15       ` Utkarsh Singh
2021-04-23  4:58       ` Utkarsh Singh
2021-04-27 20:21         ` bug#47885: " Nicolas Goaziou
2021-04-28  8:37           ` Utkarsh Singh
2021-04-28 16:38             ` Maxim Nikulin
2021-05-10 18:36               ` Utkarsh Singh
2021-05-12 17:08                 ` Maxim Nikulin
2021-05-14 14:54                   ` Utkarsh Singh
2021-05-15  9:13                     ` Bastien
2021-05-15 10:10                       ` Utkarsh Singh
2021-05-15 10:30                         ` Bastien
2021-05-15 11:09                           ` Utkarsh Singh
2021-05-17  5:29                         ` Bastien
2021-05-17 16:27                           ` Utkarsh Singh
2021-06-01 16:23                           ` Maxim Nikulin
2021-06-01 17:46                             ` Utkarsh Singh
2021-06-02 12:06                               ` Maxim Nikulin
2021-06-02 15:08                                 ` Utkarsh Singh
2021-06-02 16:44                                   ` Maxim Nikulin
2021-06-04  4:04                                     ` Utkarsh Singh
2021-06-05 12:40                                       ` Maxim Nikulin
2021-06-05 17:50                                         ` Utkarsh Singh
2021-06-09 12:15                                           ` Maxim Nikulin
2021-09-26  8:40                                           ` Bastien
2021-05-16 16:24                     ` Maxim Nikulin
2021-05-17 16:30                       ` Utkarsh Singh
2021-05-18 10:24                       ` Utkarsh Singh
2021-05-18 12:31                         ` Maxim Nikulin
2021-05-18 15:05                           ` Utkarsh Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87czuq9958.fsf@gmail.com \
    --to=utkarsh190601@gmail.com \
    --cc=bug-gnu-emacs@gnu.org \
    --cc=emacs-orgmode@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).