emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: tbanelwebmin <tbanelwebmin@free.fr>
To: emacs-orgmode@gnu.org
Subject: Re: [ANN] faster org-table-to-lisp
Date: Thu, 30 Apr 2020 22:28:58 +0200	[thread overview]
Message-ID: <0166c38e-e1f2-9cbc-4cf8-1b287600368d@free.fr> (raw)
In-Reply-To: <87wo5xcs3p.fsf@nicolasgoaziou.fr>

Le 30/04/2020 à 10:09, Nicolas Goaziou a écrit :

> Hello,
>
> tbanelwebmin <tbanelwebmin@free.fr> writes:
>
>> Here is an alternative, faster version of org-table-to-lisp. It can be
>> more than 100 times faster.
> Great! Thank you!
>
>> #+BEGIN_SRC elisp
>> (defun org-table-to-lisp-faster (&optional org-table-at-p-done)
>>   "Convert the table at point to a Lisp structure.
>> The structure will be a list.  Each item is either the symbol `hline'
>> for a horizontal separator line, or a list of field values as strings.
>> The table is taken from the buffer at point.
>> When the optional ORG-TABLE-AT-P-DONE parameter is not nil, it is
>> assumed that (org-at-table-p) was already called."
> Since you're changing the signature, I suggest to provide the table
> element instead of ORG-AT-TABLE-P. AFAICT, `org-babel-read-element',
> through `org-babel-read-table', would greatly benefit from this.
>
> Or, to be backward compatible, I suggest
>
>   &optional TEXT TABLE
>
>>   (or org-table-at-p-done (org-at-table-p) (user-error "No table at point"))
>>   (save-excursion
>>     (goto-char (org-table-begin))
>>     (let ((end (org-table-end))
>>           (row)
>>           (table))
> Nitpick:
>
>     (row nil)
>     (table nil)
>
>>       (while (< (point) end)
>>         (setq row nil)
>>         (search-forward "|" end)
>>         (if (looking-at "-")
>>             (progn
>>               (search-forward "\n" end)
> (forward-line)
>
>>               (push 'hline table))
>>           (while (not (search-forward-regexp "\\=\n" end t))
> (unless (eolp)
>   ...)
>
>>             (unless (search-forward-regexp "\\=\\s-*\\([^|]*\\)" end t)
>>               (user-error "Malformed table at char %s" (point)))
> A row may not be properly ended. It doesn't warrant an error. Could you
> make it more tolerant?
>
> Also `search-forward-regexp' -> `re-search-forward', i.e., use the
> original.
>
>>             (let ((b (match-beginning 1))
>>           (e (match-end       1)))
> Nitpick: spurious spaces.
>
>>               (and (search-backward-regexp "[^ \t]" b t)
>>                (forward-char 1))
>   (skip-chars-backward " \t")
>
>> It is faster because it operates directly on the buffer with
>> (search-forward-regexp). Whereas the standard function splits a string
>> extracted from the buffer.
> You are right. I guess the initial implementation didn't have these
> monster tables in mind.
>
>> This function is a drop-in replacement for the standard one. It can
>> benefit to Babel and Gnuplot.
>>
>> Would it make sense to upgrade Org Mode code base?
> Certainly. Could you add an entry in ORG-NEWS, in "Miscellaneous"?
>
> Regards,
>
Thanks Nicolas for your nice suggestions. I've taken them into
account. Particularly, the use of (skip-chars-backward " \t") gave a
small additional speedup, and simplified the code.

I found a way to ensure full backward compatibility. I keep the same
signature. When a table is given as a string parameter, it is inserted
into a temporary buffer, which is then parsed. Overall, the resulting
speed is quite satisfactory.

I also made the function more tolerant to ill-formed tables: missing
"|" or excess of spaces at the end of a row are now gracefully
accepted.

Regards
Thierry

#+BEGIN_SRC elisp
(defun org-table-to-lisp (&optional txt)
  "Convert the table at point to a Lisp structure.
The structure will be a list.  Each item is either the symbol `hline'
for a horizontal separator line, or a list of field values as strings.
The table is taken from the parameter TXT, or from the buffer at point."
  (if txt
      (with-temp-buffer
        (insert txt)
        (goto-char (point-min))
        (org-table-to-lisp))
    (unless (org-at-table-p) (user-error "No table at point"))
    (save-excursion
      (goto-char (org-table-begin))
      (let ((end (org-table-end))
            (row nil)
            (table nil))
        (while (< (point) end)
          (setq row nil)
          (search-forward "|" end)
          (if (looking-at "-")
              (progn
                (forward-line)
                (push 'hline table))
            (while (not (re-search-forward "\\=\\s-*\n" end t))
              (unless (re-search-forward "\\=\\s-*\\([^|\n]*\\)\\(|?\\)" end t)
                (user-error "Malformed table at char %s" (point)))
              (goto-char (match-end 1))
              (skip-chars-backward " \t" (match-beginning 1))
              (push
               (buffer-substring-no-properties (match-beginning 1) (point))
               row)
              (goto-char (match-end 2)))
            (push (nreverse row) table)))
        (nreverse table)))))
#+END_SRC


* Version 9.4 (not yet released)
** Miscellaneous
*** Faster org-table-to-lisp

The new implementation can be more than 100 times faster. This enhances
responsiveness of Babel or Gnuplot blocks handling thousands long tables.




  reply	other threads:[~2020-04-30 20:31 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-30  6:34 [ANN] faster org-table-to-lisp tbanelwebmin
2020-04-30  8:09 ` Nicolas Goaziou
2020-04-30 20:28   ` tbanelwebmin [this message]
2020-04-30 20:47     ` Daniele Nicolodi
2020-04-30 21:01       ` tbanelwebmin
2020-04-30 22:35     ` Nicolas Goaziou
2020-05-01  6:35       ` tbanelwebmin
2020-05-01 10:15         ` Nicolas Goaziou
2020-05-01 12:41           ` tbanelwebmin
2020-05-01 13:11             ` Nicolas Goaziou
2020-05-02  7:41               ` tbanelwebmin
2020-05-02  9:35                 ` Nicolas Goaziou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0166c38e-e1f2-9cbc-4cf8-1b287600368d@free.fr \
    --to=tbanelwebmin@free.fr \
    --cc=emacs-orgmode@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).