From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp1 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id AG0ODJ01q14xAgAA0tVLHw (envelope-from ) for ; Thu, 30 Apr 2020 20:31:25 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp1 with LMTPS id SLueMKY1q16XNgAAbx9fmQ (envelope-from ) for ; Thu, 30 Apr 2020 20:31:34 +0000 Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:470:142::17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 366529430D9 for ; Thu, 30 Apr 2020 20:31:31 +0000 (UTC) Received: from localhost ([::1]:47822 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jUFqE-0006Df-He for larch@yhetil.org; Thu, 30 Apr 2020 16:31:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:43810) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jUFoS-0005fL-Nm for emacs-orgmode@gnu.org; Thu, 30 Apr 2020 16:29:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jUFnt-000054-0B for emacs-orgmode@gnu.org; Thu, 30 Apr 2020 16:29:40 -0400 Received: from smtp2-g21.free.fr ([2a01:e0c:1:1599::11]:44904) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jUFns-0008OV-6q for emacs-orgmode@gnu.org; Thu, 30 Apr 2020 16:29:04 -0400 Received: from [IPv6:2a01:e35:1398:10f0:5908:4055:8a0e:62b7] (unknown [IPv6:2a01:e35:1398:10f0:5908:4055:8a0e:62b7]) by smtp2-g21.free.fr (Postfix) with ESMTPS id 4B4A52003ED for ; Thu, 30 Apr 2020 22:28:59 +0200 (CEST) Subject: Re: [ANN] faster org-table-to-lisp To: emacs-orgmode@gnu.org References: <820681a6-4973-f016-6425-4afb9c9486a7@free.fr> <87wo5xcs3p.fsf@nicolasgoaziou.fr> From: tbanelwebmin Message-ID: <0166c38e-e1f2-9cbc-4cf8-1b287600368d@free.fr> Date: Thu, 30 Apr 2020 22:28:58 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <87wo5xcs3p.fsf@nicolasgoaziou.fr> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Content-Language: en-US Received-SPF: none client-ip=2a01:e0c:1:1599::11; envelope-from=tbanelwebmin@free.fr; helo=smtp2-g21.free.fr X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2a01:e0c:1:1599::11 X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Scanner: scn0 X-Spam-Score: -1.01 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 2001:470:142::17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Scan-Result: default: False [-1.01 / 13.00]; GENERIC_REPUTATION(0.00)[-0.4945024293309]; DWL_DNSWL_BLOCKED(0.00)[2001:470:142::17:from]; R_SPF_ALLOW(-0.20)[+ip6:2001:470:142::/48:c]; FREEMAIL_FROM(0.00)[free.fr]; TO_DN_NONE(0.00)[]; IP_REPUTATION_HAM(0.00)[asn: 22989(0.17), country: US(-0.00), ip: 2001:470:142::17(-0.49)]; MX_GOOD(-0.50)[cached: eggs.gnu.org]; MAILLIST(-0.20)[mailman]; FORGED_RECIPIENTS_MAILLIST(0.00)[]; RCVD_TLS_LAST(0.00)[]; R_DKIM_NA(0.00)[]; ASN(0.00)[asn:22989, ipnet:2001:470:142::/48, country:US]; MID_RHS_MATCH_FROM(0.00)[]; TAGGED_FROM(0.00)[larch=yhetil.org]; ARC_NA(0.00)[]; RCVD_COUNT_FIVE(0.00)[6]; FROM_NEQ_ENVFROM(0.00)[tbanelwebmin@free.fr,emacs-orgmode-bounces@gnu.org]; FROM_HAS_DN(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[emacs-orgmode@gnu.org]; DMARC_NA(0.00)[free.fr]; HAS_LIST_UNSUB(-0.01)[]; RCPT_COUNT_ONE(0.00)[1]; DNSWL_BLOCKED(0.00)[2001:470:142::17:from]; MIME_TRACE(0.00)[0:+]; FORGED_SENDER_MAILLIST(0.00)[] X-TUID: kxd/mC5VtBJb Le 30/04/2020 à 10:09, Nicolas Goaziou a écrit : > Hello, > > tbanelwebmin writes: > >> Here is an alternative, faster version of org-table-to-lisp. It can be >> more than 100 times faster. > Great! Thank you! > >> #+BEGIN_SRC elisp >> (defun org-table-to-lisp-faster (&optional org-table-at-p-done) >>   "Convert the table at point to a Lisp structure. >> The structure will be a list.  Each item is either the symbol `hline' >> for a horizontal separator line, or a list of field values as strings. >> The table is taken from the buffer at point. >> When the optional ORG-TABLE-AT-P-DONE parameter is not nil, it is >> assumed that (org-at-table-p) was already called." > Since you're changing the signature, I suggest to provide the table > element instead of ORG-AT-TABLE-P. AFAICT, `org-babel-read-element', > through `org-babel-read-table', would greatly benefit from this. > > Or, to be backward compatible, I suggest > > &optional TEXT TABLE > >>   (or org-table-at-p-done (org-at-table-p) (user-error "No table at point")) >>   (save-excursion >>     (goto-char (org-table-begin)) >>     (let ((end (org-table-end)) >>           (row) >>           (table)) > Nitpick: > > (row nil) > (table nil) > >>       (while (< (point) end) >>         (setq row nil) >>         (search-forward "|" end) >>         (if (looking-at "-") >>             (progn >>               (search-forward "\n" end) > (forward-line) > >>               (push 'hline table)) >>           (while (not (search-forward-regexp "\\=\n" end t)) > (unless (eolp) > ...) > >>             (unless (search-forward-regexp "\\=\\s-*\\([^|]*\\)" end t) >>               (user-error "Malformed table at char %s" (point))) > A row may not be properly ended. It doesn't warrant an error. Could you > make it more tolerant? > > Also `search-forward-regexp' -> `re-search-forward', i.e., use the > original. > >>             (let ((b (match-beginning 1)) >>           (e (match-end       1))) > Nitpick: spurious spaces. > >>               (and (search-backward-regexp "[^ \t]" b t) >>                (forward-char 1)) > (skip-chars-backward " \t") > >> It is faster because it operates directly on the buffer with >> (search-forward-regexp). Whereas the standard function splits a string >> extracted from the buffer. > You are right. I guess the initial implementation didn't have these > monster tables in mind. > >> This function is a drop-in replacement for the standard one. It can >> benefit to Babel and Gnuplot. >> >> Would it make sense to upgrade Org Mode code base? > Certainly. Could you add an entry in ORG-NEWS, in "Miscellaneous"? > > Regards, > Thanks Nicolas for your nice suggestions. I've taken them into account. Particularly, the use of (skip-chars-backward " \t") gave a small additional speedup, and simplified the code. I found a way to ensure full backward compatibility. I keep the same signature. When a table is given as a string parameter, it is inserted into a temporary buffer, which is then parsed. Overall, the resulting speed is quite satisfactory. I also made the function more tolerant to ill-formed tables: missing "|" or excess of spaces at the end of a row are now gracefully accepted. Regards Thierry #+BEGIN_SRC elisp (defun org-table-to-lisp (&optional txt) "Convert the table at point to a Lisp structure. The structure will be a list. Each item is either the symbol `hline' for a horizontal separator line, or a list of field values as strings. The table is taken from the parameter TXT, or from the buffer at point." (if txt (with-temp-buffer (insert txt) (goto-char (point-min)) (org-table-to-lisp)) (unless (org-at-table-p) (user-error "No table at point")) (save-excursion (goto-char (org-table-begin)) (let ((end (org-table-end)) (row nil) (table nil)) (while (< (point) end) (setq row nil) (search-forward "|" end) (if (looking-at "-") (progn (forward-line) (push 'hline table)) (while (not (re-search-forward "\\=\\s-*\n" end t)) (unless (re-search-forward "\\=\\s-*\\([^|\n]*\\)\\(|?\\)" end t) (user-error "Malformed table at char %s" (point))) (goto-char (match-end 1)) (skip-chars-backward " \t" (match-beginning 1)) (push (buffer-substring-no-properties (match-beginning 1) (point)) row) (goto-char (match-end 2))) (push (nreverse row) table))) (nreverse table))))) #+END_SRC * Version 9.4 (not yet released) ** Miscellaneous *** Faster org-table-to-lisp The new implementation can be more than 100 times faster. This enhances responsiveness of Babel or Gnuplot blocks handling thousands long tables.