emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [PATCH] org-test: Create a collaborative test set for Org buffer parser
@ 2021-12-11 14:39 Ihor Radchenko
  2021-12-14 16:16 ` Max Nikulin
  0 siblings, 1 reply; 6+ messages in thread
From: Ihor Radchenko @ 2021-12-11 14:39 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 913 bytes --]

Dear all,

The attached patch is an attempt to create something like shared repo
for Org element parser tests.

The idea is moving the tests out from Elisp into a set of text files.
That way, anyone interested in developing Org syntax support can use our
tests and potentially contribute more test files to the benefit of Org
mode for Emacs.

The test set is essentially a series of .org files alongside .el files
containing normalised output of `org-element-parse-buffer'. (see the
patch)

Anyone can contribute to the test set by adding new .org files and
generating the canonical parser output with new
M-x test-org-element-parser-save-expected-result function.

README.org in the repo also serves as a test file 😝.

Any comments or suggestions?
I am particularly looking for thoughts about licensing and possible
distribution of the test set in separate repository.

Best,
Ihor


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-org-test-Create-a-separate-testset-for-Org-element-p.patch --]
[-- Type: text/x-diff, Size: 14719 bytes --]

From f7bb947517e8793a45864b614f06460d1132539d Mon Sep 17 00:00:00 2001
Message-Id: <f7bb947517e8793a45864b614f06460d1132539d.1639232929.git.yantar92@gmail.com>
From: Ihor Radchenko <yantar92@gmail.com>
Date: Sat, 11 Dec 2021 22:24:39 +0800
Subject: [PATCH] org-test: Create a separate testset for Org element parser

* testing/lisp/test-org-element-parser-sources/README.org: Add readme
file describing the test file format and organisation.

* testing/lisp/test-org-element-parser-sources/simple-heading.org: Add
an example test file.

* testing/lisp/test-org-element-parser-sources/README.el:
* testing/lisp/test-org-element-parser-sources/simple-heading.el: Add
normalised expected parser output files.

* testing/lisp/test-org-element-parser.el: New testset integration to
main Org test suite.  The file defines ERT tests for files inside
test-org-element-parser-sources and an interactive function
`test-org-element-parser-save-expected-result' to generate parser
output files.
---
 .../test-org-element-parser-sources/README.el |  42 ++++++
 .../README.org                                |  64 +++++++++
 .../simple-heading.el                         |  11 ++
 .../simple-heading.org                        |   5 +
 testing/lisp/test-org-element-parser.el       | 129 ++++++++++++++++++
 5 files changed, 251 insertions(+)
 create mode 100644 testing/lisp/test-org-element-parser-sources/README.el
 create mode 100644 testing/lisp/test-org-element-parser-sources/README.org
 create mode 100644 testing/lisp/test-org-element-parser-sources/simple-heading.el
 create mode 100644 testing/lisp/test-org-element-parser-sources/simple-heading.org
 create mode 100644 testing/lisp/test-org-element-parser.el

diff --git a/testing/lisp/test-org-element-parser-sources/README.el b/testing/lisp/test-org-element-parser-sources/README.el
new file mode 100644
index 000000000..852df032f
--- /dev/null
+++ b/testing/lisp/test-org-element-parser-sources/README.el
@@ -0,0 +1,42 @@
+(org-data
+ (:begin 1 :contents-begin 2 :contents-end 1306 :end 1306 :post-affiliated 1 :post-blank 0)
+ (section
+  (:begin 2 :contents-begin 2 :contents-end 837 :end 838 :post-affiliated 2 :post-blank 1)
+  (paragraph
+   (:begin 2 :contents-begin 2 :contents-end 51 :end 52 :post-affiliated 2 :post-blank 1)
+   "This is a shared test suite for Org mode syntax.\n")
+  (paragraph
+   (:begin 52 :contents-begin 52 :contents-end 247 :end 248 :post-affiliated 52 :post-blank 1)
+   "The test suite consists of a number of .org example files alongside\nwith the expected parser output.  Each .org file can be parsed as is\nand the result should match the corresponding .el file.  \n")
+  (paragraph
+   (:begin 248 :contents-begin 248 :contents-end 424 :end 425 :post-affiliated 248 :post-blank 1)
+   "The parser results in .el files are Emacs sexps.  Each sexp is an\noutput of "
+   (verbatim
+    (:begin 324 :end 351 :post-blank 1))
+   "stripped from unessential\nproperties.  Each sexp has the following form:\n")
+  (src-block
+   (:begin 425 :end 773 :post-affiliated 425 :post-blank 1))
+  (paragraph
+   (:begin 773 :contents-begin 773 :contents-end 837 :end 837 :post-affiliated 773 :post-blank 0)
+   "The properties of elements can be specified in arbitrary order.\n"))
+ (headline
+  (:archivedp nil :begin 838 :commentedp nil :contents-begin 854 :contents-end 1306 :end 1306 :footnote-section-p nil :level 1 :post-affiliated 838 :post-blank 0 :pre-blank 1 :priority nil :raw-value "Contributing" :tags nil :title
+	      ("Contributing")
+	      :todo-keyword nil :todo-type nil)
+  (section
+   (:begin 854 :contents-begin 854 :contents-end 1306 :end 1306 :post-affiliated 854 :post-blank 0)
+   (paragraph
+    (:begin 854 :contents-begin 854 :contents-end 983 :end 984 :post-affiliated 854 :post-blank 1)
+    "To add new test files to this suite, send a patch to Org mode mailing\nlist, as described in "
+    (link
+     (:begin 946 :contents-begin nil :contents-end nil :end 981 :post-blank 0))
+    ".\n")
+   (paragraph
+    (:begin 984 :contents-begin 984 :contents-end 1306 :end 1306 :post-affiliated 984 :post-blank 0)
+    "The expected parser output can be generated using Emacs and latest\nversion of Org mode.  You need to open an Org file in Emacs, load\n"
+    (verbatim
+     (:begin 1117 :end 1180 :post-blank 0))
+    ", and\nrun "
+    (verbatim
+     (:begin 1190 :end 1240 :post-blank 0))
+    ".  The expected\noutput will be saved alongside with the Org file.\n"))))
diff --git a/testing/lisp/test-org-element-parser-sources/README.org b/testing/lisp/test-org-element-parser-sources/README.org
new file mode 100644
index 000000000..78e33eb36
--- /dev/null
+++ b/testing/lisp/test-org-element-parser-sources/README.org
@@ -0,0 +1,64 @@
+#+TITLE: Shared Org parser testing fileset
+#+AUTHOR: Ihor Radchenko
+#+EMAIL: yantar92 at gmail dot com
+
+This is a shared test suite for Org mode syntax.  It is a part of Org
+mode's own test suite extracted for easier contributions.
+
+The test suite consists of a number of .org example files alongside
+with the expected parser output.  Each .org file can be parsed as is
+and the result should match the corresponding .el file.  
+
+The parser results in .el files are Emacs sexps.  Each sexp is an
+output of =org-element-parse-buffer= stripped from unessential
+properties.  Each sexp has the following form:
+
+#+begin_src emacs-lisp
+(org-data
+ (:property1 value-1 :property2 value-2 ...)
+ (inner-element-1 
+  (:inner-element-property1 value ...)
+  ... <other elements inside inner-element-1>)
+ ...
+ "string element is not a list, but a string"
+ ...
+ (heading
+  (:property1 value1 ... :title (object-inside-title1 (<properties>) ...))
+  ...)
+ ...)
+#+end_src
+
+The properties of elements can be specified in arbitrary order.
+
+The common properties are =:begin=, =:end=, =:contents-begin=, and
+=:contents-end=.  Their values are 1-indexed char positions from the
+beginning of the Org file.
+
+* Contributing
+
+To add new test files to this suite, send a patch to Org mode mailing
+list, as described in https://orgmode.org/contribute.html.
+
+The expected parser output can be generated using Emacs and latest
+version of Org mode.  You need to open an Org file in Emacs, load
+=/path/to/Org/git/repo/testing/lisp/test-org-element-parser.el=, and
+run =M-x test-org-element-parser-save-expected-result=.  The expected
+output will be saved alongside with the Org file.
+
+* License
+
+Org-mode is published under [[https://www.gnu.org/licenses/gpl-3.0.html][the GNU GPLv3 license]] or any later
+version, the same as GNU Emacs.
+
+Org-mode is free software: you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation, either version 3 of the License, or
+(at your option) any later version.
+
+GNU Emacs is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with Org mode.  If not, see <https://www.gnu.org/licenses/>.
diff --git a/testing/lisp/test-org-element-parser-sources/simple-heading.el b/testing/lisp/test-org-element-parser-sources/simple-heading.el
new file mode 100644
index 000000000..6ca7a54f6
--- /dev/null
+++ b/testing/lisp/test-org-element-parser-sources/simple-heading.el
@@ -0,0 +1,11 @@
+(org-data
+ (:begin 1 :contents-begin 3 :contents-end 46 :end 46 :post-affiliated 1 :post-blank 0)
+ (headline
+  (:archivedp nil :begin 3 :commentedp nil :contents-begin 24 :contents-end 46 :end 46 :footnote-section-p nil :level 1 :post-affiliated 3 :post-blank 0 :pre-blank 1 :priority nil :raw-value "this is a heading" :tags nil :title
+	      ("this is a heading")
+	      :todo-keyword nil :todo-type nil)
+  (section
+   (:begin 24 :contents-begin 24 :contents-end 46 :end 46 :post-affiliated 24 :post-blank 0)
+   (paragraph
+    (:begin 24 :contents-begin 24 :contents-end 46 :end 46 :post-affiliated 24 :post-blank 0)
+    "With some text below.\n"))))
diff --git a/testing/lisp/test-org-element-parser-sources/simple-heading.org b/testing/lisp/test-org-element-parser-sources/simple-heading.org
new file mode 100644
index 000000000..b508ecfec
--- /dev/null
+++ b/testing/lisp/test-org-element-parser-sources/simple-heading.org
@@ -0,0 +1,5 @@
+
+
+* this is a heading
+
+With some text below.
diff --git a/testing/lisp/test-org-element-parser.el b/testing/lisp/test-org-element-parser.el
new file mode 100644
index 000000000..d12307d98
--- /dev/null
+++ b/testing/lisp/test-org-element-parser.el
@@ -0,0 +1,129 @@
+;;; test-org-element-parser.el --- Tests for org-element.el parser
+
+;; Copyright (C) 2021  Ihor Radchenko
+
+;; Author: Ihor Radchenko <yantar92 at gmail dot com>
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <https://www.gnu.org/licenses/>.
+
+;;; Code:
+
+(require 'org-element)
+
+(defvar test-org-element-parser-properties
+  '((:global :begin :end :contents-begin :contents-end :pre-blank :post-blank :post-affiliated)
+    (headline :raw-value :title :level :priority :tags :todo-keyword :todo-type :footnote-section-p :archivedp :commentedp))
+  "List of important properties that should be parsed.")
+
+(defvar test-org-element-parser-source-directory "../lisp/test-org-element-parser-sources/"
+  "Path to directory containing all the test Org files.
+The expected parsed representation is stored alongside in .el files.
+For example, parsed representation of file.org is in file.el.")
+
+(defun test-org-element-parser-generate-syntax-sexp ()
+  "Return SEXP with important parts of parsed representation of current Org buffer."
+  (unless (derived-mode-p 'org-mode) (user-error "Not an Org buffer."))
+  (let ((datum (org-element-parse-buffer 'object))
+	(strip-func (lambda (el)
+		      (let ((type (org-element-type el))
+			    (plist (when (listp el) (nth 1 el)))
+			    prop value tmpalist)
+                        (if (eq type 'plain-text)
+                            (set-text-properties 0 (length el) nil el)
+			  (while plist
+			    (setq prop (car plist))
+			    (setq value (cadr plist))
+			    (when (stringp value) (setq value (substring-no-properties value)))
+			    (setq plist (cddr plist))
+			    (when (or (memq prop (alist-get :global test-org-element-parser-properties))
+				      (memq prop (alist-get type test-org-element-parser-properties)))
+                              (push (cons prop value) tmpalist)))
+                          (setq tmpalist (sort tmpalist (lambda (a b) (string< (symbol-name (car a))
+                                                                          (symbol-name (car b))))))
+			  (setf (nth 1 el)
+                                (apply #'append
+                                       (mapcar (lambda (c) (list (car c) (cdr c)))
+                                               tmpalist))))))))
+    (org-element-map datum (append '(plain-text)  org-element-all-elements org-element-all-objects)
+      strip-func nil nil nil 'with-affiliated)
+    ;; `org-element-map' never maps over `org-data'. Update it separately.
+    (funcall strip-func datum)
+    datum))
+
+(defun test-org-element-parser-save-expected-result (&optional file)
+  "Save reference parsed representation of current Org buffer or FILE.
+The parsed representation will be saved alongside with the buffer file."
+  (interactive)
+  (with-current-buffer (if file
+			   (get-buffer-create file)
+                         (current-buffer))
+    (save-buffer)
+    (let ((datum (test-org-element-parser-generate-syntax-sexp))
+	  (path (buffer-file-name))
+          newpath)
+      (unless (and path (file-exists-p path)) (user-error "Not in a file buffer."))
+      (setq newpath (format "%s.el" (file-name-base path)))
+      (with-temp-file newpath
+        (condition-case err
+            (progn
+	      (pp datum (current-buffer))
+              (message "Parsed representation saved to %s" (expand-file-name newpath)))
+          (err (message "Failed to save parsed representation: \"%S\"" err)))))))
+
+(defmacro org-test-element-verify (&optional file)
+  "Verify `org-element-parse-buffer' for current Org buffer or FILE."
+  `(progn
+     (unless ,file
+       (setq file (buffer-file-name))
+       (save-buffer))
+     (unless (and ,file (file-exists-p ,file))
+       (user-error "%s does not exist." ,file))
+     (let ((reference-file (format "%s%s.el"
+                                   (file-name-directory ,file)
+                                   (file-name-base ,file))))
+       (unless (file-exists-p reference-file)
+         (user-error "Reference result file %s does not exist." reference-file))
+       (with-temp-buffer
+         (insert-file-contents ,file)
+         (org-mode)
+         (should
+          (equal (test-org-element-parser-generate-syntax-sexp)
+	         (with-temp-buffer
+                   (insert-file-contents reference-file)
+	           (read (current-buffer)))))))))
+
+(defmacro test-org-element-parser-files (&rest files)
+  "Run `org-test-element-verify' for each file in FILES."
+  `(progn
+     (unless (and test-org-element-parser-source-directory
+                  (file-exists-p test-org-element-parser-source-directory))
+       (error "%s does not exist." test-org-element-parser-source-directory))
+     (dolist (file '(,@files))
+       (setq file (format "%s%s.org"
+                          (file-name-as-directory test-org-element-parser-source-directory)
+                          (file-name-base file)))
+       (org-test-element-verify file))))
+
+\f
+
+(ert-deftest test-org-element-parser/simple-headlines ()
+  "Basic tests for Org files with headings and plain text paragraphs."
+  (test-org-element-parser-files "simple-heading"))
+
+(ert-deftest test-org-element-parser/README ()
+  "Test README.org in the example file repo."
+  (test-org-element-parser-files "README"))
+
+(provide 'test-org-element-parser)
+;;; test-org-element-parser.el ends here
-- 
2.32.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] org-test: Create a collaborative test set for Org buffer parser
  2021-12-11 14:39 [PATCH] org-test: Create a collaborative test set for Org buffer parser Ihor Radchenko
@ 2021-12-14 16:16 ` Max Nikulin
  2021-12-14 20:27   ` Tim Cross
  2021-12-15 13:23   ` Ihor Radchenko
  0 siblings, 2 replies; 6+ messages in thread
From: Max Nikulin @ 2021-12-14 16:16 UTC (permalink / raw)
  To: emacs-orgmode

On 11/12/2021 21:39, Ihor Radchenko wrote:
> 
> The attached patch is an attempt to create something like shared repo
> for Org element parser tests.

"[PATCH]" prefix in the subject might be a reason why you message 
received less attention than it should.

> The test set is essentially a series of .org files alongside .el files
> containing normalised output of `org-element-parse-buffer'. (see the
> patch)

I think, the set should contain hundreds of tests to be helpful, thus 2 
files per test will likely be inconvenient since most of samples will be 
short. I suggest to group test input and results into large files. Such 
tests should be augmented by some metadata: keywords (labels, tags). 
E.g. besides heading

- sample: "* Simple Heading"
   keywords: heading

it should be possible to filter related cases with similar markup

- sample: "*Bold* emphasis"
   keywords: emphasis, heading
   description: Despite line is started from a star,
     there is no space after it, so it is not a heading.

- sample: " * Unordered list item"
   keywords: list, heading
   description: Due to leading space it is a list item,
     not a heading.

- sample: "*"
   keywords: text, heading
   description: Not a heading since there is no space after the star.

I omitted test IDs above.

Version of Org and test set should be included into metadata for the 
whole suite.

Since partial compliance is assumed, format of test results should be 
declared as well to be able to publish overview or comparison.

Are properties like :begin and :end mandatory for reference results of 
parsing? They make structures more verbose and harder to read. Often it 
is enough to compare content and similar properties.

> Any comments or suggestions?
> I am particularly looking for thoughts about licensing and possible
> distribution of the test set in separate repository.

Since these tests will unlikely become a part of some software, I do not 
think that GPL may be an obstacle for any project. Requirement of signed 
consent will likely prevent contributing of new cases from some developers.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] org-test: Create a collaborative test set for Org buffer parser
  2021-12-14 16:16 ` Max Nikulin
@ 2021-12-14 20:27   ` Tim Cross
  2021-12-15 13:44     ` Ihor Radchenko
  2021-12-15 13:23   ` Ihor Radchenko
  1 sibling, 1 reply; 6+ messages in thread
From: Tim Cross @ 2021-12-14 20:27 UTC (permalink / raw)
  To: emacs-orgmode


Max Nikulin <manikulin@gmail.com> writes:

> On 11/12/2021 21:39, Ihor Radchenko wrote:
>
> Since these tests will unlikely become a part of some software, I do not think
> that GPL may be an obstacle for any project. Requirement of signed consent will
> likely prevent contributing of new cases from some developers.

I agree. The test org files are input data rather than code and I don't
think they fall under the copyright restrictions (or the code
contribution guidelines). They probably don't even need to be GPL'd -
possibly a CC license would be sufficient.

As to whether they should be part of the org-mode repository or in their
own repository, I'm not sure. It would be convenient to have them int he
org-mode repository as I expect they will become part of the testing
framework and only having to checkout one repository would be useful. On
the other hand, I guess there could be cases where people want to just
checkout these samples to use to validate their own library/parser etc.
I tend towards putting them in the org-mode repository for simplicity.




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] org-test: Create a collaborative test set for Org buffer parser
  2021-12-14 16:16 ` Max Nikulin
  2021-12-14 20:27   ` Tim Cross
@ 2021-12-15 13:23   ` Ihor Radchenko
  2021-12-19  8:23     ` Max Nikulin
  1 sibling, 1 reply; 6+ messages in thread
From: Ihor Radchenko @ 2021-12-15 13:23 UTC (permalink / raw)
  To: Max Nikulin; +Cc: emacs-orgmode

Max Nikulin <manikulin@gmail.com> writes:

> On 11/12/2021 21:39, Ihor Radchenko wrote:
>> 
>> The attached patch is an attempt to create something like shared repo
>> for Org element parser tests.
>
> "[PATCH]" prefix in the subject might be a reason why you message 
> received less attention than it should.

Well. I really wanted to set a technical tone of the discussion in
contrast to my previous email (proposing this exact idea, among others)
that generated little technical feedback and a lot of non-technical.

> I think, the set should contain hundreds of tests to be helpful, thus 2 
> files per test will likely be inconvenient since most of samples will be 
> short. I suggest to group test input and results into large files. Such 
> tests should be augmented by some metadata: keywords (labels, tags). 
> E.g. besides heading
>
> - sample: "* Simple Heading"
>    keywords: heading

I am not sure here. In a way, we already have such a format in
test-org-element.el:

(ert-deftest test-org-element/bold-parser ()
  "Test `bold' parser."
  ;; Standard test.
  (should
   (org-test-with-temp-text "*bold*"
     (org-element-map (org-element-parse-buffer) 'bold #'identity nil t))))

The problem with grouping short tests into a single file is that we put
an extra requirement on the testing code. The code will have to parse
the test files with multiple tests, extract those tests, and then run
the parser. So, in addition to writing the Org parser, third-party dev
will also have to write a parser for the test format. I find individual
files easier to get started with. Most existing parser libraries can
handle individual files, but not individual pieces of text grouped into
bigger file using yet another standard convention.

The usual solution to the above problem is fixed test file format that
can be processed by testing module of a given language. But is there a
standard multi-test in one file format that can be used in multiple
programming languages?

Having said that, I like your idea about adding metadata to the tests.
Probably, we can simply do it in the readme like the following:

* Test group heading
  - file :: file:./test-files/test1.org
    - keywords :: k1, k2
    - description :: description text

Or even allow inline tests via buffer-local after-save-hook:

   - text :: "*Bold* emphasis"
     - keywords :: emphasis, heading
     - description :: Despite line is started from a star, there is no
                      space after it, so it is not a heading.

Upon save, the text:: field will be automatically converted to a
test-hash.org+test-hash.el files.

> Version of Org and test set should be included into metadata for the 
> whole suite.

If the test set is distributed together with Org, I see no reason to do
it. Otherwise, the test set should simply track bugfix and main branches
of Org and follow Org releases. Does it sound reasonable?

> Since partial compliance is assumed, format of test results should be 
> declared as well to be able to publish overview or comparison.

I provided a short description of the format in the README (see the
original patch). Is it not enough?

> Are properties like :begin and :end mandatory for reference results of 
> parsing? They make structures more verbose and harder to read. Often it 
> is enough to compare content and similar properties.

I afraid that if we put contents of every headline/element in place of
:begin :end, the results will be even less readable. The results will
have to dump multiple cumulative instances of the original file.

Or do you have an alternative suggestion about the format of the
reference parser output?

>> Any comments or suggestions?
>> I am particularly looking for thoughts about licensing and possible
>> distribution of the test set in separate repository.
>
> Since these tests will unlikely become a part of some software, I do not 
> think that GPL may be an obstacle for any project. Requirement of signed 
> consent will likely prevent contributing of new cases from some developers.

Actually, there have been talks about including Org mode tests into
Emacs itself (https://yhetil.org/emacs-devel/87o8629h8g.fsf_-_@gmx.de/)

Not all our tests are under GPL, but we may want to change this
situation.

Also, if we decide to not distribute this test set under GPL, what about
the usual fear with someone contributing and its persons' employer
coming on us with copyright claims? I do not have enough knowledge to
judge this.

Maybe Bastien or others know better. Or we may try to contact GNU legal
team.

Best,
Ihor


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] org-test: Create a collaborative test set for Org buffer parser
  2021-12-14 20:27   ` Tim Cross
@ 2021-12-15 13:44     ` Ihor Radchenko
  0 siblings, 0 replies; 6+ messages in thread
From: Ihor Radchenko @ 2021-12-15 13:44 UTC (permalink / raw)
  To: Tim Cross; +Cc: emacs-orgmode

Tim Cross <theophilusx@gmail.com> writes:

I have nothing to say about licence question. Replying to the other part

> ... On
> the other hand, I guess there could be cases where people want to just
> checkout these samples to use to validate their own library/parser etc.
> I tend towards putting them in the org-mode repository for simplicity.

We can also advice to use sparse checkout as described in
https://unix.stackexchange.com/questions/233327/is-it-possible-to-clone-only-part-of-a-git-project

I just tried to do so for testing/ folder in Org repo. It was fairly
fast.

I guess that sparse clone should also be fine to prepare patches.

Best,
Ihor


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] org-test: Create a collaborative test set for Org buffer parser
  2021-12-15 13:23   ` Ihor Radchenko
@ 2021-12-19  8:23     ` Max Nikulin
  0 siblings, 0 replies; 6+ messages in thread
From: Max Nikulin @ 2021-12-19  8:23 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 7012 bytes --]

Sorry for delay. I expected that a next step with inline raw AST 
fragments would be easier, but I have got working example for parser 
tests grouped in single file earlier.

Certainly org files may be source for parser tests. They are readable 
enough to be convenient for developers and, I hope, it would not require 
a lot of work to convert them in machine-readable files. However tests 
should be distributed in some format(s) with widely available parsers:
s-expressions, YAML, JSON. I am against JSON for source files since 
working with multiline text will likely be a pain (besides editing of 
files, patches and VCS diffs should be taken into account).

Actually there is a requirement for external projects related to parser 
already. Expected results are provided as S-expressions.

For simplicity I described a test suite with a couple of cases using 
s-expressions (see attachments). Names of macro an functions are 
arbitrary. Interface should be like

     (demo-parse-test-suite "demo-test-suite.el")

Current implementation creates an ert test for each entry in such file, 
but it may be changed later. I just had example of macro for 
parametrized test from earlier experiments.

On 15/12/2021 20:23, Ihor Radchenko wrote:
> Max Nikulin writes:
>> On 11/12/2021 21:39, Ihor Radchenko wrote:
>>>
>>> The attached patch is an attempt to create something like shared repo
>>> for Org element parser tests.
>>
>> "[PATCH]" prefix in the subject might be a reason why you message
>> received less attention than it should.

At least Tom and Timothy have their parsers, so I expected some response 
from them. I am unsure whether posts about Organice were from developers 
or from users. Patch is something too internal to development of namely 
Org Mode as Emacs major mode. Once I asked to avoid proposal to 
integrate everything meaning mostly particular person who was extremely 
active that time, but it had effect on more people than I thought. You 
intention may have stronger result than it should as well.

> I am not sure here. In a way, we already have such a format in
> test-org-element.el:
> 
> (ert-deftest test-org-element/bold-parser ()
>    "Test `bold' parser."
>    ;; Standard test.
>    (should
>     (org-test-with-temp-text "*bold*"
>       (org-element-map (org-element-parse-buffer) 'bold #'identity nil t))))

It is a bit harder to parse than it is acceptable for external tools. 
However it may be converted to more convenient format using elisp 
facilities.

> So, in addition to writing the Org parser, third-party dev
> will also have to write a parser for the test format.

I assume available parser for some wide spread format and some simple 
interpretation of the structures specific to these tests.

> Most existing parser libraries can
> handle individual files, but not individual pieces of text grouped into
> bigger file using yet another standard convention.

It should take a several line of code to put string input into a 
temporary file. Likely it is possible to implement interface of file 
stream for input string. Finally, it may stimulate developers to add a 
method that parses strings.

I found the following test corpus impressive and unmanageable in the 
form of separate files:
https://github.com/tgbugs/laundry/blob/master/laundry/test.rkt

> But is there a
> standard multi-test in one file format that can be used in multiple
> programming languages?

Some testing frameworks have facilities for parametrized tests with 
lists of arbitrary structures, others explicitly recommends to write a 
loop over such lists.

> Or even allow inline tests via buffer-local after-save-hook:
> 
>     - text :: "*Bold* emphasis"
>       - keywords :: emphasis, heading
>       - description :: Despite line is started from a star, there is no
>                        space after it, so it is not a heading.
> 
> Upon save, the text:: field will be automatically converted to a
> test-hash.org+test-hash.el files.

Or to a single file with some top-level metadata and a list (maybe 
nested lists) of tests. Unsure that such files should be stored in git.

>> Version of Org and test set should be included into metadata for the
>> whole suite.
> 
> If the test set is distributed together with Org, I see no reason to do
> it. Otherwise, the test set should simply track bugfix and main branches
> of Org and follow Org releases. Does it sound reasonable?

I have no particular opinion if test sources should be maintained in the 
same repository with Org or Emacs. If test are described in Org files 
then test suites should be distributed using some other way.

>> Since partial compliance is assumed, format of test results should be
>> declared as well to be able to publish overview or comparison.
> 
> I provided a short description of the format in the README (see the
> original patch). Is it not enough?

I mean result of *group* of tests to present comparison of parsers. 
Unsure if JUnit XML files are appropriate.

>> Are properties like :begin and :end mandatory for reference results of
>> parsing? They make structures more verbose and harder to read. Often it
>> is enough to compare content and similar properties.
> 
> I afraid that if we put contents of every headline/element in place of
> :begin :end, the results will be even less readable. The results will
> have to dump multiple cumulative instances of the original file.

I had in mind that only leaf elements have contents. It may result in 
some ambiguity however.

> Actually, there have been talks about including Org mode tests into
> Emacs itself (https://yhetil.org/emacs-devel/87o8629h8g.fsf_-_@gmx.de/)

I have seen proposals to drop separate repository for Org and do 
everything inside the one for Emacs earlier. In my opinion, it would 
mean broken compatibility with earlier version of Emacs, so I do not 
like it. On the other hand I am not an active developer, so my arguments 
may be ignored.

P.S.

Output or Ert requires more tuning:

---- >8 ----

Selector: t
Passed:  1
Failed:  1 (1 unexpected)
Skipped: 0
Total:   2/2

Started at:   2021-12-19 14:13:54+0700
Finished.
Finished at:  2021-12-19 14:13:54+0700

F.

F demo-parse-test/demo-test-suite--bold-emphasis-at-beginning-of-line
     Bold text marker at beginning of line should not be confused with 
heading: no space after star (‘demo-parse-test/demo-test-suite’)
     (ert-test-failed
      ((should
        (funcall #<subr equal>
		 '(org-data ... ...)
		 (apply #'demo-test-parse-input '...)))
       :form
       (funcall #<subr equal>
		(org-data
		 (:begin 1 :contents-begin 1 :contents-end 13 :end 13 :post-affiliated 
1 ...)
		 (paragraph
		  (:begin 1 :contents-begin 1 :contents-end 13 :end 13 
:post-affiliated 1 ...)
		  (bold ... "Bold text")))
		(org-data
		 (:begin 1 :contents-begin 1 :contents-end 13 :end 13 :post-affiliated 
1 ...)
		 (section
		  (:begin 1 :contents-begin 1 :contents-end 13 :end 13 
:post-affiliated 1 ...)
		  (paragraph ... ... "
"))))
       :value nil))



[-- Attachment #2: demo-test-suite.el --]
[-- Type: text/x-emacs-lisp, Size: 1420 bytes --]

(
 :description
 "Just a demo of test suite for org-element parser"
 :cases
 (
  (:id simple-heading
       :description "Heading and section with paragraph"
       :input "

* this is a heading

With some text below.
"
       :result (org-data
		(:begin 1 :contents-begin 3 :contents-end 46 :end 46 :post-affiliated 1 :post-blank 0)
		(headline
		  (:archivedp nil :begin 3 :commentedp nil :contents-begin 24 :contents-end 46 :end 46 :footnote-section-p nil :level 1 :post-affiliated 3 :post-blank 0 :pre-blank 1 :priority nil :raw-value "this is a heading" :tags nil :title
			      ("this is a heading")
			      :todo-keyword nil :todo-type nil)
		  (section
		    (:begin 24 :contents-begin 24 :contents-end 46 :end 46 :post-affiliated 24 :post-blank 0)
		    (paragraph
		      (:begin 24 :contents-begin 24 :contents-end 46 :end 46 :post-affiliated 24 :post-blank 0)
		      "With some text below.\n")))))
  (:id bold-emphasis-at-beginning-of-line
       :description "Bold text marker at beginning of line should not be confused with heading: no space after star"
       :input "*Bold text*
"
       :result (org-data
		(:begin 1 :contents-begin 1 :contents-end 13 :end 13 :post-affiliated 1 :post-blank 0)
		    (paragraph
		      (:begin 1 :contents-begin 1 :contents-end 13 :end 13 :post-affiliated 1 :post-blank 0)
		      (bold
			(:begin 1 :contents-begin 2 :contents-end 11 :end 13)
			"Bold text"))))
))

[-- Attachment #3: demo-test.el --]
[-- Type: text/x-emacs-lisp, Size: 2666 bytes --]

;;;; Example of usage:
;;;;      (demo-parse-test-suite "demo-test-suite.el")

(defmacro nm-deftest-parametrized
    (prefix-sym func-predicate &rest doc-cases)
  "Define parametrized test

For each SUFFIX, CASE-DOCSTRING, EXPECTATION, ARGS list
call `ert-deftest' with SUITE--SUFFIX name, CASE-DOCSTRING,
and `should' that checks whether EXPECTATION is consistent
with result of FUNCTION applied to ARGS using PREDICATE or `equal'.

\(fn SUITE (FUNCTION [PREDICATE])  \
  [SUITE-DOCSTRING] \
  (SUFFIX CASE-DOCSTRING EXPECTATION ARGS...)...)"
  (declare (debug (&define [&name "test@" symbolp] sexp
			   [&optional strinp] def-body))
	   (indent 2)
	   (doc-string 3))
  (let* ((func (car func-predicate))
	 (predicate (or (cadr func-predicate) (symbol-function 'equal)))
	 (prefix (symbol-name prefix-sym))
	 (maybe-doc (car doc-cases))
	 (cases (if (stringp maybe-doc) (cdr doc-cases) doc-cases))
	 (case-list (mapcar
		     (lambda (case)
		       (format "- `%s--%s'"
			       prefix
			       (symbol-name (car case))))
		     cases))
	 ;; Unfortunately `ert-describe-test' works only in ert mode
	 ;; and links to particular subtests are inactive.
	 (suite-doc (if (stringp maybe-doc)
			(cons maybe-doc case-list)
		      case-list)))
    (append
     `(,#'progn
	;; A function to assing doc string that is linked from each test.
	(defun ,prefix-sym () ,(mapconcat #'identity suite-doc "\n")))
     (mapcar
      (lambda (case)
	;; Have not managed to express "&rest" using `pcase-let'.
	(apply (lambda (id-sym case-docstring expectation &rest args)
		 (let* ((id (symbol-name id-sym))
			(name (intern (concat prefix "--" id)))
			(docstring (format "%s (`%s')" case-docstring prefix)))
		   `(ert-deftest ,name ()
		      ,docstring
		      (should (funcall ,predicate ,expectation
				     (apply ,func (quote ,args)))))))
	       case))
      cases))))

(defun demo-test-parse-input (text)
  (with-temp-buffer
    (insert text)
    (org-mode)
    (test-org-element-parser-generate-syntax-sexp)))

(defmacro demo-parse-test-suite (file-name)
  (let* ((prefix (file-name-base file-name))
	 (prefix-sym (intern (concat "demo-parse-test/" prefix)))
	 (suite (with-temp-buffer
		  (insert-file-contents file-name)
		  (read (current-buffer))))
	 (case-list (plist-get suite :cases))
	 (suite-description (plist-get suite :description)))
    `(nm-deftest-parametrized
	 ,prefix-sym (#'demo-test-parse-input)
       ,suite-description
       ,@(mapcar (lambda (case)
		  (list (plist-get case :id)
			(or (plist-get case :description) "Warning: no description for this case")
			`(quote ,(plist-get case :result))
			(plist-get case :input)))
		case-list))))

[-- Attachment #4: demo-test-suite.json --]
[-- Type: application/json, Size: 2620 bytes --]

[-- Attachment #5: demo-test-suite.yaml --]
[-- Type: application/x-yaml, Size: 1675 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-12-19  8:24 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-11 14:39 [PATCH] org-test: Create a collaborative test set for Org buffer parser Ihor Radchenko
2021-12-14 16:16 ` Max Nikulin
2021-12-14 20:27   ` Tim Cross
2021-12-15 13:44     ` Ihor Radchenko
2021-12-15 13:23   ` Ihor Radchenko
2021-12-19  8:23     ` Max Nikulin

Code repositories for project(s) associated with this inbox:

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).