emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Maxim Nikulin <manikulin@gmail.com>
To: emacs-orgmode@gnu.org
Subject: Re: [Patch] to correctly sort the items with emphasis marks in a list
Date: Mon, 19 Apr 2021 19:34:42 +0700	[thread overview]
Message-ID: <s5jtd3$qs6$1@ciao.gmane.io> (raw)
In-Reply-To: <87tuo2d670.fsf@nicolasgoaziou.fr>

[-- Attachment #1: Type: text/plain, Size: 1989 bytes --]

On 19/04/2021 15:33, Nicolas Goaziou wrote:
> 
> Could you try the following instead?
> 
> --8<---------------cut here---------------start------------->8---
> (defun org-sort-remove-invisible (s)
>    "Remove invisible part of links and emphasis markers from string S."
>    (let ((remove-markers
>           (lambda (m)

Just a curiosity, what is style guide recommendation: let - lambda or 
cl-labels?

>             (concat (match-string 1 m)
>                     (match-string 4 m)
>                     (match-string 5 m)))))
>      (remove-text-properties 0 (length s) org-rm-props s)
>      (replace-regexp-in-string
>       org-verbatim-re remove-markers
>       (replace-regexp-in-string org-emph-re remove-markers (org-link-display-format s) t tt)

Single "t" is at the end, isn't it?

>       t t)))

I think, the patch is an improvement.

There is still a minor shortcoming that ordering of emphasized and 
non-emphasized words is undefined
- /a/
- *a*
- a
Let's leave it aside since it requires multilevel comparison similar to 
collation in locales and more strict definition which part of string is 
compared at each level:
- /a/ :: A
- /a/ :: Z
- a :: A
- a :: Z

In my opinion, a more severe limitation comes from sequential 
regexp-based approach. Consider stripping markers from
1. "a =b *c* d= e"
2. "*b* /i/"
First case could be solved by splitting the input string by verbatim 
regexp at first and by applying emphasis substitution only to 
non-verbatim parts. However I am rather shy to experiment with regexps 
definition to avoid including chars before and after emphasis markers. 
It would be great to get role of particular characters from more 
reliable parser (or at least from text properties affected by 
fontification).

I have some tests for `org-sort-remove-invisible', see the attachment. 
Actually I do not like such style of tests, I strongly prefer to see all 
failures at once, but I have not found if such technique is used 
anywhere in org tests.

[-- Attachment #2: 0001-More-tests-for-org-sort-list.patch --]
[-- Type: text/x-patch, Size: 5026 bytes --]

From 6719386aa5a95394a4cb98ce28b889f545620bd9 Mon Sep 17 00:00:00 2001
From: Max Nikulin <manikulin@gmail.com>
Date: Mon, 19 Apr 2021 19:06:36 +0700
Subject: [PATCH] More tests for `org-sort-list'

* lisp/org.el (org-verbatim-re): Add to the docstring a statement
concerning subgroups meaning.
* testing/lisp/test-org-list.el (test-org-list/sort): Add cases with
text emphasis.  Stress in comments that it is significant whether
locale-specific collation rules ignores spaces.
* testing/lisp/test-org.el (test-org/org-sort-remove-invisible): Add
tests for `org-sort-list' helper.

Add a test with expected failures to make apparent limitation of simple
regexp-based approach in `org-sort-remove-invisible' function.
---
 lisp/org.el                   |  3 ++-
 testing/lisp/test-org-list.el | 34 +++++++++++++++++++++++++++++++++-
 testing/lisp/test-org.el      | 27 +++++++++++++++++++++++++++
 3 files changed, 62 insertions(+), 2 deletions(-)

diff --git a/lisp/org.el b/lisp/org.el
index c2738100f..3d4f5553a 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -3685,7 +3685,8 @@ After a match, the match groups contain these elements:
 5  The character after the match, empty at the end of a line")
 
 (defvar org-verbatim-re nil
-  "Regular expression for matching verbatim text.")
+  "Regular expression for matching verbatim text.
+Groups 1-5 have the same meaning as in `org-emph-re' pattern.")
 
 (defvar org-emphasis-regexp-components) ; defined just below
 (defvar org-emphasis-alist) ; defined just below
diff --git a/testing/lisp/test-org-list.el b/testing/lisp/test-org-list.el
index 3689a172f..cc7914c8d 100644
--- a/testing/lisp/test-org-list.el
+++ b/testing/lisp/test-org-list.el
@@ -1225,7 +1225,39 @@ b. Item 2<point>"
        (equal "- b\n- a\n- C\n"
 	      (org-test-with-temp-text "- b\n- C\n- a\n"
 		(org-sort-list t ?A)
-		(buffer-string))))))
+		(buffer-string))))
+      ;; Emphasis in the beginning.
+      (should
+       (equal "- a\n- /z/\n"
+              (org-test-with-temp-text "- /z/\n- a\n"
+                (org-sort-list t ?a)
+                (buffer-string))))
+      (should
+       (equal "- *a*\n- z\n"
+              (org-test-with-temp-text "- z\n- *a*\n"
+                (org-sort-list t ?a)
+                (buffer-string))))
+      ;; Emphasis of second word.
+      ;; Locales with significant spaces (C, es_ES, pl_PL)
+      ;; are more sensitive to problems with sort.
+      (should
+       (equal "- a b\n- a /z/\n"
+              (org-test-with-temp-text "- a /z/\n- a b\n"
+                (org-sort-list t ?a)
+                (buffer-string))))
+      (should
+       (equal "- a *b*\n- a z\n"
+              (org-test-with-temp-text "- a z\n- a *b*\n"
+                (org-sort-list t ?a)
+                (buffer-string))))
+      ;; Space role in sorting.
+      ;; Test would fail for locales with ignored space, e.g. en_US, it works
+      ;; in C and currently rare locales having significant space (es_ES, pl_PL)
+      (should
+       (equal "- Time stamp\n- Timer\n"
+              (org-test-with-temp-text "- Timer\n- Time stamp\n"
+                (org-sort-list t ?a)
+                (buffer-string))))))
   ;; Sort numerically.
   (should
    (equal "- 1\n- 2\n- 10\n"
diff --git a/testing/lisp/test-org.el b/testing/lisp/test-org.el
index f6fb4b3ca..307b30265 100644
--- a/testing/lisp/test-org.el
+++ b/testing/lisp/test-org.el
@@ -8262,4 +8262,31 @@ two
 
 (provide 'test-org)
 
+(ert-deftest test-org/org-sort-remove-invisible ()
+  "Empasis markers are stripped by `org-sort-remove-invisible'."
+  (dolist (case '(("Test *bold* text" . "Test bold text")
+                  ("Test /italic/ text" . "Test italic text")
+                  ("Test _underlined_ text" . "Test underlined text")
+                  ("Test +strike through+ text" . "Test strike through text")
+                  ("Test =verbatim= text" . "Test verbatim text")
+                  ("Test ~code~ text" . "Test code text")
+                  ("*Beginning* of a line" . "Beginning of a line")
+                  ("End =of line=" . "End of line")
+                  ("Braces (*around*)" . "Braces (around)")
+                  ("Multiple *emphasis* options /in the/ same *line*" .
+                   "Multiple emphasis options in the same line")))
+    (let ((test-string (car case))
+          (expectation (cdr case)))
+      (should (equal expectation (org-sort-remove-invisible test-string))))))
+
+(ert-deftest test-org/org-sort-remove-invisible-failures ()
+  "Examples of `org-sort-remove-invisible` failures."
+  :expected-result :failed
+  (dolist (case '(("Lost =in *verbatim* text= fragment" .
+                   "Lost in *verbatim* text fragment")
+                  ("Adjucent emphasis: *Overlapped* /regexps/" .
+                   "Adjucent emphasis: Ovelapped regexps")))
+    (let ((test-string (car case))
+          (expectation (cdr case)))
+      (should (equal expectation (org-sort-remove-invisible test-string))))))
 ;;; test-org.el ends here
-- 
2.25.1


  reply	other threads:[~2021-04-19 12:38 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-02 18:15 [Patch] to correctly sort the items with emphasis marks in a list Juan Manuel Macías
2021-04-09 22:28 ` Nicolas Goaziou
2021-04-10  0:01   ` Juan Manuel Macías
2021-04-10 10:19     ` Nicolas Goaziou
2021-04-10 11:41       ` Juan Manuel Macías
2021-04-13 17:31         ` Maxim Nikulin
2021-04-13 19:08           ` Juan Manuel Macías
2021-04-14 15:42             ` Maxim Nikulin
2021-04-14 15:51               ` Maxim Nikulin
2021-04-14 17:07               ` Juan Manuel Macías
2021-04-14 21:36                 ` Juan Manuel Macías
2021-04-15 14:58                 ` Maxim Nikulin
2021-04-15 18:21                   ` Juan Manuel Macías
2021-04-16 14:59                     ` Maxim Nikulin
2021-04-16 15:30                       ` Maxim Nikulin
2021-04-17 13:27     ` Maxim Nikulin
2021-04-18 17:52       ` Juan Manuel Macías
2021-04-18 21:20         ` Juan Manuel Macías
2021-04-19  8:33           ` Nicolas Goaziou
2021-04-19 12:34             ` Maxim Nikulin [this message]
2021-04-19 16:08               ` Nicolas Goaziou
2021-04-19 17:00                 ` Greg Minshall
2021-04-19 17:17                   ` Tom Gillespie
2021-04-19 18:00                     ` Greg Minshall
2021-04-19 17:36                 ` Maxim Nikulin
2021-04-19 17:50                   ` Nicolas Goaziou
2021-04-20 12:37                     ` Maxim Nikulin
2021-04-20 12:20                 ` Maxim Nikulin
2021-04-20 13:57                   ` Nicolas Goaziou
2021-04-20 15:51                     ` Maxim Nikulin
2021-04-20 20:37                       ` Nicolas Goaziou
2021-04-21 13:10                         ` Maxim Nikulin
2021-04-21 15:45                           ` Juan Manuel Macías
2021-04-24 14:41                             ` Maxim Nikulin
2021-05-20 17:06                           ` [Patch] tests for org-remove-invisible Maxim Nikulin
2021-05-20 18:06                             ` Nicolas Goaziou
2021-09-27 16:53                               ` Max Nikulin
2021-11-25 12:11                           ` [Patch] to correctly sort the items with emphasis marks in a list Ihor Radchenko
2021-11-25 16:59                             ` Max Nikulin
2021-05-15 20:43                     ` Bastien
2021-05-15 22:09                       ` Nicolas Goaziou
2021-05-16  6:04                         ` Bastien
2021-04-28  5:46     ` Bastien
2021-04-28  6:37       ` Nicolas Goaziou
2021-04-28  6:49         ` Bastien
2021-04-28  8:04           ` Bastien
2021-05-15 13:32             ` Bastien
2021-05-15 15:10               ` Maxim Nikulin
2021-05-15 20:44                 ` Bastien
  -- strict thread matches above, loose matches on Subject: below --
2021-04-12 13:50 Juan Manuel Macías
     [not found] <mailman.57.1618243212.17744.emacs-orgmode@gnu.org>
2021-04-12 18:51 ` Ypo
2021-04-12 23:18   ` Juan Manuel Macías
2021-04-12 23:52     ` Samuel Wales
2021-04-13 14:16       ` Juan Manuel Macías

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='s5jtd3$qs6$1@ciao.gmane.io' \
    --to=manikulin@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).