emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Max Nikulin <manikulin@gmail.com>
To: emacs-orgmode@gnu.org
Cc: Kyle Meyer <kyle@kyleam.com>
Subject: [PATCH] org.el: Fix percent substitutions in `org-open-file'
Date: Sun, 28 Aug 2022 00:20:02 +0700	[thread overview]
Message-ID: <dc30fb1c-f947-2797-5cc9-756094cf815c@gmail.com> (raw)
In-Reply-To: <s37elb$m2v$1@ciao.gmane.io>

[-- Attachment #1: Type: text/plain, Size: 1245 bytes --]

On 21/03/2021 19:36, Maxim Nikulin wrote:
> On 13/02/2021 11:38, Kyle Meyer wrote:
>> +(defun org--open-file-format-spec (format specification)
>> +  (with-temp-buffer
>> +    (insert format)
>> +    (goto-char (point-min))
>> +    (while (search-forward "%" nil t)
>> +      (cond ((eq (char-after) ?%)
>> +             (delete-char 1))
>> +            ((looking-at "[s0-9]")
>> +             (replace-match
>> +              (or (cdr (assoc (match-string 0) specification))
>> +                  (error "Invalid format string"))
>> +              'fixed-case 'literal)
>> +             (delete-region (1- (match-beginning 0)) (match-beginning 
>> 0)))

It's pity that this issue has not fix yet. I have tried to implement 
another helper function. It is hardly usable outside of `org-open-file' 
(perhaps it is still suitable to make mailcap commands), but separate 
function makes the code testable. I hope, the attached patch fixes the 
problems with multiple regexp groups and with percent characters in 
replacement string. Notice that "\%", not "%%" is used to escape percent 
in mailcap, anyway neither variant was supported before.

[-- Attachment #2: 0001-org.el-Fix-percent-substitutions-in-org-open-file.patch --]
[-- Type: text/x-patch, Size: 13173 bytes --]

From 2f2850dcdc0cba8cfc6c0e21fc893c4e664d575c Mon Sep 17 00:00:00 2001
From: Max Nikulin <manikulin@gmail.com>
Date: Sat, 27 Aug 2022 23:51:24 +0700
Subject: [PATCH] org.el: Fix percent substitutions in `org-open-file'

* lisp/org.el (org--open-file-format-command): New function with better
coverage of mailcap RFC 1524 syntax.  Do not replace percent character
in file name or link component, fix substitution of multiple regular
expression groups matched in the link target.
(org-open-file): Use `org--open-file-format-command' instead of inline
code.
* testing/lisp/test-org.el (org-test/org--open-file-format-command):
Tests for `org--open-file-format-command'.

The primary goal of moving code outside of `org-open-file' function is to
make it testable.

It should fix the following issues:
- Maxim Nikulin. greedy substitution in org-open-file.
  Wed, 20 Jan 2021 23:08:35 +0700.
  https://list.orgmode.org/ru9ki4$t5e$1@ciao.gmane.io
- Rodrigo Morales. Org mode links: Open a PDF file at a given page
  and highlight a given string. Tue, 02 Mar 2021 15:07:32 -0500.
  https://list.orgmode.org/87lfb5pbej.fsf@gmail.com
---
 lisp/org.el              |  95 +++++++++++++++++++++-------
 testing/lisp/test-org.el | 132 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 206 insertions(+), 21 deletions(-)

diff --git a/lisp/org.el b/lisp/org.el
index 858123e67..e94ec02d9 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -8012,6 +8012,78 @@ opened in Emacs."
    (when add-auto-mode
      (mapcar (lambda (x) (cons (car x) 'emacs)) auto-mode-alist))))
 
+(defun org--open-file-format-command
+    (mailcap-command file link match-data)
+  "Format MAILCAP-COMMAND to launch viewer for FILE.
+
+Command may be an entry from the `org-file-apps' list or from
+mailcap file, for details see RFC 1524 A User Agent Configuration
+Mechanism For Multimedia Mail Format Information
+<https://tools.ietf.org/html/rfc1524>, man mailcap(5), and Info
+node `(emacs-mime) mailcap'.  Only a part of mailcap specification
+is supported.
+
+The following substitutions are interpolated in the MAILCAP-COMMAND
+string:
+- \"%s\" to FILE name passed through `convert-standard-file-name',
+  so it must be absolute path,
+- \"%1\" to \"%9\" groups from MATCH-DATA found in the LINK string
+  by the regular expression in the `org-file-apps' entry key
+  (performed by caller).
+
+Use backslash \"\\\" to quote percent \"%\" or any other character
+including backslash itself.
+
+In addition, each argument is passed through `shell-quote-argument',
+so quotes around substitutions should not be used.  For compliance
+with mailcap files single or double quotes around substitutions
+are stripped.
+
+For example, to specify particular location withing a PDF file,
+`org-file-apps' list may have the following entries (order is important):
+
+    ;; Page and search string,
+    ;; e.g. <file:///usr/share/doc/bash/bashref.pdf::34::order of redirections>.
+    (\"\\\\.pdf::\\\\([0-9]+\\\\)::\\\\(.+\\\\)\\\\\\='\"
+        . \"okular --page %1 --find %2 %s\")
+    ;; Internal anchor and search string,
+    ;; e.g. <file:///usr/share/doc/bash/bashref.pdf::Redirections::allocate a file>.
+    (\"\\\\.pdf::\\\\(.+\\\\)::\\\\(.+\\\\)\\\\\\='\"
+        . \"okular --find %2 file://%s\\\\\\\\#%1\")
+    ;; Page number, e.g. <file:///usr/share/doc/bash/bashref.pdf::34>.
+    (\"\\\\.pdf::\\\\([0-9]+\\\\)\\\\\\='\" . \"okular --page %1 %s\")
+    ;; Internal reference, e.g. <file:///usr/share/doc/bash/bashref.pdf::Redirections>.
+    (\"\\\\.pdf::\\\\(.+\\\\)\\\\\\='\" . \"okular file://%s\\\\\\\\#%1\")
+    ;; No location within the file, optionally followed by \"::\",
+    ;; e.g. <file:///usr/share/doc/bash/bashref.pdf>.
+    (\"\\\\.pdf\\\\(?:::\\\\)?\\\\\\='\" . \"okular %s\")
+
+Side note: perhaps you would prefer to read the same bash manual used in the
+example as Info node `(bash) Redirections' (Org link: <info:bash#Redirections>)."
+  (let ((spec (list (cons ?s  (convert-standard-filename file))))
+        (ngroups (min 9 (- (/ (length match-data) 2) 1))))
+    (when (> ngroups 0)
+      (set-match-data match-data)
+      (dolist (i (number-sequence 1 ngroups))
+        (push (cons (+ ?0 i) (match-string-no-properties i link)) spec)))
+    (replace-regexp-in-string
+     "\\\\\\(.\\)\\|\\(['\"]\\)?%\\(?:\\(.\\)\\(\\2\\)?\\)?"
+     (lambda (fmt)
+       (let ((backslash (match-string-no-properties 1 fmt)))
+         (or backslash
+             (let* ((quot (match-string 2 fmt))
+                    (percent (alist-get (string-to-char
+                                         (or (match-string 3 fmt) ""))
+                                        spec))
+                    (value (and percent (shell-quote-argument percent))))
+               (unless value
+                 (error "Invalid format `%s'" fmt))
+               ;; Remove quotes around the file name - we use `shell-quote-argument'.
+               (if (and quot (string-equal quot (match-string 4 fmt)))
+                   value
+                 (concat quot value))))))
+     mailcap-command nil 'literal)))
+
 ;;;###autoload
 (defun org-open-file (path &optional in-emacs line search)
   "Open the file at PATH.
@@ -8109,27 +8181,8 @@ If the file does not exist, throw an error."
 	       (not org-open-non-existing-files))
       (user-error "No such file: %s" file))
     (cond
-     ((and (stringp cmd) (not (string-match "^\\s-*$" cmd)))
-      ;; Remove quotes around the file name - we'll use shell-quote-argument.
-      (while (string-match "['\"]%s['\"]" cmd)
-	(setq cmd (replace-match "%s" t t cmd)))
-      (setq cmd (replace-regexp-in-string
-		 "%s"
-		 (shell-quote-argument (convert-standard-filename file))
-		 cmd
-		 nil t))
-
-      ;; Replace "%1", "%2" etc. in command with group matches from regex
-      (save-match-data
-	(let ((match-index 1)
-	      (number-of-groups (- (/ (length link-match-data) 2) 1)))
-	  (set-match-data link-match-data)
-	  (while (<= match-index number-of-groups)
-	    (let ((regex (concat "%" (number-to-string match-index)))
-		  (replace-with (match-string match-index dlink)))
-	      (while (string-match regex cmd)
-		(setq cmd (replace-match replace-with t t cmd))))
-	    (setq match-index (+ match-index 1)))))
+     ((org-string-nw-p cmd)
+      (setq cmd (org--open-file-format-command cmd file dlink link-match-data))
 
       (save-window-excursion
 	(message "Running %s...done" cmd)
diff --git a/testing/lisp/test-org.el b/testing/lisp/test-org.el
index b14cbeb26..afae24aca 100644
--- a/testing/lisp/test-org.el
+++ b/testing/lisp/test-org.el
@@ -8421,6 +8421,138 @@ two
      (call-interactively #'org-paste-subtree)
      (buffer-string)))))
 
+(ert-deftest test-org/org--open-file-format-command ()
+  "Test `org--open-file-format-command' helper for `org-open-file'."
+  (let ((system-type 'gnu/linux)) ; Fix behavior of `shell-quote-argument'.
+    ;; No additional groups in `org-file-apps' key.
+    (let ((file "/file.pdf")
+          (pattern "\\.pdf\\'"))
+      (should
+       (equal "simple /file.pdf"
+              (and (string-match pattern file)
+                   (org--open-file-format-command
+                    "simple %s" file file (match-data)))))
+      (should
+       (equal "single-quotes /file.pdf"
+              (and (string-match pattern file)
+                   (org--open-file-format-command
+                    "single-quotes '%s'" file file (match-data)))))
+      (should
+       (equal "double-quotes /file.pdf"
+              (and (string-match pattern file)
+                   (org--open-file-format-command
+                    "double-quotes \"%s\"" file file (match-data)))))
+      (should
+       (equal "no subst"
+              (and (string-match pattern file)
+                   (org--open-file-format-command
+                    "no subst" file file (match-data)))))
+      (should
+       (equal "% literal percent 100% %s"
+              (and (string-match pattern file)
+                   (org--open-file-format-command
+                    "\\% literal percent 100\\% \\%s" file file (match-data)))))
+      (should
+       (equal "escape \"/file.pdf\" \\ more"
+              (and (string-match pattern file)
+                   (org--open-file-format-command
+                    ;; Second quote is not escaped.
+                    "escape \\\"%s\" \\\\ more" file file (match-data)))))
+      (should
+       (equal "/file.pdf file at start"
+              (and (string-match pattern file)
+                   (org--open-file-format-command
+                    "%s file at start" file file (match-data))))))
+    ;; Anchors within target file.
+    (let ((file "/page-search.pdf")
+          (link "/page-search.pdf::10::some words")
+          (pattern "\\.pdf::\\([0-9]+\\)::\\(.*\\)\\'"))
+      (should
+       (equal "zathura --page 10 --find some\\ words /page-search.pdf"
+              (and (string-match pattern link)
+                   (org--open-file-format-command
+                    "zathura --page '%1' --find %2 \"%s\"" file link (match-data)))))
+      ;; Unused %2.
+      (should
+       (equal "firefox file:///page-search.pdf\\#page=10"
+              (and (string-match pattern link)
+                   (org--open-file-format-command
+                    "firefox file://%s\\\\#page=%1" file link (match-data)))))
+      (should
+       (equal "adjucent-subst /page-search.pdfsome\\ words10some\\ words"
+              (and (string-match pattern link)
+                   (org--open-file-format-command
+                    "adjucent-subst %s%2'%1'%2" file link (match-data))))))
+    ;; No more than 9 substitutions are supported.
+    (let ((file "/many.pdf")
+          (link "/many.pdf::one:2:3:4:5:6:7:8:9:a:b:c")
+          (pattern (concat "\\.pdf:"
+                           (mapconcat (lambda (_) ":\\([^:]+\\)")
+                                      (number-sequence 1 12)
+                                      "")
+                           "\\'")))
+      (should
+       (equal "overflow /many.pdf::one:2:3:4:5:6:7:8:9:one0:one1:one2"
+              (and (string-match pattern link)
+                   (org--open-file-format-command
+                    "overflow %s::%1:%2:%3:%4:%5:%6:%7:%8:%9:%10:%11:%12"
+                    file link (match-data))))))
+    ;; Percent character in link fields does not cause any problem.
+    (let ((file "/file-%2.pdf")
+          (link "/file-%2.pdf::anchor-%3::search %1")
+          (pattern "\\.pdf::\\([^:]+\\)::\\(.+\\)\\'"))
+      (should
+       (equal "percents --find search\\ \\%1 file:///file-\\%2.pdf\\#anchor-\\%3"
+              (and (string-match pattern link)
+                   (org--open-file-format-command
+                    "percents --find %2 file://%s\\\\#%1"
+                    file link (match-data))))))
+    ;; Errors.
+    (let ((file "/error.pdf")
+          (pattern "\\.pdf\\'"))
+      (let* ((err (should-error
+                   (and (string-match pattern file)
+                        (org--open-file-format-command
+                         "trailing-percent %s %" file file (match-data)))
+                   :type 'error))
+             (err-text (cadr err)))
+        (should-not (unless (and (stringp err-text)
+                                 (string-match-p "\\`Invalid format .*%" err-text))
+                      err)))
+      ;; Mailcap escape for "%" is "\%", not "%%".
+      (let* ((err (should-error
+                   (and (string-match pattern file)
+                        (org--open-file-format-command
+                         "percent-percent %s%%" file file (match-data)))
+                   :type 'error))
+             (err-text (cadr err)))
+        (should-not (unless (and (stringp err-text)
+                                 (string-match-p "\\`Invalid format .*%%" err-text))
+                      err)))
+      ;; Mailcap allows "%t" for MIME type, but Org has no such information.
+      (let* ((err (should-error
+                   (and (string-match pattern file)
+                        (org--open-file-format-command
+                         "percent-t-unsupported --type '%t' %s" file file (match-data)))
+                   :type 'error))
+             (err-text (cadr err)))
+        (should-not (unless (and (stringp err-text)
+                                 (string-match-p "\\`Invalid format .*%t" err-text))
+                      err))))
+    ;; Optional regular expression groups have no point in `org-file-apps' patterns.
+    (let* ((file "/error.pdf")
+           (link "/error.pdf::1")
+           (pattern "\\.pdf::\\([^:]+\\)\\(?:::\\(.+\\)\\)?\\'")
+           (err (should-error
+                 (and (string-match pattern link)
+                      (org--open-file-format-command
+                       "no-such-match --search %2 %s" file link (match-data)))
+                 :type 'error))
+           (err-text (cadr err)))
+      (should-not (unless (and (stringp err-text)
+                               (string-match-p "\\`Invalid format.*%2" err-text))
+                    err)))))
+
 (provide 'test-org)
 
 ;;; test-org.el ends here
-- 
2.25.1


  reply	other threads:[~2022-08-27 17:21 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-20 16:08 greedy substitution in org-open-file Maxim Nikulin
2021-02-12  7:16 ` Kyle Meyer
2021-02-12 16:46   ` Maxim Nikulin
2021-02-13  4:38     ` Kyle Meyer
2021-02-15 17:04       ` Maxim Nikulin
2021-03-03 12:47       ` Maxim Nikulin
2021-03-21 12:36       ` Maxim Nikulin
2022-08-27 17:20         ` Max Nikulin [this message]
2022-09-02 12:08           ` [PATCH] org.el: Fix percent substitutions in `org-open-file' Ihor Radchenko
2022-09-02 15:41             ` Max Nikulin
2022-09-03  8:26               ` Ihor Radchenko
2022-09-04 12:16                 ` [PATCH v2] " Max Nikulin
2022-09-05  5:46                   ` Ihor Radchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dc30fb1c-f947-2797-5cc9-756094cf815c@gmail.com \
    --to=manikulin@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=kyle@kyleam.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).