emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Max Nikulin <manikulin@gmail.com>
To: emacs-orgmode@gnu.org
Subject: Re: [PATCH v2] org.el: Fix percent substitutions in `org-open-file'
Date: Sun, 4 Sep 2022 19:16:47 +0700	[thread overview]
Message-ID: <tf24vg$43p$1@ciao.gmane.io> (raw)
In-Reply-To: <87o7vw983m.fsf@localhost>

[-- Attachment #1: Type: text/plain, Size: 2269 bytes --]

On 03/09/2022 15:26, Ihor Radchenko wrote:
> Max Nikulin writes:
>> The only problems
>> is that adding entries programmatically breaks easy customization
>> interface for `org-file-apps'. Currently there is the same issue with
>> `org-link-parameters' that is a defcustom variables with entries added
>> by various org extensions.
> 
> We may alter the :type specifier in `org-file-apps' to something like
> 
> :type `(repeat
>           (choice
>             ,org-file-apps-presets
>             ...))
> 
> `org-file-apps-presets' will then contain :type specs for some common
> file types and the apps. The variable can be easily populated by
> third-party packages as well.

I mean another problem. A user adds (require 'some-package) that defines 
new link type. Later the user invokes easy customization interface to 
adjust a link unrelated to some-package. At this step
links from some-package are added to user's init file. Ideally it should 
not happen and after removing of (require 'some-package) there should be 
no link types from the package in the init file at all, including 
customization section.

 From my point of view entries added to `org-link-parameters', 
`org-file-apps', etc. by packages should not affect entries managed 
through customization.

>>        I added the example with hope to better explain the purpose of this
>>        function.
> 
> I do not think that it is needed in the `org--open-file-format-command'
> docstring. If you wish, it would be better to add examples of the
> arguments and return values of `org--open-file-format-command' instead
> of showing `org-file-apps' examples.

I have removed most parts of the example. In addition I fixed some 
issues with references in the docstring (man mailcap is not recognized 
as a link for some reason yet). I changed the function to consider 
trailing backslash as a format error.

See the following message for my new idea concerning providing 
configuration for PDF viewers. Single elisp function should be more 
robust than 6 shell commands and it is better to add code than instructions.

Max Nikulin. Re: Org mode links: Open a PDF file at a given page and 
highlight a given string. Sat, 3 Sep 2022 20:00:47 +0700. 
https://list.orgmode.org/tevj61$17d8$1@ciao.gmane.io

[-- Attachment #2: v2-0001-org.el-Fix-percent-substitutions-in-org-open-file.patch --]
[-- Type: text/x-patch, Size: 15057 bytes --]

From 5c7a38b1d61c44ee4414f01c94abc841bd203191 Mon Sep 17 00:00:00 2001
From: Max Nikulin <manikulin@gmail.com>
Date: Sat, 27 Aug 2022 23:51:24 +0700
Subject: [PATCH v2] org.el: Fix percent substitutions in `org-open-file'

* lisp/org.el (org--open-file-format-command): New function with better
coverage of mailcap RFC 1524 syntax.  Do not replace percent character
in file name or link component, fix substitution of multiple regular
expression groups matched in the link target.
(org-open-file): Use `org--open-file-format-command' instead of inline
code.
* testing/lisp/test-org.el (org-test/org--open-file-format-command):
Tests for `org--open-file-format-command'.

The primary goal of moving code outside of `org-open-file' function is to
make it testable.

It should fix the following issues:
- Maxim Nikulin. greedy substitution in org-open-file.
  Wed, 20 Jan 2021 23:08:35 +0700.
  https://list.orgmode.org/ru9ki4$t5e$1@ciao.gmane.io
- Rodrigo Morales. Org mode links: Open a PDF file at a given page
  and highlight a given string. Tue, 02 Mar 2021 15:07:32 -0500.
  https://list.orgmode.org/87lfb5pbej.fsf@gmail.com
---
 lisp/org.el              | 103 ++++++++++++++++++++-----
 testing/lisp/test-org.el | 160 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 242 insertions(+), 21 deletions(-)

diff --git a/lisp/org.el b/lisp/org.el
index 6fafed8f1..9c440bd67 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -1299,6 +1299,17 @@ Possible values for the file identifier are:
 \"evince -p %1 %s\")
                      to open [[file:document.pdf::5]] with evince at page 5.
 
+                 Likely you will need more entries: without page
+                 number, with search pattern, with
+                 cross-reference anchor, some combination of
+                 options.  Consider simple pattern here and a
+                 lisp function to determine command line
+                 arguments instead.  Passing argument list to
+                 `call-process' or `make-process' directly allows
+                 to avoid treating some character in peculiar
+                 file names as shell specialls causing executing
+                 part of file name as a subcommand.
+
  `directory'   Matches a directory
  `remote'      Matches a remote file, accessible through tramp or efs.
                Remote files most likely should be visited through Emacs
@@ -8015,6 +8026,75 @@ opened in Emacs."
    (when add-auto-mode
      (mapcar (lambda (x) (cons (car x) 'emacs)) auto-mode-alist))))
 
+(defun org--open-file-format-command
+    (mailcap-command file link match-data)
+  "Format MAILCAP-COMMAND to launch viewer for the FILE.
+
+MAILCAP-COMMAND may be an entry from the `org-file-apps' list or
+viewer field from mailcap file loaded to `mailcap-mime-data'. See
+\"RFC 1524. A User Agent Configuration Mechanism For Multimedia
+Mail Format Information\" (URL
+`https://www.rfc-editor.org/rfc/rfc1524.html') for details, man
+page `mailcap(5)' for brief summary, and Info node `(emacs-mime)
+mailcap' for specific related to Emacs.  Only a part of mailcap
+specification is supported.
+
+The following substitutions are interpolated in the
+MAILCAP-COMMAND string:
+
+- \"%s\" to FILE name passed through
+  `convert-standard-filename', so it must be absolute path,
+
+- \"%1\" to \"%9\" groups from MATCH-DATA found in the LINK
+  string by the regular expression in the key part of the `org-file-apps' entry.
+  (performed by caller).  Not recommended, consider a lisp
+  function instead of a shell command.  For example the following
+  link in an Org file
+
+       <file:///usr/share/doc/bash/bashref.pdf::#Redirections::allocate a file>
+
+   may be handled by an `org-file-apps' entry like
+
+       (\"\\\\.pdf\\\\(?:\\\\.[gx]z\\\\|\\\\.bz2\\\\)?::\\\\(#[^:]+\\\\)::\\\\(.+\\\\)\\\\\\='\"
+        . \"okular --find %2 %s%1\")
+
+Use backslash \"\\\" to quote percent \"%\" or any other character
+including backslash itself.
+
+In addition, each argument is passed through
+`shell-quote-argument', so quotes around substitutions should not
+be used.  For compliance with mailcap files shipped e.g. in
+Debian GNU/Linux, single or double quotes around substitutions
+are stripped. It deviates from mailcap specification that
+requires file name to be safe for shell and for the application."
+  (let ((spec (list (cons ?s  (convert-standard-filename file))))
+        (ngroups (min 9 (- (/ (length match-data) 2) 1))))
+    (when (> ngroups 0)
+      (set-match-data match-data)
+      (dolist (i (number-sequence 1 ngroups))
+        (push (cons (+ ?0 i) (match-string-no-properties i link)) spec)))
+    (replace-regexp-in-string
+     (rx (or (and "\\" (or (group anything) string-end))
+             (and (optional (group (any "'\"")))
+                  "%"
+                  (or (group anything) string-end)
+                  (optional (group (backref 2))))))
+     (lambda (fmt)
+       (let* ((backslash (match-string-no-properties 1 fmt))
+              (key (match-string 3 fmt))
+              (value (and key (alist-get (string-to-char key) spec))))
+         (cond
+          (backslash)
+          (value (let ((quot (match-string 2 fmt))
+                       (subst (shell-quote-argument value)))
+                   ;; Remove quotes around the file name - we use
+                   ;; `shell-quote-argument'.
+                   (if (match-string 4 fmt)
+                       subst
+                     (concat quot subst))))
+          (t (error "Invalid format `%s'" fmt)))))
+     mailcap-command nil 'literal)))
+
 ;;;###autoload
 (defun org-open-file (path &optional in-emacs line search)
   "Open the file at PATH.
@@ -8112,27 +8192,8 @@ If the file does not exist, throw an error."
 	       (not org-open-non-existing-files))
       (user-error "No such file: %s" file))
     (cond
-     ((and (stringp cmd) (not (string-match "^\\s-*$" cmd)))
-      ;; Remove quotes around the file name - we'll use shell-quote-argument.
-      (while (string-match "['\"]%s['\"]" cmd)
-	(setq cmd (replace-match "%s" t t cmd)))
-      (setq cmd (replace-regexp-in-string
-		 "%s"
-		 (shell-quote-argument (convert-standard-filename file))
-		 cmd
-		 nil t))
-
-      ;; Replace "%1", "%2" etc. in command with group matches from regex
-      (save-match-data
-	(let ((match-index 1)
-	      (number-of-groups (- (/ (length link-match-data) 2) 1)))
-	  (set-match-data link-match-data)
-	  (while (<= match-index number-of-groups)
-	    (let ((regex (concat "%" (number-to-string match-index)))
-		  (replace-with (match-string match-index dlink)))
-	      (while (string-match regex cmd)
-		(setq cmd (replace-match replace-with t t cmd))))
-	    (setq match-index (+ match-index 1)))))
+     ((org-string-nw-p cmd)
+      (setq cmd (org--open-file-format-command cmd file dlink link-match-data))
 
       (save-window-excursion
 	(message "Running %s...done" cmd)
diff --git a/testing/lisp/test-org.el b/testing/lisp/test-org.el
index b14cbeb26..004e89732 100644
--- a/testing/lisp/test-org.el
+++ b/testing/lisp/test-org.el
@@ -8421,6 +8421,166 @@ two
      (call-interactively #'org-paste-subtree)
      (buffer-string)))))
 
+(ert-deftest test-org/org--open-file-format-command ()
+  "Test `org--open-file-format-command' helper for `org-open-file'."
+  (let ((system-type 'gnu/linux)) ; Fix behavior of `shell-quote-argument'.
+    ;; No additional groups in `org-file-apps' key.
+    (let ((file "/file.pdf")
+          (pattern "\\.pdf\\'"))
+      (should
+       (equal "simple /file.pdf"
+              (and (string-match pattern file)
+                   (org--open-file-format-command
+                    "simple %s" file file (match-data)))))
+      (should
+       (equal "single-quotes /file.pdf"
+              (and (string-match pattern file)
+                   (org--open-file-format-command
+                    "single-quotes '%s'" file file (match-data)))))
+      (should
+       (equal "double-quotes /file.pdf"
+              (and (string-match pattern file)
+                   (org--open-file-format-command
+                    "double-quotes \"%s\"" file file (match-data)))))
+      (should
+       (equal "quotes 'mismatch \"/file.pdf'"
+              (and (string-match pattern file)
+                   (org--open-file-format-command
+                    "quotes 'mismatch \"%s'" file file (match-data)))))
+      (should
+       (equal "no subst"
+              (and (string-match pattern file)
+                   (org--open-file-format-command
+                    "no subst" file file (match-data)))))
+      (should
+       (equal "% literal percent 100% %s"
+              (and (string-match pattern file)
+                   (org--open-file-format-command
+                    "\\% literal percent 100\\% \\%s" file file (match-data)))))
+      (should
+       (equal "escape \"/file.pdf\" \\ more"
+              (and (string-match pattern file)
+                   (org--open-file-format-command
+                    ;; Second quote is not escaped.
+                    "escape \\\"%s\" \\\\ more" file file (match-data)))))
+      (should
+       (equal "/file.pdf file at start"
+              (and (string-match pattern file)
+                   (org--open-file-format-command
+                    "%s file at start" file file (match-data)))))
+      (should
+       (equal "backslash-newline\n/file.pdf"
+              (and (string-match pattern file)
+                   (org--open-file-format-command
+                    "backslash-newline\\\n%s" file file (match-data))))))
+    ;; Anchors within target file.
+    (let ((file "/page-search.pdf")
+          (link "/page-search.pdf::10::some words")
+          (pattern "\\.pdf::\\([0-9]+\\)::\\(.*\\)\\'"))
+      (should
+       (equal "zathura --page 10 --find some\\ words /page-search.pdf"
+              (and (string-match pattern link)
+                   (org--open-file-format-command
+                    "zathura --page '%1' --find %2 \"%s\"" file link (match-data)))))
+      ;; Unused %2.
+      (should
+       (equal "firefox file:///page-search.pdf\\#page=10"
+              (and (string-match pattern link)
+                   (org--open-file-format-command
+                    "firefox file://%s\\\\#page=%1" file link (match-data)))))
+      (should
+       (equal "adjucent-subst /page-search.pdfsome\\ words10some\\ words"
+              (and (string-match pattern link)
+                   (org--open-file-format-command
+                    "adjucent-subst %s%2'%1'%2" file link (match-data))))))
+    ;; No more than 9 substitutions are supported.
+    (let ((file "/many.pdf")
+          (link "/many.pdf::one:2:3:4:5:6:7:8:9:a:b:c")
+          (pattern (concat "\\.pdf:"
+                           (mapconcat (lambda (_) ":\\([^:]+\\)")
+                                      (number-sequence 1 12)
+                                      "")
+                           "\\'")))
+      (should
+       (equal "overflow /many.pdf::one:2:3:4:5:6:7:8:9:one0:one1:one2"
+              (and (string-match pattern link)
+                   (org--open-file-format-command
+                    "overflow %s::%1:%2:%3:%4:%5:%6:%7:%8:%9:%10:%11:%12"
+                    file link (match-data))))))
+    ;; Percent character in link fields does not cause any problem.
+    (let ((file "/file-%2.pdf")
+          (link "/file-%2.pdf::anchor-%3::search %1")
+          (pattern "\\.pdf::\\([^:]+\\)::\\(.+\\)\\'"))
+      (should
+       (equal "percents --find search\\ \\%1 file:///file-\\%2.pdf\\#anchor-\\%3"
+              (and (string-match pattern link)
+                   (org--open-file-format-command
+                    "percents --find %2 file://%s\\\\#%1"
+                    file link (match-data))))))
+    ;; Errors.
+    (let ((file "/error.pdf")
+          (pattern "\\.pdf\\'"))
+      (let* ((err (should-error
+                   (and (string-match pattern file)
+                        (org--open-file-format-command
+                         "trailing-percent %s %" file file (match-data)))
+                   :type 'error))
+             (err-text (cadr err)))
+        (should-not (unless (and (stringp err-text)
+                                 (string-match-p "\\`Invalid format .*%" err-text))
+                      err)))
+      (let* ((err (should-error
+                   (and (string-match pattern file)
+                        (org--open-file-format-command
+                         "trailing-backslash %s \\" file file (match-data)))
+                   :type 'error))
+             (err-text (cadr err)))
+        (should-not (unless (and (stringp err-text)
+                                 (string-match-p "\\`Invalid format .*\\\\" err-text))
+                      err)))
+      (let* ((err (should-error
+                   (and (string-match pattern file)
+                        (org--open-file-format-command
+                         "percent-newline %\n%s" file file (match-data)))
+                   :type 'error))
+             (err-text (cadr err)))
+        (should-not (unless (and (stringp err-text)
+                                 (string-match-p "\\`Invalid format .*%\n" err-text))
+                      err)))
+      ;; Mailcap escape for "%" is "\%", not "%%".
+      (let* ((err (should-error
+                   (and (string-match pattern file)
+                        (org--open-file-format-command
+                         "percent-percent %s%%" file file (match-data)))
+                   :type 'error))
+             (err-text (cadr err)))
+        (should-not (unless (and (stringp err-text)
+                                 (string-match-p "\\`Invalid format .*%%" err-text))
+                      err)))
+      ;; Mailcap allows "%t" for MIME type, but Org has no such information.
+      (let* ((err (should-error
+                   (and (string-match pattern file)
+                        (org--open-file-format-command
+                         "percent-t-unsupported --type '%t' %s" file file (match-data)))
+                   :type 'error))
+             (err-text (cadr err)))
+        (should-not (unless (and (stringp err-text)
+                                 (string-match-p "\\`Invalid format .*%t" err-text))
+                      err))))
+    ;; Optional regular expression groups have no point in `org-file-apps' patterns.
+    (let* ((file "/error.pdf")
+           (link "/error.pdf::1")
+           (pattern "\\.pdf::\\([^:]+\\)\\(?:::\\(.+\\)\\)?\\'")
+           (err (should-error
+                 (and (string-match pattern link)
+                      (org--open-file-format-command
+                       "no-such-match --search %2 %s" file link (match-data)))
+                 :type 'error))
+           (err-text (cadr err)))
+      (should-not (unless (and (stringp err-text)
+                               (string-match-p "\\`Invalid format.*%2" err-text))
+                    err)))))
+
 (provide 'test-org)
 
 ;;; test-org.el ends here
-- 
2.25.1


  reply	other threads:[~2022-09-04 12:33 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-20 16:08 greedy substitution in org-open-file Maxim Nikulin
2021-02-12  7:16 ` Kyle Meyer
2021-02-12 16:46   ` Maxim Nikulin
2021-02-13  4:38     ` Kyle Meyer
2021-02-15 17:04       ` Maxim Nikulin
2021-03-03 12:47       ` Maxim Nikulin
2021-03-21 12:36       ` Maxim Nikulin
2022-08-27 17:20         ` [PATCH] org.el: Fix percent substitutions in `org-open-file' Max Nikulin
2022-09-02 12:08           ` Ihor Radchenko
2022-09-02 15:41             ` Max Nikulin
2022-09-03  8:26               ` Ihor Radchenko
2022-09-04 12:16                 ` Max Nikulin [this message]
2022-09-05  5:46                   ` [PATCH v2] " Ihor Radchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='tf24vg$43p$1@ciao.gmane.io' \
    --to=manikulin@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).