From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id GJZ1BAarKmD9OAAA0tVLHw (envelope-from ) for ; Mon, 15 Feb 2021 17:10:30 +0000 Received: from aspmx1.migadu.com ([2001:41d0:8:6d80::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id 8L4tAAarKmDwYQAAB5/wlQ (envelope-from ) for ; Mon, 15 Feb 2021 17:10:30 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 82DBF14379 for ; Mon, 15 Feb 2021 18:10:29 +0100 (CET) Received: from localhost ([::1]:58390 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lBhOG-0000qW-Ln for larch@yhetil.org; Mon, 15 Feb 2021 12:10:28 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:59236) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lBhJD-0003zx-8G for emacs-orgmode@gnu.org; Mon, 15 Feb 2021 12:05:16 -0500 Received: from ciao.gmane.io ([116.202.254.214]:52026) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lBhJ5-0006D6-II for emacs-orgmode@gnu.org; Mon, 15 Feb 2021 12:05:14 -0500 Received: from list by ciao.gmane.io with local (Exim 4.92) (envelope-from ) id 1lBhJ2-00049O-26 for emacs-orgmode@gnu.org; Mon, 15 Feb 2021 18:05:04 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: emacs-orgmode@gnu.org From: Maxim Nikulin Subject: Re: greedy substitution in org-open-file Date: Tue, 16 Feb 2021 00:04:57 +0700 Message-ID: References: <874kih92nb.fsf@kyleam.com> <87mtw8fupl.fsf@kyleam.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 In-Reply-To: <87mtw8fupl.fsf@kyleam.com> Content-Language: en-US Received-SPF: pass client-ip=116.202.254.214; envelope-from=geo-emacs-orgmode@m.gmane-mx.org; helo=ciao.gmane.io X-Spam_score_int: 28 X-Spam_score: 2.8 X-Spam_bar: ++ X-Spam_report: (2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_ADSP_CUSTOM_MED=0.001, FORGED_GMAIL_RCVD=1, FORGED_MUA_MOZILLA=2.309, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.249, NICE_REPLY_A=-0.001, NML_ADSP_CUSTOM_MED=0.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-orgmode@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-orgmode-bounces+larch=yhetil.org@gnu.org Sender: "Emacs-orgmode" X-Migadu-Flow: FLOW_IN X-Migadu-Spam-Score: -1.76 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=gmail.com (policy=none); spf=pass (aspmx1.migadu.com: domain of emacs-orgmode-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=emacs-orgmode-bounces@gnu.org X-Migadu-Queue-Id: 82DBF14379 X-Spam-Score: -1.76 X-Migadu-Scanner: scn1.migadu.com X-TUID: 1GsNZu7kvZzo On 13/02/2021 11:38, Kyle Meyer wrote: > > All right, here's > a format-spec-inspired fix. At the very least it needs doc updates and > a comment or two. Thank you. I am hardly familiar with elisp so it would be difficult for me to express the same. My comments are mostly a matter of taste. Sorry, I have not tried to run the code yet. > diff --git a/lisp/org.el b/lisp/org.el > index 5b1443c4e..e8f60fd83 100644 > --- a/lisp/org.el > +++ b/lisp/org.el > @@ -8644,6 +8644,23 @@ (defun org--file-apps-regexp-alist (list &optional add-auto-mode) > (when add-auto-mode > (mapcar (lambda (x) (cons (car x) 'emacs)) auto-mode-alist)))) > > +(defun org--open-file-format-spec (format specification) > + (with-temp-buffer > + (insert format) > + (goto-char (point-min)) > + (while (search-forward "%" nil t) > + (cond ((eq (char-after) ?%) > + (delete-char 1)) > + ((looking-at "[s0-9]") Personally I do not see a reason to limit substitutions just to just "%s". I would consider "[a-zA-Z0-9]". > + (replace-match > + (or (cdr (assoc (match-string 0) specification)) > + (error "Invalid format string")) I think, substitution character in the error message could be useful during debugging, something like (format "Invalid format character %%%s" (match-string 0)). > + 'fixed-case 'literal) > + (delete-region (1- (match-beginning 0)) (match-beginning 0))) > + (t > + (error "Invalid format string")))) > + (buffer-string))) > + > ;;;###autoload > (defun org-open-file (path &optional in-emacs line search) > "Open the file at PATH. > @@ -8745,24 +8762,20 @@ (defun org-open-file (path &optional in-emacs line search) > ;; Remove quotes around the file name - we'll use shell-quote-argument. > (while (string-match "['\"]%s['\"]" cmd) > (setq cmd (replace-match "%s" t t cmd))) > - (setq cmd (replace-regexp-in-string > - "%s" > - (shell-quote-argument (convert-standard-filename file)) > - cmd > - nil t)) > - > - ;; Replace "%1", "%2" etc. in command with group matches from regex > - (save-match-data > - (let ((match-index 1) > - (number-of-groups (- (/ (length link-match-data) 2) 1))) > - (set-match-data link-match-data) > - (while (<= match-index number-of-groups) > - (let ((regex (concat "%" (number-to-string match-index))) > - (replace-with (match-string match-index dlink))) > - (while (string-match regex cmd) > - (setq cmd (replace-match replace-with t t cmd)))) > - (setq match-index (+ match-index 1))))) > - > + (setq cmd > + (org--open-file-format-spec > + cmd > + (cons > + (cons "s" (shell-quote-argument > + (convert-standard-filename file))) > + (let ((ngroups (- (/ (length link-match-data) 2) 1))) > + (and (> ngroups 0) > + (progn > + (set-match-data link-match-data) > + (mapcar (lambda (n) > + (cons (number-to-string n) > + (match-string-no-properties n dlink))) Should not be numbered substitutions passed through shell-quote-argument as well? Besides just page numbers the link could be named internal anchor where more characters are allowed. It is the reason why in general I prefer bare exec to shell commands. I am unsure concerning optional matches as "\\(?:\\.pdf\\)\\(?:::\\([0-9]+\\)\\)?\\'" Maybe they should not be used at all in favor of separate entries with and without page number. Maybe nil should coalesce to empty string "". Certainly I am against "nil" string for a missed group. By the way, in the original format-spec, cdr is applied after the check whether assoc value is nil. > + (number-sequence 1 ngroups)))))))) > (save-window-excursion > (message "Running %s...done" cmd) > (start-process-shell-command cmd nil cmd)