emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Tom Gillespie <tgbugs@gmail.com>
To: emacs-orgmode <emacs-orgmode@gnu.org>
Subject: Re: Bug: inconsistent escaping of coderef regexp
Date: Sun, 4 Apr 2021 22:22:32 -0700	[thread overview]
Message-ID: <CA+G3_PM841Lb_va1z2PWVFX=ZCeSpjKFmLkgrcN8uWZfzn4uag@mail.gmail.com> (raw)
In-Reply-To: <877dlhod94.fsf@nicolasgoaziou.fr>

[-- Attachment #1: Type: text/plain, Size: 1259 bytes --]

Hi Nicolas,
   I've attached a patch with a first pass implementation that I think
resolves most of the issues. It probably needs a few tests to go along
with it, but I think it is the simplest way forward. I tried to make the
changes without disrupting the org-babel info structure, but it comes
with the cost of having to pull out :coderef-prefix in a number of separate
contexts. Best,

> If possible, I'd like not to conflate current issue with switches
> deprecation, which needs to be discussed separately.

We can decouple them, so not an issue. The attached patch implements
the header arg equivalents of -r and -l without making any changes to the
existing switch behavior.

> What do you mean by "it is impossible for the user to specify their own
> coderef regexp that can be used in both cases"? In particular, what is
> a coderef regexp in this context? I know about coderef format, but
> I don't think users are supposed to provide a regexp here.

I did a first pass implementation and realized that allowing users to
specify coderef-regexp is a bad idea. The attached patch fixes the
divergent behavior of org-bable-tangle-single-block and provides a
standard way to specify a :coderef-prefix regexp so that empty
comments can be stripped.

[-- Attachment #2: 0001-improve-org-src-coderef-regexp-and-regularize-usage.patch --]
[-- Type: text/x-patch, Size: 13860 bytes --]

From e017fe3f4fb36da2c8560a9999e526b8bdfd42dc Mon Sep 17 00:00:00 2001
From: Tom Gillespie <tgbugs@gmail.com>
Date: Sun, 4 Apr 2021 21:40:32 -0700
Subject: [PATCH] improve org-src-coderef-regexp and regularize usage

* lisp/ob-core.el
org-babel-common-header-args-w-values: new :coderef- header args
org-babel-safe-header-args: include the new :coderef- header args
(org-babel-get-src-block-info): calulate params before info in let* so
that they can be used to set the coderef-format field (nth 6 info)
(org-babel--expand-body): use coderef-prefix to correctly strip
coderefs when expanding

* lisp/ob-tangle.el (orb-babel-tangle-single-block): Regularize
behavior when removing coderefs during tangling. This fixes an issue
where trailing whitespace would be retained when coderefs were removed
for tangling. Make the header argument :coderef-tangle no work the
same way that the -r switch currently works

* lisp/ol.el (org-link-search): use org babel info to match the
coderef format for each block

* lisp/org-src.el (org-src-coderef-regexp): now takes an additional
argument rx-prefix that can be used to customize the text preceeding
the coderef that should be removed during tangling, this is most
useful for removing comments and trailing whitespace.

* lisp/ox.el (org-export-resolve-coderef)
and (org-export-unravel-code): use org babel info to
correctly match the coderef format for each block.

This commit adds support for three new src block header arguments,
:coderef-format :coderef-prefix and :coderef-tangle. :coderef-format
has the same behavior has the org src switch -l and :coderef-tangle
has the same behavior as org src switch -r. :coderef-prefix provides
new functionality and makes it possible to set the regexp for text
leading up to the coderef. In particular this can be used to strip
comments, which are required if authoring an org file that works with
older versions of org.
 lisp/ob-core.el   | 43 +++++++++++++++++++++++++------------------
 lisp/ob-tangle.el | 18 +++++++++++-------
 lisp/ol.el        | 17 +++++++++++------
 lisp/org-src.el   |  5 +++--
 lisp/ox.el        | 15 +++++++++++----
 5 files changed, 61 insertions(+), 37 deletions(-)

diff --git a/lisp/ob-core.el b/lisp/ob-core.el
index 2e78ac3e6..feb6f2235 100644
--- a/lisp/ob-core.el
+++ b/lisp/ob-core.el
@@ -76,7 +76,7 @@
 (declare-function org-previous-block "org" (arg &optional block-regexp))
 (declare-function org-show-context "org" (&optional key))
 (declare-function org-src-coderef-format "org-src" (&optional element))
-(declare-function org-src-coderef-regexp "org-src" (fmt &optional label))
+(declare-function org-src-coderef-regexp "org-src" (fmt &optional label rx-prefix))
 (declare-function org-src-get-lang-mode "org-src" (lang))
 (declare-function org-table-align "org-table" ())
 (declare-function org-table-convert-region "org-table" (beg0 end0 &optional separator))
@@ -392,6 +392,9 @@ then run `org-babel-switch-to-session'."
 (defconst org-babel-common-header-args-w-values
   '((cache	. ((no yes)))
     (cmdline	. :any)
+    (coderef-format . :any)
+    (coderef-prefix . :any)
+    (coderef-tangle . ((nil yes no)))
     (colnames	. ((nil no yes)))
     (comments	. ((no link yes org both noweb)))
     (dir	. :any)
@@ -434,7 +437,8 @@ Note that individual languages may define their own language
 specific header arguments as well.")
 (defconst org-babel-safe-header-args
-  '(:cache :colnames :comments :exports :epilogue :hlines :noeval
+  '(:cache :coderef-format :coderef-prefix :coderef-tangle
+           :colnames :comments :exports :epilogue :hlines :noeval
 	   :noweb :noweb-ref :noweb-sep :padline :prologue :rownames
 	   :sep :session :tangle :wrap
 	   (:eval . ("never" "query"))
@@ -607,29 +611,31 @@ a list with the following pattern:
 	     (lang-headers (intern
 			    (concat "org-babel-default-header-args:" lang)))
 	     (name (org-element-property :name datum))
+             (params (apply #'org-babel-merge-params
+                            (if inline org-babel-default-inline-header-args
+                              org-babel-default-header-args)
+                            (and (boundp lang-headers) (eval lang-headers t))
+                            (append
+                             ;; If DATUM is provided, make sure we get node
+                             ;; properties applicable to its location within
+                             ;; the document.
+                             (org-with-point-at (org-element-property :begin datum)
+                               (org-babel-params-from-properties lang light))
+                             (mapcar (lambda (h)
+                                       (org-babel-parse-header-arguments h light))
+                                     (cons (org-element-property :parameters datum)
+				           (org-element-property :header datum))))))
 	       (org-babel--normalize-body datum)
-	       (apply #'org-babel-merge-params
-		      (if inline org-babel-default-inline-header-args
-			org-babel-default-header-args)
-		      (and (boundp lang-headers) (eval lang-headers t))
-		      (append
-		       ;; If DATUM is provided, make sure we get node
-		       ;; properties applicable to its location within
-		       ;; the document.
-		       (org-with-point-at (org-element-property :begin datum)
-			 (org-babel-params-from-properties lang light))
-		       (mapcar (lambda (h)
-				 (org-babel-parse-header-arguments h light))
-			       (cons (org-element-property :parameters datum)
-				     (org-element-property :header datum)))))
+               params
 	       (or (org-element-property :switches datum) "")
 	       (org-element-property (if inline :begin :post-affiliated)
-	       (and (not inline) (org-src-coderef-format datum)))))
+	       (and (not inline) (or (cdr (assq :coderef-format params))
+                                     (org-src-coderef-format datum))))))
 	(unless light
 	  (setf (nth 2 info) (org-babel-process-params (nth 2 info))))
 	(setf (nth 2 info) (org-babel-generate-file-param name (nth 2 info)))
@@ -638,13 +644,14 @@ a list with the following pattern:
 (defun org-babel--expand-body (info)
   "Expand noweb references in body and remove any coderefs."
   (let ((coderef (nth 6 info))
+        (coderef-prefix (cdr (assq :coderef-prefix (nth 2 info))))
 	 (if (org-babel-noweb-p (nth 2 info) :eval)
 	     (org-babel-expand-noweb-references info)
 	   (nth 1 info))))
     (if (not coderef) expand
-       (org-src-coderef-regexp coderef) "" expand nil nil 1))))
+       (org-src-coderef-regexp coderef nil coderef-prefix) "" expand nil nil 1))))
 (defun org-babel--file-desc (params result)
   "Retrieve file description."
diff --git a/lisp/ob-tangle.el b/lisp/ob-tangle.el
index aa0373ab8..755a404b8 100644
--- a/lisp/ob-tangle.el
+++ b/lisp/ob-tangle.el
@@ -414,10 +414,14 @@ non-nil, return the full association list to be used by
 	 (src-lang (nth 0 info))
 	 (params (nth 2 info))
 	 (extra (nth 3 info))
-	 (cref-fmt (or (and (string-match "-l \"\\(.+\\)\"" extra)
-			    (match-string 1 extra))
-		       org-coderef-label-format))
-	 (link (let ((l (org-no-properties (org-store-link nil))))
+	 (coderef (nth 6 info))
+         (asdf (message "%S" info))
+         (cref-regexp (org-src-coderef-regexp
+                       (or coderef
+                           org-coderef-label-format)
+                       nil
+                       (cdr (assq :coderef-prefix params))))
+         (link (let ((l (org-no-properties (org-store-link nil))))
                  (and (string-match org-link-bracket-re l)
                       (match-string 1 l))))
@@ -443,10 +447,10 @@ non-nil, return the full association list to be used by
 		       body params (and (fboundp assignments-cmd)
 					(funcall assignments-cmd params))))))
-	      (when (string-match "-r" extra)
+	      (when (or (string= (cdr (assq :coderef-tangle params)) "no")
+                        (string-match "-r" extra))
 		(goto-char (point-min))
-		(while (re-search-forward
-			(replace-regexp-in-string "%s" ".+" cref-fmt) nil t)
+		(while (re-search-forward cref-regexp nil t)
 		  (replace-match "")))
 	      (run-hooks 'org-babel-tangle-body-hook)
diff --git a/lisp/ol.el b/lisp/ol.el
index b8bd7d234..e2e37ba6d 100644
--- a/lisp/ol.el
+++ b/lisp/ol.el
@@ -44,6 +44,7 @@
 (declare-function calendar-cursor-to-date "calendar" (&optional error event))
 (declare-function dired-get-filename "dired" (&optional localp no-error-if-not-filep))
 (declare-function org-at-heading-p "org" (&optional _))
+(declare-function org-babel-get-src-block-info "ob-core" (&optional light datum))
 (declare-function org-back-to-heading "org" (&optional invisible-ok))
 (declare-function org-before-first-heading-p "org" ())
 (declare-function org-do-occur "org" (regexp &optional cleanup))
@@ -71,7 +72,7 @@
 (declare-function org-run-like-in-org-mode "org" (cmd))
 (declare-function org-show-context "org" (&optional key))
 (declare-function org-src-coderef-format "org-src" (&optional element))
-(declare-function org-src-coderef-regexp "org-src" (fmt &optional label))
+(declare-function org-src-coderef-regexp "org-src" (fmt &optional label rx-prefix))
 (declare-function org-src-edit-buffer-p "org-src" (&optional buffer))
 (declare-function org-src-source-buffer "org-src" ())
 (declare-function org-src-source-type "org-src" ())
@@ -1145,10 +1146,12 @@ of matched result, which is either `dedicated' or `fuzzy'."
 	    (let ((element (org-element-at-point)))
 	      (when (and (memq (org-element-type element)
 			       '(example-block src-block))
-			 (org-match-line
-			  (concat ".*?" (org-src-coderef-regexp
-					 (org-src-coderef-format element)
-					 coderef))))
+                         (let ((info (org-babel-get-src-block-info nil element)))
+			   (org-match-line
+			    (concat ".*?" (org-src-coderef-regexp
+                                           (nth 6 info)
+					   coderef
+                                           (cdr (assq :coderef-prefix (nth 2 info))))))))
 		(setq type 'dedicated)
 		(goto-char (match-beginning 2))
 		(throw :coderef-match nil))))
@@ -1523,7 +1526,9 @@ non-nil."
 	   ;; A code reference exists.  Use it.
-	      (re-search-forward (org-src-coderef-regexp coderef-format)
+	      (re-search-forward (org-src-coderef-regexp coderef-format
+                                                         nil
+                                                         rx-prefix)
 	    (setq link (funcall format-link (match-string-no-properties 3))))
diff --git a/lisp/org-src.el b/lisp/org-src.el
index 20acee4e6..b0119ddbc 100644
--- a/lisp/org-src.el
+++ b/lisp/org-src.el
@@ -868,7 +868,7 @@ to the remote source block."
    ((org-element-property :label-fmt (org-element-at-point)))
    (t org-coderef-label-format)))
-(defun org-src-coderef-regexp (fmt &optional label)
+(defun org-src-coderef-regexp (fmt &optional label rx-prefix)
   "Return regexp matching a coderef format string FMT.
 When optional argument LABEL is non-nil, match coderef for that
@@ -879,7 +879,8 @@ white spaces.  Match group 2 contains the same string without any
 surrounding space.  Match group 3 contains the label.
 A coderef format regexp can only match at the end of a line."
-  (format "\\([ \t]*\\(%s\\)[ \t]*\\)$"
+  (format "\\(%s\\(%s\\)[ \t]*\\)$"
+          (or rx-prefix "[ \t]*")
 	   (if label (regexp-quote label) "\\([-a-zA-Z0-9_][-a-zA-Z0-9_ ]*\\)")
diff --git a/lisp/ox.el b/lisp/ox.el
index f705bc83a..d8f31990a 100644
--- a/lisp/ox.el
+++ b/lisp/ox.el
@@ -78,8 +78,9 @@
 (require 'org-macro)
 (require 'tabulated-list)
+(declare-function org-babel-get-src-block-info "ob-core" (&optional light datum))
 (declare-function org-src-coderef-format "org-src" (&optional element))
-(declare-function org-src-coderef-regexp "org-src" (fmt &optional label))
+(declare-function org-src-coderef-regexp "org-src" (fmt &optional label rx-prefix))
 (declare-function org-publish "ox-publish" (project &optional force async))
 (declare-function org-publish-all "ox-publish" (&optional force async))
 (declare-function org-publish-current-file "ox-publish" (&optional force async))
@@ -4213,9 +4214,12 @@ error if no block contains REF."
 	(lambda (el)
 	    (insert (org-trim (org-element-property :value el)))
-	    (let* ((label-fmt (or (org-element-property :label-fmt el)
+	    (let* ((ob-info (org-babel-get-src-block-info nil el))
+                   (label-fmt (or (nth 6 ob-info)
-		   (ref-re (org-src-coderef-regexp label-fmt ref)))
+		   (ref-re (org-src-coderef-regexp label-fmt
+                                                   ref
+                                                   (cdr (assq :coderef-prefix (nth 2 ob-info))))))
 	      ;; Element containing REF is found.  Resolve it to
 	      ;; either a label or a line number, as needed.
 	      (when (re-search-backward ref-re nil t)
@@ -4627,7 +4631,10 @@ reference on that line (string)."
 		  (org-remove-indentation value))))
 	 ;; Build a regexp matching a loc with a reference.
-	 (ref-re (org-src-coderef-regexp (org-src-coderef-format element))))
+         (ob-info (org-babel-get-src-block-info nil element))
+	 (ref-re (org-src-coderef-regexp (nth 6 ob-info)
+                                         nil
+                                         (cdr (assq :coderef-prefix (nth 2 ob-info))))))
     ;; Return value.
      ;; Code with references removed.

  reply	other threads:[~2021-04-05  5:23 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-04 20:33 Bug: inconsistent escaping of coderef regexp Tom Gillespie
2021-04-01 15:45 ` Nicolas Goaziou
2021-04-01 16:09   ` Timothy
2021-04-04 22:01   ` Tom Gillespie
2021-04-04 23:12     ` Nicolas Goaziou
2021-04-05  5:22       ` Tom Gillespie [this message]
2021-04-05  7:42         ` Tom Gillespie
2021-04-07 17:58         ` Nicolas Goaziou
2021-04-07 19:44           ` Tom Gillespie
2021-04-09 22:19             ` Nicolas Goaziou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+G3_PM841Lb_va1z2PWVFX=ZCeSpjKFmLkgrcN8uWZfzn4uag@mail.gmail.com' \
    --to=tgbugs@gmail.com \
    --cc=emacs-orgmode@gnu.org \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).