emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Kaushal Modi <kaushal.modi@gmail.com>
To: Nicolas Goaziou <mail@nicolasgoaziou.fr>
Cc: emacs-org list <emacs-orgmode@gnu.org>
Subject: Re: Allow #+SETUPFILE to point to an URL for the org file
Date: Thu, 25 May 2017 15:15:47 +0000	[thread overview]
Message-ID: <CAFyQvY3WuEvBjMHWW4KQKOjD=Ky_0E9zo+BJYo1jbz-z_pyyTg@mail.gmail.com> (raw)
In-Reply-To: <CAFyQvY3TJ+uP6kST=cX=wJO2j7E1WwkFqE-Cfa8H137ejJsPzA@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 3672 bytes --]

I have attached an updated and rebased patch with most of your suggestions
implemented.

Comments below.

On Thu, May 25, 2017 at 7:43 AM Kaushal Modi <kaushal.modi@gmail.com> wrote:

> On Thu, May 25, 2017, 6:15 AM Nicolas Goaziou <mail@nicolasgoaziou.fr>
> wrote:
>
>> Interactive functions do not have double-dashes in their names. However,
>> I have concerns about this interactive status. Given than the function
>> is not properly documented in the manual, there is little chance it will
>> be actually used. And if it isn't, it could return surprising results.
>>
>> Another idea would be to replace NOCACHE with CLEAR-CACHE. When this is
>> non-nil, the cache is reset at the beginning of the function. The point
>> is to reset the cache the first time the function is called, but not on
>> recursive calls, which ensures any file is retrieved only once.
>
>
> Here is my use case for org-file-clear-cache:
>
> Let's say I have a file where I have a SETUPFILE retrieved from a URL. Now
> the upstream version changes but my cache is still on the older version. So
> I need to clear the hash. The org-file-clear-cache simply does that.
>
> With the function being interactive, I just do
>
> - M-x org-file-clear-cache
> - C-c C-e h h (or whatever I am exporting to)
>
> If you suggest a node where I should put that in the manual, I can add
> that to my updated patch. I'll all add more explanation to the doc-string
> of that function.
>
> Now, if the CLEAR-CACHE argument is added to org-file-clear-cache, how do
> we control the cache clearing interactively from outside?
>
> Also, how do we implement the resetting of the cache only the first time
> the function is called? Wouldn't that need an extra alist defvar to record
> the state of whether the function is already called specifically for that
> file? I think that would unnecessary complicate the logic.
>
> Another idea is that we have a defcustom like org-file-never-cache. When
> non-nil, that will always do a fresh URL download. This will be or'ed with
> the NOCACHE inside org-file-contents. This, though, makes it a bit
> inconvenient for the user to use the latest upstream version when they
> need... They might need to set org-file-never-cache to t momentarily,
> probably via Local Variables, before an export.
>
>  Of
>
> course the cache doesn't survive to multiple exports, but at least it is
>> transparent to the user.
>>
>
> Sorry, I didn't follow that. Did you mean that the cache doesn't survive
> between emacs sessions? Because the cache will actually survive between
> multiple exports.
>

This issue is still open. I have just added a bit more info to the
docstring of org-file-clear-cache.

I grepped org.texi but found no reference of org-file-contents. So may be
we need to add a section for that, and there I can explain
org-file-clear-cache in more detail. What would be a good node for that?


>           (if (re-search-forward "HTTP.*\\s-+200\\s-OK" nil t)
>>               ;; URL retrieved correctly.  Move point to after the
>>               ;; url-retrieve header, update the cache `org--file-cache'
>>               ;; and return contents.
>>               (progn
>>                 (search-forward "\n\n" nil 'move)
>
>
I have integrated most of your refactored version except for this portion.
Above will do a false match if that "HTTP.." string is present in the FILE
body too! I have retained my version of only that part where the search
happens only inside the url-retrieve header. The search is also faster in
the case of failure as it does not have to search through the whole file
before declaring a fail.. as only the header is searched.

> --

Kaushal Modi

[-- Attachment #1.2: Type: text/html, Size: 5410 bytes --]

[-- Attachment #2: 0001-Allow-org-file-contents-to-fetch-file-contents-from-.patch --]
[-- Type: application/octet-stream, Size: 9116 bytes --]

From b9877ae43bb8e83e1cecfdf0dab5065574c8dc85 Mon Sep 17 00:00:00 2001
From: Kaushal Modi <kaushal.modi@gmail.com>
Date: Thu, 25 May 2017 11:08:20 -0400
Subject: [PATCH] Allow org-file-contents to fetch file contents from a URL

* lisp/org.el (org--file-cache): New internal variable to store
downloaded files' cache.

* lisp/org.el (org-file-clear-cache): New interactive function to
clear the above file cache.

* lisp/org.el (org-file-url-p): New function to test if the input
argument is a URL.

* lisp/org.el (org-file-contents): Allow the FILE argument to be a
URL.  If the URL contents are already cached, return the cache
contents, else download the file and return contents of that.  The
file is automatically cached each time it is downloaded.  Add a new
optional argument NOCACHE.  If this is non-nil, the URL is always
downloaded afresh.  Use `org--file-cache' and `org-file-url-p'.

* lisp/ox.el (org-export--list-bound-variables)
(org-export--prepare-file-contents):
* lisp/org-macro.el (org-macro--collect-macros) : Adapt to the
possibility that the input to `org-file-contents' can be a URL too.
---
 etc/ORG-NEWS      |  5 ++++
 lisp/org-macro.el | 22 ++++++++++-------
 lisp/org.el       | 72 +++++++++++++++++++++++++++++++++++++++++++++++--------
 lisp/ox.el        | 38 +++++++++++++++++------------
 4 files changed, 104 insertions(+), 33 deletions(-)

diff --git a/etc/ORG-NEWS b/etc/ORG-NEWS
index 044f167ce..6e24d408f 100644
--- a/etc/ORG-NEWS
+++ b/etc/ORG-NEWS
@@ -234,6 +234,11 @@ which causes refile targets to be prefixed with the buffer’s
 name. This is particularly useful when used in conjunction with
 ~uniquify.el~.
 
+*** ~org-file-contents~ now allows the FILE argument to be a URL.
+This allows ~#+SETUPFILE:~ to accept a URL instead of a local file
+path.  A new optional argument ~NOCACHE~ is added to
+~org-file-contents~.
+
 ** Removed functions
 
 *** Org Timeline
diff --git a/lisp/org-macro.el b/lisp/org-macro.el
index f5ddb92e4..9f6e0ebaf 100644
--- a/lisp/org-macro.el
+++ b/lisp/org-macro.el
@@ -59,7 +59,8 @@
 		  (&optional granularity visible-only))
 (declare-function org-element-property "org-element" (property element))
 (declare-function org-element-type "org-element" (element))
-(declare-function org-file-contents "org" (file &optional noerror))
+(declare-function org-file-url-p "org" (file))
+(declare-function org-file-contents "org" (file &optional noerror nocache))
 (declare-function org-mode "org" ())
 (declare-function vc-backend "vc-hooks" (f))
 (declare-function vc-call "vc-hooks" (fun file &rest args) t)
@@ -105,16 +106,21 @@ Return an alist containing all macro templates found."
 				 (if old-cell (setcdr old-cell template)
 				   (push (cons name template) templates))))
 			   ;; Enter setup file.
-			   (let ((file (expand-file-name
-					(org-unbracket-string "\"" "\"" val))))
-			     (unless (member file files)
+			   (let* ((uri (org-unbracket-string "\"" "\"" (org-trim val)))
+				  (uri-is-url (org-file-url-p uri))
+				  (uri (if uri-is-url
+					   uri
+					 (expand-file-name uri))))
+			     ;; Avoid circular dependencies.
+			     (unless (member uri files)
 			       (with-temp-buffer
-				 (setq default-directory
-				       (file-name-directory file))
+				 (unless uri-is-url
+				   (setq default-directory
+					 (file-name-directory uri)))
 				 (org-mode)
-				 (insert (org-file-contents file 'noerror))
+				 (insert (org-file-contents uri 'noerror))
 				 (setq templates
-				       (funcall collect-macros (cons file files)
+				       (funcall collect-macros (cons uri files)
 						templates)))))))))))
 		templates))))
     (funcall collect-macros nil nil)))
diff --git a/lisp/org.el b/lisp/org.el
index 946d8af8c..ebd1f4792 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -5273,17 +5273,69 @@ a string, summarizing TAGS, as a list of strings."
 	   (setq current-group (list tag))))
 	(_ nil)))))
 
-(defun org-file-contents (file &optional noerror)
-  "Return the contents of FILE, as a string."
-  (if (and file (file-readable-p file))
+(defvar org--file-cache (make-hash-table :test #'equal)
+  "Hash table to store contents of files referenced via a URL.
+This is the cache of file URLs read using `org-file-contents'.")
+
+(defun org-file-clear-cache ()
+  "Clear the file cache stored in `org--file-cache'.
+
+By default, if the FILE argument of `org-file-contents' is a URL, the
+URL download will be skipped if it was already downloaded and cached
+in `org--file-cache'.  If you need to force-download the URL again,
+call this function to clear the cache first."
+  (interactive)
+  (clrhash org--file-cache))
+
+(defun org-file-url-p (file)
+  "Non-nil if FILE is a URL."
+  (require 'ffap)
+  (string-match-p ffap-url-regexp file))
+
+(defun org-file-contents (file &optional noerror nocache)
+  "Return the contents of FILE, as a string.
+
+FILE can be a file name or URL.
+
+If FILE is a URL, download the contents.  If the URL contents are
+already cached in the `org--file-cache' hash table, the download step
+is skipped.
+
+If NOERROR is non-nil, ignore the error when unable to read the FILE
+from file or URL.
+
+If NOCACHE is non-nil, do a fresh fetch of FILE even if cached version
+is available.  This option applies only if FILE is a URL."
+  (let* ((is-url (org-file-url-p file))
+         (cache (and is-url
+                     (not nocache)
+                     (gethash file org--file-cache))))
+    (cond
+     (cache)
+     (is-url
+      (with-current-buffer (url-retrieve-synchronously file)
+        (goto-char (point-min))
+        ;; Move point to after the url-retrieve header
+        (search-forward "\n\n" nil 'move)
+	;; Search for the success code only in the url-retrieve header
+        (if (string-match-p "HTTP.*\\s-+200\\s-OK"
+			    (buffer-substring-no-properties
+                             (point-min) (point)))
+	    ;; Update the cache `org--file-cache' and return contents
+            (puthash file
+		     (buffer-substring-no-properties (point) (point-max))
+		     org--file-cache)
+	  (funcall (if noerror #'message #'user-error)
+                   "Unable to fetch file from %S" file))))
+     (t
       (with-temp-buffer
-	(insert-file-contents file)
-	(buffer-string))
-    (funcall (if noerror 'message 'error)
-	     "Cannot read file \"%s\"%s"
-	     file
-	     (let ((from (buffer-file-name (buffer-base-buffer))))
-	       (if from (concat " (referenced in file \"" from "\")") "")))))
+        (condition-case err
+	    (progn
+	      (insert-file-contents file)
+	      (buffer-string))
+	  (file-error
+           (funcall (if noerror #'message #'user-error)
+		    (error-message-string err)))))))))
 
 (defun org-extract-log-state-settings (x)
   "Extract the log state setting from a TODO keyword string.
diff --git a/lisp/ox.el b/lisp/ox.el
index ac8d8ce68..7d1012974 100644
--- a/lisp/ox.el
+++ b/lisp/ox.el
@@ -1499,17 +1499,20 @@ Assume buffer is in Org mode.  Narrowing, if any, is ignored."
 			 (cond
 			  ;; Options in `org-export-special-keywords'.
 			  ((equal key "SETUPFILE")
-			   (let ((file
-				  (expand-file-name
-				   (org-unbracket-string "\"" "\"" (org-trim val)))))
+			   (let* ((uri (org-unbracket-string "\"" "\"" (org-trim val)))
+				  (uri-is-url (org-file-url-p uri))
+				  (uri (if uri-is-url
+					   uri
+					 (expand-file-name uri))))
 			     ;; Avoid circular dependencies.
-			     (unless (member file files)
+			     (unless (member uri files)
 			       (with-temp-buffer
-				 (setq default-directory
-				   (file-name-directory file))
-				 (insert (org-file-contents file 'noerror))
+				 (unless uri-is-url
+				   (setq default-directory
+					 (file-name-directory uri)))
+				 (insert (org-file-contents uri 'noerror))
 				 (let ((org-inhibit-startup t)) (org-mode))
-				 (funcall get-options (cons file files))))))
+				 (funcall get-options (cons uri files))))))
 			  ((equal key "OPTIONS")
 			   (setq plist
 				 (org-combine-plists
@@ -1647,17 +1650,22 @@ an alist where associations are (VARIABLE-NAME VALUE)."
 				      "BIND")
 			       (push (read (format "(%s)" val)) alist)
 			     ;; Enter setup file.
-			     (let ((file (expand-file-name
-					  (org-unbracket-string "\"" "\"" val))))
-			       (unless (member file files)
+			     (let* ((uri (org-unbracket-string "\"" "\"" val))
+				    (uri-is-url (org-file-url-p uri))
+				    (uri (if uri-is-url
+					     uri
+					   (expand-file-name uri))))
+			       ;; Avoid circular dependencies.
+			       (unless (member uri files)
 				 (with-temp-buffer
-				   (setq default-directory
-					 (file-name-directory file))
+				   (unless uri-is-url
+				     (setq default-directory
+					   (file-name-directory uri)))
 				   (let ((org-inhibit-startup t)) (org-mode))
-				   (insert (org-file-contents file 'noerror))
+				   (insert (org-file-contents uri 'noerror))
 				   (setq alist
 					 (funcall collect-bind
-						  (cons file files)
+						  (cons uri files)
 						  alist))))))))))
 		   alist)))))
       ;; Return value in appropriate order of appearance.
-- 
2.13.0


  reply	other threads:[~2017-05-25 15:16 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-03 17:23 Allow #+SETUPFILE to point to an URL for the org file Kaushal Modi
2016-12-08 11:51 ` Kaushal Modi
2016-12-08 14:22   ` John Kitchin
2016-12-08 14:31   ` Nicolas Goaziou
2016-12-08 14:44     ` Kaushal Modi
2016-12-08 21:48       ` Nicolas Goaziou
2016-12-08 22:07         ` Kaushal Modi
2016-12-08 22:40           ` Nicolas Goaziou
2017-03-13 17:37             ` Kaushal Modi
2017-03-30  7:43               ` Nicolas Goaziou
2017-05-23 19:07                 ` Kaushal Modi
2017-05-25 10:13                   ` Nicolas Goaziou
2017-05-25 10:18                     ` Nicolas Goaziou
2017-05-25 11:43                     ` Kaushal Modi
2017-05-25 15:15                       ` Kaushal Modi [this message]
2017-05-26  7:47                         ` Nicolas Goaziou
2017-05-26 20:24                           ` Kaushal Modi
2017-05-28  7:35                             ` Nicolas Goaziou
2017-05-28 10:04                               ` Kaushal Modi
2017-06-09 16:59                               ` Kaushal Modi
2017-06-12 19:32                                 ` Kaushal Modi
2017-06-13 12:43                                   ` Nicolas Goaziou
2017-06-13 15:45                                     ` Kaushal Modi
2017-06-13 21:32                                       ` Nicolas Goaziou
2017-06-13 21:42                                         ` Kaushal Modi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFyQvY3WuEvBjMHWW4KQKOjD=Ky_0E9zo+BJYo1jbz-z_pyyTg@mail.gmail.com' \
    --to=kaushal.modi@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=mail@nicolasgoaziou.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).