emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Kaushal Modi <kaushal.modi@gmail.com>
To: Nicolas Goaziou <mail@nicolasgoaziou.fr>
Cc: emacs-org list <emacs-orgmode@gnu.org>
Subject: Re: Allow #+SETUPFILE to point to an URL for the org file
Date: Tue, 23 May 2017 19:07:16 +0000	[thread overview]
Message-ID: <CAFyQvY0VfUzcnR4+TAXn3fSti_==O2d6suXhn6grqf9CwztLew@mail.gmail.com> (raw)
In-Reply-To: <874lybp5ua.fsf@nicolasgoaziou.fr>


[-- Attachment #1.1: Type: text/plain, Size: 5003 bytes --]

Thanks for the detailed review.

I have attached a patch following your recommendation. Please review it.

Here's a MWE to show the use of this new feature:

=====
#+SETUPFILE:
https://cdn.rawgit.com/kaushalmodi/.emacs.d/master/misc/org-setupfile.org
#+TITLE: Heading{{{NEWLINE}}}Sub-heading
=====

(Note that NEWLINE is not an inbuilt macro.)

On Thu, Mar 30, 2017 at 5:06 AM Nicolas Goaziou <mail@nicolasgoaziou.fr>
wrote:

> org-setupfile-ht -> org--setupfile-cache
>
> if it is meant to be inserted in "org.el" proper, or
> `org-setupfile--cache' if you want to create a new "org-setupfile.el"
> library.
>

I ended up updating org-file-contents. So now that is org--file-cache.


> Nitpick : 'equal -> #'equal
>

Done.


> Hash table to store SETUPFILE contents.
>

Done.


> `org--setupfile-clear-cache' or `org-setupfile--clear-cache' depending
> on the location of the function.
>

This is an interactive, a user-facing function; do we want to add
double-dashes in that name? The function is now called org-file-clear-cache.


> file path -> file name
>

Done.


> > If SETUPFILE is a file path, use `org-file-contents' to get the file
> > contents.
>
> Then, we might want to generalize `org-file-contents' instead (i.e., let
> `org-file-contents' handle remote locations). WDYT?
>

That was my first thought, but was leery on modifying the
org-file-contents. I have now done that.


> Mind the double spaces at the end of sentences.
>

Done.


> They are not equivalent, but could `org-file-remote-p', or
> `file-remote-p' be used instead?
>

I tried (org-file-remote-p "http://foo.bar") but it returned nil. Looks
like both org-file-remote-p and file-remote-p will not work for URLs.


> >          (cache (when (and is-url (not nocache))
> >                   (gethash setupfile org-setupfile-ht)))
>
>   (cache (and is-url (not nocache) (gethash setupfile org-setupfile-ht)))
>

Of course :)


> >               (re-search-forward "\n\n") ; 2 consecutive new-line chars
> `re-search-forward' -> `search-forward'
>

Correct, regexp is not needed in that case.


> >                 (funcall (if noerror 'message 'error)
>   (if noerror #'message #'error)
>

Done.


>
> >                          "Unable to fetch SETUPFILE from `%s'"
>
>   `%s' -> %S
>

setupfile is always a string so I used %s. If setupfile is not a string
(may be nil?), then the very first string-match-p will through an error. Is
there a specific reason for using %S? I did not use %S because I did not
want the double-quotes to be printed around the string in the echo area.


> > setupfile)))))
> >       (setq contents (org-file-contents setupfile noerror)))
>
> I think it is clearer if wrapped like this:
>
>   (contents
>    (cond (cache)
>          (is-url
>           (let (url-retrieve-header)
>             ...))
>          (t (org-file-contents setupfile noerror))))
>

That indeed is pretty sweet. I have made that change.


> >     (when contents
> >       (save-excursion
> >         (insert contents)))))
>
> This may not be necessary at this point if we merge `org-file-contents'
> with the above.
>

Correct. The attached patch has everything integrated in org-file-contents.


> > Question:
> >
> > - All the places where the content of SETUPFILE is inserted in a temp
> > buffer, it is assumed that the file is retrieved from disk and not from
> URL.
> >
> > Example in ox.el:
> >
> >  ((equal key "SETUPFILE")
> >   (let ((file
> >  (expand-file-name
> >   (org-unbracket-string "\"" "\"" (org-trim val)))))
> >     ;; Avoid circular dependencies.
> >     (unless (member file files)
> >       (with-temp-buffer
> > (setq default-directory
> >   (file-name-directory file))
> > (insert (org-file-contents file 'noerror))
> > (let ((org-inhibit-startup t)) (org-mode))
> > (funcall get-options (cons file files))))))
> >
> >
> > Note the use of expand-file-name, (member file files), default-directory,
> > (funcall get-options (cons file files)).
>
> (member file files), (cons file files) and `expand-file-name' are used
> to avoid reading over and over the same setup file. In particular, they
> prevent circular dependencies.
>
> You can ignore `expand-file-name' and replace `file' with `uri', i.e.,
> it is straightforward to extend the code to remote file names.
>
> `default-directory' is slightly more tricky, as it is used to properly
> read recursively setup files with relative file names. I think our best
> bet is to to check if current file name is local or remote, and ignore
> `default-directory' setting in the latter case.
>

Do we need to update the code using org-file-contents in these places:

lisp/org-capture.el
692:      (setq txt (org-file-contents file))

lisp/ox-man.el
519:              (setq code-block  (org-file-contents out-file))
764:        (setq code-block  (org-file-contents out-file))

contrib/lisp/ox-groff.el
1084:              (setq code-block  (org-file-contents out-file))
1521:          (setq code-block  (org-file-contents out-file))
-- 

Kaushal Modi

[-- Attachment #1.2: Type: text/html, Size: 8551 bytes --]

[-- Attachment #2: 0001-Allow-org-file-contents-to-fetch-file-contents-from-.patch --]
[-- Type: application/octet-stream, Size: 9479 bytes --]

From 93fdbc96527b63b3ceb9c25708958b9accd66c06 Mon Sep 17 00:00:00 2001
From: Kaushal Modi <kaushal.modi@gmail.com>
Date: Tue, 23 May 2017 13:40:20 -0400
Subject: [PATCH] Allow org-file-contents to fetch file contents from a URL

* lisp/org.el (org--file-cache): New internal variable to store
downloaded files' cache.

* lisp/org.el (org-file-clear-cache): New interactive function to
clear the above file cache.

* lisp/org.el (org-file-url-p): New function to test if the input
argument is a URL.

* lisp/org.el (org-file-contents): Allow the FILE argument to be a
URL.  If the URL contents are already cached, return the cache
contents, else download the file and return contents of that.  The
file is automatically cached each time it is downloaded.  Add a new
optional argument NOCACHE.  If this is non-nil, the URL is always
downloaded afresh.  Use `org--file-cache' and `org-file-url-p'.

* lisp/ox.el (org-export--list-bound-variables)
(org-export--prepare-file-contents):
* lisp/org-macro.el (org-macro--collect-macros) : Adapt to the
possibility that the input to `org-file-contents' can be a URL too.
---
 etc/ORG-NEWS      |  5 ++++
 lisp/org-macro.el | 22 ++++++++++------
 lisp/org.el       | 77 +++++++++++++++++++++++++++++++++++++++++++++++--------
 lisp/ox.el        | 38 ++++++++++++++++-----------
 4 files changed, 108 insertions(+), 34 deletions(-)

diff --git a/etc/ORG-NEWS b/etc/ORG-NEWS
index 23e8a1db7..4ae4f57eb 100644
--- a/etc/ORG-NEWS
+++ b/etc/ORG-NEWS
@@ -231,6 +231,11 @@ which causes refile targets to be prefixed with the buffer’s
 name. This is particularly useful when used in conjunction with
 ~uniquify.el~.
 
+*** ~org-file-contents~ now allows the FILE argument to be a URL.
+This allows ~#+SETUPFILE:~ to accept a URL instead of a local file
+path.  A new optional argument ~NOCACHE~ is added to
+~org-file-contents~.
+
 ** Removed functions
 
 *** Org Timeline
diff --git a/lisp/org-macro.el b/lisp/org-macro.el
index 71e917b71..590ee57bd 100644
--- a/lisp/org-macro.el
+++ b/lisp/org-macro.el
@@ -56,7 +56,8 @@
 		  (&optional granularity visible-only))
 (declare-function org-element-property "org-element" (property element))
 (declare-function org-element-type "org-element" (element))
-(declare-function org-file-contents "org" (file &optional noerror))
+(declare-function org-file-url-p "org" (file))
+(declare-function org-file-contents "org" (file &optional noerror nocache))
 (declare-function org-mode "org" ())
 (declare-function vc-backend "vc-hooks" (f))
 (declare-function vc-call "vc-hooks" (fun file &rest args) t)
@@ -102,16 +103,21 @@ Return an alist containing all macro templates found."
 				 (if old-cell (setcdr old-cell template)
 				   (push (cons name template) templates))))
 			   ;; Enter setup file.
-			   (let ((file (expand-file-name
-					(org-unbracket-string "\"" "\"" val))))
-			     (unless (member file files)
+			   (let* ((uri (org-unbracket-string "\"" "\"" (org-trim val)))
+				  (uri-is-url (org-file-url-p uri))
+				  (uri (if uri-is-url
+					   uri
+					 (expand-file-name uri))))
+			     ;; Avoid circular dependencies.
+			     (unless (member uri files)
 			       (with-temp-buffer
-				 (setq default-directory
-				       (file-name-directory file))
+				 (unless uri-is-url
+				   (setq default-directory
+					 (file-name-directory uri)))
 				 (org-mode)
-				 (insert (org-file-contents file 'noerror))
+				 (insert (org-file-contents uri 'noerror))
 				 (setq templates
-				       (funcall collect-macros (cons file files)
+				       (funcall collect-macros (cons uri files)
 						templates)))))))))))
 		templates))))
     (funcall collect-macros nil nil)))
diff --git a/lisp/org.el b/lisp/org.el
index 6a15e801c..8dcfd1676 100644
--- a/lisp/org.el
+++ b/lisp/org.el
@@ -5273,17 +5273,72 @@ a string, summarizing TAGS, as a list of strings."
 	   (setq current-group (list tag))))
 	(_ nil)))))
 
-(defun org-file-contents (file &optional noerror)
-  "Return the contents of FILE, as a string."
-  (if (and file (file-readable-p file))
-      (with-temp-buffer
-	(insert-file-contents file)
-	(buffer-string))
-    (funcall (if noerror 'message 'error)
-	     "Cannot read file \"%s\"%s"
-	     file
-	     (let ((from (buffer-file-name (buffer-base-buffer))))
-	       (if from (concat " (referenced in file \"" from "\")") "")))))
+(defvar org--file-cache (make-hash-table :test #'equal)
+  "Hash table to store contents of files referenced via a URL.
+This is the cache of file URLs read using `org-file-contents'.")
+
+(defun org-file-clear-cache ()
+  "Clear the file cache stored in `org--file-cache'."
+  (interactive)
+  (clrhash org--file-cache))
+
+(defun org-file-url-p (file)
+  "Test whether FILE is a URL.
+Return non-nil if it is."
+  (require 'ffap)
+  (string-match-p ffap-url-regexp file))
+
+(defun org-file-contents (file &optional noerror nocache)
+  "Return the contents of FILE, as a string.
+
+FILE can be a file name or URL.
+
+If FILE is a URL, download the contents.  If the URL contents are
+already cached in the `org--file-cache' hash table, the download step
+is skipped.
+
+If NOERROR is non-nil, ignore the error when unable to read the FILE
+from file or URL.
+
+If NOCACHE is non-nil, do a fresh fetch of FILE even if cached version
+is available.  This option applies only if FILE is a URL."
+  (let* ((is-url (org-file-url-p file))
+         (cache (and is-url
+                     (not nocache)
+                     (gethash file org--file-cache)))
+         err)
+    (prog1
+        (cond
+         (cache)           ;Set contents to cache if cache is non-nil
+         (is-url
+          (let (url-retrieve-header
+		cont)
+            (with-current-buffer (url-retrieve-synchronously file)
+              (goto-char (point-min))
+              ;; Take point to after the url-retrieve header
+              (search-forward "\n\n") ;2 consecutive new-line chars
+              (setq url-retrieve-header (buffer-substring-no-properties
+                                         (point-min) (point)))
+              (message url-retrieve-header) ;Dump the URL retrieve header to *Messages*
+              (if (string-match-p "HTTP.*\\s-+200\\s-OK" url-retrieve-header) ;URL retrieved correctly
+                  (progn
+                    (setq cont (buffer-substring-no-properties (point) (point-max)))
+                    ;; Update the cache `org--file-cache'
+                    (puthash file cont org--file-cache))
+                (setq err t)))
+	    cont))
+         (t (if (and file (file-readable-p file))
+                (with-temp-buffer
+	          (insert-file-contents file)
+	          (buffer-string))
+              (setq err t))))
+      (when err
+        (let ((msg (if is-url
+                       "Unable to fetch file from"
+                     "Cannot read file"))
+              (from (buffer-file-name (buffer-base-buffer))))
+          (funcall (if noerror #'message #'user-error)
+                   (format "%s `%s' (referenced in file `%s')" msg file from)))))))
 
 (defun org-extract-log-state-settings (x)
   "Extract the log state setting from a TODO keyword string.
diff --git a/lisp/ox.el b/lisp/ox.el
index ac8d8ce68..7d1012974 100644
--- a/lisp/ox.el
+++ b/lisp/ox.el
@@ -1499,17 +1499,20 @@ Assume buffer is in Org mode.  Narrowing, if any, is ignored."
 			 (cond
 			  ;; Options in `org-export-special-keywords'.
 			  ((equal key "SETUPFILE")
-			   (let ((file
-				  (expand-file-name
-				   (org-unbracket-string "\"" "\"" (org-trim val)))))
+			   (let* ((uri (org-unbracket-string "\"" "\"" (org-trim val)))
+				  (uri-is-url (org-file-url-p uri))
+				  (uri (if uri-is-url
+					   uri
+					 (expand-file-name uri))))
 			     ;; Avoid circular dependencies.
-			     (unless (member file files)
+			     (unless (member uri files)
 			       (with-temp-buffer
-				 (setq default-directory
-				   (file-name-directory file))
-				 (insert (org-file-contents file 'noerror))
+				 (unless uri-is-url
+				   (setq default-directory
+					 (file-name-directory uri)))
+				 (insert (org-file-contents uri 'noerror))
 				 (let ((org-inhibit-startup t)) (org-mode))
-				 (funcall get-options (cons file files))))))
+				 (funcall get-options (cons uri files))))))
 			  ((equal key "OPTIONS")
 			   (setq plist
 				 (org-combine-plists
@@ -1647,17 +1650,22 @@ an alist where associations are (VARIABLE-NAME VALUE)."
 				      "BIND")
 			       (push (read (format "(%s)" val)) alist)
 			     ;; Enter setup file.
-			     (let ((file (expand-file-name
-					  (org-unbracket-string "\"" "\"" val))))
-			       (unless (member file files)
+			     (let* ((uri (org-unbracket-string "\"" "\"" val))
+				    (uri-is-url (org-file-url-p uri))
+				    (uri (if uri-is-url
+					     uri
+					   (expand-file-name uri))))
+			       ;; Avoid circular dependencies.
+			       (unless (member uri files)
 				 (with-temp-buffer
-				   (setq default-directory
-					 (file-name-directory file))
+				   (unless uri-is-url
+				     (setq default-directory
+					   (file-name-directory uri)))
 				   (let ((org-inhibit-startup t)) (org-mode))
-				   (insert (org-file-contents file 'noerror))
+				   (insert (org-file-contents uri 'noerror))
 				   (setq alist
 					 (funcall collect-bind
-						  (cons file files)
+						  (cons uri files)
 						  alist))))))))))
 		   alist)))))
       ;; Return value in appropriate order of appearance.
-- 
2.13.0


  reply	other threads:[~2017-05-23 19:07 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-03 17:23 Allow #+SETUPFILE to point to an URL for the org file Kaushal Modi
2016-12-08 11:51 ` Kaushal Modi
2016-12-08 14:22   ` John Kitchin
2016-12-08 14:31   ` Nicolas Goaziou
2016-12-08 14:44     ` Kaushal Modi
2016-12-08 21:48       ` Nicolas Goaziou
2016-12-08 22:07         ` Kaushal Modi
2016-12-08 22:40           ` Nicolas Goaziou
2017-03-13 17:37             ` Kaushal Modi
2017-03-30  7:43               ` Nicolas Goaziou
2017-05-23 19:07                 ` Kaushal Modi [this message]
2017-05-25 10:13                   ` Nicolas Goaziou
2017-05-25 10:18                     ` Nicolas Goaziou
2017-05-25 11:43                     ` Kaushal Modi
2017-05-25 15:15                       ` Kaushal Modi
2017-05-26  7:47                         ` Nicolas Goaziou
2017-05-26 20:24                           ` Kaushal Modi
2017-05-28  7:35                             ` Nicolas Goaziou
2017-05-28 10:04                               ` Kaushal Modi
2017-06-09 16:59                               ` Kaushal Modi
2017-06-12 19:32                                 ` Kaushal Modi
2017-06-13 12:43                                   ` Nicolas Goaziou
2017-06-13 15:45                                     ` Kaushal Modi
2017-06-13 21:32                                       ` Nicolas Goaziou
2017-06-13 21:42                                         ` Kaushal Modi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFyQvY0VfUzcnR4+TAXn3fSti_==O2d6suXhn6grqf9CwztLew@mail.gmail.com' \
    --to=kaushal.modi@gmail.com \
    --cc=emacs-orgmode@gnu.org \
    --cc=mail@nicolasgoaziou.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).