emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* org-feed: support Atom
@ 2009-04-15 13:00 Magnus Henoch
  2009-04-15 13:26 ` Carsten Dominik
  2009-04-15 14:42 ` Carsten Dominik
  0 siblings, 2 replies; 5+ messages in thread
From: Magnus Henoch @ 2009-04-15 13:00 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 778 bytes --]

I hacked org-feed to make it support different parsers, and wrote a
simple Atom parser.  Not sure how strong my git-fu is, but I try to send
my patches here :)  Proposed ChangeLog entry:

2009-04-15  Magnus Henoch  <magnus.henoch@gmail.com>

	* org-feed.el (org-feed-alist): Add :parse-feed and :parse-entry
	options.  Mention org-feed-parse-atom-feed and
	org-feed-parse-atom-entry in the docstring.
	(org-feed-parse-rss-feed): Renamed from org-feed-parse-feed.
	(org-feed-parse-rss-entry): Renamed from org-feed-parse-entry.
	(org-feed-update): Use sha1 instead of org-sha1-string.  Use
	:parse-feed and :parse-entry.
	(org-feed-parse-atom-feed, org-feed-parse-atom-entry): New
	functions.

I have assigned copyright for Emacs; would that be good enough for
Orgmode?

Magnus


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001--org-feed.el-org-feed-parse-rss-feed-Renamed-fr.patch --]
[-- Type: text/x-patch, Size: 2217 bytes --]

From 8041c2f0a5379ccc08d3d3414cba54fb3786a94d Mon Sep 17 00:00:00 2001
From: Magnus Henoch <magnus.henoch@gmail.com>
Date: Wed, 15 Apr 2009 11:19:19 +0100
Subject: [PATCH] 	* org-feed.el (org-feed-parse-rss-feed): Renamed from
 	org-feed-parse-feed.
 	(org-feed-parse-rss-entry): Renamed from org-feed-parse-entry.
 	(org-feed-update): Update calls.

---
 lisp/org-feed.el |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index fca71f7..3ece0f9 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -288,7 +288,7 @@ it can be a list structured like an entry in `org-feed-alist'."
 	(error "Cannot get feed %s" name))
       (when retrieve-only
 	(throw 'exit feed-buffer))
-      (setq entries (org-feed-parse-feed feed-buffer))
+      (setq entries (org-feed-parse-rss-feed feed-buffer))
       (ignore-errors (kill-buffer feed-buffer))
       (save-excursion
 	(save-window-excursion
@@ -313,8 +313,8 @@ it can be a list structured like an entry in `org-feed-alist'."
 		  (push e changed))))
 
 	  ;; Parse the relevant entries fully
-	  (setq new     (mapcar 'org-feed-parse-entry new)
-		changed (mapcar 'org-feed-parse-entry changed))
+	  (setq new     (mapcar 'org-feed-parse-rss-entry new)
+		changed (mapcar 'org-feed-parse-rss-entry changed))
 
 	  ;; Run the filter
 	  (when filter
@@ -540,8 +540,8 @@ If that property is already present, nothing changes."
    ((functionp org-feed-retrieve-method)
     (funcall org-feed-retrieve-method url))))
 
-(defun org-feed-parse-feed (buffer)
-  "Parse BUFFER for RS feed entries.
+(defun org-feed-parse-rss-feed (buffer)
+  "Parse BUFFER for RSS feed entries.
 Returns a list of entries, with each entry a property list,
 containing the properties `:guid' and `:item-full-text'."
   (let (entries beg end item guid entry)
@@ -561,7 +561,7 @@ containing the properties `:guid' and `:item-full-text'."
 	(goto-char end))
       (nreverse entries))))
 
-(defun org-feed-parse-entry (entry)
+(defun org-feed-parse-rss-entry (entry)
   "Parse the `:item-full-text' field for xml tags and create new properties."
   (with-temp-buffer
     (insert (plist-get entry :item-full-text))
-- 
1.6.0.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002--org-feed.el-org-feed-alist-Add-parse-feed-and.patch --]
[-- Type: text/x-patch, Size: 3432 bytes --]

From 9f0061ff7825b18e7fbfcebf849be7cd192af340 Mon Sep 17 00:00:00 2001
From: Magnus Henoch <magnus.henoch@gmail.com>
Date: Wed, 15 Apr 2009 11:35:01 +0100
Subject: [PATCH] 	* org-feed.el (org-feed-alist): Add :parse-feed and :parse-entry
 	options.
 	(org-feed-update): Use them.

---
 lisp/org-feed.el |   28 ++++++++++++++++++++++++----
 1 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index 3ece0f9..55b14bc 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -159,7 +159,17 @@ Here are the keyword-value pair allows in `org-feed-alist'.
      This function gets passed a list of all entries that have been
      handled before, but are now still in the feed and have *changed*
      since last handled (as evidenced by a different sha1 hash).
-     When the handler is called, point will be at the feed headline."
+     When the handler is called, point will be at the feed headline.
+
+:parse-feed function
+     This function gets passed a buffer, and should return a list of entries,
+     each being a property list containing the `:guid' and `:item-full-text'
+     keys.  The default is `org-feed-parse-rss-feed'.
+
+:parse-entry function
+     This function gets passed an entry as returned by the parse-feed
+     function, and should return the entry with interesting properties added.
+     The default is `org-feed-parse-rss-entry'."
   :group 'org-feed
   :type '(repeat
 	  (list :value ("" "http://" "" "")
@@ -184,6 +194,12 @@ Here are the keyword-value pair allows in `org-feed-alist'.
 		    (list :inline t :tag "Changed items"
 			  (const :changed-handler)
 			  (symbol :tag "Handler Function"))
+                    (list :inline t :tag "Parse Feed"
+                          (const :parse-feed)
+                          (symbol :tag "Parse Feed Function"))
+                    (list :inline t :tag "Parse Entry"
+                          (const :parse-entry)
+                          (symbol :tag "Parse Entry Function"))
 		    )))))
 
 (defcustom org-feed-drawer "FEEDSTATUS"
@@ -281,6 +297,10 @@ it can be a list structured like an entry in `org-feed-alist'."
 			org-feed-default-template))
 	  (drawer (or (nth 1 (memq :drawer feed))
 		      org-feed-drawer))
+          (parse-feed (or (nth 1 (memq :parse-feed feed))
+                          'org-feed-parse-rss-feed))
+          (parse-entry (or (nth 1 (memq :parse-entry feed))
+                           'org-feed-parse-rss-entry))
 	  feed-buffer inbox-pos new-formatted
 	  entries old-status status new changed guid-alist e guid olds)
       (setq feed-buffer (org-feed-get-feed url))
@@ -288,7 +308,7 @@ it can be a list structured like an entry in `org-feed-alist'."
 	(error "Cannot get feed %s" name))
       (when retrieve-only
 	(throw 'exit feed-buffer))
-      (setq entries (org-feed-parse-rss-feed feed-buffer))
+      (setq entries (funcall parse-feed feed-buffer))
       (ignore-errors (kill-buffer feed-buffer))
       (save-excursion
 	(save-window-excursion
@@ -313,8 +333,8 @@ it can be a list structured like an entry in `org-feed-alist'."
 		  (push e changed))))
 
 	  ;; Parse the relevant entries fully
-	  (setq new     (mapcar 'org-feed-parse-rss-entry new)
-		changed (mapcar 'org-feed-parse-rss-entry changed))
+	  (setq new     (mapcar parse-entry new)
+		changed (mapcar parse-entry changed))
 
 	  ;; Run the filter
 	  (when filter
-- 
1.6.0.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: 0003--org-feed.el-org-feed-update-Use-sha1-instead-o.patch --]
[-- Type: text/x-patch, Size: 1185 bytes --]

From 269c6b3b0ec84d2c5478a7ac9cb0e49cfd2ca486 Mon Sep 17 00:00:00 2001
From: Magnus Henoch <magnus.henoch@gmail.com>
Date: Wed, 15 Apr 2009 11:37:15 +0100
Subject: [PATCH] 	* org-feed.el (org-feed-update): Use sha1 instead of
 	org-sha1-string.

---
 lisp/org-feed.el |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index 55b14bc..9c64bff 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -327,7 +327,7 @@ it can be a list structured like an entry in `org-feed-alist'."
 		(push e new)
 	      (setq olds (nth 2 (assoc (plist-get e :guid) old-status)))
 	      (if (and olds
-		       (not (string= (org-sha1-string
+		       (not (string= (sha1
 				      (plist-get e :item-full-text))
 				     olds)))
 		  (push e changed))))
@@ -361,7 +361,7 @@ it can be a list structured like an entry in `org-feed-alist'."
 			 ;; or if they were handled previously
 			 (if (assoc guid guid-alist) t (plist-get e :handled))
 			 ;; A hash, to detect changes
-			 (org-sha1-string (plist-get e :item-full-text))))
+			 (sha1 (plist-get e :item-full-text))))
 		 entries))
 
 	  ;; Handle new items in the feed
-- 
1.6.0.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #5: 0004--org-feed.el-org-feed-parse-atom-feed.patch --]
[-- Type: text/x-patch, Size: 2945 bytes --]

From ebb9b9094ae7cbf091d91540fee65cfe8522b869 Mon Sep 17 00:00:00 2001
From: Magnus Henoch <magnus.henoch@gmail.com>
Date: Wed, 15 Apr 2009 13:11:30 +0100
Subject: [PATCH] 	* org-feed.el (org-feed-parse-atom-feed)
 	(org-feed-parse-atom-entry): New functions.

---
 lisp/org-feed.el |   50 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 50 insertions(+), 0 deletions(-)

diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index 9c64bff..72d1a7a 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -596,6 +596,56 @@ containing the properties `:guid' and `:item-full-text'."
       (setq entry (plist-put entry :guid-permalink t))))
   entry)
 
+(defun org-feed-parse-atom-feed (buffer)
+  "Parse BUFFER for Atom feed entries.
+Returns a list of enttries, with each entry a property list,
+containing the properties `:guid' and `:item-full-text'.
+
+The `:item-full-text' property actually contains the sexp
+formatted as a string, not the original XML data."
+  (with-current-buffer buffer
+    (widen)
+    (goto-char (point-min))
+    ;; Skip HTTP headers
+    (search-forward "\n\n")
+    (delete-region (point-min) (point))
+    (let ((feed (car (xml-parse-region (point-min) (point-max)))))
+      (mapcar
+       (lambda (entry)
+         (list
+          :guid (car (xml-node-children (car (xml-get-children entry 'id))))
+          :item-full-text (prin1-to-string entry)))
+       (xml-get-children feed 'entry)))))
+
+(defun org-feed-parse-atom-entry (entry)
+  "Parse the `:item-full-text' as a sexp and create new properties."
+  (let ((xml (car (read-from-string (plist-get entry :item-full-text)))))
+    ;; Get first <link href='foo'/>.
+    (setq entry (plist-put entry :link
+                           (xml-get-attribute
+                            (car (xml-get-children xml 'link))
+                            'href)))
+    ;; Add <title/> as :title.
+    (setq entry (plist-put entry :title
+                           (car (xml-node-children
+                                 (car (xml-get-children xml 'title))))))
+    (let* ((content (car (xml-get-children xml 'content)))
+           (type (xml-get-attribute-or-nil content 'type)))
+      (when content
+        (cond
+         ((string= type "text")
+          ;; We like plain text.
+          (setq entry (plist-put entry :description (car (xml-node-children content)))))
+         ((string= type "html")
+          ;; TODO: convert HTML to Org markup.
+          (setq entry (plist-put entry :description (car (xml-node-children content)))))
+         ((string= type "xhtml")
+          ;; TODO: convert XHTML to Org markup.
+          (setq entry (plist-put entry :description (prin1-to-string (xml-node-children content)))))
+         (t
+          (setq entry (plist-put entry :description (format "Unknown '%s' content." type)))))))
+    entry))
+
 (provide 'org-feed)
 
 ;; arch-tag: 0929b557-9bc4-47f4-9633-30a12dbb5ae2
-- 
1.6.0.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #6: 0005--org-feed.el-org-feed-alist-Mention-org-feed-pa.patch --]
[-- Type: text/x-patch, Size: 1341 bytes --]

From ee9522707cf7d40bb18fe4cf63407e522361d46b Mon Sep 17 00:00:00 2001
From: Magnus Henoch <magnus.henoch@gmail.com>
Date: Wed, 15 Apr 2009 13:13:09 +0100
Subject: [PATCH] 	* org-feed.el (org-feed-alist): Mention org-feed-parse-atom-feed
 	and org-feed-parse-atom-entry in the docstring.

---
 lisp/org-feed.el |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index 72d1a7a..13fee41 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -164,12 +164,14 @@ Here are the keyword-value pair allows in `org-feed-alist'.
 :parse-feed function
      This function gets passed a buffer, and should return a list of entries,
      each being a property list containing the `:guid' and `:item-full-text'
-     keys.  The default is `org-feed-parse-rss-feed'.
+     keys.  The default is `org-feed-parse-rss-feed'; `org-feed-parse-atom-feed'
+     is an alternative.
 
 :parse-entry function
      This function gets passed an entry as returned by the parse-feed
      function, and should return the entry with interesting properties added.
-     The default is `org-feed-parse-rss-entry'."
+     The default is `org-feed-parse-rss-entry'; `org-feed-parse-atom-entry'
+     is an alternative."
   :group 'org-feed
   :type '(repeat
 	  (list :value ("" "http://" "" "")
-- 
1.6.0.2


[-- Attachment #7: Type: text/plain, Size: 204 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Remember: use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-04-15 14:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-15 13:00 org-feed: support Atom Magnus Henoch
2009-04-15 13:26 ` Carsten Dominik
2009-04-15 14:43   ` Magnus Henoch
2009-04-15 14:58     ` Carsten Dominik
2009-04-15 14:42 ` Carsten Dominik

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).