From mboxrd@z Thu Jan 1 00:00:00 1970 From: Magnus Henoch Subject: org-feed: support Atom Date: Wed, 15 Apr 2009 14:00:25 +0100 Message-ID: <84ocuy3wd2.fsf@linux-b2a3.site> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Return-path: Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Lu4nj-0000R5-ML for emacs-orgmode@gnu.org; Wed, 15 Apr 2009 09:05:15 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Lu4nd-0000PU-Ow for emacs-orgmode@gnu.org; Wed, 15 Apr 2009 09:05:13 -0400 Received: from [199.232.76.173] (port=43363 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Lu4nb-0000P9-SA for emacs-orgmode@gnu.org; Wed, 15 Apr 2009 09:05:09 -0400 Received: from main.gmane.org ([80.91.229.2]:53261 helo=ciao.gmane.org) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1Lu4nb-0005YC-16 for emacs-orgmode@gnu.org; Wed, 15 Apr 2009 09:05:07 -0400 Received: from root by ciao.gmane.org with local (Exim 4.43) id 1Lu4nW-0006rl-Tz for emacs-orgmode@gnu.org; Wed, 15 Apr 2009 13:05:03 +0000 Received: from host213-123-170-251.in-addr.btopenworld.com ([213.123.170.251]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 15 Apr 2009 13:05:02 +0000 Received: from magnus.henoch by host213-123-170-251.in-addr.btopenworld.com with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 15 Apr 2009 13:05:02 +0000 List-Id: "General discussions about Org-mode." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org Errors-To: emacs-orgmode-bounces+geo-emacs-orgmode=m.gmane.org@gnu.org To: emacs-orgmode@gnu.org --=-=-= I hacked org-feed to make it support different parsers, and wrote a simple Atom parser. Not sure how strong my git-fu is, but I try to send my patches here :) Proposed ChangeLog entry: 2009-04-15 Magnus Henoch * org-feed.el (org-feed-alist): Add :parse-feed and :parse-entry options. Mention org-feed-parse-atom-feed and org-feed-parse-atom-entry in the docstring. (org-feed-parse-rss-feed): Renamed from org-feed-parse-feed. (org-feed-parse-rss-entry): Renamed from org-feed-parse-entry. (org-feed-update): Use sha1 instead of org-sha1-string. Use :parse-feed and :parse-entry. (org-feed-parse-atom-feed, org-feed-parse-atom-entry): New functions. I have assigned copyright for Emacs; would that be good enough for Orgmode? Magnus --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0001--org-feed.el-org-feed-parse-rss-feed-Renamed-fr.patch >From 8041c2f0a5379ccc08d3d3414cba54fb3786a94d Mon Sep 17 00:00:00 2001 From: Magnus Henoch Date: Wed, 15 Apr 2009 11:19:19 +0100 Subject: [PATCH] * org-feed.el (org-feed-parse-rss-feed): Renamed from org-feed-parse-feed. (org-feed-parse-rss-entry): Renamed from org-feed-parse-entry. (org-feed-update): Update calls. --- lisp/org-feed.el | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/lisp/org-feed.el b/lisp/org-feed.el index fca71f7..3ece0f9 100644 --- a/lisp/org-feed.el +++ b/lisp/org-feed.el @@ -288,7 +288,7 @@ it can be a list structured like an entry in `org-feed-alist'." (error "Cannot get feed %s" name)) (when retrieve-only (throw 'exit feed-buffer)) - (setq entries (org-feed-parse-feed feed-buffer)) + (setq entries (org-feed-parse-rss-feed feed-buffer)) (ignore-errors (kill-buffer feed-buffer)) (save-excursion (save-window-excursion @@ -313,8 +313,8 @@ it can be a list structured like an entry in `org-feed-alist'." (push e changed)))) ;; Parse the relevant entries fully - (setq new (mapcar 'org-feed-parse-entry new) - changed (mapcar 'org-feed-parse-entry changed)) + (setq new (mapcar 'org-feed-parse-rss-entry new) + changed (mapcar 'org-feed-parse-rss-entry changed)) ;; Run the filter (when filter @@ -540,8 +540,8 @@ If that property is already present, nothing changes." ((functionp org-feed-retrieve-method) (funcall org-feed-retrieve-method url)))) -(defun org-feed-parse-feed (buffer) - "Parse BUFFER for RS feed entries. +(defun org-feed-parse-rss-feed (buffer) + "Parse BUFFER for RSS feed entries. Returns a list of entries, with each entry a property list, containing the properties `:guid' and `:item-full-text'." (let (entries beg end item guid entry) @@ -561,7 +561,7 @@ containing the properties `:guid' and `:item-full-text'." (goto-char end)) (nreverse entries)))) -(defun org-feed-parse-entry (entry) +(defun org-feed-parse-rss-entry (entry) "Parse the `:item-full-text' field for xml tags and create new properties." (with-temp-buffer (insert (plist-get entry :item-full-text)) -- 1.6.0.2 --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0002--org-feed.el-org-feed-alist-Add-parse-feed-and.patch >From 9f0061ff7825b18e7fbfcebf849be7cd192af340 Mon Sep 17 00:00:00 2001 From: Magnus Henoch Date: Wed, 15 Apr 2009 11:35:01 +0100 Subject: [PATCH] * org-feed.el (org-feed-alist): Add :parse-feed and :parse-entry options. (org-feed-update): Use them. --- lisp/org-feed.el | 28 ++++++++++++++++++++++++---- 1 files changed, 24 insertions(+), 4 deletions(-) diff --git a/lisp/org-feed.el b/lisp/org-feed.el index 3ece0f9..55b14bc 100644 --- a/lisp/org-feed.el +++ b/lisp/org-feed.el @@ -159,7 +159,17 @@ Here are the keyword-value pair allows in `org-feed-alist'. This function gets passed a list of all entries that have been handled before, but are now still in the feed and have *changed* since last handled (as evidenced by a different sha1 hash). - When the handler is called, point will be at the feed headline." + When the handler is called, point will be at the feed headline. + +:parse-feed function + This function gets passed a buffer, and should return a list of entries, + each being a property list containing the `:guid' and `:item-full-text' + keys. The default is `org-feed-parse-rss-feed'. + +:parse-entry function + This function gets passed an entry as returned by the parse-feed + function, and should return the entry with interesting properties added. + The default is `org-feed-parse-rss-entry'." :group 'org-feed :type '(repeat (list :value ("" "http://" "" "") @@ -184,6 +194,12 @@ Here are the keyword-value pair allows in `org-feed-alist'. (list :inline t :tag "Changed items" (const :changed-handler) (symbol :tag "Handler Function")) + (list :inline t :tag "Parse Feed" + (const :parse-feed) + (symbol :tag "Parse Feed Function")) + (list :inline t :tag "Parse Entry" + (const :parse-entry) + (symbol :tag "Parse Entry Function")) ))))) (defcustom org-feed-drawer "FEEDSTATUS" @@ -281,6 +297,10 @@ it can be a list structured like an entry in `org-feed-alist'." org-feed-default-template)) (drawer (or (nth 1 (memq :drawer feed)) org-feed-drawer)) + (parse-feed (or (nth 1 (memq :parse-feed feed)) + 'org-feed-parse-rss-feed)) + (parse-entry (or (nth 1 (memq :parse-entry feed)) + 'org-feed-parse-rss-entry)) feed-buffer inbox-pos new-formatted entries old-status status new changed guid-alist e guid olds) (setq feed-buffer (org-feed-get-feed url)) @@ -288,7 +308,7 @@ it can be a list structured like an entry in `org-feed-alist'." (error "Cannot get feed %s" name)) (when retrieve-only (throw 'exit feed-buffer)) - (setq entries (org-feed-parse-rss-feed feed-buffer)) + (setq entries (funcall parse-feed feed-buffer)) (ignore-errors (kill-buffer feed-buffer)) (save-excursion (save-window-excursion @@ -313,8 +333,8 @@ it can be a list structured like an entry in `org-feed-alist'." (push e changed)))) ;; Parse the relevant entries fully - (setq new (mapcar 'org-feed-parse-rss-entry new) - changed (mapcar 'org-feed-parse-rss-entry changed)) + (setq new (mapcar parse-entry new) + changed (mapcar parse-entry changed)) ;; Run the filter (when filter -- 1.6.0.2 --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0003--org-feed.el-org-feed-update-Use-sha1-instead-o.patch >From 269c6b3b0ec84d2c5478a7ac9cb0e49cfd2ca486 Mon Sep 17 00:00:00 2001 From: Magnus Henoch Date: Wed, 15 Apr 2009 11:37:15 +0100 Subject: [PATCH] * org-feed.el (org-feed-update): Use sha1 instead of org-sha1-string. --- lisp/org-feed.el | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/lisp/org-feed.el b/lisp/org-feed.el index 55b14bc..9c64bff 100644 --- a/lisp/org-feed.el +++ b/lisp/org-feed.el @@ -327,7 +327,7 @@ it can be a list structured like an entry in `org-feed-alist'." (push e new) (setq olds (nth 2 (assoc (plist-get e :guid) old-status))) (if (and olds - (not (string= (org-sha1-string + (not (string= (sha1 (plist-get e :item-full-text)) olds))) (push e changed)))) @@ -361,7 +361,7 @@ it can be a list structured like an entry in `org-feed-alist'." ;; or if they were handled previously (if (assoc guid guid-alist) t (plist-get e :handled)) ;; A hash, to detect changes - (org-sha1-string (plist-get e :item-full-text)))) + (sha1 (plist-get e :item-full-text)))) entries)) ;; Handle new items in the feed -- 1.6.0.2 --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0004--org-feed.el-org-feed-parse-atom-feed.patch >From ebb9b9094ae7cbf091d91540fee65cfe8522b869 Mon Sep 17 00:00:00 2001 From: Magnus Henoch Date: Wed, 15 Apr 2009 13:11:30 +0100 Subject: [PATCH] * org-feed.el (org-feed-parse-atom-feed) (org-feed-parse-atom-entry): New functions. --- lisp/org-feed.el | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 50 insertions(+), 0 deletions(-) diff --git a/lisp/org-feed.el b/lisp/org-feed.el index 9c64bff..72d1a7a 100644 --- a/lisp/org-feed.el +++ b/lisp/org-feed.el @@ -596,6 +596,56 @@ containing the properties `:guid' and `:item-full-text'." (setq entry (plist-put entry :guid-permalink t)))) entry) +(defun org-feed-parse-atom-feed (buffer) + "Parse BUFFER for Atom feed entries. +Returns a list of enttries, with each entry a property list, +containing the properties `:guid' and `:item-full-text'. + +The `:item-full-text' property actually contains the sexp +formatted as a string, not the original XML data." + (with-current-buffer buffer + (widen) + (goto-char (point-min)) + ;; Skip HTTP headers + (search-forward "\n\n") + (delete-region (point-min) (point)) + (let ((feed (car (xml-parse-region (point-min) (point-max))))) + (mapcar + (lambda (entry) + (list + :guid (car (xml-node-children (car (xml-get-children entry 'id)))) + :item-full-text (prin1-to-string entry))) + (xml-get-children feed 'entry))))) + +(defun org-feed-parse-atom-entry (entry) + "Parse the `:item-full-text' as a sexp and create new properties." + (let ((xml (car (read-from-string (plist-get entry :item-full-text))))) + ;; Get first . + (setq entry (plist-put entry :link + (xml-get-attribute + (car (xml-get-children xml 'link)) + 'href))) + ;; Add as :title. + (setq entry (plist-put entry :title + (car (xml-node-children + (car (xml-get-children xml 'title)))))) + (let* ((content (car (xml-get-children xml 'content))) + (type (xml-get-attribute-or-nil content 'type))) + (when content + (cond + ((string= type "text") + ;; We like plain text. + (setq entry (plist-put entry :description (car (xml-node-children content))))) + ((string= type "html") + ;; TODO: convert HTML to Org markup. + (setq entry (plist-put entry :description (car (xml-node-children content))))) + ((string= type "xhtml") + ;; TODO: convert XHTML to Org markup. + (setq entry (plist-put entry :description (prin1-to-string (xml-node-children content))))) + (t + (setq entry (plist-put entry :description (format "Unknown '%s' content." type))))))) + entry)) + (provide 'org-feed) ;; arch-tag: 0929b557-9bc4-47f4-9633-30a12dbb5ae2 -- 1.6.0.2 --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0005--org-feed.el-org-feed-alist-Mention-org-feed-pa.patch >From ee9522707cf7d40bb18fe4cf63407e522361d46b Mon Sep 17 00:00:00 2001 From: Magnus Henoch <magnus.henoch@gmail.com> Date: Wed, 15 Apr 2009 13:13:09 +0100 Subject: [PATCH] * org-feed.el (org-feed-alist): Mention org-feed-parse-atom-feed and org-feed-parse-atom-entry in the docstring. --- lisp/org-feed.el | 6 ++++-- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/lisp/org-feed.el b/lisp/org-feed.el index 72d1a7a..13fee41 100644 --- a/lisp/org-feed.el +++ b/lisp/org-feed.el @@ -164,12 +164,14 @@ Here are the keyword-value pair allows in `org-feed-alist'. :parse-feed function This function gets passed a buffer, and should return a list of entries, each being a property list containing the `:guid' and `:item-full-text' - keys. The default is `org-feed-parse-rss-feed'. + keys. The default is `org-feed-parse-rss-feed'; `org-feed-parse-atom-feed' + is an alternative. :parse-entry function This function gets passed an entry as returned by the parse-feed function, and should return the entry with interesting properties added. - The default is `org-feed-parse-rss-entry'." + The default is `org-feed-parse-rss-entry'; `org-feed-parse-atom-entry' + is an alternative." :group 'org-feed :type '(repeat (list :value ("" "http://" "" "") -- 1.6.0.2 --=-=-= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Emacs-orgmode mailing list Remember: use `Reply All' to send replies to the list. Emacs-orgmode@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-orgmode --=-=-=--