emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* org-feed: support Atom
@ 2009-04-15 13:00 Magnus Henoch
  2009-04-15 13:26 ` Carsten Dominik
  2009-04-15 14:42 ` Carsten Dominik
  0 siblings, 2 replies; 5+ messages in thread
From: Magnus Henoch @ 2009-04-15 13:00 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 778 bytes --]

I hacked org-feed to make it support different parsers, and wrote a
simple Atom parser.  Not sure how strong my git-fu is, but I try to send
my patches here :)  Proposed ChangeLog entry:

2009-04-15  Magnus Henoch  <magnus.henoch@gmail.com>

	* org-feed.el (org-feed-alist): Add :parse-feed and :parse-entry
	options.  Mention org-feed-parse-atom-feed and
	org-feed-parse-atom-entry in the docstring.
	(org-feed-parse-rss-feed): Renamed from org-feed-parse-feed.
	(org-feed-parse-rss-entry): Renamed from org-feed-parse-entry.
	(org-feed-update): Use sha1 instead of org-sha1-string.  Use
	:parse-feed and :parse-entry.
	(org-feed-parse-atom-feed, org-feed-parse-atom-entry): New
	functions.

I have assigned copyright for Emacs; would that be good enough for
Orgmode?

Magnus


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001--org-feed.el-org-feed-parse-rss-feed-Renamed-fr.patch --]
[-- Type: text/x-patch, Size: 2217 bytes --]

From 8041c2f0a5379ccc08d3d3414cba54fb3786a94d Mon Sep 17 00:00:00 2001
From: Magnus Henoch <magnus.henoch@gmail.com>
Date: Wed, 15 Apr 2009 11:19:19 +0100
Subject: [PATCH] 	* org-feed.el (org-feed-parse-rss-feed): Renamed from
 	org-feed-parse-feed.
 	(org-feed-parse-rss-entry): Renamed from org-feed-parse-entry.
 	(org-feed-update): Update calls.

---
 lisp/org-feed.el |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index fca71f7..3ece0f9 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -288,7 +288,7 @@ it can be a list structured like an entry in `org-feed-alist'."
 	(error "Cannot get feed %s" name))
       (when retrieve-only
 	(throw 'exit feed-buffer))
-      (setq entries (org-feed-parse-feed feed-buffer))
+      (setq entries (org-feed-parse-rss-feed feed-buffer))
       (ignore-errors (kill-buffer feed-buffer))
       (save-excursion
 	(save-window-excursion
@@ -313,8 +313,8 @@ it can be a list structured like an entry in `org-feed-alist'."
 		  (push e changed))))
 
 	  ;; Parse the relevant entries fully
-	  (setq new     (mapcar 'org-feed-parse-entry new)
-		changed (mapcar 'org-feed-parse-entry changed))
+	  (setq new     (mapcar 'org-feed-parse-rss-entry new)
+		changed (mapcar 'org-feed-parse-rss-entry changed))
 
 	  ;; Run the filter
 	  (when filter
@@ -540,8 +540,8 @@ If that property is already present, nothing changes."
    ((functionp org-feed-retrieve-method)
     (funcall org-feed-retrieve-method url))))
 
-(defun org-feed-parse-feed (buffer)
-  "Parse BUFFER for RS feed entries.
+(defun org-feed-parse-rss-feed (buffer)
+  "Parse BUFFER for RSS feed entries.
 Returns a list of entries, with each entry a property list,
 containing the properties `:guid' and `:item-full-text'."
   (let (entries beg end item guid entry)
@@ -561,7 +561,7 @@ containing the properties `:guid' and `:item-full-text'."
 	(goto-char end))
       (nreverse entries))))
 
-(defun org-feed-parse-entry (entry)
+(defun org-feed-parse-rss-entry (entry)
   "Parse the `:item-full-text' field for xml tags and create new properties."
   (with-temp-buffer
     (insert (plist-get entry :item-full-text))
-- 
1.6.0.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002--org-feed.el-org-feed-alist-Add-parse-feed-and.patch --]
[-- Type: text/x-patch, Size: 3432 bytes --]

From 9f0061ff7825b18e7fbfcebf849be7cd192af340 Mon Sep 17 00:00:00 2001
From: Magnus Henoch <magnus.henoch@gmail.com>
Date: Wed, 15 Apr 2009 11:35:01 +0100
Subject: [PATCH] 	* org-feed.el (org-feed-alist): Add :parse-feed and :parse-entry
 	options.
 	(org-feed-update): Use them.

---
 lisp/org-feed.el |   28 ++++++++++++++++++++++++----
 1 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index 3ece0f9..55b14bc 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -159,7 +159,17 @@ Here are the keyword-value pair allows in `org-feed-alist'.
      This function gets passed a list of all entries that have been
      handled before, but are now still in the feed and have *changed*
      since last handled (as evidenced by a different sha1 hash).
-     When the handler is called, point will be at the feed headline."
+     When the handler is called, point will be at the feed headline.
+
+:parse-feed function
+     This function gets passed a buffer, and should return a list of entries,
+     each being a property list containing the `:guid' and `:item-full-text'
+     keys.  The default is `org-feed-parse-rss-feed'.
+
+:parse-entry function
+     This function gets passed an entry as returned by the parse-feed
+     function, and should return the entry with interesting properties added.
+     The default is `org-feed-parse-rss-entry'."
   :group 'org-feed
   :type '(repeat
 	  (list :value ("" "http://" "" "")
@@ -184,6 +194,12 @@ Here are the keyword-value pair allows in `org-feed-alist'.
 		    (list :inline t :tag "Changed items"
 			  (const :changed-handler)
 			  (symbol :tag "Handler Function"))
+                    (list :inline t :tag "Parse Feed"
+                          (const :parse-feed)
+                          (symbol :tag "Parse Feed Function"))
+                    (list :inline t :tag "Parse Entry"
+                          (const :parse-entry)
+                          (symbol :tag "Parse Entry Function"))
 		    )))))
 
 (defcustom org-feed-drawer "FEEDSTATUS"
@@ -281,6 +297,10 @@ it can be a list structured like an entry in `org-feed-alist'."
 			org-feed-default-template))
 	  (drawer (or (nth 1 (memq :drawer feed))
 		      org-feed-drawer))
+          (parse-feed (or (nth 1 (memq :parse-feed feed))
+                          'org-feed-parse-rss-feed))
+          (parse-entry (or (nth 1 (memq :parse-entry feed))
+                           'org-feed-parse-rss-entry))
 	  feed-buffer inbox-pos new-formatted
 	  entries old-status status new changed guid-alist e guid olds)
       (setq feed-buffer (org-feed-get-feed url))
@@ -288,7 +308,7 @@ it can be a list structured like an entry in `org-feed-alist'."
 	(error "Cannot get feed %s" name))
       (when retrieve-only
 	(throw 'exit feed-buffer))
-      (setq entries (org-feed-parse-rss-feed feed-buffer))
+      (setq entries (funcall parse-feed feed-buffer))
       (ignore-errors (kill-buffer feed-buffer))
       (save-excursion
 	(save-window-excursion
@@ -313,8 +333,8 @@ it can be a list structured like an entry in `org-feed-alist'."
 		  (push e changed))))
 
 	  ;; Parse the relevant entries fully
-	  (setq new     (mapcar 'org-feed-parse-rss-entry new)
-		changed (mapcar 'org-feed-parse-rss-entry changed))
+	  (setq new     (mapcar parse-entry new)
+		changed (mapcar parse-entry changed))
 
 	  ;; Run the filter
 	  (when filter
-- 
1.6.0.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: 0003--org-feed.el-org-feed-update-Use-sha1-instead-o.patch --]
[-- Type: text/x-patch, Size: 1185 bytes --]

From 269c6b3b0ec84d2c5478a7ac9cb0e49cfd2ca486 Mon Sep 17 00:00:00 2001
From: Magnus Henoch <magnus.henoch@gmail.com>
Date: Wed, 15 Apr 2009 11:37:15 +0100
Subject: [PATCH] 	* org-feed.el (org-feed-update): Use sha1 instead of
 	org-sha1-string.

---
 lisp/org-feed.el |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index 55b14bc..9c64bff 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -327,7 +327,7 @@ it can be a list structured like an entry in `org-feed-alist'."
 		(push e new)
 	      (setq olds (nth 2 (assoc (plist-get e :guid) old-status)))
 	      (if (and olds
-		       (not (string= (org-sha1-string
+		       (not (string= (sha1
 				      (plist-get e :item-full-text))
 				     olds)))
 		  (push e changed))))
@@ -361,7 +361,7 @@ it can be a list structured like an entry in `org-feed-alist'."
 			 ;; or if they were handled previously
 			 (if (assoc guid guid-alist) t (plist-get e :handled))
 			 ;; A hash, to detect changes
-			 (org-sha1-string (plist-get e :item-full-text))))
+			 (sha1 (plist-get e :item-full-text))))
 		 entries))
 
 	  ;; Handle new items in the feed
-- 
1.6.0.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #5: 0004--org-feed.el-org-feed-parse-atom-feed.patch --]
[-- Type: text/x-patch, Size: 2945 bytes --]

From ebb9b9094ae7cbf091d91540fee65cfe8522b869 Mon Sep 17 00:00:00 2001
From: Magnus Henoch <magnus.henoch@gmail.com>
Date: Wed, 15 Apr 2009 13:11:30 +0100
Subject: [PATCH] 	* org-feed.el (org-feed-parse-atom-feed)
 	(org-feed-parse-atom-entry): New functions.

---
 lisp/org-feed.el |   50 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 50 insertions(+), 0 deletions(-)

diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index 9c64bff..72d1a7a 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -596,6 +596,56 @@ containing the properties `:guid' and `:item-full-text'."
       (setq entry (plist-put entry :guid-permalink t))))
   entry)
 
+(defun org-feed-parse-atom-feed (buffer)
+  "Parse BUFFER for Atom feed entries.
+Returns a list of enttries, with each entry a property list,
+containing the properties `:guid' and `:item-full-text'.
+
+The `:item-full-text' property actually contains the sexp
+formatted as a string, not the original XML data."
+  (with-current-buffer buffer
+    (widen)
+    (goto-char (point-min))
+    ;; Skip HTTP headers
+    (search-forward "\n\n")
+    (delete-region (point-min) (point))
+    (let ((feed (car (xml-parse-region (point-min) (point-max)))))
+      (mapcar
+       (lambda (entry)
+         (list
+          :guid (car (xml-node-children (car (xml-get-children entry 'id))))
+          :item-full-text (prin1-to-string entry)))
+       (xml-get-children feed 'entry)))))
+
+(defun org-feed-parse-atom-entry (entry)
+  "Parse the `:item-full-text' as a sexp and create new properties."
+  (let ((xml (car (read-from-string (plist-get entry :item-full-text)))))
+    ;; Get first <link href='foo'/>.
+    (setq entry (plist-put entry :link
+                           (xml-get-attribute
+                            (car (xml-get-children xml 'link))
+                            'href)))
+    ;; Add <title/> as :title.
+    (setq entry (plist-put entry :title
+                           (car (xml-node-children
+                                 (car (xml-get-children xml 'title))))))
+    (let* ((content (car (xml-get-children xml 'content)))
+           (type (xml-get-attribute-or-nil content 'type)))
+      (when content
+        (cond
+         ((string= type "text")
+          ;; We like plain text.
+          (setq entry (plist-put entry :description (car (xml-node-children content)))))
+         ((string= type "html")
+          ;; TODO: convert HTML to Org markup.
+          (setq entry (plist-put entry :description (car (xml-node-children content)))))
+         ((string= type "xhtml")
+          ;; TODO: convert XHTML to Org markup.
+          (setq entry (plist-put entry :description (prin1-to-string (xml-node-children content)))))
+         (t
+          (setq entry (plist-put entry :description (format "Unknown '%s' content." type)))))))
+    entry))
+
 (provide 'org-feed)
 
 ;; arch-tag: 0929b557-9bc4-47f4-9633-30a12dbb5ae2
-- 
1.6.0.2


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #6: 0005--org-feed.el-org-feed-alist-Mention-org-feed-pa.patch --]
[-- Type: text/x-patch, Size: 1341 bytes --]

From ee9522707cf7d40bb18fe4cf63407e522361d46b Mon Sep 17 00:00:00 2001
From: Magnus Henoch <magnus.henoch@gmail.com>
Date: Wed, 15 Apr 2009 13:13:09 +0100
Subject: [PATCH] 	* org-feed.el (org-feed-alist): Mention org-feed-parse-atom-feed
 	and org-feed-parse-atom-entry in the docstring.

---
 lisp/org-feed.el |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index 72d1a7a..13fee41 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -164,12 +164,14 @@ Here are the keyword-value pair allows in `org-feed-alist'.
 :parse-feed function
      This function gets passed a buffer, and should return a list of entries,
      each being a property list containing the `:guid' and `:item-full-text'
-     keys.  The default is `org-feed-parse-rss-feed'.
+     keys.  The default is `org-feed-parse-rss-feed'; `org-feed-parse-atom-feed'
+     is an alternative.
 
 :parse-entry function
      This function gets passed an entry as returned by the parse-feed
      function, and should return the entry with interesting properties added.
-     The default is `org-feed-parse-rss-entry'."
+     The default is `org-feed-parse-rss-entry'; `org-feed-parse-atom-entry'
+     is an alternative."
   :group 'org-feed
   :type '(repeat
 	  (list :value ("" "http://" "" "")
-- 
1.6.0.2


[-- Attachment #7: Type: text/plain, Size: 204 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Remember: use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: org-feed: support Atom
  2009-04-15 13:00 org-feed: support Atom Magnus Henoch
@ 2009-04-15 13:26 ` Carsten Dominik
  2009-04-15 14:43   ` Magnus Henoch
  2009-04-15 14:42 ` Carsten Dominik
  1 sibling, 1 reply; 5+ messages in thread
From: Carsten Dominik @ 2009-04-15 13:26 UTC (permalink / raw)
  To: Magnus Henoch; +Cc: emacs-orgmode

Hi Magnus,

On Apr 15, 2009, at 3:00 PM, Magnus Henoch wrote:

> I hacked org-feed to make it support different parsers, and wrote a
> simple Atom parser.


This sounds very good!

However, it does not mean anything to me.  :-)

Web-dump as I am, I have no clue what
"Atom" means.  And since I would like to understand
the changes, would you mind explaining what this is
useful for, and maybe show an example?

> I have assigned copyright for Emacs; would that be good enough for
> Orgmode?

If it says "future changes" and is for Emacs, yes, this will be  
sufficient.

- Carsten

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: org-feed: support Atom
  2009-04-15 13:00 org-feed: support Atom Magnus Henoch
  2009-04-15 13:26 ` Carsten Dominik
@ 2009-04-15 14:42 ` Carsten Dominik
  1 sibling, 0 replies; 5+ messages in thread
From: Carsten Dominik @ 2009-04-15 14:42 UTC (permalink / raw)
  To: Magnus Henoch; +Cc: emacs-orgmode

Actually, the patches look really good, I have already
applied them, thank you very much.

If I can ask a little more of your time, it would be great
if you'd update

http://orgmode.org/worg/org-contrib/org-feed.php

(i.e. the corresponding .org file in Worg).

And, if you like, feel free to update my RSS parser using xml-parse- 
region, it seems to me that you are doing a much better job with the  
parsing than me.

Thanks for you contribution!

- Carsten

P.S.  If you don't mind, I would like a copy of your
       copyright assignment, for my records.

On Apr 15, 2009, at 3:00 PM, Magnus Henoch wrote:

> I hacked org-feed to make it support different parsers, and wrote a
> simple Atom parser.  Not sure how strong my git-fu is, but I try to  
> send
> my patches here :)  Proposed ChangeLog entry:
>
> 2009-04-15  Magnus Henoch  <magnus.henoch@gmail.com>
>
> 	* org-feed.el (org-feed-alist): Add :parse-feed and :parse-entry
> 	options.  Mention org-feed-parse-atom-feed and
> 	org-feed-parse-atom-entry in the docstring.
> 	(org-feed-parse-rss-feed): Renamed from org-feed-parse-feed.
> 	(org-feed-parse-rss-entry): Renamed from org-feed-parse-entry.
> 	(org-feed-update): Use sha1 instead of org-sha1-string.  Use
> 	:parse-feed and :parse-entry.
> 	(org-feed-parse-atom-feed, org-feed-parse-atom-entry): New
> 	functions.
>
> I have assigned copyright for Emacs; would that be good enough for
> Orgmode?
>
> Magnus
>
> <0001--org-feed.el-org-feed-parse-rss-feed-Renamed-fr.patch><0002-- 
> org-feed.el-org-feed-alist-Add-parse-feed-and.patch><0003--org- 
> feed.el-org-feed-update-Use-sha1-instead-o.patch><0004--org-feed.el- 
> org-feed-parse-atom-feed.patch><0005--org-feed.el-org-feed-alist- 
> Mention-org-feed- 
> pa.patch>_______________________________________________
> Emacs-orgmode mailing list
> Remember: use `Reply All' to send replies to the list.
> Emacs-orgmode@gnu.org
> http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: org-feed: support Atom
  2009-04-15 13:26 ` Carsten Dominik
@ 2009-04-15 14:43   ` Magnus Henoch
  2009-04-15 14:58     ` Carsten Dominik
  0 siblings, 1 reply; 5+ messages in thread
From: Magnus Henoch @ 2009-04-15 14:43 UTC (permalink / raw)
  To: emacs-orgmode

Carsten Dominik <carsten.dominik@gmail.com> writes:

> Hi Magnus,
>
> On Apr 15, 2009, at 3:00 PM, Magnus Henoch wrote:
>
>> I hacked org-feed to make it support different parsers, and wrote a
>> simple Atom parser.
>
>
> This sounds very good!
>
> However, it does not mean anything to me.  :-)
>
> Web-dump as I am, I have no clue what
> "Atom" means.  And since I would like to understand
> the changes, would you mind explaining what this is
> useful for, and maybe show an example?

Sure.  The great thing about standards is that there are so many of them
to choose from, and feeds are no exception.  The current version of
org-feed.el supports RSS - I'm not sure exactly which version;
http://en.wikipedia.org/wiki/Rss#Variants lists six versions of RSS with
various levels of intercompatibility.  The Atom format is an attempt to
clear up this mess by starting from scratch, and some web applications
have only Atom feeds.

One example of an Atom feed can be found at
https://bugzilla.mozilla.org/.  Go to "Bugs Filed Today", get the feed
by clicking the "radio wave" icon in the Firefox location bar, and use
that URL for org-feed-alist.  You will need my patches for that to work,
and "Parse Feed" and "Parse Entry" need to be set to the Atom ones.
Then, you have a low-maintenance list of Mozilla bugs in an Org-mode
page.

>> I have assigned copyright for Emacs; would that be good enough for
>> Orgmode?
>
> If it says "future changes" and is for Emacs, yes, this will be
> sufficient.

It does, yes.

Magnus

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Re: org-feed: support Atom
  2009-04-15 14:43   ` Magnus Henoch
@ 2009-04-15 14:58     ` Carsten Dominik
  0 siblings, 0 replies; 5+ messages in thread
From: Carsten Dominik @ 2009-04-15 14:58 UTC (permalink / raw)
  To: Magnus Henoch; +Cc: emacs-orgmode

Hi Magnus,

thanks!

- Carsten

On Apr 15, 2009, at 4:43 PM, Magnus Henoch wrote:

> Carsten Dominik <carsten.dominik@gmail.com> writes:
>
>> Hi Magnus,
>>
>> On Apr 15, 2009, at 3:00 PM, Magnus Henoch wrote:
>>
>>> I hacked org-feed to make it support different parsers, and wrote a
>>> simple Atom parser.
>>
>>
>> This sounds very good!
>>
>> However, it does not mean anything to me.  :-)
>>
>> Web-dump as I am, I have no clue what
>> "Atom" means.  And since I would like to understand
>> the changes, would you mind explaining what this is
>> useful for, and maybe show an example?
>
> Sure.  The great thing about standards is that there are so many of  
> them
> to choose from, and feeds are no exception.  The current version of
> org-feed.el supports RSS - I'm not sure exactly which version;
> http://en.wikipedia.org/wiki/Rss#Variants lists six versions of RSS  
> with
> various levels of intercompatibility.  The Atom format is an attempt  
> to
> clear up this mess by starting from scratch, and some web applications
> have only Atom feeds.
>
> One example of an Atom feed can be found at
> https://bugzilla.mozilla.org/.  Go to "Bugs Filed Today", get the feed
> by clicking the "radio wave" icon in the Firefox location bar, and use
> that URL for org-feed-alist.  You will need my patches for that to  
> work,
> and "Parse Feed" and "Parse Entry" need to be set to the Atom ones.
> Then, you have a low-maintenance list of Mozilla bugs in an Org-mode
> page.
>
>>> I have assigned copyright for Emacs; would that be good enough for
>>> Orgmode?
>>
>> If it says "future changes" and is for Emacs, yes, this will be
>> sufficient.
>
> It does, yes.
>
> Magnus
>
>
>
> _______________________________________________
> Emacs-orgmode mailing list
> Remember: use `Reply All' to send replies to the list.
> Emacs-orgmode@gnu.org
> http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-04-15 14:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-15 13:00 org-feed: support Atom Magnus Henoch
2009-04-15 13:26 ` Carsten Dominik
2009-04-15 14:43   ` Magnus Henoch
2009-04-15 14:58     ` Carsten Dominik
2009-04-15 14:42 ` Carsten Dominik

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).