From: Magnus Henoch <magnus.henoch@gmail.com>
To: emacs-orgmode@gnu.org
Subject: org-feed: support Atom
Date: Wed, 15 Apr 2009 14:00:25 +0100 [thread overview]
Message-ID: <84ocuy3wd2.fsf@linux-b2a3.site> (raw)
[-- Attachment #1: Type: text/plain, Size: 778 bytes --]
I hacked org-feed to make it support different parsers, and wrote a
simple Atom parser. Not sure how strong my git-fu is, but I try to send
my patches here :) Proposed ChangeLog entry:
2009-04-15 Magnus Henoch <magnus.henoch@gmail.com>
* org-feed.el (org-feed-alist): Add :parse-feed and :parse-entry
options. Mention org-feed-parse-atom-feed and
org-feed-parse-atom-entry in the docstring.
(org-feed-parse-rss-feed): Renamed from org-feed-parse-feed.
(org-feed-parse-rss-entry): Renamed from org-feed-parse-entry.
(org-feed-update): Use sha1 instead of org-sha1-string. Use
:parse-feed and :parse-entry.
(org-feed-parse-atom-feed, org-feed-parse-atom-entry): New
functions.
I have assigned copyright for Emacs; would that be good enough for
Orgmode?
Magnus
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001--org-feed.el-org-feed-parse-rss-feed-Renamed-fr.patch --]
[-- Type: text/x-patch, Size: 2217 bytes --]
From 8041c2f0a5379ccc08d3d3414cba54fb3786a94d Mon Sep 17 00:00:00 2001
From: Magnus Henoch <magnus.henoch@gmail.com>
Date: Wed, 15 Apr 2009 11:19:19 +0100
Subject: [PATCH] * org-feed.el (org-feed-parse-rss-feed): Renamed from
org-feed-parse-feed.
(org-feed-parse-rss-entry): Renamed from org-feed-parse-entry.
(org-feed-update): Update calls.
---
lisp/org-feed.el | 12 ++++++------
1 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index fca71f7..3ece0f9 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -288,7 +288,7 @@ it can be a list structured like an entry in `org-feed-alist'."
(error "Cannot get feed %s" name))
(when retrieve-only
(throw 'exit feed-buffer))
- (setq entries (org-feed-parse-feed feed-buffer))
+ (setq entries (org-feed-parse-rss-feed feed-buffer))
(ignore-errors (kill-buffer feed-buffer))
(save-excursion
(save-window-excursion
@@ -313,8 +313,8 @@ it can be a list structured like an entry in `org-feed-alist'."
(push e changed))))
;; Parse the relevant entries fully
- (setq new (mapcar 'org-feed-parse-entry new)
- changed (mapcar 'org-feed-parse-entry changed))
+ (setq new (mapcar 'org-feed-parse-rss-entry new)
+ changed (mapcar 'org-feed-parse-rss-entry changed))
;; Run the filter
(when filter
@@ -540,8 +540,8 @@ If that property is already present, nothing changes."
((functionp org-feed-retrieve-method)
(funcall org-feed-retrieve-method url))))
-(defun org-feed-parse-feed (buffer)
- "Parse BUFFER for RS feed entries.
+(defun org-feed-parse-rss-feed (buffer)
+ "Parse BUFFER for RSS feed entries.
Returns a list of entries, with each entry a property list,
containing the properties `:guid' and `:item-full-text'."
(let (entries beg end item guid entry)
@@ -561,7 +561,7 @@ containing the properties `:guid' and `:item-full-text'."
(goto-char end))
(nreverse entries))))
-(defun org-feed-parse-entry (entry)
+(defun org-feed-parse-rss-entry (entry)
"Parse the `:item-full-text' field for xml tags and create new properties."
(with-temp-buffer
(insert (plist-get entry :item-full-text))
--
1.6.0.2
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: 0002--org-feed.el-org-feed-alist-Add-parse-feed-and.patch --]
[-- Type: text/x-patch, Size: 3432 bytes --]
From 9f0061ff7825b18e7fbfcebf849be7cd192af340 Mon Sep 17 00:00:00 2001
From: Magnus Henoch <magnus.henoch@gmail.com>
Date: Wed, 15 Apr 2009 11:35:01 +0100
Subject: [PATCH] * org-feed.el (org-feed-alist): Add :parse-feed and :parse-entry
options.
(org-feed-update): Use them.
---
lisp/org-feed.el | 28 ++++++++++++++++++++++++----
1 files changed, 24 insertions(+), 4 deletions(-)
diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index 3ece0f9..55b14bc 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -159,7 +159,17 @@ Here are the keyword-value pair allows in `org-feed-alist'.
This function gets passed a list of all entries that have been
handled before, but are now still in the feed and have *changed*
since last handled (as evidenced by a different sha1 hash).
- When the handler is called, point will be at the feed headline."
+ When the handler is called, point will be at the feed headline.
+
+:parse-feed function
+ This function gets passed a buffer, and should return a list of entries,
+ each being a property list containing the `:guid' and `:item-full-text'
+ keys. The default is `org-feed-parse-rss-feed'.
+
+:parse-entry function
+ This function gets passed an entry as returned by the parse-feed
+ function, and should return the entry with interesting properties added.
+ The default is `org-feed-parse-rss-entry'."
:group 'org-feed
:type '(repeat
(list :value ("" "http://" "" "")
@@ -184,6 +194,12 @@ Here are the keyword-value pair allows in `org-feed-alist'.
(list :inline t :tag "Changed items"
(const :changed-handler)
(symbol :tag "Handler Function"))
+ (list :inline t :tag "Parse Feed"
+ (const :parse-feed)
+ (symbol :tag "Parse Feed Function"))
+ (list :inline t :tag "Parse Entry"
+ (const :parse-entry)
+ (symbol :tag "Parse Entry Function"))
)))))
(defcustom org-feed-drawer "FEEDSTATUS"
@@ -281,6 +297,10 @@ it can be a list structured like an entry in `org-feed-alist'."
org-feed-default-template))
(drawer (or (nth 1 (memq :drawer feed))
org-feed-drawer))
+ (parse-feed (or (nth 1 (memq :parse-feed feed))
+ 'org-feed-parse-rss-feed))
+ (parse-entry (or (nth 1 (memq :parse-entry feed))
+ 'org-feed-parse-rss-entry))
feed-buffer inbox-pos new-formatted
entries old-status status new changed guid-alist e guid olds)
(setq feed-buffer (org-feed-get-feed url))
@@ -288,7 +308,7 @@ it can be a list structured like an entry in `org-feed-alist'."
(error "Cannot get feed %s" name))
(when retrieve-only
(throw 'exit feed-buffer))
- (setq entries (org-feed-parse-rss-feed feed-buffer))
+ (setq entries (funcall parse-feed feed-buffer))
(ignore-errors (kill-buffer feed-buffer))
(save-excursion
(save-window-excursion
@@ -313,8 +333,8 @@ it can be a list structured like an entry in `org-feed-alist'."
(push e changed))))
;; Parse the relevant entries fully
- (setq new (mapcar 'org-feed-parse-rss-entry new)
- changed (mapcar 'org-feed-parse-rss-entry changed))
+ (setq new (mapcar parse-entry new)
+ changed (mapcar parse-entry changed))
;; Run the filter
(when filter
--
1.6.0.2
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: 0003--org-feed.el-org-feed-update-Use-sha1-instead-o.patch --]
[-- Type: text/x-patch, Size: 1185 bytes --]
From 269c6b3b0ec84d2c5478a7ac9cb0e49cfd2ca486 Mon Sep 17 00:00:00 2001
From: Magnus Henoch <magnus.henoch@gmail.com>
Date: Wed, 15 Apr 2009 11:37:15 +0100
Subject: [PATCH] * org-feed.el (org-feed-update): Use sha1 instead of
org-sha1-string.
---
lisp/org-feed.el | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index 55b14bc..9c64bff 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -327,7 +327,7 @@ it can be a list structured like an entry in `org-feed-alist'."
(push e new)
(setq olds (nth 2 (assoc (plist-get e :guid) old-status)))
(if (and olds
- (not (string= (org-sha1-string
+ (not (string= (sha1
(plist-get e :item-full-text))
olds)))
(push e changed))))
@@ -361,7 +361,7 @@ it can be a list structured like an entry in `org-feed-alist'."
;; or if they were handled previously
(if (assoc guid guid-alist) t (plist-get e :handled))
;; A hash, to detect changes
- (org-sha1-string (plist-get e :item-full-text))))
+ (sha1 (plist-get e :item-full-text))))
entries))
;; Handle new items in the feed
--
1.6.0.2
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #5: 0004--org-feed.el-org-feed-parse-atom-feed.patch --]
[-- Type: text/x-patch, Size: 2945 bytes --]
From ebb9b9094ae7cbf091d91540fee65cfe8522b869 Mon Sep 17 00:00:00 2001
From: Magnus Henoch <magnus.henoch@gmail.com>
Date: Wed, 15 Apr 2009 13:11:30 +0100
Subject: [PATCH] * org-feed.el (org-feed-parse-atom-feed)
(org-feed-parse-atom-entry): New functions.
---
lisp/org-feed.el | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 50 insertions(+), 0 deletions(-)
diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index 9c64bff..72d1a7a 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -596,6 +596,56 @@ containing the properties `:guid' and `:item-full-text'."
(setq entry (plist-put entry :guid-permalink t))))
entry)
+(defun org-feed-parse-atom-feed (buffer)
+ "Parse BUFFER for Atom feed entries.
+Returns a list of enttries, with each entry a property list,
+containing the properties `:guid' and `:item-full-text'.
+
+The `:item-full-text' property actually contains the sexp
+formatted as a string, not the original XML data."
+ (with-current-buffer buffer
+ (widen)
+ (goto-char (point-min))
+ ;; Skip HTTP headers
+ (search-forward "\n\n")
+ (delete-region (point-min) (point))
+ (let ((feed (car (xml-parse-region (point-min) (point-max)))))
+ (mapcar
+ (lambda (entry)
+ (list
+ :guid (car (xml-node-children (car (xml-get-children entry 'id))))
+ :item-full-text (prin1-to-string entry)))
+ (xml-get-children feed 'entry)))))
+
+(defun org-feed-parse-atom-entry (entry)
+ "Parse the `:item-full-text' as a sexp and create new properties."
+ (let ((xml (car (read-from-string (plist-get entry :item-full-text)))))
+ ;; Get first <link href='foo'/>.
+ (setq entry (plist-put entry :link
+ (xml-get-attribute
+ (car (xml-get-children xml 'link))
+ 'href)))
+ ;; Add <title/> as :title.
+ (setq entry (plist-put entry :title
+ (car (xml-node-children
+ (car (xml-get-children xml 'title))))))
+ (let* ((content (car (xml-get-children xml 'content)))
+ (type (xml-get-attribute-or-nil content 'type)))
+ (when content
+ (cond
+ ((string= type "text")
+ ;; We like plain text.
+ (setq entry (plist-put entry :description (car (xml-node-children content)))))
+ ((string= type "html")
+ ;; TODO: convert HTML to Org markup.
+ (setq entry (plist-put entry :description (car (xml-node-children content)))))
+ ((string= type "xhtml")
+ ;; TODO: convert XHTML to Org markup.
+ (setq entry (plist-put entry :description (prin1-to-string (xml-node-children content)))))
+ (t
+ (setq entry (plist-put entry :description (format "Unknown '%s' content." type)))))))
+ entry))
+
(provide 'org-feed)
;; arch-tag: 0929b557-9bc4-47f4-9633-30a12dbb5ae2
--
1.6.0.2
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #6: 0005--org-feed.el-org-feed-alist-Mention-org-feed-pa.patch --]
[-- Type: text/x-patch, Size: 1341 bytes --]
From ee9522707cf7d40bb18fe4cf63407e522361d46b Mon Sep 17 00:00:00 2001
From: Magnus Henoch <magnus.henoch@gmail.com>
Date: Wed, 15 Apr 2009 13:13:09 +0100
Subject: [PATCH] * org-feed.el (org-feed-alist): Mention org-feed-parse-atom-feed
and org-feed-parse-atom-entry in the docstring.
---
lisp/org-feed.el | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index 72d1a7a..13fee41 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -164,12 +164,14 @@ Here are the keyword-value pair allows in `org-feed-alist'.
:parse-feed function
This function gets passed a buffer, and should return a list of entries,
each being a property list containing the `:guid' and `:item-full-text'
- keys. The default is `org-feed-parse-rss-feed'.
+ keys. The default is `org-feed-parse-rss-feed'; `org-feed-parse-atom-feed'
+ is an alternative.
:parse-entry function
This function gets passed an entry as returned by the parse-feed
function, and should return the entry with interesting properties added.
- The default is `org-feed-parse-rss-entry'."
+ The default is `org-feed-parse-rss-entry'; `org-feed-parse-atom-entry'
+ is an alternative."
:group 'org-feed
:type '(repeat
(list :value ("" "http://" "" "")
--
1.6.0.2
[-- Attachment #7: Type: text/plain, Size: 204 bytes --]
_______________________________________________
Emacs-orgmode mailing list
Remember: use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode
next reply other threads:[~2009-04-15 13:05 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-15 13:00 Magnus Henoch [this message]
2009-04-15 13:26 ` org-feed: support Atom Carsten Dominik
2009-04-15 14:43 ` Magnus Henoch
2009-04-15 14:58 ` Carsten Dominik
2009-04-15 14:42 ` Carsten Dominik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.orgmode.org/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=84ocuy3wd2.fsf@linux-b2a3.site \
--to=magnus.henoch@gmail.com \
--cc=emacs-orgmode@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs/org-mode.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).