emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [PATCH] fix org-feed when retrieve-method is curl or wget
@ 2009-06-30 15:59 Christopher League
  2009-06-30 17:40 ` Carsten Dominik
  0 siblings, 1 reply; 2+ messages in thread
From: Christopher League @ 2009-06-30 15:59 UTC (permalink / raw)
  To: Emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 1202 bytes --]

Hi Carsten and everyone. I love using org-feed, to gather various  
collection points (delicious, starred in google reader, dial2do, etc)  
into org-mode.

I tried switching the org-feed-retrieve-method to curl or wget, and  
encountered some bugs. The fixes were simple, and the full details are  
in the attached git patch.

The problem I was trying to solve, however, was that delicious.com  
would return a "500 server error" sometimes with url.el, and I'm not  
sure why.  It returns RSS content anyway, with the error message as an  
<item>. Org-feed doesn't notice the HTTP response status, and  
processes the error as if it were a legit item (which means that next  
time the error occurs, it is silent).  Now that I got curl working,  
the 500 doesn't seem to happen anymore.

My hypotheses so far: maybe it has something to do with the user- 
agent, or with mangling special characters in the URL.  The delicious  
URL contains the '&' argument separator, and when the error message  
comes back, it appears with something like '&amp;amp;' in it.. as if  
it were replaced twice.  I haven't traced further, to determine if  
fault lies with url.el or with delicious.com.

Best wishes

[-- Attachment #2: 0001-fix-org-feed-when-retrieve-method-is-curl-or-wget.patch --]
[-- Type: application/octet-stream, Size: 3150 bytes --]

From 0fe1cac25d1d189881a136028e3b2fca5ec6f377 Mon Sep 17 00:00:00 2001
From: Christopher League <league@contrapunctus.net>
Date: Tue, 30 Jun 2009 11:34:50 -0400
Subject: [PATCH] fix org-feed when retrieve-method is curl or wget

- test on line 312 failed because these methods returned a string instead
  of a buffer

- requesting 'wget actually executed "curl", with bad parameters

- curl needs --silent, so that progress messages don't interrupt content

- atom parser had code to skip HTTP headers, but these are present only
  when using url-retrieve-synchronously; caused errors with curl/wget.
  Instead, remove HTTP headers right after feed buffer is populated.
 lisp/org-feed.el |   22 ++++++++++++++--------
 1 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index edf49d1..5e14e00 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -309,7 +309,7 @@ it can be a list structured like an entry in `org-feed-alist'."
 	  feed-buffer inbox-pos new-formatted
 	  entries old-status status new changed guid-alist e guid olds)
       (setq feed-buffer (org-feed-get-feed url))
-      (unless (and feed-buffer (bufferp feed-buffer))
+      (unless (and feed-buffer (bufferp (get-buffer feed-buffer)))
 	(error "Cannot get feed %s" name))
       (when retrieve-only
 	(throw 'exit feed-buffer))
@@ -549,18 +549,28 @@ If that property is already present, nothing changes."
 	       (org-split-string s "\n")
 	       (concat "\n" (make-string n ?\ )))))
+(defun org-feed-skip-http-headers (buffer)
+  "Remove HTTP headers from BUFFER, and return it.
+Assumes headers are indeed present!"
+  (with-current-buffer buffer
+    (widen)
+    (goto-char (point-min))
+    (search-forward "\n\n")
+    (delete-region (point-min) (point))
+    buffer))
 (defun org-feed-get-feed (url)
   "Get the RSS feed file at URL and return the buffer."
    ((eq org-feed-retrieve-method 'url-retrieve-synchronously)
-    (url-retrieve-synchronously url))
+    (org-feed-skip-http-headers (url-retrieve-synchronously url)))
    ((eq org-feed-retrieve-method 'curl)
     (ignore-errors (kill-buffer org-feed-buffer))
-    (call-process "curl" nil org-feed-buffer nil url)
+    (call-process "curl" nil org-feed-buffer nil "--silent" url)
    ((eq org-feed-retrieve-method 'wget)
     (ignore-errors (kill-buffer org-feed-buffer))
-    (call-process "curl" nil org-feed-buffer nil "-q" "-O" "-" url)
+    (call-process "wget" nil org-feed-buffer nil "-q" "-O" "-" url)
    ((functionp org-feed-retrieve-method)
     (funcall org-feed-retrieve-method url))))
@@ -610,10 +620,6 @@ The `:item-full-text' property actually contains the sexp
 formatted as a string, not the original XML data."
   (with-current-buffer buffer
-    (goto-char (point-min))
-    ;; Skip HTTP headers
-    (search-forward "\n\n")
-    (delete-region (point-min) (point))
     (let ((feed (car (xml-parse-region (point-min) (point-max)))))
        (lambda (entry)

[-- Attachment #3: Type: text/plain, Size: 1 bytes --]

[-- Attachment #4: Type: text/plain, Size: 204 bytes --]

Emacs-orgmode mailing list
Remember: use `Reply All' to send replies to the list.

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2009-06-30 17:40 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-30 15:59 [PATCH] fix org-feed when retrieve-method is curl or wget Christopher League
2009-06-30 17:40 ` Carsten Dominik

Code repositories for project(s) associated with this public inbox


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).