emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [PATCH] fix org-feed when retrieve-method is curl or wget
@ 2009-06-30 15:59 Christopher League
  2009-06-30 17:40 ` Carsten Dominik
  0 siblings, 1 reply; 2+ messages in thread
From: Christopher League @ 2009-06-30 15:59 UTC (permalink / raw)
  To: Emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 1202 bytes --]

Hi Carsten and everyone. I love using org-feed, to gather various  
collection points (delicious, starred in google reader, dial2do, etc)  
into org-mode.

I tried switching the org-feed-retrieve-method to curl or wget, and  
encountered some bugs. The fixes were simple, and the full details are  
in the attached git patch.

The problem I was trying to solve, however, was that delicious.com  
would return a "500 server error" sometimes with url.el, and I'm not  
sure why.  It returns RSS content anyway, with the error message as an  
<item>. Org-feed doesn't notice the HTTP response status, and  
processes the error as if it were a legit item (which means that next  
time the error occurs, it is silent).  Now that I got curl working,  
the 500 doesn't seem to happen anymore.

My hypotheses so far: maybe it has something to do with the user- 
agent, or with mangling special characters in the URL.  The delicious  
URL contains the '&' argument separator, and when the error message  
comes back, it appears with something like '&amp;amp;' in it.. as if  
it were replaced twice.  I haven't traced further, to determine if  
fault lies with url.el or with delicious.com.

Best wishes
Chris


[-- Attachment #2: 0001-fix-org-feed-when-retrieve-method-is-curl-or-wget.patch --]
[-- Type: application/octet-stream, Size: 3150 bytes --]

From 0fe1cac25d1d189881a136028e3b2fca5ec6f377 Mon Sep 17 00:00:00 2001
From: Christopher League <league@contrapunctus.net>
Date: Tue, 30 Jun 2009 11:34:50 -0400
Subject: [PATCH] fix org-feed when retrieve-method is curl or wget

- test on line 312 failed because these methods returned a string instead
  of a buffer

- requesting 'wget actually executed "curl", with bad parameters

- curl needs --silent, so that progress messages don't interrupt content

- atom parser had code to skip HTTP headers, but these are present only
  when using url-retrieve-synchronously; caused errors with curl/wget.
  Instead, remove HTTP headers right after feed buffer is populated.
---
 lisp/org-feed.el |   22 ++++++++++++++--------
 1 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index edf49d1..5e14e00 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -309,7 +309,7 @@ it can be a list structured like an entry in `org-feed-alist'."
 	  feed-buffer inbox-pos new-formatted
 	  entries old-status status new changed guid-alist e guid olds)
       (setq feed-buffer (org-feed-get-feed url))
-      (unless (and feed-buffer (bufferp feed-buffer))
+      (unless (and feed-buffer (bufferp (get-buffer feed-buffer)))
 	(error "Cannot get feed %s" name))
       (when retrieve-only
 	(throw 'exit feed-buffer))
@@ -549,18 +549,28 @@ If that property is already present, nothing changes."
 	       (org-split-string s "\n")
 	       (concat "\n" (make-string n ?\ )))))
 
+(defun org-feed-skip-http-headers (buffer)
+  "Remove HTTP headers from BUFFER, and return it.
+Assumes headers are indeed present!"
+  (with-current-buffer buffer
+    (widen)
+    (goto-char (point-min))
+    (search-forward "\n\n")
+    (delete-region (point-min) (point))
+    buffer))
+
 (defun org-feed-get-feed (url)
   "Get the RSS feed file at URL and return the buffer."
   (cond
    ((eq org-feed-retrieve-method 'url-retrieve-synchronously)
-    (url-retrieve-synchronously url))
+    (org-feed-skip-http-headers (url-retrieve-synchronously url)))
    ((eq org-feed-retrieve-method 'curl)
     (ignore-errors (kill-buffer org-feed-buffer))
-    (call-process "curl" nil org-feed-buffer nil url)
+    (call-process "curl" nil org-feed-buffer nil "--silent" url)
     org-feed-buffer)
    ((eq org-feed-retrieve-method 'wget)
     (ignore-errors (kill-buffer org-feed-buffer))
-    (call-process "curl" nil org-feed-buffer nil "-q" "-O" "-" url)
+    (call-process "wget" nil org-feed-buffer nil "-q" "-O" "-" url)
     org-feed-buffer)
    ((functionp org-feed-retrieve-method)
     (funcall org-feed-retrieve-method url))))
@@ -610,10 +620,6 @@ The `:item-full-text' property actually contains the sexp
 formatted as a string, not the original XML data."
   (with-current-buffer buffer
     (widen)
-    (goto-char (point-min))
-    ;; Skip HTTP headers
-    (search-forward "\n\n")
-    (delete-region (point-min) (point))
     (let ((feed (car (xml-parse-region (point-min) (point-max)))))
       (mapcar
        (lambda (entry)
-- 
1.6.3.3


[-- Attachment #3: Type: text/plain, Size: 1 bytes --]



[-- Attachment #4: Type: text/plain, Size: 204 bytes --]

_______________________________________________
Emacs-orgmode mailing list
Remember: use `Reply All' to send replies to the list.
Emacs-orgmode@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] fix org-feed when retrieve-method is curl or wget
  2009-06-30 15:59 [PATCH] fix org-feed when retrieve-method is curl or wget Christopher League
@ 2009-06-30 17:40 ` Carsten Dominik
  0 siblings, 0 replies; 2+ messages in thread
From: Carsten Dominik @ 2009-06-30 17:40 UTC (permalink / raw)
  To: Christopher League; +Cc: Emacs-orgmode

Applied, thanks.

- Carsten

On Jun 30, 2009, at 5:59 PM, Christopher League wrote:

> Hi Carsten and everyone. I love using org-feed, to gather various  
> collection points (delicious, starred in google reader, dial2do,  
> etc) into org-mode.
>
> I tried switching the org-feed-retrieve-method to curl or wget, and  
> encountered some bugs. The fixes were simple, and the full details  
> are in the attached git patch.
>
> The problem I was trying to solve, however, was that delicious.com  
> would return a "500 server error" sometimes with url.el, and I'm not  
> sure why.  It returns RSS content anyway, with the error message as  
> an <item>. Org-feed doesn't notice the HTTP response status, and  
> processes the error as if it were a legit item (which means that  
> next time the error occurs, it is silent).  Now that I got curl  
> working, the 500 doesn't seem to happen anymore.
>
> My hypotheses so far: maybe it has something to do with the user- 
> agent, or with mangling special characters in the URL.  The  
> delicious URL contains the '&' argument separator, and when the  
> error message comes back, it appears with something like '&amp;amp;'  
> in it.. as if it were replaced twice.  I haven't traced further, to  
> determine if fault lies with url.el or with delicious.com.
>
> Best wishes
> Chris
>
> <0001-fix-org-feed-when-retrieve-method-is-curl-or-wget.patch>
> _______________________________________________
> Emacs-orgmode mailing list
> Remember: use `Reply All' to send replies to the list.
> Emacs-orgmode@gnu.org
> http://lists.gnu.org/mailman/listinfo/emacs-orgmode

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2009-06-30 17:40 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-30 15:59 [PATCH] fix org-feed when retrieve-method is curl or wget Christopher League
2009-06-30 17:40 ` Carsten Dominik

Code repositories for project(s) associated with this inbox:

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).