emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
From: Christopher League <league@contrapunctus.net>
To: Emacs-orgmode@gnu.org
Subject: [PATCH] fix org-feed when retrieve-method is curl or wget
Date: Tue, 30 Jun 2009 11:59:20 -0400	[thread overview]
Message-ID: <BABA1954-FF22-4C2A-98D9-1522450AEC8D@contrapunctus.net> (raw)

[-- Attachment #1: Type: text/plain, Size: 1202 bytes --]

Hi Carsten and everyone. I love using org-feed, to gather various  
collection points (delicious, starred in google reader, dial2do, etc)  
into org-mode.

I tried switching the org-feed-retrieve-method to curl or wget, and  
encountered some bugs. The fixes were simple, and the full details are  
in the attached git patch.

The problem I was trying to solve, however, was that delicious.com  
would return a "500 server error" sometimes with url.el, and I'm not  
sure why.  It returns RSS content anyway, with the error message as an  
<item>. Org-feed doesn't notice the HTTP response status, and  
processes the error as if it were a legit item (which means that next  
time the error occurs, it is silent).  Now that I got curl working,  
the 500 doesn't seem to happen anymore.

My hypotheses so far: maybe it has something to do with the user- 
agent, or with mangling special characters in the URL.  The delicious  
URL contains the '&' argument separator, and when the error message  
comes back, it appears with something like '&amp;amp;' in it.. as if  
it were replaced twice.  I haven't traced further, to determine if  
fault lies with url.el or with delicious.com.

Best wishes

[-- Attachment #2: 0001-fix-org-feed-when-retrieve-method-is-curl-or-wget.patch --]
[-- Type: application/octet-stream, Size: 3150 bytes --]

From 0fe1cac25d1d189881a136028e3b2fca5ec6f377 Mon Sep 17 00:00:00 2001
From: Christopher League <league@contrapunctus.net>
Date: Tue, 30 Jun 2009 11:34:50 -0400
Subject: [PATCH] fix org-feed when retrieve-method is curl or wget

- test on line 312 failed because these methods returned a string instead
  of a buffer

- requesting 'wget actually executed "curl", with bad parameters

- curl needs --silent, so that progress messages don't interrupt content

- atom parser had code to skip HTTP headers, but these are present only
  when using url-retrieve-synchronously; caused errors with curl/wget.
  Instead, remove HTTP headers right after feed buffer is populated.
 lisp/org-feed.el |   22 ++++++++++++++--------
 1 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index edf49d1..5e14e00 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -309,7 +309,7 @@ it can be a list structured like an entry in `org-feed-alist'."
 	  feed-buffer inbox-pos new-formatted
 	  entries old-status status new changed guid-alist e guid olds)
       (setq feed-buffer (org-feed-get-feed url))
-      (unless (and feed-buffer (bufferp feed-buffer))
+      (unless (and feed-buffer (bufferp (get-buffer feed-buffer)))
 	(error "Cannot get feed %s" name))
       (when retrieve-only
 	(throw 'exit feed-buffer))
@@ -549,18 +549,28 @@ If that property is already present, nothing changes."
 	       (org-split-string s "\n")
 	       (concat "\n" (make-string n ?\ )))))
+(defun org-feed-skip-http-headers (buffer)
+  "Remove HTTP headers from BUFFER, and return it.
+Assumes headers are indeed present!"
+  (with-current-buffer buffer
+    (widen)
+    (goto-char (point-min))
+    (search-forward "\n\n")
+    (delete-region (point-min) (point))
+    buffer))
 (defun org-feed-get-feed (url)
   "Get the RSS feed file at URL and return the buffer."
    ((eq org-feed-retrieve-method 'url-retrieve-synchronously)
-    (url-retrieve-synchronously url))
+    (org-feed-skip-http-headers (url-retrieve-synchronously url)))
    ((eq org-feed-retrieve-method 'curl)
     (ignore-errors (kill-buffer org-feed-buffer))
-    (call-process "curl" nil org-feed-buffer nil url)
+    (call-process "curl" nil org-feed-buffer nil "--silent" url)
    ((eq org-feed-retrieve-method 'wget)
     (ignore-errors (kill-buffer org-feed-buffer))
-    (call-process "curl" nil org-feed-buffer nil "-q" "-O" "-" url)
+    (call-process "wget" nil org-feed-buffer nil "-q" "-O" "-" url)
    ((functionp org-feed-retrieve-method)
     (funcall org-feed-retrieve-method url))))
@@ -610,10 +620,6 @@ The `:item-full-text' property actually contains the sexp
 formatted as a string, not the original XML data."
   (with-current-buffer buffer
-    (goto-char (point-min))
-    ;; Skip HTTP headers
-    (search-forward "\n\n")
-    (delete-region (point-min) (point))
     (let ((feed (car (xml-parse-region (point-min) (point-max)))))
        (lambda (entry)

[-- Attachment #3: Type: text/plain, Size: 1 bytes --]

[-- Attachment #4: Type: text/plain, Size: 204 bytes --]

Emacs-orgmode mailing list
Remember: use `Reply All' to send replies to the list.

             reply	other threads:[~2009-06-30 15:59 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-30 15:59 Christopher League [this message]
2009-06-30 17:40 ` [PATCH] fix org-feed when retrieve-method is curl or wget Carsten Dominik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

  List information: https://www.orgmode.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BABA1954-FF22-4C2A-98D9-1522450AEC8D@contrapunctus.net \
    --to=league@contrapunctus.net \
    --cc=Emacs-orgmode@gnu.org \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).