emacs-orgmode@gnu.org archives
 help / color / mirror / code / Atom feed
* [PATCH] Trouble updating some RSS feeds with org-feed
@ 2015-09-23 12:46 Hiroshi Saito
  0 siblings, 0 replies; only message in thread
From: Hiroshi Saito @ 2015-09-23 12:46 UTC (permalink / raw)
  To: emacs-orgmode

[-- Attachment #1: Type: text/plain, Size: 1664 bytes --]

Hi all,

I noticed that the org-feed could not properly handle RSS feeds which do not
contain <guid> element. The value of <guid> element is used as a key of an
association list to manage entry statuses. The keys become `nil' when a <guid>
element not found. Then no entries are added anymore after first update since a
key of new entry (`nil') is already included in the association list.


Here is an example .emacs:
----------------------------------------------------------------------
(setq org-feed-alist
      '(("Hacker News"
         "https://news.ycombinator.com/rss"
         "~/feed.org" "Hacker News"
         )))
----------------------------------------------------------------------

After running `org-feed-update-all', keys of feed status in ~/feed.org
are `nil' like this:
----------------------------------------------------------------------
:FEEDSTATUS:
((nil t "4e939ac25cb5b7c825c0894c364a220d5a98a7bf")
 (nil t "2eac7fd17ae277ba6ad6fd658da663bdf2a28586")
 (nil t "4939903fe5796ea1b5132209c5ab983e0558b5fd")
 ...
:END:
----------------------------------------------------------------------

After that, `org-feed-update-all' no longer adds new entries in above reason.


It is possible to work around this issue via `:parse_feed' option. But, I think
it would be reasonable that org-feed handles <guid>-less RSS feeds.

So, I wrote a small patch that uses a value of <link> as a key if <guid> is
missing. It's simple and not too bad since there's certain consistency to
<guid> and <link> except <link> is also optional. Another option could be using
a hash of <title> or <description> but I feel it's excessive.

--
Sincerely,
Hiroshi Saito

[-- Attachment #2: 0001-org-feed.el-Use-a-value-of-link-as-guid-if-guid-is-m.patch --]
[-- Type: application/octet-stream, Size: 1361 bytes --]

From 8ffae59ce301ba77e470bf3ff415b97aef6e4e0a Mon Sep 17 00:00:00 2001
From: Hiroshi Saito <saidie@saidie.info>
Date: Wed, 23 Sep 2015 16:58:09 +0900
Subject: [PATCH] org-feed.el: Use a value of <link> as guid if <guid> is
 missing

* lisp/org-feed.el (org-feed-parse-rss-feed): Set a value of <link>
element to `:guid' property of an entry if <guid> element is missing.

If a RSS feed does not provide <guid> to entries, `:guid' property of an
entry is always `nil'. In such a case, new feed entries are no longer
added because the property is used to detect duplication.

TINYCHANGE
---
 lisp/org-feed.el | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lisp/org-feed.el b/lisp/org-feed.el
index e511be0..7be803e 100644
--- a/lisp/org-feed.el
+++ b/lisp/org-feed.el
@@ -615,8 +615,10 @@ containing the properties `:guid' and `:item-full-text'."
 		       (match-beginning 0)))
 	(setq item (buffer-substring beg end)
 	      guid (if (string-match "<guid\\>.*?>\\(.*?\\)</guid>" item)
+		       (org-match-string-no-properties 1 item))
+	      link (if (string-match "<link\\>.*?>\\(.*?\\)</link>" item)
 		       (org-match-string-no-properties 1 item)))
-	(setq entry (list :guid guid :item-full-text item))
+	(setq entry (list :guid (or guid link) :item-full-text item))
 	(push entry entries)
 	(widen)
 	(goto-char end))
-- 
2.5.3


^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2015-09-23 12:47 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-23 12:46 [PATCH] Trouble updating some RSS feeds with org-feed Hiroshi Saito

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).