zengun

weblog » tag view

2008 05 05

hey feed reader, I read that in my browser

I got tempted by desktop feed readers tonight, and since I’m using a Mac both at home and at work, figured I would give NetNewsWire a try (at last).
This brought me to pinpoint one thing that I hate in feed readers: they can’t always get informed when you have actually read an item.

Let’s demonstrate that with two use cases:

  • While testing NetNewsWire (by reading built-in feeds), I double-clicked on an article’s title and got taken to the website itself in an embedded browser view. So far so good.
    In that browser view, I clicked to get to the home page, and read some more articles on the website. Then I got back to the feed reader view, and there it seemed like the only article I had read was the one that was double-clicked.
    I had to mark the other articles I had seen as read.

  • Earlier, using Bloglines Beta (which sucks ten times less than regular Bloglines), I was reading a feed and clicked on a link, which brought me to an article on a blog whose feed I’m already following. When I came back to Bloglines, the blog’s article that I had just clicked through was still marked as unread since I hadn’t read it from Bloglines itself.

This seems wrong to me: in an age where every other tidbit of information has an RSS item version, in order to have a correct overview of what you have read and what’s left to read in your daily feed reading session, you still have to consume it all in the same feed reader, using its interface and none other.

Clearly there should be a way to let the feed reader know that we just read some item in our feeds list.

What can be done about it? Cooperation between our browser and our feed reader (using a browser plugin specific to the feed reader).
First, let’s see what we know (and what we need) when we visit some blog:

  • the URL itself if we visit an article
  • the URL to all permalinks on a page, marked by rel="bookmark"

That’s all we need, really.
Those URLs can be fed (no pun intended) to the feed reader by our browser of choice (with some REST service for web-based feed readers), or directly used by our feed reader if we’re browsing in its embedded browser.
To avoid sending every other URL, the browser needs a way to know which feeds you are subscribed to (online readers provide an OPML resource, desktop readers may use other means) in order to send only URLs whose domain names match feeds’ <link> value’s domain name.

The feed reader can then check if it knows of any item whose URL matches (for the sake of simplicity, this can be limited to exact matches), and mark it as read.

Side notes related to web-based feed readers:

  • In order to avoid hitting the servers too hard, requests could go through a proxy that either silently drops duplicates from the same authenticated user in a given timeframe, or tells the browser to stop bugging it with an appropriate HTTP return code.
  • Privacy matters may arise, if the browser starts sending every other URL because of a faulty OPML file.

While writing this, I’m realising that this solution breaks with item link redirections like FeedBurner’s…

I still think it might be an idea worth exploring, if that point of failure could be overcome. (HEAD request that would find each new item’s final URL when the feed’s refreshed?)

What do you think?