Archive for June, 2008

How to detect the RSS feed for a blog

Every wondered how to automatically figure out the RSS feed for a blog?

Generally speaking, it’s a simple task — just download the HTML for the given blog and use a fancy regular expression to find the associated RSS feed. In PHP, it looks something like this:

$bloghtml = file_get_contents($blogurl);
preg_match('/<link.*types*=s*["']*application/rss+xml["']*.*hrefs*=s*["']?([^'" >]+)['" >]/i', $bloghtml, $match);
$rssurl = $match[1];

The main problem with this approach is that some blogs take a long time to load — and that often translates to your application being slow as well. On top of that, it’s frustrating to have to download and process an entire page of HTML just to extract one URL.

Recently Google came out with a better solution in the form of their AJAX Feed API. Using their API, detecting feeds is now easier, faster and more reliable:

$lookup_url = "".urlencode($blogurl);
$result = curl($lookup_url);

I’ve been using this API for about a month now and have really appreciated the improvements. If you need to detect feeds, give it a try. I think you’ll like it.