How to detect the RSS feed for a blog

Every wondered how to automatically figure out the RSS feed for a blog?

Generally speaking, it’s a simple task — just download the HTML for the given blog and use a fancy regular expression to find the associated RSS feed. In PHP, it looks something like this:

$bloghtml = file_get_contents($blogurl);
preg_match('/<link.*types*=s*["']*application/rss+xml["']*.*hrefs*=s*["']?([^'" >]+)['" >]/i', $bloghtml, $match);
$rssurl = $match[1];

The main problem with this approach is that some blogs take a long time to load — and that often translates to your application being slow as well. On top of that, it’s frustrating to have to download and process an entire page of HTML just to extract one URL.

Recently Google came out with a better solution in the form of their AJAX Feed API. Using their API, detecting feeds is now easier, faster and more reliable:

$lookup_url = "http://ajax.googleapis.com/ajax/services/feed/lookup?v=1.0&q=".urlencode($blogurl);
$result = curl($lookup_url);

I’ve been using this API for about a month now and have really appreciated the improvements. If you need to detect feeds, give it a try. I think you’ll like it.

  • Just noticed that my regexp is order-sensitive — it will fail if you put the href attribute in front of type. Make sure you fix that if you decide to copy and paste.

  • Just noticed that my regexp is order-sensitive (it will break if you put the href attribute in front of the type attribute). Make sure you fix that if you decide to copy and paste.

  • Hi! I was surfing and found your blog post… nice! I love your blog. 🙂 Cheers! Sandra. R.

  • I love your site. 🙂 Love design!!! I just came across your blog and wanted to say that I

  • Just what I needed!

    Many thanks!

  • I can't get it to work… any help?

  • Frank

    ThGoogle api is the fastest option, however, it does not detect the feed url sometimes making it quite unreliable. In my tests, it failed to get the feed url for one WORDPRESS blog (It got many others though)!

  • Bodhisattva Builder

    The php way is better as google only reads the RSS meta hints. With PHP you could also conditionally query known feed urls that the website may not provide meta hints for.

    if(@file_get_contents($url)){
    preg_match_all(‘//’, file_get_contents($url), $matches);
    if(isset($matches[1][0])){
    $this->feed->url = $matches[1][0];
    }elseif(@file_get_contents($this->feed->url.’/feed’)){
    $this->feed->url = $this->feed->url.’/feed’;
    }

    $this->feed->save();