The protocols powering the real-time web

by on May 25, 2009


In the past few weeks there has been a lot of discussion around the rise of the real-time web, including posts from TechCrunch, GigaOm, ReadWriteWeb and Scoble.   A lot of the talk has been around Twitter, Facebook, Friendfeed, OneRiot and of course Google.  You don’t have to be a genius to figure out that real-time is the future of the web.  I believe there is a huge need for the tech community to develop new protocols that will power this fundamental shift in how web apps work.

The problem is our existing protocols are request driven instead of event driven.  The web we know and love wasn’t built with real-time in mind.

Tim O’Reilly sent a tweet from OSCON08 that really captures the essence of the polling problem:

On monday friendfeed polled flickr nearly 3 million times for 45000 users, only 6K of whom were logged in. Architectural mismatch. #oscon08

At EventVue we have a dedicated server that does little more than poll for new blog posts from attendees.  We have a few tricks to reduce the pain, but we’re still polling thousands of blogs every hour even though 99% of them haven’t added any fresh content since the last time we checked.  With blog posts, people are used to having a small delay before they show up in Google Reader or other services.  We’re not so forgiving when it takes 30 minutes for a tweet to show up in a client application, even though getting real-time data from twitter using polling is virtually impossible.

So what is the solution?

Some people have said that XMPP holds the answer, but how many developers do you know who have set up an XMPP server before?  Right.  Me too.  XMPP may be a viable transport method but I think we’d be better off using something that is simpler and more familiar to developers.

Another prominent response to the polling problem is the Simple Update Protocol (SUP) that was proposed by Paul Buchheit from Friendfeed.  SUP is certainly an improvement over our current protocols, but what frustrates me is that it only reduces polling instead of eliminating it altogether.  It may make sense for FriendFeed, but it’s not something I would add to my blog.

My favorite approach is PubSubHubbub that was proposed by Brad Fitzpatrick and Brett Slatkin from Google.  PubSubHubbub might have a horrible name, but the protocol is exactly what we need to fix our polling problems.  It’s lightweight, simple to understand and built on top of basic HTTP.

PubSubHubbub is a simple extension to ATOM that uses webhook callbacks to deliver practically instant notifications between servers when a feed is updated.  The protocol is decentralized and free.  Anyone can run a hub.  Anyone can be a publisher or a subscriber.  I like that it eliminates polling altogether and is incredibly simple to implement.  I took a stab at writing the PHP client library and was able to take it from protocol spec to code in less than 2 hours.

If you’re interested, you can check out my PubSubHubbub PHP library and download and install the PubSubHubbub WordPress plugin I wrote as well.

It’s worth mentioning the role that Gnip plays in all of this.  Gnip has been leading the charge against the evils of polling.  I’ve been a big fan of their service and have written before how they helped EventVue.   But at the end of the day, the winning technology shouldn’t be in the hands of one company — it should be open and distributed.   Open protocols don’t eliminate the need for Gnip.  Trusted hubs like Gnip will play an important role in handling the flow of data between publishers and subscribers.  Companies will pay good money to off-load that work, and Gnip is already at the center of that opportunity.  I’d love to see Gnip embrace the open protocols that are being developed and lead the drive for adoption of PubSubHubbub in particular.

I’m excited about PubSubHubbub for a few reasons.  First, it opens the door for a whole new range of real-time applications that simply aren’t possible today.  It’s also a chance for me to contribute to solving a really big problem and an opportunity for me to get in on the ground level of something I believe is going to be huge.  I wasn’t able to contribute to the design of HTTP or sit in on the conversations that led to the development of the RSS protocol.  But one day I’m going to be able to brag that Online Aspect was the very first blog on the web to support PubSubHubbub.   And for a geek like me, that’s pretty cool.