Coding for the unexpected

You could write a piece of code, run it a million times, wait ten years, run it again and get exactly the same results.  At least that’s what I used to believe.

One of the things I love about computers is that they are boringly consistent.  Given the same input, a computer will return the same output EVERY SINGLE TIME.  Likewise, code doesn’t change.  I could save a file on my computer and if it weren’t for hardware failure it would remain the same byte-for-byte until the end of time.  Code doesn’t change. It’s just a bunch of mathematical statements bound together by rules of logic that are burned into a tiny computer chip.  Or in geek terminology, code is immutable.

In theory this sounds great.  The problem is it doesn’t mesh with the everyday reality of my life.  My code breaks all the time without me changing a thing.

I remember getting an email from someone complaining that they couldn’t login to an application I had written.  The strange thing about this was that I hadn’t touched that code in years.  The servers were being managed and had been pretty reliable.  How could my application break if no one had broken it?  I SSH’d onto the server and quickly realized my server logs had gotten so large that there wasn’t any room left on the hard drive for new session files to be created.  I deleted the log files and changed my server settings to stop it from happening again.

Since then I’ve had countless experiences where code broke unexpectedly.  The culprits vary.  Sometimes it’s hardware failure.  Sometimes an unchecked log file.  More often, it’s the result of user input that I didn’t anticipate or an integration with an external service that fails.

You would think that we would be getting better at anticipating and preventing these sort of issues from happening.   But from what I can tell, these sort of issues are happening MORE OFTEN these days, not less.  On one hand, we’re getting smarter.  We’ve learned from our mistakes  about truncating server logs and baking in automatic fail-over for hardware issues.  But there’s a bigger trend happening on the web right now that is throwing some huge variables into the equation.  Very few applications stand alone anymore.  Every application now has a million integrations with Twitter, Facebook, Flickr, YouTube… you name it!   And guess what?  Every one of those services throws another kink into the chain, giving us more uncertainty and more points of failure to try and anticipate.

The integrated web is here to stay.  As developers, we need to figure out how we’re going to deal with this new layer of uncertainty in our applications.