As reported by TechCrunch, Google's Custom Search service was down for over 12 hours last week. Notice it took Google over 12 hours to even respond to the the complaints! Then it took about 2 more hours to resolve the issue.
Let's skip over the lack of transparency from Google during the event, except to say that it's pretty sad that it took so long to at least admit to the problem. To their defense, they claim it affected a small number of clients. And Google is generally open about their problems, so we'll give them a pass on this one.
Why does it matter to you?
Unlike downtime at GMail and Google Reader, a SaaS services like Custom Search being down is a big deal. Why? Because if you were using Custom Search, to your visitors it looks like YOU are down. Imagine being a customer of Smug Mug, visiting their help page and ending up with a really slow or broken search. Would you blame Google or Smug Mug? Sure many customers would probably blame themselves, but just the possibility that your perceived uptime and user experience is dependent on a third party (that you have no control over or insight into) should give you pause. How are you supposed to even know that these services are down? Imagine if it was something more critical to your business like your ad network or the payment processing system?
Is SaaS doomed?
In spite of these dangers, the benefit of using SaaS solutions is very strong. Why bother building and hosting something outside your core competency when a service out there does it for you. You can read about the benefits of SaaS here, here, here, here, and here. I doubt I have to convince you of that. So the question is how you can continue to reap the benefits of SaaS while minimizing your exposure to problems you can't control. Is there a solution?
The key to a successful SaaS implementation is having real time access to the uptime and performance of the SaaS solutions your business relies on. If you knew right away that Google's Custom Search solution was down, at the least you could react put up a friendly message for your visitors ("Don't blame us, it's Google's fault!"). Even better you'd have a fail-over plan in place to switch to another solution. Same thing if this was an ad network or a payment system that went down. You would have some control over your user's experiences, and would no longer have to pray that all of your solution providers are up 100% of the time (good luck!). Without this knowledge, you're either assuming these services never go down, or you don't realize that your visitors have no idea that the issues aren't your fault.
The company I work for recently launched a solution that deals with this very need. It's all about working together with your SaaS providers, sharing performance and uptime data, and being able to see the same data your providers are seeing. As with most problems, it often times boils down to opening up the communication lines.
As more businesses come to rely on SaaS solutions, the more exposure these business will have to this kind of "perceived" downtime. The naive solution is to expect 100% uptime. The real solution is to know when that downtime does occur, and to have a plan of action.
Sunday, September 21, 2008
An post by Steve Rubel of Micro Persuasion titled Radical Transparency: Three Lessons Apple Can Learn from Google:
"Google isn't exactly known as the most transparent company in the world, but they're light years ahead of Apple - a company that in some ways they share a kinship with when it comes to their reputation for innovation. Apple (or for that matter any big company) can learn a lot about radical transparency, customer service and PR from Google, even though they're hardly perfect here."He goes on to review the various places that Google and Apple make public their bugs and known issues. What's missing here obviously is any mention of transparency in uptime and performance. But to fill in the gaps, as we've seen previously, Google does a much better job here as well.
Posted by Lenny Rachitsky at 5:08 PM