Sunday, November 23, 2008

Transparency case study, courtesy of ylastic

ylastic (a company that provides tools to help manage AWS services) kept their users in the loop during an outage by communicating status updates over Twitter:


You can find the entire set of updates at ylastic's twitter page.

I keep coming back to the same question. Do your users know where to go during a downtime event? ylastic has their web site, their blog, their forums, and their twitter feed. As a user, how do I know where to look when I'm having a problem and want to know what's going on with the service (which is generally an emergency)? As the company, how do I keep users from clogging my support email box in spite of my efforts to get status updates out to the world? In this case it looks like the only place that had any information was the twitter feed. If users weren't aware it existed, both sides would be out of luck.

What every SaaS service needs is a clear central place, that their users can easily find, that provides real time updates on downtime or performance events. It's great that you're willing to communicate during the event, but if no one can find those updates, what's the point? Don't get me started on falling trees.

On another note, kudos to ylastic for their transparency on the following fronts:
  • Providing insight into their product roadmap. Very much what SaaS providers must do to build the trust relationship with their users (which is critical to the success of any online hosted application).
  • Their upcoming iPhone app that among other things gives you the AWS Service Health status on the go.
  • Simply giving status updates on Twitter.

6 comments:

  1. Thank you for mentioning us on your blog. We are a small start-up from Atlanta that provides a single interface (web or mobile) for managing your AWS environment. We keep everyone in the loop by using twitter, email and our blog for communicating updates and other items of interest to our users. When we had the recent outage we used twitter extensively and emails to the customers to let them know that there was a problem and that we were working on it. Just saying there is a problem and not following up leaves a bad taste. So we decided to let everyone know exactly what was happening. We did not test our EBS snapshots as religiously as we should have. It would be a cop-out to blame AWS, as AWS doesn't test our snapshots, we do. Lesson learned :-( It's amazing how supportive users/customers are when you keep them in the loop...

    ReplyDelete
  2. Thanks for the insight, especially that you guys were emailing your customers during the event in addition to the twitter updates. I would still recommend you have an easy to find status page that your users can find easily that has the type of information you provided over email or twitter. Use the broadcast power of the Internet to its potential. That way you don't have to spend time emailing, twittering, and updating your blog, and spend more time fixing the problem. Ideally the status page would update all the communication platforms automatically for you.

    Obviously this isn't a priority for you guys right now in this growth stage, but definitely something to think about if downtime happens again. Like you said, just keeping your users in the loop is the most important thing.

    ReplyDelete
  3. Thats a nice idea and would be ideal. We are planning to have a simple status page, and when we publish an update to it automatically farm it out to all the channels - twitter, blog, etc.

    ReplyDelete
  4. Sounds like a good plan. Would love to see it when it's live!

    ReplyDelete
  5. That's pretty cool - I've been using an uptime monitoring service called: http://www.internetuptimemonitor.com - They send emails/SMS Cell phone text messages, but not twitter. That's really cool

    ReplyDelete
  6. Good discussion of the same event from a different angle: http://blog.programmableweb.com/2008/11/26/the-cloud-does-not-auto-validate-your-work/

    ReplyDelete