Showing posts with label performance. Show all posts
Showing posts with label performance. Show all posts

Friday, April 30, 2010

A proposal for new community focused on web performance

I've been really impressed with the StackExchange platform (http://www.stackexchange.com, made by the same people that run stackoverflow.com), and I feel that it could be an extremely effective platform to host a web performance focused community. They built the platform from scratch in order to improve on the innate flaws with regular threaded discussion boards (e.g. Yahoo forums, Google Groups, phpBB, vBulletin, etc.). More importantly, the platform walks the line between incentivizing quick answers (for immediate feedback), and keeping answers from getting obsolete over time.

My hope is that this site becomes an evolving source of definitive answers on web performance best practices, tips, tool tricks, book recommendations, data exchange, etc.

The process to make this a reality is:
1. Submit a proposal for peer review
2. If there is enough support (votes), it moves on to the next stage.
3. People that would like to participate in the community (and help manage it) sign up
4. The details of the community get ironed out (moderators, name, tags, etc.)
5. It goes public

I've gone ahead and submitted the initial proposal (step 1):
http://meta.stackexchange.com/questions/5821/proposal-for-stackexchange-site-focused-on-web-site-performance

I'm just here to get the initial ball rolling, but from here on out it's going to be all about the greater community. This next stage, where everyone votes on the proposals, is going to make or break the concept. It's already received a good amount of votes, but it's going to take a lot more support to push it forward. If you think this has legs, and can see the value, vote it up!

Friday, April 9, 2010

Your sites performance now affects your Google search ranking

Today Google officially followed up on a promise they made last year:
"Speeding up websites is important — not just to site owners, but to all Internet users. Faster sites create happy users and we've seen in our internal studies that when a site responds slowly, visitors spend less time there. But faster sites don't just improve user experience; recent data shows that improving site speed also reduces operating costs. Like us, our users place a lot of value in speed — that's why we've decided to take site speed into account in our search rankings. We use a variety of sources to determine the speed of a site relative to other sites."
If the ROI of page performance wasn't clear enough, we now have a big new reason to focus on optimizing performance. The big question is what Google considers "slow", and how search rankings are affected (e.g. are you boosted up if you are really fast, or are you pushed down if you are really slow, or both?). When are you done optimizing? Google has a big opportunity to set the bar, and give sites a clear target. Without that, the the impact of this move may not be as beneficial to the speed of the web as they hope.

What we know
  1. Site speed is taken "into account" in search rankings.
  2. "While site speed is a new signal, it doesn't carry as much weight as the relevance of a page".
  3. "Signal for site speed only applies for visitors searching in English on Google.com at this point".
  4. Google is tracking site speed using both the Googlebot crawler and the Google Toolbar passive performance stats.
  5. You can see what performance Google is recoding for your site (only from Google Toolbar data) in the Webmaster Tools, under "Labs"
  6. In the "Performance overview" graph, Google considers a load time over 1.5 seconds "slow".
  7. Google is taking speed very seriously. The faster the web gets, the better for them.
What we don't know
  1. What "slow" means, and at what point you are penalized (or rewarded).
  2. How much weight is given to the Googlebot stats versus the Google Toolbar stats.
  3. What Google considers "done" when a page loads (e.g. Load event, DOMComplete event, HTML download, above-the-fold load, etc.). Does Googlebot load images/objects, and if so does it use a realistic browser engine?
  4. How much historical data it looks at to determine your site speed, and how often it updates that data.
  5. Will there be any transparency into the penalties/rewards.
What I think
  1. Site performance is only going to play a factor when your site is extremely slow.
  2. Extremely slow sites will be pushed down in the rankings, but fast sites probably won't see a rise in the rankings.
  3. "Slow" is probably a high number, something like 10-20 seconds, and plays a bigger role in the final rankings as the speed gets slower. Regular sites won't be affected, even if they are subjectively slow.
  4. This is probably just the beginning, and we should expect tweaking of these metrics as we become more comfortable with them. We'll probably be seeing new metrics along the same lines in the coming years (e.g. geographical performance, Time-to-Interact versus onLoad, consistency versus average, reliability, etc.).

Tuesday, February 16, 2010

The Tao of Web Performance and Uptime

Who cares about fast web pages. Who cares about uptime. I mean really.

Does it truly matter to you whether a page loads a couple seconds faster? Are we wasting our lives keeping servers up 99.999%? Are we making an impact on the world in a meaningful way? Does it actually matter in the scheme of things?

I think the answer is yes. It does matter. I matters a lot. But it isn't for the reasons you think.

What would most people answer when asked "Why are you spending your life making web pages faster and keeping servers up?" Here are my guesses:

It's my job
I'll get fired. I'll let my peers down. I'll hurt the company. My boss will be mad at me if I don't do what I'm told. All good reasons (unless you've read Seth Godin's new book). But do these reasons make you honestly care about web page performance? Does it make you happy to spend your precious lifetime keeping servers running all hours of the night? The real question is not whether performance and uptime matter. The question that you should be asking is: Does performance and uptime matter to you, as a human being? If your answer to this relies on this being your job, that caring about it provides security and comfort, then the answer is no. You won't find fulfillment in your work if your motivation is being a good cog.

It's my company
Being the person making money off of the cogs (as they improve page performance and keep the system stable) changes the equation. No doubt, keeping your servers up is critical to the success of your online business (usually). Furthermore, the ROI of page performance is fairly conclusive. Clearly, uptime and performance lead to more revenue (or at least less lost revenue). But what do you actually care about in this equation? How quickly the pages load, or how much money you're making as a result of those faster pages? You don't want to spend your time optimizing pages all day. You want to get done with it as quickly as possible and get back to doing business. You certainly care about page performance and uptime, but only as a tool, a necessary evil, that helps you optimize your true passion (whatever that may be).

It's fun
You look at tuning performance or building highly reliable systems as a puzzle. You enjoy the work because you are good at it, or you want to accomplish something no one else has in the past. Your motivation is either the thrill of the problem or personal brand building in your group/company/industry. You enjoy performance and uptime for the opportunity that it offers, and the feeling of accomplishment that it brings when you improve performance by 23% or keep the system up during a marketing blitz. This explanation gets close to being a good reason, but it lacks something. It's selfish. It focuses on you. It doesn't give you a purpose, or impact the world in a meaningful way. Fun will take you so far, but at some point you'll wonder "what's the point?" and move on to the next challenge.

It makes people happy
This is it. This is why it matters. This is why it is worth spending your time making web pages faster and keeping servers up. To put it simply, it make people happier. To quote Matt Mullenweg, founder of Wordpress:
"That's why [performance] is important and why we should be obsessed and not be discouraged when it doesn't change the funnel. My theory here is when an interface is faster, you feel good. And ultimately what that comes down to is you feel in control. The web app isn't controlling me, I'm controlling it. Ultimately that feeling of control translates to happiness in everyone. In order to increase the happiness in the world, we all have to keep working on this. "
How can we quantify this? We have data showing that a slower google.com and bing.com results in less searches, and more importantly that user satisfaction goes down with each additional performance decrease. AOL shows us that page views drop off as page load times increase. Optimizations to Google Maps increasing user interaction with the site significantly. The faster the site, the more you want to use it. Let's delve into more evidence...

Flow
If you haven't yet come across the concept of flow:
"Flow is the mental state of operation in which the person is fully immersed in what he or she is doing by a feeling of energized focus, full involvement, and success in the process of the activity. Proposed by Mihály Csíkszentmihályi, the positive psychology concept has been widely referenced across a variety of fields.

According to Csíkszentmihályi, flow is completely focused motivation. It is a single-minded immersion and represents perhaps the ultimate in harnessing the emotions in the service of performing and learning. In flow the emotions are not just contained and channeled, but positive, energized, and aligned with the task at hand. To be caught in the ennui of depression or the agitation of anxiety is to be barred from flow. The hallmark of flow is a feeling of spontaneous joy, even rapture, while performing a task."
How does performance and uptime relate to flow? Researchers asked this very question and found some unsurprising results:
"Hoffman, Novak, and Yung found that the speed of interaction had a“direct positive influence on flow” on feelings of challenge and arousal (which directly influence flow), and on importance. Skill, control, and time distortion also had a direct influence on flow.

The researchers then applied their model to consumer behavior on the web. They tested web applications (chat, newsgroups, and so on) and web shopping, asking subjects to specify which features were most important when shopping on the web.

They found that speed had the greatest effect on the amount of time spent online and on frequency of visits for web applications. For repeat visits, the most important factors were skill/control, length of time on the web, importance, and speed.

So to make your site compelling enough to return to, make sure that it offers a perceived level of control by matching challenges to user skills, important content, and fast response times."
When asked about the importance of speed on flow, Csikszentmihalyi offers:
"If you mean the speed at which the program loads, the screens change, the commands are carried out—then indeed speed should correlate with flow. If you are playing a fantasy game, for instance, and it takes time to move from one level to the next, then the interruption allows you to get distracted, to lose the concentration on the alternate reality. You have time to think: “Why am I wasting time on this? Shouldn’t I be taking the dog for a walk, or studying?”— and the game is over, psychologically speaking."
Clearly, speed plays a key role in attaining flow. If you believe (as I do) that flow is a good thing, and brings on happiness, then giving your visitors the chance to enter a flow state is a worthwhile pursuit.

Usability
From the guru of web usability, Jakob Nielsen:
"Every web usability study I have conducted since 1994 has shown the same thing: Users beg us to speed up the page downloads. In the beginning my reaction was along the lines of "Let's just give them better design, and they will be happy to wait for it." I have since become a reformed sinner believing that fast response times are the most important design criterion for web pages; even my skull isn't thick enough to withstand consistent user pleas year after year."
Users are begging us to speed up page load times!

User Psychology
A good number of studies have further connected slow web pages (and unreliable web applications) with frustration and higher blood pressure.

  • "slow response time generated higher ratings of frustration and impatience"
  • "Frustration occurs at an interruption or inhibition of the goal-attainment process, where a barrier or conflict is put in the path of an individual"
  • "Slow websites inhibit users from reaching their goals, causing frustration"
  • "It was found that in the context of human–computer interactions while browsing a Web site, flow experience was characterized by time distortion, enjoyment, and telepresence."
A study done by Forrester Consulting (on behalf of Akamai):
  • "finds that website performance has a direct impact on revenues, profits and satisfaction."
  • "The findings indicate that website performance is second only to security in user expectations"

Drive
Daniel Pink's new book Drive argues that "the biggest motivator at work is making progress" (link). Anything that gets in the way of you making progress makes you less happy. As the web becomes a bigger part of where work is done (be it SaaS, the cloud, or Twitter), the more important the speed and reliability of those web sites becomes. Progress, motivation, and happiness will be increasingly tied to the performance and stability of the web.

Still not convinced?
Let's look at the flip side. Slow and unreliable sites make people very upset:






Downtime creates pain and frustration. Slow web applications piss people off. Ironically the more popular your site, and the more useful it is to your users, more unhappiness you can cause.

This unhappiness does not end at your firewall. I don't have to tell you how stressful downtime is internally. Your peers have to work nights, your boss has to explain what happened to their boss. If you are dogfooding your applications, internal productivity is affected from both downtime and bad performance. Simply put, the absence of uptime and bad web performance creates a lot of unhappy people.

Where does this leave us?
Let's ask the same question we asked earlier:
"Does it truly matter to you whether a page loads a couple seconds faster? Are we wasting our lives keeping servers up 99.999%? Are we making an impact on the world in a meaningful way? Does it actually matter in the scheme of things?"
Hopefully I've convinced you that there is a strong link between performance/stability and happy users. That's all well and good, but does this matter in the scheme of things? Two stats stand out to me:
  1. The number of Internet users worldwide: 1,733,993,741
  2. The amount of time spent online per week: 13 hours
If we as an industry can impact the happiness of almost 2 billion people by making the web a little bit faster, or a little more stable, I say that this does indeed impacts the world in a meaningful way. By plugging away at our little problems, our minor tweaks, our tools and tricks, we are helping our users, and the world at large, become a happier place.

"Happiness is the meaning and the purpose of life, the whole aim and end of human existence" -- Aristotle

Saturday, November 15, 2008

Comparing Amazon Web Services, SalesForce, and Zoho's online health dashboards

Now that there are three major SaaS players offering online service health dashboards, and one from Google on it's way, I thought it would be a useful exercise to compare the offerings from Amazon Web Services, Salesforce, and Zoho. This will hopefully be helpful for anyone planning to launch their own health dashboard, and to the general online community in making sense of what is important to understand about these dashboards.

Disclaimer: If I have mistakenly misrepresented anything, or if I missed any information, PLEASE let me know in the comments below.

What providers are we looking about today?
What is the URL of each status page (and are they easy to remember in times of need)?
What are these status pages called?
  • Amazon Web Services: "AWS Service Health Dashboard"
  • Salesforce: "Trust.salesforce.com - System Status" (Note: salesforce.com goes beyond simply providing system status by also providing security notices, both under their "Trust.salesforce.com brand")
  • Zoho: "Zoho Service Health Status"
What services' health are reported on?
  • Amazon Web Services: All four core services (EC2, S3, SQS, SimpleDB), plus Mechanical Turk and FlexPay. They also break out the two S3 datacenter locations (EU and US), the two ends of a Mechanical Turk transaction (Requester and Worker), plus the EC2 API.
  • Salesforce: Only the core salesforce.com services across 12 individual systems (based on geographic location and purpose).
  • Zoho: All 23 Zoho services are covered, plus their mobile site and their single sign-on system.
What health information is provided?
  • Amazon Web Services: Current status, plus about 30 days of historical status. Status is determined to be one of "Service is operating normally", "Performance Issues", or "Service disruption". "Information messages" are occasionally provided.
  • Salesforce: Current status, plus exactly 30 days of historical status. Status is determined to be either "Instances available", "Performance Issues", "Service disruption", or "Status not available". "Informational messages" are also provided on occasion.
  • Zoho: Current status and the response time for the past hour, in addition to historical uptime for the past week. Also provided are two graphs representing uptime and response time for the past seven days. If that wasn't enough, current uptime and response from six geographical locations is also given.
Where does the uptime and performance data come from?
  • Amazon Web Services: No clue.
  • Salesforce: No clue.
  • Zoho: Their own "Site 24x7" monitoring service.
What is considered downtime and what is considered a performance issue?
  • Amazon Web Services: No clue.
  • Salesforce: No clue.
  • Zoho: No clue.
Are real time updates provided during downtime events? Is it easy to find?
  • Amazon Web Services: Yes, but unclear how consistently and how easy it is to find that information.
  • Salesforce: Yes, right underneath the current status.
  • Zoho: Does not appear so, but if the issue is big enough they may update customers through their blog.
Is information provided on past downtime events?
  • Amazon Web Services: Yes. Mousing over a past performance or downtime event brings up a chronological log of events that took place, from detection to resolution. In addition, major downtime events are explained.
  • Salesforce: Yes. Clicking on any past event brings up a window giving the time of the event, a detailed description of the problem, and a root cause analysis.
  • Zoho: No. Unless they are described in the blog.
Is there a way to easily report problems users are having?
  • Amazon Web Services: Yes, clicking the "Report an Issue" link.
  • Salesforce: No, other then using the standard support channels.
  • Zoho: No, other then using the standard support channels.
How can you get notified of problems (without watching this page 24/7)?
  • Amazon Web Services: Ability to subscribe to RSS feeds for change in status of each service.
  • Salesforce: No.
  • Zoho: No.
Conclusions: The best practices for online service health dashboards are still being formed, and it's clear that each service provider has approached the need for transparency differently. Amazon Web Services provides a simple and easy to understand overview of the health of each service, but provides little insight into who is impacted and what specific functionality is down. Salesforce provides clear insight into what customers may be affected by an event, but does little in offering insight into specific functionality that may be down or slow. Zoho provides the most data by far for each service they provide, but does not have a system in place to communicate details about specific downtime events beyond the company blog. Amazon and Salesfroce completely lack insight into how that they collect the health information, and all three give no information on what is meant by downtime or performance problems.

A closing questions for each provider:
  • Amazon Web Services: What does "EC2 API" actually mean? Which API is this referring to and why not cover the API's for the other services?
  • Salesforce: Does each server status cover every application level and API on that server? Can you offer more insight into specific services?
  • Zoho: Do you expect to add details about current and past downtime events to the health dashboard? What do you expect your customers to do when they see a red light? If you answer "Email Support", you don't get the power of this status page.
  • To all: How is the health actually monitored (especially for the GUI focused Salesforce and Zoho services? Working at a (the best) web monitoring company, I know how hard it is to monitor complex web applications.
Notable mentions: The following services also offer up health dashboard page, but to keep the comparison from getting overly complex I decided to leave them out. If anyone would like me to review these, or any other service that I missed, I'd be more then happy to. Just leave a note in the comments