A fellow geek asked me the other day, “What’s your take on load testing and profiling?” Incidentally, this question comes up once in a while, so I figured I’d explain my position here.
Profiling
Let’s define profiling first:
A profiler is a performance analysis tool that measures the behavior of a program as it executes, particularly the frequency and duration of function calls.
I consider profilers to be absolutely essential tools, yet they are grossly underutilized. We tend to overlook that a piece of code may be fully tested (pick your favorite xDD fad), but may turn out to contain a bottleneck. Bottlenecks are relatively rare, but where they do occur they hemorrhage resources like there’s no tomorrow. Just about the only way to find it out is with the help of a profiler which will break down the duration of each call to help you zero in on underperforming code.
The two most established .NET profilers are JetBrain’s dotTrace and Red Gate’s ANTS.
Both are excellent tools from two well-known vendors. My only gripe is that profilers tend to be quite expensive. Too pricey for individuals, although it seems JetBrains added a personal license. As to businesses, you have to explain long and hard why you need a profiler, but the explanation will probably fall on deaf ears of somebody in the purchasing department. Ask me how I know.
Load Testing
This is a completely different beast.
Load testing is the process of putting demand on a system or device and measuring its response. […]
When the load placed on the system is raised beyond normal usage patterns, in order to test the system’s response at unusually high or peak loads, it is known as stress testing. The load is usually so great that error conditions are the expected result, although no clear boundary exists when an activity ceases to be a load test and becomes a stress test.
The idea of load testing is to simulate heavy traffic to how the app performs under stress. Usually, you “teach” it to jump from one page to another by simulating an interaction with a user. The load test records the path and plays it back.
My beef with this approach is that you’re leading your tool down a happy path. Lucky you if you find problems there. Those are easy to fix. Unfortunately, this is not how it happens in real life.
In real life the interaction doesn’t happen at the same pace. It doesn’t follow the happy path. Users will come up with a combination of steps you couldn’t even imagine. It’s impossible to teach a load testing tool what we don’t know!
Load testing has any meaning only if you can answer this seemingly simple question: what does success look like to you? Success in terms of requests/second, bandwidth, number of concurrent sessions (not users!), etc. I’ve never been given a straight answer.
I’ve sat in meetings where numbers where pulled out of thin air. Inevitably, this is where discussions would end. If you don’t have a meaningful success metric, you’re shooting in the dark. You can’t even define what and how you load test something.
Michael Nygard, in Release It! (my favorite technical read of 2008) says this:
Most load tests deliver results after the test is done. Since the data come from the load generators rather than inside the systems under test, it is a “black-box” test.
On the subject of simulating traffic:
Load testing is both an art and a science. It is impossible to duplicate real production traffic, so you use traffic analysis, experience, and intuition to achieve as close a simulation of reality as possible.
Neither are load tests effective at detecting longevity bugs, he argues:
The major dangers to your system’s longevity are memory leaks and data growth. Both kinds of sludge will kill your system in production. Both are rarely caught during testing. […]
These sorts of bugs usually aren’t caught by load testing either. A load test runs for a specified period of time and then quits.
He goes on to suggest a separate environment set up solely to run longevity tests. “Don’t hit the system hard; just keep driving requests all the time.” That’s an interesting change of perspective. Instead, what I usually see is, Let’s load test this thing and then we’ll look at, um, charts.
Apples vs. Oranges
As you see, this is really an apples-to-oranges comparison: two different activities aimed at gaining different insight.
Profiling, to me, is very quantifiable. I can see where I need to improve code. Load testing, on the other hand, is a black box.
My hope is that we can encourage more developers to use and more companies to buy profilers (and that they would become more affordable). A professional developer should have a good profiler in his or her toolbox.