Modeling the Real World for Load Testing Web Sites
Introduction
If you want to get an accurate idea of how your Web site is going to perform in the real world, it pays to create a load profile that closely models conditions your site will experience. This article addresses nine elements that can affect Web load.
Requesting your Web site’s home page 100 times per minute is not going to give you a very accurate idea of how your Web site is actually going to perform in the real world. This article seeks to provide an overview of some of the load parameters that you should consider when designing the test load to be used for performance testing a Web site. The more accurate the load profile can be made, the closer the performance tests will be to modeling the real world conditions that your Web site will ultimately have to survive. This, in turn, will lead to more reliable test results.
1. User Activities
Many transactions occur more frequently than others, and should therefore comprise a larger proportion of the test data and scripts used for performance testing.
In order to simulate the real workload that a Web site is likely to experience, it’s important that the test data and scripts are a realistic representation of the types of transactions that the Web site can be expected to handle. If your organization typically only sells one product for every fifty visitors, it would be unrealistic to weigh two test scripts evenly (one mimicking a browser customer, the other an actual buyer). A better approach would be to use a 50:1 weight (fifty browsers for every buyer).
2. Think Times
The time that it takes for a client (or virtual client) to respond to a Web site has a significant impact on the number of clients that the site can support. People, like computers, have a wide range of different processing levels and abilities. Different users require various amounts of time to think about the information that they have received. Some users race from one Web page to another, barely pausing long enough to comprehend what they’ve seen; others need some time to contemplate what they’ve read before moving on to the next page. The length of this pause is called think time. Think time is generally considered to be the length of time from when a client receives the last data packet for the Web page currently being viewed to the moment that the client requests a new page.
In theory, a Web site that can support a maximum of 100 "10-second aggressive" clients should be able to support 300 "30-second casual" clients because both types of clients result in 600 transactions per minute. Unfortunately, this theory only holds true for the most rudimentary of Web sites. Web sites that are more interactive require resources for both active and dormant clients, meaning that the 300 casual clients are likely to require more resources than the 100 aggressive clients.
When recording or writing test scripts for performance testing, you should consider how much time each of the various types of clients might spend thinking about each page. From this data, you can create a reasonable distribution of timings and, possibly, a randomizing algorithm. Web logs can be a good source for estimating think times for Web pages that have already been hosted in production.
3. Site Abandonment
Basic test scripts/tools assume that a client will wait until a Web page has been completely downloaded before requesting the subsequent page. Unfortunately in real life, some users may be able to select their next page before the previous page has been completely downloaded (e.g., they may know where they want to go as soon as the Navigation bar appears).
Alternatively, some users will surf off to another Web site if they have to wait too long (i.e., terminate the test script before reaching the end of the script). The percentage of users that abandoned a Web site will vary from site to site depending on how vital the content is to the user (e.g., a user may only be able to get their bank balance from their bank’s Web site, but could get a stock quote from numerous competitors). One thing is certain; the longer a user has to wait, the more likely it is that they will abandon the Web site. Therefore, in order to model the real world, test scripts should include a randomized event that will terminate a test script execution for a particular client if the client is forced to wait too long for the page to download. The longer the delay, the more likely that the client will be terminated.
4. Usage Patterns
Unlike typical mainframe or client-server applications, Web sites often experience large swings in usage depending on the type of visitors that come to the site. U.S. retail customers, for example, typically use a Web site in the evenings (7:00 p.m. EST to 11:00 p.m. PST). Business customers typically use a Web site during regular working hours (9:00 a.m. EST to 4:00 p.m. PST). The functionality of a Web site can also have a significant impact on usage patterns. U.S. stock quotes, for example, are typically requested during market trading hours (9:30 a.m. EST to 4:00 p.m. EST).
When attempting to model the real world, you should conduct some research to determine peak usage ramp-up and ramp-down, peak usage duration, and whether any other load profile parameters vary by time of day, the day of the week, or another time increment. Once researched, schedule tests that will run over the real Internet at appropriate times of the day/week.
5. Client Platforms
Different client-side products (e.g., Browsers and O/Ss) will cause slightly different HTTP traffic to be sent to the Web server. More important, if the Web site has been designed to serve up different content based on the client-side software being used (a technique commonly refereed to as browser sniffing), then the Web site will have to perform different operations with correspondingly different workloads.
Some browsers allow users to change certain client-side network settings (threads, version of HTTP, and buffer sizes) that affect the way the browser communicates and thus the corresponding workload that a browser puts on a Web server. While few users ever change their default settings, because different browsers/versions have different defaults, a more accurate test load would vary these values.
6. Client Preferences
Most browsers also allow users to change client-side preferences; but again, few users actually change their default settings. However, different products/versions of a browser may have different default settings. For example, a browser with cookies disabled will reduce the amount of network traffic due to the cookies not being sent back and forth between the Web site and the browser; however, it might increase the resource requirements of the application server as it struggles to maintain a session with the user without the convenience of the cookie.
If encryption is going to be used to send and receive secure Web pages, the strength (or key size) of the encryption used to transfer the data will be dependent upon a negotiation that takes place between the Web server and the browser. Stronger encryptions utilize more network bandwidth and increase the processing requirements of the CPUs that perform the encrypting and deciphering (typically the Web server). Therefore, users with low settings (e.g., 40-bit keys) will put less of a strain upon a Web server than users with high settings (e.g. 128-bit keys).
By indicating that they do not want graphics or applets downloaded, Web site visitors will not only speed up the delivery of a Web page that contains these files, but will also consume a smaller portion of the Web site’s bandwidth and fewer Web server connections. If present in significant numbers, these clients can have a noticeable effect upon the performance of the Web site.
7. Client Internet Access Speeds
The transmission speed or bandwidth that your Web application will use can have a significant impact on the overall design, implementation, and testing of your Web site. In the early days of the Web (circa mid-1990s), 14.4 Kbps was the most common (e.g., standard) communications speed available. Hence, 14.4 Kbps became the lowest common denominator for Internet access. When 28.8 Kbps modems were introduced, however, they offered a significant performance improvement over 14.4 Kbps modems and quickly surpassed 14.4 Kbps modems in popularity. When 56.6 Kbps modems were introduced, the performance improvement wasn’t as significant. Consequently, 28.8 Kbps are still in use and unlike the 14.4 Kbps (which has nearly vanished) still comprise a significant (although decreasing) proportion of the Internet population. Many companies therefore use the 28.8 Kbps transmission speed when specifying the performance requirements of their Web site.
8. Background Noise
Unless the production servers and network are going to be dedicated to supporting the Web site, you should ensure that the servers and network in the system test environment are loaded with appropriate background tasks. When designing a load to test the performance of a Web site or application, consider what additional activities need to be added to a test environment to accurately reflect the performance degradation caused by "background noise." Background noise is created by other applications running that will also be running on the production servers once the application under test moves into production, and other network traffic that will consume network bandwidth and possibly increase the collision rate of the data packets being transmitted over the LAN and/or WAN.
9. User Geographic Locations
Due to network topologies, response times for Web sites vary around the country and around the world. Internet response times can vary from city to city depending on the time of day, the geographic distance between the client and the host Web site, and the local network capacities at the client-side. Remote performance testing can become particularly important if mirror sites are to be strategically positioned in order to improve response times for distant locations.
But how can you effectively test the performance of your Web site from locations that are thousands of miles away? Possible solutions include
* using the services of a third-party company that specializes in testing a Web site from different locations around the world
* utilizing the different physical locations (branches) that your organization may already possess, coordinating the execution of your test plan with coworkers at the offices
* using a modem to dial ISP telephone numbers in different cities and factoring out the additional time for the cross-country modem connection
* buy an "around the world" airplane ticket for one or more of the Web site’s testers
Getting the Right Mix
When developing the load profile that will be used for performance testing, try to take into account as many of the previously mentioned parameters as possible. While a single parameter may only affect the test results by a few percent, the accumulation of several parameters may add up and have a significant impact of the test results.