Re: [OpenSTA-users] Performance testing with Open STA
Brought to you by:
dansut
|
From: Bernie V. <Ber...@iP...> - 2007-03-08 17:48:22
|
> <Chris writes> > > Hi there, > Thanks for your response, so does this mean that running the test for 10 > users simultaneously should not cause time out errors. The developers > think it may be because it is an unrealistic test i.e. In the real > world, I don't think we will have 10 users running the same test and > performing the same action at the same time for an hour, I think it is > probably best to ramp the test up such that 1 user is added every 30 > seconds with a 10 second delay until I get to the maximum number of > users, do you think this is a more realistic test? > > Although running the same test now with 10 simultaneous users even > though it displays timeout errors it still creates record in the > database. > Chris, We've certainly left the realm of OpenSTA related questions and moved into a discussion of performance testing. Its a slow day, I'll bite. There are three major areas of performance testing. Different people use different terminology, so you'll have to put up with mine understanding that it might not jive completely with what others say. Still, its the goal of the testing that is important, not what you call it. If your goal is to do CAPACITY PLANNING, the you should create a "realistic" workload. A mix of the most popular transactions plus those deemed critical presented to the server(s) under test in a realistic fashion. This is easy to say, and I've seen 3 day seminars and countless books dedicated to how to do this "correctly". For the most part this boils down to picking a manageable (in terms of time to develop vs. budget, goals, etc) set of transactions to emulate, determining the % probability of executing each transaction, and the overall arrival rate, and also the "success criteria" for the transactions (i.e. response time limits, throughput goals, etc). Collectively, I'll refer to these attributes as the "workload definition". One implement a given workload description is to create a master script which is assigned to each VU and have it generate random numbers and then call other scripts (that model the workload transactions) based on a table of probabilities. The scripts should modeled with think times consistent with the way your users will interact with the system. This varies greatly from one app to another and unless you are mining logs from an application already in use, is somewhat subjective. The best advice I can give is be conservative, but no so much so that the sum of all your conservative decisions is pathological. Once you have a workload that has pacing (think times) you are comfortable with, then increase the number of users and monitor how response times, server resource utilization (CPU, IO rate, Network, and memory), and throughput (number of tasks completed system wide) vary with the increased load. You might set up your test so you ramp up to a specific number of users, then let them run for a while, and repeat as necessary. This way, you capture the behavior of the system various steady states. The length of time to allow a particular number of users to run varies with a number of factors including how different the transactions are from one another in terms of resource utilization and response time. If you can't get repeatable results, your steady state interval might be too small. I've seen intervals as small as 10 minutes work and other workloads that require an interval of hours to useful. That's a rough outline of one approach to capacity planning which in summary is an attempt to load up the system with VUs in a way that a VU is indistinguishable from a "real user". Again, much easier said then done. Pick the wrong workload, and your results might be worthless. The end game here is to increase load until response times become excessive (whatever that means to you, but it needs to be defined.. again, tons of material to read about this) at which point you have found a limit to system capacity. This limit will be due to either a hardware or software bottleneck. Now, if you are on a tuning expedition, then analyze the performance metrics captured and either do some tuning, code optimization, or add some hardware resources and repeat as necessary until you either meet throughput goals, find the limits to the architecture, or run out of time (happens more then most performance engineers would like). The same scripts can be used for SOAK TESTING, where you load up the system at close to it's maximum capacity and let it run for hours, days, etc. This is a great way to spot stability problems that only occur after the system has been running a long time (memory leaks are a good example of things you will find). Run a long test and start failing components (servers, routers, etc) to see how response times are effected and how long the system takes to return to a steady state and you are on your way towards FAILOVER TESTING. You can find reams of material to read about failover testing and high availability as well. If you goals is to determine where or how the system will fail, then you are doing STRESS TESTING. One way to do this is to comment out the think times and increase VUs until something (hopefully not your emulator!) breaks. This is just one form of stress testing, a valuable aspect of performance testing, but not the same as capacity planning. How the VUs compare to "real users" may be irrelevant as you are trying to determine how the system behaves when pushed past its limits. So I guess only you can answer your question. Decide what your goals are (capacity planning, stability testing, failover testing, or stress testing) and then see if your script and test behavior is aligned with the goal(s). -Bernie www.iPerformax.com |