*Jesse H. Poore, University of Tennessee; and Carmen J. Trammell, CTI PET Systems, Inc.*

Any large, complex, expensive process with myriad ways to do most activities, as is the case with software development, can have its cost-benefit profile dramatically improved by the use of statistical science. Statistics provides a structure for collecting data and transforming it into information that can improve decision making under uncertainty. The term ''statistical testing'' as typically used in the software engineering literature has the narrow reference to randomly generated test cases. The term should be understood, however, as the comprehensive application of statistical science, including operations research methods, to solving the problems posed by industrial software testing. Statistical testing enables efficient collection of empirical data that will remove uncertainty about the behavior of the software intensive system and support economic decisions regarding further testing, deployment, maintenance and evolution. The operational *usage model* is a formalism that enables the application of many statistical principles to software testing and forms the basis for efficient testing in support of decision making.

The software testing problem is complex because of the astronomical number of scenarios of use and states of use. The domain of testing is large and complex beyond human intuition. Because the software testing problem is so complex, statistical principles should be used to guide testing strategy. In general, the concept of "testing in quality" is costly and ineffectual; software quality is achieved in the requirements, architecture, specification, design and coding activities. Although not within the scope of this essay, verification or reading techniques (Basili et al., 1996) are of critical importance to achieving quality software and may efficiently and effectively displace some testing. The problem of doing just enough testing to

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.

Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 124

-->
3
Application of Statistical Science to Testing and Evaluating Software Intensive Systems
Jesse H. Poore, University of Tennessee; and Carmen J. Trammell, CTI PET Systems, Inc.
1. Introduction
Any large, complex, expensive process with myriad ways to do most activities, as is the case with software development, can have its cost-benefit profile dramatically improved by the use of statistical science. Statistics provides a structure for collecting data and transforming it into information that can improve decision making under uncertainty. The term ''statistical testing'' as typically used in the software engineering literature has the narrow reference to randomly generated test cases. The term should be understood, however, as the comprehensive application of statistical science, including operations research methods, to solving the problems posed by industrial software testing. Statistical testing enables efficient collection of empirical data that will remove uncertainty about the behavior of the software intensive system and support economic decisions regarding further testing, deployment, maintenance and evolution. The operational usage model is a formalism that enables the application of many statistical principles to software testing and forms the basis for efficient testing in support of decision making.
The software testing problem is complex because of the astronomical number of scenarios of use and states of use. The domain of testing is large and complex beyond human intuition. Because the software testing problem is so complex, statistical principles should be used to guide testing strategy. In general, the concept of "testing in quality" is costly and ineffectual; software quality is achieved in the requirements, architecture, specification, design and coding activities. Although not within the scope of this essay, verification or reading techniques (Basili et al., 1996) are of critical importance to achieving quality software and may efficiently and effectively displace some testing. The problem of doing just enough testing to

OCR for page 124

-->
remove uncertainty regarding critical performance issues, and to support a decision that the system is of requisite quality for its mission, environment or market, is a problem amenable to solution by statistical science. The question is not whether to test, but when to test, what to test and how much to test.
Testing can be justified at many different stages in the life cycle of a software intensive system. There is, for example, testing at various stages in development, testing of reusable components, testing associated with product enhancements or repairs, testing of a product ported from one hardware system to another, and "customer testing" (i.e., field experience). Service in the field is the very best "testing" information, because it is real use, often extensive, and free except for the cost of data collection. The usage model can be the framework, the common denominator, for combining test and usage experience across different software engineering methods and life cycle phases so that maximum use can be made of all available testing and field-use information.
A statistical principle of fundamental importance is that a population to be studied must first be characterized, and that characterization must include the infrequent and exceptional as well as the common and typical. It must be possible to represent all questions of interest and all decisions to be made in terms of this characterization. All experimental design methods require such a characterization and representation, in one form or another, at a suitable level of abstraction. When applied to software testing, the population is the set of all possible scenarios of use with each accurately represented as to frequency of occurrence.
One such method of characterization and representation is the operational usage model. The states-of-use of the system and the allowable transitions among those states are identified, and the probability of making each allowable transition is determined. These models are then represented in the form of one or more highly structured Markov chains (a type of statistical model, see e.g. Kemeny and Snell, 1960), and the result is called a usage model (Whittaker and Poore, 1993; Whittaker and Thomason, 1994).
From a statistical point of view, all of the topics in this paper follow sound problem solving principles and are direct applications of well-established theory and methodology. From a software testing point of view, the applications of statistical science discussed below are not in

OCR for page 124

-->
widespread use, nor is the full process presented here in routine use. Many methods and segments of the process are used in isolated pockets of industry, on both experimental and routine bases. This paper is a composite of what is in hand and within reasonable reach in the application of statistical science to software testing.
Statistical testing based on usage models can be applied to large and complex systems because the modeling can be done at various levels of abstraction and because the models effectively allow analysis and simulation of use of the application rather than the application itself.
Many of the methods that follow are well within the capability of most test organizations, with a modest amount of training and tool support. Some of the ideas are more advanced and would require the services of a statistician the first few times they are used, or until packaged in specialized tool support. Some of the advanced methods would require a resident analyst. However, the methods lend themselves to small and simple beginnings with big payoff, and to systematic advancement in small steps with continued good return on the investment.
2. Failures in the Field
Failures in the field, and the cost (social as well as monetary) of failures in the field, are the motivation behind statistical testing. The collection, classification and analysis of field failure reports on software products has been standard practice for decades for many organizations and is now routine for most software systems and software intensive systems regardless of the maturity of the organization. Field data is analyzed for a variety of reasons, among them the ability to budget support for the next release, to compare with past performance, to compare with competitive systems, and to improve the development process.
Field failure data is unassailable as evidence of need for process improvement. The operational front line is the source of the most compelling statistics. The opportunities to compel process changes move upstream from the field, through system testing, code development, specification writing, and into stating requirements. Historically, the further one moves

OCR for page 124

-->
upstream, the more difficult it has been to effect a statistically based impact on the software development process that is designed to reduce failures in the field. Progress has been made, however, in applying statistical science to prevention of field failures: given an operational usage model, it is possible to have a statistically reasoned and economically beneficial impact on all aspects of the software life cycle.
3. Understanding the Software Intensive System and Its Use
A usage model characterizes the population of usage scenarios for a software intensive system. Usage models are constructed from specifications, user guides, or even existing systems. The "user" might be a human, a hardware device, another software system, or some combination. More than one model might be constructed for a single system if there is more than one environment of interest.
For example, a cruise control system for tracks has both human and hardware users since it exchanges information with both. The usage model would be based on the states of use of the system—system off, system on and accelerating, system on and coasting, etc.—and the allowable transitions among the states. The model could be constructed without regard to whether the supplier will be Delco or Bosch. It will be irrelevant that one uses a processor made by Motorola and the other by Siemens and that they have very different internal states, that one is programmed in C and the other in Ada. It is conceivable that the system would be tested in two environments of use—operation as a single vehicle and operation in a convoy with vehicle-to-vehicle communication.
First Principles
When a population is too large for exhaustive study, as is usually the case for all possible uses of a software system, a statistically correct sample must be drawn as a basis for inferences

OCR for page 124

-->
about the population. Figure 1 shows the parallel between a classical statistical design and statistical software testing. Under a statistical protocol, the environment of use can be modeled, and statistically valid statements can be made about a number of matters, including the expected operational performance of the software based on its test performance.
Statistical Testing Process
Statistical testing can be initiated at any point in the life cycle of a system, and all of the work products developed along the way become valuable assets that may be used throughout the life of the system. The statistical testing process involves the six steps depicted in Figure 2.
Building Usage Models
An operational usage model is a formal statistical representation of all possible uses of a system. A usage model may be represented in the familiar form of a state transition graph, where the nodes represent states of system use and the arcs represent possible transitions between states (see Figure 3). If the graph has any loops or cycles (as is usually the case), then there is an infinite number of finite sequences through the model, thus an infinite population of usage scenarios. In such graphical form, usage models are easily understood by customers and users, who may participate in model development and validation. As a statistical formalism, a usage model lends itself to statistical analysis that yields quantitative information about system properties.
The basic task in model building (Walton, Poore, and Trammell, 1995) is to identify the states-of-use of the system and the possible transitions among states-of-use. Every possible scenario of use, at the chosen level of abstraction, must be represented by the model. Thus, every possible scenario of use is represented in the analysis, traceable on the model, and potentially generated from the model as a test case.

OCR for page 124

-->
There are both informal and formal methods of discovering the states and transitions. Informal methods such as those associated with "use cases" in object-oriented methods may be used. A formal process has been developed (Prowell, 1996; Prowell and Poore, 1998) that drives the discovery of usage states and transitions. The process is based on the systematic enumeration of sequences of inputs and leads to a complete, consistent, correct and traceable usage specification.
Usage models are finite state, discrete parameter, time homogeneous, recurrent Markov chains. Inherent in this type of model is the property that the states have no memory; some transitions in an application naturally do not depend on history, whereas others must be made independent of history by state-splitting, making the states sufficiently detailed to reflect the relevant history. This leads to growth in the number of states, which must be managed. A usage model is developed in two phases—a structural phase and a statistical phase. The structural phase concerns possible use, and the statistical phase, expected use. The structure of a model is defined by a set of states and an associated set of directed arcs that define state transitions. When represented as a stochastic matrix, the 0 entries represent the absence of arcs (impossible transitions), the 1 represents the certain transitions, and all other cells have transition probabilities of 0 < x < 1 (see Table 1). This is the structure of the usage model.
The statistical phase is the determination of the transition probabilities, i. e., the x's in the structure. There are two basic approaches to this phase, one based on direct assignment of probabilities and the other on deriving the values by analytical methods.
Models should be designed in a standard form consisting of connected sub-models with a single-entry and single-exit. States and arcs can be expanded like macros. Sub-models of canonical form can be collapsed to states or arcs. This permits model validation, specification analysis, test planning, and test case generation to occur on various levels of abstraction. The structure of the usage models should be reviewed with the specification writers, real or prospective users, the developers, and the testers. Users and specification writers are essential to represent the application domain and the workflow of the application. Developers get an early opportunity to see how the system will be used, and look ahead to implementation strategies that take account of use and workflow. Testers, who are often the model builders, get an early

OCR for page 124

-->
opportunity to define and automate the test environment.
Software Architecture
The architecture of the software intensive system is an important source of information in building usage models. If the model reflects the architecture of the system, then it will be easier to evolve the usage model as the system evolves. The architecture can be used to directly identify how models should be constructed and how testing should proceed.
A product line based on a common set of objects used through a graphical user interface might have a model for each object as well as a model for the user interface. Each object could be certified independently and the object interactions as permitted by the user interface would be certified with the interface. A new feature might be added later by developing a new object and a modification to the interface; this would require a new model for the new object and an updating of the model for the interface. Importance sampling might be used to emphasize testing of the changed aspects of the interface.
Protocols and other standards established by the architecture can also be factors in usage model development. For example, a usage model for the SCSI protocol has been developed and used in constructing models of several systems that use the SCSI protocol.
Assigning Transition Probabilities
Transition probabilities among states in a usage model come from historical or projected usage data for the application. Because transition probabilities represent classes of users, environments of use, or special usage situations, there may be several sets of probabilities for a single model structure. Moreover, as the system progresses through the life cycle the probability set may change several times, based on maturation of system use and availability of more information.

OCR for page 124

-->
When extensive field data for similar or predecessor systems exists, a probability value may be known for every arc of the model (i.e., for every nonzero cell of the stochastic matrix of transition probabilities, as in column 4 of Table 2). For new systems, one might stipulate expected practice based on user interviews, user guides, and training programs. This is a reasonable starting point, but should be open to revision as new information becomes available.
When complete information about system usage is not available, it is advisable to take an analytical approach to generating the transition probabilities, as will be presented in section 5. In order to establish defensible plans, it is important that the model builder not overstate what is known about usage or guess at values.
In the absence of compelling information to the contrary, the mathematically neutral position is to assign uniform probabilities to transitions in the usage model. Table 2 column 3 represents a model based on Figure 3 with uniform transition probabilities across the exit arcs of each state.
4. Model Validation with Customer and User
A usage model is a readily understandable representation of the system specification that may be reviewed with the customer and users. The following statistics are assured to be available by the mathematical structure of the models and are routinely calculated.
Long run probability. This is the long-run occupancy rate of each state, or the usage profile as a percentage of time spent in each state. These are additive, and sums over certain states might be easier to check for reasonableness than the individual values (for example, Model in Figure 3).
Probability of occurrence in a single sequence. This is the probability of occurrence of each state in a random use of the software.
Expected number of occurrences in a single sequence. This is the expected number of times each state will appear in a single random use or test case.

OCR for page 124

-->
Expected number of transitions until the first occurrence. For each state, this is the expected number of randomly generated transitions (events of use) before the state will first occur, given that the sequence begins with Invocation. This will show the impracticality of visiting some states in random testing without partitioning and stratification.
Expected sequence length. This is the expected number of state transitions in a random use of the system and may be considered the average length of a use case or test case. (Using this value and transitions until first occurrence, one may estimate the number of test cases until first occurrence.)
These statistics should be reviewed for reasonableness in terms of what is known or believed about the application domain and the environment of use. Given the model, these statistics are derived without further assumptions, and if they do not correspond with reality then the model must be changed. These and other statistics describe the behavior that can be expected in the "long ran," i.e., in ongoing field use of the software. It may be impractical for enough testing to be done for all aspects of the process to exhibit long run effects; exceptions can be addressed through special testing situations (as discussed below).
Operational Profiles
Operational profiles (Lyu, 1995) describe field use. Testing based on an operational profile ensures that the most frequently used features will be tested most thoroughly. When testing schedules and budgets are tightly constrained, profile-based testing yields the highest practical reliability; if failures are seen they would be the high frequency failures and consequent engineering changes would be those yielding the greatest increase in reliability. (Note that critical but infrequently used features must receive special attention.)
One approach to statistical testing is to estimate the operational profiles first and then create random test cases based on them. The usage model approach is to first build a model of system use (describe the stochastic process) based on many decisions as to states of use,

OCR for page 124

-->
allowable transitions and the probability of those transitions, and then calculate the operational profile as the long run behavior of the stochastic process so described.
Operational Realism
A usage model can be designed to simulate any operational condition of interest, such as normal use, nonroutine use, hazardous use, or malicious use. Analytical results are studied during model validation, and surprises are not uncommon. Parts of systems thought to be unimportant might get surprisingly heavy use while parts that consume a large amount of the development budget might see little use. Since a usage model is based on the software specification rather than the code, the model can be constructed early in the life cycle to inform the development process as well as for testing and certification of the code.
Source Entropy
Entropy is defined for a probability distribution or stochastic source (Ash, 1965) as the quantification of uncertainty. The greater the entropy, the more uncertain the outcome or behavior. As new information is incorporated into the source, the behavior of the source generally becomes more predictable, and less uncertain. One interpretation of entropy is the minimum average number of "yes or no" questions required to determine the result of one outcome or observation of the random event or process (Ash, 1965).
Each state of a usage model has a probability distribution across its exit arcs to describe the transitions to other states, which appears as a row of the transition matrix. State entropy gives a measure of the uncertainty in the transition from that state.
Source entropy is by definition the probability-weighted average of the state entropies. Source entropy is an important reference value because the greater the source entropy, the greater the number of sequences (test cases) that it would be necessary to generate from the usage model,

OCR for page 124

-->
on average, to obtain a sample that is representative of usage as defined by the model.
Specification Complexity
Some systems are untestable in any meaningful sense. Some systems have such a large number of significant paths of use and such high cost of testing per path, that there is not sufficient time and budget to perform an adequate amount of testing by any criteria, even with the leverage of statistical sampling. This situation can be recognized early enough through usage modeling to be substantially mitigated.
A usage model represents the capability of the system in an environment of use. All usage steps are probability weighted. Any model with a loop or cycle (other than the one from Termination to Invocation) has an infinite number of paths; however, only a finite number have a probability of occurring that is large enough to consider them. The complexity of a model can be viewed as the number of statistically typical paths (to be thought of as ''paths worth considering''). Note that this concept of complexity has nothing to do with the technical challenge posed by the requirements, nor with the intricacies of the ultimate software implementation. It is simply a measure of how many ways the system might be used (how broadly the probability mass is spread over sequences) and, therefore, a measure of the size of the testing problem.
Complexity analysis can be used to assess the extent to which modification of the specification (and usage model) would reduce the size of the testing problem. By excluding states and arcs from the model, such what-if calculations can be made. For example, mode-less display systems that allow the user to switch from any task to any other task are far more expensive to test than modal displays that restrict tasks to categories. It is possible, also, to compare the differences in complexity associated with different environments of use (represented by different sets of transition probabilities, as in Tables 3 and 4). Complexity analysis can be used to assess the impact on testing of changes in the requirements and system implementation. Because the usage model is based on the specification, the model can be developed, validated,

OCR for page 124

-->
Figure 3
Example Usage model structure, directed graph format.

OCR for page 124

-->
TABLE 1
Example of a Usage Model Structure, Transition Matrix Format
To State
From State
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
1
0
l
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
2
0
0
x
x
x
0
0
0
0
0
0
0
0
0
0
0
0
3
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
4
0
0
0
0
0
x
0
0
x
0
0
0
0
0
0
0
0
5
0
0
0
0
0
0
0
0
0
x
x
0
0
0
0
0
0
6
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
7
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
8
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
9
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
10
0
0
0
0
0
0
0
0
0
0
x
0
0
0
x
0
0
11
0
0
0
0
x
0
0
0
0
x
0
x
x
x
x
x
0
12
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
13
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
14
0
0
0
0
0
0
0
0
0
0
x
0
0
0
0
x
0
15
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
16
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
17
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

OCR for page 124

-->
TABLE 2
Example Usage Models, One Structure, Two Matrices of Transition Probabilities
From-State
To-State
Uniform Probabilities
Specific Environment
(1) Invocation
(2) Initialize System
1
1
(2) Initialize System
(3) Reconfigure System
1/3
1/12
(2) Initialize System
(4) Mode-1 Setup
1/3
8/12
(2) Initialize System
(5) Mode-2 Setup
1/3
3/12
(3) Reconfigure System
(2) Initialize System
1
1
(4) Mode-1 Setup
(6) Mode-1 Standard Data
1/2
3/4
(4) Mode-1 Setup
(9) Mode-1 Special Data
1/2
1/4
(5) Mode-2 Setup
(10) Mode-2 High Rate Data
1/2
1/12
(5) Mode-2 Setup
(11) Mode-2 Service Decision
1/2
11/12
(6) Mode-1 Standard Data
(7) Mode-1 Data Analysis
1
1
(7) Mode-1 Data Analysis
(8) Mode-1 Checkpoint
1
1
(8) Mode-1 Checkpoint
(2) Initialize System
1
1
(9) Mode-1 Special Data
(7) Mode-1 Data Analysis
1
1
(10) Mode-2 High Rate Data
(11) Mode-2 Service Decision
1/2
3/4
(10) Mode-2 High Rate Data
(15) Mode-2 Transform Data
1/2
1/4
(11) Mode-2 Service Decision
(5) Mode-2 Setup
1/7
2/16
(11) Mode-2 Service Decision
(10) Mode-2 High Rate Data
1/7
1/16
(11) Mode-2 Service Decision
(12) Mode-2 Standard Data
1/7
4/16
(11) Mode-2 Service Decision
(13) Mode-2 Data Analysis
1/7
4/16
(11) Mode-2 Service Decision
(14) Mode-2 Operator Call
1/7
1/16
(11) Mode-2 Service Decision
(15) Mode-2 Transform Data
1/7
3/16

OCR for page 124

-->
TABLE 2
Example Usage Models, One Structure, Two Matrices of Transition Probabilities
(11) Mode-2 Service Decision
(16) Mode-2 Checkpoint
1/7
1/16
(12) Mode-2 Standard Data
(11) Mode-2 Service Decision
1
1
(13) Mode-2 Data Analysis
(11) Mode-2 Service Decision
1
1
(14) Mode-2 Operator Call
(11) Mode-2 Service Decision
1/2
1/2
(14) Mode-2 Operator Call
(16) Mode-2 Checkpoint
1/2
2/2
(15) Mode-2 Transform Data
(11) Mode-2 Service Decision
1
1
(16) Mode-2 Checkpoint
(17) Termination
1
1
(17) Termination
(1) Invocation
1
1

OCR for page 124

-->
TABLE 3
Usage Statistics for the Model with Uniform Probabilities on the Exit Arcs
State Identification Number
Long Run Probabilities
Prob. of Occurrence in a Single Sequence
Expected Number of Occurrences in a Single Sequence
Expected Number of Transitions Until Occurrence
Expected Number of Sequences Until Occurrence
1
0.0449
1.0000
1.00
22.25
1
2
0.1348
1.0000
3.00
1.00
1
3
0.0449
0.5000
1.00
22.25
1
4
0.0449
0.5000
1.00
19.25
1
5
0.0749
1.0000
1.67
9.00
1
6
0.0225
0.3333
0.50
42.50
2
7
0.0449
0.5000
1.00
21.25
1
8
0.0449
0.5000
1.00
22.25
1
9
0.0225
0.3333
0.50
42.50
2
10
0.0674
0.7500
1.50
16.67
1
11
0.2097
1.0000
4.67
10.75
1
12
0.0300
0.4000
0.67
43.12
2
13
0.0300
0.4000
0.67
43.12
2
14
0.0300
0.5000
0.67
36.75
2
15
0.0637
0.6538
1.42
21.53
1
16
0.0449
1.0000
1.00
20.25
1
17
0.0449
1.0000
1.00
21.25
1

OCR for page 124

-->
Number of arcs is 28.
Expected sequence length is approximately 23 states.
Expected number of sequences to cover the least likely state is approximately 3.
Expected number of sequences to cover the least likely arc is approximately 7.
The log base 2 source entropy is approximately 1.0197.
The specification complexity index is approximately 22.71 (or 6,861,000 sequences).

OCR for page 124

-->
TABLE 4
Usage Statistics of Model for Specific Environment
State Identification Number
Long Run Probabilities
Prob. of Occurrence in a Single Sequence
Expected Number of Occurrences in a Single Sequence
Expected Number of Transitions Until Occurrence
Expected Number of Sequences Until Occurrence
1
0.0250
1.0000
1.00
40.08
1
2
0.0998
1.0000
4.00
1.00
1
3
0.0083
0.2499
0.33
120.28
3
4
0.0665
0.7273
2.67
12.03
1
5
0.0582
1.0000
2.33
16.00
1
6
0.0499
0.6667
2.00
18.04
1
7
0.0665
0.7273
2.67
14.03
1
8
0.0665
0.7273
2.67
15.03
1
9
0.0166
0.4000
0.67
58.11
2
10
0.0215
0.4843
0.86
58.52
2
11
0.2662
1.0000
10.67
17.10
1
12
0.0665
0.7273
2.67
31.13
1
13
0.0665
0.7273
2.67
31.13
1
14
0.0166
0.5000
0.67
66.67
2
15
0.0553
0.6935
2.22
33.82
1
16
0.0250
1.0000
1.00
38.08
1
17
0.0250
1.0000
1.00
39.08
1

OCR for page 124

-->
Number of arcs is 28.
Expected sequence length is approximately 40 states.
Expected number of sequences to cover the least likely state is approximately 4.
Expected number of sequences to cover the least likely arc is approximately 16.
The log base 2 source entropy is approximately 0.9169.
The specification complexity index is approximately 36.68 (or 1.1 × 1011 sequences).

OCR for page 124

-->
References
Alam, M.S., et al. 1997 Assessing software reliability performance under highly critical but infrequent event occurrences. In review.
Ash, R. 1965 Information Theory. New York: Wiley.
Basili, V., et al. 1996 The empirical investigation of perspective-based reading. Empirical Software Engineering 1:133-164.
Cohen, D.M. 1997 The AETG system: An approach to testing based on combinatorial design. IEEE Transactions on Software Engineering 23(7):437-444.
Dalal, S., and C. Mallows 1988 When should one stop testing software? Journal of the American Statistical Association 83(403).
Ehrlich, W., et al. 1997 Application of accelerated testing methods for software reliability assessment. In review.
Ekroot, L., and T.M. Cover 1993 The entropy of Markov trajectories. IEEE Transactions on Information Theory 39(4):1418-1421.
Gibbons, A.M. 1985 Algorithmic Graph Theory. Cambridge, U.K.: Cambridge University Press.
Gill, P.E., W. Murray, and M.H. Wright 1981 Practical Optimization. New York: Academic Press.
Gutjahr, W.J. 1997 Importance sampling of test cases in Markovian software usage models. Probability in the Engineering and Informational Sciences 11:19-36.

OCR for page 124

-->
Kemeny, J.G., and J.L. Snell 1960 Finite Markov Chains. D. Van Nostrand Company, Inc.
Kullback, S. 1958 Information Theory and Statistics. New York: John Wiley and Sons.
Lyu, M.R. 1995 Handbook of Software Reliability Engineering. New York: McGraw-Hill.
Miller, K., et al. 1992 Estimating the probability of failure when testing reveals no failures. IEEE Transactions on Software Engineering. January.
Nair, V.N., D. James, W. Ehrlich, and J. Zevallos 1998 A statistical assessment of some software testing strategies and applications of experimental design techniques. Statistica Sinica 8:165-184.
Oshana, R. 1997 Software testing with statistical usage based models. Embedded Systems Programming 10(1):40-55.
Parnas, D. 1990 An evaluation of safety critical software. CACM. 23(6):636-648.
Poore, J.H., and C.J. Trammell 1996 Cleanroom Software Engineering: A Reader. Oxford, United Kingdom: Blackwell Publishers.
Prowell, S.J. 1996 Sequence-Based Software Specification. Ph.D. Dissertation, University of Tennessee, Knoxville, Tenn.
Prowell, S.J., and J.H. Poore 1998 Sequence-based specification of deterministic systems. Software—Practice & Experience 28(3 ):329-44.
Sherer, S.A. 1996 Statistical software testing using economic exposure assessments. Software Engineering Journal September:293-297.

OCR for page 124

-->
Walton, G.H., J.H. Poore, and C.J. Trammell 1995 Statistical testing of software based on a usage model. Software—Practice & Experience 25(1):97-108.
Walton, G.H. 1995 Generating Transition Probabilities for Markov Chain Usage Models. Ph.D. Dissertation, University of Tennessee, Knoxville, Tenn.
Whittaker, J.A., and J.H. Poore 1993 Markov analysis of software specifications. ACM Transactions on Software Engineering and Methodology 2(1):93-106.
Whittaker, J.A., and M.G. Thomason 1994 A Markov chain model for statistical software testing. IEEE Transactions on Software Engineering 30(10):812-824.