Cover Image

Not for Sale

View/Hide Left Panel
Click for next page ( 28

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 27
CHAPTER 3 Statistical Concepts An understanding of the underlying concepts of sampling and statistical accuracy is funda- mental to an understanding of such issues as the size of the sample to be used and the accuracy of the resulting findings. This chapter is primarily intended for readers who are not familiar with these concepts or those who are interested in a review of the basic statistical principles. Overview of Basic Concepts Distribution In any set of data, each item in the dataset has a particular value. The distribution of the data in the dataset refers to the proportion of the items that take each of the possible values in the dataset. With discrete data (e.g., the number of people in a travel party), each possible value (1, 2, 3, etc.) will occur for some proportion of the total number of items in the dataset. For continuous data (e.g., the time taken to drive to the airport), there are effectively an unlimited (or at least very large) number of possible values. Therefore, the distribution is defined in terms of a functional relationship, typically plotted as a graph or expressed as an equation. The relationship can be used to determine the proportion of values within a given range. Average The average value of a set of data (also referred to as the mean of the distribution) is defined as the sum of the values of each item in the dataset divided by the number of items. This corresponds to the common usage of the term "average." Usual usage is to refer to the average of a set of data and the mean of a distribution, although the concepts are identical. Variance The variance of a dataset or a distribution measures the spread of the values about the average or mean value. It can be thought of as the average of the squared difference between each value and the mean of the distribution. The differences are squared so that larger differences have greater importance than smaller differences and negative and positive differences do not offset each other. Standard The standard deviation of a dataset or distribution is defined as the square root of the variance. Deviation This expresses the spread of the values around the average or mean of the dataset or distribu- tion in the same units as the data. More details can be found in textbooks on general statistics, such as those listed at the end of this guidebook. 27