Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.

Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 27

CHAPTER 3
Statistical Concepts
An understanding of the underlying concepts of sampling and statistical accuracy is funda-
mental to an understanding of such issues as the size of the sample to be used and the accuracy
of the resulting findings. This chapter is primarily intended for readers who are not familiar with
these concepts or those who are interested in a review of the basic statistical principles.
Overview of Basic Concepts
Distribution In any set of data, each item in the dataset has a particular value. The distribution of the
data in the dataset refers to the proportion of the items that take each of the possible
values in the dataset. With discrete data (e.g., the number of people in a travel party), each
possible value (1, 2, 3, etc.) will occur for some proportion of the total number of items in
the dataset. For continuous data (e.g., the time taken to drive to the airport), there are
effectively an unlimited (or at least very large) number of possible values. Therefore, the
distribution is defined in terms of a functional relationship, typically plotted as a graph or
expressed as an equation. The relationship can be used to determine the proportion of
values within a given range.
Average The average value of a set of data (also referred to as the mean of the distribution) is
defined as the sum of the values of each item in the dataset divided by the number of
items. This corresponds to the common usage of the term "average." Usual usage is to
refer to the average of a set of data and the mean of a distribution, although the concepts
are identical.
Variance The variance of a dataset or a distribution measures the spread of the values about the average
or mean value. It can be thought of as the average of the squared difference between each
value and the mean of the distribution. The differences are squared so that larger differences
have greater importance than smaller differences and negative and positive differences do not
offset each other.
Standard The standard deviation of a dataset or distribution is defined as the square root of the variance.
Deviation This expresses the spread of the values around the average or mean of the dataset or distribu-
tion in the same units as the data.
More details can be found in textbooks on general statistics, such as those listed at the end of this guidebook.
27