Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 28
28 Guidebook for Conducting Airport User Surveys
3.1 Concepts of Census and Sample Surveys
In general, a survey will collect information from a sample of individuals from the target pop-
ulation (see Section 1.3 for a discussion of survey terminology). In some cases, it may be appro-
priate to survey the entire population, in which case the survey is termed a census survey.
A census survey is generally appropriate for collecting information on small populations
when a very high level of accuracy is required and when there are no significant constraints due
to budget, survey resources, or the time period when individuals are available to be surveyed. A
census survey might be appropriate, for example, for a survey of tenants at the airport, but not
for a survey of air passengers.
For a sample survey, a sample of respondents is selected from the target population in such a
way that the characteristics of the population can be inferred from the corresponding character-
istics of the sample. The way this is done and the implications for the accuracy of the resulting
estimates of the characteristics of the population are discussed in more detail below.
3.2 Statistical Accuracy and Confidence Intervals
The characteristics of interest of the population being surveyed, such as the mode of travel to
the airport, will vary across the members of the population. Aggregate measures of the popula-
tion, such as the proportion of air passengers accessing the airport by taxi, can be estimated from
the corresponding values for the sample.
However, when drawing a sample from a population, the distribution of the characteristics of
interest across the members of the sample will generally be different from the corresponding dis-
tribution across the population, and thus measures of this distribution, such as the average value,
will also be different. This difference between the sample average and population mean is
referred to as the error of the estimate.
With very small samples relative to the size of the population, it is unlikely that the distribu-
tion of the characteristics across the sample will correspond exactly to the distribution across the
population as a whole, since the opportunity for the sample to include the full range of values
that exist in the population is limited by the small sample size. As the size of the sample increases,
it becomes more likely that the distribution of any given characteristic will correspond to that of
the population.
The degree to which the distribution of a given characteristic in a sample of a given size cor-
responds to the distribution of the characteristic in the population as a whole depends on how
variable the characteristic is in the population. In statistical terminology, this variability is termed
the variance of the characteristic. In the extreme case in which every member of the population
has the same value for a given characteristic, a sample of only one respondent would provide a
completely accurate estimate of that value. At the other extreme, if every member of the popu-
lation has a different value for a given characteristic, a sample of the entire population (a 100%
sample) would be required in order to include every possible value of the characteristic occur-
ring in the population.
If a sample of a given size is drawn randomly from a population multiple times, a slightly
different distribution would be expected of any given characteristic in each sample, except in
the special case where every member of the population had the same value of the characteristic.
The greater the variance of the characteristic in the population, the more variation there would
be in the distribution of the characteristic across the different samples. Therefore, the average
OCR for page 29
Statistical Concepts 29
value of a given characteristic in a sample of a given size, although it is a specific value for any
particular sample, will vary across the different samples. This results in the following funda-
mental point:
The average value of any given characteristic in a sample drawn randomly from a given population has
an expected variance that depends on the variance of the characteristic in the population, as well as the
size of the sample relative to the size of the population.
Because of this variance, there will be an expected error between the average value of a partic-
ular characteristic given by a single sample and the true average (or mean) value of the charac-
teristic in the population. Because the value of the population mean is not known (the survey is
being performed to estimate this), the actual error is not known. However, the expected distri-
bution of the error can be determined from statistical principles and an estimate made of the
likely range of the error.2
The standard deviation of the estimated average value of any particular characteristic deter-
mined from a sample is termed the standard error of the estimate (SEE) and is a measure of the
accuracy of an estimate.
For large enough samples, the error will approximate a Normal distribution with an expected
(mean) error of zero, as illustrated in Figure 3-1.3 This exhibit shows the probability of a sample
giving an error of any particular size, measured in terms of the number of standard deviations
from the mean value, with the area under the curve between any two values representing the
probability of the actual value lying in that range of values. As the range gets larger, measured in
terms of the number of standard deviations from the mean, so the probability of the actual value
lying in the range approaches 100%. The greater that the variance of the sample estimate of the
mean is (i.e., the larger the standard deviation of the sample estimate), the fewer standard devi-
ations from the mean an error of any particular value will be.
Figure 3-1 also illustrates an important related aspect of sample error. As the standard devia-
tion of the sample estimate of the mean increases, so the range of values covered by any given
number of standard deviations also increases. Because an error of any particular absolute value
Probability Density
95 %
-3 -2 -1 0 1 2 3
Error (standard deviations)
Note: Probability of error being between two values is given by the area under
the probability density curve between those values.
Figure 3-1. Example of the probability distribution of the
expected error.
2The theory underlying this statement can be found in any statistics textbook.
3Under the Central Limit Theorem, the probability distribution of the sample average will approach the Normal distribution
as the sample size approaches infinity.
OCR for page 30
30 Guidebook for Conducting Airport User Surveys
(in the units of the variable) will be fewer standard devia-
Expressing the Accuracy of Variables
tions from the mean, this reduces the probability of getting
Expressed as a Percentage
an error no greater than that value, and hence increases the
corresponding probability of getting an error greater than
For categorical variables, results are often expressed that particular value. Thus as the variance (and hence the
as a percentage of the sample, for example, the standard deviation) of the sample estimate increases, so the
percentage of air passengers who use transit to probability of getting an error greater than any particular
access the airport. In such cases, expressing the value also increases.
accuracy of the estimate of the proportion of the
This leads to the second fundamental aspect of sampling
sample in a given subgroup as a percentage can
accuracy:
have two different meanings:
Although the actual error of a sample estimate of the mean
· A percentage of the sample size (e.g., ±5% of value of any characteristic of the population is unknown, the
the sample), or probability of this error being less than any given value can be
estimated.
· A percentage of the subgroup mean (e.g., ±5%
of the proportion in the subgroup). This second aspect has the important implication that any
estimate of the expected error has two attributes: the magni-
The first meaning is often referred to as "percentage tude of the error being considered and the probability that
points" to distinguish it from the latter. the actual error is less than this value (referred to as the con-
The two are very different. For example, if it is esti- fidence level). Because the error distribution is symmetrical,
mated that 10% of passengers take transit, then an as shown in Figure 3-1, it is common to express the expected
accuracy of ±5 percentage points at the 95% confi- error range, sometimes termed the margin of error, as plus
dence level corresponds to the interval from 5% to or minus a specified amount (for a continuous variable) or
15%. This range corresponds to plus or minus 50% number of percentage points4 (for a categorical variable,
of the proportion of transit passengers in the survey where the value is one of a defined list of values). For exam-
(i.e., 5/10 = 50%). However, an accuracy of ±5% of ple, the results of an opinion poll might be reported as being
the estimated proportion of passengers taking tran- accurate to within plus or minus 3% with 95% confidence.
sit corresponds to the interval from 9.5% to 10.5%,
In this case, the probability of the estimate being within a
equivalent to an accuracy of ±0.5 percentage points.
margin of error of plus or minus 3 percentage points is 0.95,
or 95%, and the results could also be described as having a
To avoid confusion, care must be taken when 95% confidence interval of plus or minus 3%, where the term
expressing the accuracy of variables expressed as a confidence interval refers to the margin of error for a spec-
percentage. When interpreting such values, care ified confidence level.
must be taken to be clear whether the accuracy is a
As shown in Figure 3-1, as the confidence level increases
percentage of the entire sample or of the subgroup
(i.e., there is a greater probability that the actual error lies
in question. It may be helpful to make a distinction
within the interval being considered), the size of the associ-
between percentage and percentage points in dis-
ated error range also increases. For a given confidence level,
cussing accuracy. Failure to be clear in this distinc-
the size of the corresponding error range depends only on the
tion when reporting survey results can result in a
variance (and hence the standard deviation) of the expected
situation where the reader cannot determine
error. As illustrated by Figure 3-1, a given confidence interval
which way to interpret the stated accuracy.
spans a fixed number of standard deviations either side of the
mean. For a 95% confidence interval, this range is plus or
minus about 2 (strictly 1.96) standard deviations. For a 90%
confidence interval, this range is plus or minus about 1.65 (strictly 1.645) standard deviations. Thus
if the variance of the estimated mean of some characteristic in a given sample is 0.0004 (i.e., the
standard deviation is 0.02 or 2%), for a confidence interval of 95%, the margin of error would
be plus or minus 4% (2 times the standard deviation). For a confidence interval of only 90%, the
margin of error would be plus or minus 3.3% (1.65 times the standard deviation).
4The term "percentage points" is used to refer to the absolute change in a variable that is expressed as a percentage. For example,
a range of 50% ± 5 percentage points is equivalent to the range 45% to 55%.