Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 36
36 Guidebook for Conducting Airport User Surveys
With non-probability sampling, it is not possible to calculate the sample size required to
achieve a given level of accuracy or to make generalizations about the population. These types of
surveys are usually of limited value in ascertaining properties of the population but can be use-
ful for obtaining ideas and user feedback.
3.4 Sample Size
A critical issue in planning any survey is determining the appropriate sample size, which is
influenced by such considerations as the following:
· Survey purpose.
· Analysis of subgroups of interest.
· Required precision of the survey results.
· Credibility of results among decision makers and data users.
· Available resources (including budget, personnel, and equipment).
The survey purpose influences the required sample size in three ways: (1) by determining
the key characteristics of the air travel party and the precision to which they need to be known,
(2) by establishing the level of disaggregation to which the results need to be expressed, and
(3) by identifying the value to be gained from improved precision. For any desired degree of pre-
cision in the survey results, the need to consider subgroups of interest--such as air passengers
with ground origins in a particular area or visitors on business trips--will increase the required
sample size of the overall survey in order to ensure a large enough number of respondents in the
subgroup(s) of interest.
The required precision and the credibility of results influence the size of confidence interval
and the acceptable margin of error. Larger samples are required to reduce the margin of error
and/or increase the confidence level for a given margin of error.
Although the target sample size for a survey should ideally be determined by the purpose and
objectives of the survey and the uses to which the results will be put, in reality the financial
resources available to fund the survey often constrain the sample size, particularly where bud-
gets have been established before the detailed planning of the survey has begun. Time constraints
can also influence the sample size if information is required on short notice.
Other factors affecting the required sample size include the following:
· The proportion of the population with the attributes being measured. An airport seeking pas-
sengers' opinions of the retail concessions, for example, must design a survey that takes into
account the fact that only a small proportion of passengers will have actually visited the retail
concessions. The passengers visiting the retail concessions are a subgroup of all passengers,
and so the required sample size is found in a similar way to that described previously for a
subgroup.
· The variability of attributes being measured. If the variability is high, a larger sample size will
be required.
· The sample design used. For example, a good stratified sample can permit a smaller sample
size than a random sample for a given level of accuracy, while cluster sampling will usually
necessitate a larger sample size (as discussed in Sections 3.3.3 and 3.3.4).
In the following discussion, sample size refers to the number of completed responses; the
number of people approached to participate in the survey may be significantly higher depend-
ing on the rates of refusal and incomplete responses. These refusals and incomplete responses
will generally take some time to survey and process, which could be significant in some surveys
OCR for page 37
Statistical Concepts 37
(e.g., where follow-up phone calls are made) and should be allowed for in determining resource
requirements. To estimate the total number of individuals to approach, divide the desired sam-
ple size of completed surveys by an estimate of the completed survey response rate, expressed as
a proportion. For example, if a sample size of 1,000 is required and the response rate is 70%, then
1,429 [= 1,000/0.7] individuals would need to be approached.
3.4.1 Sample Size with Random Sampling
Calculation of the sample size required to obtain a specified accuracy differs depending on
whether the required accuracy is for a question with categorical or numerical responses (ques-
tion types are discussed in Section 4.3.2).8 With a categorical response, the respondent must
choose from a limited number of defined responses. For example, for a question on mode of
travel to the airport, categories could be private vehicle, rented vehicle, taxi/limousine, train, bus,
airplane, or walk/bicycle. Determination of the sample size for each type of question using ran-
dom sampling is considered in the following paragraphs.
Categorical Response Questions
When using categorical response questions and random sampling, the sample size required
to give a specified level of accuracy is a function of the population size and the proportion of
the population in the category of interest (e.g., proportion using a private vehicle as their
mode of travel to the airport). This proportion is unknown and should be estimated in the survey
planning stage from experience, previous surveys, or values from other airports. The largest
sample size required occurs when half of the population has the characteristic of interest.
Table 3-2 provides approximate 95% confidence intervals for a range of population and sam-
ple sizes and two values of the proportion of the population in the category of interest (50%
and 20%).9 The largest sample size is required for a proportion of 50%. Thus, for surveys with
many questions with a range of mean proportions, it is appropriate to use the sample size based
on the 50% proportion as this will provide at least the required accuracy for all cases. Table 3-3
gives the required sample size using random sampling, based on the 50% proportion, for var-
ious confidence intervals and a range of population sizes. Alternatively, the sample size for
an accuracy of ±a percentage points10 can be calculated for a 95% confidence level using the
following expression:11
1.962 p (1 - p )
n=
(a 100) 2
+ 1.962 p (1 - p ) N
where n is the sample size,
N is the population size,
a is the width of the confidence, and
p is the estimated proportion of the population in the category of interest.
8 With a categorical response, the variance can be expressed in terms of the proportion of the population in the category of
interest. Thus an initial estimate of this proportion, rather than the variance, is required.
9 For proportions (p) greater than 0.5, the required sample size is the same as for the proportion 1 p. For example, for a pro-
portion p = 0.75, the required sample size is the same as for p = 0.25.
10 If accuracy is expressed as a percentage of the mean, say b%, then the percentage points, a = b · mean. For example, a con-
fidence interval width of ±25% of the mean with a mean of 40% corresponds to a confidence interval width of a = 25 · 0.4 =
10 percentage points.
11 For other confidence levels, replace 1.96 with the appropriate value from the standard Normal distribution for the confi-
dence level required.
OCR for page 38
38 Guidebook for Conducting Airport User Surveys
Table 3-2. Approximate 95% confidence intervals for a categorical variable
for a range of population and sample sizes.
95% Confidence Interval for Proportion of
Proportion of Population in Category
Population Sample
Population in Range
Size Size Lower Upper
Category Mean ±a Percentage
Limit Limit
Points, where a =
100 80 50% 4.9 pts 45% 55%
20% 3.9 pts 16% 24%
60 50% 8.0 pts 42% 58%
20% 6.4 pts 14% 26%
40 50% 12.0 pts 38% 62%
20% 9.6 pts 10% 30%
50,000 1,000 50% 3.1 pts 47% 53%
or higher 20% 2.5 pts 18% 22%
400 50% 4.9 pts 45% 55%
20% 3.9 pts 16% 24%
100 50% 9.8 pts 40% 60%
20% 7.8 pts 12% 28%
Note: SEE estimated using binomial distribution and sampling without replacement, Normal approximation used to
determine confidence intervals. For further information, refer to statistical textbooks listed at the end of this
guidebook.
For large populations of over 50,000, the required sample size for a 95% confidence level is
given approximately by:
40, 000 p (1 - p )
n=
a2
Thus if the proportion of some characteristic of the population is 5% (p = 0.05), say, and the
desired accuracy of the estimate of this proportion is ±1 percentage point at a 95% confidence
level, the required sample size to achieve this accuracy is 1,900. It should be noted that an error
of 1 percentage point on an estimated proportion of only 5% is an error of ±20% of the esti-
mated proportion. If it is desired to reduce this error to only 5% of the estimated proportion
(±0.25 percentage points), the required sample size would increase to about 30,000.
Table 3-3. Required sample size using random sampling for various sized
confidence intervals and a range of population sizes.*
Population Sample Size for 95% Confidence Interval: Sample Mean ±a Percentage Points, where a =
Size 1 pt 2 pts 3 pts 4 pts 5 pts 6 pts 7 pts 8 pts 9 pts 10 pts
100 99 96 91 86 79 73 66 60 54 49
200 196 185 168 150 132 114 99 86 74 65
500 475 414 340 273 217 174 141 115 96 81
1,000 906 706 516 375 278 211 164 130 106 88
2,000 1,655 1,091 696 462 322 235 179 140 112 92
5,000 3,288 1,622 879 536 357 253 189 146 116 94
10,000 4,899 1,936 964 566 370 260 192 148 117 95
20,000 6,488 2,144 1,013 583 377 263 194 149 118 96
50,000 8,057 2,291 1,045 593 381 265 195 150 118 96
100,000 8,762 2,345 1,056 597 383 266 196 150 118 96
200,000 9,164 2,373 1,061 598 383 266 196 150 118 96
500,000 9,423 2,390 1,065 600 384 267 196 150 119 96
* Sample sizes where proportion of population in the category of interest is 50%.
OCR for page 39
Statistical Concepts 39
If a subgroup composes S percent of the population and the estimate of the proportion of
some characteristic of the subgroup is required to the same accuracy as the estimate of the pro-
portion for the population as a whole, the sample will need to be larger by a factor of 100/S. Thus,
to achieve the same accuracy for a subgroup that composes 20% of the population, the sample
would need to be five times larger (100/20 = 5). If this level of accuracy is required for multiple
subgroups, the required total sample size is given by the largest of the estimated total sample sizes
calculated using the factor for each subgroup (100/Si where Si is the percentage of the popula-
tion in subgroup i). For very small populations such as with airport tenant surveys, a high pro-
portion of the population must be sampled to obtain estimates within 0.05 (i.e., 5 percentage
points) of the proportion for the total population, but actual numbers of surveys required are
small. For example, a sample size of 79 is required from a population of 100 to achieve an accu-
racy of 5 percentage points. For large populations, a sample size approaching 400 is required to
achieve a similar level of accuracy using random sampling.
Numerical Response Questions
For questions with a numerical response, such as the number of travelers in a group, expen-
ditures at the concessions, or time spent at the airport, the sample size required for a specified
level of accuracy is dependent on the variability (as measured by the standard deviation) in the
numerical response. With random sampling, the population mean and standard deviation are
estimated by the average and standard deviation (weighted if appropriate) of the responses in
the sample. The required sample size for an accuracy of ±w can be calculated for a 95% confi-
dence level using the following expression:12
n = 1.962 s 2 w 2
where s is the standard deviation of the responses in the sample.
The SEE can be found approximately by dividing the standard deviation of the sample values
by the square root of the sample size of completed responses.13 The standard deviation of the vari-
able of interest is unknown during the survey planning stage and an initial estimate is required to
calculate the required sample size. This initial estimate could be obtained from previous surveys
at the airport or from other airports, or estimated from knowledge of the typical range in values.
Examples of the mean, standard deviation, SEE and accuracy of estimate (95% confidence
interval), and required sample sizes for accuracy to within 10% of the mean for selected air pas-
senger characteristics from some airport surveys are given in Table 3-4.
As can be seen by these examples, the accuracy and required sample sizes vary greatly depend-
ing on the variable of interest. Expenditures at the airport vary greatly as many people do not
spend any money and some spend a lot. Thus large sample sizes are required to produce esti-
mates to within 10% of their expected value. In contrast, variability in the time passengers spend
at the airport is much less, and small samples would give a similar accuracy (in percentage terms).
3.4.2 Sample Sizes with Stratified and Cluster Sampling
The methods for determining the sample sizes with stratified and cluster sampling are more
complex and are outlined in the following paragraphs with details provided in Appendix B.
12 For other confidence levels, replace 1.96 with the appropriate z-value from the standard Normal distribution for the confi-
dence level required. For small population sizes, use the expression: n = 1.962 s2/[w2 + 1.962 s2/N] where N is the population
size.
13 A more accurate estimate is given by dividing by the square root of the sample size less one. This can become important for
small samples.
OCR for page 40
40 Guidebook for Conducting Airport User Surveys
Table 3-4. Examples of 95% confidence intervals and sample sizes for
selected air passenger characteristics from some recent airport surveys.
With Sample Size = 400 Sample Size for
Variable Mean Standard 95% Confidence Confidence Interval
Deviation Interval* ±10% of Mean
Number in travel group 1.4 1.6 ±0.16 or ±11% 503
Expenditure at all concessions
Airport 1 $6.20 $8.53 ±$0.84 or ±14% 728
Airport 2 $8.00 $19.00 ±$1.86 or ±23% 2,168
Time at airport (min)
Large intern'l 160 60 ±6 min or ±4% 55
Domestic 106 43 ±4 min or ±4% 64
* Confidence interval expressed as difference from sample mean, also given as a percentage of the sample
mean.
Note: Sample mean will be approximately normally distributed for large sample sizes according to the
Central Limit Theorem, even for variables such as expenditure at concessions that are not normally
distributed.
Source: Airport surveys conducted by Jacobs Consultancy in the United States and Canada.
Stratified Sampling
The objective of stratified sampling is to reduce the size of the required sample to achieve a
desired level of accuracy in situations where it is possible to define population strata within which
the variance of the population characteristic of interest differs between the strata. For example,
if the characteristic of interest is the duration of air passenger air trips (because trip duration
affects the likely use of parking at the airport), the duration values are likely to differ consider-
ably between international trips, long-haul domestic trips, and short-haul domestic trips.
Because the variance of the air trip duration within each of these three strata will be much smaller
than the variance for the population as a whole, it may be possible to estimate the average trip
duration for all air passengers to the desired level of accuracy with fewer total responses divided
between the three strata than by randomly sampling the entire population.
To achieve a similar level of accuracy in the results for each stratum, it will be necessary to use
non-proportional stratified sampling, with the sample size in each stratum inversely proportional
to the variance of the characteristic within that stratum. Because the actual variance in the char-
acteristic for each stratum will not be known until the survey has been performed, it will be nec-
essary to make an initial assumption of the differences in the variance across the strata in order to
determine the proportion of the survey responses to assign to each stratum. These assumptions
can be based on the results of prior surveys or of surveys performed at similar airports.
If Xi is the standard deviation of characteristic X in stratum i, then for a confidence interval
for the sample mean of X across the population of 2w (i.e., ±w) at a 95% confidence level, w is
given by:
w = 1.96 W i Xi (1 - ni
2 2
N i ) ni
i
where Wi is the proportion of the total population in stratum i
Ni is the population in stratum i
ni is the sample size in stratum i
A given confidence interval can be obtained for varying combinations of ni. However, if ni is
selected to be inversely proportional to the variance of X within each stratum, i.e., ni = k / Xi
2
,
then ni can be replaced by k / Xi in the above equation, which can then be solved for k and hence
2
OCR for page 41
Statistical Concepts 41
ni calculated for each stratum. The expression for calculating the value of k and the sample sizes
for each stratum is provided in Appendix B.
The total sample size is obtained by summing ni across all the strata.
Cluster Sampling
Calculating an appropriate sample size with cluster sampling in considerably more compli-
cated than with random or stratified sampling, because the composition and size of the clusters
affect the variance of the resulting estimates of the population characteristics.
The accuracy of a cluster sample depends on both the variance of the characteristic of inter-
est within each cluster and the variance between clusters. If the variation in the sample mean
between clusters is fairly small (i.e., the clusters are fairly homogeneous and have similar means)
but the variance of the characteristic within each cluster is fairly large, then the cluster sample
will give a similar accuracy to a random sample of the same overall sample size. One can think
of this situation as a series of small random samples of the population as a whole. Conversely, if
the variance between clusters is fairly high, then the overall variance of the population sample
mean of the characteristic will be larger than for a random sample and in consequence a cluster
sample will require a larger overall sample size to achieve the same level of accuracy.
3.4.3 Comparison of Sampling Methods
An example of sample size calculations for different sampling methods is given in Appendix B.
The example provides some insight into the efficiencies of each sampling method and is sum-
marized in this section. In the example, a survey of passengers is to be undertaken to obtain infor-
mation on airport access trips. A critical question to be answered may be: What is the percentage
of departing passengers dropped off at the terminal curb? Random and stratified sampling of
passengers--with stratification by flight sector (e.g., short-haul domestic, long-haul domestic,
international) and day of the week--and one- and two-stage cluster sampling--with both ran-
dom and stratified sampling of flights by sector--are examined. The flight schedule for the sur-
vey period includes 610 flights per week, and the number of originating passengers per week is
estimated at 48,300. Some 42% of passengers are on short-haul domestic flights, 34% on long-
haul domestic flights, and 24% on international flights. From past experience, initial estimates
of the percentages of passengers dropped off at the curb are 40% of short-haul domestic passen-
gers, 60% of long-haul domestic passengers, and 90% of international passengers. In the exam-
ple, the percentage of passengers to be dropped off at the curb is quite strongly related to the
flight sector, but fairly weakly related to the day of the week.
Table 3-5 summarizes the required sample sizes for an accuracy of ±2, 3, and 4 percentage
points for a 95% confidence level using various sampling strategies. The following observations
were made from this example:
· Using random sampling, the required sample size approximately doubles as the accuracy
improves from ±4 to ±3 and doubles again from ±3 to ±2 percentage points.
· Stratified sampling by flights (which has a strong relationship with the variable of interest) reduces
the sample size required by 15%, but stratified sampling by day of the week (which has a weak
relationship with the variable of interest) has a negligible effect on the required sample size.
· Cluster sampling with random sampling of flights and surveying of all passengers on those
flights was found to be very inefficient, increasing the sample size required by a factor of 9 or
more compared to random sampling.
· Cluster sampling with stratified sampling of flights by sector greatly improves the efficiency
of cluster sampling.
With all of the passengers on the selected flights surveyed, the sample size required is
reduced to approximately 3 times that of random sampling.
OCR for page 42
42 Guidebook for Conducting Airport User Surveys
Table 3-5. Sample sizes in example survey for an accuracy of 2, 3 and 4 percentage
points for a 95% confidence level using various sampling strategies.
Unit Mean ±a Percentage Points, a =
Method Comment
Sampled 2 pts 3 pts 4 pts
Random Passengers 2,218 1,012 574 Random sampling of passengers (pax)
Stratified Passengers 1,879 853 484 Stratified by sector of flight
Passengers 2,215 1,011 574 Stratified by day of the week
Cluster 1. Flights 252 146 92 Random sampling of flights with all
Passengers 19,953 11,560 7,285 pax on each flight sampled
2. Flights 83 41 24 Stratified sampling of flights by sector
Passengers 6,560 3,280 1,910 with all pax on each flight sampled
3. Flights 117 47 26 Stratified sampling of flights by sector
Passengers 4,615 1,860 1,040 with 50% pax on each flight sampled
With a random sample of 50% of passengers on each flight surveyed, the sample size
required is reduced by 30% to 2.1 times that required using random sampling. However,
with only 50% of passengers surveyed on each flight, the number of flights surveyed
increases.
Several other percentages of passengers to survey on each flight were examined, and both
the 30% and 75% levels resulted in larger passenger sample sizes. The optimal balance
between the number of flights and the proportion of passengers on those flights to survey
depends on the variation in responses between and within flights, and on the relative costs
of surveying passengers and flights, which vary from survey to survey.
The results of this example reflect the assumptions regarding variation used in the example
and will vary in other situations. Refer to Appendix B for more information on the example and
the calculation of the sample sizes.
In comparing the required sample sizes for different sampling methods, it should be borne in
mind that true random sampling of air passengers is almost impossible to achieve, as discussed
in Chapter 5.
3.4.4 Determining Desired Accuracy
While the mathematics of calculating required sample size is generally fairly straightfor-
ward, deciding on the appropriate desired level of accuracy is anything but, because it
depends on the consequences of being wrong. Although it is common in statistical analysis
to use a target accuracy of ±5% at a 95% confidence level, this is an entirely arbitrary choice
and is typically not achievable or not accurate enough for many issues addressed by air pas-
senger surveys.
Consider the case where the characteristic of interest accounts for only a small proportion of
respondents, say air passengers using transit to access the airport, which from past surveys is esti-
mated to be approximately 5%. The proportion using transit is to be estimated for a subgroup
that composes 20% of the population (e.g., air passengers from a particular part of the region).
If the required accuracy for the estimated proportion of this subgroup is ±5% of the estimated
proportion (i.e., ±0.25 percentage points) at a 95% confidence level, a random sample survey
would require a sample size of 150,000 responses, a level of effort that is totally impractical. Even
accepting an accuracy of ±20% of the estimated proportion (i.e., ±1 percentage point) at the same
confidence level, the required sample size would still be 9,500--potentially achievable, but sig-
nificantly larger than most air passenger surveys.