Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 31
Statistical Concepts 31
In practice, in addition to the sample size and variation in the attributes being estimated, the
accuracy of the estimated population attributes also depends on the sampling method, the level
of non-response, and the characteristics of the non-respondents. Unfortunately, the character-
istics of the non-respondents are generally unknown and must be estimated in some way. In the
absence of any information about the characteristics of the non-respondents (the usual case),
they are generally assumed to be the same as the characteristics of the respondents.
The appropriate level of confidence to be used in expressing the margin of error in the results
of a sample depends on the costs associated with making an error. The higher the costs of an error,
the greater the confidence that is required that the true value is within the confidence interval.
The width of the confidence interval can be reduced by increasing the sample size or, possibly,
improving the sample design. Generally 95% confidence intervals are used for most purposes,
but 99% confidence intervals may be used for some critical variables, while in other cases 90%
confidence intervals may be adequate.
Accuracy is discussed further in Section 3.4. Information on the calculation of the SEE using
the different sampling methods is provided in Appendix B.
3.3 Sampling Methods
For a sample survey, the sample of respondents should be selected from the population in
such a way that the probability of any individual respondent being selected can be estimated.
This method allows generalizations to be made about the entire population from the character-
istics of the sample and estimates to be made of the likely accuracy of the estimated characteris-
tics of the population based on the size of the sample. The most straightforward approach to
obtaining a representative sample from a population is to select the members of the sample ran-
domly from among the members of the population. However, in practice this is often difficult
to achieve, particularly in an airport environment. Furthermore, it has the disadvantage that the
sample will include relatively few members of particular subgroups of the population that com-
pose only a small proportion of the total population. To address these concerns, other common
sampling methods may be used. These methods are summarized in Table 3-1 and discussed in
more detail in the following subsections.
The choice of the appropriate sampling method is partly a question of how best to achieve the
desired accuracy of the survey results and partly a consequence of the practicalities of perform-
ing the survey. For example, it is common to perform air passenger surveys in departure lounges,
because passengers are more willing to be interviewed or fill out a survey form when they are no
longer anxious about whether they will make their flight and they are sitting down. However,
this locale constrains the sample to those passengers on a set of flights, and does not provide a
truly random sample of all passengers using the airport. On the other hand, a mail-back survey
sent to the home address of airport employees can sample employees randomly from a list of all
employees at the airport.
The sampling method selected will depend on the type of survey, data collection method, and
characteristics of the population. Random and sequential sampling are the simplest methods to
implement but require large sample sizes to obtain an adequate number of responses from small
subgroups of the population, while stratified and cluster sampling can be used, with a limited
budget, to improve the accuracy of survey results for different subgroups. Often multi-stage sam-
pling is appropriate. For example, cluster sampling may be used with flights selected using strat-
ified sampling, and passengers on those flights are selected using sequential sampling.
A controlled sample attempts to design the sampling approach so that the composition of
the sample corresponds to the underlying distribution of the population characteristics. This
OCR for page 32
32 Guidebook for Conducting Airport User Surveys
Table 3-1. Summary of sampling methods.
Type Method Comment
Random Respondents are selected randomly from Often difficult to do in airport environment. Need to
the target population. use some randomizing technique (e.g., use of random
number tables). Selection by interviewers can lead to
biases if not well trained.
Sequential Every nth individual is selected when Good practical technique for airport surveys. Sample
(Systematic) potential respondents are arranged in characteristics will be equivalent to a random sample
some order. First respondent should be if the order of potential respondents is not related to
selected randomly from among the first n the variables of interest.
individuals.
Stratified Respondents are grouped into Used to obtain a more representative sample of
homogeneous groups (e.g., different different groups, particularly if the groups vary in
categories of employee). Sampling size, or to obtain a specific accuracy in estimates for
occurs within each group separately. each group.
Cluster Respondents are sampled from naturally Suitable for large surveys where a wide range of
occurring groups (e.g., flights). A sample flights can be sampled. Can use stratified sampling of
of flights are selected, then all or a flights to obtain a more representative sample.
sample of passengers on those flights are
selected.
Non- Respondents are selected on the basis of May be useful for gathering information on the range
probability some criterion that does not allow the of possible responses, where the frequency with
probability of sampling any given which those responses occur in a defined population
member of the population to be is not required.
determined.
objective is generally satisfied with a truly random sample, provided the sample size is large
enough. In practice achieving a truly random sample with airport user surveys is often difficult,
as discussed in Section 3.3.1. In the case of other sampling methods, it is necessary to adjust the
sampling rate or define the strata or clusters so that, where the characteristics of the population
vary across different subgroups or time periods, those subgroups or time periods are represented
in the sample in proportion to their occurrence in the population.
A controlled sample is thus an attribute of a particular sample design rather than a differ-
ent type of sampling method. If the sample is not controlled so that its composition reflects
the underlying distribution of the population characteristics, then the survey results need to
be weighted to properly reflect the characteristics of the population. A cluster sample in which
sampled flights are chosen to reflect the proportions of flights in markets that are believed to
have different passenger characteristics (e.g., international, domestic short-haul, domestic
long-haul, etc.) and passengers are sampled for each flight in proportion to the passengers on
the flight would represent a controlled sample. Thus a self-completed air passenger survey in
which flights are selected in proportion to the number of flights in broadly defined markets and
survey forms are given to every adult passenger on those sampled flights would qualify as a con-
trolled sample.
Because variation in the characteristics of the population over time or across different sub-
groups will not in general be known until the survey results are obtained, designing a controlled
sample means making assumptions about subsets of the population with different characteris-
tics and ensuring that each subset is sampled in proportion to its occurrence in the population.
If it turns out that two subsets of the population that were expected to have different character-
istics in fact have similar characteristics, the results for the two subsets can be combined. How-
ever, if two subsets that in fact have different characteristics are assumed to be similar and are
not sampled in proportion to their occurrence in the population, the results will be biased.
Weighting of the subgroups will be required to remove this bias.
OCR for page 33
Statistical Concepts 33
Analysis of the results of previous surveys at the airport in question or of surveys conducted
at other airports with similar traffic patterns can help identify subsets of the population that are
likely to have different characteristics. The design of the controlled sample then attempts to
ensure that those subsets are sampled in proportion to their occurrence in the population.
3.3.1 Random Sampling
With random sampling, each individual must have an equal (or at least known) chance of being
selected. An example of random sampling would be a tenant survey where a list of all airport ten-
ants is assembled and a table of random numbers is used to select individual tenants from the list.
The sampling approach will generally ensure that no individual can be in the sample more than
once. For air passenger surveys, the sample size is typically so small relative to the population and
the methodology is such that there is very little likelihood of surveying the same person twice.5 For
most other airport user surveys, the methodology precludes sampling the same respondent twice.
Obtaining a truly random sample is often difficult, particularly for airport surveys. For exam-
ple, identifying each member of the population to include in the sampling process, then applying
a method for randomly selecting them can be difficult, if not impossible, in an airport departure
lounge. There are also problems associated with having surveyors select passengers to survey; this
introduces a human element and invariably leads to biases. To avoid this, random numbers or
sequential sampling, discussed in the following subsection, should be used for selecting individ-
uals to survey.
Interviews at groundside locations such as curb areas and parking lots, where the next available
passenger is surveyed once an interview has been completed, are equivalent to random sample sur-
veys as long as the ratio of interviews to passengers is fairly constant. However, such an approach
will clearly change the sampling rate as the passenger flow changes. During periods of very low flow,
every passenger might be interviewed, while during periods of high flow only a small proportion
of passengers would be interviewed, and this should be taken into account in analyzing the results.
3.3.2 Sequential Sampling
Sequential sampling is generally a good form of sampling for use in airport surveys. With
sequential sampling, also referred to as systematic sampling, the population is arranged in some
logical order and every nth individual is selected, starting with a randomly selected individual
from the first n individuals. An example of sequential sampling is to survey every fourth passen-
ger in a check-in queue. Sequential sampling is usually easier to apply than random sampling
and will yield a random sample if the order of individuals in the list is essentially random with
respect to the characteristics being measured in the survey.6 For example, there is no reason to
think that the order in which people sit in a departure lounge has any systematic relationship to
the characteristics being measured (such as their trip purpose or how they got to the airport),
and therefore selecting every nth person is in effect a random sample. Of course, depending on
the layout of the lounge, early arriving passengers or those with difficulty walking may sit closer
to the boarding point, while later arriving passengers may have to use seats further away. How-
ever, as long as all passengers in the lounge are included in the sampling strategy, where they sit
will not affect their chance of being sampled.
5 If an individual gets surveyed twice on two different trips, that is not the same thing as surveying the same traveler twice
on the same trip. The former should be valid as the sample being drawn is really of passenger trips, not of passengers, and
a single passenger may make more than one trip during the survey period.
6 Serious biases can occur if a characteristic of interest occurs in a cyclic order in the population list and the length of each cycle
corresponds to the sampling fraction, but this phenomenon would be rare in airport surveys.
OCR for page 34
34 Guidebook for Conducting Airport User Surveys
Where the population list is ordered by a relevant characteristic, the use of sequential sam-
pling will often result in a sample with a more representative range of characteristics than using
random sampling. For example, in selecting flights to survey, if all flights during the survey
period are listed in order of flight stage length, the resulting sample would likely better reflect
passenger characteristics such as destination city or region than a random sample, as sequential
sampling ensures a more even spread of flights by stage length and thus over destinations and
regions. With random sampling, some subgroups of the population (flights with a particular
stage length in the above example) may be missed completely and others may be over-sampled.
One common application of sequential sampling in air passenger surveys is to list flights by
departure time (and destination to resolve flights with the same departure time) and select every
nth flight to survey. A variation on this approach is to list the number of seats on each flight and
calculate the cumulative total number of seats for each flight (the total number of seats on previous
flights on the list plus the number on the current flight). Flights are then selected by identifying the
flight that corresponds to every mth seat on the cumulative list. This ensures that the probability
of a given flight being sampled is proportional to the size of the aircraft, which approximates a
random sample of air passengers if the same number of passengers is interviewed for each flight.7
3.3.3 Stratified Sampling
In stratified sampling, the population is divided into mutually exclusive groups (strata) and
individuals within each group are randomly sampled. Groups should be selected so that they are
homogeneous with respect to the variables being studied (there is low variation within the
groups), but so that the variation in the relevant variables is large between groups. For example,
in a survey to determine passenger spending at airport concessions, passengers taking short-haul
domestic flights are likely to spend much less than passengers taking long-haul international
flights. The variation in spending among short-haul domestic passengers and among long-haul
international passengers is likely to be less than the variation in spending between the two
groups. If the criterion for stratification is highly correlated to the variable being studied, such
as in this example, the gain in accuracy can be significant. Examples of stratified sampling
include dividing flights into groups--such as international and domestic short and long haul,
or by region--and dividing passengers to be sampled into groups based on day of the week,
time period during the day, and airport terminal used.
The variable used for stratifying the population must be known for all individuals in the pop-
ulation. Once the survey population has been stratified into groups, simple random or sequen-
tial sampling is used to select individuals from each group.
With proportional stratified sampling, the proportions of individuals surveyed in each group
are equal. This form of sampling is often used to assure a more representative sample than sim-
ple random or sequential sampling.
In non-proportional stratified sampling, different sampling fractions are used to improve the
accuracy of estimates for a given overall sample size. Situations where non-proportional sam-
pling is desirable include the following:
· Where the variation in the variables being studied differs greatly between groups. The non-
homogeneous groups (with a high variation in the variables of interest) should have a larger
sample than the homogeneous groups. For example, consider a survey conducted to determine
the average number of check-in bags per passenger where a stratified sample is to be drawn with
flights grouped into long- and short-haul domestic and international flights. If it is known that
7 This method assumes that the load factor (the ratio of passengers to seats) does not vary significantly across flights. Where this
is not the case, and some classes of flight have a higher average load factor than others, an adjustment to the number of passen-
gers interviewed on each flight may be required to approximate a random sample.
OCR for page 35
Statistical Concepts 35
the variation in the number of check-in bags is greater for passengers on long-haul international
flights than short-haul domestic flights, the sampling fraction would be higher for the long-haul
international flights.
· Where comparisons of distinct subgroups of the population are required, for example com-
parisons between domestic and international passengers.
· Where the cost of collecting the data differs greatly between groups. Here, overall accuracy for
a given cost can be improved by having a lower sampling fraction for the groups with high
data collection costs. However, while this may lead to a higher overall accuracy for the pooled
data, when the characteristics of subgroups need to be considered, as is almost always the case
in airport surveys, it can lead to very different accuracy for the various subgroups. Thus the
approach of reducing the sampling fraction for groups with higher data collection costs is not
generally recommended for airport surveys.
Expanding the sample results of non-proportional stratified sampling to determine estimates
for the population is not as straightforward as with proportional stratified sampling, and is dis-
cussed in Appendix B. If non-proportional stratified sampling is appropriate, it is suggested that
the planning team either become knowledgeable on the subject (refer to the Bibliography for
appropriate guidance) or consider using external expertise.
3.3.4 Cluster Sampling
With cluster sampling, the population is distributed in a large number of naturally occurring
groups, for example passengers on flights. The groups, or clusters, are sampled, thus not all clus-
ters are included in the sample. This is the primary difference from stratified sampling where
individuals are sampled from every group. In the simplest form, all individuals within a cluster
are sampled. When clusters are homogeneous, it is more efficient to sample only a fraction of the
individuals within a cluster, and to sample more clusters. Cluster sampling is used to make sam-
pling easier and less costly by limiting the survey to well-defined groups, such as passengers on
specific flights, and works well when the characteristics of interest have low variability between
clusters and high variability within clusters. For example, although the household income of pas-
sengers on a given flight will span a wide range, the average household income of passengers on
different flights will show much less variability.
The accuracy of estimates made using cluster sampling will almost always be lower than if a ran-
dom sample is used with the same sample size, because the selected clusters may not be fully rep-
resentative of the target population as a whole, and can be significantly lower if variability between
clusters is high and/or a small number of clusters are selected. It is important that the consequences
of the design of the cluster sample (often referred to as the design effect) are incorporated into the
analysis when evaluating the accuracy of estimates and required sample sizes. Details of how to
calculate sample sizes and confidence intervals for cluster samples are included in Appendix B.
A common example of cluster sampling in airport surveys is the use of individual flights as
clusters, with the flights to be surveyed being selected using random, sequential or stratified
sampling. Then either all passengers on each selected flight or a sample of passengers on those
flights are surveyed.
3.3.5 Non-Probability Sampling
Non-probability (or uncontrolled) sampling is where the probability of an individual's selec-
tion cannot be determined. Examples of non-probability sampling include the following:
· Surveys of passengers who ask for help at an airport information booth, where no record is
kept of the number of passengers seeking help at the booth.
· Voluntary Web-based surveys where all visitors to the site are invited to complete a survey.