Cover Image

Not for Sale

View/Hide Left Panel
Click for next page ( 32

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 31
Statistical Concepts 31 In practice, in addition to the sample size and variation in the attributes being estimated, the accuracy of the estimated population attributes also depends on the sampling method, the level of non-response, and the characteristics of the non-respondents. Unfortunately, the character- istics of the non-respondents are generally unknown and must be estimated in some way. In the absence of any information about the characteristics of the non-respondents (the usual case), they are generally assumed to be the same as the characteristics of the respondents. The appropriate level of confidence to be used in expressing the margin of error in the results of a sample depends on the costs associated with making an error. The higher the costs of an error, the greater the confidence that is required that the true value is within the confidence interval. The width of the confidence interval can be reduced by increasing the sample size or, possibly, improving the sample design. Generally 95% confidence intervals are used for most purposes, but 99% confidence intervals may be used for some critical variables, while in other cases 90% confidence intervals may be adequate. Accuracy is discussed further in Section 3.4. Information on the calculation of the SEE using the different sampling methods is provided in Appendix B. 3.3 Sampling Methods For a sample survey, the sample of respondents should be selected from the population in such a way that the probability of any individual respondent being selected can be estimated. This method allows generalizations to be made about the entire population from the character- istics of the sample and estimates to be made of the likely accuracy of the estimated characteris- tics of the population based on the size of the sample. The most straightforward approach to obtaining a representative sample from a population is to select the members of the sample ran- domly from among the members of the population. However, in practice this is often difficult to achieve, particularly in an airport environment. Furthermore, it has the disadvantage that the sample will include relatively few members of particular subgroups of the population that com- pose only a small proportion of the total population. To address these concerns, other common sampling methods may be used. These methods are summarized in Table 3-1 and discussed in more detail in the following subsections. The choice of the appropriate sampling method is partly a question of how best to achieve the desired accuracy of the survey results and partly a consequence of the practicalities of perform- ing the survey. For example, it is common to perform air passenger surveys in departure lounges, because passengers are more willing to be interviewed or fill out a survey form when they are no longer anxious about whether they will make their flight and they are sitting down. However, this locale constrains the sample to those passengers on a set of flights, and does not provide a truly random sample of all passengers using the airport. On the other hand, a mail-back survey sent to the home address of airport employees can sample employees randomly from a list of all employees at the airport. The sampling method selected will depend on the type of survey, data collection method, and characteristics of the population. Random and sequential sampling are the simplest methods to implement but require large sample sizes to obtain an adequate number of responses from small subgroups of the population, while stratified and cluster sampling can be used, with a limited budget, to improve the accuracy of survey results for different subgroups. Often multi-stage sam- pling is appropriate. For example, cluster sampling may be used with flights selected using strat- ified sampling, and passengers on those flights are selected using sequential sampling. A controlled sample attempts to design the sampling approach so that the composition of the sample corresponds to the underlying distribution of the population characteristics. This

OCR for page 31
32 Guidebook for Conducting Airport User Surveys Table 3-1. Summary of sampling methods. Type Method Comment Random Respondents are selected randomly from Often difficult to do in airport environment. Need to the target population. use some randomizing technique (e.g., use of random number tables). Selection by interviewers can lead to biases if not well trained. Sequential Every nth individual is selected when Good practical technique for airport surveys. Sample (Systematic) potential respondents are arranged in characteristics will be equivalent to a random sample some order. First respondent should be if the order of potential respondents is not related to selected randomly from among the first n the variables of interest. individuals. Stratified Respondents are grouped into Used to obtain a more representative sample of homogeneous groups (e.g., different different groups, particularly if the groups vary in categories of employee). Sampling size, or to obtain a specific accuracy in estimates for occurs within each group separately. each group. Cluster Respondents are sampled from naturally Suitable for large surveys where a wide range of occurring groups (e.g., flights). A sample flights can be sampled. Can use stratified sampling of of flights are selected, then all or a flights to obtain a more representative sample. sample of passengers on those flights are selected. Non- Respondents are selected on the basis of May be useful for gathering information on the range probability some criterion that does not allow the of possible responses, where the frequency with probability of sampling any given which those responses occur in a defined population member of the population to be is not required. determined. objective is generally satisfied with a truly random sample, provided the sample size is large enough. In practice achieving a truly random sample with airport user surveys is often difficult, as discussed in Section 3.3.1. In the case of other sampling methods, it is necessary to adjust the sampling rate or define the strata or clusters so that, where the characteristics of the population vary across different subgroups or time periods, those subgroups or time periods are represented in the sample in proportion to their occurrence in the population. A controlled sample is thus an attribute of a particular sample design rather than a differ- ent type of sampling method. If the sample is not controlled so that its composition reflects the underlying distribution of the population characteristics, then the survey results need to be weighted to properly reflect the characteristics of the population. A cluster sample in which sampled flights are chosen to reflect the proportions of flights in markets that are believed to have different passenger characteristics (e.g., international, domestic short-haul, domestic long-haul, etc.) and passengers are sampled for each flight in proportion to the passengers on the flight would represent a controlled sample. Thus a self-completed air passenger survey in which flights are selected in proportion to the number of flights in broadly defined markets and survey forms are given to every adult passenger on those sampled flights would qualify as a con- trolled sample. Because variation in the characteristics of the population over time or across different sub- groups will not in general be known until the survey results are obtained, designing a controlled sample means making assumptions about subsets of the population with different characteris- tics and ensuring that each subset is sampled in proportion to its occurrence in the population. If it turns out that two subsets of the population that were expected to have different character- istics in fact have similar characteristics, the results for the two subsets can be combined. How- ever, if two subsets that in fact have different characteristics are assumed to be similar and are not sampled in proportion to their occurrence in the population, the results will be biased. Weighting of the subgroups will be required to remove this bias.

OCR for page 31
Statistical Concepts 33 Analysis of the results of previous surveys at the airport in question or of surveys conducted at other airports with similar traffic patterns can help identify subsets of the population that are likely to have different characteristics. The design of the controlled sample then attempts to ensure that those subsets are sampled in proportion to their occurrence in the population. 3.3.1 Random Sampling With random sampling, each individual must have an equal (or at least known) chance of being selected. An example of random sampling would be a tenant survey where a list of all airport ten- ants is assembled and a table of random numbers is used to select individual tenants from the list. The sampling approach will generally ensure that no individual can be in the sample more than once. For air passenger surveys, the sample size is typically so small relative to the population and the methodology is such that there is very little likelihood of surveying the same person twice.5 For most other airport user surveys, the methodology precludes sampling the same respondent twice. Obtaining a truly random sample is often difficult, particularly for airport surveys. For exam- ple, identifying each member of the population to include in the sampling process, then applying a method for randomly selecting them can be difficult, if not impossible, in an airport departure lounge. There are also problems associated with having surveyors select passengers to survey; this introduces a human element and invariably leads to biases. To avoid this, random numbers or sequential sampling, discussed in the following subsection, should be used for selecting individ- uals to survey. Interviews at groundside locations such as curb areas and parking lots, where the next available passenger is surveyed once an interview has been completed, are equivalent to random sample sur- veys as long as the ratio of interviews to passengers is fairly constant. However, such an approach will clearly change the sampling rate as the passenger flow changes. During periods of very low flow, every passenger might be interviewed, while during periods of high flow only a small proportion of passengers would be interviewed, and this should be taken into account in analyzing the results. 3.3.2 Sequential Sampling Sequential sampling is generally a good form of sampling for use in airport surveys. With sequential sampling, also referred to as systematic sampling, the population is arranged in some logical order and every nth individual is selected, starting with a randomly selected individual from the first n individuals. An example of sequential sampling is to survey every fourth passen- ger in a check-in queue. Sequential sampling is usually easier to apply than random sampling and will yield a random sample if the order of individuals in the list is essentially random with respect to the characteristics being measured in the survey.6 For example, there is no reason to think that the order in which people sit in a departure lounge has any systematic relationship to the characteristics being measured (such as their trip purpose or how they got to the airport), and therefore selecting every nth person is in effect a random sample. Of course, depending on the layout of the lounge, early arriving passengers or those with difficulty walking may sit closer to the boarding point, while later arriving passengers may have to use seats further away. How- ever, as long as all passengers in the lounge are included in the sampling strategy, where they sit will not affect their chance of being sampled. 5 If an individual gets surveyed twice on two different trips, that is not the same thing as surveying the same traveler twice on the same trip. The former should be valid as the sample being drawn is really of passenger trips, not of passengers, and a single passenger may make more than one trip during the survey period. 6 Serious biases can occur if a characteristic of interest occurs in a cyclic order in the population list and the length of each cycle corresponds to the sampling fraction, but this phenomenon would be rare in airport surveys.

OCR for page 31
34 Guidebook for Conducting Airport User Surveys Where the population list is ordered by a relevant characteristic, the use of sequential sam- pling will often result in a sample with a more representative range of characteristics than using random sampling. For example, in selecting flights to survey, if all flights during the survey period are listed in order of flight stage length, the resulting sample would likely better reflect passenger characteristics such as destination city or region than a random sample, as sequential sampling ensures a more even spread of flights by stage length and thus over destinations and regions. With random sampling, some subgroups of the population (flights with a particular stage length in the above example) may be missed completely and others may be over-sampled. One common application of sequential sampling in air passenger surveys is to list flights by departure time (and destination to resolve flights with the same departure time) and select every nth flight to survey. A variation on this approach is to list the number of seats on each flight and calculate the cumulative total number of seats for each flight (the total number of seats on previous flights on the list plus the number on the current flight). Flights are then selected by identifying the flight that corresponds to every mth seat on the cumulative list. This ensures that the probability of a given flight being sampled is proportional to the size of the aircraft, which approximates a random sample of air passengers if the same number of passengers is interviewed for each flight.7 3.3.3 Stratified Sampling In stratified sampling, the population is divided into mutually exclusive groups (strata) and individuals within each group are randomly sampled. Groups should be selected so that they are homogeneous with respect to the variables being studied (there is low variation within the groups), but so that the variation in the relevant variables is large between groups. For example, in a survey to determine passenger spending at airport concessions, passengers taking short-haul domestic flights are likely to spend much less than passengers taking long-haul international flights. The variation in spending among short-haul domestic passengers and among long-haul international passengers is likely to be less than the variation in spending between the two groups. If the criterion for stratification is highly correlated to the variable being studied, such as in this example, the gain in accuracy can be significant. Examples of stratified sampling include dividing flights into groups--such as international and domestic short and long haul, or by region--and dividing passengers to be sampled into groups based on day of the week, time period during the day, and airport terminal used. The variable used for stratifying the population must be known for all individuals in the pop- ulation. Once the survey population has been stratified into groups, simple random or sequen- tial sampling is used to select individuals from each group. With proportional stratified sampling, the proportions of individuals surveyed in each group are equal. This form of sampling is often used to assure a more representative sample than sim- ple random or sequential sampling. In non-proportional stratified sampling, different sampling fractions are used to improve the accuracy of estimates for a given overall sample size. Situations where non-proportional sam- pling is desirable include the following: Where the variation in the variables being studied differs greatly between groups. The non- homogeneous groups (with a high variation in the variables of interest) should have a larger sample than the homogeneous groups. For example, consider a survey conducted to determine the average number of check-in bags per passenger where a stratified sample is to be drawn with flights grouped into long- and short-haul domestic and international flights. If it is known that 7 This method assumes that the load factor (the ratio of passengers to seats) does not vary significantly across flights. Where this is not the case, and some classes of flight have a higher average load factor than others, an adjustment to the number of passen- gers interviewed on each flight may be required to approximate a random sample.

OCR for page 31
Statistical Concepts 35 the variation in the number of check-in bags is greater for passengers on long-haul international flights than short-haul domestic flights, the sampling fraction would be higher for the long-haul international flights. Where comparisons of distinct subgroups of the population are required, for example com- parisons between domestic and international passengers. Where the cost of collecting the data differs greatly between groups. Here, overall accuracy for a given cost can be improved by having a lower sampling fraction for the groups with high data collection costs. However, while this may lead to a higher overall accuracy for the pooled data, when the characteristics of subgroups need to be considered, as is almost always the case in airport surveys, it can lead to very different accuracy for the various subgroups. Thus the approach of reducing the sampling fraction for groups with higher data collection costs is not generally recommended for airport surveys. Expanding the sample results of non-proportional stratified sampling to determine estimates for the population is not as straightforward as with proportional stratified sampling, and is dis- cussed in Appendix B. If non-proportional stratified sampling is appropriate, it is suggested that the planning team either become knowledgeable on the subject (refer to the Bibliography for appropriate guidance) or consider using external expertise. 3.3.4 Cluster Sampling With cluster sampling, the population is distributed in a large number of naturally occurring groups, for example passengers on flights. The groups, or clusters, are sampled, thus not all clus- ters are included in the sample. This is the primary difference from stratified sampling where individuals are sampled from every group. In the simplest form, all individuals within a cluster are sampled. When clusters are homogeneous, it is more efficient to sample only a fraction of the individuals within a cluster, and to sample more clusters. Cluster sampling is used to make sam- pling easier and less costly by limiting the survey to well-defined groups, such as passengers on specific flights, and works well when the characteristics of interest have low variability between clusters and high variability within clusters. For example, although the household income of pas- sengers on a given flight will span a wide range, the average household income of passengers on different flights will show much less variability. The accuracy of estimates made using cluster sampling will almost always be lower than if a ran- dom sample is used with the same sample size, because the selected clusters may not be fully rep- resentative of the target population as a whole, and can be significantly lower if variability between clusters is high and/or a small number of clusters are selected. It is important that the consequences of the design of the cluster sample (often referred to as the design effect) are incorporated into the analysis when evaluating the accuracy of estimates and required sample sizes. Details of how to calculate sample sizes and confidence intervals for cluster samples are included in Appendix B. A common example of cluster sampling in airport surveys is the use of individual flights as clusters, with the flights to be surveyed being selected using random, sequential or stratified sampling. Then either all passengers on each selected flight or a sample of passengers on those flights are surveyed. 3.3.5 Non-Probability Sampling Non-probability (or uncontrolled) sampling is where the probability of an individual's selec- tion cannot be determined. Examples of non-probability sampling include the following: Surveys of passengers who ask for help at an airport information booth, where no record is kept of the number of passengers seeking help at the booth. Voluntary Web-based surveys where all visitors to the site are invited to complete a survey.