National Academies Press: OpenBook

Guidebook for Conducting Airport User Surveys (2009)

Chapter: Appendix B - Sample Sizes, Sample Estimates, and Confidence Intervals

« Previous: Appendix A - Airport User Surveys: Summary of Research
Page 161
Suggested Citation:"Appendix B - Sample Sizes, Sample Estimates, and Confidence Intervals." National Academies of Sciences, Engineering, and Medicine. 2009. Guidebook for Conducting Airport User Surveys. Washington, DC: The National Academies Press. doi: 10.17226/14333.
×
Page 161
Page 162
Suggested Citation:"Appendix B - Sample Sizes, Sample Estimates, and Confidence Intervals." National Academies of Sciences, Engineering, and Medicine. 2009. Guidebook for Conducting Airport User Surveys. Washington, DC: The National Academies Press. doi: 10.17226/14333.
×
Page 162
Page 163
Suggested Citation:"Appendix B - Sample Sizes, Sample Estimates, and Confidence Intervals." National Academies of Sciences, Engineering, and Medicine. 2009. Guidebook for Conducting Airport User Surveys. Washington, DC: The National Academies Press. doi: 10.17226/14333.
×
Page 163
Page 164
Suggested Citation:"Appendix B - Sample Sizes, Sample Estimates, and Confidence Intervals." National Academies of Sciences, Engineering, and Medicine. 2009. Guidebook for Conducting Airport User Surveys. Washington, DC: The National Academies Press. doi: 10.17226/14333.
×
Page 164
Page 165
Suggested Citation:"Appendix B - Sample Sizes, Sample Estimates, and Confidence Intervals." National Academies of Sciences, Engineering, and Medicine. 2009. Guidebook for Conducting Airport User Surveys. Washington, DC: The National Academies Press. doi: 10.17226/14333.
×
Page 165
Page 166
Suggested Citation:"Appendix B - Sample Sizes, Sample Estimates, and Confidence Intervals." National Academies of Sciences, Engineering, and Medicine. 2009. Guidebook for Conducting Airport User Surveys. Washington, DC: The National Academies Press. doi: 10.17226/14333.
×
Page 166
Page 167
Suggested Citation:"Appendix B - Sample Sizes, Sample Estimates, and Confidence Intervals." National Academies of Sciences, Engineering, and Medicine. 2009. Guidebook for Conducting Airport User Surveys. Washington, DC: The National Academies Press. doi: 10.17226/14333.
×
Page 167
Page 168
Suggested Citation:"Appendix B - Sample Sizes, Sample Estimates, and Confidence Intervals." National Academies of Sciences, Engineering, and Medicine. 2009. Guidebook for Conducting Airport User Surveys. Washington, DC: The National Academies Press. doi: 10.17226/14333.
×
Page 168
Page 169
Suggested Citation:"Appendix B - Sample Sizes, Sample Estimates, and Confidence Intervals." National Academies of Sciences, Engineering, and Medicine. 2009. Guidebook for Conducting Airport User Surveys. Washington, DC: The National Academies Press. doi: 10.17226/14333.
×
Page 169
Page 170
Suggested Citation:"Appendix B - Sample Sizes, Sample Estimates, and Confidence Intervals." National Academies of Sciences, Engineering, and Medicine. 2009. Guidebook for Conducting Airport User Surveys. Washington, DC: The National Academies Press. doi: 10.17226/14333.
×
Page 170
Page 171
Suggested Citation:"Appendix B - Sample Sizes, Sample Estimates, and Confidence Intervals." National Academies of Sciences, Engineering, and Medicine. 2009. Guidebook for Conducting Airport User Surveys. Washington, DC: The National Academies Press. doi: 10.17226/14333.
×
Page 171
Page 172
Suggested Citation:"Appendix B - Sample Sizes, Sample Estimates, and Confidence Intervals." National Academies of Sciences, Engineering, and Medicine. 2009. Guidebook for Conducting Airport User Surveys. Washington, DC: The National Academies Press. doi: 10.17226/14333.
×
Page 172
Page 173
Suggested Citation:"Appendix B - Sample Sizes, Sample Estimates, and Confidence Intervals." National Academies of Sciences, Engineering, and Medicine. 2009. Guidebook for Conducting Airport User Surveys. Washington, DC: The National Academies Press. doi: 10.17226/14333.
×
Page 173
Page 174
Suggested Citation:"Appendix B - Sample Sizes, Sample Estimates, and Confidence Intervals." National Academies of Sciences, Engineering, and Medicine. 2009. Guidebook for Conducting Airport User Surveys. Washington, DC: The National Academies Press. doi: 10.17226/14333.
×
Page 174
Page 175
Suggested Citation:"Appendix B - Sample Sizes, Sample Estimates, and Confidence Intervals." National Academies of Sciences, Engineering, and Medicine. 2009. Guidebook for Conducting Airport User Surveys. Washington, DC: The National Academies Press. doi: 10.17226/14333.
×
Page 175
Page 176
Suggested Citation:"Appendix B - Sample Sizes, Sample Estimates, and Confidence Intervals." National Academies of Sciences, Engineering, and Medicine. 2009. Guidebook for Conducting Airport User Surveys. Washington, DC: The National Academies Press. doi: 10.17226/14333.
×
Page 176
Page 177
Suggested Citation:"Appendix B - Sample Sizes, Sample Estimates, and Confidence Intervals." National Academies of Sciences, Engineering, and Medicine. 2009. Guidebook for Conducting Airport User Surveys. Washington, DC: The National Academies Press. doi: 10.17226/14333.
×
Page 177

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

B-1 This appendix outlines how to determine the sample size so that the confidence interval of a sample estimate will be within a specified range and, when analyzing the results, how to estimate values and confidence intervals for characteristics of the population. The determination of sam- ple sizes and confidence intervals are provided for the four types of probability sampling, namely: random, sequential, stratified, and cluster sampling. The description is only a brief summary of the methods as applicable for airport surveys and the reader is encouraged to refer to statistical texts listed in the bibliography for a more complete description. Examples are then provided for applying these methods to determine sample sizes for small population sizes such as employee or tenant surveys, and for passenger surveys using each of the possible sampling strategies. Throughout this section, it is assumed that the sample size will be large enough that the sample means will be approximately Normally distributed, per the Central Limit Theorem. Generally a sample size of at least 30 is considered to provide a reasonably good approxima- tion to the Normal distribution. Most airport user surveys will have a much larger sample size than this. Random Sampling Suppose we are interested in a characteristic of the population which we denote by X. For example, X could be the variable, gender, where: X = 0 if passenger is male 1 if passenger is female Let the mean value of this characteristic be μ. In the example, μ is the proportion of the pas- sengers that are female. Using random sampling the sample mean is an unbiased estimator for the population mean, μ. If the sample size is n, the sample mean is: where X – is the sample mean xi is the ith individual in the sample, i = 1, . . . , n. The 95% confidence interval for μ (the range that contains the true value of μ with a proba- bility of 0.95) is given by: X X± 1 96. σ X x ni= ∑ ( )1 A P P E N D I X B Sample Sizes, Sample Estimates, and Confidence Intervals

B-2 Guidebook for Conducting Airport User Surveys where σX– is the standard deviation of the sample mean. For random sampling without replace- ment, it can be shown that: where σX is the standard deviation of the variable, X N is the number of individuals in the population. Hence the 95% confidence interval for μ is given by: For very large populations (1 – n/N) is approximately one, and this reduces to: The constant, 1.96, is applicable for a 95% confidence interval. For a 99% confidence inter- val, the constant should be 2.58, and for a 90% confidence interval, use 1.65. If the sample size is to be chosen so that the sample mean is within w of the population mean, μ, with a probability of 0.95, then the 95% confidence interval is given by: This can be solved to give the sample size, n: In the survey planning stage when trying to determine the sample size, the standard deviation of the variable X, σX, will be unknown and must be estimated based on data from previous sur- veys, other similar airports, or knowledge of the airport. For categorical variables, the standard deviation is related to the expected proportion of indi- viduals in a category: where p is the proportion of the population in the category of interest. For the example above, where the categorical valuable is the respondent gender, assume that we are interested in estimating the proportion of female passengers. Thus, p is the proportion of female passengers. If we initially estimate this proportion to be 0.5, then for a sample size of n = 400 and population size, N, of 50,000, the width of the 95% confidence interval, w, is ± 0.049. The sample size can be chosen to obtain the required accuracy using Equation 3. If X represents the mode of transport to the airport, and the category, taxi or limousine, has an estimated proportion of 0.20, then the width of the 95% confidence interval, w, is ± 0.039 for a sample size of n = 400 and population size of N = 50,000. For a non-categorical variable, such as time in the terminal before departure or expenditure at the concessions, the standard deviation is determined from the variation of the variable. Again, this σX p p w p p n N n = −( )⎡⎣ ⎤⎦ = −( ) −( )⎡⎣ ⎤⎦ 1 1 96 1 1. n w N X X = + 1 96 1 96 3 2 2 2 2 . . ( ) 2 σ σ X w± (note that the width of the confidence interval is 2 Thus, w w n N nX ) .= −( )1 96 1σ X nX± 1 96. σ X n N nX± −( )1 96 1 2. ( )σ σ σX X n N n= −( )1

Sample Sizes, Sample Estimates, and Confidence Intervals B-3 will be unknown at the planning stage of the survey. If an estimate of the standard deviation, σX, can be obtained from past surveys, other similar airports, or knowledge of the airport, the required sam- ple size for a given confidence interval width can be determined in a similar way using Equation 2. Once the survey has been completed, the standard deviation of X, σX, can be estimated from the sample values of X for use in determining the accuracy of the estimated population mean and confidence intervals. The sample standard deviation is the square root of the sample variance, sX 2, which is given by: where X – is the sample average. Sequential Sampling Sequential sampling is equivalent to random sampling if the order of the sample with respect to the characteristics of interest is essentially random. In this case, sample sizes and estimates of standard deviation and confidence intervals are determined in the same way as outlined above for random sampling. Sequential sampling can result in a more representative sample, and thus narrower confidence intervals, by ensuring a more even spread of sampled individuals over the population (provided the cases where serious biases can occur are avoided31). Calculation of the confidence intervals and required sample sizes are difficult and dependent on the relationships between the ordering variable and the characteristics of the population of interest. In airport surveys, it can usually be assumed that the width of the confidence interval is no wider than that which would occur with random sampling and the expressions given for random sampling above can be used. Stratified Sampling With stratified sampling, the population is divided into groups, referred to as strata, and each stratum is sampled separately using random or sequential sampling. Assume there are ns strata and the proportion of the population in the ith strata is Wi. The sample mean for the ith stratum is: and the sample mean for the population is estimated by: where X – i is the sample mean of the stratum i xij is the value of characteristic X for individual j in stratum i ni is the number of individuals in stratum i Wi is the proportion of the population in stratum i, Ni/N. The standard deviation of the sample mean, X – , is given by: σ σX i i XW i= ⎡⎣ ⎤⎦∑ 2 2 X W Xi ii ns = = ∑ 1 5( ) X x ni ij ij ni = = ∑ 1 4( ) s x X nX i 2 2 1= −( ) −( )∑ 31 Serious biases can occur if a characteristic of interest occurs in a cyclic order in the population list and the length of each cycle corresponds to the sampling fraction, but this would be rare in airport surveys.

B-4 Guidebook for Conducting Airport User Surveys where σX–i is the standard deviation of the mean for members of stratum i, X – i, and is given by: where σXi is the standard deviation of X for members of stratum i ni and Ni are the sample and population sizes for stratum i, respectively. Hence the standard deviation of the sample mean can be expressed as: The accuracy of the sample estimate is improved if the variation within each stratum (mea- sured by σX–i) is less than the variation over the whole population. As above, the 95% confidence interval, 2w, that the sample mean is within w of the popula- tion mean, μ, is given by: With proportional stratified sampling, the sampling fraction is the same for each stratum and equals the proportion of the population in the stratum, Wi. Thus, where n and N are the total sample and population sizes, respectively. In this case, the sample mean given by Equation 4 reduces to that for a random sample, Equa- tion 1, and To determine the required sample size, estimates of σXi can be determined in the survey plan- ning stage. These must be estimated from results of previous surveys, other similar airports, or knowledge of the airport. Once approximate estimates of σXi have been obtained, the total sam- ple size, n, for a confidence interval of width 2w is then found using the relationship (from Equa- tions 6 and 7): which can be solved to give the sample size, n, as: If separate estimates for each stratum cannot be obtained in the planning stage, an approxi- mate estimate of the standard deviation of X over the total population could be used to provide a conservative estimate of the sample size, n. For proportional stratified samples, the sample size for each stratum, ni, is then found by: n nWi i= n W w W i i Xi i i Xi = ( )⎡⎣ ⎤⎦ + ( )⎡⎣ ⎤⎦ ∑ ∑ 1 96 1 96 2 2 2 2 2 . . σ σ N ( )8 w n N n Wi i Xi= −( )⎡⎣ ⎤⎦ ( ){ }∑1 96 1 2. σ σ σX i i Xin N n W= −( )⎡⎣ ⎤⎦ ( ){ }∑1 72 ( ) n N n N W n n N Ni i i i i= = =and w X= 1 96 6. ( )σ σ σX i i Xi i i iW n N n= −( )⎡⎣ ⎤⎦∑ 2 2 1 σ σXi Xi i i in N n= −( )⎡⎣ ⎤⎦2 1

For determining confidence intervals of the final estimates using the survey data, the standard deviation of X in each strata, σXi , can be estimated by the sample standard deviation, sXi , deter- mined from the sample values in each stratum. The 95% confidence interval is then: With non-proportional stratified sampling, the sampling fractions differ for each stratum. The sample mean for each stratum and overall sample mean can be found using Equations 4 and 5, respectively. The sample sizes, ni , are chosen so that a confidence interval of width 2w is given by the relationship: In this case, many different combinations of ni could be chosen to produce a confidence inter- val width of 2w. The choice of ni is dependent on the reason for choosing to use non-proportional stratified sampling. To achieve a similar level of accuracy in the results for each stratum, it will be necessary to use non-proportional stratified sampling, with the sample size in each stratum inversely proportional to the variance of the characteristic within that stratum. Thus, ni = k / σXi 2 , where k is a constant. Substituting this into Equation 10 and solving for k gives: The sample size for each stratum is then found using the relationship: ni = k / σXi 2 Because the actual variance in the characteristic in the survey responses for each stratum, σXi 2 , will not be known until the survey has been performed, it will be necessary to make an initial assumption of the differences in the variance across the strata in order to determine the propor- tion of the survey responses to assign to each stratum. These assumptions can be based on the results of prior surveys or of surveys performed at similar airports. As before, confidence intervals for the final estimates can be determined using the survey data. The standard deviation of X in each strata, σXi, can be estimated by the sample standard devia- tion for each strata, sXi , determined from the sample values. The 95% confidence interval is then: Cluster Sampling As described in Section 3.2 of the guidebook, with cluster sampling the population is divided into clusters (or groups) and clusters rather than individual members of the population are sam- pled. In single stage cluster sampling, all individuals in the cluster are sampled so a complete pic- ture of the population with the sampled clusters is obtained. For example in a survey of departing passengers, flights could be used as clusters and a sample of flights would then be chosen and all passengers on those flights would be surveyed. If two-stage cluster sampling is used, individuals within each cluster are also sampled. X W s n N ni i Xi i i i± −( )⎡⎣ ⎤⎦∑1 96 1 112 2. ( ) k W w W N i i Xi i i Xi i = ( ) + ( ) ∑ ∑ 2 2 2 2 2 2 21 96 σ σ. w W n N ni i Xi i i i= −( )⎡⎣ ⎤⎦∑1 96 1 102 2. ( )σ X n N n W si i Xi± −( )⎡⎣ ⎤⎦ ( ){ }∑1 96 1 92. ( ) Sample Sizes, Sample Estimates, and Confidence Intervals B-5

B-6 Guidebook for Conducting Airport User Surveys Let N be the numbers of clusters in the population n be the numbers of clusters sampled Mk be the number of individuals in the kth cluster in the population mk be the number of individuals in the kth cluster included in the sample xkj be the value of X for the jth individual in the kth cluster. The average number of individuals per cluster is the total population size divided by the num- ber of clusters: The sample mean for the kth cluster is: The population mean is estimated by: When the numbers of individuals in each cluster are equal, Mk = M – , and the mean for the population reduces to simply the average of the cluster means, X – k. Where the clusters to be sampled are drawn randomly from the population of all clusters, and a sample of individuals are drawn randomly from each cluster, the variance of the sample mean, X – , includes two components of variation, the between and within cluster components, and for a categorical variable is estimated by: where p is the proportion of the total population in the category of interest pk is the proportion of individuals in cluster k in the category of interest σc is the variance of the cluster means around the population mean and can be estimated by: if all individuals in each selected cluster are sampled, mk = Mk, and the variance is given by the “between cluster component” term only. Clusters could be selected using stratified sampling to reduce the variance between clusters within a given stratum and thus improve the accuracy of the estimate. For example, where the clusters are flights, flights could be stratified into groups such as domestic and international and short- and long-haul. Assume that the clusters are stratified into ns strata and in each stratum, ni clusters are sampled. The population mean is estimated by: where ni is the numbers of clusters sampled in the ith stratum Mik is the number of individuals in the kth cluster in the ith stratum X – ik is the mean of individuals in the kth cluster in the ith stratum n is the total number of clusters sampled over all strata. X M M X nik ikk n i ns i = ( ) == ∑∑ 11 16( ) σc kk n X X n2 2 1 1= −( ) −( ) = ∑ σ σX cn N n 2 21= −( ) Between cluster component + ( ) ( ) ( )− −n N m M p p n mk k k k1 1 2 kkn −( )⎡⎣ ⎤⎦=∑ 11 15 Within cluster component ( ) X M M X nk kk n = ( ) = ∑ 1 14( ) X x mk kj kj mk = = ∑ 1 13( ) M M Nkk N = = ∑ 1 12( )

Sample Sizes, Sample Estimates, and Confidence Intervals B-7 With proportional stratified sampling of clusters, the between cluster variance component of Equation 15 becomes: where n is the number of clusters sampled (over all strata) N is the number of clusters in the population (over all strata) σci 2 is the variance of cluster means for clusters in stratum i Wi is the proportion of clusters sampled that are in stratum i (= ni / n = Ni / N for proportional stratified sampling). Assuming that the clusters within each stratum have the same size, Mi, the same sample size, mi, and the same proportion of individuals with the category of interest, pi, the within cluster variance component becomes: where f is the fraction of passengers sampled and is assumed to be constant in all clusters. The calculation of the accuracy of estimates, required sample sizes, and confidence intervals is complex and the reader is referred to Levy and Lemeshow, Sampling of Populations, and Cochran, Sampling Techniques, listed in the bibliography. Comprehensive statistical software programs exist which would be useful for analyzing data from cluster sampling. It should be noted, however, that cluster sampling is less efficient that random, sequential, and stratified sampling and larger sample sizes will be required to obtain the same levels of accuracy. Use of the expressions applicable for random sampling will underestimate the true standard errors of estimated population characteristics and the associated confidence intervals. Preferably, clusters should be chosen so that variation in the characteristics of interest between clusters is small, but within clusters is large. In airport surveys, cluster sampling is commonly used for sampling of flights in departing passenger surveys. For many characteristics such as trip duration, airfares, trip purpose, time at airport, spending in airport concessions, and of course destination and sector, passengers on the same flight will be more likely to have similar values of these characteristics than passengers in general. This homogeneity of characteristics within a flight significantly reduces the efficiency of cluster sampling for analyzing these characteristics. There are relatively few air party characteristics that have a similar distribution across differ- ent flights. For characteristics such as household size, the variation of the characteristics across passengers on one flight is likely to be fairly similar to that of all passengers. In this case, use of cluster sampling should not greatly reduce the sampling efficiency. It can be shown that the variances of the estimates of mean of the characteristic of interest for cluster sampling can be expressed, approximately, as a function of the variance for random sampling: where σX–C 2 is the variance using cluster sampling σX–R 2 is the variance using random sampling σ σ ρXC XR avm2 2 1 1 19= + −( )⎡⎣ ⎤⎦ ( ) Var X n N m M p p n mc i i i i i i i i( ) = ( ) −( ) −( ) −( )⎡⎣ ⎤⎦1 1 12kniins i i i i i i i i in n N m M p p n m == ∑∑ = ( ) −( ) −( ) 11 21 1 −( )⎡⎣ ⎤⎦ = ( ) −( ) −( ) −( ) = = ∑ 1 1 1 1 1 1 1 i ns i i i ii N f p p m ns∑ ( )18 Var X n N n Wc i cii ns( ) = −( ) = ∑1 1721 σ ( )

B-8 Guidebook for Conducting Airport User Surveys ρ represents the population intra-class correlation mav is the mean number of cases sampled per cluster. The variance using cluster sampling will be greater than using random sampling unless either mav = 1 or ρi ≤ 0. mav = 1 corresponds to the special case where each cluster consists of a single case and is equivalent to random sampling. The intra-class correlation, ρ, is a measure of homo- geneity and if individuals in a cluster are more homogeneous than the population as a whole, ρ will be greater than zero. The ratio of the variances: σX–C 2 / σX–R 2 is often referred to as the design effect, DE, and is given by: The effective sample size is given by: mav n / DE. Examples of Calculation of Sample Sizes Small Population Size—Using Random or Sequential Sampling for Categorical Variables In this example, the sample size is required for a survey of a relatively small population, such as an employee or tenant survey, using random sampling. The critical characteristics of the pop- ulation being determined are categorical variables, e.g., percentage of employees accessing the airport by private vehicle. The sample sizes depend on the expected proportion in the category and the level of accuracy desired. Sample sizes were determined using Equation 3 for a range of population sizes; for two values of the expected proportion in the category: 50% and 20%; and for three levels of accuracy: ±5 percentage points, ±3 percentage points, and ±2 percentage points (the latter for the expected proportion of 20% only). As discussed in Section 3.2 of the guidebook, an error of ±5 percent- age points represents a percentage error in the category proportion of 10% for an expected pro- portion of 50% and 25% for an expected proportion of 20%. As is evident in Table B-1, large sample sizes are required if an accuracy of 10% is required for categories with low proportions of the population. DE mav= + −( )1 1 20ρ ( ) Sample Size, n, Required for: Expected Proportion in Category = 50% Expected Proportion in Category = 20% Accuracy of ±5 Percentage Points Accuracy of ±3 Percentage Points Accuracy of ±5 Percentage Points Accuracy of ±3 Percentage Points Accuracy of ±2 Percentage Points Total Number in Population N (Equivalent to ±10% of Proportion) (Equivalent to ±6% of Proportion) (Equivalent to ±25% of Proportion) (Equivalent to ±15% of Proportion) (Equivalent to ±10% of Proportion) 50 44 48 42 47 48 100 80 92 71 87 94 200 132 168 110 155 177 500 218 340 165 290 377 1,000 280 515 200 405 605 5,000 360 880 235 600 1,175 Note: Assumes random sampling without replacement. Table B-1. Sample sizes required for categorical variable with expected proportions of 50% and 20% and varying levels of accuracy.

Sample Sizes, Sample Estimates, and Confidence Intervals B-9 Passenger Survey—Using Random, Stratified, and Cluster Sampling A survey of air passengers is to be undertaken to obtain information on airport access trips. A critical question to be answered is (say): What is the percentage of passengers dropped off at the curb outside departures check-in? It was decided to determine the sample sizes for each of the different sampling types and chose the most cost-effective method. It is known that the percentage dropped off at the curb varies greatly by characteristics such as trip purpose, flight sector (e.g., short-haul domestic, long-haul domestic, international), day of the week, time of day, etc. The trip purpose distribution of pas- sengers is not known at the sampling stage and so could not be used. In addition to random sam- pling, stratified sampling with passengers stratified by flight sector or day of the week, and cluster sampling with flights stratified by flight sector were examined. The survey is planned to be conducted during a two-week period. The flight schedule is obtained from the Official Airline Guide and the numbers of flights per sector by day of the week and the estimated average number of originating passengers per flight (estimated using average load factors and percentages of connecting passengers) are as in Table B-2. To determine the sample size required, it is necessary to have at least approximate estimates of the mean and standard deviation of the variable of interest—in this case the percentage of pas- sengers dropped off at the curb. From knowledge of passengers using the airport, the percentage of passengers dropped off at the curb was estimated in Table B-3. Sector of Flights Day of Week Short-Haul Domestic Long-Haul Domestic International Total Monday 60 20 8 88 Tuesday 60 20 8 88 Wednesday 60 20 8 88 Thursday 60 20 9 89 Friday 60 20 12 92 Saturday 50 16 12 78 Sunday 55 20 12 87 Total 405 136 69 610 Avg. Originating Passengers/Flight 50 120 170 79.2 Table B-2. Assumed number of flights and average number of originating passengers per flight by market sector and day of week. Passengers Dropped Off at CurbSector of Flight Mean, p Standard deviation Total Pass. Short-Haul Domestic 40% 0.490 20,250 Long-Haul Domestic 60% 0.490 16,320 International 90% 0.300 11,730 Overall 58.9% 0.492 48,300 Table B-3. Estimated percentage of passengers dropped off at the curb.

B-10 Guidebook for Conducting Airport User Surveys The overall mean percentage of 58.9% is a weighted average of the means for each sector with weights being the numbers of passengers in each sector. Since the variable of interest is a categorical variable, the standard deviation (SD) is given by: , where p is the proportion of the population in the category of interest (i.e., percentage dropped at curb). Random Sampling of Passengers A random sample of originating passengers could be surveyed as they exit the security line. In this case, the sample size is determined for a given width of confidence interval using Equation 3, where The sample size, n, was found using Equation 3 for three widths of the 95% confidence inter- vals (C.I.)—±2%, ±3%, and ±4%—as shown in Table B-4. A simple approximation, given in Section 3.4.1 of the guidebook32, could also have been used: Using this equation, the estimated sample sizes for the ±2%, ±3%, and ±4% cases are: 2,411, 1,074, and 606. The approximation leads to slightly higher estimates of the required sample sizes. If, for example, it was decided that the narrow confidence interval is appropriate, i.e., the mean estimate should be accurate to within ±2%, a sample size of 2,218 is required. This corresponds to a sampling fraction of 4.6% for a population of 48,300, and if using sequential sampling every 21st passenger passing through security should be surveyed. Stratified Sampling of Passengers—Stratified by Sector Consider the case where stratified sampling is used to select passengers to be surveyed and pas- sengers are stratified by the sector of the flight. Assume that at this airport, passengers on the dif- ferent sectors use different security screening checkpoints, thus allowing passengers on each flight sector to be sampled separately. We consider here the simple case where proportional stratified sampling is used. Thus the pro- portion of the sample size in each flight sector is equal to the proportion of the total passengers in each flight sector, Wi = ni/n = Ni/N. n p p w= −( ) ( )40 000 1 100 2, Total population Mean proportion usi N = 48 300, ng curb SD for individual pass. .X p X = = = 0 59 σ 0 492. p p1 −( )⎡⎣ ⎤⎦ 95% C.I. Mean ± w 95% C.I. ± w w as % of mean Sample n 2.00% 3.40% 2,218 3.00% 5.10% 1,012 4.00% 6.79% 574 Table B-4. Sample sizes for random sampling of passengers for 95% confidence interval widths 2%, 3%, and 4%. 32 The denominator in the equation in Section 3.4.1 is a2 where the width of the 95% confidence interval is ± a where a is expressed in percentage points. Since w above is not expressed in percentage points, w = a/100.

Sample Sizes, Sample Estimates, and Confidence Intervals B-11 The sample size is determined using Equation 8. Table B-5 shows the calculation of the sum- mation over the three flight sectors. The standard deviation for each sector is found using the relationship applicable for categorical variables: , where pi is the proportion of the population in the category of interest for flights in sector i (i.e., percentage dropped at the curb). Substituting 0.20357 from Table B-5 for ∑i(Wi σXi2 ) in Equation 8 for three C.I. widths of ±2%, ±3%, and ±4% gives the sample sizes, n, in Table B-6. The sample sizes for each flight sector are then found, based on the proportion of passengers in each sector, Wi, to be as shown in Table B-7. Comparing the total sample size with that found with random sampling, we find that stratifi- cation by flight segment has reduced the required sample size for the ± 2% case from 2,218 to 1,879—a reduction of 15%. Note that the size of the reduction is very dependent on the vari- ation in the mean responses across the different strata. p pi i1 −( )⎡⎣ ⎤⎦ Sector of Flight Enplaned Pass. Total, Ni % of Total Wi Est. Avg. Proportion at Curb, p SD σXi Wi σXi 2 Short-Haul Domestic 20,250 0.4193 0.40 0.49 0.10062 Long-Haul Domestic 16,320 0.3379 0.60 0.49 0.08109 International 11,730 0.2429 0.90 0.30 0.02186 Total 48,300 1.0000 0.59 0.20357 Table B-5. Calculation of standard deviation of sample mean for stratified sampling of passengers by market sector. 95% C.I. Mean ± w 95% C.I. ± w w as % of mean Sample n 2.00% 3.40% 1,879 3.00% 5.09% 854 4.00% 6.79% 484 Table B-6. Sample sizes for stratified sampling of passengers by sector of flight for 95% confidence interval widths 2%, 3%, and 4%. Sample Size for C.I. Width Sector of Flight ± 2% ± 3% ± 4% Short-Haul Domestic 788 358 203 Long-Haul Domestic 635 288 163 International 456 207 118 Total 1,879 853 484 Table B-7. Sample sizes by sector for stratified sampling of passengers by sector of flight for 95% confidence interval widths 2%, 3%, and 4%.

B-12 Guidebook for Conducting Airport User Surveys 95% C.I. Mean ± w 95% C.I. ± w w as % of mean Sample n 2.000% 3.40% 2,215 3.000% 5.09% 1,010 4.000% 6.79% 574 Table B-9. Sample sizes for stratified sampling of passengers by day of week for 95% confidence interval widths 2%, 3%, and 4%. Stratified Sampling of Passengers—Stratified by Day of Week Now consider the case where the passengers are stratified by the day of the week. This form of stratification is easy to implement during the conduct of the survey, and numbers of passengers are known, at least approximately, at the sample design stage. Again consider the simple case where proportional stratified sampling is used. Thus the pro- portion of the sample size on each day of the week is equal to the proportion of the total passen- gers in each day of the week. It was assumed that the proportion of people using the curb on each weekday was entirely explained by the sector of their flight. Thus, the average percentage of passengers using the curb was estimated for each day by the weighted average of the percentages for each flight sector, with weights equal to the numbers of passengers on that day to each sector. The sample size is determined using Equation 8. Table B-8 shows the calculation of the sum- mation over the days of the week. The standard deviation for each day is found using the relationship applicable for categorical variables: , where pi is the proportion of the population in the category of interest (i.e., percentage dropped at curb) on day i. Substituting 0.24175 from Table B-8 for ∑i(Wi σXi2 ) in Equation 8 for three C.I. widths of ±2%, ±3%, and ±4% gives the sample sizes, n, in Table B-9. The sample sizes for each day of the week are then found, based on the proportion of passen- gers on each day of the week, Wi, to be as shown in Table B-10. p p1 −( )⎡⎣ ⎤⎦ Originating Passengers Day Short-Haul Domestic Long-Haul Domestic International Total Ni % of Total Wi Est. Avg. Proportion at Curb, pi SD Monday 3,000 2,400 1,360 6,760 0.1400 0.572 0.495 0.03427 Tuesday 3,000 2,400 1,360 6,760 0.1400 0.572 0.495 0.03427 Wednesday 3,000 2,400 1,360 6,760 0.1400 0.572 0.495 0.03427 Thursday 3,000 2,400 1,530 6,930 0.1435 0.580 0.494 0.03496 Friday 3,000 2,400 2,040 7,440 0.1540 0.602 0.490 0.03692 Saturday 2,500 1,920 2,040 6,460 0.1337 0.617 0.486 0.03160 Sunday 2,750 2,400 2,040 7,190 0.1489 0.609 0.488 0.03546 Total 20,250 16,320 11,730 48,300 1.0000 0.589 0.24175 σXi Wi σXi 2 Table B-8. Calculation of standard deviation of sample mean for stratified sampling of passengers by day of week.

Sample Sizes, Sample Estimates, and Confidence Intervals B-13 Comparing the total sample size with that found with random sampling, we find that stratifi- cation by day of the week has reduced the required sample size for the ±2% case from 2,218 to 2,215—a reduction of only 0.1%. Thus, in this case stratification makes almost no difference to the required sample size. This is due to the low variability in the average percentage of passen- gers at the curb, pi, over the various days of the week. Note that the size of the reduction is very dependent on the variation in the mean responses across the different strata. This varies depending on the variable of interest and in some cases could vary greatly over the days of the week making stratification by day of the week worthwhile. Cluster Sampling of Flights—Additional Assumptions A very common form of sampling for passenger surveys is to select a sample of flights to sur- vey and to sample either all, or a portion, of passengers on those flights. This is a form of cluster sampling where each flight represents a cluster. Using the same example as above, the pertinent characteristics required to estimate the sam- ple size are given in Table B-11. Cluster sampling is very dependent on one parameter not relevant to the passenger sampling considered above—the variation in the mean value for each flight of the characteristic of interest (i.e., percentage of passengers dropped at the curb) over the range of flights, σci. An estimate of Sample Size for C.I. Width Day of Week ± 2% ± 3% ± 4% Monday 310 141 80 Tuesday 310 142 80 Wednesday 310 142 80 Thursday 318 145 82 Friday 341 156 89 Saturday 296 135 77 Sunday 330 150 86 Total 2,215 1,011 574 Table B-10. Sample sizes for each day for stratified sampling of passengers by day of week for 95% confidence interval widths 2%, 3%, and 4%. For Flight Sector, i Total Quantity Symbol Short-Haul Domestic Long-Haul Domestic International Symbol Value No. of Departing Flights Ni 405 136 69 N 610 Avg. Originating Pass./Flight Mi/Ni 50 120 170 M/N 79.2 Total Originating Passengers Mi 20,250 16,320 11,730 M 48,300 Proportion of Flights in Sector Wi = Ni/N 0.6639 0.2230 0.1131 1.0000 % Dropped at Curb 40% 60% 90% 58.9% Difference from Overall Avg. -19% 1% 31% SD in % Between Flights: 10% 10% 10% 21.1% Xi – X – – Xi = pi – X = p– σc i σc Table B-11. Assumed characteristics for flights in three market sectors for illustrative examples of cluster sampling.

B-14 Guidebook for Conducting Airport User Surveys this variation, expressed in terms of the standard deviation, is given in Table B-11. It is estimated from previous surveys and knowledge of passengers at the airport that the standard deviation is 10% around the mean value for each sector33. Thus, for short-haul flights the mean value of the percentage using the curb for each flight would be expected to be between 20.4% and 59.6% for 95% of flights [= 40% ± 1.96 x 10%]. The standard deviation over all flights includes both the vari- ation between flights within each sector and the variation between sectors and is given by: where Wi is the proportion of flights in sector i (= Ni / N). Cluster Sampling with Random Sampling Flights If a random sample of flights is selected and all passengers on each of the selected flights are sur- veyed, the sample size is determined for a given width of confidence interval using Equation 3, where The number of flights to be sampled, n, was found using Equation 3 for three widths of the 95% confidence interval—±2%, ±3%, and ±4%—as shown in Table B-12. Since all passengers on each flight are sampled, the number of passengers sampled is the number of flights sampled multiplied by the average number of passengers per flight (M – = M / N). Comparing the total passenger sample size with that found with random sampling of pas- sengers, we find that cluster sampling by flight has increased the required sample size greatly—for ±2% accuracy from 2,218 to 19,953. This is due to the high variation in the char- acteristics of interest between flights. In other cases, the additional sample size with cluster- ing may be much less. For example, if the mean percentage was 50% for each sector (instead of 40%, 60%, and 90%), the sample size for ±2% accuracy would be 83 flights or 6,572 pas- sengers. The increase in the sample size is very dependent on the variation in the mean responses for a flight across the different flights and the above example may not be typical in general. Total population (flights) N = 610 Mean proportion using curb SD f .X p= = 0 589 or individual flight σ σX c= = 0 211. σ σc i ci iiW X X= + −( )⎡⎣⎢ ⎤⎦⎥{ }∑ 2 2 95% C.I. Mean ± w 95% C.I. ± w w as % of mean Sample n (flights) Sample Pass. 2.00% 3.40% 252 19,953 3.00% 5.10% 146 11,560 4.00% 6.79% 92 7,285 Table B-12. Sample sizes for cluster sampling with random sampling flights for 95% confidence interval widths 2%, 3%, and 4%. 33 Note that if there was no difference between sectors, so that the mean percentage of passengers dropped off at the curb was 58.9% for all flights, the standard deviation in the percentage between flights, σci, would equal the standard deviation of the mean for each flight. Then σci 2 = pi (1 − pi) / (Ni / Mi) where Ni / Mi is the average num- bers of passengers on each flight in sector i. Thus the values of σci for the short-haul, long-haul, and international sectors would be 7.0%, 4.5%, and 3.8%, respectively, and σc would be 6.2%.

Sample Sizes, Sample Estimates, and Confidence Intervals B-15 Cluster Sampling with Stratified Sampling of Flights and All Passengers on Selected Flights Surveyed Now consider the case where flights to be surveyed are determined using stratified sampling and all passengers on each of the selected flights are surveyed. The flight sample size is deter- mined for a given width of confidence interval using Equation 8, where the units sampled in each stratum are clusters rather than individuals. Since flights are being sampled, rather than passen- gers, the standard deviation σXi in Equation 8 is the standard deviation of the average percent- age of passengers using the curb for each flight, σci, as shown in Table B-13. Substituting 0.01000 from Table B-13 for ∑i(Wi σci2) in Equation 8, the number of flights to be sampled, n, was found for three widths of the 95% confidence intervals—±2%, ±3%, and ±4%—as shown in Table B-14. The numbers of flights in each sector and estimated number of passengers (based on average numbers of passengers per flight in that sector) are as shown in Table B-15. Departing Flights Sector of Flight Total Ni Wi = Ni/N Est. Avg. % at curb, pi SD Short-Haul Domestic 405 0.6639 0.40 0.10 0.00664 Long-Haul Domestic 136 0.2230 0.60 0.10 0.00223 International 69 0.1131 0.90 0.10 0.00113 Total 610 1.0000 0.589 0.01000 σci Wi σci 2 Table B-13. Calculation of standard deviation of sample mean for cluster sampling with flights stratified by sector and all passengers on selected flights surveyed. 95% C.I. Mean ± w 95% C.I. ± w w as % of mean Sample n (flights) 2.00% 3.40% 83 3.00% 5.09% 40 4.00% 6.79% 24 Table B-14. Sample number of flights for cluster sampling with flights stratified by sector and all passengers on selected flights surveyed for 95% confidence interval widths 2%, 3%, and 4%. C.I. Width ± 2% C.I. Width ± 3% C.I. Width ± 4%Sector of Flight Flights Pass. Flights Pass. Flights Pass. Short-Haul Domestic 55 2,750 27 1,350 16 800 Long-Haul Domestic 19 2,280 9 1,080 5 600 International 9 1,530 5 850 3 510 Total* 83 6,560 41 3,280 24 1,910 * Total may be higher than previous table as number of flights must be an integer Table B-15. Sample sizes by sector for cluster sampling with flights stratified by sector and all passengers on selected flights surveyed for 95% confidence interval widths 2%, 3%, and 4%.

B-16 Guidebook for Conducting Airport User Surveys The stratification of flights by sector results in a large reduction in the numbers of flights and passengers to be surveyed. In this example, much of the variation in the variable of interest is explained by the flight sector, which results in a large reduction in sample size compared to ran- dom sampling of flights. By sampling the flights by sector, the likelihood of selecting a sample with close to the actual proportions of passengers in each sector is much greater than when ran- domly sampling flights. Again note that the results here reflect the assumptions regarding vari- ation considered in this example and will vary in other situations. Cluster Sampling with Stratified Sampling of Flights and a Sample of Passengers on Selected Flights Now consider the case where flights to be surveyed are determined using stratified sampling and a sample of passengers on each of the selected flights are surveyed. Assume initially that 50% of passengers on the selected flights are surveyed. The variance of the estimate is greater than with 100% sampling of each flight as it includes both the variation between flights (as before) and the variation due to sampling of passengers on individual flights. It is calculated from Equa- tions 17 and 18 as follows: where σci is the standard deviation of the mean percentage using the curb across flights in sector i Ni is the number of flights in sector i (N is total over all sectors) ni is the number of flights sampled in sector i (n is total over all sectors) Mi is the average number of passengers on a flight in sector i mi is the average number of passengers sampled on a flight in sector i ( = f Mi ) pi is the probability of a passenger on a flight in sector i being dropped off at the curb f is the proportion of passengers sampled on a flight ( = mi / Mi, assumed the same for all flights). The flight sample size is determined for a given confidence interval X – ± w by solving the fol- lowing relationships for n: w = 1.96 σX– 2 where σX– 2 is given by the equation above. The summations over the sectors for calculating σX– 2 are determined for a given n value as shown in Table B-16 (n = 117 used in table). The number of flights to be sampled, n, was found by setting an approximate value initially and determining the width, w, then adjusting the value of n until the appropriate value of w was obtained. In the table, σX– 2 is evaluated for n = 117 and the resulting value of w is 0.0200 or 2.00%. Samples sizes for three widths of the 95% confidence intervals—±2%, ±3%, and ±4%—were found to be as shown in Table B-17. σ σX i cii ns i in N n N N N f p 2 2 1 1 1 1 1= −( ) ( ) + ( ) −( ) − = ∑ p mi iins ( ) −( )=∑ 11 Calculate σX 2 for n = 117 Departing Flights Sampled on Each Flight Between Cluster Within Cluster Between + Within Sector of Flight Total, Ni Wi = Ni / N Est. Avg. % at Curb, pi Pass. on Each Flight, Mi % f # mi SD Between Flights, (1 – n/N ) / n (Ni / N) σci2 (1 / Ni) (1 – f) pi (1 – pi) /(mi – 1) Total Short-Haul Dom. 405 0.6639 0.40 50 50% 25 0.100 0.0000459 0.0000123 0.0000582 Long-Haul Dom. 136 0.2230 0.60 120 50% 60 0.100 0.0000154 0.0000150 0.0000304 International 69 0.1131 0.90 170 50% 85 0.100 0.0000078 0.0000078 0.0000156 Total 610 1.0000 0.59 σci – σX 2 = 0.0001041– w = 1.96 σX = 0.0200– Table B-16. Calculation of standard deviation of sample mean for cluster sampling with flights stratified by sector and a 50% sample of passengers on selected flights.

Sample Sizes, Sample Estimates, and Confidence Intervals B-17 The numbers of flights in each sector and estimated number of passengers (based on average numbers of passengers per flight in that sector) are as shown in Table B-18. The surveying of only a 50% sample of passengers on each flight resulted in an increase in the number of flights to be surveyed from 83 to 117 for the ±2% accuracy case. However, since only 50% of passengers on these flights are to be surveyed, the total number of passengers decreased from 6,560 to 4,615. For surveys conducted in the departure lounge, it is almost impossible to sur- vey all passengers on a flight due to reasons given in Chapter 5 of the guidebook. In practice it may be possible to obtain complete responses from 50% of passengers, in which case the number of flights to be surveyed based on the 50% passenger sample should be used. It is evident that the total number of passengers that need to be surveyed can be reduced by reducing the percentage of passengers sampled on each flight, but the number of flights surveyed increases. Several other cases were examined using this example: • If 75% of passengers on each of the selected flights were to be surveyed, a sample of 92 flights and 5,580 passengers would be required. • If 30% of passengers on each of the selected flights were to be surveyed, a sample of 268 flights and 6,360 passengers would be required. The optimal balance for a particular survey will depend on the variation in responses between and within flights, and on the relative costs of surveying passengers and flights, which vary from survey to survey. Another important consideration with interview surveys in gate lounges, discussed in Chapter 5 of the guidebook, is the limitation on the number of interviews that each interviewer can complete in the time window between when passengers start to arrive in the gate lounge and the start of flight boarding. As a practical matter, this limits the number of passengers who can be surveyed on a given flight. Again note that the results here reflect the assumptions regarding variation considered in this example and will vary in other situations. Table B-17. Sample number of flights for cluster sampling with flights stratified by sector and a 50% sample of passengers on selected flights surveyed for 95% confidence interval widths 2%, 3%, and 4%. 95% C.I. Mean ± w 95% C.I. ± w w as % of mean Sample n (flights) 2.00% 3.40% 117 3.00% 5.09% 47 4.00% 6.79% 26 Table B-18. Sample sizes by sector for cluster sampling with flights stratified by sector and a 50% sample of passengers on selected flights surveyed for 95% confidence interval widths 2%, 3%, and 4%. C.I. Width ± 2% C.I. Width ± 3% C.I. Width ± 4%Sector of Flight Flights Pass. Flights Pass. Flights Pass. Short-Haul Domestic 78 1,950 31 775 17 425 Long-Haul Domestic 26 1,560 11 660 6 360 International 13 1,105 5 425 3 255 Total 117 4,615 47 1,860 26 1,040

Next: Appendix C - Material and Equipment Checklists for Air Passenger Intercept Surveys »
Guidebook for Conducting Airport User Surveys Get This Book
×
 Guidebook for Conducting Airport User Surveys
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

TRB’s Airport Cooperative Research Program (ACRP) Report 26: Guidebook for Conducting Airport User Surveys explores the basic concepts of survey sampling and the steps involved in planning and implementing a survey. The guidebook also examines the different types of airport user surveys, and includes guidance on how to design a survey and analyze its results.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!