Cover Image

Not for Sale



View/Hide Left Panel
Click for next page ( 37


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 36
36 Guidebook for Conducting Airport User Surveys With non-probability sampling, it is not possible to calculate the sample size required to achieve a given level of accuracy or to make generalizations about the population. These types of surveys are usually of limited value in ascertaining properties of the population but can be use- ful for obtaining ideas and user feedback. 3.4 Sample Size A critical issue in planning any survey is determining the appropriate sample size, which is influenced by such considerations as the following: Survey purpose. Analysis of subgroups of interest. Required precision of the survey results. Credibility of results among decision makers and data users. Available resources (including budget, personnel, and equipment). The survey purpose influences the required sample size in three ways: (1) by determining the key characteristics of the air travel party and the precision to which they need to be known, (2) by establishing the level of disaggregation to which the results need to be expressed, and (3) by identifying the value to be gained from improved precision. For any desired degree of pre- cision in the survey results, the need to consider subgroups of interest--such as air passengers with ground origins in a particular area or visitors on business trips--will increase the required sample size of the overall survey in order to ensure a large enough number of respondents in the subgroup(s) of interest. The required precision and the credibility of results influence the size of confidence interval and the acceptable margin of error. Larger samples are required to reduce the margin of error and/or increase the confidence level for a given margin of error. Although the target sample size for a survey should ideally be determined by the purpose and objectives of the survey and the uses to which the results will be put, in reality the financial resources available to fund the survey often constrain the sample size, particularly where bud- gets have been established before the detailed planning of the survey has begun. Time constraints can also influence the sample size if information is required on short notice. Other factors affecting the required sample size include the following: The proportion of the population with the attributes being measured. An airport seeking pas- sengers' opinions of the retail concessions, for example, must design a survey that takes into account the fact that only a small proportion of passengers will have actually visited the retail concessions. The passengers visiting the retail concessions are a subgroup of all passengers, and so the required sample size is found in a similar way to that described previously for a subgroup. The variability of attributes being measured. If the variability is high, a larger sample size will be required. The sample design used. For example, a good stratified sample can permit a smaller sample size than a random sample for a given level of accuracy, while cluster sampling will usually necessitate a larger sample size (as discussed in Sections 3.3.3 and 3.3.4). In the following discussion, sample size refers to the number of completed responses; the number of people approached to participate in the survey may be significantly higher depend- ing on the rates of refusal and incomplete responses. These refusals and incomplete responses will generally take some time to survey and process, which could be significant in some surveys

OCR for page 36
Statistical Concepts 37 (e.g., where follow-up phone calls are made) and should be allowed for in determining resource requirements. To estimate the total number of individuals to approach, divide the desired sam- ple size of completed surveys by an estimate of the completed survey response rate, expressed as a proportion. For example, if a sample size of 1,000 is required and the response rate is 70%, then 1,429 [= 1,000/0.7] individuals would need to be approached. 3.4.1 Sample Size with Random Sampling Calculation of the sample size required to obtain a specified accuracy differs depending on whether the required accuracy is for a question with categorical or numerical responses (ques- tion types are discussed in Section 4.3.2).8 With a categorical response, the respondent must choose from a limited number of defined responses. For example, for a question on mode of travel to the airport, categories could be private vehicle, rented vehicle, taxi/limousine, train, bus, airplane, or walk/bicycle. Determination of the sample size for each type of question using ran- dom sampling is considered in the following paragraphs. Categorical Response Questions When using categorical response questions and random sampling, the sample size required to give a specified level of accuracy is a function of the population size and the proportion of the population in the category of interest (e.g., proportion using a private vehicle as their mode of travel to the airport). This proportion is unknown and should be estimated in the survey planning stage from experience, previous surveys, or values from other airports. The largest sample size required occurs when half of the population has the characteristic of interest. Table 3-2 provides approximate 95% confidence intervals for a range of population and sam- ple sizes and two values of the proportion of the population in the category of interest (50% and 20%).9 The largest sample size is required for a proportion of 50%. Thus, for surveys with many questions with a range of mean proportions, it is appropriate to use the sample size based on the 50% proportion as this will provide at least the required accuracy for all cases. Table 3-3 gives the required sample size using random sampling, based on the 50% proportion, for var- ious confidence intervals and a range of population sizes. Alternatively, the sample size for an accuracy of a percentage points10 can be calculated for a 95% confidence level using the following expression:11 1.962 p (1 - p ) n= (a 100) 2 + 1.962 p (1 - p ) N where n is the sample size, N is the population size, a is the width of the confidence, and p is the estimated proportion of the population in the category of interest. 8 With a categorical response, the variance can be expressed in terms of the proportion of the population in the category of interest. Thus an initial estimate of this proportion, rather than the variance, is required. 9 For proportions (p) greater than 0.5, the required sample size is the same as for the proportion 1 p. For example, for a pro- portion p = 0.75, the required sample size is the same as for p = 0.25. 10 If accuracy is expressed as a percentage of the mean, say b%, then the percentage points, a = b mean. For example, a con- fidence interval width of 25% of the mean with a mean of 40% corresponds to a confidence interval width of a = 25 0.4 = 10 percentage points. 11 For other confidence levels, replace 1.96 with the appropriate value from the standard Normal distribution for the confi- dence level required.

OCR for page 36
38 Guidebook for Conducting Airport User Surveys Table 3-2. Approximate 95% confidence intervals for a categorical variable for a range of population and sample sizes. 95% Confidence Interval for Proportion of Proportion of Population in Category Population Sample Population in Range Size Size Lower Upper Category Mean a Percentage Limit Limit Points, where a = 100 80 50% 4.9 pts 45% 55% 20% 3.9 pts 16% 24% 60 50% 8.0 pts 42% 58% 20% 6.4 pts 14% 26% 40 50% 12.0 pts 38% 62% 20% 9.6 pts 10% 30% 50,000 1,000 50% 3.1 pts 47% 53% or higher 20% 2.5 pts 18% 22% 400 50% 4.9 pts 45% 55% 20% 3.9 pts 16% 24% 100 50% 9.8 pts 40% 60% 20% 7.8 pts 12% 28% Note: SEE estimated using binomial distribution and sampling without replacement, Normal approximation used to determine confidence intervals. For further information, refer to statistical textbooks listed at the end of this guidebook. For large populations of over 50,000, the required sample size for a 95% confidence level is given approximately by: 40, 000 p (1 - p ) n= a2 Thus if the proportion of some characteristic of the population is 5% (p = 0.05), say, and the desired accuracy of the estimate of this proportion is 1 percentage point at a 95% confidence level, the required sample size to achieve this accuracy is 1,900. It should be noted that an error of 1 percentage point on an estimated proportion of only 5% is an error of 20% of the esti- mated proportion. If it is desired to reduce this error to only 5% of the estimated proportion (0.25 percentage points), the required sample size would increase to about 30,000. Table 3-3. Required sample size using random sampling for various sized confidence intervals and a range of population sizes.* Population Sample Size for 95% Confidence Interval: Sample Mean a Percentage Points, where a = Size 1 pt 2 pts 3 pts 4 pts 5 pts 6 pts 7 pts 8 pts 9 pts 10 pts 100 99 96 91 86 79 73 66 60 54 49 200 196 185 168 150 132 114 99 86 74 65 500 475 414 340 273 217 174 141 115 96 81 1,000 906 706 516 375 278 211 164 130 106 88 2,000 1,655 1,091 696 462 322 235 179 140 112 92 5,000 3,288 1,622 879 536 357 253 189 146 116 94 10,000 4,899 1,936 964 566 370 260 192 148 117 95 20,000 6,488 2,144 1,013 583 377 263 194 149 118 96 50,000 8,057 2,291 1,045 593 381 265 195 150 118 96 100,000 8,762 2,345 1,056 597 383 266 196 150 118 96 200,000 9,164 2,373 1,061 598 383 266 196 150 118 96 500,000 9,423 2,390 1,065 600 384 267 196 150 119 96 * Sample sizes where proportion of population in the category of interest is 50%.

OCR for page 36
Statistical Concepts 39 If a subgroup composes S percent of the population and the estimate of the proportion of some characteristic of the subgroup is required to the same accuracy as the estimate of the pro- portion for the population as a whole, the sample will need to be larger by a factor of 100/S. Thus, to achieve the same accuracy for a subgroup that composes 20% of the population, the sample would need to be five times larger (100/20 = 5). If this level of accuracy is required for multiple subgroups, the required total sample size is given by the largest of the estimated total sample sizes calculated using the factor for each subgroup (100/Si where Si is the percentage of the popula- tion in subgroup i). For very small populations such as with airport tenant surveys, a high pro- portion of the population must be sampled to obtain estimates within 0.05 (i.e., 5 percentage points) of the proportion for the total population, but actual numbers of surveys required are small. For example, a sample size of 79 is required from a population of 100 to achieve an accu- racy of 5 percentage points. For large populations, a sample size approaching 400 is required to achieve a similar level of accuracy using random sampling. Numerical Response Questions For questions with a numerical response, such as the number of travelers in a group, expen- ditures at the concessions, or time spent at the airport, the sample size required for a specified level of accuracy is dependent on the variability (as measured by the standard deviation) in the numerical response. With random sampling, the population mean and standard deviation are estimated by the average and standard deviation (weighted if appropriate) of the responses in the sample. The required sample size for an accuracy of w can be calculated for a 95% confi- dence level using the following expression:12 n = 1.962 s 2 w 2 where s is the standard deviation of the responses in the sample. The SEE can be found approximately by dividing the standard deviation of the sample values by the square root of the sample size of completed responses.13 The standard deviation of the vari- able of interest is unknown during the survey planning stage and an initial estimate is required to calculate the required sample size. This initial estimate could be obtained from previous surveys at the airport or from other airports, or estimated from knowledge of the typical range in values. Examples of the mean, standard deviation, SEE and accuracy of estimate (95% confidence interval), and required sample sizes for accuracy to within 10% of the mean for selected air pas- senger characteristics from some airport surveys are given in Table 3-4. As can be seen by these examples, the accuracy and required sample sizes vary greatly depend- ing on the variable of interest. Expenditures at the airport vary greatly as many people do not spend any money and some spend a lot. Thus large sample sizes are required to produce esti- mates to within 10% of their expected value. In contrast, variability in the time passengers spend at the airport is much less, and small samples would give a similar accuracy (in percentage terms). 3.4.2 Sample Sizes with Stratified and Cluster Sampling The methods for determining the sample sizes with stratified and cluster sampling are more complex and are outlined in the following paragraphs with details provided in Appendix B. 12 For other confidence levels, replace 1.96 with the appropriate z-value from the standard Normal distribution for the confi- dence level required. For small population sizes, use the expression: n = 1.962 s2/[w2 + 1.962 s2/N] where N is the population size. 13 A more accurate estimate is given by dividing by the square root of the sample size less one. This can become important for small samples.

OCR for page 36
40 Guidebook for Conducting Airport User Surveys Table 3-4. Examples of 95% confidence intervals and sample sizes for selected air passenger characteristics from some recent airport surveys. With Sample Size = 400 Sample Size for Variable Mean Standard 95% Confidence Confidence Interval Deviation Interval* 10% of Mean Number in travel group 1.4 1.6 0.16 or 11% 503 Expenditure at all concessions Airport 1 $6.20 $8.53 $0.84 or 14% 728 Airport 2 $8.00 $19.00 $1.86 or 23% 2,168 Time at airport (min) Large intern'l 160 60 6 min or 4% 55 Domestic 106 43 4 min or 4% 64 * Confidence interval expressed as difference from sample mean, also given as a percentage of the sample mean. Note: Sample mean will be approximately normally distributed for large sample sizes according to the Central Limit Theorem, even for variables such as expenditure at concessions that are not normally distributed. Source: Airport surveys conducted by Jacobs Consultancy in the United States and Canada. Stratified Sampling The objective of stratified sampling is to reduce the size of the required sample to achieve a desired level of accuracy in situations where it is possible to define population strata within which the variance of the population characteristic of interest differs between the strata. For example, if the characteristic of interest is the duration of air passenger air trips (because trip duration affects the likely use of parking at the airport), the duration values are likely to differ consider- ably between international trips, long-haul domestic trips, and short-haul domestic trips. Because the variance of the air trip duration within each of these three strata will be much smaller than the variance for the population as a whole, it may be possible to estimate the average trip duration for all air passengers to the desired level of accuracy with fewer total responses divided between the three strata than by randomly sampling the entire population. To achieve a similar level of accuracy in the results for each stratum, it will be necessary to use non-proportional stratified sampling, with the sample size in each stratum inversely proportional to the variance of the characteristic within that stratum. Because the actual variance in the char- acteristic for each stratum will not be known until the survey has been performed, it will be nec- essary to make an initial assumption of the differences in the variance across the strata in order to determine the proportion of the survey responses to assign to each stratum. These assumptions can be based on the results of prior surveys or of surveys performed at similar airports. If Xi is the standard deviation of characteristic X in stratum i, then for a confidence interval for the sample mean of X across the population of 2w (i.e., w) at a 95% confidence level, w is given by: w = 1.96 W i Xi (1 - ni 2 2 N i ) ni i where Wi is the proportion of the total population in stratum i Ni is the population in stratum i ni is the sample size in stratum i A given confidence interval can be obtained for varying combinations of ni. However, if ni is selected to be inversely proportional to the variance of X within each stratum, i.e., ni = k / Xi 2 , then ni can be replaced by k / Xi in the above equation, which can then be solved for k and hence 2

OCR for page 36
Statistical Concepts 41 ni calculated for each stratum. The expression for calculating the value of k and the sample sizes for each stratum is provided in Appendix B. The total sample size is obtained by summing ni across all the strata. Cluster Sampling Calculating an appropriate sample size with cluster sampling in considerably more compli- cated than with random or stratified sampling, because the composition and size of the clusters affect the variance of the resulting estimates of the population characteristics. The accuracy of a cluster sample depends on both the variance of the characteristic of inter- est within each cluster and the variance between clusters. If the variation in the sample mean between clusters is fairly small (i.e., the clusters are fairly homogeneous and have similar means) but the variance of the characteristic within each cluster is fairly large, then the cluster sample will give a similar accuracy to a random sample of the same overall sample size. One can think of this situation as a series of small random samples of the population as a whole. Conversely, if the variance between clusters is fairly high, then the overall variance of the population sample mean of the characteristic will be larger than for a random sample and in consequence a cluster sample will require a larger overall sample size to achieve the same level of accuracy. 3.4.3 Comparison of Sampling Methods An example of sample size calculations for different sampling methods is given in Appendix B. The example provides some insight into the efficiencies of each sampling method and is sum- marized in this section. In the example, a survey of passengers is to be undertaken to obtain infor- mation on airport access trips. A critical question to be answered may be: What is the percentage of departing passengers dropped off at the terminal curb? Random and stratified sampling of passengers--with stratification by flight sector (e.g., short-haul domestic, long-haul domestic, international) and day of the week--and one- and two-stage cluster sampling--with both ran- dom and stratified sampling of flights by sector--are examined. The flight schedule for the sur- vey period includes 610 flights per week, and the number of originating passengers per week is estimated at 48,300. Some 42% of passengers are on short-haul domestic flights, 34% on long- haul domestic flights, and 24% on international flights. From past experience, initial estimates of the percentages of passengers dropped off at the curb are 40% of short-haul domestic passen- gers, 60% of long-haul domestic passengers, and 90% of international passengers. In the exam- ple, the percentage of passengers to be dropped off at the curb is quite strongly related to the flight sector, but fairly weakly related to the day of the week. Table 3-5 summarizes the required sample sizes for an accuracy of 2, 3, and 4 percentage points for a 95% confidence level using various sampling strategies. The following observations were made from this example: Using random sampling, the required sample size approximately doubles as the accuracy improves from 4 to 3 and doubles again from 3 to 2 percentage points. Stratified sampling by flights (which has a strong relationship with the variable of interest) reduces the sample size required by 15%, but stratified sampling by day of the week (which has a weak relationship with the variable of interest) has a negligible effect on the required sample size. Cluster sampling with random sampling of flights and surveying of all passengers on those flights was found to be very inefficient, increasing the sample size required by a factor of 9 or more compared to random sampling. Cluster sampling with stratified sampling of flights by sector greatly improves the efficiency of cluster sampling. With all of the passengers on the selected flights surveyed, the sample size required is reduced to approximately 3 times that of random sampling.

OCR for page 36
42 Guidebook for Conducting Airport User Surveys Table 3-5. Sample sizes in example survey for an accuracy of 2, 3 and 4 percentage points for a 95% confidence level using various sampling strategies. Unit Mean a Percentage Points, a = Method Comment Sampled 2 pts 3 pts 4 pts Random Passengers 2,218 1,012 574 Random sampling of passengers (pax) Stratified Passengers 1,879 853 484 Stratified by sector of flight Passengers 2,215 1,011 574 Stratified by day of the week Cluster 1. Flights 252 146 92 Random sampling of flights with all Passengers 19,953 11,560 7,285 pax on each flight sampled 2. Flights 83 41 24 Stratified sampling of flights by sector Passengers 6,560 3,280 1,910 with all pax on each flight sampled 3. Flights 117 47 26 Stratified sampling of flights by sector Passengers 4,615 1,860 1,040 with 50% pax on each flight sampled With a random sample of 50% of passengers on each flight surveyed, the sample size required is reduced by 30% to 2.1 times that required using random sampling. However, with only 50% of passengers surveyed on each flight, the number of flights surveyed increases. Several other percentages of passengers to survey on each flight were examined, and both the 30% and 75% levels resulted in larger passenger sample sizes. The optimal balance between the number of flights and the proportion of passengers on those flights to survey depends on the variation in responses between and within flights, and on the relative costs of surveying passengers and flights, which vary from survey to survey. The results of this example reflect the assumptions regarding variation used in the example and will vary in other situations. Refer to Appendix B for more information on the example and the calculation of the sample sizes. In comparing the required sample sizes for different sampling methods, it should be borne in mind that true random sampling of air passengers is almost impossible to achieve, as discussed in Chapter 5. 3.4.4 Determining Desired Accuracy While the mathematics of calculating required sample size is generally fairly straightfor- ward, deciding on the appropriate desired level of accuracy is anything but, because it depends on the consequences of being wrong. Although it is common in statistical analysis to use a target accuracy of 5% at a 95% confidence level, this is an entirely arbitrary choice and is typically not achievable or not accurate enough for many issues addressed by air pas- senger surveys. Consider the case where the characteristic of interest accounts for only a small proportion of respondents, say air passengers using transit to access the airport, which from past surveys is esti- mated to be approximately 5%. The proportion using transit is to be estimated for a subgroup that composes 20% of the population (e.g., air passengers from a particular part of the region). If the required accuracy for the estimated proportion of this subgroup is 5% of the estimated proportion (i.e., 0.25 percentage points) at a 95% confidence level, a random sample survey would require a sample size of 150,000 responses, a level of effort that is totally impractical. Even accepting an accuracy of 20% of the estimated proportion (i.e., 1 percentage point) at the same confidence level, the required sample size would still be 9,500--potentially achievable, but sig- nificantly larger than most air passenger surveys.