The data collection process in any survey operation has a high impact on data quality. Along with the questionnaire and sample design, the data collection and capture processes can be major sources of measurement error—defined as the difference between the value of the variable provided by the respondent and the true, but unknown, value of that variable. Measurement error in the data collection phase of the survey may arise through distortions introduced via the mode of data collection, the effect of the interviewers1 and their behavior on respondents’ answers to questions, the effect of respondent interpretation of the questionnaire items, and the motivation of the respondent to provide high-quality answers (Federal Committee on Statistical Methodology, 2001).
In dealing with measurement error, as with other sources of error, statistical agencies are expected to design and administer their data collection methods in a manner that achieves the best balance between maximizing data quality and controlling measurement error, while minimizing respondent burden and cost (U.S. Office of Management and Budget, 2006a). In this chapter, several aspects of the survey operation that can contribute to maximizing data quality and controlling measurement error are addressed. We discuss the arrangements for collection of the data and assess the impacts of the various modes of collection, both current and potential. Finally, we discuss the need to capture and preserve information gleaned in the collection process and make recommendations for improving both
metadata (data about individual data items and questions) and paradata (data about the data collection process, whether from the respondent’s or interviewer’s perspective).
The conditions for success in minimizing measurement error are established in the arrangements made for data collection and capture. The administrative arrangements should be documented, stable, and treated to continuous examination and improvement. The National Agricultural Statistics Service (NASS) has chosen to manage data collection by capitalizing on the strong foundation of a long-term relationship with cooperating state agriculture departments and, through that connection, securing the interviewer staff. The cooperative agreement the agency has with the National Association of State Departments of Agriculture (NASDA) has been the mechanism used for field data collection for the Agricultural Resource Management Survey (ARMS) and its predecessors, the Census of Agriculture, and all of its surveys since 1978 (National Association of State Departments of Agriculture, 2007). Prior to this cooperative agreement, federal employee interviewers were used for the surveys that preceded ARMS and other NASS surveys.
The NASDA mission is to represent the state departments of agriculture in the development, implementation, and communication of sound public policy and programs that support and promote the American agricultural industry, while protecting consumers and the environment. The cooperative agreement with NASS is one of three cooperative agreement programs that support that goal for NASDA.2
The cooperative agreement with NASS is big business for NASDA. In 2007, it was funded at approximately $27 million. In turn, the NASDA cooperative agreement is the largest cooperative agreement in the U.S. Department of Agriculture (USDA).
Under the guidance of the NASDA national office in Washington, 3,400 part-time (not more than 1,500 hours per year) interviewers are managed through a network of 46 NASDA state field offices—some of which represent multiple states. A total of 43 of the field offices have responsibility for conducting ARMS data collection (those in Alaska, Hawaii, and Puerto Rico do not).
The field staff of 1,400 interviewers and the office/telephone interviewer staff of about 2,100 are deployed and managed by about 520 NASDA supervisors, who are largely recruited from the interviewer pool. The NASS
role with the NASDA field operations reflects an arms-length relationship, much like other contractual government data collections. NASS provides guidance to the NASDA supervisory staff and has responsibility for training, but it does not have the authority to hire or otherwise discipline individual interviewers or supervisors.
This arrangement has not been opened to competition. When the panel questioned why NASS has not opened the data collection contract to bids from organizations other than NASDA, the response was that the arrangement is very cost-effective and that things have worked well to this point, so NASS is reluctant to change them. This question reportedly arises periodically and is addressed at the management level of USDA.
The state departments of agriculture themselves play a similar supporting role. State offices variously provide staff exclusive to NASS or other inputs (space, funds for printing costs, etc.) through the cooperative agreement. NASDA employees can also conduct telephone data collection, edit interviews, and transcribe paper questionnaires into the Blaise computer-assisted interviewing instruments from the state offices. The only difference in the role of federal and state employees working on NASS projects is where their paychecks originate. (Cooperative programs in NASS and other statistical agencies enable relationships that go beyond the usual hands-off one of a contractor with a government agency. The state employees enjoy a special relationship that extends to management of various survey functions.) There are about 1,100 NASS employees in headquarters and in colocation with state offices. The states have an additional 160 state employees devoted to statistical functions.
Among other things, the cooperative agreement states that the data provided by USDA/NASS are the official state data. By taking advantage of the economies of scale that NASS has in managing ARMS, the states are able to get higher quality data under the agreement than they would be able to assemble on their own.
In this decentralized survey operation, NASS imposes quality measures and monitors the survey process to maintain the quality of the ultimate data from ARMS. The quality control methods used include recruiting and training, sample case control procedures, monitoring of interviewers, and data review. Monitoring is necessarily limited for in-person interviews, but there is a quality control plan to monitor telephone interviews on a sample basis.
NASDA supervisors, who report directly to the NASDA office in Washington, are a critical part of the administration of data collection and a key element in the quality control process. Under the cooperative agreement, supervisors hire and fire interviewers, and they review a certain sample
of their work interviewers through recontacts. If a review of a case yields important inconsistencies or other data quality problems, it is supposed to be returned to the field.
Role of Interviewers and the Interview Process
The interviewer is a critical link in the quality chain. Several types of errors can occur in the dynamic setting of the interview through the interaction among the engagement and innate cognitive abilities of the respondent, the wording of questions, the diligence of the interviewer in following directions, and the tone of voice and personal mannerisms of interviewer. All are part of a complex interaction that characterizes the interview session, and all play a role in creating something commonly referred to as the “interviewer effect.” In ARMS, for which many interviewers have substantial workloads, individual interviewers can have a large effect on the variance.
According to NASS, the interviewers recruited by NASDA supervisors are primarily from rural areas. They typically have an agricultural background and in fact often come from a farm household. The NASDA interviewers are a somewhat diverse group. Two-thirds of them are women. Although data on race and ethnic group were not available to the panel, we understand that blacks are not well represented. The gender and race/ethnic group of the interviewers is an important factor to consider when identifying the effect of the interviewer on the willingness of the respondents to participate in the survey and the reliability of their responses. Although experience indicates that gender does not seem to have a systematic effect on most types of collection of factual information, the evidence on the effects of the race and ethnic group of interviewers is less clear. NASS reports that it is trying to persuade NASDA to increase black as well as American Indian representation. This is a matter of some urgency for USDA, even though the full effect of interviewer race and ethnic group on response is not clearly understood. The potential effect of interviewers and survey methods on the quality of data on small and minority farms has been in the spotlight since issues raised in two landmark court case in the late 1990s (Pigford v. Glickman and Brewington v. Glickman) have heightened interest in the economic status of minority farmers.
Like all statistical agencies, NASS should be cognizant of the potential for interviewer effects. The agency should document interviewer assignments to individual interviews as a part of its normal data assembly, and it should use that information to deepen understanding of those effects and to develop means of controlling them.
Because many rural areas are relatively sparsely settled, it is not uncommon for interviewers and respondents to know each other. In many other
government surveys, this situation would be systematically avoided. There are concerns that respondents may be inhibited about speaking forthrightly about personal information in front of someone known to them, or that interviewers may not strictly observe the survey protocol in such situations. In ARMS, only when interviewers feel uncomfortable interviewing acquaintances or respondents express discomfort with being interviewed is the issue taken to the supervisor and the case is reassigned. In some cases, the usual dynamics of acquaintance may inhibit respondents and interviewers from expressing their objection to the arrangements for the interview. However, NASS contends that it helps the response rate if the interviewer and respondent know one another. This is a topic that deserves further investigation.
Not enough is known about the interview effect on the quality of the data, especially in light of some of the unusual aspects of the interviewer-interviewee interaction in ARMS. A fuller understanding of the interviewer effect would require collection of additional data about the interview, as well as a scientific analysis of those data. This is the kind of methodology research topic that could be addressed by a dedicated staff of research social scientists.
Recommendation 5.1: ARMS should use automated means to collect paradata on interviewer assignments to cases, the relationship between the interviewer and the sample farm respondent (i.e., whether they know each other), demographic characteristics of the interviewer, and the characteristics of the sample farms for nonrespondents that are coordinated with information obtained for respondents, either through the interview or interviewer observation. These paradata could be used to determine the need for additional research on the impact of the relationship between the interviewer and the respondent on the quality of answers. This data collection can best be facilitated using computer-assisted technologies.
As for other surveys, training for ARMS interviewers covers techniques to gain respondent cooperation, questionnaire administration, and general record keeping. ARMS training also addresses conceptual issues, such as biosecurity, and interview skills, such as cultural awareness. Moreover, because the interviewers are given a chance to edit the paper questionnaires, they receive an unusual amount of training in the appropriate technical skills.
When using paper questionnaires, ARMS interviewers have flexibility in moving around the questionnaire (navigating) and varying the question presentation. It is evident from the feedback interviewers give and from observations that interviewers often move back and forth throughout the instrument and provide additional help to respondents in ways that are not usually allowed in conventional structured interviewing.
The assistance that interviewers provide extends to helping respondents with estimating a figure. To respond to many ARMS questions for which there is unlikely to be easily retrievable data at hand, the respondent will often need to engage in mental arithmetic or perform a paper-and-pencil calculation to provide an estimate. For example, responding to a question on the number of acres of soybeans may take some time to calculate, because many farmers have numerous fields of soybeans of varying acreages. Similarly, the conventional emphasis on adhering to the standardized question wording is not a top priority for interviewers. These deviations appear to be the result of several factors.
First, because many of the terms in ARMS questions are either technically complex or differ conceptually from the ways that some respondents think about them, interviewers are trained to provide definitions or explanations if a respondent seems unfamiliar with or confused by a particular financial or agricultural term. However, some respondents are given definitions and others are not, so the question stimulus differs among respondents, departing from a key requirement of orthodox standardized interviewing. Second, because interviewers are under pressure to complete interviews and respondents are sometimes impatient with the length of the ARMS questionnaire, the interviewers have an incentive to take actions that will tend to minimize the burden on the respondent. In particular, if an answer can be inferred from previous answers, the interviewers are likely to enter the response without even asking the question. Third, ARMS interviewers also depart from the script to improvise and ask additional questions to understand the respondent’s situation. Such probing is often built into surveys, but it appears that much activity of this kind in ARMS is left to the initiative of the interviewer. Fourth, in the course of an interview, interviewers may learn that some items are best recalled by respondents in a different order than that specified by the questionnaire and some respondents may remember items or change their minds about earlier answers as a result of later explanations or the shift in context that can occur after exposure to additional questions.
These departures from standardized interviewing practices in ARMS may have exactly the intended consequence of ensuring that respondents understand the questions, interviewers understand the respondents’ circumstances, interviews are as brief as possible, and the data are as accurate as possible. But then again they may not, and virtually nothing systematic is known about the effects of such practices in ARMS.
The potential behavior of interviewers in dealing with item nonresponse is also a concern. It is believed that interviewers sometimes work intensively with respondents to obtain answers. Such effort is encouraged, but interviewers are instructed not to fill in responses that the respondent cannot or will not answer. Nonetheless, there appears to be evidence that
in some cases the interviewers assign values to what would otherwise be missing data, based on their information and their beliefs about how farms and farmers operate. There is a concern that such assignments are not documented. Moreover, it is entirely possible that interviewers might assign responses that are different from and inferior to what would have been assigned using systematic procedures used for other missing data. The current official procedure is that, when NASDA interviewers have any issues of missing data for which they think they have special knowledge, they should contact the state office for follow-up.
As indicated above, the interview process in ARMS is quite unorthodox for the federal statistical community, for which standardized, structured interviewing is the norm. For example, ARMS interviewers clarify survey concepts by providing definitions and examples when respondents seem to need this help, and they sometimes record answers in the order that respondents provide them, even when this is not the order in which the questions appear on the form. Standardized interviewing (e.g., Fowler and Mangione, 1990) rests on the assumption that all respondents are presented exactly the same questions (i.e., words) and all questions are presented in exactly the same order. The idea is that if the stimulus given to respondents is exactly the same, then differences between their answers can more easily be attributed to actual differences in the respondents’ circumstances than if the question stimulus varies between respondents. Thus, providing definitions and examples, doing so with improvised wording, and providing this additional information to only some respondents conflict with the basic tenets of standardization. Similarly, recording answers that respondents provide “out of order,” without reading (or rereading) all questions in the order they appear in the form, is also not standardized because the context (the immediately preceding questions) differs for different respondents.
This departure from strict standardization does not necessarily compromise data quality and may actually be appropriate for collecting ARMS data. But to our knowledge there is no evidence that directly bears on the impact of nonstandardized interviewing in ARMS. Stanley did study the interviewer-respondent interaction in the NASS quarterly agricultural survey and observed considerable departure from standardized interviewing, primarily to avoid violating conversational norms, such as being redundant (Stanley, 1996). For example, interviewers failed to read the introduction to questions 72 percent of the time that they should have been read, presumably because the design of the questionnaire called for the same introductions to be read identically numerous times. These interviewers also did not read the entire question 22 percent of the time, and they changed the wording of questions in a way that altered their meaning 19 percent of the time. Stanley argues that there were sound conversational reasons for doing this and that imposing stricter adherence to standardization might degrade
the accuracy of responses. However, the study did not produce data that might bear on response accuracy.
There is a body of evidence collected by Conrad and Schober (Conrad et al., in press; Schober et al., 2004; Conrad and Schober, 2000; Schober and Conrad, 1997) that directly compares data quality (primarily response accuracy) in strictly standardized interviews to data quality in more flexible or “conversational” interviews. Under the latter technique, interviewers can choose their words to make sure respondents understand the questions as intended. Although these studies do not examine any items from ARMS (or agricultural surveys for that matter), the results may be instructive here. Across these studies, Conrad and Schober have found that allowing interviewers to clarify questions (primarily from federal surveys), using words of their choosing, improved respondents’ comprehension and response accuracy, particularly when the circumstances on which they were reporting were ambiguous. For example, when respondents were asked how many people lived in a household that included one child away at college, they were far more accurate in those interviews in which interviewers could explain that a child living away at school is not counted in this survey; strictly standardized interviewers would not be able to provide this clarification unless they did so for all respondents, whether or not respondents asked for it. Conrad and Schober found no evidence that conversational interviews misled respondents or in other ways biased answers. The cost of this improved data quality was longer interviews due to the time required to provide clarification. The current approach to collecting data in ARMS has some of the character of Conrad and Schober’s conversational interviewing and so may improve respondents’ comprehension. However, we just don’t know for sure.
For this reason, the ongoing research and evaluation program we recommend should systematically explore the ways that different interviewing practices may affect ARMS data. A related area of inquiry, which should be on the agenda of an ARMS research and evaluation program, concerns the origins of the information that respondents use when answering ARMS questions. While some information is currently collected about how respondents obtain this information, this could be studied much more systematically and in much greater depth. It seems likely that respondents simply know the answers to some questions, calculate the answers to others based on what they know (e.g., “Each shipment fills the back of my truck so the annual amount must be in the neighborhood of 60 tons”), and refer to records and financial reports for others. All of these approaches may contribute different types of error in the measurement process and deserve investigation. Review of reporting practices and the interaction between the interviewer and the respondent should be a part of an overall evaluation of the interviewing techniques used in ARMS.
As discussed in Chapter 4, respondents are expected to consult various documents, such as tax returns and USDA program documents, during the interview. Reference to documents is typically taken as a positive sign for data quality when there is conceptual agreement between the documents and the questions asked in the interview. More troubling is the report that interviewers and respondents frequently need to make calculations during the interview in order to provide an answer to some questions. Such actions tend to increase respondent burden, and, unless interviewers or respondents are experts in accounting, there may be a serious compromise in the quality of the data recorded.
Recommendation 5.2: NASS should systematically explore the consequences of interviewer departures from standardization in the interview. To facilitate this, NASS should collect paradata on the frequency with which interviewers follow the order of the questionnaire, read questions as worded, provide clarification, and similar indications of departures from standardized procedures.
Role of the Reinterview in Data Accuracy
The panel considers the control and measurement of data accuracy to be major issues for the ARMS data, especially for the cost-of-production and farm income figures. Some control and measurement methods are employed, and others, found useful in other settings and in prior incarnations of ARMS, are not.
As mentioned above, under the cooperative agreement, supervisors re-contact a sample of respondents in the process of conducting a case review. NASS reports that the quality control recontacts are randomly selected for each interviewer and supervisory interviewer. Additional recontacts are made if problems are suspected or uncovered. Quality control re-contacts are made by the field office survey statistician and by supervisory enumerators.
At the conclusion of the survey operation, as part of post-survey activities, the state statistical offices attempt to capture conceptual and reporting issues in the formal Survey Evaluations (Form E-2) that are forwarded to the national office of NASS. While important to ensure interviewer quality and provide insights on reporting problems, the current quality control procedure is no substitute for a program of systematic validation (reinterview) studies. There have been no reinterview studies conducted for ARMS in recent years.
A formal reinterview study is an important method for estimating and reducing nonsampling errors in surveys (Biemer and Forsman, 1992). It can, like the current recontact program in ARMS, evaluate the fieldwork
by detecting and discouraging cheating and interviewer errors. It can also play a more comprehensive role in ferreting out nonsampling error by identifying content errors that should be reflected in a model of survey errors by estimating (a) simple response variance, or the variability in the survey estimate over conceptual repetitions of the survey, and (b) response bias, or the response errors that would be consistent over repetitions of the survey. Content errors include definitional problems, misinterpretation of questions and survey concepts, and reporting errors.
NASS has a rich history as a pioneer in implementing recurring reinterview studies in the agricultural surveys (Hanuschak et al., 1991). As early as 1975, formal reinterview studies were conducted to probe how well reporting unit concepts were understood by the respondent. (Probing questions in this early study found that approximately 30 percent of respondents incorrectly reported total acres operated, and 20 to 30 percent incorrectly reported specific livestock inventories [Bosecker and Kelly, 1975].) From the late 1980s to the early 1990s, several important reinterview studies were conducted to measure response bias in surveys that included quarterly grain stocks, the June agricultural survey, on-farm grain stocks, hogs, and cattle on feed. These surveys not only identified conceptual difficulties that could be remedied by changes in questions or in interviewer instructions and training, but also had the more practical application of informing the board estimation process with measures of response bias. Although expensive in terms of statistician and interviewer time and additional burden on the part of respondents, these formal reinterviews were considered a success and formed the basis for a panoply of recommendations for a more robust reinterview program in the future.
Particularly because of the complex nature of the ARMS questions and the unorthodox approach taken to interviewing, the panel is concerned about the lack of systemic and continuous collection of information about the response variance and response bias in the survey. If it is judged that the resource cost and response burden of an ongoing, formal reinterview study of a sample of respondents are too large, there may be other techniques that will yield useful estimates of nonsampling error in the survey. One simple technique would be to do recordings of the recontacts to be sure the interviews are structured. An alternative would be to conduct dependent interviews (i.e., providing the respondent his or her answers from the previous interview and checking for changes) rather than full reinterviews.
Recommendation 5.3: NASS should use available analytic tools, for example, cognitive interviews, interviewer debriefing, recording and coding of interviews, and reinterviews, to investigate the quality of survey responses.
The ARMS program has not fully exploited the available technologies for data collection. Considerable progress has been made in enhancing the back office operations involved with data capture, editing, and processing, and there have been far-sighted projects to test data collection through the use of personal digital assistants (PDAs) and global positioning system (GPS) devices. However, the development of and conversion to an automated field data collection mode lags behind the state of technology in data collection.
One collection mode that stands out as especially promising for further development is computer-assisted interviewing (CAI). There may be institutional issues that explain the lack of progress by NASS here. USDA has held that there is no technology currently available that can efficiently collect data on farm chemical use, production practices, cost-of-production information, and detailed cost and income statistics (National Agricultural Statistics Service, 2005).
A factor that complicates technological advances in ARMS is the wide variety of collection modes that are currently in play. In Phase I data are now collected in 4 modes—primarily through telephone interviewing (with about a 75 percent telephone response), with the remainder split among mail response, experimental web collection, and personal interview. (Mail response has been low enough in some states that ARMS does not attempt to use this method there.)
ARMS has recently begun preparations for expanding reporting via the Internet. It is expected that a web-based instrument will be offered for the 2007 ARMS Phase I screening. One concern about a shift to this mode of data collection is that it may also induce changes in the ways that respondents answer questions. This planned change reinforces the importance of research to understand how respondents answer questions in the ARMS interviews.
There are currently no plans to develop a web-based instrument for ARMS Phase II or for the fruit and nut, vegetable, or postharvest chemical use surveys, since much of the data collected requires the identification of a specific farm field that is planted to a specific commodity, and this field identification reportedly cannot easily be made on the Internet. Also, the detailed chemical application data are often copied from farm records by the interviewer during the interview. Plans are that Phase II will continue to be collected solely through personal interviews. However, other computer-based technologies such as PDAs and GPS devices might be relevant for this phase. These technologies are discussed later in this chapter.
Research will commence on the development of a mail instrument for the full ARMS economic version (ARMS Phase III cost and returns), which to date has been collected through face-to-face interviews. This research
will build on the already existing mail instrument that currently covers the core (version 5) Phase III questionnaire, and it is in anticipation of the coordination of data collection with the 2007 Census of Agriculture to be conducted in early 2008 (National Agricultural Statistics Service, 2005). In that regard, development is currently under way for a web-based version of the ARMS economic phase (Phase III) core questionnaire, which was first mailed to respondents in 2004. However, as with the proposed change to web-based data collection for Phase I, it is important that ARMS have in place a research program to identify and control any adverse effects of this change in interview mode.
Computer-Assisted Personal Interviewing
Until these research and development projects bear fruit, it is expected that the vast majority of the data in Phases II and III will continue to be collected through face-to-face interviews. That being the case, there should be serious consideration of automating the face-to-face interview process by using computer-assisted personal interviewing (CAPI). The basic idea of CAPI is quite simple—instead of having interviewers navigate though paper-and-pencil entry on their own, with CAPI a computer controls the logical flow of the interviews, presents appropriate versions of the questions to be read, and offers a place for the direct recording of the answers to the questions.
The possible use of electronic data collection for the agency’s personal interview data collection surveys has been tested, evaluated, and discussed at length since the first CAPI test with the 1989 September agricultural survey (Eklund, 1993).3 This first experiment was followed with a test of collection of the farm cost and returns survey in February 1991. The conclusions from these studies were very favorable to the adoption of CAPI:
Interviewers can learn and use CAPI effectively, even for the most difficult surveys.
The data quality showed potential improvement, particularly by ensuring that interviewers answer questions and enter them into the proper cells.
Respondent reaction was mostly indifferent, but more positive than negative.
Interviewer reaction was often initially apprehensive but turned enthusiastic as training commenced. The positive reaction, however, may not hold for all interviewers.
CAPI costs compared favorably with the paper-and-pencil method.
Now, over a decade after those pioneering tests of CAPI, NASS has not yet developed a formal business case analysis for the use of CAPI with ARMS. The agency acknowledges the potential of CAPI but does not expect to move to that mode of interviewing until 2009 at the earliest.
Cost is always an inhibiting factor in adopting new technologies in statistical agencies and, indeed, the primary constraint in moving to CAPI is cost, according to NASS. One major element of cost is the purchase of necessary data entry equipment. Although the inflation-adjusted price of laptop computers continues to decline, purchase of sufficient machines for the entire ARMS field staff would still require a substantial sum of money. Particularly in light of the relatively short life span of a computer, purchase could be justified only if the machines could be used for a number of surveys over the effective life of the machine. NASS reported that their principal surveys that use personal interviewing take place only once a year and that such surveys are few enough in number to render CAPI uneconomical. In addition, they cited pressure to reduce personal interviewing even more, in favor of increased telephone and self-administered mail and web electronic data collection. It is not clear how such arguments apply to leased computers. Some other organizations routinely use leased laptop computers to support CAPI.
With CAPI there is no data entry operation after collection and typically no opportunity for “data grooming” (i.e., manually revising handwritten information after the interview) by the interviewer. Thus, two time-consuming procedures could be eliminated. Straightforward implementation of the ARMS questionnaire in CAPI might not be faster than the current paper-and-pencil approach. Indeed, it is possible that it could be slower, if interviewers were still obliged to navigate back and forth through the instrument in order to record information appropriately.
Although the loss of the possibility of post-interview data review could be costly in terms of data quality, the logic-based structure of CAPI makes it possible to introduce systematic quality control checks during the course of the interview and to resolve them while the respondent is present. A debriefing interview with the interviewer to be filled out for each completed case, as is done with the Survey of Consumer Finances, could provide an opportunity for critical comments that interviewers were unable to record during the course of the interview.
For obtaining panel data, as the Economic Research Service (ERS) has indicated it would like to accomplish in the ARDIS initiative (see Chapter 2), dependent interviewing is a good way to detect and eliminate spurious data changes. With CAPI, such comparisons are straightforward. However, it should be noted that such an approach can produce mixed results, as in the Current Population Survey experience. When the CPS converted from paper to computerized data collection, dependent interviewing was
introduced. This drastically reduced reported change in occupation from what were clearly spuriously high levels (Polivka and Rothgeb, 1993). However, the extremely low levels of change after dependent interviewing was introduced may lie below true levels of change, reflecting respondents’ recognition that when asked if a change has occurred, reporting no change will lead to the shortest interview because there will be no follow-up questions about the new job.
Computerization of data collection makes it possible to do things that are either very difficult or impossible with a paper-and-pencil interview. The presentation of multimedia information is a straightforward matter with CAPI; for example, one might present the respondent with images of crops, pesticides, aerial photographs of fields, etc. When the Survey of Consumer Finances moved from paper to the CAPI system, the “don’t know” responses for dollar amounts were virtually eliminated because interviewers were told to automatically probe the respondents according to a prescribed protocol (as noted earlier, the decline in “don’t know” responses was accompanied by no significant change in the proportion of refused answers). Computerization also makes it far easier to collect paradata in a form that facilitates its use for methodological and substantive research, as we discuss later in the chapter. This would allow, among other things, the identification of items that are difficult for respondents to answer. Similarly, items that are given inadequate thought can be identified because their response times are too brief.
In very complex interviews, programming costs for CAPI may be substantial and considerable time may be required to debug the instrument. In a repeated survey in which revisions are gradual, costs are far lower in waves after the first one. Effectively, the cost of the initial programming would be amortized over many survey administrations. Similarly, other costs of transition, such as the programming necessary to extract the data in a form that would be comparable to existing processing systems, would largely be borne once.
In light of the now extensive experience with CAPI in other surveys, concerns about respondents’ reactions to an interviewer who is entering information into a computer strike us as overblown. Farmers, like most people in American society, have come to accept—whether grudgingly or with open arms—the ubiquity of computers in everyday interactions. If there is reason to believe that farmers are a special case—and we do not think they are—then this warrants special study. We have no reason to think that introducing a computer into the interview would deter participation in 2007 or beyond.
The one place in which the move to CAPI in the 1990s clearly reduced cost was eliminating back-end tasks like data entry (e.g., Baker et al., 1995). Otherwise the cost was largely a wash. With ARMS, CAPI would eliminate
both the data entry operation and data grooming by interviewer, both of which may increase salary costs due to increased time. Moreover, the reduced price and increased power of laptop computing since the early days of CAPI would almost surely reduce the cost of this transition in ARMS compared with the cost in the early days of the technology.
Web-Based Data Collection
Data collection via the Internet offers potentially large cost savings. Beyond the initial cost of programming the instrument, the marginal costs should be minimal. If such an approach is successful, it would be possible to increase the sample size substantially with minimal cost consequences.
However, any change of mode requires careful thought. The experience of ARMS with self-administered interviews is largely concentrated in the short version of the Phase III interview. A systematic investigation of possible mode effects in that questionnaire version should be a high priority, and it should certainly take place before considering more intensive web-based data collection. There may also be adverse perceptions of the privacy of data entered via the Internet that should be studied.
As with similar concerns about CAPI, the concern that farmers will not use the Internet because they lack the computer sophistication to do so also strike us as unfounded and demeaning. The Internet is part of modern life. In fact, it may be more important in rural than suburban and urban regions because it connects people with the rest of the world. Although at present, the farm population is somewhat less likely to have access to a broadband connections, the difference is apt to shrink in the near future.
Integration of CAPI and Web-Based Collection
Can developing a CAPI questionnaire reduce the costs of developing a web-based questionnaire? Because CAPI involves an interviewer but web-based collection is self-administered, it is not easy to directly use the CAPI instrument on the Internet without some modification. But in survey programming languages in which questionnaire “routing” (a specification of the logical path between objects in the interview) is created separately from question text, as is the case with Blaise and MR interview, it is likely that there would be substantial savings in web-based development costs and time by adapting a CAPI questionnaire. With careful planning, a simple change in the display format, which may be accomplished through templates and stylesheets, may be sufficient in many cases to render such an adaptation relatively simple. Even with less sophisticated computer languages, the question logic of both CAPI and web-based questionnaires is likely to be similar, as are the user interface decisions (e.g., check boxes
for check-all-that-apply questions, radio buttons for mutually exclusive response options). Moreover, there is much overlap between the set of skills required to do both kinds of programming, so the same programmers can almost certainly do both.
As noted earlier, the ARMS data collection process as currently structured appears to require considerable flexibility by the interviewer—for example, the ability to navigate between questions in unanticipated order. If such nonlinearity cannot be eliminated or substantially reduced by additional questionnaire research, CAPI implementation might not be able to depend on a standardized approach to instrument development. Nonetheless, such systems have been developed and utilized successfully in other survey programs for instrument testing, and with creativity they could be applied to develop and electronically index to facilitate data collection in the field. It is likely that NASS would have to hire or train a small staff of dedicated interview programmers or to hire a firm skilled in electronic questionnaire development. Despite the possibility of difficulties and additional effort to computerize ARMS, we strongly think computerization is worthwhile.
Recommendation 5.4: NASS should move to computer-assisted interview and possibly web-based data collection, after research and testing to determine possible effects of the collection mode on the data. CAPI and web-based data collection will provide opportunities to increase timeliness, improve data quality, reduce cost, and obtain important paradata.
Electronic Devices in Data Collection
NASS has recently experimented with using electronic devices in personal interviews, such as for locating sample points with GPS devices in Washington State and collecting cotton yield objective survey data in North Carolina. The resulting research reports, which can be obtained from the NASS website, are summarized below (National Agricultural Statistics Service, 2007b).
In 2004, the NASS Washington field office and the Research and Development Division combined efforts to study the practicality of using handheld GPS receivers to augment the ARMS Phase II survey data (Gerling, 2005). Washington field interviewers were supplied with Garmin GPS-72 receivers to obtain latitude and longitude coordinates of the sampled fields, rather than using county highway maps and the DLG Map software, as had been done previously.
In general, the field interviewers had few problems using the GPS receivers. Of the 211 positive usable reports, 22 (10.4 percent) operations had fields that could not be accessed by the field interviewer because the operator
refused permission to approach the fields or the weather conditions made the fields inaccessible. These fields were recorded on county maps and later transferred to the DLG Map software to obtain the latitude and longitude coordinates. On average, interviewers spent 20 minutes and drove 9 miles to obtain the coordinates of each field with the GPS receivers.
The use of GPS devices in data collection has particular promise for modernizing and enriching the Phase II data collection, in which a specific field is the sample unit. Locating the centroid of the field with a GPS device would add considerably to the data value of the information. This could be done at a surprisingly reasonable cost. The study estimated total data collection cost to implement GPS receivers for all states for the 2006 ARMS Phase II sample is at $127,264. The first year’s annual cost would be in the range of $50,000, with subsequent years’ costs affected by inflation.
PDAs are another promising technology. In another experiment, the North Carolina field office used PDAs to collect data for the 2004 cotton objective yield survey Form B data (Neas et al., 2006). The office developed a user-friendly data collection instrument to collect Form B data onto a PDA and securely transmit them. Results showed that field interviewers could successfully collect and transmit the data via a PDA, providing them more quickly, eliminating mailing costs, and improving the overall quality of the data, since edit checks were incorporated into the data collection instrument. However, conducting a cost-benefit analysis showed that use of PDAs in more field data collections and administrative activities would be needed to consider them cost-effective.
Finally, we note that a GPS device can be integrated into a PDA. This would facilitate collecting geographic information in the context of the PDA-driven data collection. This variant, wedding the promise of GPS for the Phase II collection with the advantages of portability offered by the PDA, should be tested as well.
Electronic Data Interchange
Electronic data interchange (EDI) would allow the direct uploading of a farming operation’s financial records to a USDA database. This approach should be seriously explored as an alternative to conventional modes of self-reporting for Phase III data collection. This will require ethnographic research to understand the current practices of farmers so that a system can be designed to match respondents’ record-keeping practices.
The record-keeping practice surveys discussed in Chapter 4 can provide information on the extent to which respondents maintain their records electronically. EDI may be an attractive option for some respondents, particularly those who would rather not sit through a lengthy interview. If so,
this could increase response rates for this subset of respondents and reduce interviewing costs.
The experience of the Current Economic Statistics program at the Bureau of Labor Statistics, which includes substantial research and evaluation, provides an excellent starting point. There are, of course, many differences between these data (which requests just a few numerical entries) and ARMS data (which requests many items in many formats).
There is a growing usage of standardized electronic book-keeping and report preparation packages in the farming sector, and, as standardized electronic book-keeping and reporting systems are further promulgated, EDI has the potential of seamlessly collecting some common data items, often much more quickly and potentially more frequently than is now possible. There are, however, several potential roadblocks. Any effort to proceed with EDI would need to be sensitive to respondents’ feeling that providing actual records directly to a government agency may compromise their privacy more than reporting to an interviewer on a question-by-question basis, in which the respondent and the interviewer are in control of the flow of information. Some respondents may require the continuing assurances or persuasion of an interviewer to maintain their cooperation. For those who want to use a more efficient way of sharing their data, EDI ought to be an option.
DATA CAPTURE, EDITING, AND PROCESSING
ARMS employs a multilayered process of data capture, editing, and processing. Interviewers perform an initial review of their interviews with the goal of correcting errors; a systematic review of the data occurs in the field offices; keyed data at data entry points is carefully monitored; NASS data review happens simultaneously with the field office review; and an outlier board with representation from both NASS and ERS reviews outliers.
Supporting this multilayered system are automated tools, both off-the-shelf and internally developed. PRISM, an interactive data review system developed by NASS, allows for interactive review of error listings from computerized batch edits and previously submitted data corrections. NASS also uses the Feith system to review scanned images of keyed questionnaires and the NASS-developed IDAS tool to review data at both micro and macro levels.
These procedures appear to be fully in keeping with standard practices for data capture, editing, and processing, and the high degree of sophisticated process automation appears to insure against generation of errors in these processes. A defect of the process is that information about changes to the data is not systematically retained or is not retained in a way that can support methodological research.
METADATA AND PARADATA
We have identified several points in the process of data collection in which errors could be generated—points that, at a minimum, should be transparent for full understanding of the meaning of the data. We have suggested that one way that NASS and ERS agencies can further assess quality and assist data users to evaluate the quality of survey information is through capturing and providing supplementary information, known as metadata and paradata.
Survey metadata provide context that can help in the interpretation and use of individual data items and statistical aggregations of them. The most basic form of such information is the text of the questions, including response options, that elicited the data, and any system of codes needed to understand the meaning of open responses. Among many other types of information, the question text provides the time period for which the respondent has reported activity. This can be critical in making sense of answers.
Another class of metadata consists of indicators that reflect the quality of the individual pieces of information collected, such as information on the original status of each item (whether it was reported fully by the respondent or was missing in a particular way), actions taken that altered the original item in terms of content or position in a set of data as a result of editing or any form of data processing, comparisons of values to parallel values from other sources, particular evidence from initial question testing and design that may bear on the content and reliability of the questions asked, among others. If answers to a particular question undergo substantial amounts of editing, this suggests that respondents may be consistently misunderstanding the question, or when the interview is administered by interviewers, that the interviewers may be making systematic errors. If an interviewer is involved, some measure of interviewer characteristics can be useful context. For example, if experienced interviewers are collecting consistently more “no” responses when a “yes” response would lead to an additional set of questions, this could suggest that veteran interviewers are subtly biasing answers to lighten their burden (and that of the respondents).
Survey paradata include information about the processes that generated the final individual data records, which also can be taken to include metadata as a subset. The wider categories of paradata can include aspects of individual interviewers’ speech, such as whether they read the question exactly as intended, whether they probed for more information; indications of respondent effort or uncertainty, such the response latency or changes to initial answers; indications of the use of auxiliary information by respondents, such as administrative records; case history information on all attempts to interview each respondent; an indication of the mode of data collection; information on imputation; information about interviewer training and support; cognitive evaluations of survey questions; computer routines for data processing and imputation; and other systematic processes affecting the final data.
The collection and recording of metadata varies in difficulty and cost depending on the mode of data collection. In particular, computerization greatly lessens the monetary cost of collecting some types of paradata, particularly records associated with the management of cases and the traces of screen navigation recorded as mouse clicks and other key entries during an interview; up-front programming is typically the only such nonnegligible monetary cost. Such interface actions comprise the vast majority of a respondent’s outward behaviors during a web-based questionnaire or an interviewer’s behaviors in a computer-assisted interview. Similarly, interviewer and respondent speech can be easily captured digitally and then linked to the associated answer, when respondents can be persuaded to give permission for such limited recording.
The capture of metadata may also be facilitated by computerization as well, because it is a simple matter to merge conditional question wording or interviewer information (already entered for a data collection session) with the answers. In addition, the main data and their original state are known with certainty without the intervention of coding and subsequent entry processes that may introduce additional error.
Although a paper-based system of data collection may be made to yield some of the same information as more fully electronic systems, the necessary data linkages are often more difficult, and such linkages allow the possibilities of new sources of error. In such a system, most acts of creating metadata or paradata are inherently costly and thus obvious candidates for omission in a world of continuing cost control.
Without at least basic metadata and paradata, it becomes difficult to find a well-founded empirical basis for evaluation and improvement of a survey under actual field conditions. Without such information, there is only the informal (but clearly very important) knowledge embedded in the actors in the data collection process. Analysis of systemic information may often be difficult or inconclusive, but it is the best hope for informing analysts about the quality of the data and the data collection process and for plotting future improvements.
Collection and preservation of metadata and paradata in the ARMS program appears minimal and unsystematic.4 A key example is the fact that ARMS is not set up in a way that allows for preserving a record of the original data. Editing and imputation of various sorts occur in ARMS at different levels of data processing, and the information about the outcomes is not systematically stored apart from the data. To gain a sense of
Several components of the publicly available ARMS documentation may be considered to provide metadata (http://www.ers.usda.gov/Data/ARMS/GlobalDocumentation.htm). The ARMS data page (http://www.ers.usda.gov/Data/ARMS/) provides a complete variable listing (http://www.ers.usda.gov/data/arms/Variables.htm). Also, the ARMS tailored report query tool has help buttons with definitions (http://www.ers.usda.gov/data/arms/app/Farm.aspx).
the reliability of an observation, it is important that data users be given a clear sense of how much manipulation has been made to the original data; at a minimum, a strange-looking data value might be more credible if it had an associated data flag indicating that it had been reviewed with the respondent or had been reconciled with other variables in the record. Although users may very often want to benefit from the expertise of people who process a large technical database like ARMS, they should have the opportunity to make their own decisions about how to treat data.
Data users have a need for tracking and understanding the impact of imputations for missing data. The relatively simple conditional mean imputation practices used for much of ARMS will generate data that are not appropriate for sophisticated multivariate analysis; for such work, users would need to perform their own imputations or use techniques for analyzing partially observed data. Within the metadata framework, ERS can provide signals to users regarding what values were reported and what values were imputed.
Another critical area in which such knowledge is not recorded in a usable form in ARMS is the history of management of individual survey cases. Systematic collection and organization of such data on attempted contacts with respondents, together with relevant data on interviewer, respondent, and neighborhood characteristics, are particularly important for use in understanding and potentially improving the methods of case administration as well as in understanding nonresponse and detecting nonresponse bias. In light of the relatively high nonresponse rate in ARMS, making such data available should have a high priority.
Highlighting these two issues in no way should be taken to diminish the importance of collecting and organizing other metadata and paradata. In particular, efforts should begin to collect systematic information on interviewers, to document the processes underlying questionnaire design, to document changes in interviewing practices, to note the types of records respondents use, to record any special efforts or incentives used in gain the cooperation of the respondent, and other such factors.
As ARMS moves toward computerization—a step we advocate in this report and that seems inevitable in the long run—it makes sense to build the capabilities for capturing and organizing metadata and paradata as an integral component of ARMS data collection, processing, and products. The need for metadata and paradata makes the transition to digital data collection much more urgent.
Recommendation 5.5: NASS and ERS should develop a program to define metadata and paradata for ARMS so that both can be used to identify measurement errors, facilitate analysis of data, and provide a basis for improvements to ARMS as part of the broader research and development program the panel recommends.