The survey interview context adds another formidable challenge to constructing a quality computerized instrument. The skip sequences through an instrument may be thought of as prescribed flows, or forward motion through the questionnaire. But to truly accommodate the survey experience, a CAPI instrument must also have the capacity for unprescribed flows. If a respondent suddenly remembers income of $5,000 rather than the $1,000 reported earlier in the interview, there must be the facility to backtrack in the instrument, fill in a new answer, and proceed with the interview—either by returning to the previous spot in the instrument or by computing a new path based on the changed answer. Likewise, the instrument must be able to handle breaks at any time— cases in which the respondent opts not to answer some questions or to terminate the interview either temporarily or permanently.
CURRENT PRACTICE IN DOCUMENTATION AND TESTING
Two major challenges in conducting a survey through computer-assisted methods are the documentation and testing of computerized survey instruments. Documentation and testing are different in intent but strongly related in key ways. The strongest common link between the two may be that both are often thought of as end-of-the-line processes, things done when an instrument is complete and ready to be fielded (and done to the extent that remaining resources permit). However, documentation and testing are crucial parts of the questionnaire development process and fundamental to the quality and usability of the resulting data.
Documentation
Conversion from paper-and-pencil interviewing to computer-assisted methods effectively did away with a basic form of survey documentation—namely, a paper questionnaire that can be leafed through to see questions exactly as they would be worded when administered to respondents. A “free good” of paper-and-pencil survey development, the paper questionnaire itself is a weak form of documentation. Its usefulness in developing an understanding of the logical flows through sets of questions is limited, and it is not a guide or codebook for the data resulting from the survey. Still, paper has the major advantage of being tangible and therefore approachable to human reviewers in a way that software code to implement a survey is not. There is also a sense in which the scope of surveys—the number of conditional question sets that respondents of a certain type would get routed through—was kept in check by the medium of the paper questionnaire. Lest the document grow too
massive and intimidate interviewer and respondent alike, survey instruments developed in the age of paper had a tendency to be somewhat shorter and simpler.
The relative unapproachability of computer code, the enabling of customized paths through a questionnaire, and the sheer magnitude of the extant questionnaires being converted by federal statistical agencies for CAPI implementation all combine to make the documentation of CAPI instruments a major problem. As difficult as it is to make sense of a paper questionnaire containing thousands of items, interpreting a computerized version to try to determine what exact questions are being asked and in what order is vastly harder. Moreover, for federal surveys, some manner of documentation that permits a gauge (of whatever accuracy) of respondent burden is a legal requirement, given the U.S. Office of Management and Budget’s (OMB) statutory role in approving all government data collections.
Before a computerized survey instrument can be coded, specifications must be constructed so that programmers know what it is they are supposed to implement. These specifications, if kept and maintained as a living document over the course of the survey design process, could be a useful piece of documentation. In practice, as related at the workshop, current survey specifications are often in a perpetual state of development and difficult to keep current; for a federal survey, in particular, major changes in a survey may be called for on the fly in legislation or other agency interactions, further complicating the ability to maintain specifications. The Census Bureau reports a recent move toward using database management systems to develop and track specifications, although electronic specifications have proven to be as difficult to keep in synchronization as scattered paper specifications.
Documentation that outlines the inner logic of a questionnaire could also be a critical tool in the design process of a questionnaire. End users and OMB could benefit from documentation that suggests how specific survey items map to specific data needs and output data locations. Likewise, it could help survey designers map survey specifications to their implementations in the code and allow them to detect coding errors that may make it impossible for respondents to reach certain parts of a questionnaire. A subtle form of documentation that must also be considered in the computer-assisted interviewing arena is the archiving of computerized instruments, not only for potential reuse but also as a record of how specific surveys are conducted in an ever-changing computing environment.
Existing Systems: CASES/IDOC and Blaise/TADEQ
Methods to ease the creation of documentation by generating some form of it automatically from the electronic questionnaire itself have been prized goals. The computer-assisted survey community has made significant inroads in addressing the issue of automated instrument documentation. With sponsorship from the Census Bureau, the Computer Survey Methodology group at the University of California at Berkeley developed companion software for CASES, the DOS-based instrument authoring language that was a major force in early CAPI implementations but that has declined in use due to the lack of a Windows version. This companion software processes an instrument to produce an instrument document (IDOC), an automatically generated set of linked HTML pages that allow a user to browse through the logic of a questionnaire, identifying questions or decision points that flow directly into or out of a particular item in the questionnaire.
The emerging dominant survey authoring language, Blaise, also has a companion software suite for automated documentation under development. Sponsored by a consortium of European statistical agencies, the TADEQ Project has developed prototype software that can process a questionnaire and produce a flowchart-style overview of a questionnaire’s logic, as well as some descriptive statistics. The eventual hope is for TADEQ to be independent of the software platform used to write the questionnaire—if an electronic questionnaire could be ported into the XML markup language, it could be processed by TADEQ—but initial development appears to have focused on its coordination with Blaise.
These two automated documentation initiatives are good first steps in addressing the global documentation problem in electronic surveys. Both suffer from some inherent practical limitations—IDOC from its exclusive applicability only to CASES-coded instruments and from the lack of an overall map to what can be a massive number of linked HTML pages, TADEQ from its perceived difficulty in processing very large and complicated instruments. More fundamentally, both systems are essentially post-processors of coded instruments; hence, the extent to which they can contribute to up-front guidance on questionnaire development—as a diagnostic tool during survey design—is not clear. Both also suffer from the reality that automated documentation can go only so far in conveying meaning and context to a human reader; it can suggest the functional flow from item to item but, on its own, it may not explain why those items are related to each other. Contextual tags and explanatory text in survey questionnaires require human input during coding (which often is not done, given time and resource limitations).