3
The Vaccine Safety Datalink Data Sharing Program

DESIGN AND IMPLEMENTATION TO DATE OF THE VACCINE SAFETY DATALINK DATA SHARING PROGRAM

Development of the VSD Data Sharing Program

Until August 2002, Vaccine Safety Datalink (VSD) research was limited to researchers from the National Immunization Program (NIP) and the managed care organizations (MCOs) participating in the VSD. The team of VSD researchers set research priorities, determined which studies to undertake, and planned how studies would be monitored.1 External researchers could in principle pursue a collaborative research project with any of the VSD researchers at the NIP or the MCOs, but no process had been established to allow use of VSD data outside such a collaborative relationship, and there appear to have been no proposals for broader participation.

Development of the VSD data sharing program began in August 2000, and the program was formally established on August 30, 2002 (CDC, 2004d). The program was developed in an ad hoc way with input from the Department of Health and Human Services, Congress, and the MCOs participating in the VSD because of heightened interest in public access to VSD data (Wharton, 2004). No additional funding was provided to the Centers for Disease Control and Prevention (CDC) to develop such a program. It resembles no other existing data sharing program known to the committee.

1  

Personal communication, F. DeStefano, NIP, February 10, 2005.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust 3 The Vaccine Safety Datalink Data Sharing Program DESIGN AND IMPLEMENTATION TO DATE OF THE VACCINE SAFETY DATALINK DATA SHARING PROGRAM Development of the VSD Data Sharing Program Until August 2002, Vaccine Safety Datalink (VSD) research was limited to researchers from the National Immunization Program (NIP) and the managed care organizations (MCOs) participating in the VSD. The team of VSD researchers set research priorities, determined which studies to undertake, and planned how studies would be monitored.1 External researchers could in principle pursue a collaborative research project with any of the VSD researchers at the NIP or the MCOs, but no process had been established to allow use of VSD data outside such a collaborative relationship, and there appear to have been no proposals for broader participation. Development of the VSD data sharing program began in August 2000, and the program was formally established on August 30, 2002 (CDC, 2004d). The program was developed in an ad hoc way with input from the Department of Health and Human Services, Congress, and the MCOs participating in the VSD because of heightened interest in public access to VSD data (Wharton, 2004). No additional funding was provided to the Centers for Disease Control and Prevention (CDC) to develop such a program. It resembles no other existing data sharing program known to the committee. 1   Personal communication, F. DeStefano, NIP, February 10, 2005.

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust Proposals Submitted to the VSD Data Sharing Program As of October 2004, the NIP had received proposals requesting the use of VSD data from a very small number of researchers. In September 2002, the NIP received proposals from one group of researchers for 13 new vaccine safety studies and 11 reanalyses (CDC, 2004d). Those proposals were revised and slightly modified. The first group of researchers visited the National Center for Health Statistics (NCHS) Research Data Center (RDC) in October 2003 and January 2004 (Geier and Geier, 2004) to analyze the VSD data for which their access was approved. In August 2003, the NIP received a proposal from another researcher for a reanalysis of a published VSD study of the association between measles-mumps-rubella and varicella vaccines and type 1 diabetes (CDC, 2004f). The researcher’s proposal was complete, but at the time of this writing the researcher had not pursued the next steps in the process. Challenges in Implementing the VSD Data Sharing Program CDC experienced several challenges in implementing the VSD data sharing program. At the time of the announcement of the data sharing program, CDC did not have a formal data sharing policy to provide a standard or guide for the VSD program (Wharton, 2004). Congressional interest in the status of the VSD data sharing program brought increased scrutiny and time pressures to the development process. Analytic data files from some previously published VSD studies had not been archived in a standard manner, so it was difficult to respond expeditiously to requests to reanalyze published VSD studies. The scope of the data sharing program also had to satisfy the dual objectives of providing access to VSD data and ensuring the privacy of the personal medical information in the VSD (Wharton, 2004). NIP resources and personnel were challenged by those events and competing demands and by the adversarial environment that soon emerged. Summary of VSD Data Sharing Program Guidelines Four successive versions of the VSD data sharing program guidelines for independent external researchers have been released publicly. Each version of the guidelines was intended to provide greater clarification about program requirements and expectations than the version before it. In August 2002, the first version of Guidelines for Data Sharing Proposals from External Researchers: Vaccine Safety Datalink (VSD) Project was released (CDC, 2002). The guidelines outlined the process for submitting proposals to the NIP, the suggested proposal elements, and the process for re-

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust questing Institutional Review Board (IRB) approval from each of the MCOs whose data would be examined. In October 2003, CDC released the second version of the guidelines, Guidelines for Data Sharing Program for External Researchers: Access to CDC’s Vaccine Safety Datalink Data (CDC, 2003a). The second version provided additional details about the process that independent external researchers were to use to request access to VSD data, clarified the difference in access between two categories of VSD data (new vaccine safety studies and reanalyses of published VSD studies), described the provisions governing use of the RDC at NCHS to access VSD data, and laid out requirements for the publication of research based on VSD data. The third version of the VSD data sharing program guidelines was provided by NCHS after the programmatic responsibility for the data sharing program was transferred from the NIP to NCHS in March 2004 (CDC, 2004d). NCHS decided not to create separate guidelines for access to VSD data but rather used its general Procedures for Use of the RDC (CDC, 2004b) document and the accompanying RDC General Description document (CDC, 2004c) to serve as the interim guidelines for the VSD data sharing program until those documents could be updated. On November 18, 2004, NCHS published a Federal Register notice and request for comments on Procedures and Costs for Use of the Research Data Center (CDC, 2004a). Although this document outlines procedures that apply to all datasets available through the RDC, it also constitutes the fourth version of the VSD data sharing program guidelines because it includes project-specific requirements for VSD data in an appendix to the main document (CDC, 2004a). The Federal Register notice includes the information that was contained in the two documents that constituted the third version of the VSD data sharing program guidelines and additional information on the RDC and VSD-specific requirements. NCHS requested public comment on this document. The original deadline for public comment was December 9, 2004 (CDC, 2004a); this deadline was extended to March 1, 2005 (CDC, 2004g). (The Federal Register notices can be found in Appendix G.) The new RDC procedures provide additional explanation of the expectations for guest researchers at the RDC and costs for use of the RDC (CDC, 2004a). The new VSD-specific guidelines have additional requirements for information that is to be included in proposals, compared with earlier versions of the guidelines for the VSD data sharing program (CDC, 2004a). THE VACCINE SAFETY DATALINK DATA SHARING PROGRAM’S ABILITY TO SHARE DATA The VSD data sharing program does not meet the traditional definition of data sharing, because of the limitations of the data available

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust through the program and the differing levels of access to VSD data that depend on the type of researcher requesting access. Because of the contract provisions that govern the VSD data sharing program, independent external researchers are unable to gain access to data from the year 2001 or later for new studies (such as investigation of a new hypothesis or use of novel methods to investigate a previously studied hypothesis), whereas researchers affiliated with the NIP or with one of the VSD MCOs have access to these data for all types of studies. Even for new studies conducted by independent external researchers with data from before 2001, the available data are generally less than ideal in that only data from the annual VSD extracts are provided to these researchers; researchers affiliated with the NIP or the MCOs can use chart review or other means to improve the quality of the data used for a particular study. Data sharing through the VSD data sharing program is also impeded by the requirement, for all reanalyses, to obtain IRB approval from all MCOs whose data are included in the final dataset. If one MCO’s IRB does not provide approval, a reanalysis of the full set of study data cannot be done, because the researcher would be analyzing only part of the data used in the original study. Confidentiality concerns alone may not sufficiently justify those limitations. Any independent external researcher seeking use of VSD data is required to access the data at the NCHS RDC. The RDC is designed to make breaches of confidentiality nearly impossible. The data access restrictions in place at the RDC are sound and extensive, and they reduce the possibility of breaches of confidentiality regardless of the extent or type of data being accessed or the intentions of external researchers. In preparing its advice to the NIP and NCHS, the committee recognizes the current limitations of the VSD data sharing program. If the NIP and NCHS want to allow access to VSD data in the true spirit of a data sharing program, the committee’s advice and recommendations will help the program to meet scientific standards of data sharing. If the current limitations of the program are not overcome, the NIP should characterize the program as a limited data access program rather than a data sharing program. The committee finds that overcoming the limitations may require renegotiation of the VSD contract. A true VSD data sharing program would need to include the following three elements: access to the core VSD data for exploratory analyses; access to studies that involve chart review, and so on, to consider alternative explanations; and new collaborative studies with the NIP and the MCOs to pursue new hypotheses. If the intention is to allow true data sharing, researchers should be allowed use of all available years of data for new studies and not be limited to final datasets for reanalyses.

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust The VSD is a public resource that is designed to inform important public health policy decisions. By the very nature of its potential to influence policy, the public demands and deserves access to the data used to influence those decisions and transparency in the processes that permit or restrict access. If the VSD indeed is intended to be used as a foundation of policy decisions, there is a public need to share data fairly and to be as transparent as possible while protecting the confidentiality of individually identifiable information in the VSD. The committee uses the term VSD data sharing program throughout this report for the sake of consistency and ease of reference. Despite the limitations of the sharing function of the VSD data sharing program, the term is now well established. CURRENT STANDARDS OF PRACTICE OF SIMILAR DATA SHARING PROGRAMS Benefits and Costs of Sharing Data Sharing of VSD data or any other type of data has both benefits and costs. Some benefits and costs may be unique to the sharing of particular datasets, but the often-cited benefits of data sharing include (Fienberg, 1994; NRC and the Committee on National Statistics, 1985): Reinforcement of open scientific inquiry; Verification, refutation, or refinement of original results; Promotion of new research through existing data; Encouragement of the appropriate use of empirical data in policy formulation and evaluation; Improvement of methods for data collection and measurement; Development of theoretical knowledge and knowledge of analytic techniques; Encouragement of multiple perspectives; Provision of resources for training in research; Protection against faulty data; Greater application of scientific research in decision-making; Reduction of the expense of duplicative data collection and the concomitant burden on human subjects; and Respect for the desire of respondents to contribute to societal knowledge. In the case of VSD data, the committee finds that, especially in the context of government-funded research, increased data sharing also promotes greater transparency in the derivation of research results, which

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust enhances public trust in the use of the VSD to make accurate assessments of vaccine safety. Data sharing also has costs. Those often cited are related to (Fienberg, 1994; NRC and the Committee on National Statistics, 1985): Elimination of technical obstacles to sharing data; Need for extensive technical and substantive documentation of datasets; Monetary and time costs to original researcher for preparing data for sharing; Monetary and time costs to subsequent analysts for developing a base of knowledge about the data; Response to errors by others; Response to unwarranted criticisms based on poor analyses by others; Loss of original researchers’ exclusive right to future discoveries; and Breaches of confidentiality. The constraints that limit access to VSD data and can be considered costs of the VSD data sharing program include protection of proprietary information, protection of detailed medical information, and protection of intellectual property rights of researchers. The benefits of, costs of, and risks posed by data sharing are important in examining the VSD data sharing program so that any proposed changes in the program can be understood properly. Expanding or limiting access to VSD data will lead to nontrivial shifts in the balance of costs and benefits. Some of the committee’s recommendations promote expanding access to VSD data, and some create constraints on access to VSD data. The NIP and NCHS will have to consider the costs and benefits of the different recommendations to determine which ones to implement. Approaches to Data Access Data can be shared in a number of ways—public-use data files (with data elements that are limited or altered to prevent identification of individuals), restricted data use agreements and licensing agreements, and access to restricted data at a data enclave. Many data sources allow access to be granted through multiple means, depending on the sensitivity of the data needed for a particular study. Public-use data files are available to anyone who would like to use the data. Data providers normally provide a data dictionary and background information on the design features of the data source. Carefully

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust constructed public-use data files present the lowest risk of disclosure of confidential information. Restricted data use agreements and licensing agreements may allow researchers access to data that are somewhat broader than what is available in public-use data files. To help to ensure that confidential data remain protected, the data owners or data stewards require the researcher to sign a restricted data use or licensing agreement. The agreement specifies the penalties for violating provisions of the agreement. Access to restricted data at a data enclave allows researchers to use data that are very sensitive or might allow easy identification of individuals whose information is included in the database. To access data at a data enclave, researchers submit a proposal outlining their proposed study and describing why confidential or sensitive data not available in other ways are needed. Researchers conduct their analyses at the data enclave, and all output must be reviewed for the risk of disclosure before it can leave the data enclave. Review of Similar Data Sharing Programs To assess how the VSD data sharing program compares with current standards of practice for data sharing in the scientific community, the committee reviewed extensive information about different data sharing policies, different types of data sharing activities, and legal and regulatory provisions governing confidentiality of data. The committee found that the provisions that are in place for data sharing activities reflect increasing concerns about confidentiality and thus increasing restrictions on access to data: public-use data files, restricted data use agreements, and access to confidential data files in a controlled environment (through a data enclave). The committee’s recommendations on the specific provisions of the VSD data sharing program reflect its review of current standards of practice for data sharing. Because the VSD is a unique database, with unique conditions governing its creation and use, no single data sharing program is a perfect model for comparison with it, but the committee identified four data-enclave approaches to data sharing that have operations similar to the VSD data sharing program and have similar concerns, including the need for confidentiality. The committee reviewed those four data sharing programs to assess current standards of practice of programs similar to that of the VSD, although they do not contain the same type of data as the VSD. All those programs allow access to restricted data through a data-enclave approach and have written rules, limitations, application processes, and review processes. Subsets of data, often called public-use data files, are also available from some of those data sources with virtually no restric-

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust tions. The data enclaves are for follow-up and access to individually identifiable data in a form that protects confidentiality. Those programs are the Medical Expenditure Panel Survey (MEPS) of the Agency for Healthcare Research and Quality (AHRQ), the program of the Census Bureau RDCs, the Health and Retirement Study (HRS) at the University of Michigan, and the California Health Interview Survey (CHIS). The committee finds that the unique circumstances surrounding the creation, maintenance, and use of the VSD will require that the VSD use some specific adaptations to standard data sharing procedures and practices to account for its unique circumstances. Description of Data Sharing Provisions for Different Data Enclaves Table 1 (page 42) summarizes the specific provisions that govern access to restricted data at the data enclaves the committee reviewed in depth. Medical Expenditure Panel Survey The MEPS collects detailed data on specific health services used in the United States and allows linkage of different data files (AHRQ, 2004c). Some MEPS data files are available as public-use files, but others can be accessed only in the secure environment of the Center for Financing, Access, and Cost Trends Data Center (CFACT-DC). There is an application process to obtain approval for use of the center and a fee for each use (AHRQ, 2004b). Review of the application includes an evaluation of the feasibility of the researchers’ proposal, a review of whether the analysis can be done without breaching confidentiality, a determination of the compatibility of the project with the AHRQ mission, and the availability of resources within the CFACT-DC for whatever work may be needed to respond to the request (AHRQ, 2004b). Staff resources to provide assistance at the CFACT-DC are limited, so extensive programming support must be contracted ahead of time for a fee. The manager at the CFACT-DC coordinates the review of each proposal. Once a proposal is approved, the researchers can access the data by going to the CFACT-DC at AHRQ to access the files (AHRQ, 2004a). That usually occurs about 4-6 weeks after approval. Researchers can access only the variables that were identified and approved in their proposal (AHRQ, 2004d). All materials must be reviewed by AHRQ staff before they can be removed from the data center. Census Bureau Research Data Centers The Census Bureau RDCs allow researchers to carry out research with confidential census records. The external research program is supported

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust by the Center for Economic Studies (CES), and its data records are available to “sophisticated” users in a controlled and secure environment (Census Bureau, 2004b). Researchers must register as potential users with CES through the Census Bureau Web site, and their proposal can then be submitted through the same Web site (Census Bureau, 2004a). Each proposal must identify a specific dataset to be analyzed and show that the research can be conducted successfully with the proposed methods and the data available. The proposal must show the need for and importance of using confidential data, and researchers who will have access to confidential data must obtain “special sworn status” from the Census Bureau. In this case, special sworn status requires passing a security clearance and signing a statement agreeing to preserve the confidentiality of the data (Census Bureau, 2004a). Researchers can use confidential data only for the purpose for which the data are supplied or pursuant to the objectives of Title 13, which authorizes the Census Bureau to collect such data, and all analyses must be done at the RDC. All proposals need to gain approval from both the RDC and the Census Bureau and must demonstrate a benefit to the bureau’s programs (Census Bureau, 2004b). The RDC administrator reviews preliminary proposals and may suggest ways to improve and refine them. The administrator must approve a preliminary proposal before researchers can submit the final proposal. Researchers should expect a minimum of a 6-month lapse between submitting their final proposal and the commencement of research with confidential data (Census Bureau, 2004a). All data to be taken out of the RDC must go through a disclosure review; no confidential data can be taken out of the center. Researchers must undergo a security check before leaving the RDC (Census Bureau, 2004b). Health and Retirement Study The HRS at the University of Michigan includes information that is made available to external researchers only under strict conditions (HRS, 2004a). The datasets from the HRS are cleaned and processed to make use easier and are supplemented with information files provided by users (HRS, 2004a). The HRS Web site lists what data will be made available for study and analysis (HRS, 2004c). To use HRS data, researchers must submit a research proposal package. The researchers must identify a dataset of interest and state why the unrestricted data will not be adequate for the research purpose. They must also submit a restricted data protection plan to the HRS. Reviewers consider the risk of disclosure of restricted information on the basis of the users’ description of expected analysis, the scientific and technical feasibility of the project, the availability of data files being requested, and whether the proposed project is in accordance with

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust TABLE 1 Comparison of Data Sharing Programs That Use Data Enclaves   Medical Expenditure Panel Survey (MEPS) Census Research Data Centers (RDCs) GENERAL INFORMATION ON THE DATA SHARING PROGRAMS Type of Data MEPS is the third (and most recent) in a series of national probability surveys conducted by AHRQ on the financing and use of medical care in the United States. MEPS collects data on the specific health services that Americans use, how frequently they use them, the costs of the services, how they are paid, and the cost, scope, and breadth of private health insurance held by and available to the U.S. population. Census data include microdata and data that cannot be released publicly, because they contain detailed information on geographic location and other characteristics about the firms or households that could be used to determine their identities. ELEMENTS OF STUDY PROPOSAL Identification of Specific Variables to Be Studied ▶ Researchers must list data files to which they would like access to. ▶ Researchers will have access only to the variables identified in their approved proposals. ▶ Researchers must identify the specific dataset to be analyzed.

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust Health and Retirement Study (HRS) California Health Interview Survey (CHIS) Vaccine Safety Datalink (VSD) *based on 2004 Federal Register Notice* The University of Michigan HRS surveys more than 22,000 Americans over the age of 50 every 2 years. The survey collects data on respondents’ physical and mental health, insurance coverage, financial status, family support systems, labor-market status, and retirement planning. Registered users can download HRS public data products free. Restricted-release files contain sensitive information that can be made available only under specified conditions. CHIS is a telephone survey of adults, adolescents, and children from all parts of the state of California. The survey is conducted every 2 years. Some of the data collected are prepared for public release as free public-use files. The files are designed to minimize the risk of respondent identification yet preserve the broadest range of descriptive demographic data. Restricted-use files at CHIS available at the DAC contain detailed geographic identifiers and full demographic descriptions for the survey respondents from the 2001 survey. The files also include responses to sensitive questions that are excluded from the public-use data files. The VSD is a large linked database that was developed in 1991 by the collaborative efforts of CDC and several private MCOs. The VSD currently includes data from administrative records for more than 7 million members of eight MCOs. In the VSD, vaccination records, patient characteristics, and health outcomes are linked, allowing the VSD to serve as a unique and potentially powerful resource for the continuing evaluation of vaccine safety. ▶ A specific dataset must be chosen from the list of restricted-use datasets. ▶ Researchers must state why the unrestricted data would not be adequate for their research purpose. ▶ Researchers must request variables using the DAC variable lists. ▶ Researchers must provide a list detailing data requested: data system, files, years, and variables. ▶ Only variables needed to conduct the proposed analyses will be included in the analytic file.

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust   TABLE 2 VSD Research Options for Independent External Researchers   Pre – 1/01/01 Data Post – 1/01/01 Data Audit Possible; no collaboration required Possible; no collaboration required Broader reanalysis Possible; no collaboration required Possible; no collaboration required Corroboration study Possible; collaboration recommended to improve quality Not possible without collaboration Investigation of a new hypothesis Possible; collaboration recommended to improve quality Not possible without collaboration types of studies that can be done with VSD data to conceptualize the full range of studies that independent external researchers may wish to conduct with the data: an audit, a broader reanalysis, a corroboration study, and an investigation of a new hypothesis. It also will be necessary for independent external researchers to be explicit in their proposals about the purpose of their proposed analysis: if an audit or application of an alternative statistical method is intended, a final dataset may suffice; for a broader reanalysis, access to an intermediate or extended dataset, rather than the final dataset, will almost surely be necessary; and if a corroboration study or investigation of a new hypothesis is proposed, independent external researchers will need access to source data, which may require the assistance of a VSD collaborator. When collaboration is sought, the external researchers should contact a facilitator for collaboration at each MCO (see Recommendation 3.3) to pursue a collaborative research relationship with a researcher at the MCO or contact the lead staff person for the VSD data sharing program to pursue a collaborative research relationship with a NIP-affiliated researcher. That will aid access to the data, reduce the likelihood of concerns about confidentiality, and help to facilitate direct, knowledgeable reanalyses of VSD data. SPECIFIC COMPONENTS OF THE VACCINE SAFETY DATALINK DATA SHARING PROGRAM GUIDELINES The committee considered modifications of the VSD data sharing program guidelines needed to facilitate use of VSD data by external research-

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust ers, to ensure appropriate utilization of the data, and to protect the confidentiality of the data. The committee has used the framework of the process that independent external researchers must follow to use VSD data through the data sharing program in formulating its recommendations on specific aspects of the guidelines. Review of Proposals Required Proposal Elements All four versions of the VSD data sharing program guidelines contain information about the elements that are required or suggested in proposals for accessing VSD data (CDC, 2002, 2003a, 2004a,b). The latest version of the guidelines (CDC, 2004a) provides much more detail about the required proposal elements than previous versions. The committee finds that all the information currently required in proposals is reasonable and necessary. The committee encourages the NCHS to maintain the list of required proposal elements in future revisions of the guidelines and to consider further specifying the required information for “proposed analytic strategies” (CDC, 2004a). Evaluation Criteria The criteria that will be used to evaluate VSD data sharing proposals are not clear. In the August 2002, October 2003, and November 2004 (CDC, 2002, 2003a, 2004a) versions of the guidelines, no specific evaluation criteria are provided for VSD proposals. The August 2002 guidelines do not mention how proposals will be evaluated (CDC, 2002). The October 2003 guidelines simply state that the NIP will determine whether external researchers’ proposals are complete and whether the requested variables are available (CDC, 2003a). In March 2004, the programmatic responsibility for the VSD data sharing program was transferred from the NIP to NCHS (CDC, 2004d). The August 2004 version of NCHS’s procedures for use of its RDC states that four criteria will be used to evaluate proposals: scientific and technical feasibility of the project, availability of resources at the RDC, risk of disclosure of restricted information, and whether the proposed project is in accordance with the NCHS mission (CDC, 2004b). In the November 2004 guidelines, NCHS states that those criteria will be used for proposals that request use of NCHS data. For VSD proposals, “RDC staff will notify the external researcher whether his/her proposal is complete and whether the requested variables are available” (CDC, 2004a); no criteria for evalu-

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust ation of VSD proposals are provided. The NIP and NCHS should develop, with public input, the criteria that will be used to evaluate proposals. Recommendation 3.6: The committee recommends that there be specific evaluation criteria for VSD proposals and that interested persons have an opportunity to comment on the draft evaluation criteria before they are finalized; the evaluation criteria should be identified clearly in the VSD data sharing guidelines. Technical Feasibility of Proposals To ensure appropriate utilization of VSD data, the committee agrees that it is reasonable and appropriate to evaluate the technical feasibility of a proposed study. Determining whether a study can be carried out successfully with the VSD data that are available to external researchers is important for ensuring that the resources of the NIP, NCHS, the MCOs, and the researchers are not spent on studies that have no possibility of answering a proposed question. The committee emphasizes, however, that technical feasibility should be determined on the basis of stated objective criteria. The criteria that should define technical feasibility include these: The requested data are available in the database. Enough individuals are represented in the database with the exposures and outcomes of interest to study the proposed hypothesis. The proposed statistical tests are possible with the available data. The criteria are not meant to exclude novel hypotheses or novel methods. If a proposed VSD study is technically feasible with the available VSD data, even if the hypotheses or methods are considered atypical, access to the data should be approved. The technical feasibility of a proposal should be determined by an independent review committee rather than by VSD program staff. Recommendation 3.7: The committee recommends that the technical feasibility of a proposed VSD study be the primary evaluation criterion in the review of proposals submitted to the VSD data sharing program. If study of a hypothesis is determined not to be technically feasible with the available VSD data, the committee believes that it is reasonable and appropriate for an independent review committee to deny the proposal or return it for revision. Weaknesses found during the review process should be brought to the attention of the researchers. When a pro-

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust posal is denied or returned for revision, the technical feasibility determinations or other determinations that prompted the decision should be clearly and adequately described to the applicants. However, it should remain the responsibility of the researchers, not the NIP or NCHS, to ensure that the proposal is revised to meet the requirements. Recommended Competencies It is reasonable to expect that external researchers who wish to use VSD data have specific competencies, such as the ability to use SAS or an equivalent statistical analysis package, experience in using claims data, adequate knowledge of epidemiologic methods, and the ability to select and interpret statistical tests. The committee believes that the lack of such competencies should not in itself be a reason to deny or require revisions to a VSD proposal. If the competencies that will be helpful in analyzing VSD data appropriately are delineated, it may help external researchers (who may include consumers interested in conducting research) to gain an understanding of competencies that they may want to develop or acquire through additional consultations or collaborations in a team approach. That may save much time, effort, and frustration during the limited time that researchers have for access to the VSD data at the NCHS RDC. A list of recommended competencies should be used to assist external researchers in preparing to use VSD data at the RDC and should not be used to discourage external researchers from submitting research proposals for the VSD data sharing program. Recommendation 3.8: To assist independent external researchers who want to use VSD data through the data sharing program, the committee recommends that the NIP and NCHS add to the VSD data sharing program guidelines a list of recommended competencies for VSD data analysis. Technical Assistance Not all external researchers may want to pursue a collaborative research project with a NIP-affiliated or MCO-affiliated researcher who previously has analyzed VSD data. However, external researchers who want to conduct a VSD study independently should not expect to receive extensive technical assistance (such as advice and guidance on appropriate statistical tests, on confounders that should be considered, or on statistical analysis programs) from the NIP or NCHS in developing their proposal or using the data at the NCHS RDC. NIP and NCHS employees already have major tasks in developing and using the VSD data sharing program

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust and should not be diverted from other required duties by requests from external researchers for extensive technical assistance. Submission of Proposals to Institutional Review Boards at the Managed Care Organizations The VSD could not exist without the voluntary participation of the MCOs whose members’ data constitute it. By law, the MCOs are responsible for ensuring the confidentiality of their members’ health information. Confidentiality protections must not be jeopardized; a single breach of confidentiality, no matter how minor, could undermine the contractual arrangements between the MCOs and the NIP and lead to the termination of cooperation and the loss of a unique resource of potentially great national value. Protecting the confidentiality of the information requires that procedures for use of the VSD be clearly stated and explained in the VSD data sharing guidelines. Institutional Review Board Application Process for VSD Proposals The VSD data sharing program guidelines require that independent external researchers receive approval from the IRB at each MCO whose data will be accessed. That requirement can mean that researchers must submit applications to up to nine IRBs (one of the MCO sites requires application to two IRBs) (CDC, 2004d). Each IRB has its own application formats, rules, procedures, and timelines for reviewing VSD proposals. Approvals for data access are for 1 year at a time; because IRBs work on different schedules, the first approval may expire before the last is granted. Although previous users of the VSD data sharing program believed that the process was too burdensome (Geier and Geier, 2004), review by each participating institution is a standard element of multisite studies. IRB review processes generally take months rather than weeks in part because of the frequent need for repeated revision and clarification of proposals, and some IRBs charge fees (McNay et al., 2002). The MCO IRB approval process took about 6 months for the only group of researchers who accessed VSD data through the data sharing program (Geier and Geier, 2004), but the process included multiple revisions and clarifications; the researchers also were charged $1,500 by one of the IRBs involved in the program. IRB review is expensive in personnel time. The committee concludes that the time and costs of IRB review (among institutions that charge for IRB review) that were experienced by the previous users of the VSD (Geier and Geier, 2004) are fair and within the normal range for IRB review of various types of research proposals, given the nature of these proposals.

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust Burdens Caused by Multiple Institutional Review Board Applications Even with the reasonable and customary requirements of IRBs, efforts could be made to make the IRB application process less burdensome. Independent external researchers should not be unduly hindered or delayed in accessing VSD data, so it is important that the IRB review process move quickly, without jeopardizing the careful review of provisions for protecting confidentiality. Because independent external researchers are required to gain IRB approval from multiple MCOs for any VSD study, the committee believes that in the spirit of public access and transparency, unnecessary hurdles imposed on those wishing to use the VSD data sharing program should be minimized. Use of an IRB authorization agreement could be one way to streamline the IRB application process for independent external researchers. An IRB authorization agreement allows an institution to rely on another institution’s IRB for review and continuing oversight of its human subjects research (HHS, 2002). IRB authorization agreements can be used for all human subjects research at an institution or can be limited to specific research protocols (HHS, 2002). The IRB that conducts the review reports its findings and actions to appropriate officials at the other institution. The institution that delegated its IRB review is responsible for ensuring compliance with the IRB’s determinations, even though it is relying on the IRB of the other institution. For research proposals submitted through the VSD data sharing program, use of IRB authorization agreements could streamline the IRB approval process for independent external researchers by reducing the number of MCO IRB applications that must be submitted. Recommendation 3.9: To facilitate use of the VSD data sharing program, the committee recommends that the NIP work with the VSD-participating MCOs to determine the feasibility of using IRB authorization agreements for VSD research proposals. Burdens Caused by Institutional Review Board Requirements for Reanalyses The committee finds that the requirement for IRB approval from each MCO whose data would be examined in a reanalysis of a previously published study could be burdensome and may inhibit reanalyses. To do a reanalysis, independent external researchers are required to seek IRB approval from each MCO whose data were included in the final dataset. If one IRB denies the application, the researchers cannot conduct a true reanalysis. That potentially reduces the value of the data sharing program because there is no recourse if one IRB chooses to deny or limit access to final datasets from studies that have already been published. That a study

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust is a reanalysis means that the MCOs’ IRBs approved a previous analysis of the final dataset. The committee understands that there is a new disclosure risk in allowing different researchers access to such a dataset, but the previous approval for analysis of the dataset and rigorous confidentiality provisions in place at the RDC argue at least for expedited review of IRB applications for reanalyses. Recommendation 3.10: The committee recommends that the NIP work with the MCOs participating in the VSD and America’s Health Insurance Plans (the VSD contractor) to evaluate the feasibility of streamlining the IRB review process for audits or broader reanalyses in accordance with appropriate regulations. Use of the NCHS Research Data Center Confidentiality Protections at the RDC When independent external researchers access MCO data, NCHS takes extensive measures to ensure the confidentiality of individually identifiable information. When MCO-affiliated or NIP-affiliated researchers use VSD data, their employers have provisions (for example, the possibility of termination of employment) that can help to ensure that confidentiality is not violated. When independent external researchers are granted access to data, the primary way to ensure confidentiality is to protect it at the time of data analysis. For the VSD, the confidentiality protections operate through the restrictions in place at the NCHS RDC. When independent external researchers want to use VSD data for a particular study, NCHS must prepare data files that contain only the data required by the approved proposal. Researchers must work within the physical confines of the RDC, and no electronic or hard copies of data files or documents may leave the RDC without passing disclosure limitation review (CDC, 2004a). Restrictions go beyond personal identifiers and include unique or unusual combinations of elements that might apply to few people (for example, inpatient admission of a 56-year-old man to a specific MCO on a specific date could well be used to identify a particular person). Therefore, table cells with fewer than five observations are customarily blocked by NCHS before a table leaves the RDC (CDC, 2003c), as are tables with geographic variables in any dimension, models with geographic variables as outcome variables, or case listings (CDC, 2004a). Fair Application of Confidentiality Provisions The committee is concerned that the restrictions placed on independent external researchers, compared with NIP-affiliated or MCO-affili-

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust ated VSD researchers, are not applied equitably. For example, independent external researchers using the VSD data sharing program are not permitted to see table cells that contain fewer than five observations (CDC, 2004a). However, internal VSD researchers have published papers in which the cells in some tables contain fewer than five observations (Verstraeten et al., 2003a). It is understandable that some additional restrictions on data access need to be in place for researchers not affiliated with one of the parties to the VSD contract, but equitable application of the confidentiality restrictions on all researchers will help to ensure public trust in the VSD data sharing program. The committee concludes that some of those concerns can be addressed by use of an independent review committee for oversight of research protocols (see Recommendation 5.3). Enforcement of Confidentiality Provisions Violations of the confidentiality provisions of the RDC are subject to federal law and are punishable under Title 18 of the United States Code, section 1001, by a fine of up to $10,000 or imprisonment for up to 5 years (CDC, 2004a). To help to ensure that the confidentiality of individually identifiable VSD data is not jeopardized, data requesters should be informed clearly about the penalties and about the strict sanctions for any violations of confidentiality. An understanding that there will be strict enforcement of confidentiality requirements may help to ensure that researchers take their responsibility to safeguard confidentiality very seriously. The general rules for use of the NCHS RDC require that researchers sign the Agreement Regarding Conditions of Access to Confidential Data in the Research Data Center of the National Center for Health Statistics (can be found in Appendix G) (CDC, 2004a). This agreement states that: Deliberate violation of any of these conditions may result in cancellation of the data access agreement, and the researcher may be escorted from the premises by the duly authorized Federal protection service on duty at NCHS. The researcher may also be barred from any future use of the RDC upon review and determination by the Director of NCHS that this is necessary to protect the integrity and confidentiality of the RDC. On the basis of the information that is included for the project-specific requirements for the VSD (CDC, 2004a), it is not clear whether that provision applies to use of VSD data at the RDC or only to use of NCHS data at the RDC. Recommendation 3.11: Because the confidentiality concerns are integral to the continuation of the VSD, the committee recommends that NCHS in conjunction with the MCOs develop policies and pro-

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust cedures to address confidentiality violations of VSD data and that they be clearly described in the VSD data sharing program guidelines and the agreements that external researchers must sign before using the RDC. Costs for Use of the RDC Most researchers in any field of health or medicine must obtain grant or contract funding to pay the costs of doing research, including the cost of access to data. As with other databases, the VSD data holder (NCHS) incurs costs (primarily personnel costs) when it allows independent external researchers access to the VSD database. Submission of a proposal generates personnel costs to NCHS for its review and follow-up. The committee believes that the costs to external researchers for use of the VSD are reasonable, compared with the costs for using other data enclaves. NCHS should not be expected to cover the costs of these activities out of its current funding, which is allocated for other activities. It is reasonable to expect independent external researchers who want to use VSD data to acquire or provide funding to support their research. Recommendation 3.12: The committee concludes that it is reasonable to expect researchers who request access to VSD data to have their own funding and it therefore recommends that RDC costs not be waived for independent external researchers. Reporting Objectives, Methods, and Results Sharing information about any studies or analyses done with VSD data can have many benefits. First, providing information about the objectives, methods, and results of studies promotes transparency, and greater transparency can enhance public trust (McComas, 2004a,b). Sharing details about the methods used for particular studies can assist other researchers who are interested in pursuing a reanalysis or a corroboration of a study with a different database. In the next chapter, the committee describes why it is important to share as much programmatic information as possible about current and completed VSD studies. The committee discusses how the NIP should share information about any studies conducted by NIP-affiliated or MCO-affiliated researchers. Likewise, the committee finds that the need for transparency should also apply to independent external researchers who use VSD data. If NIP-affiliated and MCO-affiliated researchers will be asked to share their research protocols, fairness and transparency require that external researchers do the same. As stewards of the data, the NIP and NCHS should know how VSD

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust data are being used. One way is to ensure that the NIP and NCHS have standardized information on each study conducted through the VSD data sharing program by requiring independent external researchers to provide specific information on any studies conducted through the program. Requiring standardized, specific information from users of the data sharing program and standardized research protocols from internal VSD researchers will leave all VSD researchers subject to similar information-sharing requirements. Recommendation 3.13: The committee recommends that, as a condition of accessing VSD data, all independent external researchers that use the VSD data sharing program be required to submit a report to the NIP (with a copy to NCHS) within a reasonable time (to be determined by the NIP) on the status of their study, the type of study conducted (an audit, a broader reanalysis, a corroboration study, or an investigation of a new hypothesis), the results obtained, and their planned further activities. The reports should be made public by the NIP and should be easily accessible. Correspondingly, when researchers are preparing a public release of findings from data that were accessed through the VSD data sharing program (for example, in a presentation at a conference or meeting or in a journal article), the committee finds it reasonable to expect that the data steward (the NIP and NCHS) will be notified of the release of the findings within a reasonable time. When the findings are released, there may be questions about the benefits and limitations of the database, the study population, and the analyses that are appropriate given the structure of the database. The data steward may be called on to explain how the findings presented in the publication or presentation support or contradict other findings derived from the same database. Being able to provide an explanation for how the findings compare with other findings derived from the database can help to give the public the appropriate context for understanding the new findings. It is therefore appropriate for the data steward to have an opportunity to prepare for such questions. However, the committee believes that a requirement for advance notification of the release of findings should not be used as an opportunity to censor findings or to release similar findings in advance of the researchers’ planned release; any evidence to the contrary should be reviewed by an independent review committee. Recommendation 3.14: The committee recommends that, as a condition of accessing VSD data, all independent external researchers that use the VSD data sharing program be required to submit to the NIP (with a copy to NCHS) a copy of a manuscript intended for

OCR for page 33
Vaccine Safety Research, Data Access, and Public Trust publication at least 30 days before submission to a journal or other print or electronic media. Copies of presentations to be delivered at conferences or meetings that are open to the public or that have media coverage should also be submitted to the NIP and NCHS at least 15 days before presentation. Failure to comply with either of those reporting requirements could be grounds for NCHS to deny future access to VSD data through the data sharing program. Creation of a Basic Analytic File The VSD is a complex database, and generally only sophisticated users will be able to master use of its data files. The NIP and NCHS may want to explore the creation of a basic analytic file that could be used to answer many questions of interest to external researchers. Such a data file would not replace all specific data files that might be requested by independent external researchers for particular studies, but it could serve as a useful resource for many researchers to develop and refine hypotheses and to begin understanding how to use the VSD files in the RDC. The committee recognizes that the creation of such a data file, with full protections of confidentiality, would require considerable time and effort. The cost of such a resource should be assessed and made publicly available so that Congress and other stakeholders would have the information necessary to make an informed decision regarding a possible investment of public funds. The creation of such a data file should be pursued only if additional funds are made available for the purpose. The availability of a cleaned, standard analytic file to a variety of researchers could help to foster appropriate use of the dataset by external researchers and to reduce the burden on both NCHS and researchers who want to analyze VSD data.