E
Comparison of Data Sharing Programs That Use Data Enclavesa
|
Medical Expenditure Panel Survey (MEPS) |
Census Research Data Centers (RDCs) |
General Information on the Data Sharing Programs |
||
Type of Data |
MEPS is the third (and most recent) in a series of national probability surveys conducted by Agency for Healthcare Research and Quality (AHRQ) on the financing and use of medical care in the United States. MEPS collects data on the specific health services that Americans use; how frequently they use them; the costs of the services; how they are paid for; and the cost, scope, and breadth of private health insurance held by and available to the U.S. population. |
Census data include microdata and data that cannot be released publicly because they contain detailed information on geographic location and other characteristics about the firms or households that could be used to determine their identities. |
Health and Retirement Study (HRS) |
California Health Interview Survey (CHIS) |
Vaccine Safety Datalink (VSD)b |
The University of Michigan HRS surveys more than 22,000 Americans over the age of 50 every two years. The survey collects data on Americans’ physical and mental health, insurance coverage, financial status, family support systems, labor-market status, and retirement planning. Registered users can download HRS public data products free. Restricted-release files contain sensitive information that can be made available only under specified conditions. |
CHIS is a telephone survey of adults, adolescents, and children from all parts of the state of California. The survey is conducted every two years. Some of the data collected are prepared for public release as free public-use files. The files are designed to minimize the risk of respondent identification yet preserve the broadest range of descriptive demographic data. Restricted-use files at CHIS are available at the Data Access Center (DAC) and contain detailed geographic identifiers and full demographic descriptions for the survey respondents from the 2001 survey. The files also include responses to sensitive questions that are excluded from the public-use data files. |
The VSD is a large linked database that was developed in 1991 by the collaborative efforts of CDC and several private MCOs. The VSD currently includes data from administrative records for more than 7 million members of eight MCOs. In the VSD, vaccination records, patient characteristics, and health outcomes are linked, allowing the VSD to serve as a unique and potentially powerful resource for the continuing evaluation of vaccine safety. |
|
Medical Expenditure Panel Survey (MEPS) |
Census Research Data Centers (RDCs) |
Elements of Study Proposal |
||
Identification of Specific Variables to Be Studied |
• Researchers must list data files which they would like access to. • Researchers will have access only to the variables identified in their approved proposals. |
• Researchers must identify specific dataset to be analyzed. |
Confidentiality Protection Measures in Study Design |
• The proposed study must be done without compromising confidentiality of respondents. • Researchers must read and comply with the CFACT-DC User Guide. |
• Researchers must obtain Special Sworn Status. • Researchers can use confidential data only for the purpose for which the data are supplied. |
Feasibility of Study and Data-Resource Assessment |
Decision Criteria: • Can the research be conducted with the available data? |
• The proposal must show that the research can be conducted successfully with the proposed method and available data. • The proposal should show the need for and importance of using confidential data. |
Health and Retirement Study (HRS) |
California Health Interview Survey (CHIS) |
Vaccine Safety Datalink (VSD)b |
• A specific dataset must be chosen from the list of restricted-use datasets. • Researchers must state why the unrestricted data would not be adequate for their research purpose. |
• Researchers must request variables using the DAC variable lists. |
• Researchers must provide a list detailing data requested: data system, files, years, and variables. • Only variables needed to conduct the proposed analyses will be included in the analytic file. |
• Researchers must submit a Restricted Data Protection Plan to HRS. • Risk of disclosure of restricted information is considered based on the users’ description of expected analysis and results. • The confidentiality agreement restricting disclosure and use of data from the Michigan Center on the Demography of Aging Data Enclave must be read and signed by the researchers. • All users will be periodically audited by HRS to ensure that all conditions of the restricted data agreement are being met. Various data from 1992–2004 are available. |
Decision Criteria: • Is there a risk of disclosure of confidential information? • Does the project propose to merge user-supplied data with CHIS data? • What additional risks of disclosure are associated with the merged dataset? • Researchers must sign a Nondisclosure Affidavit and Data Access Confidentiality Agreement before starting their work. |
• All users must sign an affidavit of confidentiality promising not to attempt to identify respondents. • Researchers can use confidential data only for the purpose for which the data are supplied. |
• Scientific and technical feasibility of the project, including availability of data files being requested, is considered. |
Decision Criteria: • Is sample size sufficient? • Are CHIS data appropriate for answering the research questions proposed? • Are the variables requested related to the proposed analyses? |
• No criteria specified for review of VSD proposals. • No publicly available data. |
|
Medical Expenditure Panel Survey (MEPS) |
Census Research Data Centers (RDCs) |
Consistency with Mission of the Organization |
• The proposed study must be in accordance with the mission of AHRQ (this is specified in its authorizing legislation). |
• All projects must provide a benefit to Census Bureau programs. The benefit requirement is an explicit proposal criterion and is required by law (Title 13, Sec. 23, U.S.C.). |
IRB Approval |
(Information not available.) |
• The need for IRB approval is based on where the confidential data come from, and the researchers must follow the rules and regulations of that agency. |
Other Guidelines |
||
Costs and Fees |
• At the data center: To cover technical assistance, simple file construction, and up to 2 hours of programming support, there is a $150.00 fee. Additional programming support can cost $80.00 an hour. |
• $3,125/month for a full-time seat • A project that requires a 40% level of access (about 2 days/week) for a period of 1 year would cost $15,000. • Additional fees may be charged to projects that use datasets outside the core or that impose other special costs on CES, the Census Bureau, or the RDC. |
Who Reviews Researchers’ Application and Proposal? |
• The manager at the CFACT-DC coordinates the review of each proposal. |
• Both the RDC and the Census Bureau must approve the proposal. • The RDC administrator reviews the preliminary proposal and suggests ways to improve or refine it. The RDC administrator must approve the preliminary proposal before the researchers can submit the final proposal. |
Health and Retirement Study (HRS) |
California Health Interview Survey (CHIS) |
Vaccine Safety Datalink (VSD)b |
• Proposed project must be in accordance with the mission of the MiCDA. |
• Study must be compatible with the purpose of CHIS. |
• Not specified for VSD data. |
• Researchers must be affiliated with an institution with an NIH-certified human subjects review process. • A signed form from the researchers’ institution certifying Human Subjects Review was done is necessary. |
• Copy of approval or exemption by home institution’s IRB is necessary. |
• Researchers must obtain IRB approval from each MCO whose data they would need to undertake the analyses. |
• Academic (faculty members of accredited institutions of higher education) or government (federal, state, or local): $200/day. • Student (currently enrolled in an accredited graduate or undergraduate program): $50/day. • Other: $500/day. |
Costs are developed on an individual basis and include • $500 initial set-up fee • $65/hour for guest research access • $140/hour for programming services • $120/hour to run programs. • $1,000 minimum fee per project Charges are determined by actual time spent on project. |
• Set-up charge of $500/day for merging files or creating custom file formats. • Guest researchers at $200/ day (2-day minimum, 10-day maximum). |
• The HRS DCC-WG reviews the application. When the application is adequate, the DCC-WG will contact the researchers and let them know that they can submit the application to their local IRB for review. Once the researchers have IRB approval, their application is complete, and they can submit it for review by the DCC for review and final approval. |
• DAC staff prepares a summary of the application. The CHIS Data Disclosure Review Committee meets biweekly, reviews the application, and makes a recommendation to the CHIS PI to approve or reject the application or to request further information from the researchers. |
• Completed proposals are sent to the NCHS RDC for review by a committee consisting of the director of NCHS RDC, the RDC staff liaison, the NCHS confidentiality officer, and the director of the NCHS data division whose data are included in the proposal. • Approval for use of the VSD requires approval by the MCOs’ IRBs. |
|
Medical Expenditure Panel Survey (MEPS) |
Census Research Data Centers (RDCs) |
Response Time |
• Applications are accepted continuously. • About 4–6 weeks after a proposal is approved, researchers can go into the CFACT-DC. |
• There is at least a 6-month period between the deadline for the final proposal submission and the commencement of research. |
Assistance from Program Staff |
• Currently, there are limited staff resources to help at the CFACT-DC, so extensive programming support must be contracted ahead of time for a fee. |
• Researchers work closely with the RDC administrator to develop a preliminary proposal. |
Available Data Programs |
SAS, Stata, SPSS, and SUDAAN are the software packages most suitable for analyzing MEPS data. |
(Information not available.) |
Health and Retirement Study (HRS) |
California Health Interview Survey (CHIS) |
Vaccine Safety Datalink (VSD)b |
• When HRS receives an application, it is logged and review is scheduled. |
• The CHIS PI will respond to the request within 21 days after receiving the application. • Computer programs that are e-mailed to the DAC staff will be run within five working days. |
• Response time varies but NCHS tries to respond to the initial proposal as soon as possible. • The time it takes between securing proposal approval and using the RDC varies as well (depends on the complexity of the work, how long it will take to prepare the data files, and what other work is already scheduled at the RDC). |
At the MiCDA data center: • Enclave users are responsible for developing and implementing all data-management procedures necessary to produce datasets to be used for analysis. • Enclave staff provide assistance with dataset installation, software installation, operating-system problems, statistical-package operation, backups, and user-interface issues. • Staff members do not provide assistance in carrying out statistical analysis. |
• Researchers are encouraged to consult the DAC manager while developing their proposals. • Researchers are provided with limited technical assistance on CHIS variables, weighting, and variance calculation. • A senior programmer contact is assigned to the project. • Dummy data files are sent to the researchers. |
• Researchers are encouraged to check with RDC staff before writing their proposals to ensure that the data of interest can be made available to them. • Researchers must be able to conduct their analyses with the software specified in their research proposal. |
STATA (v6.0), SAS (v6.12), SPSS (v9.0). |
SAS, SPSS, State, STAT/ Transfer, SUDAAN, and Wesvar; custom software is installed on request. |
Hardware: Pentium computers with Windows 2000. SAS is the standard program for use of VSD data, but other languages can be made available with sufficient lead time. |
|
Medical Expenditure Panel Survey (MEPS) |
Census Research Data Centers (RDCs) |
Where Data Can Be Accessed |
• Public-use datasets can be downloaded from the MEPS web site (http://www.meps.ahrq.gov). • Restricted data can be accessed by approved researchers at the CFACT-DC, in Rockville, MD. • Researchers may also choose to contract with the AHRQ data-processing contractor (Social and Scientific Systems) to develop and run their programs. |
• All analysis must be done on site in the RDC. |
Disclosure Review Before Material Leaves the RDC |
• All materials must be reviewed by AHRQ staff before they can be removed from the data center. |
• Researchers cannot remove any confidential data from the RDC on any medium. All output must be submitted to Census Bureau personnel for disclosure review. |
Other Requirements |
Researchers must also provide: • List of publication plans and other intended uses of data in the proposal • Sources of funding • Estimated timeframe for viewing data and completing their work • Resumes or CVs for all persons who will access the data center |
Researchers must provide: • Purpose of the research • Funding source • CVs for all investigators • Abstract of the proposal • Project description • Statement of benefits to Census Bureau Preliminary and final proposals are completed through the Census Bureau web site. |
Health and Retirement Study (HRS) |
California Health Interview Survey (CHIS) |
Vaccine Safety Datalink (VSD)b |
• Public-use files can be accessed through the HRS web site (http://hrsonline.isr.umich.edu/data/avail.html). • Restricted-use data can be viewed by approved researchers at the MiCDA data enclave in the Institute for Social Research. |
• Public-use files can be accessed through the CHIS web site by registering free (http://www.chis.ucla.edu/main/default.asp?page=puf). • Restricted data can be viewed at the DAC at the UCLA Center for Health Policy Research after submitting and gaining approval of a proposal. • Researchers can also gain access to restricted files after proposal approval by e-mailing computer programs to DAC staff, who will run them and send results to the researchers. |
• All analyses must be done on site in the RDC in Hyattsville, MD. • A maximum of three collaborating researchers can sit at a computer station at the RDC. |
• Users are allowed to remove results of statistical analysis from the data enclave only after enclave staff have conducted a disclosure-limitation review to protect respondent confidentiality. |
• DAC manager or senior programmer conducts a disclosure review for all output before it is removed from the DAC. |
• All output and materials removed from the RDC are subject to disclosure-limitation review. • Researchers must provide a list of the table shells, equations, and test statistics of statistical output they plan to take out of the RDC. |
• If institutional or physical circumstances of the researchers change, HRS is to be contacted to modify the underlying agreement. • Yearly recertification of the certification and data agreement is required. • Researchers must submit a renewal request if the initial agreement expires and they want continued access to the data. |
If there are many small cells, the programmer recommends the recoding of variables so that this does not occur. If there are few small cells in the output, the programmer must suppress small cells and do complementary suppression. DAC applications include: • DAC application forms. • Personal and organizational information • Service request • Abstract |
Researchers must provide: • Current resumes or CVs • Dates of proposed use of the RDC • Source of funding • Summary of proposed study • Background of the study • Data dictionary NCHS complies with 308(d) Confidentiality Statute. |
Health and Retirement Study (HRS) |
California Health Interview Survey (CHIS) |
Vaccine Safety Datalink (VSD)b |
Researchers must also provide: • Current resumes or CVs • Dates of proposed tenure at the data enclave • Funding sources for user project and for data enclave cost recovery |
Supplemental materials include: • Biographic sketch or resume • List of CHIS variables requested • Detailed description of any user-supplied files |
|
(Information not available.) |
Researchers must also: • Acknowledge CHIS in their manuscript for publication • Submit copies of publications to DAC |
• External researchers are required to submit a copy of the data-sharing guidelines and a copy of the signed confidentiality agreement with any manuscript submitted to a journal. • Must include certain disclaimers in their manuscript. |
REFERENCES
AHRQ (Agency for Healthcare Research and Quality). 2004a. CFACT Data Center (CFACT-DC). [Online]. Available: http://www.meps.ahrq.gov/datacenter.htm [accessed June 30, 2004].
AHRQ. 2004b. CFACT Data Center (CFACT-DC): User Guide/Application Information. [Online]. Available: http://www.meps.ahrq.gov/datacenter/dcuserguide.htm [accessed June 30, 2004].
AHRQ. 2004c. Overview of the Medical Expenditure Panel Study. [Online]. Available: http://www.meps.ahrq.gov/WhatIsMEPS/Overview.HTM [accessed June 30, 2004].
AHRQ. 2004d. FAQs about the Data Center. [Online]. Available: http://www.meps.ahrq.gov/FAQs/FAQ_DataCenter.HTM [accessed May, 2004].
CDC (Centers for Disease Control and Prevention). 2004. Procedures and Costs for Use of the Research Data Center. Notice and Request for Action. Federal Register 69:222.
Census Bureau. 2004a. CES RDC Research Proposal Guidelines. [Online]. Available: http://148.129.75.160/ces.php/guidelines [accessed July 8, 2004].
Census Bureau. 2004b. The Research Data Center Program. [Online]. Available: http://148.129.75.160/ces.php/research [accessed July 8, 2004].
CHIS (California Health Interview Survey). 2003. CHIS Research Clearinghouse. [Online]. Available: http://www.chis.ucla.edu/rc/ [accessed November 3, 2004].
CHIS. 2004a. About the California Health Interview Survey. [Online]. Available: http://www.chis.ucla.edu/about.html [accessed June 28, 2004].
CHIS. 2004b. The Data Access Center at the UCLA Center for Health Policy Research. [Online]. Available: http://www.chis.ucla.edu/chis_dac.html [accessed June 28, 2004b].
HRS (Health and Retirement Study). 2004a. Background Information. [Online]. Available: http://hrsonline.isr.umich.edu/intro/sho_intro.php?hfyle=uinfo [accessed June 16, 2004].
HRS. 2004b. HRS Restricted Data: Application Processing Overview. [Online]. Available: http://hrsonline.isr.umich.edu/rda/rdanarrative.htm [accessed July 1, 2004].
IOM (Institute of Medicine). 2005. Vaccine Safety Research, Data Access, and Public Trust. Washington, DC: The National Academies Press.