Appendix H
Data Processing for Program Evaluation
The committee solicited data from NIFA to explore the relationship between resource input and the output of AFRI-funded research. Information solicited for the analyses included the following for all new grants funded from 2009 to 2012:
• Title, type, and size of grants (e.g., total value, annual amount, and number of years funded).
• Duration of each award (e.g., start and actual or expected end date).
• Characteristics of each award, such as
o Program area (e.g., foundation, challenge-area, or fellowship grant).
o Award type (e.g., standard award, CAP, conference grant, or FASE award).
o Project function (e.g., research, education, extension, or integrated, that is, integrating at least two of the three functions).
o Percentage dedicated to research, education, and extension of each award.
o Program code, which reflects the subject area of the project.
o Percentage basic and percentage applied research.
o Awarding institution and type of institution (e.g., 1862 land-grant university, 1890 land-grant university, or public non–land-grant university).
• Demographics of principal investigators (PIs), including
o Ranks of each PI (e.g., assistant, associate, or full professor).
o Each PI’s current and pending funding.
• Human resources, including
o Number of co-PIs.
o Number of undergraduate students and number of months supported.
o Number of graduate students and number of months supported.
o Number of postdoctoral researchers and number of months supported.
• Research output as reported in USDA CRIS.
The committee also requested the same data for at least 1 year of the NRI for comparison. NIFA submitted multiple Excel files, each of which consisted of some pieces of the requested data exported from CRIS. The files as submitted were not organized in a way that would allow regression analyses. For example, some files included duplicate entries for a grant (mostly for the continuous grants that require annual reporting). Another example is that the number of undergraduate and graduate students and postdoctoral researchers trained and the number of months trained (also called number of student months) were all grouped together in one column. Those data had to be parsed into separate columns—one for each of the following categories: number of undergraduate students supported, number of undergraduate-student months, number of graduate students supported, number of graduate-student months, number of postdoctoral researchers supported, and number of postdoctoral-researcher months. To render all the submitted data in an analyzable form, National Research Council staff sorted the data, removed duplicate entries, collated data from the various files into one Excel file, and created dummy variables for the regression analyses. In the process of sorting the data, the staff noticed some gaps in data and a few inconsistencies among datasets (e.g., some entries for PI ranks or grant types were missing. In those cases, the staff either sought the information from the Web or sought clarification from NIFA staff.
In addition to the Excel files, the committee received thousands of folders, each of which contained all the files for PIs’ and any co-PIs’ pending and current funding in pdf. For about 5% of the awards, the staff could not identify the pdf files that contained the pending and current funding information. The committee found the results of the analyses without those data rather robust. Their addition would be unlikely to alter the results of the analyses. In the interest of time and effort spent on the part of the staff and NIFA, the committee decided not to seek those data from NIFA. For those pdf files, the National Research Council staff had to identify the file that corresponded to each grant and manually record the number of pend-
ing and current funding that the PI had from various agencies or types of organizations in the Excel file for regression analyses.
The committee did not receive any Excel files that had a column that specified the number of co-PIs on each AFRI award. However, such information was embedded within each folder that had all the pdfs for PIs’ and co-PIs’ pending support. Under the assumption that all co-PIs completed a form to disclose their current and pending support, the number of those forms completed for each project was used as a proxy for the number of co-PIs on each project. For a sample of projects, the number of co-PIs determined that way was compared with that listed in CRIS in order to confirm that the committee’s method of tallying the number of co-PIs in a project was reasonable.
Although CRIS includes data on publications, presentations, and conferences held in connection with each project submitted by PIs, PIs cannot add information to the system after the project terminates. Given the lag time between the conduct of research and the publication of results, it is unlikely that all publications from every project are accounted for by CRIS. Therefore, the committee solicited help from Yunguang Chen, of Oregon State University, to search publications that acknowledge AFRI as a source of funding for the 2009–2012 grants and for the NRI awards initiated in 2008 by using Google Scholar. Written materials submitted to the present committee by external sources, including data submitted by NIFA, are listed in the project’s public-access file and can be made available to the public on request.