A robust peer review process is critical to the Congressionally Directed Medical Research Programs’ (CDMRP’s) mission of funding innovative health-related research. Peer review is used throughout the scientific research community, both for determining funding decisions for research applications and for publications on research outcomes. The purpose of peer review is for the proposed research to be assessed for scientific and technical merit by experts in the same field who are free from conflicts of interest (COIs) or bias and, in some instances, by lay people who might be affected by the results of the proposed study. The process should be transparent to the applicant as well as to the research and stakeholder communities in order to ensure that each application receives a competent, thorough, timely, and fair review. The peer review process is the standard by which the most meritorious science is developed, assessed, and distributed.
This chapter describes the process of assigning peer reviewers to panels and applications; the criteria and scoring used by peer reviewers; activities that occur prior to, during, and after the peer review meeting; and the quality assurance procedures used by CDMRP to ensure that the applications are scored and critiqued correctly. Peer review occurs after CDMRP has received full applications in response to the programmatic panel activities described in Chapter 4 (see Figure 5-1). Peer review panels are responsible for conducting both the scientific and technical merit review and an impact assessment of all of the full applications submitted to each CDMRP research program.
As described in Chapter 3, the selection of peer reviewers begins shortly after the award mechanisms are chosen by the programmatic panel during the vision setting meeting. The peer review support contractor works with the CDMRP program manager to determine the number of peer review panels that will be required and the types of expertise that will be necessary to review the anticipated applications.
Panel and Application Assignments
Depending on the program and the number of award mechanisms and anticipated applications, multiple peer review panels—tailored to the specific expertise required by the research program and award mechanism—may be needed for a single funding opportunity or topic area (for example, if several applications propose to investigate a specific protein group or metabolic pathway). In general, a peer review panel reviews applications for a single award mechanism (or a group of similar types of mecha-
nisms). Because the review criteria differ to some extent by mechanism, having multiple panels allows peer reviewers to focus on the specific review criteria and programmatic intent of an award mechanism (Kaime et al., 2010). For example, in 2014 there were 249 separate review panels (and 3,195 peer reviewers) across 23 programs. Large programs, such as the Peer Reviewed Medical Research Program and the Breast Cancer Research Program, had the most peer review panels (85 and 38, respectively), and small programs, such as the Duchenne Muscular Dystrophy Research Program and the Multiple Sclerosis Research Program, had only one panel each (Salzer, 2016c).
The number of peer review panels and the mechanism(s) or topic(s) they cover are discussed by the integrated program team and inform the CDMRP program manager’s task order assumptions regarding the types and number of reviewers to be recruited (Salzer, 2016a). As mentioned in Chapter 3, the peer review contractor begins recruiting potential reviewers when the program announcement is publicly released. The committee notes that in its scanning of program announcements for 2016, there appear to have been about 5–6 months between the release of the program announcement and the peer review meeting.
In general, a single panel reviews a maximum of 60 applications; there is no minimum number of applications a panel may review (CDMRP, 2011). Scientific review officers (SROs) and consumer review administrators assign, respectively, scientist and consumer reviewers to specific panels (Kaime et al., 2010).
Each application is assigned to at least two scientist reviewers (a primary reviewer and a secondary reviewer) and one consumer reviewer, but reviewers are responsible for being familiar with all applications to be discussed and reviewed by their panel. Each scientist reviewer is generally assigned six to eight applications, but this may vary depending on the program and the guidance provided by the program manager.
One to four consumer reviewers serve on each panel, depending on the number of applications and the number of panels for the program. Each consumer reviewer is assigned at most 20 applications as these reviewers are required to review only selected sections of an application, such as the impact statement and lay abstract. Any consumer who has a scientific background is assigned to a panel that is reviewing applications outside his or her area of expertise to avoid confusion between the “science” and “advocacy” roles of the panel member (CDMRP, 2016f). A specialty reviewer may also be assigned to review and assess specific sections of applications, in addition to the two scientist reviewers (CDMRP, 2011). For example, for a clinical trial application, a biostatistician specialty reviewer would review the statistical analysis plan.
The criteria used to assign specific reviewers to an application include
COIs (discussed in Chapter 3) and expertise. Applicants also assign primary and secondary research classification codes to their applications; the codes can be used to assign applications to peer review panels, inform recruitment of peer reviewers, and balance panel workloads. Panel membership is confidential until the end of the review process for that funding year, when the names and affiliations of all peer reviewers for each research program are posted to the CDMRP website (CDMRP, 2016g).
Primary peer reviewers are assigned to applications on the basis of their expertise (CDMRP, 2011). Once applications are assigned to a panel, all scientist reviewers indicate their level of expertise for reviewing individual applications using an adjectival rating scale of high, medium, low, or none, based on review of the title and abstract. The scale and description are as follows (Salzer, 2016c):
- High: You are able to review the application with little or no need to make use of background material or the relevant literature. You have likely published in areas closely related to the science/topics presented in the application.
- Medium: You have most of the knowledge to review the application although it would require some review of relevant literature to fill in details or increase familiarity with the system employed. You may employ similar methodologies in your own work, or study similar molecules, processes, and/or topics, but you may need to review the literature for recent advances pertinent to the application.
- Low: You understand the broad concepts but are unfamiliar with the specific system or other details, and reviewing the application would require considerable preparation.
- None: You have only superficial or no familiarity with the concepts and systems described in the application.
The committee finds these definitions of expertise and experience to be very helpful and likely to be quite informative in assigning applications for review. There does not appear to be a similar process at the National Institutes of Health (NIH).
The committee assumes that the SRO uses the reviewers’ responses to confirm their expertise and to assign primary, secondary, and specialty reviewers (if applicable). In general, only reviewers who have indicated high or medium expertise for an application are assigned to be primary or secondary reviewers for that application; the primary reviewer for any
application must have high expertise (Salzer, 2016d). The peer review contractor also ensures that all reviewers on a panel have a minimum level of expertise.
Preliminary Critiques and Initial Scores
Timing of the Preliminary Reviews
In general, it appears that the time between the deadline for the submission of a full application and the peer review meeting is about 2 months, although for some programs or awards this may extend to about 3 months. CDMRP states that approximately 40 hours of pre-meeting preparation over 4–6 weeks is required for peer review. This time includes registration, training, reviewing assigned applications, and writing critiques and comments for assigned applications. CDMRP states that peer reviewers typically have 3–4 weeks to review their assigned applications, although review time may vary by program, or if additional reviewers are added to the panel at a later time. In response to the committee’s solicitation of input, several peer reviewers stated that they would like more time to review and critique their assigned applications.
Preliminary critiques and scores are submitted to the electronic biomedical research application portal (eBRAP), after which the contractor SRO and the CDMRP program manager review them. Applications, preliminary critiques, and preliminary scores are also available electronically to all other panel members (not just assigned reviewers) except in the case of those who have COI with a particular application.
CDMRP reports that peer reviewers then have 4–5 weeks to review all preliminary critiques and scores assigned to their panel before the panel meeting (Salzer, 2016c). This length of time is intended to allow the reviewers to become familiar with all the critiques before the meeting, so that the discussion at the meeting can be focused and the reviewers who were not assigned a specific application will have enough time to be informed about it and contribute to the discussion.
Peer Review Criteria
Each application is evaluated according to the peer review criteria published in the program announcement. Usually, two sets of scores are given during peer review. The first set consists of scores on evaluation criteria such as impact, research strategy and feasibility, and the transition plan; each of these criteria receive numeric scores from the primary and secondary reviewers. Other criteria, such as environment, budget, and application presentation are also evaluated but do not receive numeric
scores. The individual criteria are not given different weights, but they are generally presented in order of decreasing importance (Kaime et al., 2010). The scale uses whole numbers from 1 (deficient) to 10 (outstanding) (see Table 5-1).
The second score given is the overall score, which represents an overall assessment of the application’s merit. Overall scores range from 1.0 (outstanding) to 5.0 (deficient) (see Table 5-2). Some award mechanisms use only an overall score, whereas others only use adjectival scores instead of numeric ones (Salzer, 2016b). Reviewers are instructed to base the overall score on the evaluation criterion scores, but they may also consider additional criteria that are not individually scored, such as budget and application presentation (Kaime et al., 2010).
As can be seen from Tables 5-1 and 5-2, the scale used to score evaluation criteria is different from the scale for the overall score. The overall and criterion scores are not combined or otherwise mathematically manipulated to connect them, but the overall score is expected to correspond to the individual criterion scores (for example, if the individual criteria receive scores in the excellent range, the overall score should also be in the excellent range) (MOMRP, 2014). The use of two different scales to score
TABLE 5-1 CDMRP Evaluation Criteria Scoring Scale
|Evaluation Criteria Score Range||Category Descriptions|
SOURCE: PCRP, 2014.
TABLE 5-2 CDMRP Overall Scoring Scale
|Overall Score Range||Category Descriptions|
SOURCE: PCRP, 2014.
the application is deliberate and intended to discourage averaging evaluation criterion scores into an overall score (Kaime et al., 2010). However, several peer and programmatic reviewers who responded to the committee’s solicitation of input noted that the use of the two scales was confusing. The committee finds that because CDMRP funding announcements dictate the importance of each criterion for the overall score, it would be easier for reviewers to appropriately consider each criterion if the same scale were used for both overall and criterion scores.
The committee notes that in 2009 the NIH, which reviews tens of thousands of applications per year, moved to a 9-point, whole-number scale (see Table 5-3) which may be used for both criterion scoring and the overall score (NIH, 2015b). Several other agencies that conduct peer review have adopted NIH’s revised scoring system. The new scoring
TABLE 5-3 NIH Peer Review Scoring Scale
|Impact||Score||Descriptor||Additional Guidance on Strengths/Weaknesses|
|High||1||Exceptional||Exceptionally strong with essentially no weaknesses|
|2||Outstanding||Extremely strong with negligible weaknesses|
|3||Excellent||Very strong with only some minor weaknesses|
|Medium||4||Very Good||Strong but with numerous minor weaknesses|
|5||Good||Strong but with at least one moderate weakness|
|6||Satisfactory||Some strengths but also some moderate weaknesses|
|Low||7||Fair||Some strengths but with at least one major weakness|
|8||Marginal||A few strengths and a few major weaknesses|
|9||Poor||Very few strengths and numerous major weaknesses|
|Non-numeric score options: AB = Abstention, CF = Conflict, DF = Deferred, ND = Not Discussed, NP = Not Present, NR = Not Recommended for Further Consideration.|
|Minor Weakness: An easily addressable weakness that does not substantially lessen impact.|
|Moderate Weakness: A weakness that lessens impact.|
|Major Weakness: A weakness that severely limits impact.|
SOURCE: NIH, 2009.
includes additional guidance for the category descriptors. For example, an application that is given a score of 1 has a descriptor of “exceptional,” which corresponds to “exceptionally strong with essentially no weaknesses.” Major, moderate, and minor weaknesses are further defined (NIH, 2015b). The NIH 9-point scoring system is based on the specific psychometric properties presented with that scale. CDMRP’s use of such a 9-point scale would better reflect current peer review practices and help experts who review for multiple funding agencies be more comfortable with the CDMRP scoring system.
Scientist reviewers evaluate an entire application; consumer reviewers are required to critique only specific sections—the most important of which is the “impact statement”—but they may read and critique other sections if they choose. The impact statement describes the potential of the proposed research to address the goal of the program and the potential effects it will have on the scientific research community or people who have or are survivors of the health condition. Directions for preparing an impact statement are included in each program announcement, but the specific content varies by award mechanism (CDMRP, 2016f). Specialty scientist reviewers are responsible for critiquing and scoring only designated sections of the applications, such as the research plan or statistical design, but, as with consumer reviewers, they may critique additional sections of the application if they choose.
Each peer reviewer writes a preliminary critique on the strengths and weaknesses of his or her assigned applications and scores each criterion; the data are entered directly into eBRAP. Specialty or ad hoc peer reviewers provide criterion scores for those specific areas that they are charged with reviewing (for example, the statistical plan for a biostatistician specialty reviewer). Each assigned reviewer, including specialty reviewers, also provides an overall score (Salzer, 2016b). The committee is concerned about having reviewers who may have read only a portion of an application provide an overall score for it. However, the committee recognizes that the preliminary peer review scores, including the overall scores, can be revised by a reviewer during the full panel discussion of an application.
There are five possible formats for conducting scientific peer review: onsite, in-person peer review panels; online/virtual peer review panels; teleconference peer review panels; video teleconference peer review
panels; and online/electronic individual peer reviews. The peer review format to be used is decided by the program manager and included in the task order assumptions. An onsite peer review meeting lasts approximately 2–3 days, whereas teleconference or videoconference panels generally meet over 1–3 afternoons. In 2014, across 23 CDMRP research programs, there were 124 in-person panels, 23 videoconferences, 41 teleconferences, and 61 online conferences (49 of the online conferences were held by the Peer Reviewed Medical Research Program alone) (Salzer, 2016c). The majority of the programs had only in-person meetings or in-person meetings plus another meeting format, a few programs had only teleconferences, and none of the programs used only videoconferences or online meetings. Among peer reviewers who responded to the committee’s solicitation of input, many stated that the in-person panel meetings were far superior to meetings held by teleconference or another meeting format.
When possible and depending on program size, most or all peer review panels for a given program are scheduled to take place in consecutive sessions. At the start of peer review panel meetings, the program manager presents a plenary briefing for all reviewers which includes an overview of CDMRP, the history and background of the program, the award mechanisms and their intent, the goals and expectations of the review process, and a summary of how the peer review deliverables inform the programmatic review panel. Program managers observe the performance of the panel as a whole as well as the performance of the individual peer reviewers, the panel chair, and the scientific review officers. Program managers also ensure that the panel discussions are consistent with the program announcement. For large programs with multiple peer review panels, the program manager may request that the program’s science officer(s) attend to provide additional oversight (Salzer, 2016b).
During the meeting, the chair calls each individual application for discussion. Chairs are responsible for being familiar with all applications assigned to their panel and their preliminary critiques and scores. Each peer review panel must maintain a quorum of at least 80% of all panel members. The discussion of an application begins with the primary reviewer summarizing its goals, strengths, and weaknesses, followed by additional comments from the secondary and specialty peer reviewers, if applicable, and consumer reviewers. For in-person meetings, the chair facilitates further discussion of the application between the assigned reviewers and the other panel members (Kaime et al., 2010).
All panel members assign a final score to each application following deliberations on it (Kaime et al., 2010). Specialty peer reviewers are considered equal voting members of the panel for applications that they reviewed, but their input is limited to their assigned applications and not
the panel’s entire portfolio (Salzer, 2016b). The overall score provided by each voting panel member is averaged to produce the overall score for the application, which is provided to the applicant along with the summary statement (MOMRP, 2014).
Reviewers may revise their preliminary critiques, if necessary, and complete the standardized application evaluation forms following the general discussion of the application. The panel chair then gives an oral summary of the discussion, including any reviewer’s revisions. Panel members must finalize their scores immediately after the discussion of each application, and critiques may be modified at any time during the meeting and up to 1 hour after the meeting is concluded (Salzer, 2016d). The discussion is incorporated into the final summary statement for the application (discussed under Post-Meeting Activities and Deliverables). All panel reviewers are given an opportunity to comment on the chair’s oral summary (Salzer, 2016b).
Programs that handle large volumes of applications may use an expedited review process (Kaime et al., 2010). This process is a form of review triage and was instituted to decrease the cost and increase the efficiency of the peer review processes (Salzer, 2016c). In this process, all applications are reviewed in the pre-meeting phase by assigned peer reviewers as previously described. Pre-meeting scores are collated by the peer review contractor, who then sends the reviewers’ scores (overall and criteria scores) for all applications to the CDMRP program manager. The program manager then reviews the scores and selects the range of scores (and thus applications) that will be assigned for discussion as well as those that will not be discussed at the plenary peer review meeting. Typically, the scoring threshold for discussion is the top 10% (although this could be as much as 40%) of applications. Applications designated for expedited review are not discussed at the plenary peer review meeting unless the application is championed. An expedited application may be championed by any member of the peer review panel and will immediately be added to the docket for full panel discussion (Salzer, 2016c).
The main outcome of a peer review meeting is a summary statement for each application. Summary statements are used to help inform the deliberations of the programmatic panel when making funding recommendations (see Chapter 6). After the review process is complete and funding decisions have been made, the summary statements are provided as feedback to the respective principal investigators.
Following a peer review panel meeting, the peer review contractor generates an overall debriefing report that has a summary of all comments made by the reviewers during the panel discussion. For each application that was reviewed by a panel, the SRO prepares a multi-page summary statement that consists of
- identifying information for the application;
- an overview of the proposed research which may include the specific aims;
- the average overall score from all panel members who participated in the peer review meeting (with standard deviation so that the applicant can see how much variability there was in panel scores);
- the average criterion-based scores (standard deviation may be provided);
- for each criterion section, the assigned reviewer’s written critiques of the application’s strengths and weaknesses, including specialty reviews;
- any panel discussion notes captured by the SRO during the panel meeting, such as comments from panel members who were not assigned to the application or the chair’s oral summary; and
- for unscored criteria, such as budget and application presentation, a summary of the strengths and weaknesses noted by assigned scientist reviewers and any relevant panel discussion notes (Salzer, 2016b).
The SRO does not change or alter the reviewers’ written critiques except for formatting, spelling, typographical errors, etc. The SRO drafts all summary statements, and the program manager reviews them (Salzer, 2016b).
After the meeting, the CDMRP program manager receives a final scoring report as well as administrative and budgetary notes (Salzer, 2016b). Program managers review the summary statements and a final scoring report and may request rewrites from reviewers or the contractor if, for example, the critiques and summary do not match the scores or a summary statement is inadequate. If there are any actions that need to be completed prior to programmatic review, such as clarifying eligibility issues, the program manager will act on them before the programmatic review meeting (Salzer, 2016b).
Quality Assurance and Control of the Peer Review Process
The CDMRP program manager reviews all deliverables and evaluates contractor performance according to the individual contract’s Quality Assurance Surveillance Plan. Contractor performance is reviewed and evaluated on a quarterly basis. Through the evaluations, CDMRP is able to provide feedback on the members of the peer review panels. All comments are sent to the contracting officer’s representative, who gathers them and sends them to the contracting officer. Any discrepancies or deficiencies in the Quality Assurance Surveillance Plan are discussed with the contractor, and a resolution is sought (Salzer, 2016a).
For each funding cycle, the U.S. Army Medical Research and Materiel Command (USAMRMC) evaluates a random sample of 20 applications across all programs to assess whether reviewer expertise was appropriate for the application and to compare the information with the reviewer’s self-assessment of their areas of expertise (CDMRP, 2015c). However, this random sample equates to less than one application to be reviewed per program. No additional information was provided on this evaluation process.
In addition to the USAMRMC evaluation, the program manager reviews at least 10% of the draft summary statements from each panel to assess the reviewers’ evaluation of applications and to ensure that the summary statements accurately capture the panel’s discussions (Salzer, 2016b,d). Summary statements are not chosen randomly; factors that may flag a summary for review include large changes in pre- versus post-discussion scores, applications with high scores, applications for which the panel had disagreements, and applications considered to be “high-profile” (Salzer, 2016d). Summary statements are reviewed to ensure that there is concordance among the evaluation criteria, overall scores, and the written critiques; that key issues of the panel discussion were captured; and that the critiques are appropriate for each criterion. If an issue that may have affected the peer review outcome is identified, the program manager may request a new peer review for the application. Other assessments that do not directly affect a peer review outcome are given as feedback to the contractor (Salzer, 2016d).
A mechanism for the quality assurance of peer reviewers is the post-meeting evaluation by the CDMRP program manager with support from the contractor’s scientific review manager and SRO. The goal of this evaluation is to identify reviewers and chairs who should be asked to serve again, should their expertise be needed. Panel members are also evaluated to identify those who might be good chairs in future review cycles. There is no standardized form or criteria for peer reviewer evaluations (Salzer, 2016c), and no examples of program-specific evaluations were provided. The performance assessments of individual peer reviewers consider exper-
tise, the ability to communicate ideas and rationale, group interactions, the ability to present and debate an opposing view in a professional manner, strong critique writing skills, and adherence to policies on nondisclosure and confidentiality (Salzer, 2016c). As part of the quality assurance check of summary statements, the overall quality of each reviewer’s product is evaluated to be sure that there is a match between scores and written critiques, that there is solid reasoning, and that reviewers have demonstrated an understanding of how to judge each criterion (Salzer, 2016d). It is unclear to the committee whether the evaluation also includes any feedback from the peer reviewer, such as whether the panel had the appropriate expertise. Because program needs and award mechanisms may change from year to year, there is no guarantee that a particular expertise will be required each year, so even reviewers or chairs who are exemplary may not be invited to participate in subsequent panels if their expertise is not needed (Salzer, 2016b).
Post-Peer Review Survey
Following the peer review meetings, scientist and consumer reviewers complete an online survey and evaluation form to provide feedback on the experience, the process, and the areas that could use improvement (Salzer, 2016b). CDMRP informed the committee that the survey includes reviewer demographics; satisfaction with the process, including whether consumers are engaged; an evaluation of pre-meeting support (including logistics, webinars, and review guidance) and the technological interfaces used for review; and whether reviewers have recently submitted an application to another CDMRP research program or to another award mechanism in the same research program. Scores and comments are compiled and used by the peer review contractor and CDMRP program managers to assess the peer review process and to evaluate the program announcement for future modifications—for example, to clarify overall intent of the award, focus areas, or peer review criteria. Other aspects may be discussed at the program’s subsequent vision setting meeting (Salzer, 2016a).
CDMRP stated that program managers have an opportunity to add program-specific or award mechanism–specific questions as needed to the post-review survey. For example, Box 5-1 shows the questions that were added to the post-peer review survey for the Amyotrophic Lateral Sclerosis Research Program at CDMRP’s request in 2014 and 2015.
The committee requested a copy of the post-peer review survey as well as the aggregated responses from a sample program. However, neither the survey nor the aggregate sample results were provided because the post-peer review survey is owned by and considered a deliverable by the peer review support contractor.
The consumer reviewers meet separately as a group at the end of the meeting to debrief on their experience and provide feedback to CDMRP through the consumer review administrator on how the experience could be improved (CDMRP, 2016f). Questionnaires are used to evaluate the mentor program and other consumer aspects of the CDMRP program. The results of the questionnaires were not available to the committee, but individual testimony from consumers who attended the committee’s open sessions as well as consumers and scientists who responded to the committee’s solicitation of input reported that consumers’ input to the peer review process was helpful.