Strengthening the System
Our guiding principle in setting forth the recommendations that follow derives from our dual conclusions—detailed in the previous chapter—that peer review of education research proposals ought to focus explicitly on identifying and supporting the highest quality research and strengthening the field to foster a culture of rigorous inquiry.
Peer review in research agencies sets a foundation for developing and sustaining a culture of inquiry and rigor—both within the walls of the agency and in the fields it supports. More so than any single process or practice, creating and nurturing this culture in a research agency is consistently emphasized in reports on peer review in education as well as a range of other fields (National Research Council, 1998, 1999b, 2002; President’s Commission on Excellence in Special Education, 2002). Developing this shared sense of commitment to best practice in an agency provides a healthy environment for peer review to function effectively. Specific practices in the peer review systems can sustain the broader culture in the field through reinforcing these norms among applicants, staff, and reviewers, who together are key members of the research field.
In our recommendations, we focus exclusively on the government side of the peer review partnership. We chose this perspective on the basis of an assessment of the current policy landscape and the near-term needs of two key parts of the federal government that support education research: the Institute of Education Sciences (IES) and the Office of Special Education Programs (OSEP). The president has nominated members for appoint-
ment to the National Board for Education Sciences of the IES, which, pending confirmation by the Senate, will work with the director to maintain a high-quality peer review system for the agency. And the pending reauthorization for OSEP will almost certainly require changes in its peer review process. Thus, we have attempted to provide a framework for research policy makers who need to revisit their peer review systems in this short report.
Overall, the recommendations we provide are of two kinds: (1) specific ideas to enact the conclusions we set out in the previous chapter and (2) our suggestions for addressing some of the problems related to such issues as logistics and training that were aired during our workshop. In this context, we reemphasize that the evidence we use to support these ideas is based primarily on workshop discussions, and our recommendations should be interpreted in this light.
A final caveat: although we focus on the government’s role, we should clarify that the responsibility for ensuring a well-functioning peer review system does not rest solely within the walls of federal agencies. As our depiction of peer review in this report makes clear, the research communities as well as the broader education community all play a part in the process—and thus all share accountability for its success. Without the support, integrity, and participation of these communities—particularly researchers—it is sure to fail.
The committee makes 10 recommendations:
Peer review is the best available method to support education research funding decisions. Thus, we recommend that this mechanism be maintained and strengthened.
Agencies that fund education research should explicitly focus on two key objectives for peer review: (a) to identify and support high-quality education research and (b) to promote the professional development of a culture of rigor among applicants, reviewers, and agency staff in the education research communities.
Agencies that fund education research should develop clear statements of the intended outcomes of their peer review systems and establish organizational routines for ongoing, systematic evaluation of the peer review process.
Agencies that fund education research should build strong infrastructures to support their peer review processes. This infrastructure should include (a) knowledgeable staff, (b) systems for managing the logistics of
peer review, (c) technologies to support review and discussion of proposals, (d) a clear mechanism for providing feedback, and (e) standing panels when research priorities are relatively stable.
Effective peer review systems require planning and organization in advance of a review. In order to schedule, agencies that fund education research need relatively predictable levels and timing of funding. Internal barriers that slow down program announcements or make peer review difficult to schedule should be minimized. To the extent possible, scheduling problems and complicated or burdensome logistics should be eliminated to support the availability and participation of highly qualified reviewers.
Agencies that fund education research should uphold basic principles of peer review but retain flexibility in designing peer review systems to meet their individual needs. Agencies should be accountable for upholding these principles and should provide data on how well their process achieves its goals. External mandates that extend beyond these foundations should be minimal, as they can hinder the development and implementation of high-quality peer review systems.
The criteria by which reviewers rate proposals should be clearly delineated, and the meaning of different score levels on each scale should be defined and illustrated. Reviewers should be trained in the use of these scales.
As a group, peer review panels should have the research experience and expertise to judge the theoretical and technical merits of the proposals they review. In addition, peer review panels should be composed so as to minimize conflicts of interest and balance biases, promote the participation of people from a range of scholarly perspectives and traditionally underrepresented groups, and provide opportunities for professional development.
Agencies that fund education research should involve practitioners and community members in their work to ensure the relevance and significance of their portfolio. If practitioners and community members participate on peer review panels, they should focus on the relevance, significance, applicability, and impact of an education research proposal.
Agencies that fund education researchand professional associations should create training opportunities that educate scholars in what the peer review process entails and how to be an effective reviewer.
We begin with two general recommendations that follow from our
overarching theme that peer review be strengthened by focusing on two objectives.
Recommendation 1: Peer review is the best available method to support education research funding decisions. Thus, we recommend that this mechanism be maintained and strengthened.
A strong peer review system is a hallmark of successful research and development programs across the federal government. Although there are alternatives to the peer review process for allocating research dollars, none has the same potential for identifying high-quality research proposals and promoting the further development of a professional culture of inquiry among education researchers. Each of the most common alternatives—including a strong program manager who makes the determination, allocations based on formulas, and legislative earmarking—inadequately addresses both of these critical goals that peer review can serve. Peer review is the best mechanism to support the kinds of objectives we hold as critical for the future of the field and the use of education research to improve policy and practice in turn.
The effectiveness of peer review depends heavily on the extent to which specific procedures are designed and implemented to support the stated objectives of the system. Thus, following the logic of the previous chapter, we first state the purposes that in our judgment should undergird peer review of education proposals in federal agencies, and then we make a series of suggestions for how various processes can serve them. We then recommend some basic principles for peer review systems and outline key features of an infrastructure that can support its functions. Peer review involves the expenditure of a good deal of resources, and it is important to ensure that this investment (of federal dollars and researchers’ time, primarily) is worthwhile for all.
Recommendation 2: Agencies that fund education research should explicitly focus on two key objectives for peer review: (a) to identify and support high-quality education research and (b) to promote the professional development of a culture of rigor among applicants, reviewers, and agency staff in the education research communities.
Peer review practices in federal agencies have historically been designed based on culture and tradition. We recommend that federal agencies that support education research adopt two objectives—identifying and supporting high-quality education research and promoting a shared culture of rigor in the field—as guideposts as they design, revamp, or evaluate their systems.
One main objective of peer review is to ensure that any research that is funded, published, or used to support education policies or practices meets high standards of quality (Chubin and Hackett, 1990; Hackett and Chubin, 2003). This goal has historically been at the center of agency peer review systems, and should continue to be. Upholding this goal requires that agencies work toward achieving clarity on what is meant by research quality in the context of peer review—a formidable task. Although consensus on this question has been and continues to be elusive in education research more generally (Lagemann, 2000), a starting point for these discussions in peer review panels with respect to technical or intellectual merit criteria could be the principles of scientific education research outlined in Scientific Research in Education (National Research Council, 2002, pp. 3-5): 1
Pose significant questions that can be investigated empirically.
Link research to relevant theory.
Use methods that permit direct investigation of the question.
Provide a coherent and explicit chain of reasoning.
Replicate and generalize across studies.
Disclose research to encourage professional scrutiny and critique.
These principles were not designed to be, nor should they be used as, strict standards to which individual research applications are subjected in peer review.2 Central concepts embedded in the principles can and should
be applied fruitfully, however; for example, applications should be scrutinized for the explicit positioning of a researchable question in the broader theoretical and empirical line of inquiry within which it is situated. Also, to extend the principle of a logical chain of reasoning in peer review of funding applications, reviewers can assess the degree to which proposals include a discussion of the kinds of data that will be collected to inform the question and of the sorts of alternative explanations for expected outcomes that could arise during the investigation.
More detailed considerations of what constitutes technical quality in particular competitions and research areas will be required. At the level of specifics, standards for high-quality research require that the research design be consistent with up-to-date accepted practice (or its extension or development) for that design. If an ethnographic study of a school is proposed, for example, reviewers should judge whether the amount of time allocated for investigators to spend in the school is adequate and whether the observational system meets accepted standards for reliability.
Assessment of quality should also include consideration of the relevance or significance of the proposed project: Will it build on what is known in productive ways? Will it contribute to a knowledge base that can inform educational improvement? Is it likely to contribute to solving an important problem of educational practice?3
Ensuring that high-quality research is funded at the agency also extends to the internal decision making on fund allocation once the slate of applications is generated by extant peer reviewers. Typically the main product of the peer review process is a slate, or list of proposals, which rankorders the proposals according to how peers rated them on the evaluation criteria. With variations on the theme, it is common practice for authorized agency staff (typically the director of the division or directorate) to go down the list, funding proposals until they reach the cutoff—that is, the point at which available resources are depleted.
of inquiry will not have been replicated independently—a strong line of research is likely to do so” (National Research Council, 2002, p. 52). Also, the principles are more directly applicable to research products than research proposals.
In practice, there are often reasons to deviate from the linear process of funding the top proposals that can be supported with available resources. For example, funding proposals up to the cutoff is often independent of the relative quality differences between ranked proposals—that is, the line may very well be drawn at a point at which differences in scores are just as easily attributable to random errors as to true differences in quality. Or a set of proposals might not meet high standards of quality such that the cutoff is drawn well below where the quality of the proposals drops off. The converse could also be true: the quality of a proposal could be very high (scoring say, 90 on a 100-point scale), but because it may have been (randomly) part of an especially high-caliber batch of proposals (all 95 or above) or competing for an especially limited number of dollars, it does not make the cut.
These circumstances require careful consideration in making final funding decisions. In the case when the quality of proposals drops off above the cutoff, a decision might be made to fund only those that meet quality standards and therefore not to use all available funding. But it is politically dangerous to decline funding for proposals of questionable quality when funds are available—this decision sends a signal to appropriators that the agency cannot support current funding levels, setting themselves up for funding cuts in the next fiscal year. However, as Whitehurst suggested in describing recent decisions on a slate at the IES, the risks of populating the literature with inferior work may be far greater than a short-term dip in the dollars available for annual research spending.
Finally, federal agencies that fund education research should strive to fund some high-risk proposals in their portfolios. When review criteria include explicit reference to innovation or originality, it is important for applicants to describe and to justify the ways in which the proposed work departs from, and could advance understanding in, current knowledge. However, since the peer review process can be conservative with respect to risk, additional attention to innovation is likely to be needed. For example, the use of a funding mechanism designed to invest a proportion of research funding for high-risk, innovative research could be an effective strategy. Agency leadership should also have the flexibility to make funding decisions “out of order” to support a proposal that might not have been rated highly due to its innovative character.
It is important to clarify that taking calculated risks in funding federal education research does not require a retreat from high standards of qual-
ity, nor does it justify supporting work that is technically flawed or of questionable significance to education. Creative, innovative research proposals, like any other, should adhere to standards of rigor and relevance. But because the merits of fundamentally novel questions or approaches can be easily dismissed as peers assess the merits of proposed work through the lens of established traditions and paradigms, agencies need to find ways to reward risk-taking in strategically managing and developing their portfolios over time.
A second objective we hold for peer review is to encourage further development of a culture of inquiry among education researchers. This purpose has not been explicitly adopted on a large scale in the federal agencies considered in this report, and it is especially critical given current concern about the capacity of the field of education to conduct scientific research. Implementing this recommendation will take some time and effort. In Chapter 2, we analyzed several facets of peer review that can be leveraged to promote these professional development opportunities, including panel membership, standing panels, and feedback mechanisms. We elaborate on several specific strategies in recommendations 4, 8, and 10.
In terms of this objective of further developing the field, decisions to contract out aspects of the review process to nongovernment sources should be made sparingly. Outsourcing may be an attractive method for managing the agency’s workload, but it can also significantly curtail opportunities for the substantive interactions in the peer review process that foster learning and intellectual growth. Ensuring that agency staff have an opportunity to participate in meaningful ways and seeing that the review process is grounded in the agency’s priorities and expertise are critically important in the review process. Outsourcing makes their attainment difficult. Although it may be desirable to outsource logistics, these professional development considerations strongly suggest that internal capabilities to support the intellectual tasks associated with review be retained in-house. For a large research operation like that at the National Institutes of Health (NIH), for example, these internal capabilities have served the agency well. For agencies like the IES, where staff capacity is less developed, however, contracting out aspects of peer review may be an effective interim strategy to build a basic infrastructure. As a general matter, recommendations for outsourcing or otherwise diverting the peer review process to external groups should be made based on a careful assessment of needs and goals in each agency; “one size fits all” recommendations should be avoided.
FEATURES OF PEER REVIEW
The next set of recommendations addresses infrastructure, policy, logistics, and specific mechanisms for strengthening peer review in federal agencies that support education research.
Recommendation 3: Agencies that fund education research should develop clear statements of the intended outcomes of their peer review systems and establish organizational routines for ongoing, systematic evaluation of the peer review process.
Articulating the objectives of the agency’s peer review system is a point of departure not only for the design of the system but also for its evaluation. There is surprisingly little systematic evaluation of how and the extent to which peer review processes support or hinder the attainment of system objectives in practice, and we think this status quo is unacceptable.
The benefits of evaluation for organizational growth and improvement are well known. Formative evaluation that generates systematic, ongoing data can identify gaps in service, highlight opportunities for growth, and suggest potential reform strategies. Regularly analyzing these data and feeding them into a continuous improvement loop can be a powerful tool for promoting organizational excellence. And summaries of such data can provide stakeholders with information about how well organizations have met their goals. A sustained focus on the outcomes of government activities is also the spirit in which Congress passed the Government Performance and Results Act of 1993, P.L. 103-62, which mandated that all federal agencies develop strategic plans and monitor their performance with respect to how well they meet specific milestones each year as well as more recent initiatives, like the President’s Management Agenda. Thus, we recommend that education research agencies establish procedures for the ongoing review of their peer review practices and results and periodically evaluate whether the goals and purposes of peer review are being met.
Carefully developed process and outcome measures that map to agreedon objectives must guide data collection and analysis. Using the two objectives we recommend to guide peer review systems in federal agencies that support education research, we offer a few examples of potential data collection and analysis efforts.
Evaluating the extent to which the peer review process results in the identification and support of high-quality research could include such strategies as retrospective assessments of research portfolios by researchers and
practitioners (see Recommendation 9), periodic review of the quality of work in progress in a sample of funded projects, and monitoring the publications of investigators supported by agency funds in peer-reviewed journals. Examining the links between peer review practices and the professional development of the field can be pursued in a number of ways. The specificity and quality of reviewer comments on each evaluation criteria and the nature and speed with which applicants were furnished with this feedback can be evaluated and unsuccessful applicants interviewed to ascertain how helpful reviews were for improving their subsequent work. Analyzing the characteristics of reviewers along a number of dimensions (e.g., disciplinary background, gender, race/ethnicity) and their reports of the value of the experience for their career and the field in which they work can provide clues as to how well the system is engaging and helping to develop a broad range of scholars. Management practices can be assessed by systematically surveying agency customers—reviewers and investigators submitting proposals (successful and unsuccessful), for example—and by compiling data on the time allowed for review, scheduling practices, and other activities designed to ease the burden on applicants and reviewers.
Furthermore, the use of committees of visitors to assess peer review practices—a common practice at the National Science Foundation (NSF)—can effectively evaluate the legitimacy and fairness of peer review systems. By focusing and publicly reporting on the implementation of procedures designed to guard against abuses that arise from conflicts of interest, bias, and other potential problems, agencies can institute checks on the system and garner useful information about how to improve it. We see this practice as a useful element of a broader evaluation strategy in federal agencies that support education research.
Calling for evaluation of results is easier said than done. Effectively implementing these activities will require an investment of scarce time and money. However, agencies cannot continue to operate without empirical evidence of the effectiveness of their peer review practices for supporting their objectives. Research agencies by their very nature have the capacity for rigorous inquiry and investigation. We think they ought to use it to critically examine their own practices and to set an example for other organizations inside and outside government.
Recommendation 4: Agencies that fund education research should build strong infrastructures to support their peer review processes. This infrastructure should include (a) knowledgeable
staff, (b) systems for managing the logistics of peer review, (c) technologies to support review and discussion of proposals, (d) a clear mechanism for providing feedback, and (e) standing panels when research priority areas are relatively stable.
In all of the current peer review systems described during our workshop, agency staff play important roles in the process. While the specifics vary, common tasks include preparing grant announcements, identifying and recruiting reviewers, developing and managing reviewer training, handling logistics of the review process, summarizing the comments of reviewers, participating in review meetings, communicating with those submitting proposals, and (in fewer cases) using their own judgment in making final funding decisions. These responsibilities require expertise both in logistics and in the substance of the research areas. A strong peer review system depends on having staff with the managerial and substantive expertise to make the system run smoothly and to capitalize on the knowledge that they and scholarly peers bring to the process.
In particular, the role of the program manager—staff charged with writing requests for proposals for a competition and who will oversee the portfolio of work that will result—requires careful consideration. As described in Chapter 2, there are many different models for this role in peer review, ranging from the firewall approach at the NIH (clear division of labor between program and review staff), to the strong manager approach at the Office of Naval Research (the same staff performs program and review tasks), with many hybrid approaches in between. Whatever the role of program managers, the boundaries should be transparent and articulated to all persons involved in the process and the weaknesses of the preferred staff model compensated for elsewhere in the peer review system.
Models with shared roles threaten at least the perception of the integrity of peer review; potential conflicts of interest can arise easily in selecting reviewers and assigning reviewers to proposals and even the perception of conflict of interest and the inevitable concerns about cronyism will raise questions about the integrity of review. Thus, if such approaches are adopted, they will require the development and consistent implementation of checks and balances to ensure fairness (e.g., having program managers organize reviews of programs other than their own). We see the benefits of substantively engaged staff for promoting their professional development and the learning of those who come in contact with a system staffed by knowledgeable personnel. At a minimum, however, program managers
should never attempt to influence reviewers’ assessment of the theoretical and technical merit of proposals, nor should their own views on merit be provided at a peer review panel meeting.
Logistical support for peer review is critical to its smooth functioning and to the ability to recruit scholars at the leading edge of their fields to participate in the process. For example, in many competitions, the numbers of proposals and reviewers are large. In the absence of established procedures for handling a high volume of proposals, the process can easily become chaotic. Staff must ensure manageable workloads for reviewers. Practices such as prescreening—sorting proposals before review to weed out applications that do not meet minimum standards or to focus review on controversial or borderline proposals—can be helpful in this regard. For example, we support enacting one of the recommendations emanating from the evaluation of the Office of Educational Research and Improvement (OERI) peer review (August and Muraskin, 1998) suggesting that staff eliminate applications for funding activities that do not involve research.
In determining the optimal size of review panels, staff must balance cost and efficiency criteria with the need for multiple perspectives and backgrounds. An ideal size does not exist—these determinations must be made in the context of the particular circumstances of each review. A reasonable minimum number is three: one person obviates the need for peer review altogether, and with two people there is no way to adjudicate highly divergent conclusions. At the other end of the range, when the size of a panel grows larger than 12 or 14, in-depth discussion of individual proposals is inhibited by the short amount of time each panelist can speak as well as abbreviated opportunities for meaningful and inclusive interaction. The overarching consideration is to ensure adequate expertise (see Recommendation 8) while keeping group size manageable. Soliciting additional ad hoc reviews by those who work in the specialized areas of particular competitions can be helpful in infusing needed expertise while maintaining reasonable panel size.
Additional logistical supports could include well-managed databases. To assist staff in recruiting groups of reviewers that cover the breadth and depth of expertise needed in a given competition, agencies can develop electronic banks of outstanding reviewers. These databases can include contact information, areas of expertise, disciplinary affiliation, demographic data, and strengths and weaknesses of previous reviewers (e.g., skilled writer, synthesizer of information, excellent listener). Active maintenance of this tool will be required to ensure its usefulness. Identifying and tracking the
pool of reviewers can also be an effective way to ensure the broad participation of people with diverse viewpoints and backgrounds in an agency’s peer review process, an issue addressed by Recommendation 8.
Written documentation of processes, timelines, roles and responsibilities, and review criteria as well as face-to-face training in such areas as how to apply review criteria (see Recommendation 7) are also extremely helpful in fostering a positive environment. To the extent possible, schedules with ample lead time for initial reviews and time for group discussion should be established and followed; without sufficient time resources, reviewers cannot carefully study or discuss proposals, and few purposes will be served well. Such circumstances have long-term implications as well, since they will almost surely leave reviewers dissatisfied and less likely to agree to participate in the future.
Enhanced applications of technology can also facilitate peer review and smooth logistics. At NSF, computer systems provide reviewers with quick and easy ways of recording their reviews, access to other reviewers’ comments, and support for substantive discussions of proposals. These systems can be used to minimize the time reviewers have to spend on “process” considerations and enhance opportunities for interactions. Technology can also improve communication at a distance, when costs or timing prohibit face-to-face meetings of reviewers. However, we concur with workshop participants who argued that face-to-face meetings both improve the quality of discussion and provide incentives for reviewers to participate in reviews (since they are more likely to gain from such interactions).
It is worth emphasizing the point that the immediate benefits from competent staff, clear and well-executed procedures, and useful technologies all contribute to longer term payoffs in the form of incentives for scholars to serve as peer reviewers. Several workshop participants noted that the motivations for serving as a reviewer are not financial, but rather a desire to serve and influence the field and the opportunities to learn from the discussions. When the process runs smoothly, discussions are engaging, and the impact of the reviews are evident, reviewers will be motivated to continue; when the process seems clumsy, communication thin and hurried, and impact uncertain, reviewers may decline the next invitation.
For peer review systems to meet the overall goals of supporting high-quality research and providing professional development for the field, procedures must be developed to facilitate communication about the proposals, among the group of reviewers and among the reviewers, applicants, and staff. The amount and type of feedback given on proposals varies across the
agencies represented at the workshop, in part because of the effort needed and the varying instruction and training given to reviewers. A greater proportion of agency budgets devoted to consistent and thorough feedback from reviewers to applicants through agency staff and written communications, along with better training for reviewers (e.g., giving them model reviews, providing training in the application of the review criteria), would enhance the role that peer review plays in strengthening research. Here again, these dual objectives of peer review in federal agencies that fund education research should always be kept in mind.
A policy that allows unsuccessful applicants to revise and resubmit their proposals based on previous review can be an excellent vehicle for explicitly linking feedback to future (improved) submissions. Such opportunities can help develop a field, allowing opportunities (especially for researchers in early career stages) to hone and refine ideas and to ensure that promising ideas are not lost in the vagaries of the review process. Indeed, education researchers commonly complain about one-time submission processes (President’s Commission on Excellence in Special Education, 2002).
Standing panels can be an attractive part of the agency infrastructure as well. If the general areas of research to be supported are stable, standing panels have several advantages: they make it easier to recruit top scholars as reviewers by increasing the prestige for serving and reducing the need for extensive training before each review cycle. Standing panels facilitate development of consistent interpretations of rating criteria. They are especially conducive to the professional development goal of peer review, since panels can set standards that are maintained as reviewers begin and finish their terms, also providing for the professional development of reviewers. The use of standing panels (“study sections”) at NIH exemplifies these possibilities. Agencies that rely solely on ad hoc review panels miss valuable opportunities to develop the human resources of education research. We therefore recommend that standing panels be established for review of education research proposals whenever the substantive focus is reasonably stable over time.
Standing panels often require that additional ad hoc reviewers be added when the topics of supported research change substantially across competitions and each new competition requires a different mix of expertise. Furthermore, the extended terms of standing panel members can lead to stagnated or narrow perspectives, so policies such as staggered terms which infuse new people and ideas into the group, are necessary to counterbalance this effect. And because of the length of service and influence of standing
panel members, carefully selecting individuals with a range of perspectives and backgrounds is very important.
In this context, the role of ad hoc panels should be clearly conceptualized in the peer review philosophy at the agency. Ad hoc panels are most useful when a new priority is under consideration or when agencies desire to initiate a significant program of research in a particular area. By augmenting with ad hoc panels, the tendency of standing panels to be risk-averse can often be minimized. Ad hoc panels may also be necessary in agencies with relatively small research portfolios, given the cost of conducting peer review, especially if these priorities fluctuate on a year-to-year basis. The latter situation makes a standing panel especially difficult unless the priorities are related and the pooled expertise of the review group is adequate across different priorities.
Recommendation 5: Effective peer review systems require planning and organization in advance of a review. In order to schedule, agencies that fund education research need relatively predictable levels and timing of funding. Internal barriers that slow down program announcements or make peer review difficult to schedule should be minimized. To the extent possible, scheduling problems and complicated or burdensome logistics should be eliminated to support the availability and participation of highly qualified reviewers.
Ensuring a well-managed process, especially providing ample time and organized scheduling, once again invokes the trade-off of costs and efficiency versus the efficacy of peer review in accomplishing its core objectives (see Recommendation 3 for specifics on implications for agency infrastructure).
In education research, there is a long history of uncertainty over appropriation cycles (see, for example, National Research Council, 1992). When funding is uncertain, competitions cannot be established well in advance and creating standing panels can be difficult. This situation not only affects the quality of research proposed (as applicants are forced to produce rushed proposals), but also makes it difficult to recruit reviewers. If schedules cannot be set up in advance, managing peer review effectively is almost impossible. Reviewers are often unavailable due to prior commitments and, if they do agree to serve, insufficient lead time to thoroughly evaluate the proposals can doom a review. These problems are compounded if internal
clearances prior to a competition are extensive and time-consuming (President’s Commission on Excellence in Special Education, 2002).
When schedules of announcements are not regular, the field cannot plan and develop thoughtful proposals. Unpredictable and infrequent competitions may attract a high proportion of the top scholars in an area. That leaves an impoverished pool from which to select reviewers. For a peer review system to work well, both strong applicants and strong reviewers must be available on a regular basis. Standing panels can help promote regularity in scheduling and training. But such pools should be used to form committees with regular schedules, not just represent a pool from which reviewers can be drawn when an irregular competition is held. Establishing this kind of stability is essential to helping to grow a culture of rigorous inquiry in education and improving the knowledge base in turn.
Again, we encourage the management of workloads through prescreening. So long as applicants receive written feedback, it is not necessary to exhaustively discuss every proposal at a meeting. The panel needs time to deliberate and should focus on proposals for which there is potentially high merit as well as those for which there may be disagreements among reviewers.
Finally, attracting and maintaining a high-quality pool of reviewers requires smooth logistics. If the meeting requires complicated travel, poor accommodations, or other sources of administrative burden, reviewers will be tired, demoralized, and less inclined to participate in future panels.
Recommendation 6: Agencies that fund education research should uphold basic principles of peer review but retain flexibility in designing peer review systems to meet their individual needs. Agencies should be accountable for upholding these principles and should provide data on how well their process achieves its goals. External mandates that extend beyond these foundations should be minimal, as they can hinder the development and implementation of high-quality peer review systems.
Peer review mechanisms must adhere to basic principles and be accountable for results. Basic principles for peer review include a dedication to ensuring the scientific merit of proposals; independence from political interference with respect to merit; appropriate expertise on panels; straightforward and publicly available procedures for reviews that promote fairness and integrity; and a mechanism for providing feedback to applicants. Over-
all, agencies should be accountable for instituting a fair process that leads to funding high-quality research that is seen as significant by consumers. Policy makers and consumers should expect to see independent evaluations of the peer review system as well as evaluations of the agency portfolio (see Recommendation 9 for the role of practitioners in ensuring accountability).
Beyond these considerations, agencies need the flexibility to design and manage their peer review systems. Research evolves in unpredictable ways. Peer review systems must be supple enough to respond to emerging needs and opportunities and to guard against a narrowing of the field. Mandating mechanisms for peer review through legislation may rob the agency of the flexibility it needs. In particular, mandates that extend beyond the authority to conduct peer review and outlining the overall structure can impede the capacity of the peer review mechanism to meet its objectives. For example, as the President’s Commission on Excellence in Special Education recently noted in its report (2002), legislative requirements about the composition of review panels are especially difficult for an agency and can have deleterious effects on the ability of the peer review system to identify and support high-quality education research.
Recommendation 7: The criteria by which reviewers rate proposals should be clearly delineated, and the meaning of different score levels on each scale should be defined and illustrated. Reviewers should be trained in the use of these scales.
The agencies represented at our workshop all used different evaluation criteria in their peer review processes. The extent to which the criteria were defined, and the nature and intensity of training for reviewers on how to apply those criteria, varied as well. Given differences in mission and other factors, it is reasonable to expect variation in review criteria; however, we recommend that attention be paid to ensuring criteria are clearly defined and based on valid and reliable measures. We also recommend that the development of training materials and implementation of tutorials for reviewers become standard operating procedure.
Agencies should strive to ensure that the evaluation criteria for peer review be clearly defined and based on valid and reliable measures. In our judgment, reliability (and validity) can be improved for the ratings assigned to proposals as well as for the descriptive feedback associated with scores and group discussion.
At the workshop, Domenic Cicchetti concluded that there was potential for significant improvement in the reliability of ratings across reviewers
through careful training on the rating scale criteria and on the rating process itself. This finding is consistent with a large literature on job performance ratings (Woehr and Huffcutt, 1994; Zedeck and Cascio, 1982) indicating the importance of careful definition of scale “anchors” and training in the rating process. Training could not only improve the consistency of initial ratings across reviewers on a panel, but also facilitate group discussion that leads to stronger consensus and reliability of group ratings. It can have the added benefit of improving the validity of the feedback provided to applicants by better aligning the feedback with the specific evaluation criteria, both in terms of the particular scores given and the descriptions of strengths and weaknesses. For all of these reasons, we concur that clearly defined measures and effective training for reviewers on the use of the scales are essential.
We point to the benefits of training throughout this report. In the context of review criteria, training is important to ensure that reviewers understand how to approach the evaluation of proposals and how to assign specific ratings to each criterion. At the workshop, Teresa Levitin of the National Institute on Drug Abuse provided several useful ideas for how to illustrate key concepts to reviewers about the review criteria. To our knowledge there are few such models from which to learn about effective training practices in the context of peer review in federal agencies. Our recommendation is that agencies place stronger emphasis on developing, evaluating, and refining traning programs to ensure that reviewers are applying criteria in ways that are intended, contributing to the process in effective ways, and learning from the experience.
PEOPLE: ROLES OF REVIEWERS, APPLICANTS, STAFF, AND PRACTITIONERS
The next set of recommendations address the types of people who should participate in reviews and the kinds of training needed for the education research communities.
Recommendation 8: As a group, peer review panels should have the research experience and expertise to judge the theoretical and technical merits of the proposals they review. In addition, peer review panels should be composed so as to minimize conflicts of interest and balance biases, promote the participation of people from a range of scholarly perspectives and traditionally under-
represented groups, and provide opportunities for professional development.
The first priority for assembling a peer review panel is to ensure that it encompasses the research experience and expertise necessary to evaluate the theoretical and technical aspects of the proposals to be reviewed. For agencies that fund education research, we define “theoretical and technical aspects” to refer to three areas: (1) the substance or topics of the proposals, (2) the research methods proposed, and (3) the educational practice or policy contexts in which the proposal is situated. Relevant experience and expertise should be determined broadly, based on the range of proposal types and program priorities. If, for example, a specialized quantitative research design is being proposed, at least some reviewers should have expertise in this design; if a specialized qualitative research design is proposed, some reviewers should have expertise in this design.
In addition, it is the range of proposal types and program priorities, not their frequency or conventionality that should determine the scope of the panel’s experience and expertise. In most cases, individual panelists will have relevant experience and expertise in one or more, but not all, of the topics and techniques under review. Thus, it is the distributed expertise of the review panel as a whole, and not the characteristics of individual members, that establishes the appropriateness of the panel for the task. In this way, peer review is “intended to free [decision making] from the domination of any particular individual’s preferences, making it answerable to the peer community as a whole, within the discipline or specialty” (Harnad, 1998, p. 110).
Thus, peer reviewers of research proposals should be chosen first and foremost for their experience and expertise in an area of investigation under review. Ideally, reviewers will not harbor biases against other researchers or forms of research, will not have conflicts of interest that arise from the possibility of gaining or losing professionally or financially from the work under review, and can be counted on to judge research proposals on merit alone. But in practice, researchers in the same field often do know each other’s work and may even know each other personally. They may have biases for or against a certain type of research. They may be competitors for the same research dollars or the same important discovery or have other conflicts of interest associated with the research team proposed in a study. In such situations, impartiality is easily compromised and partiality not always acknowledged (Eisenhart, 2002). However, Chubin and Hackett
(1990) argue that increases in specialization and interdisciplinary research have shrunk the pool of qualified reviewers to the point at which only those with a conflict of interest are truly qualified to conduct the review. Potential conflicts of interest and unchecked biases are a serious limitation of peer review. In the long term these limitations can be addressed by expanding the pools of qualified reviewers, through training and outreach to experts traditionally underrepresented in the process.
In assembling peer review panels, attention to the diversity of potential reviewers with respect to disciplinary orientation as well as social background characteristics is important for a number of reasons. Diverse membership promotes the legitimacy of the process among a broad range of scholars and stakeholders. If peer review panels are consistently homogenous with respect to discipline, race and ethnicity, or other category, it will send a signal to those who have been excluded from participating that they are not relevant actors in education research, and that their concerns and perspectives are not valued in the work of the agency. Thus, efforts to promote diversity should be part of the public record.
Diversity is also related to quality. Peer review panels made up of experts who come from different fields and disciplines and who rely on different methodological tools can together promote a technically strong, relevant research portfolio that builds and extends on that diversity of perspectives. Similarly, diverse panels with respect to salient social characteristics of researchers can be an effective tool for grounding the review in the contexts in which the work is done and for promoting research that is relevant to a broad range of educational issues and student populations.
Finally, actively recruiting panelists from diverse backgrounds to participate in the process can extend professional opportunity to a broader pool of researchers, building capacity in the field as a whole. Social characteristics affect the training researchers receive (because of the schools they attend, the topics and designs they are likely to pursue in school, and the jobs they anticipate for themselves) and in turn affect the experiences and expertise they develop (Harding, 1991; Howe, 2004). Thus, explicit attempts to engage traditionally underrepresented groups in the peer review process can improve access and opportunity, resulting in an overall stronger field and more relevant research.
As we have discussed (see Recommendation 2 in particular), peer review can provide a rich context for further developing researchers into the culture of their profession and should be explicitly designed to promote the attainment of this objective. This function of peer review is often
underutilized in the push to make funding decisions efficiently. Opportunities for engaging panel members in activities that further their professional development are compromised when panels do not include broad representation of relevant experience and expertise, when panel members do not deliberate together, and when time does not permit differences of perspective and position to be aired and debated. Such opportunities for developing investigators—both experienced and inexperienced with respect to sitting on review panels—to the research ethos are compromised when clear requests for proposals are not available and when good feedback is not provided to proposers. These limitations also reduce the incentive for strong researchers to contribute their time and expertise to peer review: Why should they contribute if so little will come of their efforts and if they will gain so little from the experience?
Attending to promising scholars at early stages of their careers can also target professional development opportunities for up-and-coming researchers who have solid credentials but less experience reviewing. The testaments of many workshop participants citing early experiences serving on NIH (standing) panels as career-changing are indications of the potential of peer review to develop early career researchers. It is important, however, that promoting the participation of rising scholars in the context of peer review be balanced against the need to tap the best intellectual talent for review.
We need to be clear that by supporting peer review as a mechanism for developing researchers we do not mean to suggest inculcating researchers to a culture based on cronyism and traditionalism. To prevent the isolation of perspectives and favoritism for well-established names and institutions from taking hold, checks on the system must be in place. That said, the very foundation of the research process rests on the development of a commitment to scientific norms and values, which can and should be reinforced in the context of peer review (National Research Council, 2002).
Recommendation 9: Agencies that fund education research should involve practitioners and community members in their work to ensure the relevance and significance of their portfolio. If practitioners and community members participate on peer review panels, they should focus on the relevance, significance, applicability, and impact of an education research proposal.
Education research, by its very nature, focuses on issues with high social relevance. As a result, it is important that the process of funding that research in federal agencies includes input from a variety of stakeholders
who include, but are not limited to, teachers, principals, superintendents, curriculum developers, chief state school officers, school board members, college faculty, parents, and federal policy makers (for ease of exposition, we refer to these groups collectively as “practitioners and community members” henceforth). These individuals can provide a wider base of expertise from which to examine educational issues and they can also typically provide a wider demographic perspective to inform decision making.
The question, then, is not whether to involve practitioners and community members in the work of federal agencies that support education research, but how. All research agencies have mandates that involve the need to address the societal benefits of proposed research. In education, this often translates as ensuring that the research is relevant to practice and feasible in the context in which it is proposed. To adequately assess research on this criterion, it is incumbent upon the agencies to involve persons beyond the research communities who can help judge the social relevance and benefits of the funded projects in its portfolio. Indeed, the inclusion of practitioners and community members in the work of federal agencies that support education research can be thought of as another dimension of diversity in peer review and funding deliberations.
As we describe in Chapter 2, the variation in agency practice suggests that there are many ways in which practitioners and community members can provide input to the work of federal research agencies. We struggled ourselves to sort out the best place to engage practitioners and community members. Ultimately, we conclude that these agencies should have the flexibility to use one or more of the four mechanisms we describe here to ensure their active participation:
Panel review of proposals alongside researchers;
Second-level review of proposals after researchers’ reviews;
Priority-setting or policy boards; and
Retrospective reviews of agency portfolios.
The first and most controversial practice used in some agencies involves the inclusion of practitioners and community members on peer review panels alongside researchers. Since this approach is a significant topic of interest generally and among the workshop participants specifically, we analyze the underlying issues as they pertain to the review of education research proposals in federal agencies in some detail and outline the conditions under which such an approach could be beneficial to all involved.
A major concern with the practice of including reviewers without research expertise4 on panels is that it could lead to inadequate reviews with respect to technical merit criteria, a critical aspect of research proposal review in all agencies. In addition, since the field of education research is in the early stages of developing scientific norms for peer review, this important process could be complicated or slowed by the participation of individuals who do not have a background in research.
We also see the potential benefits of including practitioners and community members on panels evaluating education research funding applications to help identify high-quality proposals and to contribute to professional development opportunities for researchers, practitioners, and community members alike. With respect to quality, practitioners and community members are well suited to provide insights about the relevance and significance of research proposals—an important evaluation criterion across all agencies represented at the workshop. As we argue above, evaluating the technical merits of research—another critical evaluation criterion—is best addressed by seasoned researchers. However, there may be feasibility or practical issues associated with particular study design features that practitioners and community members could help identify. In this way, they can contribute to the judgment of technical merit by lending expertise about the likelihood that a design can be successfully implemented in a particular educational setting (although many researchers also have experience with implementation). Because practical and technical issues overlap in this way, if practitioners and community members serve on panels, they should fully participate in all aspects of the review process, including written reports prior to meetings and discussion and ratings processes during panel meetings.
Including practitioners and community members on panels can also enhance their professional development and that of their researcher peers by providing opportunities for these two disparate groups to understand and appreciate each others’ perspectives and contributions. Thus, the peer review process would need to be structured to provide opportunity for meaningful interactions among panelists. Through these interactions, practitioners and community members could learn more about education re-
search and methods and be more enthusiastic about its potential for improving schools. And researchers could learn more about pressing educational issues and the practicalities of researching them, leading to improvements in research design and implementation that better fits district, school, or classroom practices and organizational features.
If, for these or related reasons, practitioners and community members are included on panels, we recommend that the ratio of researchers to practitioners and community members be high and that maintaining manageable panel size be an additional consideration in whether and how to include them in reviews (see discussion of Recommendation 4). In the authorizing statute for the Office of Special Education and Rehabilitative Services, for example, there are many categories of practitioners and community members required to participate on panels. As a practical matter, this could mean that these groups have a disproportionally large impact on evaluating research proposals relative to their research peers or that, in an attempt to ensure adequate research expertise is represented, panel sizes become unwieldy. Neither scenario is ideal.
Furthermore, attention to developing a pool of qualified reviewers (see Recommendation 8) would need to extend to practitioner and community member groups as well—it is critical that all reviewers, including those without research expertise, be carefully and rigorously selected to participate and contribute in a positive way. All peer reviewers—whether they are researchers, teacher trainers, dissemination specialists, administrators, parent trainers, policy makers or others—should be deeply knowledgeable about the area under investigation and screened for potential conflicts of interest and biases.
Finally, to engage practitioners and community members on peer review panels successfully, it is critical that agencies provide thorough training to all reviewers so they understand the expertise they are expected to bring to bear to the review and can participate in the process effectively. In the case of practitioners and community members, then, special attention should be focused on their role in evaluating the significance, feasibility, and applicability of the research. We also think that panel chairs will need additional training to succeed in effectively facilitating group processes when disparate groups are represented at the same peer review table.
There are additional promising ways in which practitioners and community members can and should be meaningfully involved in the education research allocation process in federal agencies. Most of the conditions we describe as important to ensure the success of direct practitioner in-
volvement on panels apply to these options as well: no matter what the strategy, fostering opportunities for meaningful interactions, providing training, and developing and vetting candidates are all essential practices.
The NIH model for involving stakeholders is an attractive option. Stakeholders serve on advisory boards that provide a second level of review—after the assessment of technical merit has been made—that evaluates the grants proposed for funding in terms of their significance and relevance for practice. These advisory boards at NIH also help establish priority areas for their respective institutes.
The role of the NIH advisory boards as setting priorities is yet another way that agencies have employed to engage practitioners and community members. For example, the former National Educational Research Policy and Priorities Board and the National Board for Education Sciences were both created to work with agency leadership to guide the development of programmatic priorities for education research, and by statute require the inclusion of both researchers and practitioners and community members. While the role of these board members does not include the review of individual proposals, practitioners and community members who serve on such boards nonetheless can exert a powerful influence over the policies and practices used in peer review and the nature and type of research the agency seeks to fund.
A final way that agencies can ensure active practitioner and community member involvement is one currently used by the U.S. Department of Education’s OSEP to comply with the Government Performance and Results Act of 1993, P.L. 103-62. The agency assembles panels of researchers and stakeholders (including parents of children with disabilities and special education teachers, among others) to comprehensively and retrospectively evaluate the investments OSEP has made along two key dimensions: rigor and relevance. In this way, people “on the front line” are engaged in critically assessing the relevance of the research for addressing their needs. We see this process as a good example of tapping expertise appropriately in federal research agencies, which should be investigated further for its applicability to other agencies and settings.
In sum, education research is strengthened by a rigorous review of proposed projects. Practitioners and community members who represent diverse viewpoints bring important perspectives to education research. Their participation in the work of federal agencies that support education research should be ensured in ways that capitalize on their strengths: assessing the relevance and societal significance of the research.
Recommendation 10: Agencies that fund education research and professional associations should create training opportunities that educate scholars in what the peer review process entails and how to be an effective reviewer.
No matter what regulations or procedures are established for the peer review process, the process can only be as good as the individuals involved. Without explicit training, many scholars in education research may be unfamiliar with the peer review process for obtaining agency funding. The particulars of how to develop proposals for various agencies may not be transparent. And if they are asked to act as reviewers, it cannot be assumed that researchers understand their responsibilities or how to conduct themselves on a panel. This situation is particularly acute for investigators who are beginning their careers and have little or no experience as peer reviewers. Thus, to improve the quality of proposals submitted to funding agencies and the reviews of those proposals, training activities must be developed for writing proposals and conducting reviews.
Because researchers write proposals for paper presentations, books, and other activities, it is often assumed that the requirements for writing a proposal for external funding are similar. While most scholarly proposals do contain similar elements—the importance of the research question, how the study is to be conducted, what one expects to find, and what importance it has to the field—proposals for external funding require a particular level of clarity and specificity that is not typical in other areas. It is not just the level of detail that distinguishes proposals for external funding: such proposals also have to be uniquely conceived and written according to guidelines that can change within and across the various agencies that support education research.
Requests for proposals from agencies typically specify the range of questions, the populations that can be studied, designs for studying them, and the funding level to conduct such work. Although there are announcements for most funding opportunities, investigators still may need assistance understanding if their work falls within the call for proposals, or whether designs that are not entirely in scope but are innovative in approach would be acceptable. Much of this uncertainty could be dispelled with regular communication between funding agency personnel and investigators and efforts to make the process transparent to all.
One way to address these issues systematically would be for the professional associations, such as the American Educational Research Association
(AERA), the American Sociological Association, the American Psychological Association, the American Anthropological Association, and others, to hold workshops on how to write proposals to federal agencies. These workshops should involve agency personnel and individuals who have been successful in the proposal process. Examples of outstanding proposals that succeeded in the review process could be distributed and discussed. It would be particularly worthwhile if the agencies identified proposals that were especially well written and resulted in work that made important contributions to the field.
Several organizations already sponsor similar kinds of opportunities (e.g., the Spencer Foundation, AERA, and the National Academy of Education). Access to these and related experiences should continue to be made available to all, with special emphasis (as is currently the case with some existing programs) on engaging researchers in their early career stages and who come from traditionally underrepresented groups.
Roles and responsibilities for peer reviewers will differ across agencies, areas under investigation, the level of development of a field of research, and the resources available for peer review. Regardless of the roles chosen, they must be made clear to reviewers in advance. As we discuss in connection with Recommendation 7, reviewer training on the use of evaluation criteria is a must. Furthermore, reviewers should follow a basic code of conduct, which includes acting professionally, avoiding personal innuendos, listening to others, airing disciplinary and ideological biases, and continually scrutinizing the potential contribution of the study being proposed. Other training needs will vary by agency goals and associated processes.
Based on workshop discussions, we see this as an area in major need of improvement in most agencies. The National Institute on Drug Abuse has a program to train investigators selected for review panels, which was described at the workshop by Teresa Levitin. Although not widely implemented, it could be a prototype for developing materials and training reviewers, especially for reviewers in education and other social science fields. There are likely to be other models in different agencies that use peer review worth examining in this context.
Peer review has been held up as a standard for enhancing the quality and utility of education research. Understanding the basic issues associated with this tool is required for the standard to be used effectively. We offer
this brief treatment to help education research policy makers approach the task of improving peer review in this era of evidence-based education. It is our view that the current emphasis on peer review is welcome—provided that those charged with overseeing the process understand the strengths and weaknesses of various approaches and implement them with clarity of purpose.