Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 20
Strengthening Peer Review in Federal Agencies that Support Education Research 2 Analyzing Key Elements On the surface, setting up an effective peer review system seems straightforward: contact the top experts in the field, have them review a set of proposals and provide their input, and fund research projects according to the advice provided. Experience has shown, however, that enacting this simple concept is complicated. How should expertise be defined, especially in the many areas of education research that are multidisciplinary? What kind of criteria should be used to judge the proposals, and how should they be quantified or summarized? How should the process be structured so it is seen as legitimate by a range of stakeholders? What is the best way to organize and support the group? What is the nature of the relationship between the peers and the agency staff and leadership, who typically make, and are ultimately accountable for, final funding decisions? These are but a few of the multidimensional questions involved in designing, revamping, or evaluating peer review systems. This chapter provides an overview of some of the major components of peer review systems designed to assess education research proposals in federal agencies. We describe and analyze components of peer review processes with respect to how they promote particular objectives. We chose to consider aspects of peer review from this perspective not only because it serves as an effective organizational framework, but also because in our view research policy makers ought to approach their own systems in a similar manner. We conclude
OCR for page 21
Strengthening Peer Review in Federal Agencies that Support Education Research with an examination of management issues that influence the extent to which such systems can produce desired results. From our analysis, we draw six major conclusions: Peer review serves a number of worthwhile purposes. For peer review systems for federally funded education research, two objectives important in their design are the identification and support of high-quality research and the further development of a culture of rigorous inquiry in the field. Federal agencies that fund education research use a range of models for peer review that serve different purposes and objectives. Developing peer review systems involves balancing multiple, and sometimes conflicting, values and thus often requires making trade-offs. Peer review in the federal government is a tool by which agency goals are accomplished and therefore can only be developed, evaluated, and understood as framed by these objectives. Although peer review is not perfect, it is the best available mechanism for identifying and supporting high-quality research. Peer review of education research proposals in federal agencies could be improved in a number of ways. MULTIPLE PURPOSES AND VALUES In Chapter 1, we described the nature of peer review in federal government as one that serves both scientific and political ends. In their paper (see http://www7.nationalacademies.org/core/HacketChubin_peer_review_paper.pdf) and presentation to the committee, Hackett and Chubin elaborate the many functions that peer review is called on to serve. At the most basic level, peer review is a mechanism for evaluating the merits of proposals for research funding, thereby influencing the distribution of federal research funds. But it also serves several additional and related functions. For example, a major reason scientists participate in peer review—a time-consuming task in addition to existing professional obligations—is to have an impact on the field beyond their own investigations. Thus, peer review shapes the accumulation of knowledge over time by recommending a subset of proposed research for implementation. This idea was prominent
OCR for page 22
Strengthening Peer Review in Federal Agencies that Support Education Research in workshop discussions. Both Hilda Borko, education professor at University of Colorado and president of the American Educational Research Association, and Penelope Peterson, dean, school of education and social policy and Eleanor R. Baldwin Professor of Education at Northwestern University, speaking on behalf of a group of education school deans,1 articulated peer review as a force that “shape[s] and envisions” the future of a field. Edward Hackett, sociology professor at Arizona State University highlighted the “communication function” of peer review and its role in “prepar[ing] the ground for the acceptance of new ideas.” Finbarr Sloane, of the Education and Human Resources (EHR) Directorate of the National Science Foundation (NSF), echoed these ideas, stating that “there is a huge return on investment for serving on a panel…. [Reviewers] get a sense … for what national questions other people are posing, and responses to those questions.” And Edward Redish—a physicist and physics education researcher at the University of Maryland—also pointed to the benefits for researchers who serve on peer review panels, citing the value he has experienced in “see[ing] what people were thinking about in the field.” Delivering feedback to proposers can also signal the field’s (often implicit) standards of quality, reinforcing them in a formal context. Redish made this point about the purpose of peer review most directly, arguing that “peer review is not just about finding scientific merit in particular areas. It is about defining it and creating it.” This purpose is particularly salient in education, since current standards of evidence often vary by discipline and subfield. Redish’s point also underscores the fact that judging the scientific merit of a proposal for research is different from judging the merits of a research product. Research is by its nature an exercise in being alert to, and systematically dealing with, unexpected issues and questions that arise in the course of an investigation. Therefore, the nature and level of specificity of quality criteria are different when considering a description of how an investigator plans to approach the work than when considering the product of a completed investigation. Peer review can also be used as a tool for building interdisciplinary 1 This group—called the Education Deans’ Alliance—was formed in 2000 to share information and to improve the doctoral training of education researchers at their institutions. It includes deans from schools of education at Columbia University, Emory University, Harvard University, Michigan State University, Northwestern University, Stanford University, the University of California at Berkeley, the University of California at Los Angeles, the University of Michigan, and the University of Pennsylvania.
OCR for page 23
Strengthening Peer Review in Federal Agencies that Support Education Research trust among groups of investigators from different research traditions—again, an important endeavor in an area like education, in which multiple fields and disciplines focus on various aspects of teaching, learning, and schooling. Kenneth Dodge, director of the Center for Child and Family Policy at Duke University, described how engaging in peer review helps draw disparate fields together to better reflect and understand the complexities of educational phenomena. Another function of peer review is its role as a buffer, creating a privileged space for researchers to make judgments largely apart from political considerations (Hackett and Chubin, 2003). While political considerations drive funding levels and can impact statements of priority areas (National Research Council, 2002), peer review is used to remove decisions about the funding of individual projects from the influence of special interests or other political groups and agendas. Thus, the peer review process offers a space for researchers to apply scientific principles, debate and identify promising lines of inquiry, and offer crucial advice to decision makers that draws on their expertise to advance research-based knowledge. Workshop discussions also highlighted the role of peer review as a tool for professional development—for proposers, reviewers, and agency staff—to promote a professional culture of inquiry and rigor among researchers. This culture includes an ethos steeped in self-reflection and integrity, as well as a commitment to working toward shared standards of practice (Shulman, 1999; National Research Council, 2002; Feuer, Towne, and Shavelson, 2002). Many workshop participants pointed to the broad “educative” function of peer review to mentor an incoming generation of scholars, to train investigators to review the scholarly quality of proposals, to produce higher quality proposals in the future, and to strengthen connections throughout the field of education research. Although rarely explicit, peer review is often expected to meet these and many other purposes equally well. It is therefore not surprising that the process can come under fire for not serving any one of them fully. Designing peer review systems, improving existing ones, and assessing their effectiveness requires cognizance of these expectations and the implementation of process options accordingly. In addition to serving multiple purposes, peer review systems are also designed to serve a set of values, like those of the agency and the fields it supports. These values are sometimes in tension, and they always require a careful balancing act in choosing a course of action. For example, peer review is expected to uphold the value of effectiveness—“to recommend
OCR for page 24
Strengthening Peer Review in Federal Agencies that Support Education Research projects that would benefit the field and confer some greater social benefit, to offer advice to proposers, to circulate ideas within a community, and more. Peer review is also asked to be efficient, to do all of this at very low cost, with cost measured in terms of dollars spent on reviews (infrastructure, travel, reviewer compensation) and in hours expended by proposal writers and reviewers” (Hackett and Chubin, 2003, p. 15). Another example of these value tensions is the trade-off between risk and tradition. Hackett and Chubin (2003) argue that this tension in peer review is a reflection of the tension in scientific communities more generally: research is expected to chart new progress, but to do so systematically and within the broad parameters set by existing knowledge and standards of rigor. During her presentation, Peterson argued that peer review systems ought to “create opportunities for risk-taking and innovative education research.” Simultaneously maximizing efficiency and effectiveness and risk and tradition are just a few examples of the many kinds of values to be balanced—explicitly or implicitly—by peer review systems (see Hackett and Chubin, 2003, for a more complete treatment). The multiple purposes and competing values inherent to peer review, coupled with the complex nature of education and education research, are reflected in a high degree of variability in peer review systems among the many agencies that fund education research. Culture, tradition, and the mission of the agency also exert a powerful influence over the nature of peer review practices. Indeed, it is clear that no single model could suit all purposes and all situations and all fields equally well. Whether a particular practice will work well depends in large part on the specifics of the situation and the purposes the system is intended to serve. To guide our analysis of peer review practices, we first articulate two broad purposes best served by peer review systems in federal agencies that support education research. KEY OBJECTIVES OF PEER REVIEW FOR EDUCATION RESEARCH Taking our cue from this discussion of multiple purposes, we conclude that two broad objectives that ought to guide the design of peer review systems in federal agencies: the identification and support of high-quality education research and the professional development of the field. The first objective of using peer review as a process to achieve quality
OCR for page 25
Strengthening Peer Review in Federal Agencies that Support Education Research research has been front and center in federal agencies that have funded education research for some time (although it is a matter of debate how well various agencies have done so in the past). We strongly endorse explicit attention to education research quality as well as redoubled efforts to strengthen peer review systems for this purpose. Rigorous studies of educational phenomena can provide important insights into policy and practice (and have—see National Research Council, 2002, for examples). But poor research is in many ways worse than no research at all, because it is wasteful and promotes flawed models for effective knowledge generation. Quality is of the essence, and having leaders in the field carefully scrutinizing and screening proposed work is one essential way to promote it. Although what is meant by quality with respect to education research is a matter of some debate in the field, attending to the rigor and relevance of education research is essential to its health. Peer review systems in federal agencies offer a natural place to engage the field in the contested but crucial task of developing and applying high standards for evaluating the merits of proposed research. Strict rules are not advisable given the interdisciplinary nature of education and the prospective nature of research proposals. However, broad standards, consistently applied in peer review settings, are needed to ensure quality. Moreover, the current enthusiasm for, and debates surrounding, calls for “scientifically based research” in education and references to the use of peer review provide opportunities for a stronger and more consistent focus on peer review as the means to promote research quality. By defining and upholding high standards of quality in the peer review process, researchers can exert a powerful influence on questions of what counts as high-quality research in particular contexts—providing input directly from the scholarly communities with respect to the implementation of policies stemming from the now numerous definitions of quality research that appear in federal education law (e.g., the No Child Left Behind Act of 2001, the Education Sciences Reform Act of 2002, and bills pending to reauthorize the Individuals with Disabilities Education Act of 1997 and parts of the Higher Education Act of 1965, P.L. 89-329). The insulation of peer review from the political process is important for facilitating this goal. In our view, the second objective that should guide peer review in federal agencies that support education research is to contribute to the further development of a culture of inquiry in the field. Peer review has not historically been designed to promote such professional development in the federal agencies that support education research. We think it deserves
OCR for page 26
Strengthening Peer Review in Federal Agencies that Support Education Research far more attention. As the authors of Scientific Research in Education (National Research Council, 2002) argue, we think it is a professional responsibility of education researchers to participate in peer review in federal agencies, and the field ought to harness this system to promote the development of the profession. Federal education research policy makers also have major responsibility for organizing peer review in ways that foster growth among education researchers. If deliberately developed with this objective in mind, peer review systems can serve this purpose among the many players in the education research field. In the context of peer review, they can usefully be categorized as applicants (people who are seeking agency funds to initiate new work), reviewers (people who review the merits of the proposals for new work), and staff (people who work in the research agencies). All three of these categories of people are members of the research community, operating in the broader public domain. In the ideal, peer review systems foster enriching interactions, and each group serves both a teaching and learning function to their own benefit and that of others. Chubin and Hackett (1990) argue that this dynamic can improve understanding among all members of the community, enhancing the capacity of the field as a whole. For example, an applicant can communicate to reviewers cutting-edge ideas in an area of study, stimulating thinking among a broader set of researchers on potential new directions for a field or subfield. In much the same way, the feedback that reviewers provide to applicants often signals areas of contention about new ideas or techniques, preparing the ground for broader scrutiny and consideration of where and how to push the knowledge base and its application. Agency staff teach and learn as well: they familiarize reviewers with relevant agency priorities, goals, review criteria, process specifics, and the particular objectives held in a research competition for advancing the field. In the process of managing and participating in the process, the staff often gain a significant breadth of understanding and knowledge in a field by reading proposals and listening to reviewers’ dialogue about the status of the field and the quality of the batch of proposals under review across and within panels. In some cases, agency staff are themselves accomplished researchers who are serving in temporary posts in research agencies. Overall, knowledgeable staff sharpen internal thinking about how to shape and run future competitions. Having described and justified our choice for the two objectives we
OCR for page 27
Strengthening Peer Review in Federal Agencies that Support Education Research hold as most salient for shaping peer review of education research proposals, we now analyze several design features of peer review systems described at the workshop with respect to how likely they are to promote them. Other purposes, including those mentioned in this report, may be relevant to promote in particular contexts and at particular points in the evolution of a line of inquiry in education research. Our intent in setting forth these two objectives is to identify explicitly the purposes we see as most relevant for organizing peer review systems in federal agencies, as well as to provide a structure for analyzing various aspects of peer review systems. Since some peer review practices serve more than one purpose, there is some overlap in the discussion of peer review practices and considerations between the two main sections that follow. In some of these cases, we highlight the tensions that arise and the trade-offs that are often required in the attempt of peer review to serve multiple purposes. IDENTIFYING AND SUPPORTING HIGH-QUALITY RESEARCH The formal review of education research proposals by professional peers must be designed to identify and support high-quality research. There are many decisions and practices that undergird this critical function, most of which can be categorized into two areas: the people in the process—Who counts as a peer?—and the criteria by which quality is judged—How is research quality defined? Within each, we take up a set of peer review practices described at the workshop that relates to them most directly. Who Reviews: Identifying Peers Deciding who counts as a peer is the very crux of the matter: the peer review process, no matter how well designed, is only as good as the people involved. Judging the competence of peers in any research field is a complex task requiring assessment on a number of levels. In education research, it is particularly difficult because the field is so diverse (e.g., with respect to disciplinary training and background, epistemological orientation) and diffuse (e.g., housed in various university departments and research institutions, working on a wide range of education problems and issues). The workshop discussions brought out several related issues and illustrated the difficulties in, and disagreements associated with, assembling the right people for the job.
OCR for page 28
Strengthening Peer Review in Federal Agencies that Support Education Research Unpacking Expertise What are the required skills, experiences, and knowledge for peer reviewers to perform their duties? Workshop participants answered this question in a number of ways. In their presentation of the main findings from an evaluation of the peer review system at the former Office of Educational Research and Improvement (OERI) during the mid-1990s, Diane August, senior research scientist, Center for Applied Linguistics, and Penelope Peterson reported on an analysis of the fit between the expertise of reviewers and the competitions they reviewed for. Using the standards for peer reviewers that were in place at the time, they focused on the extent to which each reviewer had content, theory, and methodological expertise. They found a number of disconnects, including a relatively low level of fit on the methodological aspects of the research proposals under review (August and Muraskin, 1998). Expertise is required in three main areas to identify high-quality education research in the review process: the content areas of the proposed work, the methods and analytic techniques proposed to address the research questions, and the practice and policy contexts in which the work is situated. At one level, it is self-evident that reviewers need to know something about content to review education research proposals. But “education” is a term covering a vast territory of potential areas of study. Some competitions for research dollars are cast quite broadly (e.g., early childhood development), while others carve out a well-defined subtopic (e.g., effectiveness of pre-K curriculum on school readiness). Content expertise, then, is defined by the research priorities in the competition itself. Even in relatively circumscribed competitions, a wide range of content knowledge is typically required to adequately judge the merits of a set of proposals. Furthermore, the knowledge of content as it applies to teaching and learning that content is important. Referencing Shulman (1986), Borko made this point at the workshop, asserting that in order “to review proposals about mathematics teaching and learning, [reviewers] really do need to know about mathematics, and … teaching and learning. Pedagogical content knowledge is kind of the nexus of those aspects of knowledge.” Another dimension of expertise necessary for peer review of education research proposals is knowledge of relevant methodological and analytic techniques. Like any profession, familiarity and facility with the tools of the trade are an essential part of the job. Reviewers must posses a solid grounding in methodological approaches best suited for studying the par-
OCR for page 29
Strengthening Peer Review in Federal Agencies that Support Education Research ticular problems or topics reflected in the competition. Competent peer review of the quality of research must be conducted by groups of researchers who are together familiar with both general standards (like those outlined in Scientific Research in Education, National Research Council, 2002) and specific standards (relative to particular subfields) and who practice these standards in their own research studies (National Research Council, 1992; Chubin and Hackett, 1990; Cole, 1979). Finally, reviewers must be grounded in the overarching practice and policy contexts associated with the area under consideration. This foundation is necessary to place the potential contribution of new work in the context of current issues and problems facing education policy makers and practitioners, as well as to consider the kinds of expertise that might be required to carry out the work effectively. Do all reviewers need to have each kind of expertise to participate effectively? Most workshop participants agreed that not only was it nearly impossible to find people with such breadth and depth of experience and expertise, but also that it wasn’t necessary. Rather, we agree with most participants that it is the combined expertise of the group that matters. That is, constructing panels with appropriate expertise requires ensuring that the group as a whole reflects appropriate coverage. Hackett made this point most directly, arguing that it is the “distributed” expertise on a peer review panel that is relevant. Beyond these three broad areas of competence that we view as essential for peer review panels, additional kinds of expertise relevant to the process surfaced in workshop discussions. For example, Robert Sternberg, director of the Yale Center for the Psychology of Abilities, Competencies and Expertise and the president of the American Psychological Association, suggested that creativity is an undervalued yet critical talent for assessing research quality.2 Teresa Levitin, director, Office of Extramural Affairs, speaking from her experience running panels at the National Institute on Drug Abuse at the National Institutes of Health (NIH), referred to a number of personal qualities that make for effective reviewers. Such people listen respectfully and are intellectually open to genres of research outside their realm of expertise. They neither dominate nor acquiesce during face-to-face delib- 2 Due to illness, Sternberg did not attend the workshop but sent his presentation slides for the committee’s consideration.
OCR for page 30
Strengthening Peer Review in Federal Agencies that Support Education Research erations about proposals under review. Although we deem these traits as secondary to the three dimensions of expertise we describe here, they are some of the intangibles that influence the success of the peer review process in a very real way and therefore must be considered in vetting reviewer candidates. Conflicts of Interest and Bias For peer review to be an effective tool for identifying and supporting high-quality research, it must be credible. Essential to the integrity and legitimacy of the process is ensuring that reviewers do not have a vested interest in the outcomes of the competition that could introduce criteria other than quality into the process. Thus, it is essential to vet potential reviewers for whether they would have a conflict of interest that would prevent them from fairly judging a proposal or set of proposals. At one level, it is the responsibility of agency staff to probe these potential problems. But it is also a critical part of an ethical code of conduct among investigators to be forthcoming about their relationships to the proposed work. As Levitin put it: “the integrity of the system really depends on the integrity of the individual reviewers.” Conflicts of interest may arise in situations in which there is a possibility, or a perceived possibility, that a reviewer, or his or her associates, might gain from a decision about funding. Agencies deal with these issues in different ways. Steve Breckler, of the Social Behavioral and Economic Sciences directorate at the NSF, referenced a “complex array of conflict of interest rules” that applies to peer review of research proposals submitted to the NSF. Brent Stanfield, deputy director, NIH’s Center for Scientific Review, mentioned that applicants for funding from the NIH are encouraged to identify “competitors” who they feel would be too influenced by the outcome of the review to serve as fair reviewers, and that panelists with potential conflicts of interest on a particular proposal would recuse themselves from the discussion of its merits. Louis Danielson, director of the Research to Practice Division, Office of Special Education Programs (OSEP), described the interpretation of these and related rules by the U.S. Department of Education that preclude the participation of reviewers with particular affiliations. A related but distinct idea that shapes the vetting of panelists is bias. Biases are preferences that may influence the degree to which proposals are judged fairly. Everyone has preferences, and researchers are no exception:
OCR for page 39
Strengthening Peer Review in Federal Agencies that Support Education Research FURTHER DEVELOPING A PROFESSIONAL CULTURE OF INQUIRY Peer review of education research proposals also ought to be designed to support the development of the field of education research. In this section, we analyze facets of peer review that relate most directly to upholding this objective: diversity of perspectives and backgrounds, standing panels, feedback, the role of staff, and training. Diversity Several workshop participants suggested that since peer review can and should serve an educative function, efforts to involve a diversity of research perspectives as well as the participation of people from traditionally underrepresented populations in the process were imperative. In response to a question about how agencies ensure diverse perspectives on peer review panels, Steven Breckler told the group that NSF program officers spend a significant amount of time trying to identify people and places that “ordinarily are not plugged into the NSF review process.” He also pointed to the NSF criteria for reviewing research applications, which require an assessment of the extent to which the proposed activity will broaden the participation of such groups in the evaluation of the proposals themselves “broader impacts.” According to Stanfield, NIH also pays close attention to these issues, relying on a number of mechanisms to promote broader participation, including the use of discretionary funding to support research among underrepresented groups and institutions. In terms of this professional development goal, workshop discussions also focused on the role of peer review for developing junior scholars, another way to view diversity in the composition of panels. At the workshop, Peterson argued that a critical function of peer review in education research was to promote learning opportunities and growth among early career researchers. Borko made a similar argument, suggesting that peer review be used to “mentor the next generation of researchers.” Agency representatives offered examples of how this goal is pursued in practice. For example, Sloane noted that in his work, “we make an effort to have about 20 to 25 percent of our panels be people who are not tenured.”
OCR for page 40
Strengthening Peer Review in Federal Agencies that Support Education Research Standing Panels Panelists can be assembled once to review a single set of proposals (ad hoc panel) or on a regular basis to meet over a predetermined length of time and consider a particular area of research (standing panel). There are strengths and weaknesses of both approaches. Ad hoc panels may be prudent when efficiency must be maximized; the review of small, exploratory grants may also be best served by assembling one-time groups. To promote professional development and capacity building in the field, standing panels are a very attractive mechanism. Since education researchers come from so many fields and orientations, panels focused on particular issues or problems in education can promote a collective expertise that builds interdisciplinary bridges and facilitates the integration of knowledge across domains. Hackett, drawing from his own experience participating on NSF peer review panels, asserted that establishing interdisciplinary trust is difficult when panels are ad hoc. In contrast, he argued that standing panels that convene groups of investigators regularly around issues or problems can be quite promising in this regard. Standing panels provide a context for researchers to build relationships with scholars they might not otherwise know. Panel members can carry these experiences into their own work and that of their colleagues, forging broader disciplinary connections among more and more researchers studying common phenomena and questions but approaching them from different perspectives. The use of standing panels is also likely to encourage the participation of top-flight investigators, as these longer term experiences are more attractive as professional learning opportunities than short-term panels. Offering this benefit is particularly needed in education. In their evaluation of OERI, Diane August and Lana Muraskin reported that many former panelists do not view peer review as worthwhile for their career development and trajectory (August and Muraskin, 1998). Although there are surely many factors that lead to this sentiment, it is worth noting that peer review panels at OERI were always ad hoc. Standing panels can also provide the kind of stability and institutional knowledge that can facilitate positive outcomes in resubmitted proposals. Not all agencies have standard resubmission policies—that is, formal procedures that unsuccessful proposers can follow to respond to the reviews of the proposal and potentially receive funding at a future date. Such processes can identify promising projects in need of further development for funding and provide concrete direction for improvements in specific areas. When an (improved) application is resubmitted, the panel members know
OCR for page 41
Strengthening Peer Review in Federal Agencies that Support Education Research the history of its development and can more knowledgably evaluate it on how well the proposers have responded to specific critiques rendered during its initial review. In addition, when groups of scholars meet regularly in peer review, they provide continuity of vision to programs of research—lines of inquiry in particular areas that together point to new insights, raise new questions, and suggest future directions for agency competitions. Over time, panelists acquire an understanding of the roles and relationships between the field and the agency, enhancing mutual understanding and reinforcing the norms of the culture in the context of the agency’s operations. It is the continuity that standing panels bring to an agency’s peer review system that is the basis for fostering powerful learning among proposers, reviewers, and staff. Although well-suited as a professional development tool, standing panels have their drawbacks. Retaining the same people over time can have a narrowing effect on the advice given to agency leadership, which is why many standing panels have term limits. Standing groups develop a consensus view of the field and its needs, which can result in neglecting potentially important lines of inquiry, methodological approaches, or contextual factors. Worse, they can institutionalize the biases the members bring to the work. The potential for these negative consequences is heightened if the members are not explicitly and carefully selected to represent a range of perspectives, if they do not approach their work with a willingness to listen and to consider differences of opinion and approach thoughtfully, and if their biases are not declared, considered, and balanced. Feedback Most peer review systems are designed in one way or another to provide substantive feedback to proposers (or would-be proposers) on the strengths and weaknesses of their plans. The mode of feedback can take any number of forms. At the Office of Naval Research (ONR), for example, program officers spend substantial amounts of time working directly with investigators before they write a formal proposal for funding consideration. At many other agencies (e.g., NIH, NSF, and OSEP), the primary feedback mechanism is the provision of written products from the proposal review process—forms completed by reviewers that detail strengths and weaknesses for each evaluation criteria. Substantive feedback—as well as clear guidelines for resubmission of rejected proposals—can play a vital role in promoting peer review’s educa-
OCR for page 42
Strengthening Peer Review in Federal Agencies that Support Education Research tive function. At the workshop we learned that a major finding of the OERI evaluation by August and Muraskin (1998) was that the written reviews of proposals were cursory and often merely descriptive summaries of the content of the proposals themselves (rather than analysis of the content with respect to the review criteria). Both Borko and Peterson emphasized the value of feedback in the process and the need to upgrade its use in current systems. At the same time, agency staff from OSEP cited persistent problems getting reviewers to fully document their comments and to clearly justify their ratings, and August and Muraskin (1998) noted this problem in their evaluation of the former OERI’s peer review system as well. If peer review is to serve a professional development function effectively, agency staff and reviewers should take these responsibilities seriously and invest the time to fulfill them. Yet another issue aired at the workshop showed how difficult establishing high-quality feedback can be. Both representatives from NIH described difficulties the agency encountered in recent years because investigators bristled at what they perceived to be inappropriate directives from reviewers. In response, then-director Harold Varmus determined that summary statements emanating from reviews should evaluate the proposed research according to established review criteria, but they should not be tutorials telling investigators how to do their research. In this context, there was considerable discussion about the appropriate level of detail that ought to be part of reviews: How do reviewers and staff balance the need to justify ratings and to communicate effectively with applicants while respecting the professional judgment of applicants? Danielson also raised the issue of resource constraints in this context, suggesting that if the agency were to provide detailed feedback on each of the roughly 4,500 applications they receive each year, they would have to contract the work out due to limited staff resources. We support erring on the side of more detailed information and critique, as this documentation is a key component of a feedback loop that can lead to future improvements in a field. In addition to reviewers’ written feedback, agency staff can also interact with members of the research community—at professional association meetings, workshops convened specifically for principal investigators and future principal investigators, and other such venues—to orient investigators to the agency’s priorities and processes. The level of detail, approach, and other such particulars associated with the content and format of proposals is not the same across or even within federal research agencies, and the more familiar proposers and reviewers are with these important process
OCR for page 43
Strengthening Peer Review in Federal Agencies that Support Education Research mechanisms, the better the review and, most importantly, the better the products of the review. Explicit training on the nature of feedback should also be provided to reviewers; we take up such training issues in a later section. Role of Staff Another key feature of peer review systems is the role of staff in the process. Agency staff are part of the human resources of the research field, playing both teaching and learning roles. There are very real trade-offs associated with the various models of staff involvement in practice today. Three of the agencies represented at the workshop—NIH, NSF, and ONR—nicely illustrate two models at opposite extremes and a hybrid approach to staff involvement. At NIH, the system is very deliberately built to erect a clear separation (sometimes called a firewall) between the staff who write the grant announcements soliciting proposals and developing scientific programs and the staff who select and interact with peers in the review of proposals received in response to those solicitations. In contrast, at ONR, a single staff person (sometimes called a strong manager) performs all of these functions. The system at NSF falls somewhere in between—endowing program officers with a fair degree of authority to shape competitions and to select peers, while creating checks and balances in the system to guard against improprieties. The benefit of the ONR approach is in continuity of expertise. Knowledgeable staff can follow the process from beginning to end, substantively interacting with members of the field in ways that facilitate learning on both sides and result in work with tight alignment to agency goals. As Susan Chipman, director, Cognitive Science Program, of the ONR, described the process, “ONR staff are the peers—they review proposals and make recommendations for funding.” Program officers at ONR often use multiple internal peers to judge research proposals, including potential consumers of the work, since ONR’s work is very applied and mission-oriented. Program managers like Chipman actively develop research programs based on the needs of their agency. The trade-off is that this kind of participation across all parts of the peer review process can result in a loss of external legitimacy. Whitehurst, in describing his plans for peer review at the IES, articulated this downside. In the former OERI, program officers who developed solicitations also selected the peers to review proposals. He acknowledged that this continuity
OCR for page 44
Strengthening Peer Review in Federal Agencies that Support Education Research is beneficial because that person becomes expert in all aspects of the competition. The problem, as he described it, is that having responsibility for both kinds of tasks raises the possibility of infusing bias into the system, thereby weakening its overall legitimacy. As he put it, investigators might reasonably wonder: Is everyone getting a fair shake, or are those researchers who are chummy with the program officer getting an unfair advantage? The NIH model, with its built-in firewall, creates a clear boundary and, as Dodge put it, this “keeps it pure.” Describing the NSF process as it relates to these two models, Breckler argued that their hybrid approach taps the best of both worlds by relying on external panels of experts while allowing program officers substantive involvement. He asserted that the tenets of social psychology suggest that the best way to get people to act responsibly is to make them identifiable and responsible for what they are doing, supporting the kinds of roles that staff are authorized to serve: crafting program announcements, selecting peers for review, and settling on a slate to pass on for funding decisions. This approach, he suggested, allows one person to go against the group tendency to be conservative—that is, to reject innovative ideas. And a high level of responsibility helps to attract high-quality officers to the agency. Responding to questions about the potential abuses of such a system, Breckler argued that the system is rarely compromised because the process is open. The agency mandates extensive documentation of peer review panels, requiring program officers to certify that they have completed parts of the process to the best of their ability and in concordance with relevant policy. To address charges that some investigators may not get the fair shake to which Whitehurst referred, Breckler pointed to a complex array of conflict of interest rules for program officers. Furthermore, NSF has a long-standing tradition of instituting a final check in the process by engaging a committee of visitors to periodically and comprehensively assess research programs on a host of dimensions, including whether such conflict of interest rules were followed. The researchers who are called on to serve this function are asked to carefully scrutinize all aspects of the process to assess its fairness and legitimacy, and the results of the assessment are made publicly available. Training For peer review to fulfill a professional development function, explicit training for reviewers, proposers, and staff must be part of the process. But
OCR for page 45
Strengthening Peer Review in Federal Agencies that Support Education Research workshop participants revealed that in-depth training is the rare exception rather than the rule in practice. Training reviewers was raised repeatedly at the workshop as an important element of the peer review process, and agency participants discussed strategies and identified impediments to facilitating successful training. Stanfield described ways NIH helps to familiarize reviewers with the peer review process, including brokering meetings prior to the panel discussion and setting its tone by beginning with experienced reviewers. Breckler asserted that providing model reviews to reviewers would be a helpful strategy, lamenting that this practice is not permitted at NSF. Chipman agreed, suggesting that the use of model reviews could help strengthen a tradition of high-quality reviews in peer review settings for education research. In describing some training techniques she has used for reviewers at the National Institute on Drug Abuse at NIH, Levitin highlighted several potentially helpful strategies. She suggested that training starts well before the first meeting of the group, is both formal and informal, and is grounded in “general principles and policies.” Levitin suggested that if reviewers are well versed in a “few fundamental” ideas, they will be able to provide a fair review. She made clear that there cannot be hard and fast rules for every circumstance, given the very complex nature of review, different types of applications, and other factors, but that there are policies and procedures to guide review in making fair judgments. One key area of training she described relates to teaching reviewers how to apply the review criteria. At NIH, ratings range from 1 to 5, with 1 being the most meritorious. Levitin also stressed that it is important to communicate to reviewers how to provide balanced and thorough reviews, so that the strengths and weaknesses of every application are described and only the stated review criteria are used to assess them. Training potential applicants was also an area discussed at the workshop. The agencies represented at the workshop relied on a range of largely informal strategies to promote better proposals—such as program officers talking with junior scholars about the grant-writing process—and the degree to which this issue was addressed varied quite a bit. Procedures for resubmission at NIH was the most formal procedure described: with clear and comprehensive written feedback on the weaknesses of a submission, proposers get insights into how to improve their future proposals to the agency and are informed of specific guidelines for resubmitting a revised application in a future grant cycle. Milton Hakel, an industrial and organizational psychologist from Bowling Green State University, suggested that
OCR for page 46
Strengthening Peer Review in Federal Agencies that Support Education Research the ability to write rejoinders to reviews could also be instructive. In many respects, the opportunity to revise an application in response to peer review provides this type of opportunity. One-time submission policies without explicit requirements for identifying a proposal as a resubmission and explaining how the grant has been revised misses valuable opportunities for professional development of the researchers. Finally, the training of staff is similarly important, but no one at the workshop mentioned any kind of professional development for staff involved in peer review systems. Indeed, in their recommendations, August and Muraskin (1998) suggested staff training as a strategy for improving the peer review process at OERI. How to develop training for agency staff would depend on the specific tasks the staff are expected to perform and the skills and knowledge needed to accomplish them effectively. AGENCY MANAGEMENT AND INFRASTRUCTURE Like any system, the peer review process must be effectively managed. Negative experiences of many reviewers of education research proposals—especially in the competitions studied in the evaluation of the former OERI by August and Muraskin (1998) and in testimony about peer review at OSEP to the President’s Commission on Excellence in Special Education (2002)—in large part derived from poor logistics. Active, careful attention to logistical arrangements enables a smooth peer review process that encourages participation and improves its outcomes. For example, lead time is critical to engaging top scholars in the process. Last-minute planning (often deriving from either legislative or executive branch delays) invariably leads to conflicts with previous commitments, seriously reducing the likelihood of tapping top talent to participate. It also leaves little time for substantive reflection on proposals, leading to cursory and incomplete feedback and, in extreme cases, poor advice to decision makers about funding priorities. Infrequent and inconsistent announcements can set off a “now or never” mentality among researchers, ensuring a high rate of rejection given scarce resources and depleting the pool of potential reviewers. Active proposal management—through triage processes that involve an initial cut through the proposals and assignment of only promising projects to reviewers—can minimize workloads, focusing attention on high-priority areas and making participation manageable for reviewers. Despite the many anecdotes of how important peer review is to the field and to individual research careers, agency representatives consistently
OCR for page 47
Strengthening Peer Review in Federal Agencies that Support Education Research pointed to increasing difficulties in recruiting reviewers. Stanfield identified the logistical hurdles involved with convening face to face meetings as particularly problematic: “It is very difficult to get very busy scientists to come to Washington three times a year for four years.” Similarly, Breckler stated, “it is difficult to get people who are going to dedicate themselves to do peer review” and “it is getting increasingly difficult.” Incentives for scholars to serve as peer reviewers derive from a number of sources and compel individuals to behave in a variety of ways. Many of the sources are outside the control of any given federal agency (e.g., whether or not service on peer review panels is recognized in promotion and tenure decisions). Agencies can do their part to enable the recruitment of top-flight investigators to review by ensuring that their systems are managed effectively and reviewer workloads are minimized to the extent possible. For example, the August and Muraskin (1998) evaluation reported that many reviewers at the former OERI spent far longer reviewing than the estimated time commitment they had been provided by agency staff. FLAWS AND ALTERNATIVES To this point, we have not taken on what might be considered the threshold question: What are the drawbacks to peer review as a mechanism for informing resource allocation of federal research dollars, and are there viable alternatives? There are indeed problems with peer review, some of them significant (see Finn, 2002; Horrobin, 2001; McCutchen, 1997). And there are other ways that research dollars have been and are distributed. The workshop discussions did not address these questions in any detail. Hackett and Chubin’s paper (2003), however, does provide an overview of some of these issues. To set the stage for the committee’s recommendations in Chapter 3, and drawing on Hackett and Chubin’s analysis, we acknowledge and describe some of the most worrisome weaknesses of peer review. We also identify some of the alternatives they describe for allocating federal education research dollars, ultimately concluding that, despite its flaws, peer review is nonetheless the best available mechanism for allocating scarce education research dollars. A persistent complaint about the peer review process is the possibility of cronyism—that is, that engaging peers predisposes outcomes to benefit friends or colleagues with no or little regard for the actual merit of a given proposal (Kostoff, 1994). This situation can lead to a kind of protectionism
OCR for page 48
Strengthening Peer Review in Federal Agencies that Support Education Research that repeatedly rewards an elite few, narrowing the breadth of perspectives and ideas that is so critical to scientific progress and stunting potentially promising lines of inquiry. The peer review process can also inhibit innovation. Arguably, peer review is expected to draw the line “between sound innovation and reckless speculation.” As Hackett and Chubin (2003, p. 17) argue, “a review system at one extreme could reward novelty, risk-taking, originality, and bold excursions in a field … [or] it could sustain the research trajectory established by the body of accepted knowledge by imposing skeptical restraint on new ideas.” The closer to the latter pole a system becomes, the more easily it could reject promising ideas as implausible. Current practice is often criticized for being too conservative—a well-known example recounted by Hakel at the workshop is that when the original manuscript describing the double-helix structure of DNA was submitted for publication, it was subjected to peer review and rejected. What about other ways to allocate research dollars? As Hackett and Chubin (2003) report, Congress has the prerogative to allocate funds through direct appropriation (also termed “earmarking” or “pork barrelling”). In fiscal year 2002, Congress earmarked $1.8 billion for projects at colleges and universities. While not all of this money is for research, earmarks for academia are a useful indicator of the exercise of direct appropriation. And while compared with the roughly $100 billion federal investment in research and development, $1.8 billion is a relatively small amount, it seems somewhat larger in comparison to the $25 billion federal budgets for basic research (all data from analysis by the American Association for the Advancement of Science of the R&D budget; http://www.aaas.org/spp/rd/guihist.htm). The main deficiency of earmarking is that it circumvents technical expertise, jettisoning altogether the principle that scientific quality ought to be the primary basis for the allocation of research dollars. It also has a corrosive effect on the development of the research profession: without a clear link between rewards (continued funding) and performance (quality of proposals for future work), the core values of science would be eroded significantly (Hackett and Chubin, 2003). Another alternative is to rely on a single, so-called strong manager who makes decisions on behalf of the agency according to his or her best judgment (as is done in ONR). As Hackett and Chubin (2003, p. 5) observe, “In effect, this is peer review with one peer, so this steward had better be on a par (intellectually and in stature within the field) with those applying for support … [and] should understand the field and its needs (which should
OCR for page 49
Strengthening Peer Review in Federal Agencies that Support Education Research be clear and widely shared) to ensure that decisions and allocations are wise, legitimate, and effective.” The arguments offered to support the strong manager arrangement include that it is flexible and responsive, and an efficient way to distribute relatively small pots of money. It may also be appealing because the manager is held accountable for performance outcomes (e.g., research-based products that benefit the Navy). However, it would be nearly impossible to scale this approach up to the size of NIH (about $27 billion in fiscal year 2003), and it would face similar difficulties in mid-sized research agencies. More importantly, such concentrated power limits the breadth and depth of expertise that can be brought to bear on proposals and invites serious questions of bias and partiality. Hackett and Chubin (2003) discuss a third funding alternative—using a formula to allocate resources. Funds may be allocated to states or universities or institutes, then suballocated to groups or individuals according to a variety of additional criteria. Or formulas may be devised based on the past performance of individual scientists, with funds awarded accordingly. Some measure of current need or potential payoff may factor into the equation, as well as the number of researchers at a university or residents in a state. Fair and effective formulas would be hard to devise, and the relative merits of various options endlessly debated. None of these options for allocating research dollars is perfect, including peer review. When peer review is compared with these alternatives, however, it emerges as the mechanism best suited to promote merit-based decisions about research quality and to enhance the development of the field. This statement does not preclude some type of blended approach in making decisions about what research to fund, however. Indeed, maintaining a variety of funding mechanisms can be leveraged to obviate the weaknesses of peer review. And there are additional design features that can be used in peer review to minimize potential problems. For example, the role of a peer review panel should always be to rank proposals, not to recommend particular decisions about what should be funded, as empowering panelists with making direct recommendations can more easily lead to questions about cronyism and conflict of interest. Term limits, blended expertise on panels, and attention to systematic evaluation of peer review processes and outcomes are additional examples of the kind we address in Chapter 3 that can and should be used to counterbalance the flaws of peer review systems. In short, peer review as a system for vetting education research proposals in federal agencies is worth preserving and improving. So the question for us is how to strengthen it—a topic we address in the next chapter.
Representative terms from entire chapter: