4

Assessing Technical Quality

It is important to consider such factors, discussed above, as type of R&D organization, type of assessment, and timescale of the assessment, so that the assessment of technical quality can be appropriately tailored to achieve its purpose for the given organization.

CHARACTERISTICS OF QUALITY

Quality is the characteristic of an R&D organization best suited to quantitative assessment and metrics. This characteristic encompasses both the workforce and the work being performed, as well as the adequacy of resources provided. Because no single statistic captures quality, it is important to define an appropriate set of tools that measure both the quality of the workforce and the quality of the output of the research. The aspects of quality listed here interact with and are also part of the assessment of the management practices, discussed in Chapter 3.

Quality of Personnel

The importance of the quality and expertise of people in an organization cannot be overemphasized.1 Many organizations have determined that the quality of the workforce is the most reliable predictor of future R&D performance, independent of mission drivers or impact. An assessment of the quality of the workforce is a fundamental best practice common to almost all assessments. This assessment can be both quantitative and qualitative, and benchmarking can be useful. A productive skill-set balance of the workforce will match the mission of the organization and be of an appropriate quality. It is advisable to evaluate management’s plans for recruiting the type of person most suited to the job.

Of special significance for many R&D organizations is the presence in any particular division, laboratory, or project of people with deep and creative technical capabilities.2 It is important to assess the organization’s policies and actions aimed at steadily building on the sets of capabilities associated with individuals possessing both breadth of experience across multiple projects and depth in one or more systems and disciplines. An effective organization enables its staff to capture new skills required for a given set of tasks at hand, while over the long term building the network required to make team members effective participants in global efforts to achieve the overall goals of the organization. To facilitate the creation of such capabilities, a diversity of personnel and work experience is vital. An effective assessment of an organization

____________________________

1 J. Lyons, 2012. Reflections on Over Fifty Years in Research and Development: Some Lessons Learned. National Defense University, Washington, D.C.

2 Y. Saad, 2012. “Review of IBM’s Technical Review Path.” Presentation to the National Research Council’s Panel for Review of Best Practices in Assessment of Research and Development Organizations, March 20, Washington, D.C.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 29
4 Assessing Technical Quality It is important to consider such factors, discussed above, as type of R&D organization, type of assessment, and timescale of the assessment, so that the assessment of technical quality can be appropriately tailored to achieve its purpose for the given organization. CHARACTERISTICS OF QUALITY Quality is the characteristic of an R&D organization best suited to quantitative assessment and metrics. This characteristic encompasses both the workforce and the work being performed, as well as the adequacy of resources provided. Because no single statistic captures quality, it is important to define an appropriate set of tools that measure both the quality of the workforce and the quality of the output of the research. The aspects of quality listed here interact with and are also part of the assessment of the management practices, discussed in Chapter 3. Quality of Personnel The importance of the quality and expertise of people in an organization cannot be overemphasized.1 Many organizations have determined that the quality of the workforce is the most reliable predictor of future R&D performance, independent of mission drivers or impact. An assessment of the quality of the workforce is a fundamental best practice common to almost all assessments. This assessment can be both quantitative and qualitative, and benchmarking can be useful. A productive skill-set balance of the workforce will match the mission of the organization and be of an appropriate quality. It is advisable to evaluate management's plans for recruiting the type of person most suited to the job. Of special significance for many R&D organizations is the presence in any particular division, laboratory, or project of people with deep and creative technical capabilities.2 It is important to assess the organization's policies and actions aimed at steadily building on the sets of capabilities associated with individuals possessing both breadth of experience across multiple projects and depth in one or more systems and disciplines. An effective organization enables its staff to capture new skills required for a given set of tasks at hand, while over the long term building the network required to make team members effective participants in global efforts to achieve the overall goals of the organization. To facilitate the creation of such capabilities, a diversity of personnel and work experience is vital. An effective assessment of an organization 1 J. Lyons, 2012. Reflections on Over Fifty Years in Research and Development: Some Lessons Learned. National Defense University, Washington, D.C. 2 Y. Saad, 2012. "Review of IBM's Technical Review Path." Presentation to the National Research Council's Panel for Review of Best Practices in Assessment of Research and Development Organizations, March 20, Washington, D.C. 29

OCR for page 29
includes consideration of a diverse workforce whose contributions may affect and advance the R&D mission of the organization.3,4,5,6 The success of a strategy that builds and strengthens this workforce can be assessed by examining movement of personnel both within and outside a given organization. Over time this strategy will provide for a diversity of experience with the broadest possible base of professional interactions. Thus, a key element of any meaningful assessment will be the agility with which an organization acts, at all stages of a program, to train its team and to provide a high level of exposure to diverse programs and organizations working with related or overlapping technical agendas and expertise. At the heart of looking forward to the next generation of scientific and technological opportunities are the organization's scientists and engineers. Their knowledge of cutting-edge research will be an essential starting point in all such forward-looking efforts. To be effective, they will have flexibility to attend scientific and technical professional meetings and to participate in the international community of scholars. They will be encouraged to think about next-generation efforts, and they will be rewarded for that effort by some strategy that brings resources to bear on the most promising ideas and allows some to pursue high-risk, high-payoff efforts. More difficult but equally important is a parallel peer-review process that continually evaluates such highly speculative programs and aids in determining when available data indicate a clear likelihood of failure and suggest reallocation of assets. Organization directors' discretionary funding, internal allocations for basic research, and outside ("other agency") funding sources are all key factors that enable the process of fostering a high-quality technical staff. This approach, enriched by interaction with academia, has proven to continue to yield dividends in new and unanticipated discoveries, often without guidance by a well-defined and planned timetable for discovery. Quality of R&D The desired outputs of the R&D organization differ by type of organization, but they can typically be represented by measurable quantities such as publications (and their quality), patents, copyrights, and peer awards. The absolute value may not be as important as the trends in such quantities. An effort to benchmark these metrics across similar organizations can make the absolute value more meaningful. Preparedness Preparedness is defined as the actions taken by the organization to identify and maintain the resources and strategies necessary to respond flexibly to future challenges. 3 C. Herring, 2009. Does diversity pay? Race, gender, and the business case for diversity. American Sociological Review 74:208-224. 4 O.C. Richard, 2000. Racial diversity, business strategy, and firm performance: A resource-based view. Academy of Management Journal 43:164-177. 5 O. Richard, A. McMillan, K. Chadwick, and S. Dwyer, 2003. Employing an innovation strategy in racially diverse workforces: Effects on firm performance. Group and Organization Management 28:107-126. 6 O.C. Richard, B.P.S. Murthi, and K. Ismail, 2007. The impact of racial diversity on intermediate and long-term performance: The moderating role of environmental context. Strategic Management Journal 28:1213-1233. 30

OCR for page 29
Capabilities A core capability is defined as an area of sustained investment by the organization. It is frequently measured by comparison with peer organizations. Once core capabilities are identified, typically as part of any strategic planning exercise, the maintenance of those capabilities will be one essential component of an assessment of an organization's strategy. The development of new capabilities, when applicable to an organization, is also part of a preparedness assessment. Research Infrastructure The quality of the facilities and equipment, including buildings as well as capital equipment, is also an important part of any assessment of quality, since the lack of this infrastructure can stand in the way of even the highest-quality workforce. Benchmarking is an effective method here. Assessment of the plans for future infrastructure, including the physical plant, capital equipment, and other factors, is part of the assessment of preparedness. Assessing the current state of infrastructure is best done as part of an overall assessment of near-term quality; the planned future state of the infrastructure is properly assessed as part of preparedness for long-term quality. It is important that an evaluation of the quality of research management be done in the context of the possible, considering whether management is doing the best that it can with the hand it has been dealt (and is making its best case for a better hand). For example, an assessment of the facility infrastructure necessarily reflects budget realities--if there are strong budget constraints, it is appropriate to ask whether management is investing in the best combination of new and upgraded facilities. If the assessment is also intended for an audience that includes budgeting authorities, it is appropriate to assess the potential value of additional infrastructure funding. Technical facilities encompass the physical space of the organization and how it is occupied. Does the workforce have what it needs to carry out the research program as identified or planned? An evaluation of facilities independent of a clear understanding of program content is just as imprudent as an evaluation of program content without recognition of appropriate facilities needs. Metrics for the quality of the research infrastructure are not readily generalized, but it is important that the assessment panels be presented data to assist their evaluation and not merely asked to "eyeball" a site while looking at posters of technical accomplishments. Ultimately, a good assessment will help management to judge whether the quality, cost, and capability of facilities are matched to the technical need. For example, space in an old facility may be acceptable, as long as it is at least adequate for optimum equipment utilization. Some of the required elements are adequate power and appropriate power backup, and environmental control (e.g., temperature, pressure, vibration) that is consistent with the requirements of the equipment needed. How much equipment the organization has is another metric that may be examined. It is not possible to specify a generalizable fraction of available funds that should be allocated for infrastructure maintenance and upgrade. To determine this requires the evaluation of context, including available funds and R&D priorities. "Home-built" equipment is a potentially useful measure of creativity that can be properly identified and documented for the assessment team. 31

OCR for page 29
The successful capture of material intellectual property from custom efforts is a possible indicator of quality. HOW TO ASSESS QUALITY An effective assessment includes appropriate qualitative and quantitative measures for all aspects of the organizational research activities. For an effective assessment, research activities will not be viewed in isolation, but as part of the entire research portfolio of the organization, and in the context of the priorities within that research portfolio. An assessment will often include the utilization of panels of domain expertise, but the use of such domain expertise is not sufficient to ensure that the assessment is appropriate and adequate. R&D organizations are frequently organized according to scientific or technical disciplines, and when their projects and programs do not involve multidisciplinary collaborations, they are amenable to being assessed by discipline-oriented panels of peers. This was the historical precedent created by the National Research Council (NRC) in concert with the National Bureau of Standards (NBS; now the National Institute of Standards and Technology [NIST]). This same process was adopted by the NRC as it established an assessment process for the Army Research Laboratory (ARL) and other federal organizations. This type of process also describes the assessments of the Air Force Research Laboratory (AFRL) made by its external Scientific Advisory Board and is similar to that used by many other organizations. However, as technology and related programs become more complex, one increasingly finds a matrix-like organizational structure developing in organizations. Even if formal management and personnel policy do not reflect this matrix, technical programs crossing disciplinary organizational lines do. Such cross-organizational programs will also be reviewed in a comprehensive assessment.. The process and frequency of cross-organizational program assessments will differ in most respects from one another, and an effective assessment will not be found in a process focused exclusively on visits to discipline-organized subsets of an organization. Assessment of a cross-disciplinary program may benefit significantly from establishment of a peer group drawn from a cadre of experts already fully engaged in disciplinary assessment of relevant programs. Appendix I presents examples of recent cross-organizational assessments that may clarify some of the issues faced in such assessments. The examples presented in Appendix I are recent reviews of the manufacturing-related programs at NIST and the autonomous systems program at ARL-- each of which involved projects whose participants were drawn from multiple laboratories. The Role of Peer Review Peer review, coupled with quantitative and qualitative metrics (see the section below), offers an opportunity to gain a better understanding of and then to assess an R&D organization. A well-selected team of experts can produce valuable insights with respect to the overall quality of the R&D strategy and its execution within an organization. Detailed critiques and insightful suggestions from experts permit checks and balances among contrasting points of view. Including individuals with expertise in emerging areas at the margin of the main focus of the assessment can help to identity new or missed opportunities. Peer review is an accepted process that is understood by all, and it provides scientific accountability. It can also identify links to the external scientific and technical community and to relevant R&D performed elsewhere. Peer review can also provide advice on decision making, 32

OCR for page 29
particularly with regard to resource allocation for the array of R&D directions that an organization may be considering. External reviewers bring to an assessment their individual perspectives, which may constitute biases or conflicts of interest. There is generally more credibility for independent external assessments. A fully independent assessment will be arranged by and managed by an independent contractor. It is essential that candidates for membership on assessment panels be required to present any biases or potential conflicts of interest to the contractor so that the appointment decision can take such potential conflicts into consideration. In any case, external peer review is essential for evaluating any R&D organization. Peer review is the most valuable and credible best practice an organization can employ to assess its quality and, in many cases, its impact.7,8,9,10,11 Appendix J summarizes assessment processes at NIST, ARL, and Sandia National Laboratories. Appendix K summarizes assessment processes at some other government laboratories: within the DOD (Army, Air Force, and Navy); the National Institutes of Health; and the Department of Energy. Each of the processes described in Appendixes J and K involves peer review of technical quality. Metrics Before considering the best metrics to use and how to use them, it is necessary to determine what is to be measured and why.12,13 A report of the National Research Council committee that examined metrics for the U.S. Environmental Protection Agency suggested that metrics must be meaningful to the recipients of the results of the assessment, simple and understandable, integrated into an overall assessment, aligned with the goals of the organization, both process- and outcome-oriented, accurate, consistent, cost-effective, and timely.14 In evaluating metrics for R&D, it is important to avoid simply counting (i.e., numbers of papers, patents, etc.) and instead to identify how the measures will be mapped to decisions that are expected to rely on the results of the assessment. It is also helpful to identify and distinguish between leading and lagging indicators among metrics. Investments in R&D may provide for a number of outputs such as those identified in the following list (a detailed discussion of possible metrics is also provided by Geisler15): 7 National Research Council, 1995. On Peer Review in NASA Life Sciences Programs. National Academy Press, Washington, D.C. 8 National Research Council, 1998. Peer Review in Environmental Technology Development Programs. National Academy Press, Washington, D.C. 9 National Research Council, 1999. Evaluating Federal Research Programs: Research and the Government Performance and Results Act. National Academy Press, Washington, D.C. 10 National Research Council, 2007. Assessment of the Results of External Independent Reviews for U.S. Department of Energy Projects. The National Academies Press, Washington, D.C. 11 U.S. Office of Science and Technology Policy (OSTP), 1983. Report of the White House Science Council ("Packard Report"). OSTP Report Number NP-3902794. OSTP, Washington, D.C. 12 The National Research Council, 2005. Thinking Strategically: The Appropriate Use of Metrics for the Climate Change Science Program. The National Academies Press, Washington, D.C. 13 R. Behn, 2003. Why measure performance? Different purposes require different measures. Public Administration Review 63(5). 14 National Research Council, 2003. The Measure of STAR: Review of the U.S. Environmental Protection Agency's Science to Achieve Results (STAR) Research Grants Program. The National Academies Press, Washington, D.C. 15 E. Geisler, 2000. The Metrics of Science and Technology. Quorum Books, Westport, Conn. 33

OCR for page 29
Publications, patents, reports, and the citations garnered; Technical assistance provided to end users, customers, and stakeholders; Invited presentations (e.g., at conferences and workshops); Training and mentoring of personnel; New and improved products, materials, and processes; Patents leading to new products; Development of test and evaluation protocols, codes, and standards; Technology transfer; Maintained competencies; New competencies; Cost savings (e.g., in materials or processes); Increased productivity; Safety practices and culture; Effectiveness of management structure and strategy; and Recognition of the R&D organization as best, among the best, or unique. Bibliometrics Bibliometrics are methods to quantitatively analyze scientific and technological literature. Bibliometrics can be used in the evaluation of individuals, groups, or institutions as a whole. The collection of the data and its analysis are straightforward. Bibliometric measures allow for the quantitative assessment of R&D outputs by simple counts of papers, citations, and patents. This process allows for clear assessments of core journals and their relative impact, including their journal impact factors. Citation analysis (Science Citation Index) can be used to help determine the role that individual scientists, their groups, and their institutions have in the evolution of new ideas and their technological development. With proper analysis, bibliometrics can identify trends and emerging concepts in science and technology even across diverse scientific disciplines. Since bibliometrics are accepted by the general scientific community, they can be used as a reasonable representation of the outputs of a research organization. In using bibliometrics, it is important to recognize that the analysis does not include all published articles and that other types of written outputs such as technical reports are not covered. Also, they do not account for work in progress. Citations are not a measure of quality; a high citation count may simply be a measure of those papers that are in concert with others and that are not truly important. It is also hard to validate cross-disciplinary research in publications, owing to different structures and procedures for different disciplines. New, evolving, and mature areas all have different publication and citation rates. There is no standard for the validation of counts of papers and citations as they relate to quality. For example, much can be written about a mistake. Another factor to consider is that bias can be an issue in assessing the merit of paper and publication counts. For example, for papers that discuss research near the boundaries of their discipline publication and citation in top journals may be more difficult to achieve. Also bibliometric databases may not adequately cover conference proceedings, which are increasingly important in some fields such as computer science. Given the caveats above, absolute 34

OCR for page 29
bibliometrics can be less than useful. Assessing the trend in such metrics can, however, provide a lagging indicator of research quality. The visibility and global impact of research investments may also be measured by means of the number of presentations given at meetings both internal and external to an organization. Of particular significance is the number of invited presentations. This measure has value in the assessment of both the organization and individual effort. Although presentations internal to an organization are not searchable through bibliometric databases, those given at symposia, workshops, and conferences are available in databases. As with the other bibliometric data described above, it is important to consider this metric in context and to avoid its use in isolation. Patents Patents are another measure of the outcome of investments, and they may be viewed as a measure of the potential market applications resulting from R&D. Patents are also legal documents and may be viewed both as a measure of scientific productivity and as a measure of intermediate outputs. Patents are a measure applied across the science and technology organization. Because the format of patents is uniform, comparisons can be made across diverse research organizations and even between countries. Because patents contain information on inventions that have resulted from science and technology activity, it is possible to reconstruct levels of investment. Science and technology can lead to a high quantity and quality of patents, which may correlate with an improved knowledge base, improved pool of skills, the protection of promising ideas, new products, and improved innovation activity. Data related to patents are easily quantifiable; patent databases are large and easily searched, and the data can be manipulated and readily cross-correlated. Also, the citations are readily available. Patents provide important information about the actual work and level of effort that led to the patent. The potential impact of a patent suggests that it can be considered a reasonable measure of output. Successful patents are generally considered to be a measure of technological performance. As such, they provide indicators of the knowledge base and the quality of the research that led to them. In making assessments based on patents, there are a number of points to consider. Whether or not to file a patent is decided by the individual organization, and there are marked differences in the propensity of organizations to file for patents; a more meaningful metric, particularly for long-term assessment, is the quantity and quality of licensed patents. Also, not all of the research investment of an organization will lead to a patent, and thus patents alone, licensed or not, cannot be used as a stand-alone metric of quality. Further, there is not a one-to- one correspondence between a patent and a product, and such correlations are not easily derived from a database. Although there may not be a clear method for modeling patents and their relationship to R&D performance, the patent metric can be a useful tool when considered in the proper context. Cultural Metrics Some metrics associated with the quality of an R&D effort relate to the culture of the organization. Some factors that may allow an assessment of the cultural environment include the importance placed on the training and mentoring of personnel and the commitment to safety practices and culture. A healthy organization requires buy-in and commitment to the importance 35

OCR for page 29
of these elements at all levels, from the most senior manager to the individual executing a given program.16 Some elements of the assessment of cultural factors will be quantitative--for example, numbers of courses or degree programs associated with training, safety initiatives, accidents, and other indicators relating to safety--although the following important metric is largely qualitative in nature: Does the cultural environment provide meaningful support? Other considerations related to organizational culture may include: To what extent are organizational members clear about organizational policies and processes, explicit and implicit? To what extent do organizational members agree that "this is a great place to work"? To what extent are organizational members treated with respect and dignity? Are differences among people respected and encouraged, or is the expectation one of bias and prejudice? Is conflict surfaced and managed, or is it avoided? As is readily gleaned from the above discussion, indicators may be either quantitative or qualitative in nature and may be found in various data sources, including employee surveys. For instance, quantitative measures include number of publications and presentations, citations, and new products and processes. Return on investment and performance outputs can be important metrics. Other outputs not as readily associated with quantitative measures might also have significance. Examples include impact on customer satisfaction, contributions to the pool of innovations, global recognition, the effectiveness of organizational leadership, communication among various entities within the organization and with relevant stakeholders, and the ability to transition research from invention/innovation to later stages of development. Appendix L provides a set of assessment metrics and criteria applied by NRC panels that review the ARL. This set of metrics and criteria is not presented as a prescription, but, rather, as an example of a tailored set developed to meet the perceived assessment needs of one organization. The assessment items identified fell into the following categories: relevance to the wider scientific and technical community, impact to customers, formulation of the goals and plans for projects, methodology applied to the research and development activities, adequacy of supporting capabilities and resources, and responsiveness to the findings from previous assessments. Benchmarking One commonly used assessment approach is to compare one R&D organization with one or more others judged to be at the top level of performance. This is usually done with metrics that are normalized to account for size differences. Thus one may cite the number of archival publications for each technical professional. Using percentages accounts for size differences, e.g., the percentage of doctorates among the professional population. It is important that comparisons made are among R&D organizations operating in similar contexts.17 For example, comparing an engineering research organization with an academic department provides little meaningful information, because the two operate in different contexts. A problem with benchmarking with metrics is that such assessments do not reveal the effectiveness of the organizations. A first- class organization may reside in a parent that fails to capitalize on the 16 B. Jaruzelski, J. Loehr, and R. Holman, 2011. Why Culture Is Key. Booz & Company, New York, N.Y. 17 National Research Council, 2000. Experiments in International Benchmarking of U.S. Research Fields. National Academy Press, Washington, D.C. 36

OCR for page 29
organization's breakthroughs. Nonetheless, benchmarking can be a useful addition to the assessment tool kit. SUMMARY OF FINDINGS Different aspects of the assessment of technical quality are suited to appropriate quantitative and qualitative metrics. The quality of the research activities is best viewed in the context of the entire portfolio of the organization, including suitable assessment of cross- organizational programs. Peer review, coupled with quantitative and qualitative metrics, is a critical part of an effective assessment of the R&D organization. Quantitative metrics can play a key role in assessment, but it is important to determine first what will be measured and why, and to avoid counting the numbers without a good rationale. Typical bibliometric measures are publications and presentations. Patents are another quantifiable metric. For each of these, the total number is not nearly as important as the quality and the contribution (and commonly accepted measures such as number of publication citations can be a poor surrogate for quality). Other indicators, generally qualitative, are associated with the culture of the organization; these include training and mentoring, safety initiatives, and the effectiveness of management. Quantitative metrics can usually be associated with such additional aspects as return on investment, the development of new products and processes, and internal productivity and/or cost savings. Other indicators that are more qualitative include technical assistance, customer satisfaction, communication (both internally and with stakeholders), and global recognition (including benchmarking). The transitioning of research into products has both quantitative and qualitative aspects. 37