Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 29
4
Assessing Technical Quality
It is important to consider such factors, discussed above, as type of R&D organization,
type of assessment, and timescale of the assessment, so that the assessment of technical quality
can be appropriately tailored to achieve its purpose for the given organization.
CHARACTERISTICS OF QUALITY
Quality is the characteristic of an R&D organization best suited to quantitative
assessment and metrics. This characteristic encompasses both the workforce and the work being
performed, as well as the adequacy of resources provided. Because no single statistic captures
quality, it is important to define an appropriate set of tools that measure both the quality of the
workforce and the quality of the output of the research. The aspects of quality listed here interact
with and are also part of the assessment of the management practices, discussed in Chapter 3.
Quality of Personnel
The importance of the quality and expertise of people in an organization cannot be
overemphasized.1 Many organizations have determined that the quality of the workforce is the
most reliable predictor of future R&D performance, independent of mission drivers or impact.
An assessment of the quality of the workforce is a fundamental best practice common to almost
all assessments. This assessment can be both quantitative and qualitative, and benchmarking can
be useful. A productive skill-set balance of the workforce will match the mission of the
organization and be of an appropriate quality. It is advisable to evaluate management's plans for
recruiting the type of person most suited to the job.
Of special significance for many R&D organizations is the presence in any particular
division, laboratory, or project of people with deep and creative technical capabilities.2 It is
important to assess the organization's policies and actions aimed at steadily building on the sets
of capabilities associated with individuals possessing both breadth of experience across multiple
projects and depth in one or more systems and disciplines. An effective organization enables its
staff to capture new skills required for a given set of tasks at hand, while over the long term
building the network required to make team members effective participants in global efforts to
achieve the overall goals of the organization. To facilitate the creation of such capabilities, a
diversity of personnel and work experience is vital. An effective assessment of an organization
1
J. Lyons, 2012. Reflections on Over Fifty Years in Research and Development: Some Lessons Learned. National
Defense University, Washington, D.C.
2
Y. Saad, 2012. "Review of IBM's Technical Review Path." Presentation to the National Research Council's
Panel for Review of Best Practices in Assessment of Research and Development Organizations, March 20,
Washington, D.C.
29
OCR for page 30
includes consideration of a diverse workforce whose contributions may affect and advance the
R&D mission of the organization.3,4,5,6
The success of a strategy that builds and strengthens this workforce can be assessed by
examining movement of personnel both within and outside a given organization. Over time this
strategy will provide for a diversity of experience with the broadest possible base of professional
interactions. Thus, a key element of any meaningful assessment will be the agility with which an
organization acts, at all stages of a program, to train its team and to provide a high level of
exposure to diverse programs and organizations working with related or overlapping technical
agendas and expertise.
At the heart of looking forward to the next generation of scientific and technological
opportunities are the organization's scientists and engineers. Their knowledge of cutting-edge
research will be an essential starting point in all such forward-looking efforts. To be effective,
they will have flexibility to attend scientific and technical professional meetings and to
participate in the international community of scholars. They will be encouraged to think about
next-generation efforts, and they will be rewarded for that effort by some strategy that brings
resources to bear on the most promising ideas and allows some to pursue high-risk, high-payoff
efforts. More difficult but equally important is a parallel peer-review process that continually
evaluates such highly speculative programs and aids in determining when available data indicate
a clear likelihood of failure and suggest reallocation of assets.
Organization directors' discretionary funding, internal allocations for basic research, and
outside ("other agency") funding sources are all key factors that enable the process of fostering a
high-quality technical staff. This approach, enriched by interaction with academia, has proven to
continue to yield dividends in new and unanticipated discoveries, often without guidance by a
well-defined and planned timetable for discovery.
Quality of R&D
The desired outputs of the R&D organization differ by type of organization, but they can
typically be represented by measurable quantities such as publications (and their quality),
patents, copyrights, and peer awards. The absolute value may not be as important as the trends in
such quantities. An effort to benchmark these metrics across similar organizations can make the
absolute value more meaningful.
Preparedness
Preparedness is defined as the actions taken by the organization to identify and maintain
the resources and strategies necessary to respond flexibly to future challenges.
3
C. Herring, 2009. Does diversity pay? Race, gender, and the business case for diversity. American Sociological
Review 74:208-224.
4
O.C. Richard, 2000. Racial diversity, business strategy, and firm performance: A resource-based view. Academy
of Management Journal 43:164-177.
5
O. Richard, A. McMillan, K. Chadwick, and S. Dwyer, 2003. Employing an innovation strategy in racially diverse
workforces: Effects on firm performance. Group and Organization Management 28:107-126.
6
O.C. Richard, B.P.S. Murthi, and K. Ismail, 2007. The impact of racial diversity on intermediate and long-term
performance: The moderating role of environmental context. Strategic Management Journal 28:1213-1233.
30
OCR for page 31
Capabilities
A core capability is defined as an area of sustained investment by the organization. It is
frequently measured by comparison with peer organizations. Once core capabilities are
identified, typically as part of any strategic planning exercise, the maintenance of those
capabilities will be one essential component of an assessment of an organization's strategy. The
development of new capabilities, when applicable to an organization, is also part of a
preparedness assessment.
Research Infrastructure
The quality of the facilities and equipment, including buildings as well as capital
equipment, is also an important part of any assessment of quality, since the lack of this
infrastructure can stand in the way of even the highest-quality workforce. Benchmarking is an
effective method here. Assessment of the plans for future infrastructure, including the physical
plant, capital equipment, and other factors, is part of the assessment of preparedness. Assessing
the current state of infrastructure is best done as part of an overall assessment of near-term
quality; the planned future state of the infrastructure is properly assessed as part of preparedness
for long-term quality.
It is important that an evaluation of the quality of research management be done in the
context of the possible, considering whether management is doing the best that it can with the
hand it has been dealt (and is making its best case for a better hand). For example, an assessment
of the facility infrastructure necessarily reflects budget realities--if there are strong budget
constraints, it is appropriate to ask whether management is investing in the best combination of
new and upgraded facilities. If the assessment is also intended for an audience that includes
budgeting authorities, it is appropriate to assess the potential value of additional infrastructure
funding.
Technical facilities encompass the physical space of the organization and how it is
occupied. Does the workforce have what it needs to carry out the research program as identified
or planned? An evaluation of facilities independent of a clear understanding of program content
is just as imprudent as an evaluation of program content without recognition of appropriate
facilities needs. Metrics for the quality of the research infrastructure are not readily generalized,
but it is important that the assessment panels be presented data to assist their evaluation and not
merely asked to "eyeball" a site while looking at posters of technical accomplishments.
Ultimately, a good assessment will help management to judge whether the quality, cost,
and capability of facilities are matched to the technical need. For example, space in an old
facility may be acceptable, as long as it is at least adequate for optimum equipment utilization.
Some of the required elements are adequate power and appropriate power backup, and
environmental control (e.g., temperature, pressure, vibration) that is consistent with the
requirements of the equipment needed.
How much equipment the organization has is another metric that may be examined. It is
not possible to specify a generalizable fraction of available funds that should be allocated for
infrastructure maintenance and upgrade. To determine this requires the evaluation of context,
including available funds and R&D priorities. "Home-built" equipment is a potentially useful
measure of creativity that can be properly identified and documented for the assessment team.
31
OCR for page 32
The successful capture of material intellectual property from custom efforts is a possible
indicator of quality.
HOW TO ASSESS QUALITY
An effective assessment includes appropriate qualitative and quantitative measures for all
aspects of the organizational research activities. For an effective assessment, research activities
will not be viewed in isolation, but as part of the entire research portfolio of the organization, and
in the context of the priorities within that research portfolio. An assessment will often include the
utilization of panels of domain expertise, but the use of such domain expertise is not sufficient to
ensure that the assessment is appropriate and adequate.
R&D organizations are frequently organized according to scientific or technical
disciplines, and when their projects and programs do not involve multidisciplinary
collaborations, they are amenable to being assessed by discipline-oriented panels of peers. This
was the historical precedent created by the National Research Council (NRC) in concert with the
National Bureau of Standards (NBS; now the National Institute of Standards and Technology
[NIST]). This same process was adopted by the NRC as it established an assessment process for
the Army Research Laboratory (ARL) and other federal organizations. This type of process also
describes the assessments of the Air Force Research Laboratory (AFRL) made by its external
Scientific Advisory Board and is similar to that used by many other organizations. However, as
technology and related programs become more complex, one increasingly finds a matrix-like
organizational structure developing in organizations. Even if formal management and personnel
policy do not reflect this matrix, technical programs crossing disciplinary organizational lines do.
Such cross-organizational programs will also be reviewed in a comprehensive
assessment.. The process and frequency of cross-organizational program assessments will differ
in most respects from one another, and an effective assessment will not be found in a process
focused exclusively on visits to discipline-organized subsets of an organization. Assessment of a
cross-disciplinary program may benefit significantly from establishment of a peer group drawn
from a cadre of experts already fully engaged in disciplinary assessment of relevant programs.
Appendix I presents examples of recent cross-organizational assessments that may clarify some
of the issues faced in such assessments. The examples presented in Appendix I are recent reviews
of the manufacturing-related programs at NIST and the autonomous systems program at ARL--
each of which involved projects whose participants were drawn from multiple laboratories.
The Role of Peer Review
Peer review, coupled with quantitative and qualitative metrics (see the section below),
offers an opportunity to gain a better understanding of and then to assess an R&D organization.
A well-selected team of experts can produce valuable insights with respect to the overall quality
of the R&D strategy and its execution within an organization. Detailed critiques and insightful
suggestions from experts permit checks and balances among contrasting points of view.
Including individuals with expertise in emerging areas at the margin of the main focus of the
assessment can help to identity new or missed opportunities.
Peer review is an accepted process that is understood by all, and it provides scientific
accountability. It can also identify links to the external scientific and technical community and to
relevant R&D performed elsewhere. Peer review can also provide advice on decision making,
32
OCR for page 33
particularly with regard to resource allocation for the array of R&D directions that an
organization may be considering.
External reviewers bring to an assessment their individual perspectives, which may
constitute biases or conflicts of interest. There is generally more credibility for independent
external assessments. A fully independent assessment will be arranged by and managed by an
independent contractor. It is essential that candidates for membership on assessment panels be
required to present any biases or potential conflicts of interest to the contractor so that the
appointment decision can take such potential conflicts into consideration.
In any case, external peer review is essential for evaluating any R&D organization. Peer
review is the most valuable and credible best practice an organization can employ to assess its
quality and, in many cases, its impact.7,8,9,10,11
Appendix J summarizes assessment processes at NIST, ARL, and Sandia National
Laboratories. Appendix K summarizes assessment processes at some other government
laboratories: within the DOD (Army, Air Force, and Navy); the National Institutes of Health; and
the Department of Energy. Each of the processes described in Appendixes J and K involves peer
review of technical quality.
Metrics
Before considering the best metrics to use and how to use them, it is necessary to
determine what is to be measured and why.12,13 A report of the National Research Council
committee that examined metrics for the U.S. Environmental Protection Agency suggested that
metrics must be meaningful to the recipients of the results of the assessment, simple and
understandable, integrated into an overall assessment, aligned with the goals of the organization,
both process- and outcome-oriented, accurate, consistent, cost-effective, and timely.14 In
evaluating metrics for R&D, it is important to avoid simply counting (i.e., numbers of papers,
patents, etc.) and instead to identify how the measures will be mapped to decisions that are
expected to rely on the results of the assessment. It is also helpful to identify and distinguish
between leading and lagging indicators among metrics. Investments in R&D may provide for a
number of outputs such as those identified in the following list (a detailed discussion of possible
metrics is also provided by Geisler15):
7
National Research Council, 1995. On Peer Review in NASA Life Sciences Programs. National Academy Press,
Washington, D.C.
8
National Research Council, 1998. Peer Review in Environmental Technology Development Programs. National
Academy Press, Washington, D.C.
9
National Research Council, 1999. Evaluating Federal Research Programs: Research and the Government
Performance and Results Act. National Academy Press, Washington, D.C.
10
National Research Council, 2007. Assessment of the Results of External Independent Reviews for U.S.
Department of Energy Projects. The National Academies Press, Washington, D.C.
11
U.S. Office of Science and Technology Policy (OSTP), 1983. Report of the White House Science Council
("Packard Report"). OSTP Report Number NP-3902794. OSTP, Washington, D.C.
12
The National Research Council, 2005. Thinking Strategically: The Appropriate Use of Metrics for the Climate
Change Science Program. The National Academies Press, Washington, D.C.
13
R. Behn, 2003. Why measure performance? Different purposes require different measures. Public Administration
Review 63(5).
14
National Research Council, 2003. The Measure of STAR: Review of the U.S. Environmental Protection Agency's
Science to Achieve Results (STAR) Research Grants Program. The National Academies Press, Washington, D.C.
15
E. Geisler, 2000. The Metrics of Science and Technology. Quorum Books, Westport, Conn.
33
OCR for page 34
Publications, patents, reports, and the citations garnered;
Technical assistance provided to end users, customers, and stakeholders;
Invited presentations (e.g., at conferences and workshops);
Training and mentoring of personnel;
New and improved products, materials, and processes;
Patents leading to new products;
Development of test and evaluation protocols, codes, and standards;
Technology transfer;
Maintained competencies;
New competencies;
Cost savings (e.g., in materials or processes);
Increased productivity;
Safety practices and culture;
Effectiveness of management structure and strategy; and
Recognition of the R&D organization as best, among the best, or unique.
Bibliometrics
Bibliometrics are methods to quantitatively analyze scientific and technological literature.
Bibliometrics can be used in the evaluation of individuals, groups, or institutions as a whole. The
collection of the data and its analysis are straightforward. Bibliometric measures allow for the
quantitative assessment of R&D outputs by simple counts of papers, citations, and patents. This
process allows for clear assessments of core journals and their relative impact, including their
journal impact factors. Citation analysis (Science Citation Index) can be used to help determine
the role that individual scientists, their groups, and their institutions have in the evolution of new
ideas and their technological development. With proper analysis, bibliometrics can identify
trends and emerging concepts in science and technology even across diverse scientific
disciplines. Since bibliometrics are accepted by the general scientific community, they can be
used as a reasonable representation of the outputs of a research organization.
In using bibliometrics, it is important to recognize that the analysis does not include all
published articles and that other types of written outputs such as technical reports are not
covered. Also, they do not account for work in progress. Citations are not a measure of quality;
a high citation count may simply be a measure of those papers that are in concert with others and
that are not truly important. It is also hard to validate cross-disciplinary research in publications,
owing to different structures and procedures for different disciplines. New, evolving, and mature
areas all have different publication and citation rates. There is no standard for the validation of
counts of papers and citations as they relate to quality. For example, much can be written about
a mistake.
Another factor to consider is that bias can be an issue in assessing the merit of paper and
publication counts. For example, for papers that discuss research near the boundaries of their
discipline publication and citation in top journals may be more difficult to achieve. Also
bibliometric databases may not adequately cover conference proceedings, which are increasingly
important in some fields such as computer science. Given the caveats above, absolute
34
OCR for page 35
bibliometrics can be less than useful. Assessing the trend in such metrics can, however, provide
a lagging indicator of research quality.
The visibility and global impact of research investments may also be measured by means
of the number of presentations given at meetings both internal and external to an organization.
Of particular significance is the number of invited presentations. This measure has value in the
assessment of both the organization and individual effort. Although presentations internal to an
organization are not searchable through bibliometric databases, those given at symposia,
workshops, and conferences are available in databases. As with the other bibliometric data
described above, it is important to consider this metric in context and to avoid its use in isolation.
Patents
Patents are another measure of the outcome of investments, and they may be viewed as a
measure of the potential market applications resulting from R&D. Patents are also legal
documents and may be viewed both as a measure of scientific productivity and as a measure of
intermediate outputs. Patents are a measure applied across the science and technology
organization. Because the format of patents is uniform, comparisons can be made across diverse
research organizations and even between countries. Because patents contain information on
inventions that have resulted from science and technology activity, it is possible to reconstruct
levels of investment. Science and technology can lead to a high quantity and quality of patents,
which may correlate with an improved knowledge base, improved pool of skills, the protection
of promising ideas, new products, and improved innovation activity.
Data related to patents are easily quantifiable; patent databases are large and easily
searched, and the data can be manipulated and readily cross-correlated. Also, the citations are
readily available. Patents provide important information about the actual work and level of effort
that led to the patent. The potential impact of a patent suggests that it can be considered a
reasonable measure of output. Successful patents are generally considered to be a measure of
technological performance. As such, they provide indicators of the knowledge base and the
quality of the research that led to them.
In making assessments based on patents, there are a number of points to consider.
Whether or not to file a patent is decided by the individual organization, and there are marked
differences in the propensity of organizations to file for patents; a more meaningful metric,
particularly for long-term assessment, is the quantity and quality of licensed patents. Also, not all
of the research investment of an organization will lead to a patent, and thus patents alone,
licensed or not, cannot be used as a stand-alone metric of quality. Further, there is not a one-to-
one correspondence between a patent and a product, and such correlations are not easily derived
from a database. Although there may not be a clear method for modeling patents and their
relationship to R&D performance, the patent metric can be a useful tool when considered in the
proper context.
Cultural Metrics
Some metrics associated with the quality of an R&D effort relate to the culture of the
organization. Some factors that may allow an assessment of the cultural environment include the
importance placed on the training and mentoring of personnel and the commitment to safety
practices and culture. A healthy organization requires buy-in and commitment to the importance
35
OCR for page 36
of these elements at all levels, from the most senior manager to the individual executing a given
program.16 Some elements of the assessment of cultural factors will be quantitative--for
example, numbers of courses or degree programs associated with training, safety initiatives,
accidents, and other indicators relating to safety--although the following important metric is
largely qualitative in nature: Does the cultural environment provide meaningful support?
Other considerations related to organizational culture may include: To what extent are
organizational members clear about organizational policies and processes, explicit and implicit?
To what extent do organizational members agree that "this is a great place to work"? To what
extent are organizational members treated with respect and dignity? Are differences among
people respected and encouraged, or is the expectation one of bias and prejudice? Is conflict
surfaced and managed, or is it avoided?
As is readily gleaned from the above discussion, indicators may be either quantitative or
qualitative in nature and may be found in various data sources, including employee surveys. For
instance, quantitative measures include number of publications and presentations, citations, and
new products and processes. Return on investment and performance outputs can be important
metrics.
Other outputs not as readily associated with quantitative measures might also have
significance. Examples include impact on customer satisfaction, contributions to the pool of
innovations, global recognition, the effectiveness of organizational leadership, communication
among various entities within the organization and with relevant stakeholders, and the ability to
transition research from invention/innovation to later stages of development.
Appendix L provides a set of assessment metrics and criteria applied by NRC panels that
review the ARL. This set of metrics and criteria is not presented as a prescription, but, rather, as
an example of a tailored set developed to meet the perceived assessment needs of one
organization. The assessment items identified fell into the following categories: relevance to the
wider scientific and technical community, impact to customers, formulation of the goals and
plans for projects, methodology applied to the research and development activities, adequacy of
supporting capabilities and resources, and responsiveness to the findings from previous
assessments.
Benchmarking
One commonly used assessment approach is to compare one R&D organization with one
or more others judged to be at the top level of performance. This is usually done with metrics
that are normalized to account for size differences. Thus one may cite the number of archival
publications for each technical professional. Using percentages accounts for size differences,
e.g., the percentage of doctorates among the professional population. It is important that
comparisons made are among R&D organizations operating in similar contexts.17 For example,
comparing an engineering research organization with an academic department provides little
meaningful information, because the two operate in different contexts. A problem with
benchmarking with metrics is that such assessments do not reveal the effectiveness of the
organizations. A first- class organization may reside in a parent that fails to capitalize on the
16
B. Jaruzelski, J. Loehr, and R. Holman, 2011. Why Culture Is Key. Booz & Company, New York, N.Y.
17
National Research Council, 2000. Experiments in International Benchmarking of U.S. Research Fields. National
Academy Press, Washington, D.C.
36
OCR for page 37
organization's breakthroughs. Nonetheless, benchmarking can be a useful addition to the
assessment tool kit.
SUMMARY OF FINDINGS
Different aspects of the assessment of technical quality are suited to appropriate
quantitative and qualitative metrics. The quality of the research activities is best viewed in the
context of the entire portfolio of the organization, including suitable assessment of cross-
organizational programs.
Peer review, coupled with quantitative and qualitative metrics, is a critical part of an
effective assessment of the R&D organization.
Quantitative metrics can play a key role in assessment, but it is important to determine
first what will be measured and why, and to avoid counting the numbers without a good
rationale. Typical bibliometric measures are publications and presentations. Patents are another
quantifiable metric. For each of these, the total number is not nearly as important as the quality
and the contribution (and commonly accepted measures such as number of publication citations
can be a poor surrogate for quality). Other indicators, generally qualitative, are associated with
the culture of the organization; these include training and mentoring, safety initiatives, and the
effectiveness of management.
Quantitative metrics can usually be associated with such additional aspects as return on
investment, the development of new products and processes, and internal productivity and/or cost
savings. Other indicators that are more qualitative include technical assistance, customer
satisfaction, communication (both internally and with stakeholders), and global recognition
(including benchmarking). The transitioning of research into products has both quantitative and
qualitative aspects.
37