of a particular political or economic situation depends on multiple sources of information, including, but not limited to, the comparative data provided by judgments based on rating scales. With regard to interventions and specific project designs, George Lopez suggested that some may be better designed and evaluated by specialists, such as community and organizational consultants, including community-rights organizers, than by researchers.
Additional concerns voiced by the practitioners at the workshop made evident a possible “two cultures” problem. Many of the practitioners were concerned about the possibility of “too much” quantification in a scheme, wondering just what to do with the many highly detailed schemes put forth in the literature. George Lopez elaborated this concern by noting that even the best indicator systems could be quickly outdated: the scholarly community lags behind real-world events; many data-gathering categories do not capture some events or the pace of change that occurs in social and political institutions. On the positive side, although noting the difficulty of comparing, for example, different judicial systems, a number of the practitioners found rating scales useful as points of departure or as starting points for further work. They also found them useful for focusing attention on major differences between countries and within a country over time. It was generally agreed that minor differences uncovered by the ratings may be attributed to sampling error.
The discussion returned to a consideration of technical issues of measurement. Featured in this discussion were a concern about the quality of the sources for judgments, the merits of combining several partially correlated indicators into a single aggregated unit, the way that different components of an index are weighted and assessments of the impact of different weighing decisions, and the need to go “beyond the numbers” in order to explain why change did or did not occur. Other methodological issues addressed were the need for analysis of the dimensionality of indicator systems (for example, there are many indicators of a relatively few concepts); the trade-off between validity and reliability, as illustrated by the observation that fewer coding categories increase reliability (agreement among coders) at the cost of missing nuances in the phenomenon being measured; and the result of sensitivity analyses showing that alternative weighing systems for components of indexes may make little difference in terms of altering statistical relationships among the indexes.
Robert Dahl concluded the discussion with the comment that the researchers who work on measuring democracy do not have all the answers, and there will be inevitable frustrations for practitioners asking for tools that can function effectively as decision aids. Researchers are not prepared to provide such tools at this time, although they can provide useful indicators for ranking countries’ performance and even change. It may well be that exchanges such as those held here over the past two days could contribute to an improved craft. Not only would the indicators be more sensitive to change, but they would be developed in the context of issues raised by the policy community. Over the long run, this is likely to benefit both communities by contributing to theory and to practice.
LESSONS LEARNED
In a final session participants revisited the lessons learned from the previous workshop. Three questions were posed: On the basis of the discussions at this workshop, which lessons are still valid? Which lessons have been changed as a result of these
discussions? What new lessons have been added? Of the seven lessons in the earlier workshop report (National Research Council, 1991a:11–12), the participants agreed that six remain valid:
-
If one uses a fairly restricted definition of democracy, similar to Dahl’s concept of polyarchy in its focus on political contestation and political rights, then it is possible to measure this kind of democracy reliably [see Inkeles and Sirowy, 1990].
-
A number of different indicator systems are available to measure democracy. At present, the Freedom House is the only set of rankings produced annually. It would also be relatively easy for A.I.D. to construct their own systems.
-
It is possible to add components to the indicator system to capture country and region-specific concerns, as long as these are related to the fundamental concept being measured. Any such additions should have to be justified and defended, however.
-
Using these types of indicator systems, it is not necessary to weight the various components of a scale.
-
In addition to an overall ranking, any indicator system should provide country profiles, that is, a full listing of how each country ranked on each of the components of the indicator.
…
-
Initially, it would be desirable to use more than one indicator system and compare the rankings each generates.
One of the lessons needs some clarification. This was lesson (6): “Other important concepts, such as governance and human rights, should be measured separately.” Although participants did agree that these concepts should still be measured separately, they emphasized that they should not be viewed as “pure concepts” but as intertwined or correlated parts of a more general concept of rights. Nor are most indicator systems limited to the one or another of these concepts; elements of each are usually represented in the system. This had been recognized in the first workshop, but in that case the participants had emphasized the differences in the relative understanding of the concepts and the extent of experience with efforts to measure them. Their particular concern was the concept of “governance.” They concluded that it would be better to invest independent effort into developing better understanding and measures of the newer concept, rather than conflating it with indicators of “human rights” and “democracy,” for which there is far more consensus on their meaning and how to go about measuring them.2
A new lesson emerged from the observation that there is a discrepancy between the relatively high levels of agreement found for indicators of key elements of democracy and the lack of a compelling theoretical framework about its causes and development that could
guide program design and evaluation decisions. Participants suggested a new lesson: Indicator systems should be enveloped in a theoretical context that addresses the issue of usefulness and serves as a guide for further research. In this regard, patience will be necessary. Progress will be made in the direction of producing relevant theory, but even the most sophisticated theoretical framework is unlikely to resolve issues of application. Project design and evaluation efforts are fundamentally different enterprises that can be informed by academic research on democracy, but it is unrealistic, and perhaps unwise, to expect that a framework can serve as a tool for making intervention decisions.
Overall, the workshop participants generally agreed that the indicator systems currently available or under development can contribute to practitioners’ interests in several ways. They provide the basis for ranking and comparing the relative standing of different countries, taking regional variations into account. They can document when large changes are occurring in a country and provide some insight into what it would take to change the ratings for a given country. The fact that aid agencies such as the African Bureau and UNDP are engaged in developing and using indicators further enhances the prospects for improved systems. And the quality of indicators will improve as a result of exchanges between researchers who develop and evaluate the systems and practitioners who use them for making day-to-day decisions.