Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 71
Improving Democracy Assistance: Building Knowledge through Evaluations and Research 3 Measuring Democracy1 INTRODUCTION One of the U.S. Agency for International Development’s (USAID) charges to the National Research Council committee was to develop an operational definition of democracy and governance (DG) that disaggregates the concept into clearly defined and measurable components. The committee sincerely wishes that it could provide such a definition, based on current research into the measurement of democratic behavior and governance. However, in the current state of research, only the beginnings of such a definition can be provided. As detailed below, there is as much disagreement among scholars and practitioners about how to measure democracy, or how to disaggregate it into components, as on any other aspect of democracy research. The result is that there exist a welter of competing definitions and breakdowns of “democracy,” marketed by rivals, each claiming to be a superior method of measurement, and each the subject of sharp and sometimes scathing criticism. The committee believes that democracy is an inherently multidimensional concept, and that broad consensus on those dimensions and how 1 Helpful comments on this chapter were received from Macartan Humphreys, Fabrice Lehoucq, and Jim Mahoney. The committee is especially grateful to those who attended a special meeting on democracy indicators held at Boston University in January 2007: David Black, Michael Coppedge, Andrew Green, Rita Guenther, Jonathan Hartlyn, Jo Husbands, Gerardo Munck, Margaret Sarles, Fred Schaffer, Richard Snyder, Paul Stern, and Nicolas van de Walle. See Appendix C for further information.
OCR for page 72
Improving Democracy Assistance: Building Knowledge through Evaluations and Research to aggregate them may never be achieved. Thus, if USAID is seeking an operational measure of democracy to track changes in countries over time and where it is engaged, a more practical approach would be to disaggregate the various components of democracy and track changes in democratization by looking at changes in those components. Yet even for the varied components of democracy, there are no available measures that are widely accepted and have demonstrated the validity, accuracy, and sensitivity that would make them useful for USAID in tracking modest changes in democratic conditions in specific countries. The development of a widely recognized disaggregated definition of democracy, with clearly defined and objectively measurable components, would be the result of a considerable research project that is yet to be done. This chapter provides an analysis of existing measures of democracy and points the way toward developing a disaggregated measure of the type requested by USAID. The committee finds that most existing measures of democracy are adequate, and in fair agreement, at the level of crude determination of whether a country is solidly democratic, autocratic, or somewhere in between. However, the committee also finds that all existing measures are severely inadequate at tracking small movements or small differences in levels of democracy between countries or in a single country over time. Moreover, the committee finds that existing measures disaggregate democracy in very different ways and that their measures of various components of democracy do not provide transparent, objective, independent, or reliable indicators of change in those components over time. While recognizing that it may seem self-serving for an academic committee to recommend “more research,” it is the committee’s belief—after surveying the academic literature and convening a workshop of experts in democracy measures to discuss the issue—that if USAID wishes a measure of democracy that it can use to gauge the impact of its programs and track the progress of countries in which it is active, it faces a stark choice: either rely on the current flawed measures of democracy or help support the development of a research project on democracy indicators that—it is hoped—will eventually produce a set of indicators with the broadly accepted integrity of today’s national accounts indicators for economic development. To provide just a few examples to preview the discussion below, USAID manages its DG programs with an eye toward four broad areas: rule of law, elections, civil society, and good governance. Yet consider the two most widely used indicators of democracy: the Polity autocracy/ democracy scale and the Freedom House scales of civil liberties and political rights. The former breaks down its measures of democracy into three components: executive recruitment, executive constraints, and political
OCR for page 73
Improving Democracy Assistance: Building Knowledge through Evaluations and Research competition, measured by six underlying variables. While some of these could be combined to provide indicators of elections, civil society, and aspects of rule of law, Polity does not address “good governance.” Moreover, the validity of the various components and underlying variables in Polity is so greatly debated that there is no reason to believe that a measure of rule of law based on the Polity components would be accepted. Freedom House rates nations on two scales: civil liberties (which conflates rule of law, civil society, and aspects of good governance) and political rights (which conflates rule of law, elections, and aspects of good governance). Even if these scales were based on objective and transparent measurements (and they are not), there would be no way to extract from them information on components relevant to USAID’s DG policy areas. Fortunately, while more sensitive and accurate measures to track sectoral movements toward or away from democracy are vital to improving USAID’s policy planning and targeting of DG programs, USAID can still gain knowledge on the impacts of its programs by focusing on changes in outcome indicators at a level relevant to those projects (for which methodologies are examined in Chapters 5 through 7). That is, USAID should seek to determine whether its projects lead to more independent and effective behavior by judges and legislators, broader electoral participation and understanding by citizens, more competitive and fair election practices, fewer corrupt officials, and other concrete changes. The issue of how much those changes contribute to overall trajectories of democracy or democratic consolidation is one that can only be solved by future experience and study and the development of better disaggregated measures for tracking democracy at the sectoral level. The committee thus agrees that USAID is correct in focusing its interest in measurement on developing a measure of democracy that is disaggregated into discrete and measurable components. This chapter will analyze existing approaches to measuring democracy, identifying why they are flawed, and point the way toward what the committee believes will be a more useful approach to developing disaggregated sectoral or meso-level measures (Table 2-1). PROBLEMS WITH EXTANT INDICATORS A consensus is growing within the scholarly community that existing indicators of democracy are problematic.2 These problems may be grouped into five categories: (1) problems of definition, (2) sensitivity issues, (3) measurement errors and data coverage, (4) aggregation prob- 2 See Bollen (1993), Beetham (1994), Gleditsch and Ward (1997), Bollen and Paxton (2000), Foweraker and Krznaric (2000), McHenry (2000), Munck and Verkuilen (2002), Treier and Jackman (2003), Berg-Schlosser (2004 a, b), Acuna-Alfaro (2005), and Vermillion (2006).
OCR for page 74
Improving Democracy Assistance: Building Knowledge through Evaluations and Research lems, and (5) lack of convergent validity. What follows is a brief, sometimes rather technical, review of these problems and their repercussions. Definitions of key terms are provided in the text or in the Glossary at the end of the report. The focus of the discussion is on several leading democracy indicators: (1) Freedom House; (2) Polity; (3) ACLP (“ACLP” stands for the names of the creators—Alvarez, Cheibub, Limongi, and Przeworski; Alvarez et al 1996; recently expanded by Boix and Rosato 2001); and (4) the Economist Intelligence Unit (EIU). Freedom House provides two indices: “Political Rights” and “Civil Liberties” (sometimes employed in tandem, sometimes singly). Both are seven-point scales extending back to 1972 and cover most sovereign and semisovereign nations.3 Polity also provides two aggregate indices: “Democracy” and “Autocracy.” Both are 10-point scales and are usually used in tandem (by subtracting one from the other), which provides the 21-point (−10 to 10) Polity2 variable. Coverage extends back to 1800 for sovereign countries with populations greater than 500,000.4 ACLP codes countries dichotomously (autocracy/ democracy) and includes most sovereign countries from 1950 to 1990. The expanded dataset provided by Boix and Rosato (2001) stretches back to 1800.5 The EIU has recently developed a highly disaggregated index of democracy with 5 core dimensions and 60 subcomponents, which are combined into a single index of democracy (Kekic 2007). Coverage extends to 167 sovereign or semisovereign nations but only in 2006. Glancing reference will be made to other indicators in an increasingly crowded field,6 and many of the points made in the following discussion apply quite broadly. However, it is important to bear in mind that each indicator has its own particular strengths and weaknesses. The following brief survey does not purport to provide a comprehensive review.7 3 See www.freedomhouse.org. 4 Both are drawn from the most recent iteration of this project, known as Polity IV. See www.cidcm.umd.edu/inscr/polity. 5 Jose Cheibub and Jennifer Ghandi are currently engaged in updating the ACLP dataset, but results are not yet available. 6 See Bollen (1980), Coppedge and Reinicke (1990), Arat (1991), Hadenius (1992), Vanhanen (2000), Altman and Pérez-Liñán (2002), Gasiorowski (1996; updated by Reich 2002 [also known as “Political Regime Change—PRC dataset”]), and Moon et al (2006). 7 The most detailed and comprehensive recent reviews are Hadenius and Teorell (2005) and Munck and Verkuilen (2002). See also Bollen (1993), Beetham (1994), Gleditsch and Ward (1997), Bollen and Paxton (2000), Elkins (2000), Foweraker and Krznaric (2000), McHenry (2000), Casper and Tufis (2003), Treier and Jackman (2003), Berg-Schlosser (2004a, b), Acuna-Alfaro (2005), and Bowman et al (2005).
OCR for page 75
Improving Democracy Assistance: Building Knowledge through Evaluations and Research Definition There are many ways to define democracy, and each naturally generates a somewhat different approach to measurement (Munck and Verkuilen 2002). Some definitions are extremely “thin,” focusing mainly on the presence of electoral competition for national office. The ACLP index exemplifies this approach: Countries that have changed national leadership through multiparty elections are democracies; other countries are not. Other definitions are rather “thick,” encompassing a wide range of social, cultural, and legal characteristics well beyond elections. For example, the Freedom House Political Rights Index includes the following questions pertaining to corruption: Has the government implemented effective anticorruption laws or programs to prevent, detect, and punish corruption among public officials, including conflict of interest? Is the government free from excessive bureaucratic regulations, registration requirements, or other controls that increase opportunities for corruption? Are there independent and effective auditing and investigative bodies that function without impediment or political pressure or influence? Are allegations of corruption by government officials thoroughly investigated and prosecuted without prejudice, particularly against political opponents? Are allegations of corruption given wide and extensive airing in the media? Do whistle-blowers, anticorruption activists, investigators, and journalists enjoy legal protections that make them feel secure about reporting cases of bribery and corruption? What was the latest Transparency International Corruption Perceptions Index score for this country? (Freedom House 2007) It may be questioned whether these aspects of governance, important though they may be, are integral components of democracy. More generally, many scholars treat good governance as a likely result of democracy; yet many donors (including USAID) treat good governance as an essential component of democracy. Similar complaints might be registered about other concepts and scales of democracy; some are so “thick” as to include diverse elements of accountability, even distributional equity and economic growth. For example, some definitions treat the United States as a democracy from the passage of its Constitution and first national election in 1789. Yet since George Washington ran uncontested in both 1789 and 1792, even ACLP would not treat the United States as democratic until the appearance of contested multiparty elections in 1796. If slavery is considered a contravention of democracy, the United States could not be considered a democracy until its abolition throughout its territory in 1865. If women’s right to vote is also considered essential to the definition of democracy, the United States does not qualify until 1920. And if the disenfranchisement of African Americans in southern states is considered a block to democ-
OCR for page 76
Improving Democracy Assistance: Building Knowledge through Evaluations and Research racy, the United States does not become a full democracy until passage of the Civil Rights Act in 1965. In short, only a “thin” definition of democracy would classify the United States as “fully democratic” from the early nineteenth century. Yet most donor agencies are reluctant to adopt such thin measures as a guide to current democracy assessments, questioning whether “thin” indices of democracy capture all the critical features of this complex concept. The problem of definition is critical but very difficult to resolve. Sensitivity A related issue is that many of the leading democracy indicators are not sensitive to important gradations in the quality of democracy across countries or through time. At the extreme, dichotomous measures such as ACLP reduce democracy to a dummy variable: A country either is or is not a democracy, with no intermediate stages permitted. While useful for certain purposes, one may wonder whether this captures the complexity of such a variegated concept (Elkins 2000). At best it captures one or two dimensions of democracy (those employed as categorizing principles), while the rest are necessarily ignored. Most democracy indicators allow for a more elongated scale. As noted above, Freedom House scores democracy on a seven-point index (14 points if the Political Rights and Civil Liberties indices are combined). Polity provides a total of 21 points if the Democracy and Autocracy scales are merged into the Polity2 variable, which gives the impression of considerable sensitivity. In practice, however, country scores stack up at a few places (notably, 7 for autocracies and +10 for full democracies, the highest possible score), suggesting that the scale is not as sensitive as it purports to be. The EIU index is by far the most sensitive and does not appear to be arbitrarily “bunched.”8 Note that all extant indicators are bounded to some degree and therefore constrained. This means that there is no way to distinguish the quality of democracy among countries that have perfect negative or positive scores. This is fine as long as there really is no difference in the quality of democracy among these countries. Yet the latter assumption is highly questionable. Consider that in 2004, Freedom House assigned the highest score (1) on its Political Rights Index to the following 58 countries: Andorra, Australia, Austria, Bahamas, Barbados, Belgium, Belize, Bulgaria, Canada, Cape Verde, Chile, Costa Rica, Cyprus (Greek), 8 Questions can also be raised about whether these indices are properly regarded as interval scales (Treier and Jackman 2003). The committee does not envision an easy solution to this problem.
OCR for page 77
Improving Democracy Assistance: Building Knowledge through Evaluations and Research Czech Republic, Denmark, Dominica, Estonia, Finland, France, Germany, Greece, Grenada, Hungary, Iceland, Ireland, Israel, Italy, Japan, Kiribati, Latvia, Liechtenstein, Luxembourg, Malta, Marshall Islands, Mauritius, Micronesia, Nauru, Netherlands, New Zealand, Norway, Palau, Panama, Poland, Portugal, San Marino, Slovakia, Slovenia, South Africa, South Korea, Spain, St. Kitts and Nevis, St. Lucia, Suriname, Sweden, Switzerland, Tuvalu, United Kingdom, United States, and Uruguay.9 Are we really willing to believe that there are no substantial differences in the quality of democracy among these diverse polities? Measurement Errors and Data Coverage Democracy indicators often suffer from measurement errors and/or missing data.10 Some (e.g., Freedom House) are based largely on expert judgments, judgments that may or may not reflect facts on the ground.11 Some (e.g., Freedom House in the 1970s and 1980s) rely heavily on secondary accounts from a few newspapers such as the New York Times. These accounts may or may not be trustworthy and almost assuredly do not provide comprehensive coverage of the world. Moreover, newspaper accounts suffer from extreme selection bias, depending almost entirely on the location of the newspaper’s reporters. Thus, if the New York Times has a reporter in Mexico but none in Central America, coverage of the latter is going to much spottier than the former. In an attempt to improve coverage and sophistication, some indices (e.g., EIU) impute a large quantity of missing data. This is a dubious procedure wherever data coverage is limited, as it seems to be for many of the EIU variables. Note that many of the EIU variables rely on polling data, which are available on a highly irregular basis for 100 or so nation states. The quality of many of the surveys on which the EIU draws has not been clearly established. This means that data for these questions must be estimated by country experts for all other cases, estimated to be about half of the sample. (The procedures employed for this estimation are not known.) Wherever human judgments are required for coding, one must be 9 The precise period in question stretches from December 1, 2003, to November 30, 2004; obtained from http://www.freedomhouse.org/template.cfm?page=15&year=2006 (accessed on September 21, 2006). 10 For general treatments of the problem of conceptualization and measurement, see Adcock and Collier (2001). 11 With respect to the general problem of expert judgments, see Tetlock (2005), who found that expert opinions tended to reflect more the consensus of the expert community than an objective “truth,” inasmuch as his surveys of experts produced answers that were often, in retrospect, no more accurate than a coin toss.
OCR for page 78
Improving Democracy Assistance: Building Knowledge through Evaluations and Research concerned about the basis of the respondent’s decisions. In particular, one wonders whether coding decisions about particular topics (e.g., press freedom) may reflect an overall sense to outside experts of how democratic country A is, rather than an independent evaluation of the question at hand. The committee also worries about the problem of endogeneity of the evaluations, that is, with experts looking more at what other experts and indicators are doing rather than making their own independent evaluation of the country. The intercoder “reliability” may be little more than an artifact of experts accepting other experts’ judgments. In this respect, “disaggregated” indicators are often considerably less disaggregated than they appear. Note that it is the ambiguity of the questionnaires underlying these surveys that fosters this sort of premature aggregation. The committee undertook a limited statistical examination of the Freedom House scores for 2007 on their key components—for political rights this included electoral process, pluralism and participation, and functioning of government; for civil liberties these were freedom of expression, association and organizational rights, rule of law, and personal autonomy and individual rights (see Appendix C). Across all countries, two-way correlations among the seven components were never less than 0.86 and in several cases were 0.95 or greater. This high correlation could imply that democracy is indeed a far “smoother” condition than the “lumpy” view expressed in this study. That is, the high correlation among the items suggests that picking any one is just about as good as picking any other. Yet the committee doubts the independence of the judgments on each of the components of the scale. The EIU democracy scale also is divided into components: civil rights, elections, functioning of government, participation, and culture. Taking the Freedom House and EIU components together, a factor analysis reveals that a single factor loading explains 83 percent of the variance across all 12 components, and the two principal factors explain 90 percent of the variance (Coppedge 2007). This, by itself, is not problematic; it could be that good/bad things go together; that is, countries that are democratic on one dimension are also democratic on another. However, it raises concern about the actual independence of the various components in these indices. It could be, in other words, that respondents (either experts or citizens) who are asked about different dimensions of a polity are, in fact, simply reflecting their overall sense of a country’s democratic culture. It also suggests that the various independent components in fact contain no more useful information than the principal one or two factors. Adding to worries about measurement error is the general absence of intercoder reliability tests as part of the coding procedure. Freedom House does not conduct such tests (or at least does not make them public). Polity does so, but it requires a good deal of hands-on training before coders reach an acceptable level of coding accuracy. This suggests that other cod-
OCR for page 79
Improving Democracy Assistance: Building Knowledge through Evaluations and Research ers would not reach the same decisions simply by reading Polity’s coding manual or that artificial uniformity is imposed. And this, in turn, points to a potential problem of conceptual validity: Key concepts are not well matched to the empirical data. Aggregation Since democracy is a multifaceted concept, all composite indicators must wrestle with the aggregation problem—how to weight the components of an index and which components to include. For aggregation to be successful, the rules must be clear, operational, and consistent with common notions of what democracy is; that is, the resulting concept must be valid. It goes almost without saying that different solutions to the aggregation problem lead to quite different results (Munck and Verkuilen 2002; for a possible exception to this dictum, see Coppedge and Reinicke 1990). Although most indicators have fairly explicit aggregation rules, they are often difficult to comprehend, and consequently to apply. They may also include “wild card” elements, allowing the coder free rein to assign a final score that accords with his or her overall impression of a country (e.g., Freedom House). In some cases (e.g., Polity), components are listed separately, which helps clarify the final score a country receives. However, in Polity’s case the components of the index are themselves highly aggregated, so the overall clarity of the indicator is not improved. Even when aggregation rules are clear and unambiguous, because they bundle a host of diverse dimensions into a single score, it is often unclear which of the dimensions is driving a country’s score in a particular year. It is often difficult to articulate what an index value of “4” means within the context of any single indicator. Moreover, even if an aggregation rule is explicit and operational, it is never above challenge. The Polity index, in Munck and Verkuilen’s estimation, “is based on an explicit but nonetheless quite convoluted aggregation rule” (2002:26). Indeed, a large number of possible aggregation rules fit, more or less, with everyday concepts of democracy and thus meet the minimum requirements of conceptual validity. For this reason the committee regards the aggregation problem as the only problem that is unsolvable in principle. There will always be disagreement over how to aggregate the various components of “Big D democracy” (i.e., the one central concept that is assumed to summarize a country’s regime status). Convergent Validity Given the above, it is no surprise that there is significant disagreement among scholars over how to assign scores for particular countries on
OCR for page 80
Improving Democracy Assistance: Building Knowledge through Evaluations and Research the leading democracy indices. Granted, intercorrelations among various democracy indicators are moderately high, suggesting some basic agreement over what constitutes a democratic state. As shown in the analysis undertaken for the committee that is summarized in Appendix C, the Polity2 variable (combining Democracy and Autocracy) drawn from the Polity dataset and the Freedom House Political Rights Index are correlated at .88 (Pearson’s r). Yet when countries with perfect democracy scores (e.g., the United Kingdom and the United States) are excluded from the samples, this intercorrelation drops to .78. And when countries with scores of 1 and 2 on the Freedom House Political Rights scale (the two top scores) are eliminated, the correlation experiences a further drop—to .63, implying that two-thirds of the variance in one scale is unrelated to changes in the other scale for countries outside the upper tier of democracies. The committee similarly finds that correlations between the Freedom House and EIU scores are low when the highest-scoring countries are set aside. For a substantial number of countries—Ghana, Niger, Guinea-Bissau, the Central African Republic, Chad, Russia, Cambodia, Haiti, Cuba, and India—the Freedom House and EIU scores differ so widely that they would be considered democratic by one scale but undemocratic by the other. Indeed, country specialists often take issue with the scoring of countries they are familiar with (e.g., Bowman et al 2005; for more extensive cross-country tests, see Hadenius and Teorell 2005). Since tracking progress in democracy assistance often depends on accurately measuring modest improvements in democracy, it is particularly distressing that the convergence between different scales is so low in this regard. While the upper “tails” of the distributions on the major indicators (the fully democratic regimes) are highly correlated, the democracy scores for countries in the upper middle to the bottom ranges are not. The analysis commissioned by the committee (see Appendix C) found that the average correlation between the annual Freedom House and Polity scores for autocratic countries (those with Polity scores less than −66) during 1972-2002 was only .274. Among the partially free countries of the former Soviet Union, the correlation between annual Freedom House and Polity scores for the years 1991-2002 was .295; for the partially free countries in the Middle East, it was 0.40. In many cases the correlations for specific countries were negative, meaning that the two scales gave opposite measures of whether democracy levels were improving or not. This is a serious problem for USAID and other donors, since they are generally most concerned with identifying the level of democracy, and degrees of improvement, precisely for those countries lying in the middle and bottom of the distribution—countries that are mainly undemocratic or imperfectly democratic—rather than for countries already at the upper end of the democracy scale.
OCR for page 81
Improving Democracy Assistance: Building Knowledge through Evaluations and Research If there is little agreement on the quality and direction of democracy in countries that lie in between the extremes, it must be concluded that there is relatively little convergent validity across the most widely used democracy indicators. That is, whatever their intent, they are not in fact capturing the same concept. By way of conclusion to this very short review of extant indicators, the committee quotes from another recent review by Jim Vermillion, current executive vice president of the International Foundation for Election Systems: Initial work in the measurement of democracy has provided some excellent insights into specific measures and has helped enlighten our view of where underlying concepts related to democracy stand. However, we are far from coming up with a uniform, theoretically cohesive definition of the construct of democracy and its evolution that lends itself easily to statistical estimation/manipulation and meaningful hypothesis testing. (Vermillion 2006:30) The need for a new approach to this ongoing, and very troublesome, problem of conceptualization and measurement is apparent. Average Versus Country-Specific Results It is reasonable to ask, if the existing indicators of democracy have so many problems, how can the committee have any confidence in the findings mentioned in Chapter 1, such as that the number of democracies in the world is rising and that USAID DG assistance has, on average, made a significant positive difference in democracy levels? For that matter, how is it possible for scholars to have undertaken more than two decades of quantitative research on democracy and democratization, correlating various causal factors with shifts in these democracy indicators, with any belief in the validity of their research? The answers to this question lie in the very different purposes that democracy indicators must serve for scholarly analysis of average or overall global trends, as against the purposes they must serve to support policy analysis of trends in specific countries. For the former purpose it is acceptable for democracy data to have substantial errors regarding levels of democracy in particular states, as long as the errors are not systematically biased. That is, even a democracy scale that makes substantial errors will be useful for looking at average trends as long as its score for any given country is equally likely to be “too high” or “too low.” Such a scale will state the level of democracy as too high in about half the world’s countries and too low in the other half, but the average level of global democracy overall will be fairly correct, and scholars can use statistical methods to “separate out” the random errors from the overall trends.
OCR for page 88
Improving Democracy Assistance: Building Knowledge through Evaluations and Research Appendix C. Here, the reader’s attention is called to the following general points: First, the criteria applying to different dimensions sometimes conflict with one another. For example, strong civil society organizations representing one social group may pressure government to restrict other citizens’ civil liberties (Levi 1996, Berman 1997). This is implicit in democracy’s multidimensional character. Good things do not always go together. Second, some dimensions are undoubtedly more important in guaranteeing a polity’s overall level of democracy than others. However, since resolving this issue depends on which overall definition of democracy is adopted and on various causal assumptions that are difficult to prove, the committee is not making judgments on this issue. Third, it is important to note that dimensions of democracy are not always dimensions of good governance. Thus, inclusion of an attribute on this list does not imply that the quality of governance in countries with this attribute will be higher than those without it. For example, some credibly democratic countries (Japan after World War II, the United States in the nineteenth century) have seen enormous corruption scandals. Of course, evaluating whether an attribute of democracy improves the quality of governance hinges on how one chooses to define the latter, about which much has been written but little agreement can be found (Hewitt de Alcantara 1998, Pagden 1998, Knack and Manning 2000). The committee leaves aside the question of how good governance might be defined, noting only that some writers consider democracy an aspect of good governance, some consider good governance an aspect of democracy, and still others prefer to approach these terms as separate and largely independent (nonnested) concepts. Finally, the committee does not rule out the possibility of alterations to this list of 13. The list might be longer (including additional components) or shorter (involving a consolidation of categories). There is nothing sacrosanct about this particular list of dimensions. Indeed, the committee does not assume that a truly comprehensive set of dimensions is possible, given the extensive and overlapping set of meanings that have been attached to this multivalent term. However, the committee believes strongly that these 13 dimensions are a plausible place to begin. In any case, whether the index has 13 components or some other (smaller or larger) number is less significant for present purposes than the approach itself. Note that if one begins with a disaggregated set of indicators, it is easy to aggregate upward to create more consolidated concepts. One may also aggregate all the way up to Big D democracy, à la Polity and Freedom House. However, the committee does not propose aggregation rules for this purpose, leaving it as a matter for future scholars and policymakers to decide.
OCR for page 89
Improving Democracy Assistance: Building Knowledge through Evaluations and Research Potential Benefits of Disaggregation No aggregate democracy index offers a satisfactory scale for purposes of country assessment or for answering general questions pertaining to democracy. Thus, the committee strongly supports USAID’s inclination to focus its efforts on a more disaggregated set of indicators as a way of capturing the diverse components of this key concept while overcoming difficulties inherent in measures that attempt to summarize, in a single statistic, a country’s level of democracy (à la Freedom House or Polity). To be sure, before undertaking a venture of this scope and scale, USAID will want to consider carefully the added value that might be delivered by a new set of democracy indicators. In the committee’s view, conceptual disaggregation offers multiple advantages. Even so, this approach will not solve every problem, and the committee does not wish to overstate the potential rewards our proposal could bring. The first advantage to disaggregation is the prospect of identifying concepts on whose definitions and measurements most people can agree. While the world may never agree on whether the overall level of democracy in India can be summarized as a “4” or a “5” (on some imagined scale), it may yet agree on more specific scores along 13 (or so) dimensions for the world’s largest democracy. The importance of creating consensus on these matters can hardly be overemphasized. The purpose of a democracy index is not simply to guide policymakers and policymaking bodies such as USAID, the World Bank, and the International Monetary Fund. Nor could it be so constrained, even if it were desirable. As soon as an index becomes established and begins to influence international policymakers, it also becomes fodder for dispute in other countries around the world. A useful index is one that gains the widest legitimacy. A poor index is one that is perceived as a tool of Western influence or a masque for the forces of globalization (as Freedom House is sometimes regarded). Indeed, because current democracy scales are produced by proprietary scalings and aggregations by specific organizations rather than by objective measurements, those organizations are often subjected to “lobbying” by countries that wish to shift their scores. The hope is that by disaggregating the components of democracy down to levels that are more operational and less debatable, it might be possible to garner a broader consensus around this vexed subject. Countries would know, more precisely, why they received the scores they did. They would also know, more precisely, what areas remained for improvement. Plausibly, such an index might play an important role in the internal politics of countries around the world, akin to the role of Transparency International’s Corruption Perceptions Index (Transparency International 2007). A second advantage is the degree of precision and differentiation that
OCR for page 90
Improving Democracy Assistance: Building Knowledge through Evaluations and Research a disaggregated index offers relative to the old-fashioned “Big D” concept of democracy. Using the committee’s proposed index, a single country’s progress and/or regress could be charted through time, allowing for subtle comparisons that escape the purview of highly aggregated measures such as Freedom House and Polity. One would be able to specify which facets of a polity have improved and which have remained stagnant or declined. This means that the longstanding question of regime transitions would be amenable to empirical tests. When a country transitions from autocracy to democracy (or vice versa), which elements come first? Are there common patterns, a finite set of sequences, prerequisites? Or is every transition in some sense, unique? Similarly, a disaggregated index would allow policymakers to clarify how, specifically, one country’s democratic features differ from others in the region or across regions. While Big D democracy floats hazily over the surface of politics, the dimensions of a disaggregated index are comparatively specific and precise. Contrasts and comparisons may become correspondingly more acute. Applying the Proposed Index to Democracy Assistance Programming It is important to remember that, although the committee’s general goal is to provide a path to democracy measures that will be useful to policymakers and citizens alike, the specific charge is to assist USAID. This means the index must be useful for particular policy purposes. Consider the problem of assessment. How can policymakers in Washington and in the field missions determine which aspects of a polity are most deficient and therefore in need of assistance? While Freedom House and Polity offer only one or several dimensions of analysis (and these are highly correlated and difficult to distinguish conceptually), the committee’s proposed index would begin with 13 such parameters. It seems clear that for assessing the potential impact of programs focused on different elements of a polity (e.g., rule of law, civil society, governance, and elections—the four subunits of the DG office at USAID), it is helpful to have indicators that offer a differentiated view of the subject. These same features of the proposed index are equally advantageous for causal analysis, which depends on the identification of precise mechanisms, intermediate factors that are often ignored by macro-level cross-national studies. Which aspects of democracy foster (or impede) economic growth? What aspect of democracy is most affected by specific democracy promotion efforts? Whether democracy is looked on as an independent (causal) variable or as a dependent (outcome) variable, we need to know which aspect of this complex construct is at play. Policymakers also wish to know what effect their policy interventions
OCR for page 91
Improving Democracy Assistance: Building Knowledge through Evaluations and Research might have on a given country’s quality of democracy (or on a whole set of countries, considered as a sample). There is little hope of answering this question in a definitive fashion if democracy is understood only at a highly aggregated level. The interventions by democracy donors are generally too small relative to the outcome to draw plausible causal inferences between USAID policies, on the one hand, and country A’s level of democracy (as measured by Freedom House or Polity) on the other. However, it is plausible—though admittedly still quite difficult—to estimate the causal effects of a project focused on a particular element of democracy if that element can be measured separately. Thus, USAID’s election-centered projects might be judged against several specific indicators that measure the characteristics of elections. This is plausible and perhaps quite informative (though, to be sure, many factors other than USAID have an effect on the quality of elections in a country). The bottom line is this: If policymakers cannot avoid reference to country-level outcome indicators, they will be much better served if these indicators are available at a disaggregated meso level. All of these features should enhance the utility of a disaggregated index for policymakers. Indeed, the need for a differentiated picture of democracy around the world is at least as important for policymakers as it might be for academics. Both are engaged in a common enterprise, an enterprise that has thus far been impeded by the lack of a sufficiently discriminating measurement instrument. Consider briefly the problem that would arise for macroeconomists, finance ministers, and members of the World Bank and International Monetary Fund if they possessed only one highly aggregated indicator of economic performance. As good as GDP is (and there are, of course, considerable difficulties), it would not go very far without the existence of additional variables that measure the components of this macro-level concept. There is a similar situation in the field of political analysis. We have a crude sense of whether countries are democratic, undemocratic, or in between (e.g., “partly free” or partially democratic), but we have no systematic knowledge of how a country should be scored on the various components of democracy. Since a disaggregated index can be aggregated in a variety of ways, developing a disaggregated index is advantageous even if a single aggregated measure is sometimes desired for policy purposes. Indeed, it is expected that scholars and policymakers will compose summary scores from the underlying data provided by this index. However, the benefit of beginning with the same underlying data (along each of the identified dimensions) is that the process of aggregation is rendered transparent. Any composite index based on these data would be forced to reveal how the summary score for a particular country in a particular year was deter-
OCR for page 92
Improving Democracy Assistance: Building Knowledge through Evaluations and Research mined. Any critic of the proposed score, or of the summary index at large, would be able to contest the aggregation rules used by the author. The methodology is “open source” and thus subject to revision and critique. Further, any causal or descriptive arguments reached on the basis of a summary indicator could be replicated with different aggregation rules. If the results were not robust, it might be concluded that such conclusions were contingent on a particular way of putting together the components of democracy. In short, both policy and scholarly discourse might be much improved by a disaggregated index, even if the ultimate objective involves the composition of a highly aggregated index of Big D democracy. Funding and Management Readers of this document might wonder why, if the potential benefits of a disaggregated democracy index are so great, one has not yet been developed. There are two simple answers to this question. First, producing such an index would be a time-consuming and expensive proposition, requiring the participation of many researchers. It would not be easy. Second, although the downstream benefits are great, no single scholar or group of scholars has the resources or the incentives to produce such an index.15 (Academic disciplines do not generally reward members who labor for years to develop new data resources.) Consequently, academics have continued to use—and complain about—Polity, Freedom House, ACLP, and other highly aggregated indices. Policymakers will have to step into this leadership vacuum if they expect the problem of faulty indicators to be solved. Precedents for such support can be found in other social science fields. USAID served as a principal funder for demographic and health surveys that vastly enhanced knowledge of public health throughout the developing world.16 The State Department and the Central Intelligence Agency served as principal funders of the Correlates of War data collection project.17 On a much smaller scale, the State Department provides ongoing support for the Polity project. To be sure, the entire range of indicators proposed here is probably larger than any single funder is willing or able to undertake. It is highly advisable that several funders share responsibility for the project so that 15 Note that while scholars who are discontented with the leading indicators of democracy periodically recode countries of special concern to them (e.g., McHenry 2000, Berg-Schlosser 2004a, b; Acuna-Alfaro 2005; Bowman et al 2005), this recoding is generally limited to a small set of countries and/or a small period of time. 16 Surveys and findings are described on the USAID Web site: http://www.measuredhs.com/. 17 Information about the project may be found at http://www.correlatesofwar.org/.
OCR for page 93
Improving Democracy Assistance: Building Knowledge through Evaluations and Research its financial base is secure into the future and so that the project is not wholly indebted to a single funder, a situation that might raise questions about project independence. Preferably, some of these funders would be non-American (e.g., Canadian, European, Japanese, European Union, or international organizations like the World Bank or the United Nations Development Program). Private foundations (e.g., Open Society Institute, Google Foundation) might also be tapped. The committee conceptualizes this project as a union of many forces. This makes project management inevitably more complicated. However, the sorts of difficulties encountered here, insofar as they constitute a deliberative process about the substantive issues at stake, may enhance the value of the resulting product. Certainly, it will enhance its legitimacy. Another possibility is that different funders might undertake to develop (or take responsibility for) different dimensions of the index, thus apportioning responsibility. It is preferable, in any case, that some level of supervision be maintained at the top so that the efforts are well coordinated. Coordination involves not only logistical issues (sharing experts in the field, software, and so forth) but also, more importantly, the development of indicators that are mutually exclusive (nonoverlapping) so that the original purpose of the project—disaggregation—is maintained. Note that several of the above-listed components might be employed across several dimensions, requiring coordination on the definition and collection of that variable. As a management structure, the committee proposes an advisory group to be headed by academics—with some remuneration, depending on the time requirements, and suitable administrative support—in partnership with the policy community.18 This partnership is crucial, for any widely used democracy assessment tool should have both a high degree of academic credibility and legitimacy among policymakers. Major shortcomings of previous efforts to develop indices of democracy and governance resulted from insufficient input from methodologists and subject specialists or lack of broad collaboration across different stakeholders. For this wide-ranging proposal, experts on each of the identified dimensions will be needed. Their ongoing engagement is essential to the success of the enterprise. Moreover, it is important to solicit help widely within the social sciences disciplines so that decisions are not monopolized by a few (with perhaps quirky judgments). As a convening body, 18 The Utstein Partnership, a group formed in 1999 by the ministers of international development from the Netherlands, Germany, Norway, and the United Kingdom to formalize their cooperation is an example of this possible approach applied to a different problem. The U4 Anti-Corruption Resource Centre assists donor practitioners to more effectively address corruption challenges by providing a variety of online resources. See http://www.u4.no/about/u4partnership.cfm.
OCR for page 94
Improving Democracy Assistance: Building Knowledge through Evaluations and Research there are several possibilities, including the professional associations of political science, economics, and sociology (the American Political Science Association, American Economic Association, and American Sociological Association) or a consortium of universities. CONCLUSIONS This chapter has reviewed the most widely used indicators that measure “democracy” and arrived at these key findings: The concept of democracy cannot at present be defined in an authoritative (nonarbitrary) and operational fashion. It is an inherently multidimensional concept, and there is little consensus over its attributes. Definitions range from minimal—a country must choose its leaders through contested elections—to maximal—a country must have universal suffrage, accountable and limited government, sound and fair justice and extensive protection of human rights and political liberties, and economic and social policies that meet popular needs. Moreover, the definition of democracy is itself a moving target; definitions that would have seemed reasonable at one time (such as describing the United States as a democracy in 1900 despite no suffrage for women and few minorities holding office) are no longer considered reasonable today. To obtain a more reliable and credible method of tracking democratic change to guide USAID DG programming, USAID should foster an effort to develop disaggregated sectoral-level measures of democratic governance. This would likely have to involve numerous parties to attain wide acceptance. Existing empirical indicators of democracy are flawed. The flaws extend to problems of definition and aggregation, imprecision, measurement errors, poor data coverage, and a lack of convergent validity. These existing measures are useful to identify whether countries are fully democratic, fully autocratic, or somewhere in between. They are not reliable, however, as a guide for tracking modest improvements or declines in democracy within a country over the period of time in which most DG projects operate. While the United States, other donor governments, and international agencies that are making decisions about policy in the areas of health or economic assistance are able to draw on extensive databases that are compiled and updated at substantial cost by government or multilateral agencies mandated to collect such data (e.g., World Bank, World Health Organization, Organization for Economic Cooperation and Development), no comparable source of data on democracy currently exists. Data on democracy are instead currently compiled by various individual academics on irregular and shoestring budgets, or by nongovernmental
OCR for page 95
Improving Democracy Assistance: Building Knowledge through Evaluations and Research organizations or commercial publishers, using different definitions and indicators of democracy. These findings lead the committee to make a recommendation that we believe would significantly improve USAID’s (and others’) ability to track countries’ progress and make the type of strategic assessments that will be most helpful for DG programming. USAID and other policymakers should explore making a substantial investment in the systematic collection of democracy indicators at a disaggregated, sectoral level—focused on the components of democracy rather than (or in addition to) the overall concept. If they wish to have access to data on democracy and democratization comparable to that relied on by policymakers and foreign assistance agencies in the areas of public health or trade and finance, a substantial government or multilateral effort to improve, develop, and maintain international data on levels and detailed aspects of democracy would be needed. This should not only involve multiple agencies and actors in efforts to initially develop a widely accepted set of sectoral data on democracy and democratic development but should seek to institutionalize the collection and updating of democracy data for a broad clientele, along the lines of the economic, demographic, and trade data collected by the World Bank, United Nations, and International Monetary Fund. While creating better measures at the sectoral level to track democratic change is a long-term process, there is no need to wait on such measures to determine the impact of USAID’s DG projects. USAID has already compiled an extensive collection of policy-relevant indicators to track specific changes in government institutions or citizen behavior, such as levels of corruption, levels of participation in local and national decision making, quality of elections, professional level of judges or legislators, or the accountability of the chief executive. Since these are, in fact, the policy-relevant outcomes that are most plausibly affected by DG projects, the committee recommends that measurement of these factors rather than sectoral-level changes be used to determine whether the projects are having a significant impact in the various elements that compose democratic governance. REFERENCES Acuna-Alfaro, J. 2005. Measuring Democracy in Latin America (1972-2002). Working Paper No. 5, Committee on Concepts and Methods (C&M) of the International Political Science Association. Mexico City: Centro de Investigacion y Docencia Economicas.
OCR for page 96
Improving Democracy Assistance: Building Knowledge through Evaluations and Research Adcock, R., and Collier, D. 2001. Measurement Validity: A Shared Standard for Qualitative and Quantitative Research. American Political Science Review 95(3):529-546. Altman, D., and Pérez-Liñán, A. 2002. Assessing the Quality of Democracy: Freedom, Competitiveness and Participation in Eighteen Latin American Countries. Democratization 9(2):85-100. Alvarez, M., Cheibub, J.A., Limongi, F., and Przeworski, A. 1996. Classifying Political Regimes. Studies in Comparative International Development 31(2):3-36. Arat, Z.F. 1991. Democracy and Human Rights in Developing Countries. Boulder: Lynne Rienner. Beck, T., Clarke, G., Groff, A., Keefer P., and Walsh, P. 2000. New Tools and New Tests in Comparative Political Economy: The Database of Political Institutions. Policy Research Working Paper 2283. Washington, DC: World Bank, Development Research Group. For further info, see http//www.worldbank.org/research/bios/pkeefer.htm Research Group Web site http//econ.worldbank.org/. Beetham, D., ed. 1994. Defining and Measuring Democracy. London: Sage. Berg-Schlosser, D. 2004a. Indicators of Democracy and Good Governance as Measures of the Quality of Democracy in Africa: A Critical Appraisal. Acta Politica 39(3):248-278. Berg-Schlosser, D. 2004b. The Quality of Democracies in Europe as Measured by Current Indicators of Democratization and Good Governance. Journal of Communist Studies and Transition Politics 20(1):28-55. Berman, S. 1997. Civil Society and the Collapse of the Weimar Republic. World Politics 49(3): 401-429. Bertelsmann Foundation. 2003 Bertelsmann Transformation Index: Towards Democracy and a Market Economy. Gütersloh, Germany: Bertelsmann Foundation. Boix, C., and Rosato, S. 2001. A Complete Data Set of Political Regimes, 1800-1999. Chicago: University of Chicago, Department of Political Science. Bollen, K.A. 1980. Issues in the Comparative Measurement of Political Democracy. American Sociological Review 45:370-390. Bollen, K.A. 1993. Liberal Democracy: Validity and Method Factors in Cross-National Measures. American Journal of Political Science 37(4):1207-1230. Bollen, K.A., and Paxton, P. 2000. Subjective Measures of Liberal Democracy. Comparative Political Studies 33(1):58-86. Bowman, K., Lehoucq, F., and Mahoney, J. 2005. Measuring Political Democracy: Case Expertise, Data Adequacy, and Central America. Comparative Political Studies 38(8):939-970. Casper, G., and Tufis, C. 2003. Correlation Versus Interchangeability: The Limited Robustness of Empirical Findings on Democracy Using Highly Correlated Data Sets. Political Analysis 11:196-203. Coppedge, M. 2007. Presentation to Democracy Indicators for Democracy Assistance. Boston University, January 26. Coppedge, M. Forthcoming. Approaching Democracy. Cambridge: Cambridge University Press. Coppedge, M., and Reinicke, W.H. 1990. Measuring Polyarchy. Studies in Comparative International Development 25:51-72. Dunn, J. 2006. Democracy: A History. New York: Atlantic Monthly Press. Elkins, Z. 2000. Gradations of Democracy? Empirical Tests of Alternative Conceptualizations. American Journal of Political Science 44(2):287-294. Finkel, S.E., Pérez-Liñán, A., and Seligson, M.A. 2007. The Effects of U.S. Foreign Assistance on Democracy Building, 1990-2003. World Politics 59(3):404-439. Finkel, S.E., Pérez-Liñán, A., Seligson, M.A., and Tate, C.N. 2008. Deepening Our Understanding of the Effects of U.S. Foreign Assistance on Democracy Building: Final Report. Available at: http://www.LapopSurveys.org.
OCR for page 97
Improving Democracy Assistance: Building Knowledge through Evaluations and Research Foweraker, J., and Krznaric, R. 2000. Measuring Liberal Democratic Performance: An Empirical and Conceptual Critique. Political Studies 48(4):759-787. Freedom House. 2006. Freedom of the Press 2006: A Global Survey of Media Independence. New York: Freedom House. Freedom House. 2007. Methodology, Freedom in the World 2007. Freedom House: New York. Available at: http://www.freedomhouse.org/template.cfm?page=351&ana_page=333&year=2007. Accessed on September 5, 2007. Gasiorowski, M.J. 1996. An Overview of the Political Regime Change Dataset. Comparative Political Studies 29(4):469-483. Gleditsch, K.S., and Ward, M.D. 1997. Double Take: A Re-examination of Democracy and Autocracy in Modern Polities. Journal of Conflict Resolution 41:361-383. Hadenius, A. 1992. Democracy and Development. Cambridge: Cambridge University Press. Hadenius, A., and Teorell, J. 2005. Assessing Alternative Indices of Democracy. Committee on Concepts and Methods Working Paper Series. Mexico City: Centro de Investigacion y Docencia Economicas (CIDE). Hewitt de Alcantara, C. 1998. Uses and Abuses of the Concept of Governance. International Social Science Journal 11(155):105-113. Kaufmann, D., Kraay, A., and Mastruzzi, M. 2006. Governance Matters V: Governance Indicators for 1996-2005. Washington, DC: World Bank. Kekic, L. 2007. The Economist Intelligence Unit’s Index of Democracy. Available at: http://www.economist.com/media/pdf/DEMOCRACY_INDEX_2007_v3.pdf. Accessed on February 23, 2008. Knack, S., and Manning, N. 2000. Why Is It So Difficult to Agree on Governance Indicators? Washington, DC: World Bank. Kurtz, M.J., and Schrank, A. 2007. Growth and Governance: Models, Measures, and Mechanisms. Journal of Politics 69:2. Landman, T. 2003. Map-Making and Analysis of the Main International Initiatives on Developing Indicators on Democracy and Good Governance. Unpublished manuscript, University of Essex. Levi, M. 1996. Social and Unsocial Capital: A Review Essay of Robert Putnam’s Making Democracy Work. Politics & Society 24:145-155. McHenry, D.E. 2000. Quantitative Measures of Democracy in Africa: An Assessment. Democratization 7(2):168-185. Moon, B.E., Birdsall, J.H., Ciesluk, S., Garlett, L.M., Hermias, J.H., Mendenhall, E., Schmid, P.D., and Wong, W.H. 2006. Voting Counts: Participation in the Measurement of Democracy. Studies in Comparative International Development 41(2):3-32. Munck, G.L. 2006. Standards for Evaluating Electoral Processes by OAS Election Observation Missions. Paper prepared for Organization of American States. Munck, G.L., and Verkuilen, J. 2002. Conceptualizing and Measuring Democracy: Alternative Indices. Comparative Political Studies 35(1):5-34. Pagden, A. 1998. The Genesis of Governance and Enlightenment Conceptions of the Cosmopolitan World Order. International Social Science Journal 50(1):7-15. Reich, G. 2002. Categorizing Political Regimes: New Data for Old Problems. Democratization 9(4):1-24. Schaffer, F.C. 1998. Democracy in Translation: Understanding Politics in an Unfamiliar Culture. Ithaca, NY: Cornell University Press. Schumpeter, J.A. 1942. Socialism and Democracy. New York: Harper. Tetlock, P. 2005. Expert Political Judgment: How Good Is It? How Can We Know? Princeton, NJ: Princeton University Press. Thomas, M.A. 2007. What Do the Worldwide Governance Indicators Measure? Unpublished manuscript, School of Advanced International Studies, Johns Hopkins University.
OCR for page 98
Improving Democracy Assistance: Building Knowledge through Evaluations and Research Transparency International. 2007. Corruption Perceptions Index. Available at: http://www.transparency.org/policy_research/surveys_indices/cpi. Accessed on September 5, 2007. Treier, S., and Jackman, S. 2003. Democracy as a Latent Variable. Paper presented at the Political Methodology meetings, University of Minnesota, Minneapolis-St. Paul. USAID (U.S. Agency for International Development). 1998. Handbook of Democracy and Governance Program Indicators. Washington, DC: Center for Democracy and Governance. Available at: http://www.usaid.gov/our_work/democracy_and_govenance/publications/pdfs/pnacc390.pdf. Accessed on August 1, 2007. Vanhanen, T. 2000. A New Dataset for Measuring Democracy, 1810-1998. Journal of Peace Research 37:251-265. Vermillion, J. 2006. Problems in the Measurement of Democracy. Democracy at Large 3(1): 26-30.