Presentation of Information in National Patterns
Although the content and methods for National Patterns were the primary topics for the steering committee and workshop, there was also interest in whether the current presentation of the information, generally through use of tables, could be enhanced either through the use of different tabular presentations or through the use of maps. Thus, one workshop session was devoted to the presentation of information in National Patterns, led by Daniel Carr of George Mason University. He discussed the work he has done for federal agencies, such as the Bureau of Labor Statistics and the National Cancer Institute, and how spatial and longitudinal data could be displayed in a more informative way for National Patterns users.
BASIC PRINCIPLES AND COGNITIVE SCIENCE
Carr began by offering his four design principles for statistical graphics, two focusing on content and two on presentation:
1. Feature meaningful, accurate comparisons.
2. Provide a rich context for interpretation by including reference values, related variables, temporal or spatial context, and an assessment of uncertainty or quality.
3. Strive for a simple appearance.
4. Attract and engage people by providing added-value appearance, interactive choices with guidance, and feedback, educational pathways, and opportunities to contribute.
He added that statistical graphics design involves compromise because these four principles can be in conflict and because designs must also address the constraints of the media and audience.
The design of statistical graphics should consider human cognitive limitations and strengths, he noted. Often overlooked limitations include the universal forms of blindness called inattentional and change blindness. To demonstrate inattentional blindness, he described a person-swap scenario. In this scenario, Person A is giving directions to Person B, but during a moment of distraction for Person A, Person C takes the place of Person B. In an experiment designed by Derren Brown,1 Person A continues giving directions without noticing that the person asking for directions is not the same. Carr said that the big bottleneck in human visual processing is visual working memory, which can handle only from one to three (simple) visual objects (see Ware, 2008). Visual objects not immediately needed are not retained and may not be stored in long-term memory.
In addition to visual memory, verbal reasoning is needed to work with numbers and think about quantitative graphics. Human working auditory memory consists of a 2-second sound loop, which is also limiting in terms of presenting information. Carr suggested that although there is a lot to learn from cognitive science, enough is already known to improve statistical graphics designs and the accompanying text.
Although there are many barriers to change, some guidance is relatively easy to put to work. Having too much information to process easily in one chunk makes a plot appear complex, whether or not it is. For example a graphics panel showing more than four lines (time series) appears complex. Perceptually grouping lines into panels that have four or fewer lines per panel simplifies the plot appearance.
Cognitive strengths include adjustable visual queries, parallel processing of visual and auditory systems, and for some tasks the ability to adapt and learn. This ranges from the priming of neurons to respond faster when a similar pattern appears to training based on the reduction of the cognitive effort associated with learning new tasks.
Carr said that design guidance is applicable to both statistical tables and graphics, and he offered ideas for redesign of a National Patterns table: see Table 6-1. One design feature is the use of black dots to call attention to total and subtotal columns. People can tune their vision to scan for black dots, just as they can tune their vision to scan for red items in the room. As described by Ware (2008), the things that pop out on a page are the things for which people can use their top-down control to tune for low-level visual processing. Among the several other changes is the use of light gray lines in the background to provide smaller perceptual groups of rows to support
______________
1 See http:/www.youtube.com/watch?v=vBPG_OBgTWg [March 2013].
TABLE 6-1 Proposed Presentation of R&D Expenditures by Performing Sector and Source of Funds Over Time
Dollar Units: Current Millions | Table 2. US Basic Research Expenditures | |||||||||||||||||
Years: 1953−2008 | By Performance Sectors and Funding Sources | |||||||||||||||||
All Performers | Federal | Industry | Industry FFRDCs | U&C | U&C FFRDCs | Other nonprofit organizations | Nonprofit FFRDCs | |||||||||||
All Sources | Federal • | Total • | Federal | Industry | Total • | Total • | Federal | Other Government | Industry | U&C | Other Nonprofit | Total • | Total • | Federal | Industry | nonProfit | Total • | |
1953 | 460 | 102 | 151 | 19 | 132 | NA | 123 | 82 | 7 | 13 | 6 | 16 | 36 | 48 | 27 | 9 | 12 | NA |
1954 | 509 | 96 | 166 | 23 | 143 | NA | 148 | 97 | 10 | 15 | 8 | 18 | 44 | 55 | 31 | 11 | 13 | NA |
1955 | 579 | 98 | 189 | 27 | 162 | NA | 180 | 117 | 14 | 17 | 12 | 21 | 50 | 63 | 36 | 13 | 14 | NA |
1956 | 718 | 114 | 253 | 37 | 216 | NA | 220 | 143 | 19 | 20 | 15 | 24 | 58 | 74 | 42 | 15 | 17 | NA |
1957 | 814 | 124 | 271 | 41 | 230 | NA | 261 | 167 | 25 | 23 | 20 | 27 | 72 | 87 | 49 | 15 | 23 | NA |
1958 | 944 | 149 | 295 | 43 | 252 | NA | 312 | 202 | 31 | 24 | 24 | 31 | 85 | 103 | 59 | 16 | 28 | NA |
1959 | 1,087 | 165 | 320 | 72 | 248 | NA | 388 | 263 | 38 | 24 | 28 | 36 | 95 | 120 | 72 | 18 | 30 | NA |
1960 | 1,286 | 184 | 376 | 79 | 297 | NA | 485 | 341 | 45 | 25 | 33 | 41 | 106 | 136 | 85 | 21 | 30 | NA |
1961 | 1,512 | 230 | 395 | 81 | 314 | NA | 598 | 432 | 54 | 25 | 40 | 48 | 126 | 164 | 105 | 22 | 37 | NA |
1962 | 1,824 | 252 | 488 | 143 | 345 | NA | 737 | 546 | 64 | 25 | 48 | 55 | 148 | 200 | 130 | 24 | 46 | NA |
1963 | 2,115 | 285 | 522 | 147 | 375 | NA | 909 | 689 | 75 | 25 | 58 | 63 | 175 | 225 | 150 | 25 | 50 | NA |
1964 | 2,396 | 339 | 507 | 123 | 384 | 42 | 1,071 | 824 | 84 | 25 | 70 | 68 | 200 | 238 | 166 | 25 | 47 | NA |
1965 | 2,664 | 375 | 563 | 157 | 406 | 29 | 1,221 | 944 | 94 | 27 | 86 | 70 | 218 | 260 | 179 | 29 | 52 | NA |
1966 | 2,930 | 410 | 593 | 142 | 451 | 31 | 1,380 | 1,066 | 104 | 29 | 106 | 75 | 239 | 278 | 188 | 32 | 58 | NA |
1967 | 3,168 | 434 | 595 | 168 | 427 | 34 | 1,554 | 1,188 | 114 | 34 | 136 | 83 | 263 | 289 | 194 | 34 | 61 | NA |
1968 | 3,376 | 482 | 607 | 145 | 462 | 35 | 1,681 | 1,265 | 131 | 38 | 156 | 91 | 276 | 296 | 196 | 37 | 63 | NA |
1969 | 3,491 | 545 | 581 | 123 | 458 | 37 | 1,754 | 1,288 | 153 | 40 | 171 | 103 | 272 | 302 | 192 | 43 | 67 | NA |
1970 | 3,594 | 562 | 566 | 122 | 444 | 36 | 1,855 | 1,323 | 179 | 43 | 196 | 115 | 265 | 311 | 195 | 44 | 72 | NA |
1971 | 3,720 | 581 | 557 | 101 | 456 | 33 | 1,968 | 1,385 | 194 | 50 | 214 | 127 | 252 | 329 | 207 | 45 | 77 | NA |
1972 | 3,850 | 603 | 554 | 91 | 463 | 39 | 2,038 | 1,437 | 195 | 55 | 216 | 134 | 270 | 347 | 216 | 47 | 84 | NA |
1973 | 4,099 | 652 | 595 | 96 | 499 | 36 | 2,103 | 1,489 | 196 | 59 | 223 | 137 | 343 | 371 | 232 | 49 | 90 | NA |
1974 | 4,511 | 715 | 650 | 114 | 536 | 49 | 2,282 | 1,609 | 204 | 66 | 250 | 153 | 415 | 401 | 245 | 54 | 102 | NA |
1975 | 4,875 | 760 | 677 | 104 | 573 | 53 | 2,480 | 1,768 | 212 | 72 | 264 | 164 | 476 | 430 | 255 | 59 | 116 | NA |
1976 | 5,373 | 850 | 750 | 116 | 634 | 69 | 2,675 | 1,924 | 218 | 75 | 283 | 175 | 556 | 474 | 278 | 65 | 131 | NA |
1977 | 6,008 | 943 | 836 | 135 | 701 | 75 | 2,967 | 2,114 | 232 | 89 | 334 | 198 | 667 | 521 | 301 | 72 | 148 | NA |
1978 | 6,959 | 1,044 | 941 | 156 | 785 | 94 | 3,376 | 2,399 | 260 | 107 | 398 | 213 | 906 | 598 | 351 | 79 | 168 | NA |
1979 | 7,836 | 1,112 | 1,054 | 161 | 893 | 104 | 3,828 | 2,719 | 286 | 128 | 466 | 229 | 1,050 | 689 | 413 | 87 | 190 | NA |
1980 | 8,745 | 1,212 | 1,205 | 170 | 1,035 | 120 | 4,315 | 3,061 | 307 | 156 | 544 | 248 | 1,167 | 726 | 416 | 95 | 215 | NA |
1981 | 9,658 | 1,343 | 1,477 | 164 | 1,313 | 137 | 4,737 | 3,331 | 338 | 183 | 615 | 269 | 1,284 | 671 | 324 | 105 | 243 | 9 |
1982 | 10,651 | 1,522 | 1,776 | 253 | 1,523 | 128 | 5,091 | 3,475 | 368 | 215 | 716 | 317 | 1,366 | 759 | 369 | 115 | 275 | 9 |
SOURCE: Daniel Carr’s re-expression of National Patterns of R&D Resources: 2009 Data Update. NSF 12-321.
local focus and more accurate horizontal scanning. The grouping in units of five is a compromise, being more than the suggested perceptual groups of four. However, thinking about years in units of 10 is convenient and use of groups of 5 is compatible with this.
Carr added that interactive tables have a history and a future. Historically, Table Lens software has provided many interactive features (see Rao and Card, 1994). Now, variable selection, row and column reordering, and focusing tools are increasingly common, and statistical methods are more available to support the making of comparisons in a table context.
Although tables remain important for some tasks, there are merits to using statistical graphics for many discovery, analysis, and communication tasks. Carr showed a linked micromap that uses a graphical user interface for variable selection and uses statistical graphics to represent estimates and confidence intervals for both the primary variable of interest and related variables.2 The graphics include reference values and color-linked micromaps that show spatial patterns. The interactive applet provides “drill-down” capabilities from states to counties, and supports reordering rows and columns.
The website for the Nation’s Report Card3 provides instructive example tables related to state achievement (student averages) on standardized tests. These tables foster statistical comparisons in two ways. First, the variables available for interactive selection include differences, such as differences between male and female average scores for each state. The 95 percent confidence intervals of such differences for each state can draw attention to states whose intervals do not include zero. Such differences are unlikely to be due to random variation and so are of interest in investigating disparities.
Second, the interactive table on the National Center for Education Statistics website supports selection of a reference value to use in making comparisons. For example, selecting the national public value adds a table column that shows states in one of three categories: those with confidence intervals below, including, or above the national public reference value. People find it easy to think in terms of three ordered categories, such as small, medium, and large, and so they also find it easy to think of states as belonging to one of three categories. This column of state categories can be
______________
2 The example shown at the workshop is available at: http://statecancerprofiles.cancer.gov/ micromaps, a database maintained by the National Cancer Institute.
3 This website of the National Center for Education Statistics is available at: http://nces.ed.gov/nationsreportcard/states [January 2013].
represented in a map, where three colors can encode each state’s category. An alternative encoding shows three panels of maps with one panel for each class. States are then highlighted in the panel indicating their category.
Figure 6-1 shows the extension of this encoding approach to representing three variables: 8th grade reading, 4th grade mathematics, and 8th grade mathematics achievement scores. The center map shows Illinois in gray so Illinois’s 8th grade reading achievement score is similar to the national public achievement score. Since the plot highlights Illinois in the middle row and middle column, the 4th and 8th grade mathematics average scores are also similar to the national public averages scores. States that have all average score confidence intervals below the national public scores are purple and appear in the lower left panel. States that have all average score confidence intervals above the national public are green and appear in the top right panel. This comparative micromap design shows both the association among these three variables and spatial patterns.
Carr noted that there are new designs based on confidence intervals that are emerging from exploratory graphic designs for showing three variables in a geospatial context. He showed additional education examples that he created using dynamic Java software called CCmaps (conditioned choropleth maps), which serves as an exploratory tool. It also uses a 3 × 3 grid of maps with highlighted states. One key difference is that three-class
FIGURE 6-1 State educational achievement scores for three variables: An example of micromaps. SOURCE: Carr (2012).
“sliders”—buttons that allow for continuous changing of class definition—are positioned above, to the right and below the grid of maps and provide control of the partitioning of states to low, middle, and high classes for each of three interactively selected variables. The software also provides dynamic statistical feedback, guidance about slider settings, and alternative views. Carr and Pickle (2010) describe these capabilities and also address other design issues, such as simplifying map boundaries for visualization purposes.
International interest in National Patterns is a reason, Carr said, to make world maps to show spatial-temporal patterns for many nations. He noted, however, that world maps typically have visibility problems for small nations and that for some data the big difference between neighboring nations makes it more difficult to see and learn spatial patterns.
As an alternative to world maps, Carr showed examples from Gapminder.4 He mentioned that Gapminder’s animated bubble scatter-plots help visualize time series for two variables. Although animation poses some visual and cognitive problems, they can be partially addressed.
Carr said that one can juxtapose a few state maps to show all the state class memberships and changes over time. He showed a temporal change maps design that displays state expenditures of R&D funding relative to the gross domestic product for just four of the yearly maps: see Figure 6-2. In this design, an analyst uses a dynamic three-class slider below the maps to put states into low, middle, and high classes based on their values. The blue, gray, and red colors in the middle row of maps indicate the class memberships over time. However, even when studying two maps that are in sight, such as the 1993 and 1998 maps, it is hard to find all the class changes. When people’s eyes jump from map to map in movements called saccades, they are effectively blind, and their change detectors are reset. People can only remember a little area in focal attention long enough to make a comparison across maps.
In general, careful comparison of two juxtaposed similar images requires tedious back-and-forth comparisons of small corresponding areas. People see the new focal location, but the usual feedback about change in the large visual field is absent. Change blindness is the phenomena of not noticing many changes because one’s visual change detectors have been reset. Explicitly showing the class changes, as in Figure 6-2, addresses the change blindness problem. Specifically, the top row of maps shows all the states that changed to a higher category in their new color which is either gray or red. The bottom row shows all the states that changed to a lower category in their new color which is either gray or blue.
Carr and Pickle (2010) describe a variety of comparative micromaps. Their examples include maps that can be indexed by such variables as age
______________
4 For information, see http://www.gapminder.org [January 2013].
FIGURE 6-2 Ratio of state gross R&D expenditures to GDP, as illustrated by temporal change maps. SOURCE: Carr (2012).
group, sex, and race as well as by time. Carr noted that Java software called TCmaps (temporal change maps) now supports dynamic interaction with such graphics and includes some new designs: for an example, see Figure 6-3. This example addresses percent changes in black and white populations for 4 years in Louisiana parishes. Hurricanes Katrina and Rita occurred after the 2005 census and the impact was quite different for the two populations. The example shows linked filter sliders on the left that are set to highlight parishes with changes of more than 1.78 percent. The highlighted parishes appear in the second and fourth rows of maps. The first and fifth rows are change maps that show parishes that changed from unselected to selected using color-filled polygons. Parishes that changed from selected to unselected appear with colored outlines. The maps in the center row are called cross maps. These represent parish class membership in a 2 × 2 matrix indicating selected or not selected for the two populations. Parishes in the background were not selected. Yellow indicates parishes that have percent changes of more than 1.78 for both black and white populations.
Carr said that the above examples suggest a variety of graphics designs that NCSES could produce. He commended NCSES on many of the graphical displays in the 2012 Science and Engineering Indicator Digest, the 2011 Women, Minorities, and Person with Disabilities in Science and Engineering, the 2010 Key Science and Engineering Indicator Digest, and the 2009 Doctorate Recipients from U.S. Universities. He noted the excellent use of linked graphics and text, the attention given to many details, and high print
FIGURE 6-3 Comparing populations in Louisiana parishes, 2004-2007: Illustration of use of micromaps with linked highlighting sliders and subset cross plots. NOTES: The top panel shows changes in the black population; the middle panel is a cross map, and the bottom panel shows changes in the white population.
SOURCES: Carrr (2012) and Zhang (2012).
quality that provided a value-added appearance. He observed some style variations across the documents and suggested that variety can be good.
As an example of possible improvement, he offered a different approach for the legend in a figure from Women, Minorities, and Persons with Disabilities in Science and Engineering that puts the legend in empty space to increase plot resolution: see Figures 6-4a and 6-4b. His variant is designed to use the same space and include the same content. Changes include putting the legend above the plot, changing the y-axis label style, and adding grid line labels on the right where people are likely to assess the values. But, he said, he is not sure whether the result is easier or harder for readers. His concern was the complex appearance due to too many overplotted lines in the same panel and too many color and label links. Thus, his alternate uses web graphics space to reduce the complex appearance.
In conclusion, Carr pointed out that web graphics can open many doors. They can allow for user variable selection, variable transformations, focusing tools, full color, the opportunity to comment and contribute, and
FIGURE 6-4a Time plot of percent minorities with science and engineering bachelor’s degrees, 1989-2008: Original presentation.
NOTE: Data are not available for 1999.
SOURCE: 2011 Women, Minorities, and Persons with Disabilities in Science and Engineering.
FIGURE 6-4b Time plot of percent minorities science and engineering bachelor’s degrees, 1989-2008: Alternative design.
NOTE: Data are not available for 1999.
SOURCE: Carr (2012).
better access to data and graphics. In addition to his book with Linda Pickle and Rao and Card (1994), Carr also suggested that people look at Ware (2008) and Kosslyn, Thompson, and Ganis (2006).5
______________
5 Linked micromaps can be found at http://www.gis.cancer.gov/tools/micromaps [March 2013] and conditioned micromaps can be found at http://www.math.yorku.ca/SCS/sasmac/ccmap.html [March 2013]. Carr also mentioned dynamically conditioned chloropeth maps, which are described at http://dgrc.org/dgo2004/disc/demos/tuesdemos/carr.pdf [March 2013]
Chris Hill asked what the programming demands are on implementing such graphical tools. Carr responded that it requires knowledge of Java, which does take some expertise. Michael Cohen asked whether there are concerns regarding disclosure avoidance. Carr answered that the suppression that would be used for a comparable tabular presentation is used prior to implementation of these graphics. Karen Kafadar asked whether anyone had carried out usability studies as a result of the implementation of such graphical tools. Carr said that users often wanted structure that went against Carr’s principles but were closer to what users were accustomed to. Two examples are the need to provide context up front, and the need to rank things from top to bottom, rather than have the best be the middle of the map.
Kafadar wondered whether there was feedback from the general user community as to utility. Carr didn’t think so. John Jankowski wondered how the variables used to define subgroups were determined. Carr responded that the selection of variables was due to collaboration between the subject-matter experts and himself. Cohen said that there is now some interest in R&D patterns both over time and subnationally. These are the types of tools that could be used to display that, correct? Carr agreed that state structure over time could be represented. Kafadar pointed out that National Patterns had 168 variables: Would it be hard to select the right subset for each response of interest? Carr said that nowadays searching a file with that number of variables was relatively easy.