Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 63
4 Engaging Data Users I n Chapter 1, the dynamic and growing role of the Internet as a force for change in National Center for Science and Engineering Statistics (NCSES) dissemination practices was briefly discussed. Rosabeth Moss Kanter has made the point that, although the Internet offers new challenges and opportunities, the quality of the customer experience remains centrally important to the success of many (Kanter, 2011). In developing dissemina- tion policies and procedures, fulfilling the needs of data users in a manner that exceeds expectations of the user should be a key goal for NCSES. Although NCSES has long been committed to serving the needs of data users, it has not gathered sufficient information on who its users are, how they use its data, and how well it is meeting their needs. Although NCSES has made several notable attempts to gather this intelligence about user needs, it does not have a formal, systematic, consistent, structured, and continuing program for doing so. One problem for NCSES is that there are multiple communities of users for which products must be developed. Furthermore, the breadth and diversity of NCSES data users will expand as it orients itself to the broader mission mandated by the America COMPETES Act. For the most part, out- reach efforts have been addressed to those whom NCSES perceives to be in its main user community. The user community consists mostly of research- ers and analysts of research and development (R&D) expenditures and the R&D workforce, particularly those concerned with federal science policy. The panel heard from key data users in the course of its workshop and through interviews conducted by panel members and staff. These users were representing the legislative and administrative branches of the federal 63
OCR for page 64
64 COMMUNICATING SCIENCE AND ENGINEERING DATA government, the organizations that support federal government science and engineering (S&E) analysis, the academic community, and regional economic development analysts. In the presentations and interviews, these users were asked to address, from their perspective, the current practices of the National Science Foundation (NSF) for communicating and disseminat- ing information in hard-copy publication format as well as on the Internet through the NCSES website, and the Integrated Science and Engineering Resources Data System (WebCASPAR), and the Scientists and Engineers Statistical Data System (SESTAT) database retrieval systems. CONGRESSIONAL COMMITTEE STAFF Panel and staff members met with staff of the House Subcommittee on Research and Science Education to discuss congressional staff uses of the NSF S&E information. Staff work in support of the committee is a fast- turnaround operation, requiring speed in retrieving data and easy access. In fulfilling its work, the committee staff makes extensive use of S&E Indi- cators in hard copy. The staff relies on the report narrative to help them interpret the data; the analysis helps them put the numbers into perspective. They expressed the view that data tables lacking explanation are subject to misinterpretation. Like other user groups interviewed by the panel, the congressional staff expressed concern about the timeliness and frequency of the survey-based data. The main use of the website occurs when the staff is away from the office and hard copies of the publications. They most often use Google as the search engine for discovering S&E information, commenting that the search capability of the NSF site is cumbersome and unreliable. In response to a question about use of WebCASPAR, there seemed to be confusion as to what WebCASPAR is and whether, in fact, they did use it at all. The staff often turns to the American Association for the Advancement of Science web database when they need NSF statistics, because it is readily available and comprehensive. The House committee staff would like to have access to Indicators in June rather than in the following spring, and the committee had proposed legislation to make that happen; the legislation was not supported in the Senate. Staff also expressed a need for more usability tools, such as the ability to link to other data. This capability may be available in Data.gov, but the staff has not used Data.gov very much. They were also interested in the possibility of visualization tools for the data. Some data needed for support of legislative initiatives are not presented in the aggregation (i.e., tables and cross-cuts) they desire. For example, the staff would like disaggregated S&E workforce and science, technology, engineering, and mathematical educa-
OCR for page 65
65 ENGAGING DATA USERS tion data by occupation, industry, and geography. Also, they need more data broken out by field of science and engineering. CONGRESSIONAL RESEARCH SERVICE As an arm of the Congress, the Congressional Research Service (CRS) responds to members of Congress and the congressional committees. In meeting the requirements of Congress for objective and impartial analysis, CRS publishes periodic reports on trends in federal support for R&D, as well as reports on special topics in R&D funding. Both types of studies rely heavily on data from NSF, both as originally published and as summarized in such publications as Science and Engineering Indicators and as extracted from the NSF website. The panel met with Christine Matthews, specialist in science and technology policy in the Resources, Science, and Industry Divi- sion of the Congressional Research Service. She is the primary staff contact with NSF. Her recent publications include The U.S. Science and Technol- ogy Workforce (2009), Science, Engineering, and Mathematics Education: Status and Issues (2007), and National Science Foundation: Major Research Equipment and Facility Construction (2009). Matthews is a frequent user of NSF information. She makes 8-10 visits to the NSF website each day and is a listserv subscriber. Although she visits the NSF website often during a given day, many of those searches are on the general NSF awards site and sites for divisions other than NCSES. In addition to general information on S&E expenditures and work- force, she specifically references data on academic R&D for historically black colleges and universities (HBCUs) and information on R&D facilities and equipment. She directs most of her specific inquiries through the NSF congressional liaison office, mentioning staff member George Wilson. She commented that her use of the data is limited by the curtailment in the amount of published information in NSF reports that accompanied the shift from hard-copy to electronic dissemination of several of the key reports. The HBCU data, for example, was located in a special report with analysis and extensive tables, but now they appear only as an InfoBrief and in data tables. Most of her data requests are filled by data readily available on the website. She has requested special data runs only a few times, noting that not everyone has the ability to request special data runs. Her experi- ence with WebCASPAR is positive, as it is user-friendly. She has not used SESTAT. The timelines of the data is not a particular problem for her. She recog- nizes that the data require time for collection and processing. For most of her uses, the data are sufficiently timely. She is able to satisfactorily explain the lags to congressional staff members when pressed. She does not gener-
OCR for page 66
66 COMMUNICATING SCIENCE AND ENGINEERING DATA ally use visualizations of NCSES data, but when she does, she would prefer visualizations in color. OFFICE OF SCIENCE AND TECHNOLOGY POLICY Representing the Office of Science and Technology Policy (OSTP), Kei Koizumi summarized the extensive use of NSF S&E information by this agency of the Executive Office of the President. He typically accesses the NCSES data primarily through the NCSES website, through the detailed statistical tables for individual surveys. He commented that the InfoBrief series is useful in that it informs him about which data are new. He reads each InfoBrief and explores some of the data further. For data outside his core area (R&D expenditures data), he often looks for the data in S&E Indicators, and, if needed, he goes to the most current data on the NCSES website. He uses WebCASPAR to access historical data and long time series. His overall comments focused attention on the timeliness of the data, suggesting that, to users, the data are never timely enough, although some of the lags are understandable. He remains optimistic that next year the data will be available earlier. He expressed concerns over the quality of the data, and the methodology employed in the federal funds survey, which were summarized in a recent National Research Council report (National Research Council, 2010). SCIENCE AND TECHNOLOGY POLICY INSTITUTE The Science and Technology Policy Institute (STPI) was created by Con- gress in 1991 to provide rigorous objective advice and analysis to OSTP and other executive branch agencies, offices, and councils. Bhavya Lal and Asha Balakrishnan reported on the activities and interests of STPI, which can be considered a very sophisticated user of NSF S&E information. STPI sup- ports sponsors in three broad areas: strategic planning; portfolio, program, and technology evaluation; and policy analysis and assessment. In their presentation, Lal and Balakrishnan reported on several specific examples of the attempts by STPI to use NSF S&E information. In one task, investigators sought to determine the amount of research funded by government and industry for specific subfields of interest (i.e., networking and information technology). They were able to obtain percentage basic research of R&D “by source” and “by performer” for government and industry, but not broken out by specific fields or sectors of interest as broad as networking and information technology. They were able to get data on industry R&D by fields (i.e., North American Industry Classification Sys- tem codes), but without the breakdown of basic research, applied research,
OCR for page 67
67 ENGAGING DATA USERS and development funding. Based on this experience, the investigators rec- ommended that NSF provide access to the data in a raw format. Their overall view was that access to NSF-NCSES data tables and briefs is extremely helpful in STPI’s support of OSTP and executive agencies. However, access to the data in a raw format would better enable assess- ment of emerging fields. The STPI researchers would like to obtain the data sets that underlie the special tabulations related to publications, patents, and other complex data. Similarly, they would like access to more notes on conversions, particularly to international data, to understand underlying assumptions; for example, China’s S&E doctoral degrees. For their work, they requested more detail on R&D funding/R&D obligations by field of science and by agency, although, for their needs, those data need not be publicly available. ACADEMIC USES Paula Stephan of Georgia State University, who classifies herself as a “chronic” user of NSF S&E information, summarized her uses of the data. She has a license with NSF, and about 40 to 50 times a year she uses restricted files pertaining to SDR, SED and SESTAT, InfoBriefs, and the Science and Engineering Indicators appendix tables. She accesses data through WebCASPAR. Graduate students use WebCASPAR to build tables and create such variables as stocks of R&D, stocks of graduate students, and stocks of postdoctorates by university and field. She reported that WebCASPAR can be difficult for new users to navigate, but they have to use WebCASPAR because the NCSES web page does not always have the most up-to-date links to data. For example, the number of doctorates for 2007 and 2008 is available only from WebCASPAR. She commented that the S&E indicators appendix tables are easy to use and that the tables are very well named, so it is easy to find data. The ability to export the data to Excel allows one to easily analyze data. Stephan noted that she does not use table tools, but her colleague, Henry Sauermann, did so for a study, and he reported that table tools provided him with exactly what he needed (starting salaries for industry life scientists). She pointed out that NSF staff have been very responsive to user needs. For example, in 2002 users recommended that NCSES collect information on starting salaries of new Ph.D.s in the SED, and, beginning in 2007, the question was on the SED. She suggested a need for more user support. Data workshops were held for three years that brought together users and potential users of licensed data. This same approach could be useful for acclimating users to web- based data. It would be a good way to find out how people use the data and to find out difficulties with or questions that people have about the data.
OCR for page 68
68 COMMUNICATING SCIENCE AND ENGINEERING DATA Like other users, Stephan commented that a major problem with the data is timeliness. The lack of timeliness affects the ability of researchers to assess current issues, such as the effect of the 2008-2010 recession on salaries, the availability of positions, the length of time individuals stay in postdoctoral status, and the mobility of S&E personnel. As an example of the lag, she pointed out that the 2008 SDR will be publicly released in November 2010 but the restricted data will not be released for licensed use until sometime in 2011 (the data were collected in October 2008). Owing to this lag, the data will provide little useful information about how the recession affected careers: analysts will have to wait until fall 2012 to get the 2010 data and will have to wait until sometime in 2013 to get the restricted data. Similarly, the earliest SED data collected during recession—for July 1, 2008 to June 30, 2009—were not scheduled to be released until November 2010 (note: the data release was subsequently delayed to allow for correc- tion of data quality issues in the racial/ethnic data). So it is “early” reces- sion data, although it will be analytically important because it will be the third year for which salary data have been collected in SED: when these SED salary data are available, analysts will be able to learn a good deal comparing the data with earlier years. However, such analyses will have to wait until November 2011 when the 2010 SED (July 1, 2009, to June 30, 2010) data are released (and assuming that salary data are made available). Stephan pointed out the timeliness is not a new issue. She quoted a 2000 National Research Council report: “SRS must substantially reduce the period of time between the reference date and data release date for each of its surveys to improve the relevance and usefulness of its data” (National Research Council, 2000, p. 5). REGIONAL ECONOMIC DEVELOPMENT USERS Jeffrey Alexander, a senior science and technology policy analyst with SRI International, is a frequent user of NSF S&E information and a con- tractor to NSF. In his presentation, he summarized his previous private- sector uses of the information, mainly focused on uses of the data for analysis of technology applications at the state policy level. He accessed data from the website and through use of WebCASPAR. He stated a major caution about the comparability of data sources and noted that good metadata (data about the data) are not generally available for NCSES data. In particular, he said there is a need for more detailed geographic coding of the data so one can be confident in matching NSF data with data from the Bureau of Labor Statistics and other sources. Like other users, he expressed a concern with the timeliness of the data and said that timeliness is a key factor in the relevance of the data.
OCR for page 69
69 ENGAGING DATA USERS With regard to access, Alexander said he often needs trend data, so he most generally goes to the tables on the web page to extract specific data items. He has problems in downloading multiple files, and he finds that the WebCASPAR and SESTAT tools are not very user-friendly. A useful enhancement would be to enable searches for variables across the various surveys. He does not use the printed publications, although he finds that the InfoBriefs are very useful in announcing and highlighting new products. Alexander suggested that NCSES needs to become a center of informa- tion for the user community, and it should devote more attention to reach- ing out to larger users with information about how to access data as well as to seek input for improvements. LIMITATIONS OF USER ANALYSIS The input received in the workshop and in the interviews was very helpful to the panel in framing its analysis of user needs. The users of NCSES data can conveniently, if imprecisely, be classified as primary users (those who directly use NCSES data in their research and analysis); second- ary users (those who indirectly rely on NCSES products to understand and gauge the implications for programs, policy, and advocacy, and those who assist others in obtaining access to the data); and tertiary users (the public). The input of primary users was extensively provided in the panel work- shops and in interview sessions, and some information was gathered from secondary users, but information from tertiary users was less systematically gathered and is given less attention in this report. Only since NCSES has begun to conduct consumer surveys is information about the needs of all user groups becoming known. It is incumbent on NCSES to consider the needs of all of these groups and the technology platforms they use to access the data as the agency considers the program of measurement and outreach discussed in this report. NCSES could consider novel means of harvesting information about data use to analyze usage patterns, such as reviewing citations to NCSES data in publications, periodicals, and news items. For example, to get a sense of users who are citing S&E Indicators, a panel member did a Web of Science “cited reference search” on *NAT SCI BOARD and (sci eng ind*). This exercise yielded a list of 691 publications going back to 1988, shortly after S&E Indicators was introduced under that name. Google Scholar is another potential source of such information. Reaching out to a wide variety of data users by means of surveys or interviews would be another worthwhile initiative. Moreover, such inter- actions would inform NCSES not only about user dissemination needs, but also about their substantive data needs, such as subject, variables, and level of geography. A list of organizations that could be contacted to assist
OCR for page 70
70 COMMUNICATING SCIENCE AND ENGINEERING DATA in obtaining input on uses of S&E information would include the Ameri- can Association for the Advancement of Science, the American Economic Association, the Association of Public and Land-Grant Universities, the Association of Public Policy Analysis and Management, the Association for University Business and Economic Research, the Council for Community and Economic Research, the Industry Studies Association, the International Association for Social Science Information Services and Technology, the Interuniversity Consortium for Political and Social Science Research, the National Association for Business Economics, the Special Libraries Asso- ciation Divisions of Biomedical and Life Sciences and Engineering, and the State Science and Technology Institute. One means of ensuring that the needs of the secondary and tertiary data users are met is to ensure that pro- grams of outreach are specially directed to members of the media—those who rerelease the NCSES data and interpret them to the public. Among the tools that NCSES has used to assess user needs, according to John Gawalt, NCSES program director for information and technol- ogy services at the time of the workshop, are URCHIN, a web statistics analysis program that analyzes web server log file content and displays the traffic information on the basis of the log data, and WebTrends, software that collects and presents information about user behavior on its website. With proper permissions and protections, NCSES is also contemplating using cookies to identify return users and increase the efficiency of filling data requests. In April 2011, NCSES took another step in the direction of obtaining user feedback when it placed a link on the website that directs users to a short customer survey to formally measure satisfaction and initiated an email-based sample survey, sent to customers who had requested electronic notification of new NCSES reports. As of mid-August 2011, the agency had received 44 responses to the website survey and 20 responses to the email survey. Most of those responding to both surveys were researchers, students, and teachers, with a smaller number of librarians, reporters, and policy makers, including legislative staff members. The respondents viewed the organization of the home page in positive or neutral terms, reporting that they could find what they were looking for using the current topical groupings or that they could find what they needed even though the organization was not satisfactory. Not surprisingly, researchers tended to want more in-depth reports with extensive data and analysis and detailed data tables, whereas reporters and policy makers were more likely to be satisfied with short, topical reports with summary data and analysis. Students and teachers varied in their needs and were about split between wanting short, topical reports and wanting more in-depth reports. Detailed data tables were commonly requested from this subset of customers as well. The staff of NCSES reports that it will continue to
OCR for page 71
71 ENGAGING DATA USERS solicit the views of visitors to the website and to periodically solicit views from a sample of requestors of electronic notification of NCSES reports in the future. Recommendation 4-1. The National Center for Science and Engineering Statistics (NCSES) should analyze the results of its initial online con- sumer survey and refine it over time. Using input from other sources, such as regular structured user focus groups and panel-based periodic user surveys, NCSES should regularly and systematically collect and analyze patterns of data use by web users in order to develop a typol- ogy of data users and to identify usability issues. The surveys are a useful start, but there is much more that can be accomplished by way of seeking the input of data users. In seeking a model for outreach to users, NCSES could consider modeling its efforts on the very aggressive program of Statistics Canada, described at the workshop by panel member Diane Fournier. Statistics Canada uses a combination of online questionnaires, focus groups, and usability testing to assess user needs and the usability of its website. One advantage of this approach, although it is resource intensive, is the possibility of gathering useful infor- mation from a wide range of users, both from regular users, who are knowl- edgeable, and from secondary and tertiary users, who are less familiar with the data. Another initiative that NCSES could undertake to better determine user needs is to renew the data workshops that it conducted for several years but have been discontinued. Those workshops brought together users and potential users of licensed data. This same approach could be useful for acclimating users to web-based data and to introduce frequent users to changes in data dissemination practices and procedures. Such data work- shops would be a good way to find out how knowledgeable data users use NCSES data and to find out what concerns users have about the data. These workshops could be conducted onsite, in remote locations (perhaps in conjunction with meetings of interested associations), or by means of webinars (perhaps hosted by interested associations). The input received in the workshop and in the interviews was very helpful to the panel in framing its analysis of user needs. We recognize that the analysis relies mainly on the input of primary and, to a lesser extent, secondary users. The panel was not able in the time allowed to systemati- cally gather much information from tertiary users (such as policy makers, the media, and librarians). Nonetheless, the panel thinks that it is incum- bent on NCSES to consider the needs of all three of these groups and the technology platforms that they use to access the data as it considers the program of measurement and outreach discussed in this report.
OCR for page 72
72 COMMUNICATING SCIENCE AND ENGINEERING DATA The agency can begin by developing a concrete typology of its data users. One approach to this would be to develop user personas—that is, stereotypical characters who represent the variety of user types for the sci- ence and engineering data (Pruit and Adlin, 2006, p. 3). These personas are usually developed by distilling data collected in interviews with users, much as the panel has tried to do in this report. The personas could be formal- ized in short descriptions to aid data dissemination designers, in that they provide a common description of the needs, skills, and the environment faced by the various user persona. A related approach would be to develop a typology of user interac- tion scenarios that describe what users do with the online resources. The user scenario would provide a concrete and flexibly detailed representation of the tasks that users will try to carry out with the systems (Rosson and Carroll, 2002). These two aids (personas and scenarios) provide for a user- centered integration of the system life cycle. Once done, they will serve as a reference for subsequent redesign and they help to focus the design of usability tests and user assistance programs. Recommendation 4-2. The National Center for Science and Engineer- ing Statistics should educate users about the data and learn about the needs of users in a structured way by reinstating the program of user workshops and instituting user webinars. The outreach activities discussed in this chapter, along with the devel- opment of a formal typology of users, will assist NCSES to better under- stand and respond to user needs. These activities will also assist the agency in allocating its scarce resources to the groups and needs that have the greatest return to the dissemination investment. USING WIKI AND OTHER COLLABORATION TOOLS FOR COMMUNICATION WITH USERS Another means of obtaining user input is offered by means of online collaboration tools, or wikis. Wikis have greatly improved the ability of federal agencies to establish open lines of communication and engage com- munities interested in their activities (Schroeder, Eynon, and Fry, 2009). The most widely used wiki tool is Wikipedia, the collaboratively cre- ated online encyclopedia. The Wikipedia Foundation provides the com- puting infrastructure, the server, wiki software, general rules for entries, and style guidelines. Content is generated by anyone who has access to an Internet browser. Users can edit existing content pages or create new pages on topics not yet covered. The Wikipedia wiki software provides the online editing environment, tracks the changes made to pages, and allows con-
OCR for page 73
73 ENGAGING DATA USERS tributors to engage in an online discussion about the content of pages. Page and text formatting is accomplished by simple specialized mark-up tags. Wiki software tools have increasingly been adopted by government agencies as a platform for sharing information and as a means of encour- aging the sharing of best practices and other types of information. Wiki software is available from commercial software vendors and as open-source software. Standard tools include software for group editing of online con- tent, blog pages, threaded discussions, and file management for group access to files and images. A version of Wiki has served as the foundation of Eurostat’s dissemina- tion system, called “Statistics Explained.”1 This is a new way of publishing European statistics, explaining what they mean, what is behind the figures, and how they can be of use, in an easily understandable language. Statistics Explained looks similar to Wikipedia, but unlike it, information can be updated only by Eurostat staff, thus ensuring the authenticity and reliability of the content. The latest data can be accessed through hyperlinks available in each statistical article. The U.S. General Services Administration (GSA) operates a wiki envi- ronment to encourage communication across governmental entities. The GSA site emphasizes a “community of practice” model for taking advan- tage of wiki software. People who have some engagement in a particular subject or project can benefit from a central online point of contact rather than attempting communication through a series of email conversations. Wikis and other online collaboration tools can help maintain a dialog with academics and outside experts. Wiki pages on technical issues related to the database could generate a valuable two-way flow of information about technical issues between outside researchers and staff experts at NCSES. KEEPING USERS INFORMED The current NCSES websites and published reports appropriately point users to technical descriptions of the data collections and identify staff who are ready and able to assist users in their use of the data. However, a perusal of other federal statistical agency websites identifies useful information sharing. For example, the Census Bureau’s Manufacturing and Construc- tion Division, which manages the Business Research and Development and Innovation Survey (BRDIS) for NCSES, includes on its website (see http:// www.census.gov/mcd/clearance/ [November 2011]) a listing of the open opportunities for public comment noted in the Federal Register, identifies 1 See http://eppeurostat.ec.europa..eu/statistics_explained/index.php/Main_Page [November 2011].
OCR for page 74
74 COMMUNICATING SCIENCE AND ENGINEERING DATA planned changes, and includes copies of the forms and the supporting docu- ments as submitted to the Office of Management and Budget (OMB). The technical information in these OMB clearance packages can assist users in understanding the strengths and weaknesses of the data. ENHANCING USABILITY OF NCSES DATA Usability is generally understood to be the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfaction in a specified context of use (ISO 9241-11). The field of website usability is developing rapidly and now includes sophisti- cated methods to gather feedback from users about their interactions with websites. Although there is no single broad measure of website usability, some very useful guidelines are contained in a federal government publication, Research-Based Web Design and Usability Guidelines, prepared jointly by the U.S. Department of Health and Human Services and the General Services Administration (U.S. Department of Health and Human Services and the General Services Administration, No date). Identified on the web as Usability.gov, the publication contains guidelines that emphasize the need to design websites “to facilitate and encourage efficient and effective human-computer interactions.” The guidelines call on website designers to strive to reduce the user’s workload by taking advantage of the computer’s capabilities on the premise that users will make the best use of websites when information is displayed in a directly usable format and content organization is highly intuitive. The guidelines make the point that task sequences make a difference. The sequencing of user tasks need to be consistent with how users typically do the tasks for which they have visited the site, using techniques that do not require them to remember information for more than a few seconds, employing terminology that is understandable, and taking care to refrain from overloading users with information. Likewise, users should never be sent to unsolicited windows or unhelpful graphics. The guidelines empha- size that speed in accomplishing a task is important, and users should not have to wait for more than a few seconds for a page to load, and, while waiting, should be supplied with appropriate feedback. Tasks like printing information should be made easy.2 2 See http://www.usability.gov/pdfs/chapter2.pdf [November 2011].
OCR for page 75
75 ENGAGING DATA USERS Evaluation of the NCSES Website In order to assess how well the current NCSES website (see http://www. nsf.gov/statistics/ [November 2011]) fulfills these basic usability guidelines and criteria, the panel conducted an evaluation of the site as it appeared in May 2011. This review was by no means exhaustive. Rather, the goal was to stimulate the development of a formal usability process by briefly reviewing the current design. Clearly, further user research would be neces- sary prior to making improvements to the current design. Any decision to change the website’s design, including content and organization, must be based on user feedback and a usability evaluation testing strategy, which is presented at the end of this review. It is apparent that having the NCSES web pages as a subsite of the NSF website poses limitations for NCSES website designers. If not treated care- fully, this fact of life could increase the difficulty of navigating the site for NCSES data users. For example, the design of the NCSES tab “Statistics” is a path to a different site altogether. The visual cue indicating for users that they are still within the NSF website is the use of the same visual design template (same header, footer, and title format with image), which is crucial. However, the main issue with this design is that users can have some difficulty finding NCSES if their point of entry is the NSF home page. From the NSF home page, users are expected to find what they are looking for by exploring the site through main and secondary navigation. Once users find the NCSES subsite (from the NSF home page or directly via an Internet search engine or bookmark), users are faced with an organi- zation-centric site rather than a user-centric site based on tasks. The current design appears to try to educate return visitors on how to navigate the site, but it would be best to organize the site in a way that all users (frequent, infrequent, or new users) can quickly and efficiently accomplish the task they are setting out to do. Suggestions for reorganizing the NCSES subsite appear in Appendix B. On the whole, the evaluation of the NSF website points to the need for more systematic user-centered design and more regular usability evaluation. There are a number of methods in use, including expert heuristic evalua- tion, usability testing with small samples of actual users, and large-scale web browsing analytics. A heuristic evaluation is recommended as an initial approach. It is one lightweight method that web designers use for discovering usability problems in a user interface design (electronic or paper prototypes), so that problems can be addressed throughout an iterative design process. The evaluation is usually employed early in the design process—the earlier, the better. Jakob Nielson, an expert in this field, recommends having between three and five evaluators separately review the interface design. The num-
OCR for page 76
76 COMMUNICATING SCIENCE AND ENGINEERING DATA ber of issues discovered increases with each evaluator, but the cost-benefits begin to decrease after five (Nielsen, 1994; Nielsen and Molich, 1990). Along with the customer surveys and focus groups recommended in Recom- mendation 4-1, the heuristic evaluation can intelligently inform the process of designing a more effective and efficient website. Recommendation 4-3. The National Center for Science and Engineer- ing Statistics should employ user-focused design and user analysis, starting with an initial heuristic evaluation and continuing as a regular and systematic part of its website and tool development. Meeting Compliance Standards Websites should be designed to ensure that everyone, including users who have difficulty seeing, hearing, and making precise movements, can use them. Generally, this means ensuring that websites facilitate the use of common assistive technologies. As a federal government agency, NSF is governed by the Section 508 regulations. These amendments to the Reha- bilitation Act require federal agencies to make their electronic and infor- mation technology accessible to people with disabilities. Section 508 was enacted to eliminate barriers in information technology, to make available new opportunities for people with disabilities, and to encourage develop- ment of technologies that will help achieve those goals. The U.S. Access Board has responsibility for the Section 508 standards and has announced its intention to harmonize the web portions of its Section 508 regulations with Web Content Accessibility Guidelines (WCAG) 2.0, for which the Web Accessibility Initiative (WAI) has responsibility. Statistical Policy Directive Number 4 (March 2008) directs statistical agencies to make information available to all in forms that are readily accessible.3 Some of the major accessibility issues to be dealt with include the following: • provide text equivalents for nontext elements; • ensure that scripts allow accessibility; • provide frame titles; • enable users to skip repetitive navigation links; • ensure that plug-ins and applets meet the requirements for acces- sibility; and • synchronize all multimedia elements. 3A summary of Section 508 is available. See http://www.section508.gov/index.cfm?fuse Action=stdsSum [November 2011].
OCR for page 77
77 ENGAGING DATA USERS When it is not possible to ensure that all pages of a site are accessible, designers should provide equivalent information to ensure that all users have equal access to all information.4 Other standards include the “web accessibility initiative” of the World Wide Web Consortium (W3C), which provides guidance and tools for a range of websites and applications. Even more significant, given the possibility for rich dynamic interaction with these data resources, is that W3C has also developed standards for access to dynamic content, with specific guidelines in four categories: 1. Accessible rich Internet applications: address accessibility of dynamic web content, such as those developed with Ajax, dynamic HTML, or other such technologies. 2. Authoring tool accessibility guidelines: address the accessibility of the tools used to create websites. 3. User agent accessibility guidelines: address assistive technology for web browsers and media players. 4. Web content accessibility guidelines: address the information in a website, including text, images, forms, and sounds. The convention when considering web design for individuals with disabilities is to ensure that the site is accessible to those who are visually impaired. However, there is a much wider range of ways in which some- one’s access to information should be considered when developing websites and web applications. For example, a chart that is color-coded may not be readily interpreted by someone with color blindness, multimedia files may not be accessible to someone with deafness unless they are accompanied by transcripts, and someone with a cognitive disability, such as attention deficit disorder, may find websites that lack a clear and consistent organization difficult to navigate.5 Data Accessibility Issues The accessibility of tabular data and data visualization is an open research question. Although W3C has pioneered standards for accessibility of dynamic user interfaces, many other issues, including table navigation, navigation of large numeric data sets, and dynamic data visualization, raise computer-human interaction challenges that have been explored only 4 U.S. Department of Health and Human Services, Research-based Web Design and Usability Guidelines, p. 23. (2008). See http://www.usability.gov/guidelines/guidelines_book.pdf [May 2011]. 5 Presentation of Judy Brewer, director of the Web Accessibility Initiative at the W3C, on the issue of accessibility of information on the web.
OCR for page 78
78 COMMUNICATING SCIENCE AND ENGINEERING DATA peripherally. The issue of accessibility is a clear opportunity for NSF to partner with scientists with disabilities and those who work on interface design and so lead by example. In order for NSF S&E information to be used, it must be accessible to users. By nearly eliminating the hard-copy publication of the data in favor of electronic dissemination, mainly through the web, NSF is committed to the provision of web-based data in an accessible format, not only for trained sophisticated users, but also for users who are less confident of their ability to access data on the Internet. Importantly, the user popula- tion includes people with disabilities for whom, by law and right, special accommodations need to be made. The panel benefited from a presentation by Judy Brewer, who directs the WAI at W3C. W3C hosts the WAI to develop standards, guidelines, and resources to make the web accessible for people with disabilities; ensure accessibility of W3C technologies (20-30 per year); and develop educational resources to support web accessibility. Brewer stated that Web 2.0 adds new opportunities for persons with disabilities, and that data visualization is a key to effective communication. However, people with disabilities face a number of barriers to web acces- sibility, including missing alternative text for images, missing captions for audio, forms that “time out” before they can submit them, images that flash and may cause seizures, text that moves or refreshes before they can interact with it, and websites that do not work with assistive technologies that many people with disabilities rely on. In response to a question, Brewer addressed the continued problem of making tabular information accessible, and she requested input on where the WAI should go in this area. She referred to a workshop held by the National Institute of Standards and Technology on complex tabular infor- mation that resulted in several recommendations. Brewer argued for publishing existing S&E data in compliance with Section 508 requirements, while continuing R&D on accessibility tech- niques for new technologies, improved accessibility supports for cognitive disabilities, and more affordable assistive technologies, such as tablets. She said WAI would partner with agencies to ensure that dissemination tools are accessible. Recommendation 4-4. The National Science Foundation should spon- sor research and development on accessible data visualization tools and approaches and potential other means for browsing and explor- ing tabular data that can be offered via web, mobile, and tablet-based applications, or browser-based ones.