Page 47 Cite

Suggested Citation:"4 AI Resurgence." National Academies of Sciences, Engineering, and Medicine. 2020. Information Technology Innovation: Resurgence, Confluence, and Continuing Impact. Washington, DC: The National Academies Press. doi: 10.17226/25961.

×

autonomous cars, autonomous aircraft, automated farm equipment, communications and networking equipment, electric utilities, medical devices, etc. And since many of these systems connect to the Internet, they must be verified to withstand cyberattacks.

Formal checking technologies are now used to improve the quality of hardware and software systems, but neither are they used routinely nor do they cover all possible ways that computer systems can fail. Greater use will come as better tools are developed that are more smoothly integrated into programming languages and software development environments.

. . .

Chapter 4 continues the exploration of resurgence by looking at several areas of AI research that have also experienced this phenomenon.

Page 48 Cite

Suggested Citation:"4 AI Resurgence." National Academies of Sciences, Engineering, and Medicine. 2020. Information Technology Innovation: Resurgence, Confluence, and Continuing Impact. Washington, DC: The National Academies Press. doi: 10.17226/25961.

×

4
AI Resurgence

This chapter continues the exploration of resurgence with a deeper exploration of the phenomenon in several areas of artificial intelligence (AI) research that have experienced this phenomenon: machine learning, reasoning, natural language, computer vision, and robotics. This landscape of accomplishments in AI also provides key building blocks for the subsequent discussion of confluence.

The chapter illustrates through examples the role of federal agencies and companies in supporting these advances. It traces salient intellectual threads in these five areas and provides examples of federal research programs and individual grants that propelled these advances.

MACHINE LEARNING AND NEURAL NETWORKS

Machine learning is the subfield of AI that studies how computing systems can automatically improve through experience. Often that experience is in the form of historical databases, as when machine learning is applied to historical databases of credit card transactions labeled as fraudulent or not, so that a classifier can be trained and then used to predict which future transactions are likely to be fraudulent. In other cases, the experience may be collected in real time by the AI system, as is the case for many online applications such as search and advertising, and in other areas, including some robotics.

Page 49 Cite

Suggested Citation:"4 AI Resurgence." National Academies of Sciences, Engineering, and Medicine. 2020. Information Technology Innovation: Resurgence, Confluence, and Continuing Impact. Washington, DC: The National Academies Press. doi: 10.17226/25961.

×

U.S. government support for machine learning-related research began in the 1950s and continued for decades in different forms, including some research funding programs for AI broadly and other smaller funding programs specifically for machine learning. Early research in machine learning, such as Oliver Selfridge’s Pandemonium work,¹ laid out key concepts, algorithms, and overall architecture that are important in modern machine learning methods. Decades of research on machine learning led eventually to the first significant commercial applications by the early 1990s (such as applications to credit card fraud detection).² A variety of machine learning approaches were developed, including decision trees and support vector machines.³ There was a burst of related work in the mid-1990s on probabilistic approaches to time-series prediction and control problems that draw on earlier frameworks such as adaptive control models, Kalman filters, and Markov decision processes.

Applications of machine learning and its impact on the economy have grown steadily since then, including applications to marketing and advertising, to recognition of addresses and automatic sorting of mail, to automatic spam email detection, email prioritization, online recommendation engines, medical diagnosis, and many more. The impact of machine learning accelerated dramatically beginning around 2010 due to several key drivers: (1) development of a new generation of deep neural network learning algorithms that significantly advanced the state of the art, (2) growth in the availability of big data sets to train and test machine learning algorithms, and (3) development of hardware support (e.g., graphics processing units (GPUs)) that allow practical application of deep network algorithms to large data sets. Over the past decade, deep network learning algorithms have produced breakthroughs in several other fields of AI, including computer vision, speech recognition, and natural language processing, each of which has in turn produced its own significant applications and economic impacts (see below).

Sustained funding for machine learning research has been fundamental to producing the results leading to today’s huge economic impacts of AI. During the past 50 years of government funding, machine learning research has gone through multiple paradigms, beginning with work in the 1950s on perceptrons, switching in the 1970s to approaches based on more symbolic representations, continuing into the 1980s with the advent of multilayer neural network algorithms based on gradient

___________________

¹ O. Selfridge, 1958, “Pandemonium—A Paradigm for Learning,” pp. 511-526 in Mechanisation of Thought Processes: Proceedings of a Symposium Held at the National Physical Laboratory on 24th, 25th, 26th and 27th November 1958, National Physical Laboratory Symposium No. 10, https://aitopics.org/doc/classics:504E1BAC.

² Early companies applying machine learning to credit card fraud detection include Hecht-Nielson’s HNC Corporation, which was eventually acquired by Fair Isaac. Machine learning approaches now dominate the business of approving credit card transactions.

³ C. Cortes and V. Vapnik, 1995, Support-vector networks, Machine Learning 20(3): 273-297; supported in part by NSF Award Number 0916200, An Advanced Learning Paradigm: Learning Using Hidden Information.

Page 50 Cite

Suggested Citation:"4 AI Resurgence." National Academies of Sciences, Engineering, and Medicine. 2020. Information Technology Innovation: Resurgence, Confluence, and Continuing Impact. Washington, DC: The National Academies Press. doi: 10.17226/25961.

×

descent methods. During the 1990s, neural network research receded as researchers moved on to probabilistic approaches, including Bayesian networks, and, more generally, probabilistic graphical models, in a shift that blended the field of machine learning with research in statistics. Probabilistic methods grew, along with kernel-based methods such as support vector machines through the early 2000s. By 2010, however, the drivers mentioned above led neural network approaches to re-emerge and dramatically advance the state of the art, leading for the first time to the ability to achieve human-level competence in a variety of perceptual problems such as converting speech to text and recognizing diverse objects in photographs. At this point, industrial funding in machine learning grew very significantly, and industrial research and development became significant contributors to principles and applications of machine learning. Without sustained government funding, it is very unlikely that today’s machine learning technology would have been developed.

As of 2020, machine learning methods have had a great impact on the economy. Venture funding for machine learning companies has grown dramatically over the past decade. Machine learning is now fundamental to operations of large cloud providers, Internet search companies, and social network companies, as well as customer modeling, robotic applications in factories and farms, and analysis of data in basic sciences from biology to physics. Machine learning lies at the heart of many AI applications that rely on computer vision, speech recognition, natural language, chatbots, and conversational interfaces to computers. A 2018 report issued by Zion Market Research⁴ estimates that the global machine learning market stood at $1.58 billion in 2017 and forecasts that it will grow to $20.83 billion by 2024, with a compound annual growth rate of 44 percent. As machine learning systems become more widely adopted and trained on historical data reflecting a variety of human biases, a new issue arises. For example, data sets on which loan applications were approved might show bias, with more approvals for male than female applicants. Creating unbiased systems learned from such biased data gives rise to new research challenges and to entire workshops devoted to addressing them, such as the annual ACM Conference on Fairness, Accountability, and Transparency.

___________________

⁴ Zion Market Research, 2018, Machine Learning Market by Service (Professional Services, and Managed Services), for BFSI, Healthcare and Life Science, Retail, Telecommunication, Government and Defense, Manufacturing, Energy and Utilities, Others: Global Industry Perspective, Comprehensive Analysis, and Forecast, 2017-2024, https://www.zionmarketresearch.com.

Page 51 Cite

Suggested Citation:"4 AI Resurgence." National Academies of Sciences, Engineering, and Medicine. 2020. Information Technology Innovation: Resurgence, Confluence, and Continuing Impact. Washington, DC: The National Academies Press. doi: 10.17226/25961.

×

REASONING

Reasoning is the process by which we combine different pieces of information or knowledge to reach new conclusions via rules of inference. From the early days of AI, researchers recognized that reasoning is a core cognitive capability that is an essential component of intelligent decision-making systems. Examples of reasoning include playing complex games such as chess, Go, and poker (where AI systems now outplay the top human players), and logistics problems such as combinations of offline and real-time scheduling for fleets of planes and trucks to minimize delivery delays. It is useful to contrast reasoning with learning. Machine learning is the process of uncovering regularities in data. The uncovered regularities represent useful knowledge about our world. Different sources of data and signals, including real-time information available via different sensors, provide different views of knowledge. Combining knowledge from these different sources through the process of reasoning leads us to new perspectives that may never before have been encountered and the ability to handle new types of situations that may never have been encountered before.

Reasoning therefore allows humans to “go beyond the data” by providing problem solving capabilities that consider rules of probabilistic or logical inference to reason about the implications of current knowledge, including gaining deeper understanding about entities, actions, plans, and outcomes. Decision-theoretic inference leverages probabilistic reasoning along with rules of expected value decision making to identify optimal actions or recommendations.

A well-known challenge for machine learning systems is how to handle cases that lie outside the set of examples used to train the system. Such scenarios present a serious challenge in making intelligent autonomous systems safe and robust. For example, the vision system in a self-driving car may find itself in a scenario that it has never encountered during the training of its machine learning model. In such settings, reasoning capabilities can enable the system to draw on other knowledge to still calculate and take the correct actions.

There is a rich history in AI of different reasoning paradigms. Reasoning methods in AI include computational procedures for applying sets of inferential rules such as logical reasoning—for example, to prove theorems from sets of axioms or rules of statistical or decision-theoretic inference to identify patterns, compute likelihoods of different outcomes, or determine best actions from multiple pieces of evidence. Reasoning can also be performed heuristically or approximately—for example, via consideration, in a qualitative manner, of knowledge about causes and effects and of chains of causation. During the 1970s and 1980s, most work focused on various

Page 52 Cite

Suggested Citation:"4 AI Resurgence." National Academies of Sciences, Engineering, and Medicine. 2020. Information Technology Innovation: Resurgence, Confluence, and Continuing Impact. Washington, DC: The National Academies Press. doi: 10.17226/25961.

×

forms of search and logical reasoning. In a common approach of the time, a methodology referred to as production systems was used to create “expert systems,” where knowledge was encoded in the form of logical rules, and reasoning engines provided diagnoses or made recommendations by chaining rules together. Logical reasoning methods are a good fit with engineered physical systems and applications in hardware and software verification.

Work on reasoning in AI led to the development of modern logical reasoning engines called SAT or SMT solvers. To give one example, work by Selman, Levesque, and Mitchell⁵ introduced a now widely used approach for solving problems in logical reasoning by iteratively “repairing” a guess rather than by step-by-step reasoning. Today, more sophisticated reasoning systems, such as Microsoft’s Z3, are in wide use across industry for the verification and testing of hardware and software and for performing constraint satisfaction problems seen in such logistical challenges as optimizing the ordering of actions on an assembly line (see “Formal Methods” in Chapter 3). These reasoning engines can chain together millions of facts in a matter of seconds to validate complex configurations or designs.

In the mid-1980s, AI researchers began to focus more intensively on methods for encoding and reasoning with uncertainty. Federal agencies provided funds to nurture the research of a growing community of researchers who pursued probabilistic representations and reasoning methods with the ability to handle incompleteness and uncertainty. For example, National Science Foundation (NSF)-supported work by Judea Pearl and others showed how probabilistic information could be represented and reasoned about in an efficient manner⁶—work for which Pearl was awarded the ACM Turing Award in 2011. The community of researchers developed expressive probabilistic representations. These network-centric representations, named Bayesian networks, and, more generally, probabilistic graphical models, encode dependencies and independencies among observations, outcomes, and other variables. The models could be constructed by assessing distinctions and probabilistic relationships from experts as well as directly from data. Researchers developed families of Bayesian inference algorithms for performing coherent probabilistic reasoning within Bayesian networks. With traditional Bayesian inference methods, prior knowledge in the form of probabilities is combined with new observations to produce updated, posterior probabilities. Innovations with Bayesian inference procedures were required

___________________

⁵ See, for example, B. Selman, H. Levesque, and D. Mitchell, 1992, “A New Method for Solving Hard Satisfiability Problems,” pp. 440-446 in Proceedings of the Tenth National Conference on Artificial Intelligence (AAAI-92), http://www.aaai.org. Related subsequent work was supported by NSF Award Number 9734128, CAREER: Compute Intensive Methods for Artificial Intelligence.

⁶ See, for example, J. Pearl, 1986, Fusion, propagation, and structuring in belief networks, Artificial Intelligence 29(3): 241-288. This research was supported in part by NSF Award Number 8313875, Toward a Computational Model of Evidential Reasoning.

Page 53 Cite

Suggested Citation:"4 AI Resurgence." National Academies of Sciences, Engineering, and Medicine. 2020. Information Technology Innovation: Resurgence, Confluence, and Continuing Impact. Washington, DC: The National Academies Press. doi: 10.17226/25961.

×

to propagate uncertainty across many variables of the network-centric representations, as necessary when the value of any variables is changed—for example, when variables representing symptoms in a patient were shifted from false to true during a patient visit. The Bayesian approach is a good fit with machine learning methods because Bayesian network models can be trained on data, yet also take advantage of prior knowledge of the designer about the structure of the problem and prior knowledge of the dependencies among the variables. In the 1990s, learning procedures were developed to learn the structure as well as the parameters of Bayesian networks directly from data.

Prominent applications of Bayesian networks include medical diagnosis, machine troubleshooting, traffic prediction and routing, document analysis, user-preference modeling, financial modeling, and pattern recognition, such as identifying junk email.⁷ On the latter, spam filters demonstrated the power and value of the probabilistic methods in daily life. There was a period in the mid-1990s that email was at risk of losing its effectiveness as a communication medium because of the deluge of spam emails. Bayesian spam filters were trained on large sets of messages (training data) labeled as spam or nonspam email. The method was surprisingly effective and can be viewed as having rescued email as an effective communication tool. Bayesian reasoning has since found a broad range of practical applications. For example, beyond spam filtering, the methods have been leveraged to identify important or urgent email and to help people to route email to specific folders.

One attractive approach of Bayesian and other probabilistic approaches to reasoning is that they output probabilistic predictions (e.g., the probability that the stock market will go down tomorrow), which allows their results to be used with formal decision theoretic reasoning systems that take into account costs and benefits of different outcomes and to identify best decisions to take under uncertainty. Today, deep neural networks are also often trained in a way that enables them to similarly make probabilistic predictions, resulting to some degree in their use in applications that had made use of earlier Bayesian network approaches.

Another active area for reasoning involves planning. In planning, the reasoning system has to find a sequence of steps that leads from a given initial state to a goal state. Planning was first considered in robotics where the steps are physical actions, and a plan is a sequence of actions needed for the robot to complete a given task. When taking into account probabilistic components of the environment, one uses probabilistic planning formalisms such as those captured by Markov decision

___________________

⁷ See, for example, M. Sahami, S. Dumais, D. Heckerman, and E. Horvitz, 1998, “A Bayesian Approach to Filtering Junk E-mail,” AAAI Workshop on Learning for Text Categorization, AAAI Technical Report WS-98-05, http://www.aaai.org.

Page 54 Cite

Suggested Citation:"4 AI Resurgence." National Academies of Sciences, Engineering, and Medicine. 2020. Information Technology Innovation: Resurgence, Confluence, and Continuing Impact. Washington, DC: The National Academies Press. doi: 10.17226/25961.

×

processes (MDP).⁸ MDPs were explored in operations research, management science, and electrical engineering decades before they rose to prominence in AI research. Within AI, they were extended to scenarios and challenges including procedures for learning and updating parameters from observational data, including observations about the world that are actively pursued via exploration policies. The latter extensions to MDPs led to a set of methods referred to as reinforcement learning (RL). A connection between planning and learning can be found in the work on RL and deep RL, which combines deep neural networks with RL. By contrast to supervised learning, which uses labeled input-output pairs for training, RL seeks to maximize a cumulative reward in an environment over time. The DeepMind breakthroughs of AlphaGo and AlphaZero were obtained with the use of deep RL techniques in which a set of neural nets were trained through extensive self-play to reach super-human performance for Go.⁹ It should be noted that to reach such performance, the systems also relied on AI search techniques, in particular, Monte Carlo Tree search. The deep RL framework thus combines machine learning with reasoning and demonstrating the power of combining learning and reasoning for guiding action in complex domains.

AI planning and scheduling research helped build modern supply chain management platforms as used in the U.S. Department of Defense and by companies such as Amazon. Decision-making systems incorporate planning under uncertainty and thus are integral to the further advancement of self-driving cars and autonomous drones. Planning under uncertainty and multi-agent reasoning are also a critical component of modern automated trading and finance platforms. The methods are also being used in web search and retrieval—for example, providing methods for crawling and indexing the web so as to keep search systems up-to-date with growing and changing content.¹⁰ In other areas, methods for planning and scheduling are being used for resource management in computing systems, with applications in use on computing devices from personal laptops all the way to storage and computing at large data centers.

___________________

⁸ See, for example, R. Howard, 1960, Dynamic Programming and Markov Processes, Technology Press-Wiley, Cambridge, MA.

⁹ D. Silver, A. Huang, C.J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, et al., 2016, Mastering the game of Go with deep neural networks and tree search, Nature 529: 484-489, https://doi.org/10.1038/nature16961.

¹⁰ A. Kolobov, Y. Peres, C. Lu, and E. Horvitz, 2019, “Staying Up to Date with Online Content Changes Using Reinforcement Learning for Scheduling,” in Advances in Neural Information Processing Systems 32 (NIPS 2019), https://papers.nips.cc/book/advances-in-neural-information-processing-systems-32-2019.

Page 55 Cite

Suggested Citation:"4 AI Resurgence." National Academies of Sciences, Engineering, and Medicine. 2020. Information Technology Innovation: Resurgence, Confluence, and Continuing Impact. Washington, DC: The National Academies Press. doi: 10.17226/25961.

×

NATURAL LANGUAGE PROCESSING

Natural language processing is the subfield of AI that harnesses computer methods for interpreting human language. Although computers cannot be said to understand the meaning embodied in natural language as humans do, nor to perform as well as humans at many natural language tasks, there has been tremendous technical progress in this field, leading to very significant economic impacts in areas from speech recognition, to web search, to machine translation. The lesson from this experience is that even partial understanding of natural language by computers has had great economic impact, and this impact is likely to grow as future computers understand natural language to a greater degree.

Natural language was called out in the early writings about the goals of AI as a core capability for human and computational intelligence, along with perception, learning, and reasoning. Government funding for natural language research started as early as the 1950s with early efforts to translate Russian into English, continued in the 1960s with support for two distinct threads of research—one on speech recognition and to the application of speech-to-text systems and a second on text analysis.

Research on speech-to-text transcription was consistently funded by a number of federal agencies (especially the Defense Advanced Research Projects Agency (DARPA)) throughout the 1980s and up until very recently. These funding programs helped establish benchmark speech data sets (e.g., the Switchboard data sets developed by the National Institute of Standards and Technology), which played a key role in enabling researchers to compare the performance of their different algorithms using a single standard. The accuracy of speech-to-text transcription on these data sets increased over the years, leading to breakthroughs over the past few years, where computer speech recognition (specifically, transcribing spoken language to text) rivals or surpasses the performance of humans in several domains. Crossing this threshold enabled the widespread adoption of speech interfaces in mobile phones, automobiles, Amazon Alexa, Google Home, Microsoft Cortana, and other widely used devices.

In parallel with research on speech-to-text, a second line of research focused on algorithms to interpret the meaning of text sentences, paragraphs, and stories. Early influential research in the late 1960s included Terry Winograd’s SHRDLU system,¹¹ which allowed users to interact with a simulated robot using a very limited form of natural language (e.g., “pick up a big red block”), and work by Roger Schank, who framed language understanding as a problem of creating symbolic representations of text meaning, and who introduced a particular “conceptual dependency”

___________________

¹¹ This research was conducted as part of the Massachusetts Institute of Technology’s Project MAC, launched with support from the Defense Advanced Research Projects Agency.

Page 56 Cite

Suggested Citation:"4 AI Resurgence." National Academies of Sciences, Engineering, and Medicine. 2020. Information Technology Innovation: Resurgence, Confluence, and Continuing Impact. Washington, DC: The National Academies Press. doi: 10.17226/25961.

×

representation to capture those meanings. Research proceeded on parsing algorithms to capture the syntactic structure of sentences, and in parallel on “bag of words” algorithms that instead considered the statistical distribution of words without considering their sequence (e.g., early research by Salton¹² on bag of words algorithms for retrieving relevant documents for library search).

Throughout the 1980s and up to this day, federal research funding has supported text understanding research, sometimes focusing on fundamental problems, and sometimes targeted toward specific missions (e.g., retrieval of relevant text documents, extracting entities such as persons, locations, and dates from documents, conversational systems, large-scale information extraction from the web).

Over the past decade, the adoption of deep neural network approaches to text analysis has yielded a significant improvement in accuracy for many language processing tasks, including information extraction and summarization, sentiment analysis, and machine translation. One conceptual breakthrough was the introduction of vector-space embeddings (such as Word2vec, created by a team of researchers at Google), which have transformed natural language processing (NLP) from dealing with discrete symbols to continuous spaces based on contextual co-occurrence that then allowed rapid development of deep neural network methods for NLP. A second conceptual breakthrough has been language modeling and especially transformer models that have made giant strides toward resolving the word sense disambiguation problem and introduced powerful models of what linguistic expressions are actually used.

Speech interfaces are now so pervasive that it is difficult to remember that when the iPhone was introduced in 2007 there was no speech interface to control it—the technology was too immature. Text analysis has now reached the point where translation by computer from one language to another is widely used, although still imperfect, and where search engines are now evolving beyond simply retrieving relevant web pages, to instead providing direct answers to natural language questions in some cases. Conversational systems are beginning to appear more widely, and natural language technology promises to change the very nature of user interfaces across many devices (e.g., home alarm systems that answer questions about which door is open).

The huge economic impact of progress in natural language processing has a direct link to the decades of basic and mission-oriented government funding.

___________________

¹² G. Salton, A. Wong, and C.S. Yang, 1975, A vector space model for automatic indexing, Communications of the ACM 18(11): 613, doi:10.1145/361219.361220. This research was supported in part by NSF under award GN43505.

Page 57 Cite

Suggested Citation:"4 AI Resurgence." National Academies of Sciences, Engineering, and Medicine. 2020. Information Technology Innovation: Resurgence, Confluence, and Continuing Impact. Washington, DC: The National Academies Press. doi: 10.17226/25961.

×

Speech recognition has been directly influenced by years of sustained DARPA support, while research in text analysis has benefited from funding from many agencies (e.g., DARPA’s Machine Reading research program in the 2000s, National Institutes of Health funding for information extraction from journal articles, and NSF funding of basic research in this area).

With the recent commercial impacts of natural language processing, corporate investment in research and development has grown dramatically over the past decade, especially by the largest companies relying on this technology). This investment has accelerated progress but along with companies’ hiring faculty from universities and the development of the proprietary large-scale text data sets has also led to a situation where it may be more difficult for universities to compete with these companies to advance the state of the art. On the other hand, industry has helped boost university research through the release of open-source software (e.g., Allen Institute for AI’s AllenNLP open-source software), large data sets), and since 2018, the dissemination of deep neural networks (e.g., ELMo, BERT, GPT, and Turing) that have been pre-trained on billions of words of text data and that serve as a general substrate for developing more targeted natural language applications.

ROBOTICS

Robotics¹³ is the study of realizing our goals in the physical world through the physical action of machines. Today’s achievements in robotics reflect a strong scientific-industry-consumer pipeline founded on early and continued government support. The products of robotics research have had important impacts on manufacturing, transportation, health, national security, and our everyday lives. More importantly, research in autonomous robotics, where robotics meets AI, is breaking down difficult scientific challenges that are leading to the next generation of innovation and technology. Autonomous robotics often frames AI as a “sense-plan-act” loop where computer programs enable the robot to perceive the current state of its world, plan a course of action, and execute this plan through its motors. Robots also use AI to reason in the face of uncertainty, as we commonly experience on the public roadways, as well as robotics for more flexible manufacturing, in the handling of improvised explosive threats, and in the care of our older adults. The objective of robotics is not to replace humans by mechanizing and automating all tasks, but rather to find

___________________

¹³ This definition of robotics is a modification of Matt Mason’s from the 2017 Lynch and Park textbook, Modern Robotics: Mechanics, Planning, and Control (K.M. Lynch and F.C. Park, eds.), Cambridge University Press, http://hades.mech.northwestern.edu/index.php/Modern_Robotics.

Page 58 Cite

Suggested Citation:"4 AI Resurgence." National Academies of Sciences, Engineering, and Medicine. 2020. Information Technology Innovation: Resurgence, Confluence, and Continuing Impact. Washington, DC: The National Academies Press. doi: 10.17226/25961.

×

new ways that allow robots to collaborate with humans more effectively. Robots are better than humans at tasks such as crunching numbers and moving with precision. Robots can lift much heavier objects. Humans are better than robots at tasks like reasoning, defining abstractions, and generalizing or specializing, thanks to our ability to draw on prior experiences. By working together, robots and humans can augment and complement each other’s skills. The vision of “co-robots” capable of working in close proximity and collaboration with people is poised to be realized in the coming years of robotics research.

The recent past has seen significant advancements in robotics. Key drivers for this growth have been mission-oriented U.S. agencies, including NASA efforts for Mars exploration, Air Force 2020 vision, and the DARPA Challenges (Desert Driving, Urban Driving, and Dexterous Manipulation challenges), the Office of Naval Research (ONR) Science of Autonomy program, and the NSF National Robotics Initiative. These programs invested consistently in robotics for the last decade. As a result, annual patent filings in robotics have tripled over the past decade. Venture capital investments doubled past year. And the technology research firm International Data Corporation estimates that the robot market was worth $135 billion in 2019.¹⁴ As the research investment pays off, and robots move from our imaginations into our homes, offices, and factory floors, they will become the partners that help us do so much more than we can do alone.

Many of the advancements in autonomous robotics can be traced¹⁵ back to initial robotics research projects established in the 1960s and 1970s with growth supported by sustained federal funding. Autonomous vehicles and planetary rovers, for example, are descendants of research projects in mobile¹⁶ robotics such as SRI’s Shakey the robot, developed with support from DARPA, ONR,¹⁷ the U.S. Army, NSF, and NASA. The knowledge and talent resulting from these early investments turned into an industry of significance in the 1990s and 2000s. This industry was led by products such as iRobot’s now ubiquitous Roomba autonomous vacuum cleaner and its ruggedized PackBot platform. The PackBot and similar field robots have been used for searching in the debris of Ground Zero after 9/11, assessing the damage after the Fukushima nuclear incident, assisting service people around the world, and exploring the unknown. The advancements in robotics over the past decade have

___________________

¹⁴ R. Waters and T. Bradshaw, 2016, “Rise of the Robots Is Sparking an Investment Boom,” Financial Times, https://www.ft.com/content/5a352264-0e26-11e6-ad80-67655613c2d6.

¹⁵ N.J. Nilsson, 2010, “The Quest for Artificial Intelligence (Web Version),” Stanford University, https://ai.stanford.edu/~nilsson/QAI/qai.pdf.

¹⁶ D. Szondy, 2015, “Fifty Years of Shakey, the “World’s First Electronic Person,” New Atlas, https://newatlas.com/shakey-robot-sri-fiftieth-anniversary/37668/.

¹⁷ D. Smalley, 2018, “From Fiction to Reality: U.S. Navy Technology and Innovation,” Navy Live Blog, https://navylive.dodlive.mil/2018/01/02/from-science-fiction-to-reality-u-s-navy-technology-and-innovation/.

Page 59 Cite

Suggested Citation:"4 AI Resurgence." National Academies of Sciences, Engineering, and Medicine. 2020. Information Technology Innovation: Resurgence, Confluence, and Continuing Impact. Washington, DC: The National Academies Press. doi: 10.17226/25961.

×

demonstrated that robotic devices can locomote, manipulate, and interact with people and their environment in unique ways. The locomotion capabilities of robots have been enabled by the wide availability of accurate sensors (e.g., laser scanners), high-performance motors, and development of robust algorithms for mapping, localization, motion planning, and waypoint navigation. Robotic swarms coordinate the behavior of multiple robots as a system consisting of large numbers of generally simple individual robots.

The impact of autonomous robotics has accelerated since the 2000s due to several key drivers, including the following: (1) the maturation of inexpensive robot hardware components (such as lidar ranging sensors, structured light cameras, and series-elastic actuators) and their interoperation through middleware frameworks (such as the Robot Operating System); (2) the advancement of robot algorithms for control, perception, planning, and state estimation such as simultaneous localization and mapping (or SLAM for short), which are now scalable and deployable at city-scale, enabling autonomous navigation; (3) the proliferation of a diverse array of robot platforms capable of mobility and dexterous manipulation for various situations (drive-by-wire cars, quadrotor drones, fulfillment robots in warehouses); and (4) the availability of data sets and the rise of viable machine learning methods that enable robots to recognize and work with objects in common human environments. Further, with the rise of human-robot interaction and co-robotics research, socially assistive robotics offer the promise of new therapies for improved treatment of social disorders, such as autism, and physical rehabilitation. Over the past decade, through major cross-agency support such as the National Robotics Initiative, these advances have enabled robotics to begin automating tasks in the physical world similar to how computing has automated information tasks in the past.

Recent robot growth¹⁸ indicators include 19 percent growth in units sold in 2017 over 2016. Seventy percent of the worldwide sales of robots are in China, Japan, the United States, South Korea, and Germany, with the U.S. economy investing $10.3 billion for robot units and $30.5 billion for robot systems. Commercial success stories include the Kiva robots, which are enabling Amazon’s business model. Building from university-based robotics research¹⁹ in the 1990s and 2000s, Amazon now operates more than 200,000 robots in their fulfillment centers. iRobot’s vision-based SLAM solution deployed in all their 900 series robots builds on research from multiple universities, including Carnegie Mellon University, the Massachusetts Institute

___________________

¹⁸ B. Raphael, R.O. Duda, R.E. Fikes, P.E. Hart, N.J. Nilsson, P.W. Thorndyke, and B.M. Wilber, 1971, “Research and Applications—Artificial Intelligence,” National Aeronautics and Space Administration, https://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/19730014552.pdf.

¹⁹ S. Crowe, 2020, “Kiva Systems creators inducted into National Inventors Hall of Fame,” The Robot Report, https://www.therobotreport.com/kiva-systems-creators-inducted-into-national-inventors-hall-of-fame.

Page 60 Cite

Suggested Citation:"4 AI Resurgence." National Academies of Sciences, Engineering, and Medicine. 2020. Information Technology Innovation: Resurgence, Confluence, and Continuing Impact. Washington, DC: The National Academies Press. doi: 10.17226/25961.

×

of Technology, and University of Southern California. More than 25 million units have been deployed by iRobot with this SLAM capability. Intuitive Surgical remains the leader in technologies for robot surgery, with over 5,000 systems deployed worldwide that have performed over 7 million procedures.²⁰ The market for robotic surgery is expected to grow to over $9 billion by the year 2030.²¹

Robotics has also been a catalyst for emerging areas of growth in the entrepreneurial sector. Start-ups in the area of autonomous vehicles abound, including Waymo, May Mobility, Nutonomy, Argo AI, and Cruise Automation as well as the now-established Tesla. Such start-ups often have investments or partnerships with larger more established automotive companies. A number of emerging start-ups are poised to further innovate the retail supply chain. Bossa Nova Robotics is working with Walmart on solutions for inventory tracking in retail environments. Fetch Robotics provides autonomous material-handling solutions for warehouses and cloud-managed supply chain logistics. One problem for all of these systems is their restriction to moving on the flat surfaces of the roadway or retail floor. Robots are increasingly ready to take on the stairs and move the way people do. University research along with cutting-edge start-ups, such as Agility Robotics and Boston Dynamics, are working on next-generation robots capable of bipedal locomotion. Such bipedal robots offer the prospect of autonomy taking the last few steps from the car to the front door.

COMPUTER VISION

Computer vision²² is the science of understanding images and videos. Computer vision methods seek to recover models or descriptions of objects and scenes (e.g., garden versus kitchen versus parking lot) from visual data. The resulting descriptions can vary from detailed geometric models (e.g., measure the size of paint damage on a bridge from drone imagery) to very abstract predictions of likely successful actions. Descriptions can vary from the locations and sizes of particular objects to detailed written captions. Although computer vision methods have found widespread industrial application, including for visual effects, advertising, and autonomous vehicles, it is widely believed that vision remains a complex and poorly understood skill. Many important practical problems remain unsolved or even undescribed.

___________________

²⁰ Intuitive, “About,” https://www.intuitive.com/en-us/about-us/company, accessed July 1, 2020.

²¹ BIS Research, 2020, “Global Robotic Surgery Consumables Market—Analysis and Forecast, 2020-2030,” https://bisresearch.com/industry-report/robotic-surgery-consumables-market.html.

²² David Forsyth, University of Illinois, Urbana-Champaign, contributed the original draft of this section.

Page 61 Cite

Suggested Citation:"4 AI Resurgence." National Academies of Sciences, Engineering, and Medicine. 2020. Information Technology Innovation: Resurgence, Confluence, and Continuing Impact. Washington, DC: The National Academies Press. doi: 10.17226/25961.

×

U.S. government funding for computer vision began around the 1960s and has continued in a variety of forms across multiple agencies (DARPA, ONR, NASA, and NSF). These investments reflect the widespread utility of the ideas. For example, agencies with responsibilities for satellites and mapping were early supporters of work in photogrammetry—the problem of recovering geometric descriptions from multiple images. Defense agencies have had a sustained interest in research on object detection—the problem of determining which objects are in an image and where they are. Likewise, the intelligence community has been particularly enthusiastic about interpreting aerial images and in understanding mixed resources of images and text. Early techniques from pure computer vision research include scale-invariant feature transforms (for matching);²³ histograms of oriented gradients (for recognition),²⁴ and convolutional neural networks (for character recognition).

The ImageNet Large Scale Visual Recognition Challenge provided a key benchmark for advances in image recognition. Its conception and development were supported by NSF²⁵ and multiple companies and other institutions. Although neural networks had been around for decades, researchers at the University of Toronto changed the field of computer vision by obtaining dramatically better results in the contest using deep convolutional neural networks than were possible with prior methods.

Industrial research in computer vision has waxed and waned over the past 60 years but has grown explosively over the past 15 years. Some early work focused on robotics problems, including visual processing to support picking objects, automated welding, and moving pallets around warehouses. A significant area of applied research has been in three-dimensional reconstruction from video. This work now drives applications as diverse as creating visual effects for the entertainment industry and measuring the condition of civil engineering infrastructure like bridges. The area blossomed out of early government-funded research on the geometry of multiple views, on reconstruction from video, and on matching using local image appearance. For users with low or no vision, computer vision–based systems that can read documents and bar codes or identify elements of a scene, which are often implemented in mobile devices, can be of tremendous assistance.

___________________

²³ D.G. Lowe, 1999, “Object Recognition from Local Scale-Invariant Features,” pp. 1150-1157 in Proceedings of the International Conference on Computer Vision. 2, doi:10.1109/ICCV.1999.790410.

²⁴ N. Dalal and B. Triggs, 2005, “Histograms of Oriented Gradients for Human Detection,” pp. 886-893 in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, doi: 10.1109/CVPR.2005.177.

²⁵ See, for example, J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, 2009, “Imagenet: A Large-Scale Hierarchical Image Database,” pp. 248-255 in 2009 IEEE Conference on Computer Vision and Pattern Recognition, doi:10.1109/CVPR.2009.5206848; supported in part by NSF Award 0509447, CSR-PDOSContent-Searchable Storage for Feature-Rich Data. Subsequent partial support for ImageNet was provided by NSF Award 1115493, III: Small: Collaborative Research: Using Large-Scale Image Data for Online Social Media Analysis.