The workshop concluded with an open discussion of the seven questions posed by the workshop sponsor. Planning committee members moderated the discussion for each of the questions.
- What is the current progress in the conceptualization, development, and adoption of machine learning (ML) in the cyberattack and cybersecurity processes, such as in components of the cyber kill chain and cybersecurity operations?
Vinh Nguyen, National Security Agency, began the discussion by encouraging participants to look at the realities of ML and artificial intelligence (AI), to cut through the hype and address where the technology is headed, with particular attention to how adversaries might use these techniques to advance or augment their cyber expertise, operations, or decision-making process. Such insights are needed to help guide policy considerations and identify areas the United States might invest in now to help prepare for the threat landscape 5 years down the road. John Manferdelli, Northeastern University, moderated an open discussion on the question, urging participants to consider areas that had not been discussed previously that might be important or could catch us by surprise, including work being done internationally (e.g., the use of AI for the social credit system being developed in China).
One participant, the chief data scientist for U.S. Cyber Command’s Applied Research and Development Division, said that AI and ML are more likely to be used on the offensive side of the cyber kill chain than to defend cyber systems, largely because of trust issues. He suggested that experts need to do a better job explaining how AI and ML work to cyber defenders so they will trust it and adopt it. David Brumley, Carnegie Mellon University and ForAllSecure, said that the use of AI and ML for finding and weaponizing new vulnerabilities is in the conceptualization and development stage in the United States, and likely in China and Israel as well. Despite many efforts considering the deployment of AI technologies and some success in startups, it is premature to say that the technologies have been widely adopted. Jay Stokes mentioned recent work from Microsoft Research toward concrete applications of AI and ML across the cyber kill chain. One study1 presented a new system—tested on data sets that included one targeted, real-world attack—for detecting an adversary’s lateral movement2 within a network.
1 Q. Liu, J.W. Stokes, R. Mead, T. Burrell, I. Hellen, J. Lambert, A. Marochko, and W. Cui, 2018, “Latte: Large-Scale Lateral Movement Detection,” 2018 IEEE Military Communications Conference (MILCOM), doi:10.1109/MILCOM.2018.8599748.
2 Rapporteur’s note: Lateral movement is where an attacker moves from the originally infected computer to others in the system in order to carry out an objective; this movement typically begins in the middle stages of the cyber kill chain.
He suggested that additional data from the field would be helpful for further progress. A second study looked at the detection of malware signatures based on command-line processing,3 and a third one, soon to be released, would address interpretability for brute force attacks and scanning. He added that although it is difficult to arrive at publishable results, his team is receiving new data sets from Windows Defender ATP to enable them to pursue this work.
An audience member asked about the degree to which knowledge of the domain in which an attack occurs affects how ML systems might be applied to the problem. A system can’t be trained if the environment is unknown; the questioner asked what it would take to develop a rapid-feedback discovery system for automated assessment of attack capabilities. Yevgeniy Vorobeychik, Washington University, responded by pointing out the distinction between AI that solves a general class of problems and point-solution AI that works in a very specific context. Most successes have come with point solution AI, he said. When you can define the problem precisely—for example, if you want to discover Internet of Things devices that have vulnerabilities—you can collect data and automate the process. This need not necessarily be done via ML—it could be via a decision-support tool. More general categories of the problem cannot be defined, and thus cannot be solved with AI. Wenke Lee, Georgia Institute of Technology, agreed that experts have a very good understanding of how to model specific point attacks, but added that attackers can be very creative, using multiple tools and strategies. He commented that researchers and industry seem to be moving toward modeling those multiple or higher-level strategies upfront.
Still, the audience member pointed out, it remains unclear how ML- or AI-informed cyberattacks might play out when the attacker has little or no information on the system going in. For example, what if an attacker wants to take down Google worldwide with little knowledge of its network? Manferdelli stressed the importance of data to ML models and suggested that it may be possible, in such an instance, to obtain data that the target accidentally discloses (e.g., pizza being ordered for a military office right before a major action), but if there truly is no data and no upfront information about the system, it would not be amenable to ML and instead becomes a general AI question, or worse.
Brumley, Manferdelli, and Subbarao Kambhampati, Arizona State University, discussed distinctions between large technology companies and the U.S. government, emphasizing that one may not be an appropriate stand-in for another when trying to understand hypothetical scenarios. For example, Brumley commented that a large company like Google may employ many AI and ML experts who can tweak their algorithms in order to make any possible gains in efficiency of detecting attacks over time, whereas the U.S. government might see such tools as products to be purchased. Manferdelli noted that Google also has significantly more data. Brumley responded that, while Google may have more data for experimentation, the government might have authorities that permit different kinds of experimentation. Kambhampati commented that it may not be fair or well-informed to make direct comparisons of capabilities of the private sector versus governments, which have different roles and models of operation.
- What are the key cybersecurity problems that can be solved using ML? What AI and ML techniques are well suited for cybersecurity? What are areas of cybersecurity that ML show great promise for resolving? What are areas of cybersecurity that will prove challenging to address using ML?
Lee launched the discussion with an emphasis on defense, including current and potential applications of AI and ML for attack prevention, detection, and response. In the prevention space, he said ML is poised to help with user authentication, including with biometric data, as well as with automatically generating firewall or intrusion prevention rules. For detection, both industry and academic researchers are looking at ML to help with attack detection, intrusion detection, malware detection, clustering, and anomaly detection. In his view, ML is most suited for classification (figuring out where there is an attack) and clustering (figuring out what family of attacks to analyze). He also suggested that graph and link analysis algorithms—techniques not addressed so far in the
3 J.W. Stokes, R. Agrawal, M. Marinescu, and K. Selvaraj, 2018, “Robust Neural Malware Detection Models for Emulation Sequence Learning,” presented at 2018 IEEE Military Communications Conference (MILCOM), IEEE, https://www.microsoft.com/en-us/research/uploads/prod/2019/03/Milcom2018_Agrawal.pdf.
discussions—hold potential for helping piece evidence together. Using ML for recovery and response is an area that is not yet well developed, he said.
Lee identified several areas where ML might show great promise further down the road. First, he suggested that in forensic analysis, in particular, attribution is very ripe for ML-based solutions; the availability of forensic data from malware, hosts, and infrastructure suggest that ML could be a useful approach for attribution of the origin of an attack to an individual, group, or nation-state. Lee then suggested that ML also has strong potential to assist in identifying both attack and defense strategies. In particular, it could be helpful for recognizing an adversary’s attack plan, modeling and/or optimizing a security administrator’s response strategy, or prioritizing different security alerts. He related this to Brumley’s insight, from the Defense Advanced Research Projects Agency (DARPA) Cyber Grand Challenge (CGC), that ML was useful in helping to prioritize how teammates should best use their limited time; Lee suggested that ML-assisted, strategic decision-making should be studied as a way to improve efficiency of cybersecurity operations. Among the harder problems for ML, according to Lee, are the generation of patches and risk analysis to inform organizational policy.
Manferdelli agreed that there is likely more that can be done in the area of surveillance assessment and perhaps in automatic scanning of the Web or a specific system to identify new events. He raised the question of what can be learned from side channels, rather than an event itself, which may be hidden. He also raised a more general question about the efficacy of current approaches to research and development (R&D) in the area of AI for cybersecurity. In particular, he noted that some fields, such as computer hardware, rely on phenomenological approaches for making progress, but he expressed a sense that AI researchers are focused more on publishing papers. He suggested that incorporating more phenomenological experiments could be helpful for researchers exploring the potential applications of AI to cybersecurity problems.
Sven Krasser, CrowdStrike, pointed out that data is central to ML, but not all data collection has been designed with ML in mind. Using data sets with proxy rules and policies can result in a model that parrots the proxies and rules, instead of modeling the core issue. He argued that raw data collection is better than data filtered through a policy framework. If a policy framework must be used, he urged data collectors to make the rules consistent in order to make it possible for the data to be used successfully for ML.
Tyler Moore, University of Tulsa, mentioned that some progress has been made in collecting external measurements of networks and organizations and correlating that information with security events. For example, companies such as BitSight, QuadMetrics, and Security Scorecard are using ML to relate some measure of network hygiene to security outcomes. It is still an open question about how effective this is, he said, but the approach offers hope for connecting an organization’s security practices to significant events, such as data breaches. So far, the models are correlative rather than explanatory, said Moore, but the area warrants more work.
Alex Kantchelian, Google, provided a counterpoint to some of the optimism expressed by other participants about the security problems that have been solved to date. For example, he suggested that the statistical approaches for intrusion detection are still somewhat naïve. While intruder detection is probably one of the most important cybersecurity applications for industry, he said, the systems deployed today are generally rule-based and not ML-based. There is little work being done that uses ML algorithms to reason on voluminous, complex, temporal sequences of events that are both high and low level. The main challenge resides in collecting and collating these events at scale into semantically meaningful groups that an ML system can in turn analyze. Lee countered that industry is actually lagging behind academic research in the area of intrusion detection, likely by about 10 years.
Una-May O’Reilly, Massachusetts Institute of Technology, agreed with Kantchelian that much of the progress made has been naïve, suggesting that ML as applied today can address only isolated facets of cybersecurity, and fails to elucidate causality. In addition, she said transparency and trustworthiness are also important issues to be addressed. In order to make progress, for example, on collecting measurements and risk analysis, the ML community needs help from others: it needs to be able to share data privately and anonymously and to use that shared data to produce aggregate models, predictions, and forecasts that still manage to keep the data de-identified. She identified technology and policy for secure, private, and trustworthy data sharing as a fundamental problem for ML researchers working in cybersecurity. She noted that she has been interacting with the financial sector to build trust and develop those mechanisms. She also stressed the importance of integrative approaches that address the layered and complex nature of cyber systems and security and urged the community
to reward work aimed at integrating an understanding of threats, their behaviors, and their consequences using a multilayered approach.
Stokes wrapped up the conversation by sharing his perspective on progress in ML systems at Microsoft Research over the past two decades. In 2006, they began building and internally deploying systems for analyzing telemetry data collected from clients. In 2010, teams were building progression-based systems, built on file classification. By 2012, Microsoft Research began working with deep learning for malware classification and in 2013 began building systems that looked at sequences of events to model malware—systems that are still being refined today. Deep learning, he said, has been successful, and the resulting systems have advanced beyond signature-based approaches. Real-time blocking exists, based on calls back to the cloud, which then uploads files very quickly.
- What is the biggest “ask” the cybersecurity community could make of the ML community?
Workshop chair Fred Chang launched the discussion of Question 3 by stating that the cybersecurity community could learn from the ML community’s experience in conditioning, tagging, and fusing data. He expressed a sense that many of the ML techniques being deployed in cybersecurity have been borrowed from the field of computer vision, and that it could be helpful for the ML community to develop and use techniques that are specific to cybersecurity. Lastly, he said that the ML community could focus on tools for assisting analysts, who must forecast situational awareness, fuse data, and make recommendations for actions. Delip Rao, AI Foundation, built on this point, noting that ML can augment analysts’ capabilities, working with the analyst-in-the-loop, to make their jobs easier while improving outcomes by enabling the analyst to provide real-time guidance to the models. There is potential for new models to help, not only in the context of cybersecurity, but also for fact checking and addressing instances of disinformation.
Manferdelli mentioned the company Sqrrl, acquired by Amazon, which is applying ML to NetFlow data for detection of network-based attacks. O’Reilly said the ML community could leverage malware data sets for cybersecurity purposes, but cautioned that network security training data sets would need to be from multiple sources because network attacks are very disparate. Furthermore, ML approaches to network security would only be valuable if aimed at getting to the root of an integrated problem, rather than making marginal improvements to solving a particular problem that was already well-understood.
Lee said his biggest ask is to make ML algorithms, including deep learning models, more explainable. If the models can’t be explained, there is no way to reason about how robust they are or about their accuracy, he added. Krasser agreed that explainability is an important ask. He also echoed Chang’s points about needing to fuse or make sense of multiple data formats and the reliance on computer vision-based approaches in the security contexts discussed. He added that computer vision models are currently popular—in some ways appealing or easy for people to comprehend due to their visual nature—but they do not work with the kinds of hard constraints seen in a network security context, an area in which he’d like to see more work.
Nicolas Papernot, Google Brain, reiterated that data sets need to be in a format that the ML community can understand, and capable of yielding results relevant to threat modeling. In particular, the metrics that researchers report must be meaningful in the context of complex situations in which a defender faces an adaptive adversary. He then went on to express a different perspective on explainability, suggesting that he felt the issue was being overemphasized. He compared the use of a deep neural networks to a practical deployment of a decision tree, which would in practice include thousands of decision trees with thousands of leaves. Looking at the layers of a deep neural network reveals some representation of the data, and while we may not be able to explain different features extracted by individual neurons, that might not be necessary. He suggested that ML is being applied to certain problems because no one knows how to write programs for solving them directly—in such instances, logic that humans can understand has not proven capable of solving the problem. Lee commented that a decision tree yields output that conforms to rules that an expert can actually read and understand, and which match their intuition or other heuristics derived from actual domain knowledge. He said the biggest issue with deep neural networks is that the human analyst doesn’t know which neurons will be fired because the function is so complex. The process is a complete black box and hard to ground to reality, through domain knowledge, or to the particulars of a network.
Krasser pointed out that the model that explains can be different from the model that detects. Humans can use deep neural networks to detect and then use something else—possibly descriptions of interim states of that model—to explain what happened. Papernot agreed with this point but argued that the bar is too high on explainability and that, while explainability might bring some insights on machine logic, humans may not be able to understand everything. He also pointed out that humans commonly come up with explanations for their decisions that are perfectly intuitive but actually inaccurate. Explainability is a word being used to mean a lot of things, he said, and this makes for a confusing debate. Krasser added that explainability is a systems problem rather than an algorithmic one, and suggested that explainability may not extend to the lowest levels of a model, which often deal with highly abstract traits with little utility for human operators. Kambhampati said he believed that explainability is already important in any instance of joint work between humans and machines, but he agreed with Papernot that explainability may be overemphasized in the security context. Perhaps, he suggested, debugging tools are actually the more important issue.
Phil Venables, Goldman Sachs, observing that explainability may mean different things to different people, made the case that the cybersecurity and ML communities need a common taxonomy that defines certain terms and provides a common basis for progress. Papernot said that there are now many workshops aimed at bridging these communities and that such a taxonomy may well emerge naturally over time.
Kambhampati questioned whether ML should be considered the only “savior” for cybersecurity and noted that much of the discussion, in his view, has revolved around buzzwords and solutions (e.g., supervised learning and clustering) looking for a problem, rather than starting with the problems the cybersecurity community deals with. O’Reilly acknowledged this point and suggested that the cybersecurity community should ask the ML community for trustworthy ML. Just like systems need to be secure by design—from networks to operating systems to the application layers—ML also needs to be secure by design, she concluded. Papernot added that the National Science Foundation (NSF)-funded Center for Trustworthy Machine Learning4 was conceived to advance such a goal.
- What are the key challenges and barriers to ML adoption for cybersecurity by business and governments?
Addressing Question 4, Venables began by noting that one key challenge for business and government is the lack of viable ML products. While some products are effective, many are not, he said, and some are only algorithms or features that should really be part of another product or service. The challenge, in his view, is to be able to recognize a viable product that doesn’t require special skills to integrate into an environment. Venables also mentioned the challenge of data availability as well as ongoing data curation and governance to ensure data is accurate and free of bias, and has provenance that can be substantiated.
Venables also spoke to the challenges of having scientists in business and government with the right skills and making sure that data scientists and ML professionals are using their talents effectively. In some situations, he said, data scientists can be assigned to tasks such as data tagging and basic curation or other problems that could easily be handled by other people or processes. It is not the case that anything having to do with ML requires a PhD, he asserted. Other challenges and barriers that Venables mentioned included explainability, which in his view should be seen more in terms of the bigger goal of model assurance and confidence, and macro deployment control versus micro controls. While micro controls focus on individual defense techniques, macro controls are the patterns that would be deployed into an environment to limit the “blast radius” when a system is under attack and things go wrong. Lastly, he suggested that the ML community sets its focus too narrowly on certain aspects of cybersecurity, overlooking opportunities to apply AI and ML more generally to functions like auditing, managing information technology (IT) general controls, different kinds of surveillances, improving peoples’ productivity, or applying AI for physical security.
Krasser reiterated his view that the many products claim to use ML but do not actually do so in a substantive way. He suggested that it could be useful to define some way of demonstrating what is actually happening under the hood. At the same time, users must move past seeing ML as just a checkbox: the primary concern should be
4 National Science Foundation, 2018, “NSF Announces $78.2 Million to Support Frontiers of Cybersecurity, Privacy Research,” October 24, https://www.nsf.gov/news/news_summ.jsp?cntn_id=296933.
on the functionality rather than the mechanism for achieving it. Venables suggested that companies that claim to make ML products could predefine their model assurance and their constituent maker—almost like an AI/ML bill of materials—in order to verify how the techniques are contributing to the end product. Tyler Moore, University of Tulsa, suggested that AI/ML vendors are following a longstanding tradition of not having useful product evaluations. In particular, it has been a challenge to evaluate the security of a software product. The software industry has traditionally handled evaluation via common criteria certification schemes, but there are many pitfalls. One way to separate the wheat from the chaff might be to start requiring certain disclosures, such as a list of “materials”—its AI components or even attributes of the underlying data.
Nguyen brought up the human challenges involved in ML adoption in the areas of education, talent, and trust. He said research shows that integrating AI and ML with human processes leads to the best outcomes, and that if the ML community is developing and introducing new technologies quickly, it needs to figure out how to help people through the adoption process quickly. The challenge is finding the right process to achieve the best outcomes; a process that is better than the current one of porting technologies over to use and then leaving it to the user to figure it out. Kathleen Fisher, Tufts University, noted that DARPA is addressing the challenge of partnering humans with cyber reasoning systems through a new initiative called CHESS.5
O’Reilly said she has found most people in business and government care about compliance and regulatory security, additional challenges that will need to be addressed.
- What countries or companies are investing in the application of ML to their cybersecurity processes and what are these company or country partnerships?
In Venables view, the countries doing the most with ML and AI for cybersecurity are the United States, the United Kingdom, Israel, and China. In some cases, the companies using ML have spun off from government agencies. The business sectors that are applying the technology in the area of cybersecurity are the big technology companies, the big financial companies, and in many cases the security vendors that work with those technology and financial companies. He reiterated that AI and ML technologies are sometimes used even when simpler solutions exist; on the other hand, they may be underused in some areas that could benefit, such as natural language processing for encoding and analyzing data about security incidents, large-scale Web crawling for sentiment analysis, and brand protection. He reiterated Moore’s point that large-scale surveillance and anomaly detection in extended supply chains (to both third and fourth parties) has proven useful.
Stokes said that Microsoft is committed to ML and that Google likely is as well, and there are a huge number of professionals in the field now. Venables reminded the group that companies don’t always require PhDs in ML; many tools exist that regular programmers can use quite effectively. Stokes pointed out a key challenge that he has observed—the overlap of efforts as multiple teams work on similar types of problems. Manferdelli reiterated that the trend is to hire PhD-level ML staff, which in his view shouldn’t be needed for every team member on an ML or AI project.
- How will proliferation and adoption of ML change the potential consequences (including physical, economic, or psychological impacts) of cyberattacks?
Kathleen Fisher boiled Question 6 down to the following question: How will AI change cybersecurity? She stated that cybersecurity is inherently an arms race, and AI and ML are important parts of the evolution of that race. One aspect of cybersecurity that AI and ML are likely to change is the timescale, she said. The timescale of cyber operations can be too fast for humans to be in the decision loop, which may mean that humans may need to serve in more of an oversight capacity while machines proceed at their own pace. Remote systems empowered to make decisions without “phoning home” could dramatically change the timescales of operations. A key question remains: How hard will it be to automate reconnaissance, analysis, bug fixing, patching, and exploits? If
5 For more information, see Defense Advanced Research Projects Agency, “Computers and Humans Exploring Software Security (CHESS),” https://www.darpa.mil/program/computers-and-humans-exploring-software-security.
that turns out to be an easy problem to solve, then faster-than-human timescales and automated decision-making will take on a more important role. Fisher expressed her opinion that automating these processes is likely to be difficult and take years, and that so-called “hard-cyber” will not be particularly amenable to these techniques—in part because the code space is generally non-linear. However, she did note that there is some counterevidence to this view to suggest that progress could be coming. One example is a project called GamePad,6 which looked at using ML in a theorem prover context where the ML system predicted the number of steps in a proof and which tactic would be used in the next step. GamePad was reasonably successful in a very narrow domain. She also pointed to the GPT-2 language model7 as another example of coherent text resulting from a statistical approach. She noted that it is unclear whether these instances could be generalized outside of the specific contexts examined.
The impact of AI and ML on cybersecurity also will vary for different countries depending on their tolerance of risk and their access to data, Fisher noted. In particular, the United States is less likely to tolerate collateral damage. In China, where the lines between government and industry are blurred, data access is easier in part because there is a higher tolerance of surveillance. Russia appears indifferent to what other countries think of their tactics. These realities, she said, give other nations experience and practice that are not likely to be available in the United States.
Noting the difficulty of postulating on when hard problems will get solved, O’Reilly added that researchers are making progress on multiple fronts, including ML for both static and dynamic inspection of programs, and for product synthesis—an area that both O’Reilly and Fisher have worked on, as well as others, such as Joshua Tenenbaum.8 She said that these are areas where the ML community could come up with novel ideas.
Moore said that while the United States may have some disadvantages compared to China and Russia, it also has the advantage of being home to the private-sector platform companies who are doing much work in ML and cybersecurity—and that the firm separation between the government and industry in the United States might also be an advantage. He added that if ML does proliferate in the field of cybersecurity, this could actually help reduce the expected shortage of cybersecurity professionals in the coming years. For security purposes, closing that gap would be a positive outcome, he suggested. Lee noted that proliferation of ML would automate many low-skill kinds of attacks, but it would also encourage attackers to become more creative. On the other hand, he added, increased automation of cybersecurity tasks could also have the benefit of helping to reduce potential insider threats because fewer humans are involved.
- Are there other areas of AI research beyond ML that are significant from a cyber threat or cybersecurity perspective?
Rao Kambhampati opened the discussion on Question 7 by stating that cybersecurity is an AI-complete problem, not just an ML problem, urging a much broader view of AI in this context. While classification and clustering are useful tools, cybersecurity is a much broader space. He reiterated Brumley’s perspective that autonomy, rather than ML, will have the biggest impact on cybersecurity. He added that strategic and longitudinal approaches have not been addressed, and while point solutions may help some businesses make money, they don’t solve the breadth of security problems that exist. Kambhampati identified the AI research topics of sequential decision-making and planning as potentially useful on the strategy front. He also emphasized penetration testing—testing a system to find security vulnerabilities that an attacker could exploit and then trying to figure out what an attacker will do—as an example of a more strategic approach to cybersecurity that could be addressed via AI and automation beyond ML.
Kambhampati argued that cybersecurity is a game theory problem and the community should focus more on game theory techniques rather than explainability, data cleaning, and data availability, which he views as smaller and less forward-looking issues. He also encouraged looking at ML applications more broadly rather than with a narrow focus, for example, on detecting malware. He argued that it may be useful to learn transition models,
6 D. Huang, P. Dhariwal, D. Song, and I. Sutskever, 2018, “GamePad: A Learning Environment for Theorem Proving,” to appear as a conference paper at the Seventh International Conference on Learning Representations (ICLR 2019), https://arxiv.org/pdf/1806.00608.pdf.
7 For more on OpenAI’s GPT-2, see A. Radford, J. Wu, D. Amodei, D. Amodei, J. Clark, M. Brundage, and I. Sutskever, 2019, “Better Language Models and Their Implications,” OpenAI.com, February 14, https://openai.com/blog/better-language-models/.
reward models, and policies. Lee responded by pointing out that cybersecurity researchers have tried to apply game theory for years, and the models that have resulted have been too simple to work. Manferdelli said game theory hasn’t worked in practice because people don’t understand or believe the outcomes. In part, the incentives of AI and cybersecurity may not be aligned, in his view—the issue of trust is huge in cybersecurity, in a way that it may not have been in AI historically. He added that technologies such as automation can be valuable, but the whole field is complicated by the fact that one is facing adaptable human adversaries while at the same time facing the challenge of convincing users on the defense side to believe in an automated or AI system.
Lura Danley, MITRE Corporation, also highlighted the human aspects of cybersecurity, noting that many of the words used in research and application—such as privacy, security, and trust—are being used in a mathematical and statistical context and might not have the same meaning for the human user. She pointed to the issue of model assurance, and to the question of how to explain convincingly to users what a system actually does—which, as noted, can affect overall adoption. She suggested that cybersecurity is not only about the technology, but also involves user effectiveness, user efficiency, and user satisfaction—and that these things can be studied. She urged technologists to understand the trade-offs among those factors and what kinds of compromises users will tolerate. If there is no translation of meaning between users and technologists, the user may reject the technology. She urged the group not to lose sight of where the human fits into the picture at all stages of development, emphasizing the need to bridge the gaps between technologists, technologies, and the end user.
Kambhampati concluded the discussion by noting that just because people don’t understand game theory or because it is counterintuitive does not mean it can’t be used in developing cybersecurity. People also don’t understand probabilities, he noted, which doesn’t prevent them from being used to solve problems; he suggested that it may actually be necessary to go beyond plausible reasoning-based approaches, given they have not yet solved all cybersecurity problems. Game theory, he argued, could be a cornerstone of the new strategic behavior.
The final session of the workshop was reserved for speakers to weigh in on one or more key ideas that each would take away from the convening. The final comments from planning committee members and panelists are summarized below, organized by topic.
Topics for Further Discussion
Chang identified issues he saw as important but that were underemphasized in workshop discussions. These included how AI and ML can help the cyber analyst; how these tools can advance forecasting and situational awareness; and issues around fusion or integration of data. He also thought further discussion of policy implications and oversight would have been useful, such as the level of autonomy we expect out of AI systems, whether some sort of “kill-switch” is necessary, requirements for having humans in the loop, and how to monitor and correct a system when it drifts beyond established bounds. Finally, he noted that AI, ML, and cybersecurity have important international dimensions, as highlighted in a recent statement by the Department of Defense9 (DoD) that came out the same day as the executive order on AI, suggesting that deeper discussion in this space will be important.
Working with a Full Toolbox
Kambhampati reiterated that he was surprised by the narrow view commonly taken of the cybersecurity and AI connection and encouraged a broader perspective. He added that data-driven approaches should be combined with knowledge-based approaches. Sometimes rules are extremely hard to learn using data—for example, a rule against removing an email sent in response to a user’s message—and simple semantic rules works better. He encouraged
9 See U.S. Department of Defense, 2019, “DOD Unveils Its Artificial Intelligence Strategy,” February 12, https://dod.defense.gov/News/Article/Article/1755942/dod-unveils-its-artificial-intelligence-strategy/.
using AI in full, rather than just clustering and classification, which may be low-hanging fruit, but more interesting possibilities exist.
Moore encouraged drawing from different approaches to make fundamental improvements to cybersecurity and said those approaches should include AI but also other tools, from secure software engineering practices to formal verification methods. Many tools are available, he said, and we would benefit from seeing more clearly how they relate. Today, he suggested, experts often get siloed in their own toolsets, and more insight is needed into how they can work together.
Overcoming Silos and Making Connections
Lee said a key takeaway is that experts in cybersecurity and ML need to work together more. Manferdelli stated that it is difficult to have coequal partnerships between AI and cybersecurity experts, in part because the individual fields have separate expectations and standards for publication, and the important work at the intersection might be underappreciated by both fields. While there are many opportunities to work together, he said, domain experts need to come to consensus on what the problem is and how to tackle it. Motivating researchers to do so is a concern, especially since government agencies want to address very specific questions, cutting out other approaches.
Rao stressed the importance of understanding the role of humans in the AI/ML loop and encouraged investment in science that will lead to a better understanding of that role, especially in the context of deep learning. He urged expanding the AI versus ML debate to consider more broadly how to engage with other communities, such as psychology and the behavioral sciences.
Papernot proposed spending more energy—technical, legal, and policy—on facilitating cooperation among different entities so that data can be shared more easily. Hospitals and financial institutions, for example, have many barriers to sharing data, and that makes it difficult to have the global perspective needed to apply ML and AI techniques to pressing problems, he said.
Kantchelian suggested that no real progress can be made without real data. For security, that means heterogenous massive data across the computing stack, and from low-level network flows and CPU activity to high-level application-specific activity, capturing all observable human behavior of interest. Domain-specific knowledge is also essential. While data may not be easily available outside of corporations, Kantchelian suggested greater collaboration is needed to support more research on rich and complex data sets.
Open Questions and Research Opportunities
Fisher said she wants to know whether AI and ML techniques will be able to solve the hard cyber problems, such as vulnerability detection, patch detection, and exploit generation at computer speeds in the wild. If AI and ML can achieve this, it will be a total game changer, she said. Fisher added that, while she had offered some reasons why the answer might be no and O’Reilly identified some evidence that the answer could be yes, we do not yet know the answer to this question—but getting to the bottom of it will be extremely important.
Venables reiterated the importance of establishing a taxonomy, defining not only terms but also the entire space of cybersecurity and its intersection with AI and ML. This would help both the cybersecurity and AI communities recognize the narrow elements that researchers are focused on and could help to shed more light on what areas are not being addressed. Lee pointed out that the bigger problems will require longer-term research, and that work addressing Grand Challenge-level problems in cybersecurity should be encouraged and funded by agencies.
Manferdelli suggested that more experimentation is needed and that while cyber offense specialists seem willing to adopt AI—in part because the benefits can be reaped quickly—cyber defense specialists are much more cautious.
Kantchelian noted that there is no single, permanent solution for cybersecurity. Reality is complex and fast-changing, he said, suggesting a game theory approach is too high level. Krasser commented that Brumley’s talk presented the DARPA CGC as a great microcosm of the cybersecurity landscape; the design decisions of Brumley’s
team were geared toward the objectives incentivized by the rules of the game. One important takeaway for him was that cyber defense can’t be accomplished just by working on what is interesting intellectually; instead, work must focus on what is robust and gets the job done. In addition, he underscored that technology transfer from academia will require that researchers adhere to real-world constraints—for example, around resource limitations, robustness, and complexity. He suggested that the simplest solutions will have the most real-world impact in the near term. Rao suggested that it will be important to create incentives for building and deploying products, rather than just carrying out publication-focused research.
Nguyen said he arrived at the workshop expecting two outcomes. First, the proceedings of this workshop will help to provide situational awareness about AI and cybersecurity to the White House and inform funding priorities at agencies such as DARPA and NSF. Second, they will also be of value to Congress, including the Senate and House Intelligence Committees, so that they understand the challenges and can offer help in national security and with partners and allies.
He noted two additional, unexpected outcomes of the workshop. One is that these discussions will help in scoping the work of DoD and the Intelligence Community (IC) in AI so that the approach, applicability, governance process, risks, and complexity are all applicable to the national security mission—particularly important with the launch of DoD’s Joint Artificial Intelligence Center and the IC’s Augmenting Intelligence using Machines (AIM Initiative) strategy, which represent a significant investment on their own, independent of Intelligence Advanced Research Projects Activity, DARPA, NSF, and other R&D funding. Another outcome, he said, is a better understanding of the gaps that need to be bridged in order to deploy ML and AI effectively for cybersecurity moving forward.