Page 22 Cite

Suggested Citation:"3 Currently Deployed Artificial Intelligence and Machine Learning Tools for Cyber Defense Operations." National Academies of Sciences, Engineering, and Medicine. 2019. Implications of Artificial Intelligence for Cybersecurity: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25488.

×

3

Currently Deployed Artificial Intelligence and Machine Learning Tools for Cyber Defense Operations

John Manferdelli, professor and executive director of the Cybersecurity and Privacy Institute at Northeastern University, moderated a panel addressing current ways in which artificial intelligence (AI) and machine learning (ML) are being applied for cybersecurity defense operations.

Manferdelli began with the caveat that the boundary between offense and defense can be somewhat artificial: An understanding of offense is necessary for a good defense, and exploring offensive operations can be very helpful for informing defensive ones. He then offered an overview of the challenges involved in cyber defense and posed a variety of possible ways in which AI or ML tools may help to address those challenges.

Asymmetry is a central problem in cybersecurity, with the defender typically at a significant disadvantage. For example, attackers can choose the time of attack, often have solid knowledge of the target system, have years or months to prepare (especially with government systems, which persist for a long time), and can deploy a new attack fairly rapidly. Defenders must provide a constant, high level of defense against all attacks by often unknown adversaries whose attack methods and timing are also unknown. Defenders also often contend with a long lag time between developing a defensive technology and actually deploying it, and typically have few metrics by which to evaluate their performance. On the other hand, defenders do have a few advantages. For example, changing the configuration of the system out from under an attacker can undermine an attack that they have invested time and effort in.

Manferdelli asked how AI and ML might help to address asymmetries in operations, transparency, and data access to help improve the defender’s position. For example, might AI tools be a useful complement to cybersecurity personnel, by taking over dull, repetitive, or high-throughput tasks, or helping to improve situational awareness? Might they help reduce the operational burden or resources required (human and financial) in order to maintain constant, effective defenses? Could they help defenders evaluate the strength of their defenses, forecast attacks, assess the impacts of an attack, or use knowledge about what went wrong to defend better against the next attack?

To explore these issues more deeply with concrete examples, Manferdelli introduced the three panelists: Alex Kantchelian, software engineer, Google, Inc.; Dave Baggett, founder and CEO, INKY Technology Corporation; and Sven Krasser, chief scientist at CrowdStrike, Inc. The panel was followed by an open discussion.

Page 23 Cite

Suggested Citation:"3 Currently Deployed Artificial Intelligence and Machine Learning Tools for Cyber Defense Operations." National Academies of Sciences, Engineering, and Medicine. 2019. Implications of Artificial Intelligence for Cybersecurity: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25488.

×

ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING IN ANOMALY DETECTION

Alex Kantchelian, Google, Inc.

Kantchelian spoke about a method for anomaly detection that his team has developed and deployed in Google’s enterprise system.¹ While the approach is more general than any single application domain, for the purpose of his presentation, he chose to focus on its application to detecting improper document access. A company’s internal documents often contain intellectual property and are a prime target for corporate espionage, making this application domain an important area of cyber defense.

Shortcomings of the Status Quo

The most common method for protecting documents is through the use of access control lists (ACLs), which identify the specific documents any given user is authorized to access. However, ACLs are hard to manage at scale, and when an organization has a large number of employees and documents it can be easy to inadvertently overshare. As individuals move between projects or teams, they may retain access to documents they no longer need or should no longer have access to, especially if an ACL’s expiration date is too lax and it is not updated to reflect all personnel changes. However, even if ACLs are properly managed, they do not protect against hijacking attacks. For instance, by implanting malware on the employee’s machine, an attacker may inherit and abuse the victim’s otherwise legitimate authorizations. ACLs also do not protect against malicious insiders—individuals authorized to access files for a work purpose but who use them in unauthorized ways.

To address these shortcomings, Kantchelian’s team developed and applied a more dynamic approach to catching unauthorized file access events, making use of ML-driven anomaly detection.

Machine Learning- and Artificial Intelligence-Based Solutions

Before describing the system his team developed, Kantchelian provided some background about potential ML-based solutions, beginning with supervised learning. With this method, a model, such as a neural network or an approximator function, is provided with a set of training data that captures a collection of file-access instances, with each labeled as either benign or malicious. The algorithm takes in this data and uses it to create a formula for determining whether any particular item is benign or malicious. Such approaches can work very well to classify new instances if the training data are abundant, of very high quality, or both. Essentially, supervised learning is good at detecting patterns that are well-understood in the training data—this is almost like a rule-based detection system, except that the rules are optimized by an ML framework rather than written explicitly from a set of clear principles.

Kantchelian argued that this approach is not particularly useful for corporate security for several reasons. First, he noted that analysts have already created sophisticated rule-based detection systems that enable them to detect an array of adversarial behaviors. More fundamentally, the quantity of known, successful attacks against corporate networks is small, so the limited data available for training a supervised algorithm will not be sufficient to enable it to recognize attacks. Furthermore, in general, supervised learning methods may be less likely to succeed in identifying a new attack with characteristics unlike those already observed and present in the training data—a so-called novel attack that belies expectation or breaks known rules.

An alternative approach is anomaly detection. Kantchelian suggested that this has historically had a bad reputation in the security community because it raises a large volume of alerts that must be processed by analysts looking for novel attacks and often turn out to be irrelevant, trivial, or difficult to investigate. For example, even if false positive rates are extremely low, so many documents are typically accessed in a large organization that this will still translate to a large number of false-positive alerts each day. Bombarded by noise, analysts lose confidence in the system and miss opportunities to learn because much of the system’s behavior goes unobserved. While anomaly

___________________

¹ That is, Google’s corporate security, not its consumer-facing products.

Page 24 Cite

Suggested Citation:"3 Currently Deployed Artificial Intelligence and Machine Learning Tools for Cyber Defense Operations." National Academies of Sciences, Engineering, and Medicine. 2019. Implications of Artificial Intelligence for Cybersecurity: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25488.

×

detection can be successful at detecting denial-of-service attacks, again, other methods already exist for this, and anomaly detection has had limited utility for most other attack scenarios, Kantchelian said.

A New Approach

In light of these challenges, Kantchelian’s group set out to develop a new, global ML model capable of reviewing events in much the same way as human analysts would. He described the model as applied to the example of document-access events. In this case, the goal is to train the model to determine whether any particular instance of a specific user accessing a particular document is acceptable or malicious. In this context, the model takes a two-part input: (1) information about the user and (2) information about the document accessed. When trained on information across the company, the global model can recognize whether or not a specific instance is expected or unusual given the profile of the user and document in question.

Most security analysts in a large organization do not know each user personally. Instead, they rely on auxiliary information, such as job title, department, previous historical behavior, and the systems and data the user is authorized to access. The team used this same auxiliary information as input for their model. Similarly, given that a large corporation may have billions of files with a variety of content types, Kantchelian’s team chose a metadata-based description of files, where a document is represented by the set of all employees who previously accessed it. Doing so is much simpler than dealing with the document’s content directly, and, again, is similar to the approach used by a security analyst.

To train a model to find anomalous document accesses, the team reduced the learning problem to supervised binary classification. The negative, or “normal,” pairs of employee-document descriptors are defined to be all the accesses that historically happened. The positive, or “anomalous,” pairs are synthetically generated by randomly sampling from all employee-document pairs which were not observed. The general idea is that the model should be able to learn “compatibility rules” between employees and documents. For example, it would be unusual for a receptionist to access the hardware designs for an upcoming product, and this instance is precisely the type that would appear as a synthetically generated positive. This overcomes the previously discussed limitations of supervised learning in that the positive, or “abnormal,” class is not biased toward any pre-existing attack patterns, but is rather a large set that is representative of a diverse and a priori unknown set of attacks.

In practice, this implementation has had success in identifying true positives—not only instances that were detected by an alternate, knowledge-based detection system, but also instances that were previously undetectable. For example, using this system, Google was able to detect both early reconnaissance activities of a red-team exercise and instances of malicious insiders—employees seeking out corporate documents for a competing business. At the same time, the model has also yielded some false positives—for example, instances where an employee was transitioning between positions or accessing out-of-date documents. However, by and large, the system has brought improvements, Kantchelian said. In the 2 years since its implementation at Google, it has been able to identify actual novel attacks with very low false-positive rates. He concluded by commenting that this anomaly detection approach succeeded using pre-existing supervised learning methods (but with a new approach to the training data) and that his team is currently working to apply a similar approach to lower-level problems that are more subtle than document access.

ARTIFICIAL INTELLIGENCE FOR IDENTIFYING NOVEL PHISHING ATTACKS

Dave Baggett, INKY Technology Corporation

Baggett’s company, INKY, develops email protection solutions with a particular focus on phishing. Baggett discussed the role of AI in his company’s approaches to identifying novel phishing attacks.

The Phishing Landscape

Baggett identified phishing as one of the most significant cybersecurity problems today—both in terms of loss of money and as the initiating vector in most publicized cyberattacks. Phishing attacks are based on the

Page 25 Cite

Suggested Citation:"3 Currently Deployed Artificial Intelligence and Machine Learning Tools for Cyber Defense Operations." National Academies of Sciences, Engineering, and Medicine. 2019. Implications of Artificial Intelligence for Cybersecurity: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25488.

×

impersonation of either a person or a “brand,” typically for the purpose of stealing personal information or credentials and/or for some form of monetary theft. Baggett provided several examples.

One example of brand forgery would be an email that falsely appears to be from Microsoft that prompts the recipient to click on a malicious website disguised as an Office 365 login page. If a user attempts to login, those credentials are now known by the scam’s orchestrator. A similar brand forgery attack spoofing DocuSign could potentially give the attacker access to any confidential documents signed by the recipient.

Spearphishing attacks, which target a particular user, are on the rise. For example, a spearphishing message may appear to come from someone within the same institution as the target. In one scam using spearphishing, the message states that the organization is doing a marketing giveaway and asks the recipient to assist by purchasing a gift card and providing the card’s identifying numbers for later use in the promotion. The scammer then sells these numbers on the dark Web, and the victim has been robbed of the value of the card.

Traditionally, the main method used to identify phishing emails has been to compare the Web links, or universal resource locators (URLs), contained in an email against a compiled list of URLs previously reported as malicious. However, not all phishing emails include URLs, and attackers who do use them have adapted to this detection scheme by simply randomizing the URLs so they will not be a match for any on a known threat list.

Companies have also aimed to deal with these threats through phishing awareness training, sending human users examples of phishing emails to help teach them to be more cautious. However, Baggett noted that it is increasingly difficult for users to tell the difference between genuine and fraudulent emails, as attackers have become better and better at creating realistic fakes. For example, attackers can save a valid email from a trusted brand as an HTML file, change a single URL, and then resend it fraudulently to trick a recipient into clicking the malicious link. They can even register and use a domain name that looks very similar to that of the real brand—a discrepancy that is very hard for humans to recognize.

Tackling Brand Forgery

Baggett described ways in which his company is using AI to identify phishing emails based on first principles, including what he called “zero-day phishing”—types of attacks that have not been previously detected and reported.

There are a multitude of features that go into branding, including images, layout, colors, and text. When a human looks at an email, they can easily recognize a brand based on the combined impression made from of all of these features. However, it not easy for a human to determine whether a message is authentic, especially given the sophistication of today’s brand forgeries.

If a computer program were able to recognize the brands that a human would, it could easily identify forgeries, for example, by comparing the source or signature of the email to the official domains or public keys of the recognized brand. However, it is hard for traditional computational methods to recognize the brand of an email as a human would. INKY set out to address this problem via several methods.

Baggett noted that significant academic work has been done on using deep learning (DL) to recognize photographic images. He described INKY’s efforts to apply DL to other visual features in emails using the HTML underlying the message. However, he also pointed out that these methods could potentially be tricked by adversarial examples, as alluded to by Rao Kambhampati in his opening remarks.

Baggett said that his team has spent a significant amount of time on what he referred to as a “good, old-fashioned search” to develop approximate string-matching search techniques. These techniques are used to identify words or phrases critical to a company’s branding that have been manipulated to evade conventional searches, in particular, by substituting certain letters with unicode or other symbols that will fool a human but be missed by a naive string-matching search. For example, a phishing email might replace a particular Latin letter with a similar-looking Cyrillic one, or combine letters in ways that fool the human eye (e.g., at first glance “r” and “n” together as “rn” look like the letter “m”) but elude computer detection unless such matching techniques are employed.

Significant efforts at INKY have gone into developing data structures to support approximate string matches against a huge set of strings in real time, taking into account all of the strange character transformations that might have been done. The system is currently able to identify major brands as represented in emails with high accuracy and then easily determine in less than a millisecond whether or not the email actually originated with that brand—much better than a human ever could.

Page 26 Cite

Suggested Citation:"3 Currently Deployed Artificial Intelligence and Machine Learning Tools for Cyber Defense Operations." National Academies of Sciences, Engineering, and Medicine. 2019. Implications of Artificial Intelligence for Cybersecurity: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25488.

×

Ultimately, Baggett argued, the filtering of phishing emails is something that will be fully automated. While acknowledging that the systems under development are not absolutely perfect, he expressed optimism that email phishing could soon cease to be a successful attack vector, and that this would play out in the market over the next 3 to 5 years. Attackers might then turn to phishing via SMS and voice messages, which would need to be addressed by other methods.

Human-Machine Partnership in Anomaly Detection

Baggett noted that INKY also makes use of ML in anomaly-detection algorithms, in particular, to prevent against spear phishing based on the impersonation of an individual. The ML models flag messages that look unusual given the historical profile of a user’s inbox. He reiterated Kantchelian’s point that anomaly-detection algorithms are often susceptible to high rates of false positives—not all unusual messages are actually malicious. One ongoing project at INKY focuses on avoiding false positives in gray-area cases by empowering the user to play a role in deciding whether the mail is fraudulent. To do so, information about the salient features that led to the classification of an email as a potential anomaly is extracted from the model and communicated via a brightly colored banner at the top of the message. The user can then interpret the details and make an informed judgment call.

Baggett provided an illustrative example. If someone traveling abroad sends an email to a user, the user would see an alert banner noting that the email is suspicious because it unexpectedly originated in a foreign country. In this case, if the recipient knew the sender to be traveling outside of the country, the alert could easily be ignored.

He also noted that the team has also worked on using natural language processing (NLP) to extract the semantic intent of an email. This can be used to flag commonly malicious content, such as wire transfer requests or prompts to change one’s password. He noted that these methods are language dependent.

SELECTED MACHINE LEARNING APPLICATIONS AT CROWDSTRIKE

Sven Krasser, CrowdStrike, Inc.

Krasser, who has been applying ML in the information security industry for nearly 15 years, discussed ML-based methods used by CrowdStrike in its security products, with an emphasis on their use for malware detection. The company’s cloud-based Falcon platform works to automate elements of cybersecurity protection operations in three areas: endpoint security, security operations, and threat intelligence. He described the platform as analogous to a “flight recorder” for events on systems.

While emphasizing that ML has been subject to media hype about applications that aren’t yet possible, he suggested that there do exist many useful techniques that are not widely known. He compared ML expectations and applications to those in the field of robotics: While some people assumed that life would be similar to that portrayed on The Jetsons, with a robot in every home, this is not the current reality. Nonetheless, robotics has proven invaluable in certain contexts, such as in automation of car manufacturing. Similarly, ML can provide real value for certain cybersecurity applications—by automating certain tasks, operating at scale, and in dealing with complex data beyond human cognition. He went on to describe CrowdStrike’s approach.

A Machine Learning-Enabled Security Platform

In general, CrowdStrike’s use of ML is enabled by the huge amount of information collected by its endpoint sensors. Roughly 250 billion events are ingested and processed each day by the CrowdStrike cloud, and the data is analyzed in several ways. Some of the data is saved for long-term batch processing, and some is analyzed in real-time, which enables faster reaction times but is less complete and accurate. All of the collected data is put into a large graph database. All of these collections are analyzed using ML models that provide different insights and advantages that help to prevent, detect, and hunt out attacks. These models can be deployed at various scopes—for example, on graph data, bulk data, or on endpoints. The growing collection of sample files (on the order of 2 petabytes) is indexed for full-content search to detect file-based threats. Based on event data, the files with the

Page 27 Cite

Suggested Citation:"3 Currently Deployed Artificial Intelligence and Machine Learning Tools for Cyber Defense Operations." National Academies of Sciences, Engineering, and Medicine. 2019. Implications of Artificial Intelligence for Cybersecurity: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25488.

×

most real-world impact (and files related to those) can be identified and used to train ML models to detect previously unknown malware effectively. This approach is a component of CrowdStrike’s Next-Generation antivirus (NGAV) software, which demonstrates a consistently high detection rate of malware, including malware that had not previously been seen at the time the model was developed. Krasser showed example data based on the daily performance of a single model on new, unseen malware files. The model showed continuous performance in the high nineties to close to 100 percent detection rate over several months without visible decay in efficacy. Krasser contrasted this with standard antivirus (AV) software based on signatures, which succeeds in detecting known threats but requires periodic software updates as new malware threats emerge, resulting in a fluctuating performance profile. Performance spikes with each daily signatures update but falls thereafter—until the next daily update.

The use case of the NGAV model is helpful for illustrating how the system works. Krasser described how telemetry data—data captured from remote sources—is used as a baseline to establish file provenance and ground truth. In particular, a model can be trained in the cloud using all of the centrally collected data. For static malware detection, models can make use of a range of features such as file size and entropy, digital signature presence, function names, icons, and more. As an example, he shared a visualization of the entropy distribution in an instance of ransomware, noting an underlying pattern—apparent even to the human eye—that can be learned and recognized easily by ML algorithms. CrowdStrike embeds many classifications into a lower-dimensional feature space for use as an input to classification algorithms. The models can be deployed for static analysis of new files either in the cloud or on endpoints, adapted to fit the computational resources available where they are run.

Krasser emphasized that CrowdStrike’s focus is not on simply training the best model for a curated data set, but rather on creating processes to generate a stream of high-quality models from all of the data being collected—and doing this in an automated fashion. He noted that while most of the data, by several orders of magnitude, is unlabeled, all of them can nonetheless be leveraged for cloud-based representation learning—automatic identification of data representations that, when embedded (expressed as low-dimensional, continuous-variable vectors rather than discrete classifications), can be used as features in the model.

Minimizing False Positives

One key concern is minimizing the system’s rate of false positives or negatives. To avoid silent failure, CrowdStrike ensures that there are some humans in the loop to look at what is being detected in the field. For file-based threats, there are two approaches. First, there is the “Similarity API,” which identifies samples that could be related according to a model-based similarity metric and the “MalQuery User Interface,” which enables one to perform a full-content search on all of the sample files stored in the cloud. Both approaches enable an analyst to find additional specimens that can augment the corpus of labeled data, helping to improve ground truth.

Nonetheless, even a detection system that is 99 percent successful—for example, at malware detection— still gives false negatives 1 percent of the time. An adversary that repeatedly attempts to bypass such a system has a 99.3 percent chance of success after 500 attempts.² Krasser noted that this highlights the importance of efforts to raise the cost of an attack for adversaries. Furthermore, false positives are another challenge due to the large number of non-malicious instances a security product needs to handle. Curtailing false positives is critical for practical deployment of security software, according to Krasser.

Other Challenges

Krasser pointed out that the data with which they train their models tend to be unwieldy—for example, containing many outliers or displaying multimodal distributions—and some features of the data can be controlled by adversaries. Such characteristics affect the efficacy of different modeling techniques.

In addition to static analysis of artifacts (e.g., malware files), the team also carries out behavioral analysis (e.g., file use, system processes, or user actions). Behavioral data can include information on every process and its subsequent child process. Krasser commented that ML-based analysis of document contents is typically

___________________

² Assuming each attempt is statistically independent of the others.

Page 28 Cite

Suggested Citation:"3 Currently Deployed Artificial Intelligence and Machine Learning Tools for Cyber Defense Operations." National Academies of Sciences, Engineering, and Medicine. 2019. Implications of Artificial Intelligence for Cybersecurity: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25488.

×

straightforward because the training data has a relatively even distribution of malware. However, behavioral analysis is more complicated, because most system behaviors are benign, yielding few malicious behavioral artifacts for model training. This makes it more challenging to build models to recognize malicious behavior. As one means to address this hurdle, Krasser’s group simulates an environment to study within a sandbox, containing a higher proportion of malicious behavior than is seen in the real world. However, this strategy is only an approximation. Because the sandbox is created by developers, it lacks key information that might be relevant in a real-world scenario, such as the file’s provenance.

Beyond the use of ML for static artifact analysis and behavioral modeling, Krasser briefly mentioned reviewing call stacks as another useful ML-based strategy for detecting malware. When a process launches, examining the call stack provides insight into why or how that process initiated. Some malicious processes like exploitations are easier to detect than others, but sequence learning techniques can be used to detect stack outliers, he said.

Krasser stressed the importance of continuously updating the ground truth, especially for behavioral artifacts. Because so many factors are constantly changing (both in the attacks and in the detection methods), information generated now about current conditions will not necessarily be relevant in a year. In addition, policies and contractual obligations can put limits on the length of time for which data may be retained. This ephemeral nature of ground truth is another complicating factor for ML for behavioral analysis when compared to ML for malware file detection.

Krasser closed with several reflections on the value of ML for cybersecurity tools and operations. He suggested that the most successful companies will maintain perspective on the limitations that result from ML-based analyses. For example, their abstract and high-dimensional nature sometimes leave no way to understand why or when certain things are detected or missed. In addition, deployments of ML for security operations opens up a new set of attack surfaces in and of themselves through adversarial learning. Recognizing this, CrowdStrike maintains human oversight via a service called Falcon Overwatch, to help ensure that appropriate data are collected, analyzed, and integrated properly. Although ML in cybersecurity has solved some problems, Krasser emphasized in his closing remarks that many gaps remain where progress would be needed before good solutions can be found.

PANEL DISCUSSION

A moderated discussion among panelists and workshop participants followed the presentations. Participants explored low-hanging fruit in the realm of AI for cyber defense, as well as risks and limitations of its use.

Near-Term Opportunities

Manferdelli kicked off the discussion by asking each panelist to point to areas of cyber defense where they can envision AI having an impact in the near term. For example, automating audit log analysis, increasing situational awareness, and automating decisions about setting configurations may be areas ripe for AI-based innovations, Manferdelli posited.

Baggett offered the possibility of developing an ML-based risk metric or scorecard that can shed light on how vulnerable individuals and organizations are, perhaps including situational information derived from active defense—techniques that make it more difficult for attackers to succeed by raising the costs of attacks or lowering the costs of defense. A second opportunity Baggett pointed to was automating the process of identifying the nature, source, and intent of attacks—time-consuming work that is currently done by humans.

Noting the rapidly increasing quantity of data and its growing level of abstraction and dimensionality, Krasser stressed that automation of data management will become increasingly important. Because humans cannot possibly keep up with the flow of data, machine-based methods will be needed, he suggested. He also noted that more work is needed on certain areas, such as in developing better techniques for representation learning.

Kantchelian posited that unlike AI more generally, ML is ripe enough for the needs of cybersecurity. Rather than striving to improve the processing of images or natural language—things that humans interact with—he further suggested a more useful focus is on fundamental methods for learning about objects or algorithms that are intrinsic to machines, and that humans are not inherently capable of understanding. This, he suggested, would help build a basis for a deeper collaboration between ML and security analysts.

Page 29 Cite

Suggested Citation:"3 Currently Deployed Artificial Intelligence and Machine Learning Tools for Cyber Defense Operations." National Academies of Sciences, Engineering, and Medicine. 2019. Implications of Artificial Intelligence for Cybersecurity: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25488.

×

Risks

Workshop chair Fred Chang suggested that the United States is a key leader in cybersecurity startups and wondered whether the concentration of U.S.-based efforts risks becoming a monoculture that does not learn from the work of its competitors, limiting innovation in the field. Baggett commented that Israel is also seen as a global leader in this space and expressed his view that the disparity between big, established technology companies and the startup community is more significant than that between countries—whose researchers are probably all reading the same technical literature. Because success in deploying ML requires data, companies with access to enormous data troves will have a built-in edge on developing ML and AI tools—a condition that favors the incumbents.

Additionally, Baggett commented that there are many cybersecurity startups touting incredible ML-based solutions that might not actually work and are simply marketing efforts. He said it’s difficult for companies to compete and rise above such a hype-saturated landscape, suggesting that it recalled 19th century snake-oil sales. Building on this, Krasser stressed that there is a risk of over-emphasizing AI and ML as necessary elements of a cybersecurity solution, suggesting that a large percentage of companies include ML in their product pitches but don’t actually use it. He suggested that it is good that companies are also looking to non-AI-based solutions, but bad that companies perceive a need for all tools to contain AI, as this reflects a distraction from the big-picture goals of successful solutions. He commented that non-AI-based approaches could prove valuable in conjunction with AI-based approaches.

For example, simply applying the best ML algorithms to existing data sets will not be as effective as applying them to the best data for solving the particular problem at hand. Instead of relying on whatever data is already available, researchers could leverage potential synergies with other researchers to improve the nature of data available.

Phil Venables, Goldman Sachs, noted that over-emphasis of ML is not unique to cybersecurity startups, but a common problem across all fields requiring venture capital. He then asked the panelists whether their ML-based tools needed to account for the presence of a particular company security product or operating system configuration in order for the ML to function properly. Krasser replied that interoperability between multiple security solutions is a common challenge, noting multiple instances in which unusual design or behavior of a valid program falsely registered as a detection event to his system. He expressed that one should not underestimate the number of benign behaviors that look malicious because someone’s software design is unintuitive or unusual. Without proper attention to this issue, interactions among systems can cause breakdowns or cause systems to flag each others’ behavior as malicious, a common occurrence in his experience.

Feedback Effects

A related question, posed by Una-May O’Reilly, Massachusetts Institute of Technology, is the degree to which deploying ML tools can actually affect the nature of the subsequent data itself, resulting in a moving-target problem caused by the interventions themselves. Baggett noted that this touches on a common problem in the email filtering space, in which adversaries respond to the filtering algorithms by changing their behavior to dodge them.

Venables added that similar feedback loops can be found in many applications, and they are not always caused by changes in what adversaries are doing but can also stem from the methodology itself and can lead to a gradual narrowing of the behaviors that are examined. Building on this point, O’Reilly noted that gradual narrowing could be analogous to a common phenomenon in economics in which markets become more homogeneous over time.

False Positives and Human Factors

Rao Kambhapati noted that the three speakers had addressed similar types of applications, focused on identification of anomalies or outliers—in particular, malware and phishing emails. He wondered whether such applications were in fact the low-hanging fruit, and if more complicated AI-based defense strategies are still too immature to be useful. He also commented on an experience he had where messages that were part of a genuine email thread were flagged and filtered into a spam folder, wondering how responses to one’s own messages could possibly be spam, and asked whether correspondence history is used in current models.

Page 30 Cite

Suggested Citation:"3 Currently Deployed Artificial Intelligence and Machine Learning Tools for Cyber Defense Operations." National Academies of Sciences, Engineering, and Medicine. 2019. Implications of Artificial Intelligence for Cybersecurity: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/25488.

×

Baggett said that spam is almost a solved problem in principle, but real-world social pressures can cause managers to shift the calibration on what to treat as spam. For example, in some companies, users may complain if they receive any spam at all, prompting the security managers to recalibrate the system so that it errs on the side of false positives, overfiltering valid messages into spam folders—especially in instances where such complaints are included in the metrics for evaluating a security manager’s job performance. He noted the importance of including some way to deal with false positives in product designs—the spam folder being the common example in the context of emails. Krasser added that a model might not have included important factors that would indicate a message is valid, or might have accounted for them but still somehow yielded an incorrect output. He suggested that this particular type of false positive could be easily avoided via a simple rule: emails that are part of a thread originated by the user himself cannot be spam—without relying upon ML to achieve this.