Vinh Nguyen, National Security Agency, prefaced the panel on deep fakes by noting that the Senate Select Committee on Intelligence had sent a request to his team asking about the potential threats posed by deep fakes along with potential mitigations and what the Intelligence Community is doing about it. This request was instigated largely out of concerns about the potential use of deep fakes in the runup to the 2020 Presidential election, underscoring the urgency and importance of the issue. In this limited timeframe, Nguyen and his team are striving to better understand deep fakes, identify possible mitigations or defenses against them, and inform a strategy to protect the integrity of U.S. elections. For further context, Nguyen explained that the government’s current position is to expose the creators of deep fakes, rather than publicly passing judgment on whether the content created is fake.
Subbarao Kambhampati, Arizona State University, offered additional context for the panel discussion. Not to be confused with the notion of “fake news,” deep fakes are wholly unreal digital creations—text, audio, image, or video—that are generated via machine learning (ML)-based sampling of existing data. He also contrasted deep fakes with adversarial examples, which are inputs designed to fool the perceptual system of a machine, whereas deep fakes are designed to fool humans. He pointed out that deep fakes can take the form of images or text that are generated by a machine, and suggested the potential for what he called deep head fakes, where, over time, machine-generated content could override one’s entire mental model of what reality is. He introduced several other important discussion topics related to deep fakes, including their prevalence and relevance, whether they may have implications for cybersecurity in addition to our mental security, how to recognize them, and their potential long-term implications for society.
Today’s deep fakes often display telltale signs resulting from technological limitations of the techniques used to create them, but he suggested that these will eventually be overcome. He considered several proposed solutions. While the government can legislate against deep fakes, such legislation would not stop bad actors from creating and using them. He noted that some people have argued that artificial intelligence (AI) systems should somehow declare that they and anything they generate are AI-based, but pointed out that this somewhat overoptimistically relies on the global goodwill to do so. Another option is to require a digital signature on all media verifying that it came from a trusted source, which could in the future become a prerequisite for believing our own eyes.
Upon elucidating these issues and emphasizing the potential gravity of deep fakes, Kambhampati introduced the panel’s two speakers, Jay Stokes, principal research software design engineer in the Cloud and Infrastructure Security Group at Microsoft Research, and Delip Rao, vice president of research at AI Foundation, and then moderated an open discussion.
Jay Stokes, Microsoft Research
Stokes explained that he is part of a group at Microsoft trying to understand the challenges posed by deep fakes and what they might do to solve them; he shared some of the insights gained to date. In his presentation, he provided a brief history and overview of deep fakes, the challenges and implications they present, and some potential methods for detecting them.
The Emergence of Deep Fakes
Stokes introduced deep fakes by showing the audience a synthetic video1 of Barack Obama speaking voiced-over by a famous comedian, and suggested that synthetic audio might not be far away. This technology captured public attention over the past several years when individuals began incorporating photos of the faces of celebrities into pornographic videos using an algorithm called Face2Face, which was made available via arXiv in 2017. One particular Reddit user called “deepfakes” uploaded so many of these videos that they were named after him.
Deep fakes have rapidly emerged as a cruel and destructive tool for character assassination and cyberbullying, Stokes said. Among other examples, he described an early instance where the Facebook photos of a private citizen of Australia were used to create lewd deep fakes that spread across the Internet along with her personally identifiable information; with no real legal recourse, the individual experienced severe online and in-person harassment, exacerbated when the content was removed at her request by some Internet services.
On the industrial and economic front, he pointed out that deep fake-enabled spearphishing is a significant concern. He suggested that fake audio could be used to gain access to a network, or to create fake recordings of a CEO or corporate management saying something that appears to impugn the corporation. Such activity has the potential to wreak havoc on brand reputations, stock prices, and business mergers. When created about political figures, world leaders, or militaries, Stokes said, deep fakes have the potential to affect national elections and could cause political unrest or potentially even war.
Part of what makes deep fakes so pernicious is that they can go viral globally almost instantaneously, there are very few laws to help victims, and removing fake images from multiple servers in multiple countries is difficult at best. Intent is an important facet in determining how best to address deep fakes. Some deep fakes are intended to harm people, but others could be perceived as art, satire, or other protected types of free speech. In addition to disrupting the lives of individuals, deep fakes could have sweeping societal implications. While today people tend to trust video and audio content, deep fakes may undermine public trust in these media. For example, the very existence of deep fakes could provide a plausible fake alibi, for example, allowing someone to claim that an incriminating photograph was faked.
Technology is improving every year, leading to increasingly realistic deep fakes. Deep fake facial images in 2014 were fuzzy, small scale, and mostly black and white, while today’s deep fake images, just a few years later, are high resolution, high quality, and full color, making detection far more difficult. Deep fake audio is also being enabled and rapidly advanced by text-to-speech technologies such as those being pursued by companies like Google,2 Azure, Amazon Web Services, and Adobe. Soon, he suggested, it may be possible to upload a collection of sample audio and achieve a model for any individual person, a disturbing prospect. As a defender, detecting deep fakes will be a daunting task, he said.
2 Stokes referred to the paper on Google’s “WaveNet” system as a seminal publication in the field: A. van den Oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals, A. Graves, N. Kalchbrenner, A.W. Senior, and K. Kavukcuoglu, 2016, “WaveNet: A Generative Model for Raw Audio,” p. 135 in SSW9: 9th ISCA Workshop on Speech Synthesis Proceedings, http://ssw9.talp.cat/download/ssw9_proceedings.pdf.
Possible Detection Methods
Stokes outlined three possible ways to detect deep fakes: manual investigation, algorithmic detection, and content provenance. However, there has not been enough research on any of these methods, he noted, and scaling them will be difficult. Manual investigation uses humans to detect fakes. For example, several news outlets have created deep fake task forces that are training editors and reporters to recognize deep fakes and create newsroom guidelines. Unfortunately, technological progress may soon make manual investigation impossible.
A paper published last year pointed out that deep fake videos rarely show blinking, and so a person could conclude that a video without blinking is fake.3 However, now that this is recognized, Stokes speculated that developers will create better methods for generating blinking, and this telltale sign will soon cease to be a reliable way to detect a fake.
Algorithmic detection uses computers to analyze synthetic media. Facebook in particular is working on several AI methods in this space, for not only identifying deep fakes, but also targeting misinformation more generally, including content presented out of context, and false claims in text and audio. In particular, it is scanning images to perform optical character recognition or transcribing audio to generate text that can then be searched upon to see whether anyone has debunked them. However, this is very difficult to do at scale, Stokes noted.
Content provenance is a digital signature or cryptographic validation of audio or video that is specific to the actual camera or microphone used to record it. For example, Stokes described efforts by an Israeli startup to insert hashes into a video file in a device-specific way and upload them into a publicly available blockchain sequence; a comparison of the video to the blockchain-stored value can then reveal, using a simple color scheme, which parts of a video are real. Content provenance can also be assured through digital signatures, directly analogous to the certificates used to authenticate Web pages. Stokes named one tool, called Proof Mode,4 that embeds metadata signatures in video or images to ensure a chain of custody and inspire confidence that the content collected is real. This app was designed for the purpose of providing credibility to documentation of human rights abuses.
However, content provenance and digital trustworthiness are not new ideas, and several domain experts have long been skeptical of their effectiveness. Skeptics point out that while such methods may be technically feasible, they are extremely difficult to implement. For example, if a certificate is stolen, it must be revoked from all cameras and a new one issued—a high-cost, brittle solution difficult to implement at Internet scale, Stokes said.
While no federal legislation has been passed to address the issue, a bill has been introduced in Congress that would criminalize malicious creation and distribution of deep fakes. In New York State, a law was proposed that would punish individuals who make non-consensual deep fakes of others, but movie companies are fighting back, citing First Amendment rights. Some believe that deep fakes, while used in certain communities, are unlikely to be widespread or cause serious damage and that concerns are overblown—in particular, because posting a deep fake might actually call attention to the malicious actor, making it not worth the risk.
In summary, Stokes stressed that technology is moving incredibly fast, deep fakes are causing real harm to real people, and it is only a matter of time before they are deployed for political manipulation. Given this context, Stokes urged academia, industry, and government to take advantage of this brief window of opportunity to help find solutions.
Delip Rao, AI Foundation
Rao described the AI Foundation’s efforts to develop methods for detecting synthetic digital content. This work is undertaken in order to improve what the foundation terms information safety, a goal with three components: education (degree programs, employee training, and public safety campaigns), enforcement (creating and enforcing
3 Y. Li, M.-C. Chang, and S. Lyu, 2018, “In Ictu Oculi: Exposing AI Created Fake Videos by Detecting Eye Blinking,” in 2018 IEEE International Workshop on Information Forensics and Security (WIFS), doi:10.1109/WIFS.2018.8630787.
4 Guardian Project, 2017, “Combating ‘Fake News’ with a Smartphone ‘Proof Mode,’” posted on February 24, https://guardianproject.info/2017/02/24/combating-fake-news-with-a-smartphone-proof-mode/.
effective laws), and empowerment. In the context of empowering users to detect fake content, Rao outlined two broad categories: (1) provenance-based tools, such as the hashes and cryptographic schemes Stokes mentioned; and (2) data-driven methods, which use ML and data science. Data-driven methods are the focus of Rao’s work, and they fall into three categories: metadata-based methods, content-based methods, and a combination of both. Since only the largest companies or governments have enough metadata to use metadata-based methods, Rao’s team focuses on content-based methods. He described several of these methods, which differ depending on whether the content in question is an image (or video), audio, or text forgery.
Detecting Synthetic Images or Videos
Rao described several approaches to detecting fake images according to how they were created. Generative adversarial networks (GANs) have been able to create increasingly realistic synthetic images, but these often contain artifacts that help enable detection. He also described improvements in so-called “image-to-image transfer” approaches, where, for example, the facial expression from one image can be transferred to the face in another.
The AI Foundation and the Technical University of Munich are collaborating on ML-based detection methods for fake or altered videos of faces, training models on more than half a million faces from a data set of 1,000 videos to improve accuracy. For example, he commented that one could apply their model to the images that appear on whichfaceisreal.com and have a reasonably accurate success rate of identifying the fake in a given pair of images. However, he noted that the model is not as accurate for compressed media, because the artifacts from compression are similar to those left in GAN-generated synthetic images, causing the model to think that the compressed images may be fake. The team found that the accuracy drops as the level of compression increases. It is important to keep refining these models because the human eye is especially bad at detecting fake facial images, Rao added.
Another approach to detecting image manipulation is through segmentation, where regions of manipulation in an image can be detected and highlighted. The AI Foundation is building a suite of products and services, including a browser plugin that could detect and highlight doctored portions of image or video. The Face Forensics data set—1.4 million altered images at different levels of quality—is available for models to learn to detect fakes. Its images have different quality levels to reflect the range of quality in fake images one might encounter. The plugin tool has generally high accuracy rates for the attack vectors it has been trained on, although rates are higher for uncompressed images. In addition, the model’s detection performance varies for different forgery techniques, suggesting a general caution against assuming that any detection technology will generalize to detection of all deep fakes.
To mitigate this, Rao’s team is working on a so-called “forensic transfer” learning approach to understand how quickly one model trained on a given data set can be bootstrapped to work on another—for example, corresponding to a different type of image (e.g., faces versus landscapes) or a different type of forgery. So far, they have been able to successfully transfer a detection model to a new domain by fine-tuning the pre-trained model with as few as a dozen new training images.
Detecting Synthetic Audio
Detecting fake audio raises different challenges. There are currently three different modalities for generating synthetic audio: voice conversion, text-to-speech synthesis, and replay attack. Voice conversion starts with a “source voice” that says the desired phrase, which is then converted to sound like the “target voice.” Text-to-speech synthesis, very common because of its commercial applications, converts text to audio that sounds like a particular individual. Replay attacks, authentic audio samples that have been strung together out of context with a malicious intent, have been until recently very difficult to detect. A recent competition, ASVspoof, drastically improved replay detection rates for certain data sets, Rao noted, although this does not by any means eliminate the problem.
Voice conversion and text-to-speech are particularly worrisome because they sound realistic, and the human ear is unable to distinguish between real and fake as easily as the eye can, Rao said. Because telltale artifacts of audio forgeries are harder for humans to detect, accurate model-based interventions are necessary in Rao’s view.
While multiple companies are collaborating on a data set and shared task to improve such models, the work is in very early stages, he concluded.
Detecting Synthetic Text
Text can be synthetically generated by computers via language models capable of generating words to follow a given sequence of words. Language models are trained based on word-co-occurance statistics. First developed decades ago, such models have continually improved over time.5 Recently, DL-based sequence modeling approaches have produced state-of-the-art language models. He described the recent transformer-based GPT-2 model,6 trained to predict the next word in a sequence, as capable of generating text with lexical coherence not only within but also between sentences; however, broader topical and semantic coherence is still largely elusive in today’s language models.
Concerns were recently raised about the GPT-2 model by the OpenAI blog—including its potential for generating misleading news articles or automating production of fake and potentially abusive online content or spam production. Rao suggested that current models, while problematic in theory, are not in practice capable of generating text with topical coherence, even with models using around 1.5 billion parameters trained on 8 million Web articles—indicating that simply adding more data and computing power will not be enough to cross this barrier. He noted that synthetic text from language models is not particularly difficult to identify. In particular, models for creating machine-generated text are optimized to minimize perplexity, an entropy-based measure of disagreement between the word frequency predicted by the model and the word frequency observed in natural language.7 However, human-generated text tends not to adhere to this principle while nonetheless maintaining readability and coherence.8 Rao noted that this picture could change as text generation technology continues to advance.
He went on to state that the kinds of threats posed by language generation—that is, computer models capable of generating entirely new text—are not really new. Misinformation has already been created manually with little effort. For example, fake news articles were generated around the 2016 U.S. election by taking real, existing articles, and changing a few minor features to create a false message in a “new” article. Rao took this approach of re-purposing existing, human-written text, and automated it using natural language processing (NLP) techniques, sharing a fake article that he had generated in fun about his co-panelist. He pointed out that, while this example had some factual inconsistencies that a fact-checker would catch, the approach could be further improved using NLP. In addition, it was not subject to the same telltale artifacts as language generation methods, and the approach cost his team on the order of $100, compared to an estimated $43,000 required for training a model like GPT-2. Given the ease of generating such “cheap fakes,” he suggested that resources should be allocated for enabling their detection.
Wrapping up, Rao concluded that the quality of synthetic media will continue to improve, and as they become increasingly successful at fooling humans, algorithmic detection methods will become more and more indispensable. The potential scale of distribution is a big part of the problem, but the recent adoption of private and ephemeral messaging products will pose further challenges, because while public data can be monitored, fact-checked, and used to create training data sets, private or ephemeral data cannot.
Kambhampati moderated an open discussion period that touched on such themes as human behavior, psychology, and trust.
5 Rao referred to a family of models, including recurrent neural networks (RNNs), long short-term memory models (LSTMs, a type of RNN), and transformer-based models such as the GPT and subsequent GPT-2 models.
6 A. Radford, J. Wu, D. Amodei, D. Amodei, J. Clark, M. Brundage, and I. Sutskever, 2019, “Better Language Models and Their Implications,” OpenAI.com, February 14, https://openai.com/blog/better-language-models/.
8 This observation was successfully exploited in the GLTR detection approach; see H. Strobelt and S. Gehrmann, “Catching a Unicorn with GLTR: A Tool to Detect Automatically Generated Text,” at http://gltr.io/.
Is It Futile to Fight Deep Fakes?
Kathleen Fisher, Tufts University, pointed to the Defense Advanced Research Project Agency’s (DARPA’s) MediFor9 program, created to forensically detect fake imagery and audio, and expressed a sense conveyed by colleagues that it is only a matter of time before technology supersedes detection abilities. For example, she said, it is now possible to fake image metadata, including GPS coordinates—although the same capabilities are not yet mature for fake videos. Building on this, Yevgeniy Vorobeychik, Washington University in St. Louis, shared a pessimistic and an optimistic comment. First, he wondered whether deep fake detection is even worth pursuing, given that it could become a near-impossibility. Would it be healthier in the long run if society as a whole simply agreed that digital content is untrustworthy? There does exist real digital content that is misleading or trusted more than it should be, so maybe the pollution due to digital fakes could promote a healthy distrust of digital content more generally. On the other hand, he saw opportunities to explore new modes for detection, based on common sense or on some sort of side-channel analysis. For example, it may be intuitive that a video of a well-known leader saying something out of character (and unreported in the mainstream media) is quite possibly fake.
Delip Rao responded that a high-profile deep fake of a celebrity or politician is likely to attract scrutiny and prompt people to try to validate or invalidate what is being claimed, but the harder challenge is to protect individuals and civilians from targeted, malicious deep fakes that could bring significant harm to their safety and reputation. He also doubted that humans could be so skeptical as to truly “trust nothing,” to which Moore added that the social costs of trusting nothing may be worse than the costs of the occasional deep fake.
Phil Venables, Goldman Sachs, wondered whether people would believe something is a deep fake if an algorithm says it’s fake but they cannot discern the fake themselves. He also wondered whether this mattered, given that outrage about the fake content would get attention anyway. Stokes noted that humans have cognitive bias: they prefer to believe things that already fit into their worldview. Rao added that people also tend to harden their position if someone claims an item is false; in this context, binary labels like “real” or “fake” can actually increase polarization and distrust.
Rao Kambhampati pointed out a need to clarify that there are different kinds of “fakes”: human-manufactured content that is objectively untrue, and machine-manufactured content that may or may not be objectively true. Nicolas Papernot, Google, recalled an earlier comment from Kambhampati that deep fakes and adversarial examples are distinct, and he suggested that maybe the relationship between the two is more complicated than originally considered. He pointed to a paper that his team wrote in 2018 that found that humans are affected by adversarial examples even after a short exposure, and some of what advanced machines misclassify could be something that humans might also misclassify; this information could be extracted by a malicious actor for the purpose of tricking humans under certain conditions.
Another participant noted that, besides deep fakes designed to deceive the public, they could be used in a military scenario where a deep fake influences short-term decision-making with catastrophic consequences. In that scenario, the ability to detect a deep fake quickly is crucial, but Kambhampati pointed out that false positives are an impediment to this goal. Rao agreed, adding that false positives undermine trust in detection tools and limit their use. Kambhampati also pointed out that military decisions are important enough that there may be many people examining the content in question, and potentially other methods of cross-checking its veracity through other sources.
Facets of Potential Solutions
Delip Rao noted that technology will only be a small part of the solution for detecting deep fakes, which will also require public education and a fuller understanding of human psychology and behavior. Stokes pointed out that humans are predisposed to seek out and repeat things that are new and negative, to which Rao added the saying that “[if] you repeat a lie enough times it becomes the truth.” In this age of social media amplification, using algorithmic approaches to detect misinformation before it goes viral could be essential to prevent this truth by repetition.
Wenke Lee, Georgia Institute of Technology, compared deep fake creation and detection to an arms race. Whatever an algorithm is detecting, creators will eventually find a workaround. Might it be better to stop trying to detect fakes and look instead to random biometric authentication challenges, which make it harder for an attacker to convincingly pose as a human?
While several speakers raised the challenges involved in addressing deep fakes vialegislation, Tyler Moore, University of Tulsa, argued that intent matters and can be legislated against, in the way that impersonating a police officer is illegal in certain circumstances. Kambhampati agreed that laws can be helpful in this context, but noted that they are unlikely to be a global, Internet-scale solution.
Applications of Deep Fakes
David Brumley, Carnegie Mellon University and ForAllSecure, raised the potential for justifiable reasons to use deep fakes, such as for reinforcement, akin to a language learning program generating sample sentences to help users learn another language. Rao likened such potential uses of deep fakes to advertising, where a constructed narrative is used to achieve a certain agenda. Even if an ad is not necessarily perceived as real, it can still be effective. Kambhampati added that deep fakes have only been discussed in a negative light, but are actually exciting in the context of virtual reality applications for games or other entertainment. These applications will also drive interest in advancing some of the same methods used for generation of deep fakes, and could contribute to persistence of their negative aspects over a period of time.