Lying has never looked so good, literally. Concern over increasingly sophisticated technology able to create convincingly faked videos and audio, so-called 'deepfakes', is rising around the world. But at the same time they're being developed, technologists are also fighting back against the falsehoods.
“The concern is that there will be a growing movement globally to undermine the quality of the information sphere and undermine the quality of discourse necessary in a democracy,” Eileen Donahoe, a member of the Transatlantic Commission on Election Integrity, told CNBC in December 2018. She said deepfakes are potentially the next generation of disinformation.
Lawmakers in California are so concerned about the risks of deepfakes in the lead-up to the 2020 elections that they passed a law in October 2019 banning the distribution of "materially deceptive audio or visual media" within 60 days of an election.
However, political interference through deepfakes hasn't happened yet. The majority of deepfakes are being created in pornography, without the consent of the people in the videos. This doesn't mean there aren't other potential deepfake uses. Including faked criminal evidence, fraud or blackmail. The other possibility, that the cry of "deepfake!" might be used to wrongly dismiss real footage as fake, is equally worrying.
Even though California's 'Anti-Deepfake Bill' has multiple flaws, one of these raises an important question: who is responsible for proving that audio or video has been manipulated?
Some researchers around the world are working on building the tools to help do just that, by fighting AI with AI. The whole point of deepfakes is that they are convincing enough to fool a human audience. As the technology for creating deepfakes becomes ever more effective, and ever more easily accessible to bad actors, creating equally powerful tech for deepfake detection mode and analysis will become a crucial battleground for the truth.
"Audio and visual deepfakes that are done well are hard to catch even for humans," says Ragavan Thurairatnam. Thurairatnam is the co-founder and Chief of Machine Learning for Dessa, a start-up which has built a tool to fake Joe Rogan's voice to demonstrate the capabilities of audio deepfakes. The company is currently working on developing AI for detecting audio deepfakes.
"If we try a traditional software based approach, it would be very difficult to figure out the rules to write in order to catch deepfakes. On top of this, deepfake technology will constantly change and traditional software would have to be rewritten by hand every time," explains Thurairatnam. "AI, on the other hand, can learn to detect deepfakes on its own as long as you have enough data. In addition, it can adapt to new deepfake techniques as they surface even when detection is difficult to human eyes."
Siwei Lyu of the University of Albany also believes that deep learning may hold the key, at least for now. "Data-driven deep learning methods seem to be the most effective methods so far. Because they learn classification rules from training data, they are more flexible and can be adapted to complex conditions in which the videos are spread, for example through video compression, social media laundering, and other counter-measures applied by the forgers."
Training machine-learning models requires a lot of data, however. Lack of training data has been a significant obstacle for researchers trying to build effective deepfake detection systems. A recent report by Deeptrace, an Amsterdam-based start-up which aims to counter deepfakes, identified 14,678 deepfake videos online, the overwhelming majority of which were porn. Although the rate of increase (the number of videos identified almost doubled since the previous audit in December 2018, although it is unclear whether some of this may be due to better detection by Deeptrace) is alarming, in absolute terms this is still a relatively low number to train AI algorithms on.
This is a structural advantage on the side of forgers. Whilst the good guys need huge numbers of deepfake videos to train on, the forgers might only need to place one video in the right place at the right time to achieve their goal.
While the number of deepfakes is fairly small, the threat is being taken seriously. To help address the problem of a lack of training data, Facebook, Google, Amazon Web Services and Microsoft recently came together to announce the Deepfake Detection Challenge. The Challenge, which is due to launch next month, will release a specially-created dataset of deepfakes using paid actors for researchers around the world to use as training data for their models. Developing effective deepfake detection systems is obviously in the public good, but it's not entirely an act of altruism by the tech giants, who are likely to be on the front lines of enforcing legislation like California's Anti-Deepfake Bill and therefore have a strong incentive to find practical detection mechanisms.
Read more: Deepfakes are already breaking democracy. Just ask any woman
Lyu and his colleague Yuezun Li have proposed an alternative detection method which is less data-hungry. Because currently deepfake algorithms can only generate images of limited resolutions, which have to be further warped to match the original video, it is possible to identify deepfaked videos through measuring the warping of faces. Training this model requires far less data than many other deep learning methods. The weakness, of course, is that forgers may find a way to reduce face warping and then detectors will be back at square one.
The same applies to other telltale signs of deepfakes. For example, Lyu and his team observed that people in deepfaked videos rarely blink, and built a model to detect deepfakes based on this. In the paper publishing their research, however, they noted that "[s]ophisticated forgers can still create realistic blinking effects with post-processing and more advanced models and more training data", and that this method is therefore unlikely to be effective once the forgers figure out the process.
"The competition between forgery making and detection is an ongoing cat-and-mouse game, each side will learn from the other and improves," says Lyu.
Thanh Thi Nguyen and his team at Australia’s Deakin University recently compared different deep learning models for AI detection. He agrees that deep learning is the most promising method currently, but there are other options which are also worth exploring.
“Another method [which does not use deep learning] is the use of photo response non uniformity analysis to detect deepfakes from authentic ones. PRNU is a noise pattern stemming from factory defects in light sensitive sensors of digital cameras,” Nguyen says. “PRNU is different for every digital camera and often considered as the fingerprint of digital images. The analysis is widely used in image forensics because the swapped face is supposed to alter the local PRNU pattern in the facial area of video frames.”
"There is no single model that performs well for all kinds of deepfakes," Nguyen explains. "This is especially true for deepfake detection because most of the existing deepfake detection methods are based on finding weaknesses in the deepfake creation methods. However, researchers still need to work towards a goal of finding a deepfake detection model which can perform effectively on most kinds of deepfakes."
Developing technical methods for deepfake detection is only the first step to addressing the problem, however. Any solution is only as effective as its implementation. That comes back to the key question left unanswered by California's Anti-Deepfake Bill. Who is actually responsible for identifying deepfakes, and how can it be done at scale?
"Building awareness about deepfakes is one of the best ways to combat their potential for malicious uses. Think about the audio deepfake fraud case that happened recently, for example: the executive on the end of the phone who thought he was talking to his boss probably had zero awareness of hyperrealistic synthesized audio," says Thurairatnam, referring to a recent case in which a deepfake of a CEO's voice was used to steal €220,000 (£189,000) from a UK-based energy company. "But what if he had? When we know to question whether or not to believe our ears, or eyes, especially when things seem off, that’s the first step to ensuring people are less easily deceived by these technologies."
Awareness amongst individual users is important, but the scale of the problem means that ultimately it is likely to be the tech companies who will have to take point in sorting truth from fakes.
"The technology that is used for both generating and detecting deepfakes will only evolve from here. It’s going to be a cat and mouse game where the technologies build off the other. Because of that, it’s critical we have other strategies in place to mitigate deepfakes’ risk – these won’t rely on technology but instead on participation from the government or different policy groups," Thurairatnam says.
Integrating deepfake detection systems at the level of tech platforms, particularly the social media giants like Facebook, Twitter and Youtube, may be the only realistic way of cracking down on the proliferation of deepfake content. That has both advantages and disadvantages. On the one hand, the scale and reach of these platforms allows for an unmatchable level of global visibility and data analysis. Cooperation between the tech giants on the Deepfake Detection Challenge may lay the groundwork for further collective action in the battle against deepfakes.
One option might be a system similar to the current Global Internet Forum to Counter Terrorism. The GIFCT manages a database which allows tech companies to share hashes of problematic content. This means that if, for example, Facebook finds a terrorist video on their platform, they can share a hash of that content to the database so that Youtube or Twitter can investigate whether the same video has also been uploaded onto their services.
But the global scale of the tech giants also brings challenges. Cultural misunderstandings, recognising satire and protected political speech, and the complexity of legal jurisdictions and incompatible national laws are all problems which have impacted the GIFCT, and are likely to effect any coordinated effort to manage deepfakes as well.
As complex as the technology of deepfake detection is, it may turn out to be the easy part compared to the politics of policing what's 'real' amidst the morass of sex, lies and videotape.
Digital Society is a digital magazine exploring how technology is changing society. It's produced as a publishing partnership with Vontobel, but all content is editorially independent. Visit Vontobel Impact for more stories on how technology is shaping the future of society.
This article was originally published by WIRED UK