But it’s not just that models can’t recognize accents, languages, syntax, or faces less common in Western countries. “A lot of the initial deepfake detection tools were trained on high quality media,” says Gregory. But in much of the world, including Africa, cheap Chinese smartphone brands that offer stripped-down features dominate the market. The photos and videos that these phones are able to produce are much lower quality, further confusing detection models, says Ngamita.
Gregory says that some models are so sensitive that even background noise in a piece of audio, or compressing a video for social media, can result in a false positive or negative. “But those are exactly the circumstances you encounter in the real world, rough and tumble detection,” he says. The free, public-facing tools that most journalists, fact checkers, and civil society members are likely to have access to are also “the ones that are extremely inaccurate, in terms of dealing both with the inequity of who is represented in the training data and of the challenges of dealing with this lower quality material.”
Generative AI is not the only way to create manipulated media. So-called cheapfakes, or media manipulated by adding misleading labels or simply slowing down or editing audio and video, are also very common in the Global South, but can be mistakenly flagged as AI-manipulated by faulty models or untrained researchers.
Diya worries that groups using tools that are more likely to flag content from outside the US and Europe as AI generated could have serious repercussions on a policy level, encouraging legislators to crack down on imaginary problems. “There’s a huge risk in terms of inflating those kinds of numbers,” she says. And developing new tools is hardly a matter of pressing a button.
Just like every other form of AI, building, testing, and running a detection model requires access to energy and data centers that are simply not available in much of the world. “If you talk about AI and local solutions here, it’s almost impossible without the compute side of things for us to even run any of our models that we are thinking about coming up with,” says Ngamita, who is based in Ghana. Without local alternatives, researchers like Ngamita are left with few options: pay for access to an off the shelf tool like the one offered by Reality Defender, the costs of which can be prohibitive; use inaccurate free tools; or try to get access through an academic institution.
For now, Ngamita says that his team has had to partner with a European university where they can send pieces of content for verification. Ngamita’s team has been compiling a dataset of possible deepfake instances from across the continent, which he says is valuable for academics and researchers who are trying to diversify their models’ datasets.
But sending data to someone else also has its drawbacks. “The lag time is quite significant,” says Diya. “It takes at least a few weeks by the time someone can confidently say that this is AI generated, and by that time, that content, the damage has already been done.”
Gregory says that Witness, which runs its own rapid response detection program, receives a “huge number” of cases. “It’s already challenging to handle those in the time frame that frontline journalists need, and at the volume they’re starting to encounter,” he says.
But Diya says that focusing so much on detection might divert funding and support away from organizations and institutions that make for a more resilient information ecosystem overall. Instead, she says, funding needs to go towards news outlets and civil society organizations that can engender a sense of public trust. “I don’t think that’s where the money is going,” she says. “I think it is going more into detection.”