Meta is facing a class action lawsuit filed by five major book publishers and one author over claims the company “engaged in one of the most massive infringements of copyrighted materials in history” when training its Llama AI models, as reported earlier by The New York Times. In their suit, Macmillan, McGraw Hill, Elsevier, Hachette, Cengage, and author Scott Turow allege that Meta “repeatedly copied” their books and journal articles without permission.
The lawsuit accuses Meta of knowingly ripping copyrighted work from “notorious pirate sites,” such as LibGen, Anna’s Archive, Sci-Hub, Sci-Mag, and others, and then feeding that material into its AI model. It also claims that Meta trained Llama with information inside the Common Crawl dataset, which is allegedly “full of unauthorized copies of copyrighted works.” As a result, Llama “outputs verbatim and near-verbatim substitutes” of copyrighted material:
For example, when prompted with two brief sentences from Cengage’s best-selling textbook, Calculus: Early Transcendentals, 9th edition, by James Stewart, Llama begins reproducing word-for-word the continuation of the section.
A group of authors also sued Anthropic over copyright infringement. While a federal judge ruled that training AI models on legally purchased books without permission is considered fair use, he allowed the authors to move forward with a class action lawsuit over the “millions” of works Anthropic allegedly pirated. Anthropic agreed to pay writers $1.5 billion last year to settle the class action lawsuit.
Turow and the group of publishers are suing Meta for damages, and ask that the court order the company to block its allegedly unlawful activities. They also ask the court to require the company to provide a list of books, journal articles, and other copyrighted works that it trained its Llama AI models on.
“AI is powering transformative innovations, productivity and creativity for individuals and companies, and courts have rightly found that training AI on copyrighted material can qualify as fair use,” Meta spokesperson Dave Arnold said in an emailed statement to The Verge. “We will fight this lawsuit aggressively.”



