AI companies sued for training chatbots with ‘pirated literature’

Tue Dec 23 2025
Rajesh Sharma (2191 articles)
AI companies sued for training chatbots with ‘pirated literature’

A coalition of journalists and writers in the United States has filed a lawsuit against prominent artificial intelligence companies, including OpenAI and Elon Musk’s xAI, claiming that they have utilized copyrighted books without authorization to develop their AI systems. The petitioners, among them John Carreyrou, initiated the lawsuit on Monday in a federal court located in California. In addition to OpenAI and xAI, the list of defendants also features Google, Anthropic, Meta Platforms, and the AI search startup Perplexity. The lawsuit, initiated by Carreyrou claims that the companies replicated and utilized protected literary works to create large language models that drive commercial chatbots, all without obtaining licenses or compensating the authors.

The alleged conduct was characterized as a deliberate infringement of copyright. “This case concerns a straightforward and deliberate act of theft that constitutes copyright infringement,” the filing states. It states that the companies “illegally copied vast quantities of copyrighted books without permission and then used those stolen copies to build and train their commercial large language models.” The petitioners assert that the defendants obtained pirated copies of books via shadow libraries, such as LibGen, Z-Library, and OceanofPDF. These copies were reportedly reproduced, analyzed, re-copied, and embedded into AI systems to accelerate commercial development. “The Copyright Act prohibits exactly this conduct,” the complaint states.

The lawsuit asserted that the purported infringement impacted hundreds of authors, encompassing bestselling writers and Pulitzer Prize-winning journalists. The filing represented the inaugural copyright lawsuit to identify xAI as a defendant. A growing list of legal challenges has emerged, initiated by authors, artists, and publishers against technology firms regarding the use of copyrighted material in AI training. The plaintiffs contended that current class-action settlements do not adequately represent the magnitude of the purported infringement. “The danger is not hypothetical,” the complaint states, referencing a pending class action against Anthropic. It highlights that authors in that scenario are anticipated to earn approximately $3,000 per work, prior to legal expenses, which it characterizes as “a tiny fraction (just 2 per cent)” of the Copyright Act’s statutory damages limit of $150,000 per infringed work. “LLM companies should not be able to so easily extinguish thousands upon thousands of high-value claims at bargain-basement rates,” the filing reads.

The lawsuit clearly indicates that the authors are not opting for a class-action approach. Rather, they seek for individual claims to be evaluated by a jury. “Under established Supreme Court precedent, ‘the amount of statutory damages is a question for the jury’,” the complaint states. The statement emphasizes that the Copyright Act empowers authors to pursue accountability against alleged infringers independently of class settlements. “This is not how Plaintiffs plan to proceed,” the filing states. AI companies have consistently contended that utilizing copyrighted content for training AI models falls under fair use, as these systems produce new and transformative results instead of merely replicating original works. In a previous case, a US judge determined that Anthropic’s utilization of copyrighted books for AI training constituted fair use. However, the court ruled that the company violated copyright law by storing millions of pirated books in a central database, irrespective of their eventual use for training. Carreyrou later stated in court that the use of pirated books to develop AI systems was Anthropic’s “original sin,” as reported.

Rajesh Sharma

Rajesh Sharma

Rajesh Sharma is Correspondent for Stock Market of South East Asia based in Mumbai. He has been covering Asian markets for more than 5 years.