Anthropic Facing Copyright Lawsuit Over Alleged Misuse of Pirated Books in AI Training
The artificial intelligence company Anthropic has been hit with a class-action lawsuit in California federal court by three authors who claim the company misused their copyrighted works and hundreds of thousands of others to train its popular AI chatbot, Claude.
The complaint, filed on Monday by writers Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson, alleges that Anthropic knowingly used pirated versions of their books and other copyrighted materials to teach Claude how to respond to user prompts. The authors argue that this constitutes a “large-scale theft” of their intellectual property.
“It is no exaggeration to say that Anthropic’s model seeks to profit from strip-mining the human expression and ingenuity behind each one of those works,” the lawsuit states.
This lawsuit marks the latest in a series of legal challenges facing AI companies over their use of copyrighted material to train their language models. Similar cases have been filed against industry leaders like OpenAI and Meta, with authors, news outlets, and even music publishers accusing these firms of infringing on their intellectual property rights.
Anthropic, a San Francisco-based startup founded by former OpenAI executives, has positioned itself as a more responsible and safety-focused developer of generative AI. However, the authors allege that the company’s actions have “made a mockery of its lofty goals” by tapping into repositories of pirated writings to build its AI product.
The lawsuit claims that Anthropic used a dataset called “The Pile,” which includes a massive library of pirated ebooks, to train Claude. The authors argue that Anthropic could have obtained licenses to use the copyrighted material, but instead “made the deliberate decision to cut corners and rely on stolen materials.”
The plaintiffs are seeking an unspecified amount of monetary damages and a permanent injunction to prevent the company from further misusing their work.
As the AI industry continues to grapple with the complex issue of copyright and fair use, this case highlights the growing tensions between technology companies and creators. The outcome of this and similar lawsuits could have significant implications for how AI models are developed and trained in the future.