Settling Up: The First AI v. Copyright Dominoes Fall
Authors and Anthropic come to terms but raise as many questions as they resolve
Within days of each other, the first two settlements have been reached ahead of trial in AI copyright lawsuits. On Thursday (8/23), the parties in Vacker, et. al. v. Eleven Labs told the judge they have reached a settlement with the help of a mediator. And on Tuesday (8/26), Anthropic reached a settlement with a group of authors on the remaining charges in a class-action lawsuit against the AI company, also with the aid of a mediator.
Perhaps the dominoes are beginning to fall.
The terms of either settlement were not disclosed pending court approval of the agreements. But a lawyer for the authors in the Anthropic case said they would be “announcing details of the settlement in coming weeks.” There was no indication whether either party in the Eleven Labs case, which pitted voice actors against the AI company over the unauthorized use of their voices, intends to reveal the terms.
The Anthropic settlement, at least, had been telegraphed in advance. Although Judge William Alsup found Anthropic’s use of digitized copies of books it had lawfully acquired to be fair use, he ruled that its downloading of millions of pirated copies of other works from shadow libraries was not and ordered Anthropic to stand trial on that count. He further certified a plaintiff class on that charge involving potentially millions of authors with tens of millions of works.
That raised the stakes considerably for Anthropic. With up to $150,000 in potential statutory damages for each work, it was facing hundreds of billions of dollars in possible liability.
The judge also rejected Anthropic’s request for permission to file an immediate appeal of the downloading ruling, further turning the screws on the AI company. And in an unmistakable signal of the court’s preferred outcome, Judge Alsup took the unusual step of granting permission for the parties to negotiate a class settlement even before he formally certified a class, all but telling Anthropic he expects it to make a deal.
While the settlement may be in the best interests of the parties under the circumstances, the outcome of the case is not necessarily in the best interests of anyone involved in similar cases. Critically, Judge Alsup’s rejection of Anthropic’s permission to appeal robs the Ninth Circuit Court of Appeals the opportunity to weigh in on a crucial question on fair use that might have gone a long way toward clarifying its application to the training of AI models with copyrighted works.
In its petition, Anthropic drew attention to a conflict between Alsup’s ruling and a ruling handed down two days later by a different judge in the same district court in Kadrey v. Meta. In that case, in which Meta was also accused of illegally downloading pirated books to train its model, Judge Vince Chhabria found Meta’s use of the books in training to be fair use. As to Judge Alsup’s ruling that the use of pirated works negates a claim of fair use, Judge Chhabria said that “begs the question,” as “the whole point of fair use analysis is to determine whether a given act of copying was unlawful.”
Anthropic was asking to have the Ninth Circuit resolve the “novel and consequential legal questions about the proper fair-use standard in the context of copyright-infringement challenges to groundbreaking generative artificial intelligence (“AI”) technology.”
Specifically, it said the conflict, “presents the question whether fair use is analyzed based on the defendant’s ultimate purpose in copying a copyrighted work or instead may be parsed into separately analyzed constituent steps.” Judge Alsup, it said, “concluded that a supposed preliminary step—downloading books from “pirate” websites to be retained in a general-purpose library—was not compatible with a fair use of those books to train an LLM.” Judge Chhalabria, on the other hand, viewed Meta’s downloading “in light of Meta’s ultimate objective of LLM development, and deeming that copying fair use.”
Resolving that conflict, Anthropic said, “is critically important to the outcome of this and the many other pending copyright challenges to LLMs.”
With the settlement, however, that “novel and consequential” question is unlikely to get to the Ninth Circuit, at least for now.
Yet, the settlement does raise at least two other consequential questions, although neither is directly related to the fair use question.
While we don’t (yet) know the terms of the Anthropic settlement, the dollar figure is likely to be substantial. So substantial, in fact, that only a handful of the largest AI companies likely would be able to shoulder it: OpenAI, Microsoft, Google, Meta, Amazon, maybe Perplexity. Even with Amazon’s backing, it’s probably a heavy lift even for Anthropic, which may partially explain why it was recently looking to raise $10 billion in new financing.
Should the settlement become a template for how other copyright infringement lawsuits against AI developers are resolved, it could effectively hand a de facto AI data monopoly to the largest tech companies. Only those with the ability to absorb class-action size damage awards could afford to risk the liability, providing them the kind of competitive “moat” the technology itself has failed to deliver.
We also don’t yet know whether the Anthropic settlement includes any forward-looking provisions dealing with ongoing or future uses of the data. But even without those, it would amount to a kind of retroactive compulsory license for the tens of millions of works Anthropic used to train the Claude LLM. The same would apply to any similar future settlements with the few defendants that could afford it, further widening the moat would-be competitors would need to leap.
The other critical issue the Anthropic settlement could reveal is the murkiness of data on copyright ownership and rights assignments. Given the size of the class, the number of potential claimants on the money damages could be huge. But there is no comprehensive database or registry of who controls the reproduction and distribution rights to every title still under copyright.
I am a member of the rights committee of the Book Industry Study Group, which recently released Find a Rights Holder. It is a directory of contact information for people seeking to clear translation and other subsidiary rights or permissions, organized by imprint and mapped to the publishing house that currently owns each imprint. It was a Herculean effort, literally years in the making, and it’s still very much a work in progress. Like music song catalogs, publishing imprints and the rights they control change hands or go out of business frequently. Before FRH, it could be near impossible to track down who actually holds the rights you’re trying to secure for a particular title. Sound familiar?
As good as it is — and there is nothing else like it out there — Find a Rights Holder would not be adequate for dispersing retrospective damages from the settlement to the appropriate parties because it is not organized by author or contain information on individual publishing contracts.
There are also questions regarding money potentially due to authors or rights holders who do not agree with the terms of the settlement, and whether they could be swept up into it unless they affirmatively opt-out.
Logistics aside, it’s illustrative of challenges created by the scale of the datasets needed to train generative AI models. Rights are managed contractually and individually, whereas AI operates on them generically and collectively. It’s an irreducible conflict that will haunt all efforts to develop a sustainable licensing system for AI training, whether forward-looking or retroactive.
By creating the conditions where a class settlement was the only viable option for the parties, the court in Bartz v. Anthropic, perhaps unwittingly, has laid the conflict bear.