Unresolved: Did the Bartz v. Anthropic Settlement Actually Settle Anything?
We could know more this week
We’ll presumably find out more this week about whether the much-ballyhooed settlement in Bartz v. Anthropic really settled anything. Judge William Alsup was decidedly unimpressed with the parties’ first effort, telling them he was “disappointed” that the proposed settlement agreement did not include some important information.
He then assigned the parties homework, issuing a list of 17 questions to address, ordering them to submit “complete and succinct written answers” by this Tuesday (9/23). He later followed those up with 17 more, bringing the total to 34, ending his order, pointedly, with a warning that “These questions do not necessarily identify all [sic] issues that could bar approval — nor even all instances where one of the above [contradictions or omissions] comes up.”
If he still is not satisfied, the case could go to trial in December on whether Anthropic’s downloading of millions of books from online shadow libraries amounts to copyright infringement.
Among the issues Judge Alsup seems concerned with is the precise procedures by which claims on the settlement will be assessed and paid out.
“What documentary proof, if any, will be required to back up a claim of beneficial or legal ownership of a work on the Works List?” he wants to know. “Relatedly, the Settlement Agreement provides that the Settlement Administrator may request information beyond whatever the Claim Form requires; describe the process for any such further requests and any review.”
And later, “Please explain whatever methods, data, and application of methods to data support the assertions that planned general notice will actually reach 70 to 95 percent of class members or more.”
Well, he did say “please.”
Going deeper into the weeds, the judge proposes the following hypothetical and wants to know how it would be resolved.
The class defined in the order on class certification and in the Settlement Agreement requires that a member of the class be a beneficial or legal owner of the exclusive right to reproduce copies of a relevant work (under Section 106(1)). Suppose, however, that while the author and publisher have retained beneficial and legal ownership of that right, a movie studio has acquired the exclusive legal right to prepare derivative works (under Section 106(2)). Does this mean that the movie studio is not in the class and therefore has not given up its potential claim against Anthropic, one turning on its contention that Anthropic created derivative works?
Judge Alsup’s concerns don’t precisely parallel the potential problems with the class-action settlement I raised in my previous post on the case. But they further highlight how fraught the process of resolving copyright disputes involving AI training could be, especially in the case of class actions, of which there are numerous prospective cases pending. Rights holdings are often complex and fragmented, involving multiple parties, and data on ownership and contractual arrangements are often imperfect. Resolving such cases in ways that don’t simply end up enriching the lawyers and class administrators at the expense of the nominal victims, and that actually remove liability from the defendants, will be challenging.
Here’s another complication: The Bartz settlement requires Anthropic to delete the original files it downloaded from two shadow libraries and all copies created from them so they cannot be used again, or used for purposes other than AI training. Such destruction of ill-gotten copies of protected works is a remedy available under the Copyright Act and is often imposed in settlements and judgments.
But what does it mean in the context of AI? Deleting the original downloaded files after they already have been used to train a model does not remove them from the model. Whatever it is the model actually took from the original files used in training becomes part of the model itself. They cannot simply be excised, like a diseased organ, while leaving the rest of body intact. They are the body, or at least the body as it has been would not exist without them.
Further, the model will continue to make use of whatever it has incorporated of its training data every time it generates an output. It can’t not use it without being retrained from scratch without the infringing works, which the settlement does not require. Even Judge Alsup, moreover, ruled that the use Anthropic’s models made of its training data, irrespective of its provenance, to be a fair use based on its transformativeness.
Requiring retraining could also conflict with providing successful plaintiffs with meaningful financial remuneration for the infringement. Training AI models takes time and is hugely expensive, requiring access to vast amounts of computing power for a sustained period, for which most developers must rely on third-party providers.
Defendants would almost certainly insist that the cost of retraining be factored into the overall financial value of any settlement, reducing the funds available to pay claimants.
Artificial intelligence doesn’t just confound traditional notions of copyright and copyright infringement, it confounds traditional remedies for it as well.