GEMA, the German music collecting society, this week sued OpenAI in Munich Regional Court over the unlicensed use of GEMA artists’ lyrics to train ChatGPT.
The filing marks the first time a collecting society has brought a legal action against an AI developer over model training. But if GEMA is successful, it could also mark a significant step toward defining the role of collective management organizations in the licensing of copyrighted works for use in AI.
"Our members‘ songs are not free raw material for generative AI systems providers‘ business models,” GEMA CEO Dr. Tobias Holzmüller said in a news release announcing the lawsuit’s filing. “Anyone who wants to use these songs must acquire a license and remunerate the authors fairly."
GEMA has clearly been laying the ground work for the latest move for some time. For several years it has offered its 95,000 members the option to grant the CMO “graphic rights” to their song lyrics. In 2022 it amended its basic authorization agreement to include text-and-data mining rights, including the right to opt out of TDM in the EU on members’ behalf. Last month, it released details of a proposed generative AI “Licensing Framework,” and earlier this month issued an “AI Charter” outlining 10 “Legal and Ethical Principles for Dealing With Generative AI.”
The complaint against OpenAI also has been carefully tailored to advance a broader strategy.
[T]he lawsuit is… intended to clarify numerous legal problems associated with the use of AI in general,” GEMA said in an FAQ on the case. “The ‘lyrics lawsuit’ is also representative of other uses such as the generation of sound files by unlicensed services. The associated legal problems are similar. Lyrics have the advantage for a lawsuit that infringements can be clearly established. This is much more difficult with music recordings as there is greater room for interpretation regarding whether a composition constitutes plagiarism.”
GEMA has spent more than two years carefully positioning itself as a vehicle for the collective management of AI rights on behalf of its members, and by extension a potential source of blanket licensing for AI developers. OpenAI will no doubt oppose the lawsuit. But if it loses, and its liability is clearly established, it should welcome the establishment of one-stop shops for the licenses it will need.
For AI Developers, More Is Yielding Less
Bloomberg reports this week that OpenAI, Google and Anthropic are seeing diminishing performance returns from scaling up the latest iterations of their models. That followed an earlier report in The Information, citing sources inside OpenAI, that the improvement in quality from GPT-4 to its latest, yet-to-be-released version, relative to the increase in cost to train it, is not as great as the gains from GPT-3 to GPT-4, suggesting the scaling laws that have ruled AI development over the past several generations have hit a wall.
AI executives, including OpenAI’s Sam Altman, dispute the reports. But that’s likely aimed at investors who have been fueling the AI arms race and recently pumped another $6.6 billion into Sam’s company, and who expect to see increasing returns on capital.
If there’s truth in the reports, however, one likely explanation is that AI developers have run out of high-quality data to train their models on, having scraped nearly all of the low- and medium-hanging fruit from the internet, and their models are increasingly sensitive to the quality of the data they’re trained on. That’s already tipping the balance of leverage toward the owners of high-quality data repositories — as evident from the increasing number of licensing deals OpenAI and others are signing with leading publishers.
ICYMI
Bundle of Joy Don’t look now, but a music streaming platform may actually be on the verge of realizing actual GAAP-compliant net income. Spotify posted a boffo Q3, with revenue from its premium tier spiking 24% on a 12% YoY jump in paid subs, and offered a bullish outlook for the next fiscal year. That’s due in no small part to its recently implemented bundling strategy, which has drawn angry denunciations from music publishers and labels and a lawsuit from NMPA. But it has also delivered substantial savings from $99.15 million in reduced royalty payments between March and the end of September, which will no doubt anger music owners further.
Updated Guidance At a Senate oversight hearing this week, Register of Copyrights Shira Perlmutter confirmed that sometime next year, the Copyright Office will issue an update to its 2023 guidance on registering works that incorporate AI-generated content. It will also invite public comments on a proposal for revisions to its basic operating manual, the Compendium of U.S. Copyright Office Practices, including AI-related revisions. Perlmutter also said the office expects to — but cannot guarantee — release the final two sections of its report on 2023 study of copyright issues related to AI, covering the use of copyrighted content to train AI models, and the copyrightability of works produced in whole or in part by AI.
Weekend Reading The EU’s AI Office, created by the AI Act, this week released a first draft of its Code of Conduct for providers of general purpose AI models as LLM foundation models, including sections related to copyright compliance and data transparency. More on this next week.