When Copyright Is Not the Only Right
The websites of news organizations are among the most popular feeding grounds for AI crawlers looking for fodder to train their models, by virtue of the volume, frequency and timeliness of the content news outlets typically produce. Just as typically, AI companies don’t bother to ask before inviting themselves in, to the consternation and fury of many publishers.
Yet, as a culture of actually paying their tab has begun to take hold among AI developers, news organizations have also become among the most sought-after sources of licensed data for developers. That’s so both because of the virtues mentioned above, and because news organizations typically own or control the copyrights outright on most of what the publish, making licensing from them a (reasonably) straightforward affair.
Or, maybe not so straightforward, at least not within the Europe Union. The Italian news publisher Gedi, which owns several outlets in that country, announced a deal with OpenAI in September to provide the ChatGPT developer with Italian-language content for its AI models. Late last month, however, the Italian Privacy Guarantor sent a formal notice to Gedi warning it to “be careful about selling personal data contained in the newspaper archive” to the AI company for use in training (h/t Luiza Jarovsky).
While Gedi may own the copyright on its articles, in other words, that’s not the only equity at issue. The Privacy Guarantor cited Article 9 of the EU’s General Data Protection Regulation (GDPR), under which, "It is prohibited to process personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, as well as to process genetic data, biometric data intended to uniquely identify a natural person, data concerning health or data concerning a natural person's sex life or sexual orientation."
Gedi is hardly the only European publisher to sign a licensing deal with AI companies. La Monde, Der Spiegel, Axel Springer and many others have also gotten with the program. The warning to Gedi does not amount to a formal prohibition of licensing. The privacy authority states, “the impact assessment, carried out by the company and transmitted to the Guarantor, does not sufficiently analyze the legal basis by virtue of which the publisher could transfer or license for use by third parties the personal data present in its archive to OpenAI.”
If it were to get to formal ban, however, and were other EU countries follow Italy’s lead, news organizations could find themselves effectively shut out of the data licensing market. And with the EU moving toward including AI training under the bloc’s text-and-data mining (TDM) exception to copyright they would be without a clear path to profiting from the biggest digital use case of their content since social media.
A Collective Reigns in Spain
Italy is not the only EU country where regulators could reshape the market for licensing content for use in AI. The Spanish government is set to debate a draft royal decree meant to facilitate the collective licensing of copyrighted works by collective management organizations (CMOs) for training AI models. The draft decree cites Article 12 of the EU Copyright Directive that allows for “extended collective licensing” (ECL) of works under certain conditions.
ECL refers to permitting CMOs to grant non-exclusive licenses to their entire repertoire for certain purposes, even if some members have not explicitly granted the organization the right to license their works for those purposes. It also allows for the granting of licenses for all works within a given category, whether formally represented by the CMO or not, if the CMO is deemed sufficiently representative of the category as a whole. The royal decree would establish AI training as one of the purposes to which ECL could apply.
I have no earthly idea how the Spanish political system operates, or what the legal status is of a royal decree. But if the proposal were to become law there, it would be yet another precedent other EU countries might follow and open the door more widely to collective management of AI training rights.
Announcing RightsTech@DEW 2025
RightsTech will host a special AI-focused track at the Digital Entertainment World conference in Los Angeles in February. Topics will include the buying & selling of data for use in AI; opting-in and opting-out of AI training; deepfakes & identity management; the role of collective rights management in the age of AI; and more. Reach out if you’re interested in speaking or (especially!) sponsoring
ICYMI
AI companies coming around on licensing? It might be just lip service in public forum, but both OpenAI’s Sam Altman and Google’s Sundar Pichai made positive-sounding noises at this week’s New York Times DealBook Summit in New York about (eventually) embracing some measure of content licensing for use in their AI systems “I think we do need a new deal, standard, protocol, whatever you want to call it, for how creators are going to get rewarded,” Altman said. “We need to have new economic models where creators can earn revenue streams.” One example he offer: compensating moderator Andrew Ross Sorkin if someone wanted to create something in his “style” with AI. Pichai was more definitive. “I do think people will develop [economic] models around it,” he said. “There will be a marketplace in the future, I think. There will be creators who create for AI models and get paid for it. I really think that’s part of the future and people will figure it out.”
Whither TikTok? The DC Circuit Court of Appeals on Friday upheld the controversial law forcing ByteDance to sell TikTok by January 19th or see it banned in the U.S. A unanimous court rejected TikTok’s First Amendment challenge to the law. Absent an emergency stay from the Supreme Court the ban will now (likely) go into effect, although what that actually means or how it would happen are unclear. The deadline also comes one day before Donald Trump returns to the Oval Office. During the campaign he came out in support of TikTok after previously supporting the ban. Whether he would, or could, try to reverse the ban after taking office is similarly unclear. He says a lot of contradictory things about a lot of things, so you never know.
Friendly Skies I’m on Blue Sky now (@dcpaul.bsky.social). Haven’t had much to say, yet. Still learning my way around. But it seems nicer than that other place.