WASHINGTON, DC — It’s election day here, and downtown shops and restaurants have boarded-up their doors and windows, riot fencing has gone up around the White House, the Capitol, and the vice president’s official residence. Construction crews near and around Howard University, where Kamala Harris will be watching the returns later tonight, have been told to stop work and remove any and all tools and objects from their sites that could possibly be used as projectiles.
Welcome to the Now Normal here in the seat of the world’s self-proclaimed Oldest Democracy™.
On second thought, let’s talk about anything else. Across the pond, the U.K. government is sending conflicting signals about the use of copyrighted works to train AI models.
Prime Minister Keir Starmer last week wrote an op-ed that ran in the Guardian calling journalism “the life blood of British democracy” (which, depending on your definition, is actually older than the World’s Oldest), and vowing to protect those who produce and publish it from threats both political and technological. Of the latter, he wrote, “We recognise the basic principle that publishers should have control over and seek payment for their work, including when thinking about the role of AI.”
Two weeks earlier, however, the Financial Times reported that Starmer’s government is preparing to launch a consultation into adopting a regime of allowing AI companies to scrape content freely from the internet, including journalistic content, unless the publisher of the content affirmatively opts-out of allowing their content to be used in AI training.
The venerable BBC join other news organizations in calling the proposal “an existential threat” to that very same life blood. “It’s critical that publishers and media companies retain control over how their content is used when it comes to AI,” the broadcaster said in a statement. “The onus should remain on AI developers to seek permission for use of content, not on publishers to opt out.”
The impetus for the consultation, according to the FT report, comes from a desire on the part of some ministers to align U.K. policy, more or less, with that of the European Union, of which the U.K. is no longer a part, to maintain Britain’s technological prowess amid geo-economic competition.
As discussed here in a previous post, the EU’s Digital Single Market directive allows researchers to conduct text and datamining (TDM) on lawfully obtained copyrighted works without seeking permission from the copyright owner, unless the rights owner affirmatively opts out of allowing their content to be used. There remains some dispute as to whether the TDM exception applies to AI training. But as I noted in the earlier post, the text of the EU AI Act suggests that it does, and the one European court to address the issue so far seemed to imply that it does. Many media organizations also seemed to expect that it would apply, proactively issuing general opt-out statements before the AI Act came into force and taking technical measures to limit scraping of their websites.
If the U.K. proposal ultimately is adopted, and the EU TDM exception is confirmed to apply to AI training by commercial AI developers, it would lend considerable momentum toward an emerging opt-out global standard for AI training. The one big holdout for now is the U.S.
The U.S. Congress has yet to weigh in on the question. The fair use standard in U.S. copyright law creates a different legal framework around the issue from its European counterparts, and nearly three dozen infringement lawsuit have been filed in U.S, courts that will test the application of the fair use standard to AI training.
The U.S. Copyright Office has also been studying issues around AI for going on two years, and its promised report on its findings has been delayed from its self-imposed timetable. In a somewhat defensively worded letter sent last week to the chair and ranking member of the House Administration Committee, Register of Copyrights Shira Perlmutter offered a new (but not guaranteed) timeline for delivering the report by the end of calendar 2024.
Both courts and Congress tend to be deferential to the USCO on questions related to the applicability of various copyright exceptions and it is likely they will wait to read what it says before weighing in. But even if the U.S. ultimately decides that AI training is presumptively infringing unless authorized by the copyright owner — as many copyright scholars expect it will — as a practical matter the U.S. could find itself bucking a de facto global standard of placing the burden to affirmatively opt-out on creators and rights owners.
As most of the leading AI developers are based, for now at least, in the U.S. that tension could have significant geo-political and geo-economic repercussions. And it could leave both AI developers and creators trying to serve two masters.