Trans-Atlantic Antitrust Regulators Try to Get on Same AI Page
The heads of the chief antitrust authorities in the U.S., U.K. and the European Union this week issued a joint statement outlining an agreement for a cooperative approach to ensuring “effective competition and the fair and honest treatment of consumers and businesses,” by AI developers, with a particular eye on the incumbent technology giants with significant market power in existing digital markets.
While the top competition cops— U.S. Justice Department antitrust chief Jonathan Kanter, U.K. Competition and Markets Authority CEO Sarah Cardell, EU Competition Commissioner Margrethe Vestager, and Federal Trade Commission chair Lina Khan — stressed hat specific actions and decisions will remain sovereign and independent in each jurisdiction, the statement emphasized that if the risks outlined in the agreement were to materialize, “they will likely do so in a way that does not respect international boundaries.” Thus, the need for a coordinated approach.
Notably, among the risks identified in the announcement is the potential for dominant AI developers to exercise monopsony power over content creators and rights owners in the market for training data.
“For content creators, choice among buyers could limit the exercise of monopsony power that can harm the free flow of information in the marketplace of ideas,” according to the statement.
I’ve noted here before comments by Kanter and by Khan expressing concerns over the danger that monopsony power among dominant tech companies could pose to creators.
"What incentive will tomorrow's writers, creators, journalists, thinkers, and artists have if AI has the ability to extract their ingenuity without appropriate compensation?" Kanter asked, rhetorically, at a conference at Stanford University in June. "Absent competition to adequately compensate creators for their works, AI companies could exploit monopsony power on levels that we have never seen before, with devastating consequences.”
Khan has also fought to put competition law at the center of the discussion around the use of creative works to train generative AI models.
“Conduct that may violate the copyright laws––such as training an AI tool on protected expression without the creator’s consent or selling output generated from such an AI tool, including by mimicking the creator’s writing style, vocal or instrumental performance, or likeness—may also constitute an unfair method of competition or an unfair or deceptive practice,” the FTC wrote in comments filed with the U.S. Copyright Office last year. “Many large technology firms possess vast financial resources that enable them to indemnify the users of their generative AI tools or obtain exclusive licenses to copyrighted (or otherwise proprietary) training data, potentially further entrenching the market power of these dominant firms.”
The trans-Atlantic agreement now elevates those concerns to the international stage.
Other notable call outs in the statement: It wants that “Any claims that interoperability requires sacrifices to privacy and security will be closely scrutinized,” and cautions that “Firms that deceptively or unfairly use consumer data to train their models can undermine people’s privacy, security, and autonomy.”
The former seems pretty clearly aimed at Apple, which announced earlier this month that it would not enable its new Apple Intelligence AI capabilities on iPhones sold within the EU citing privacy and security risks from the platform interoperability requirements in last year’s Digital Markets Act.
The latter is likely a reference to Meta, which announced it will not make the latest version of its Llama foundation model available in the EU because it was trained in part on data from Facebook and Instagram users, which could run afoul of the EU’s General Data Protection Regulation (GDPR).
By signing onto the accord, DOJ and FTC are signaling they are eager to steer U.S. AI regulation in the direction of the stricter European approach even in the absence of comprehensive legislation by Congress.
ICYMI
Small World
I’ve alluded to the possibility before, but now it appears to be really happening: generative AI models are getting smaller. Google and Nvidia are the latest to roll out slimmed down versions of their flagship models, but it’s becoming a genuine trend, with small language models coming from OpenAI, Microsoft, Mistral, Falcon and others. The slimmed down models are intended to run faster and more efficiently than big foundation models, requiring less computing power and less data to train on. Some users are creating their own small models tailored to specific uses and often trained on proprietary data. Driving the trend is need by enterprises to be able to make use of AI without needing to invest in large data centers to train and run big models, and growing concerns about the costs and environmental impact of model training. Smaller models can also run on a wider array of devices, including mobile devices, as Apple is enabling with Apple Intelligence. Small can be beautiful.
Here We Go Again
SAG-AFTRA, which shut down movie and TV production last year with a protracted strike again the Hollywood studios over the use of AI on and off set, has done it again, this time to video game studios. Voice performers walked off the job just after midnight on Friday after the major game studios declined to sign on to a new agreement guaranteeing performers protection against displacement by the technology. “Although agreements have been reached on many issues important to SAG-AFTRA members, the employers refuse to plainly affirm, in clear and enforceable language, that they will protect all performers covered by this contract in their A.I. language,” SAG-AFTRA said in a statement. Studios affected by the work stoppage include Activision Productions Inc., Blindlight LLC, Disney Character Voices Inc., Electronic Arts Productions Inc., Formosa Interactive LLC, Insomniac Games Inc., Llama Productions LLC, Take 2 Productions Inc., VoiceWorks Productions Inc., and WB Games. Talks between the sides on a new contract have been going on for more than a year. Among the main sticking points in the negotiations has been the studios’ demand for AI voice-training rights for dialog recording sessions.
AI Snake May Start Eating Its Tail
Further to my previous post on the rapid closing off of the open web in response to massive unauthorized data scraping by AI companies to train their models, developers could soon be have another training headache to deal with. According to a new paper published in Nature, as more of the data available on the web is itself produced by AI. “If the training data of most future models are also scraped from the web, then they will inevitably train on data produced by their predecessors,” according to the paper. That could lead to what the researchers call model collapse. “We discover that indiscriminately learning from data produced by other models causes ‘model collapse’—a degenerative process whereby, over time, models forget the true underlying data distribution, even in the absence of a shift in the distribution over time,” the authors write. When trained on “polluted data,” models begin the mis-perceive reality. “In early model collapse, the model begins losing information about the tails of the distribution,” the authors write. “[I]n late model collapse, the model converges to a distribution that carries little resemblance to the original one, often with substantially reduced variance.” The rush to block scraping of human-produced data will only exacerbate the problem by increasing the incentive to rely on synthetic data for training.