AI Goes to Market
Microsoft, Amazon eye the next generation of search
Last week, Amazon Web Services quietly revealed plans to launch a marketplace where publishers can offer their content technology companies developing AI systems. According to a report by the Information, Amazon circulated slides ahead of an AWS conference that included reference to the plan in a presentation on its AI developer platforms including Bedrock and Quick Suite.
The announcement followed on the heels of Microsoft’s somewhat less quiet unveiling its own AI licensing platform a week earlier, which it dubbed Publishers Content Marketplace (PCM).
Unlike similar content marketplaces such as the recently acquired Human Native, which are largely focused on providing datasets for training AI models, the Microsoft and Amazon offerings are positioned as platforms for securing licensed access to real-time data retrieval to support AI-powered search applications.
In a blog post, Microsoft described PCM as “a solution that gives publishers a new revenue stream, provides AI systems with scaled access to premium content, and delivers better responses for consumers.” The goal, according to the post, is to make “authoritative content [that] lives behind paywalls or within specialized archives,” available to AI developers while providing publishers with “sustainable, transparent ways to govern how their premium content is used and to license it when it makes the most sense.”
Microsoft has lined up a charter group of publishers for the launch, including AP, Business Insider, Condé Nast, Hearst Magazines, People, USA TODAY, and Vox Media (notably missing from the list: the New York Times, still locked in litigation with Microsoft over unlicensed scrapping of Times content), and it designated its own Copilot system as the initial buyer in the marketplace.
Amazon’s announcement offered fewer details, but its slides aligned its marketplace with Bedrock, which is its platform for developing so-called agentic-AI applications, which would include AI-powered search tools. Given how many websites are hosted on AWS servers Amazon also has with a large stable of potential publisher clients to pitch.
I suggested a year ago that real-time, AI-powered search—what folks used to call “retrieval-augmented generation” (RAG) but now seem to prefer “conversational search”—presents both a greater potential threat to publishers than traditional search, but could also offer a more clear-cut licensing opportunity for their content than merely selling data to pre-train large language models.
With training, once the tokenized data have been extracted and incorporated into a model’s weights, the dataset has little to no further value to the model. For the data provider, therefore, a deal for training data is effectively a one-and-done proposition, not a recurring a revenue stream.
AI-powered search on the other hand, is…well, search. As with traditional search, it puts a premium on perpetual access to up-to-date, authoritative content. For certain types of publishers—those with the most up-to-date, authoritative content—that need for perpetual access by automated agents could potentially translate into recurring revenue.
As Microsoft put it in its blog post, “The open web was built on an implicit value exchange where publishers made content accessible, and distribution channels - like search - helped people find it. That model does not translate cleanly to an AI-first world, where answers are increasingly delivered in a conversation… As the AI web grows, publishers need sustainable, transparent ways to govern how their premium content is used and to license it when it makes the most sense.”
What’s needed, in other words, is the technical infrastructure to enable those “sustainable, transparent ways to govern” how content is used. That could include a scaleable marketplace, robust usage analytics, and some sort of gating mechanism to regulate access, among other things.
Microsoft and Amazon are two of the largest cloud-computing providers on the planet and can certainly deliver scale. They also have the means to generate and report analytics and can no doubt devise a technical means to effectively regulate access to content archives (something like Cloudflare’s managed-robots.txt platform).
What Microsoft and Amazon need is the incentive to go to the trouble of providing the infrastructure. And that incentive is revealed in Microsoft’s blog post: “PCM also provides usage-based reporting, enabling publishers to understand how content has been valued in the past and where it can provide increased value in the future.”
It’s the data. Particularly search-related data. Data on usage and user intent. The kind of data that made Google the undisputed champion of the previous generation of search.
Google did not invent the search engine. But it devised a superior strategy for indexing websites and organizing information. That one weird trick led more people to use Google to find the information they were searching for, and the more people who used Google the more data Google collected on their interests and intentions, further improving the results it delivered. The more data Google had on consumers’ interests and intention the better it also became at targeting advertising to them, which in turn drew more advertisers to use its platform.
That feedback loop ultimately allowed Google to achieve an effective monopoly over both search and online advertising.
Microsoft and Amazon are betting that the next generation of search, powered by AI, will be less about the best algorithm for finding information and more about superior access to premium content to answer queries directly.
By providing access to premium content via their marketplaces,and collecting data on how that content is being used, they’re hoping they will be the ones that end up with the data on consumers’ interests and intent.
Just as Google’s superior algorithm initially drew more users, who in turn drew more advertisers, Microsoft and Amazon are further betting that marketplaces with the best content will attract the most developers looking for premium content, and the more buyers they draw into the market the more sellers they will attract—in this case rightsholders—creating the same sort of feedback loop that powered Google.
As the market makers, Microsoft and Amazon will benefit directly from transactions that happen there, taking a small percentage of whatever payments developers end up making to rightsholders. But the greater treasure could be the mountain of usage data they end up sitting atop.
For all Microsoft’s talk of providing publishers with “sustainable, transparent ways to govern how their premium content is used,” the arrangement’s benefit to publishers is incidental to the strategy. But that doesn’t make it any less real. It’s the price to be paid to ensure the infrastructure is built to benefit at all.

