Nvidia’s rivals and biggest customers are rallying behind an OpenAI-led initiative to build software that would make it easier for developers of artificial intelligence to switch away from its chips.
Silicon Valley-based Nvidia has become the world’s most valuable chipmaker thanks to a near monopoly on the chips needed to create large AI systems, but supply shortages and high prices are pushing customers to seek out alternatives.
Making new AI chips only solves part of the problem, however. While Nvidia is best known for its powerful processors, industry figures say its “secret sauce” is its software platform, Cuda, which enables chips originally designed for graphics to speed up AI applications.
At a time when Nvidia is investing heavily to expand its software platform, rivals including Intel, AMD and Qualcomm are taking aim at Cuda in the hopes of luring away customers — with the eager support of some of the biggest companies in Silicon Valley.
Engineers at Meta, Microsoft and Google are assisting in the development of Triton, software for making code run efficiently on a wide range of AI chips, which OpenAI released in 2021.
Even as they continue to spend billions of dollars on its latest products, big tech companies hope that Triton will help break the stranglehold that Nvidia has on AI hardware.
“Essentially it breaks the Cuda lock-in,” said Greg Lavender, Intel’s chief technology officer.
Nvidia dominates the market for building and deploying large language models, including the system behind OpenAI’s ChatGPT. That has propelled its valuation to more than $2tn and left rivals Intel and AMD rushing to catch up. Analysts expect Nvidia to report this week that its latest quarterly revenues more than tripled year on year, with profits increasing more than six times over.
But Nvidia’s hardware has only become such a hot commodity because of the accompanying software that it has developed over almost two decades, creating a formidable “moat” that competitors have struggled to overcome.
“What Nvidia does for a living is not [just] build the chip: we build an entire supercomputer, from the chip to the system to the interconnects . . . but very importantly the software,” chief executive Jensen Huang said at its GPU Technology Conference in March. He has described Nvidia’s software as the “operating system” of AI.
Founded more than 30 years ago to target video gamers, Nvidia’s pivot to AI was facilitated by its Cuda software, which the company created in 2006 to enable general-purpose applications to run on its graphics processing units.
Since then, Nvidia has invested billions of dollars to build hundreds of software tools and services to make running AI applications on its GPUs quicker and easier. Nvidia executives say it now hires twice as many software engineers as hardware staff.
“I think people are underestimating what Nvidia has actually built,” said David Katz, a partner at Radical Ventures, an AI-focused investor.
“They built an ecosystem of software around their products which is efficient, easy to use and actually works — and makes very complex things simple,” he added. “It is something that has evolved with a massive community of users over a very long time.”
Nonetheless, the high price of Nvidia’s products and the long queue to buy its most advanced gear, such as the H100 and forthcoming GB200 “super chip”, have driven some of its biggest customers — including Microsoft, Amazon and Meta — to seek out alternatives or develop their own.
However, because most AI systems and applications already run on Nvidia’s Cuda software, it is time-consuming and risky for developers to rewrite them for other processors, such as AMD’s MI300, Intel’s Gaudi 3 or Amazon’s Trainium.
“The thing is, if you want to compete with Nvidia in this space, you not only need to build hardware that is competitive, you also need to make it easy to use,” said Gennady Pekhimenko, chief executive of CentML, a start-up making software to optimise AI tasks, and associate professor of computer science at the University of Toronto. “Nvidia’s chips are really good but, in my opinion, their biggest advantage is on the software side.”
Rivals such as Google’s TPU AI chips might offer comparable performance in benchmark tests but “convenience and software support makes a big difference” in Nvidia’s favour, according to Pekhimenko.
Nvidia executives argue that its software work makes it possible to deploy a new AI model on its latest chips in “seconds” and offers continuous efficiency improvements. But those benefits come with a catch.
“We are seeing a lot of Cuda lock-in in the [AI] ecosystem, which makes it very difficult to utilise non-Nvidia hardware,” said Meryem Arik, co-founder of TitanML, a London-based AI start-up. TitanML started out using Cuda but last year’s GPU shortages prompted it to rewrite its products in Triton. Arik said this helped TitanML win new customers who wanted to avoid what she called the “Cuda tax”.
Triton, whose co-creator Philippe Tillet was hired by OpenAI in 2019, is open source, meaning anyone can view, adapt or improve its code. Proponents argue that gives Triton an inherent appeal to developers over Cuda, which is proprietary to Nvidia. Triton started out working with Nvidia’s GPUs only but it now supports AMD’s MI300, with support for Intel’s Gaudi and other accelerator chips planned soon.
Meta, for example, has put Triton software at the heart of its in-house-developed AI chip, MTIA. When Meta released the second generation of MTIA last month, its engineers said that Triton was “highly efficient” and “sufficiently hardware-agnostic” to work with a range of chip architectures.
Developers at OpenAI’s rivals such as Anthropic — and even Nvidia itself — have also worked on improving Triton, according to logs on GitHub and talks about the toolkit.
Triton is not the only attempt to challenge Nvidia’s software advantage. Intel, Google, Arm and Qualcomm are among the members of the UXL Foundation, an industry alliance developing a Cuda alternative based on Intel’s open-source OneAPI platform.
Chris Lattner, a prominent former senior engineer at Apple, Tesla and Google, has launched Mojo, a programming language for AI developers whose pitch includes: “No Cuda required”. Only a small minority of the world’s software developers know how to code with Cuda and it is hard to learn, he argues. With his start-up Modular, Lattner hopes Mojo will make it “dramatically easier” to build AI for “developers of all kinds — not just the elite experts at the biggest AI companies”.
“Today’s AI software is built using software languages from over 15 years ago, akin to using a BlackBerry today,” he said.
Even if Triton or Mojo are competitive, it will still take Nvidia’s rivals years to catch up to Cuda’s head start. Analysts at Citi recently estimated that Nvidia’s share of the generative AI chip market will decline from about 81 per cent next year to about 63 per cent by 2030, suggesting it will remain dominant for many years to come.
“Building a competitive chip against Nvidia is a hard problem but an easier problem than building the entire software stack and making people start using it,” said Pekhimenko.
Intel’s Lavender remains optimistic. “The software ecosystem will evolve,” he said. “I think the playing field will level.”
Additional reporting by Camilla Hodgson in London and Michael Acton in San Francisco