Etched is building an AI chip that only runs one type of model



As generative AI touches a growing number of industries, the companies producing chips to run the models are benefitting enormously. Nvidia in particular, which commands an estimated 70% to 95% of the market for AI chips, wields massive influence. Cloud providers from Meta to Microsoft are spending billions of dollars on Nvidia GPUs, wary of falling behind in generative AI.

Generative AI vendors aren’t pleased with the status quo for understandable reasons. A large portion of their success hinges on the whims of the dominant chipmakers. And so they, along with opportunist VCs, are on the hunt for promising upstarts to challenge the AI chip incumbents.

Etched is among the many, many alternative chip companies vying for a seat at the table — but it’s also among the most intriguing. Only two years old, Etched was founded by a pair of Harvard dropouts, Gavin Uberti (ex-OctoML and ex-Xnor.ai) and Chris Zhu, who along with Robert Wachen and former Cypress Semiconductor CTO Mark Ross sought to create a chip that could do one thing: run AI models.

That’s not unusual. Plenty of startups and tech giants have — or are — developing chips that exclusively run AI models, also known as inferencing chips. Meta has MTIA, Amazon has Graviton and Inferentia and so on. But Etched’s chips are unique in that they only run a single type of model: transformers.

The transformer, proposed by a team of Google researchers back in 2017, has become the dominant generative AI model architecture by far.

Transformers underpin OpenAI’s video-generating model Sora. They’re at the heart of text-generating models like Anthropic’s Claude and Google’s Gemini. And they power art generators such as the newest version of Stable Diffusion.

“In 2022, we made a bet that transformers would take over the world,” Uberti, Etched’s CEO, told TechCrunch in an interview. “We’ve hit a point in the evolution of AI where specialized chips that can perform better than general-purpose GPUs are inevitable — and the technical decision-makers of the world know this.”

Etched’s chip, called Sohu, is an ASIC (application-specific integrated circuit) — a chip tailored for a particular application, in this case running transformers. Manufactured using TSMC’s 4nm process, Sohu can deliver dramatically better inferencing performance than GPUs and other general-purpose AI chips while drawing less energy, claims Uberti.

“Sohu is an order of magnitude faster and cheaper than even Nvidia’s next generation of Blackwell GB200 GPUs when running text, image and video transformers,” Uberti said. “One Sohu server replaces 160 H100 GPUs … Sohu will be a more affordable, efficient and environmentally-friendly option for business leaders that need specialized chips.”

How does Sohu achieve all this? In a few ways, but the most obvious — and intuitive — is a streamlined inferencing hardware-and-software pipeline. Because Sohu doesn’t run non-transformer models, the Etched team was able to do away with hardware components not relevant to transformers while trimming the software overhead traditionally used to deploy and run non-transformers.

Etched
A graph from Etched comparing hardware performance running Meta’s open model Llama 70B.
Image Credits: Etched

Etched is arriving on the scene at an inflection point in the race for generative AI infrastructure. Beyond cost concerns, the GPUs and other hardware components necessary to run models at scale today are dangerously power-hungry.

Goldman Sachs predicts that AI is poised to drive a 160% increase in data center electricity demand by 2030, contributing to a significant uptick in greenhouse gas emissions. Researchers at UC Riverside, meanwhile, estimate that global AI usage could cause data centers to suck up 1.1 trillion to 1.7 trillion gallons of fresh water by 2027, impacting local resources. (Many data centers use water to cool servers.)

Uberti optimistically — or bombastically, depending on how you interpret it — pitches Sohu as the solution to the industry’s consumption problem.

“In short, our future customers won’t be able to afford not to switch to Sohu,” Uberti said. “Companies are willing to take a bet on Etched because speed and cost are existential to the AI products they are trying to build.”

But can Etched — assuming the company meets its goal of bringing Sohu to mass market in the next few months — succeed when so many others are following close behind it?

While Etched lacks a direct competitor at present, AI chip startup Perceive recently previewed a processor with hardware acceleration for transformers. Groq has also invested heavily in transformer-specific optimizations for its ASIC.

Competition aside, what if transformers one day fall out of favor? Uberti says that, in that case, Etched will do the obvious: design a new chip. Fair enough. But that’s a pretty drastic fallback, considering how long it’s taken to bring Sohu to fruition.

None of these concerns have dissuaded investors from pouring an enormous amount of money into Etched.

Today, Etched announced that it closed a $120 million Series A funding round co-led by Primary Venture Partners and Positive Sum Ventures. Bringing Etched’s total raised to $125.36 million, the round had participation from heavyweight angel backers including Peter Thiel (Uberti, Zhu and Wachen are Thiel Fellowship alums), GitHub CEO Thomas Dohmke, Cruise (and the Bot Company) co-founder Kyle Vogt and Quora co-founder Charlie Cheever.

These investors presumably believe that Etched has a reasonable chance at successfully scaling up its business of selling servers. And perhaps it does — Uberti claims that unnamed customers have reserved “tens of millions of dollars” in hardware so far. The forthcoming launch of the Sohu Developer Cloud, which will let customers preview Sohu via an online interactive playground, should drive additional sales, Uberti suggested.

It still seems too early to tell, though, whether this will be enough to propel Etched and its 35-person team into the future the company’s co-founders are envisioning. The AI chip segment can be unforgiving in the best of times — see the high-profile near-failures of AI chip startups like Mythic and Graphcore, and, relatedly, plunging funding for AI chip ventures in 2023.

Uberti makes a strong sales pitch, though: “Video generation, audio to audio modalities, robotics and other future AI use cases will only be possible with a faster chip like Sohu. The entire future of AI technology will be shaped by whether the infrastructure can scale.”




Source