inference Archives - GenixPlay Studios

Tensormesh raises $4.5M to squeeze more inference out of AI server loads

With the AI infrastructure push reaching staggering proportions, there’s more pressure than ever to squeeze as much inference as possible out of the GPUs they have. And for researchers with expertise in a particular technique, it’s a great time to raise funding. That’s part of the driving force behind Tensormesh, launching out of stealth this […]

Tensormesh raises $4.5M to squeeze more inference out of AI server loads Read More »

Anthropic hires new CTO with focus on AI infrastructure

AI / Sasandara Dilmina

Anthropic has a new chief technical officer, former Stripe CTO Rahul Patil. Patil started at the company earlier this week, taking over from co-founder Sam McCandlish, who will move to a new role as chief architect. As part of the change, Anthropic is updating the structure of its core technical group, bringing the company’s product-engineering

Anthropic hires new CTO with focus on AI infrastructure Read More »

DeepSeek releases ‘sparse attention’ model that cuts API costs in half

AI / Sasandara Dilmina

Researchers at DeepSeek on Monday released a new experimental model called V3.2-exp, designed to have dramatically lower inference costs when used in long-context operations. DeepSeek announced the model with a post on Hugging Face, also posting a linked academic paper on GitHub. The most important feature of the new model is called DeepSeek Sparse Attention,

DeepSeek releases ‘sparse attention’ model that cuts API costs in half Read More »

Clarifai’s new reasoning engine makes AI models faster and less expensive

AI / Sasandara Dilmina

On Thursday, the AI platform Clarifai announced a new reasoning engine that it claims will make running AI models twice as fast and 40% less expensive. Designed to be adaptable to a variety of models and cloud hosts, the system employs a range of optimizations to get more inference power out of the same hardware.

Clarifai’s new reasoning engine makes AI models faster and less expensive Read More »

Hugging Face makes it easier for devs to run AI models on third-party clouds

AI / Sasandara Dilmina

AI dev platform Hugging Face has partnered with third-party cloud vendors including SambaNova to launch Inference Providers, a feature designed to make it easier for devs on Hugging Face to run AI models using the infrastructure of their choice. Other partners involved with the new effort include Fal, Replicate, and Together AI. Hugging Face says

Hugging Face makes it easier for devs to run AI models on third-party clouds Read More »

Sagence is building analog chips to run AI

Hardware / Sasandara Dilmina

Graphics processing units (GPUs), the chips on which most AI models run, are energy-hungry beasts. As a consequence of the accelerating incorporation of GPUs in data centers, AI will drive a 160% uptick in electricity demand by 2030, Goldman Sachs estimates. The trend isn’t sustainable, argues Vishal Sarin, an analog and memory circuit designer. After

Sagence is building analog chips to run AI Read More »