DeepSeek releases ‘sparse attention’ model that cuts API costs in half
Researchers at DeepSeek on Monday released a new experimental model called V3.2-exp, designed to have dramatically lower inference costs when used in long-context operations. DeepSeek announced the model with a post on Hugging Face, also posting a linked academic paper on GitHub. The most important feature of the new model is called DeepSeek Sparse Attention, […]
DeepSeek releases ‘sparse attention’ model that cuts API costs in half Read More »









