Tech

DeepSeek Unveils 671B-Parameter Math Model with Advanced Efficiency Features

Published on Apr 30, 2025
Image Credit: Matheus Bertelli

DeepSeek has released a new large-scale AI model named DeepSeek-Prover-V2-671B on the open-source platform Hugging Face.

This upgraded model, potentially a successor to last year's Prover-V1.5, features 671 billion parameters and adopts the efficient safetensors file format, enabling faster and more resource-efficient training and deployment across a range of precision settings.

Built on the DeepSeek-V3 architecture with a Mixture-of-Experts (MoE) design, the model comprises 61 Transformer layers and a 7168-dimensional hidden layer. It supports ultra-long context windows with a maximum position embedding of 163,800, allowing it to handle complex mathematical proofs. Additionally, FP8 quantization is employed to reduce model size and boost inference efficiency.

Tags

Comments