Articles from Cerebras
Cerebras today announced the launch of Qwen3-32B, one of the most advanced open-weight models in the world, now available on the Cerebras Inference Platform. Developed by Alibaba, Qwen3-32B rivals the performance of leading closed models like GPT-4.1 and DeepSeek R1—and now, for the first time, it runs on Cerebras with real-time responsiveness.
By Cerebras · Via Business Wire · May 15, 2025

Cerebras and Hugging Face today announced a new partnership to bring Cerebras Inference to the Hugging Face platform. HuggingFace has integrated Cerebras into HuggingFace Hub, bringing the world’s fastest inference to over five million developers on HuggingFace. Cerebras Inference runs the industry’s most popular models at more than 2,000 tokens/s – 70x faster than leading GPU solutions. Cerebras Inference models including Llama 3.3 70B, will be available to HuggingFace developers, enabling seamless API access to Cerebras CS-3 powered AI models.
By Cerebras · Via Business Wire · March 11, 2025