Nvidia’s AI Server Promises 10x Speed Boost for Next-Gen Models

Thu Dec 04 2025
Julie Young (697 articles)
Nvidia’s AI Server Promises 10x Speed Boost for Next-Gen Models

Nvidia on Wednesday unveiled new performance data indicating that its latest artificial intelligence server can achieve results up to 10 times faster when executing next-generation models, including two prominent Chinese models, as per reports. The update arrives as the global AI industry undergoes a significant shift in focus. Nvidia remains a leader in the market for training AI systems; however, the competition intensifies significantly when it comes to deploying those models for millions of users. Competitors like Advanced Micro Devices and Cerebras have been making efforts to narrow the divide in this expanding sector. Nvidia’s latest results focus on mixture-of-experts models, a technique in AI that is gaining popularity. MoE systems decompose user inquiries into smaller components and direct them to various “experts” within the model, enhancing efficiency.

The approach gained significant attention following the release of a high-performing open-source model by China’s DeepSeek in early 2025, which demonstrated a reduced training requirement on Nvidia chips compared to numerous competitors. Since then, several major players, including OpenAI, France-based Mistral, and China’s Moonshot AI, have embraced the MoE style. Moonshot launched its highly regarded Kimi K2 Thinking model in July, further enhancing interest in this technique. As MoE gains traction, Nvidia has been demonstrating that its hardware remains crucial, not only for training but also for delivering these models to users. The company reports that its latest AI server is equipped with 72 high-performance Nvidia chips linked by exceptionally rapid data connections. Nvidia stated that this setup enhanced the performance of Moonshot’s Kimi K2 Thinking model by a factor of 10 compared to the previous generation of Nvidia servers. The company reported comparable gains when utilizing DeepSeek’s models.

Nvidia attributed these improvements to two key strengths:

  • The large number of chips it can assemble into a single system
  • The high-speed connections between those chips.

Nvidia stated that both areas continue to provide significant advantages over its competitors. As Nvidia forges ahead, rivals are making strides as well. AMD is in the process of developing a multi-chip AI server, akin to Nvidia’s, with plans for a launch next year. On Tuesday, Amazon’s cloud division AWS revealed plans to incorporate Nvidia’s “NVLink Fusion” technology into its forthcoming AI chip, Trainium4, as per reports. NVLink stands out as one of Nvidia’s most significant technologies, as it establishes ultra-fast connections between various types of chips, facilitating efficient processing for substantial AI workloads. The announcement was unveiled at AWS’s annual cloud conference held in Las Vegas.

Nvidia has been striving to encourage more chipmakers to embrace NVLink. The technology is expanding across the industry, with Intel, Qualcomm, and now AWS joining in. AWS stated that NVLink Fusion will enable the construction of significantly larger AI systems capable of faster communication and synchronized operation, which is crucial for training extensive AI models that depend on thousands of interconnected machines. Through this partnership, AWS customers will gain access to what the company refers to as “AI Factories,” specialized AI infrastructure within their data centres aimed at providing enhanced speed, security, and preparedness for extensive AI initiatives.

Julie Young

Julie Young

Julie Young is a Senior Market Reporter and Analyst. She has been covering stock markets for many years.