Mistral AI Releases Large 3 — 675B Open-Weight MoE

Summary

Mistral AI released Mistral Large 3, a 675-billion parameter mixture-of-experts model activating 41 billion parameters per forward pass, under Apache 2.0 — making it one of the largest fully permissive open-weight models available. With native vision capabilities, 256K context, multilingual support, and an LMSYS Arena Elo of 1418, it positioned the French startup as a serious competitor at the MoE frontier. The model was trained on 3,000 H100 GPUs.

What Happened

Released on December 2, 2025 — one day after DeepSeek V3.2 — Mistral Large 3 entered a dense competitive window for open-weight frontier releases. The 675B total / 41B active MoE architecture incorporated a native vision pathway for image understanding alongside text, a 256K token context window, and multilingual instruction following across European and Asian languages.

Mistral used Apache 2.0, with no distillation restrictions or commercial use carve-outs — the most permissive license available for a model at this scale. On the LMSYS Chatbot Arena, the model achieved an Elo rating of 1418, placing it among the top-ranked publicly accessible models. Mistral reported a training infrastructure of 3,000 Nvidia H200 GPUs, a detail notable for transparency into the compute investment underlying a major open release.

The simultaneous announcement included a revised set of smaller Mistral models, reinforcing the company's strategy of maintaining both edge-deployable and frontier-class open offerings.

Why It Matters

Mistral Large 3 was significant both for its scale and its timing. Released the day after DeepSeek V3.2, it contributed to a concentrated December 2025 moment where open-weight frontier models were being released at a pace that matched or exceeded proprietary API product cycles. Two frontier MoE models with hundreds of billions of parameters, both MIT or Apache licensed, released in consecutive days from labs in China and France.

For the open-vs-closed debate, Mistral Large 3 illustrated that fully permissive open-weight release was now viable at frontier scale from a commercially funded European lab. The Apache 2.0 license — with no restrictions on derivative works, distillation, or commercial deployment — represented a commitment to genuine openness that distinguished Mistral from larger labs releasing weights with usage restrictions. The LMSYS Elo placement confirmed competitive capability alongside the licensing transparency.

Mistral AI Releases Large 3 — 675B Open-Weight MoE

Summary

What Happened

Why It Matters

Tags