- Published on
In this post, I aim to provide an introduction to the architectural innovations—particularly the Mixture of Experts (MoE)—pioneered by DeepSeek in their V3-Base model, which laid the foundation for their state-of-the-art reasoning model, R1.