Why Google Gemma 4 is in the spotlight as an open-source AI game changer

Summit Pande. [Photo: Summit Pande LinkedIn page]

Google DeepMind has unveiled Gemma 4, an open-source AI model. The reaction has been broadly positive. Hugging Face CTO Julien Chaumond (줄리앙 쇼몽) went as far as saying, "Google is back in the game."

The Gemma series has 400 million downloads and more than 100,000 community-derived models, but it has lagged in real-world use behind DeepSeek, Qwen and Llama. Attention is on whether Gemma 4 can reverse that momentum.

Summit Pande (서미트 판데이), a data scientist and machine learning engineer, drew notice for rating Gemma 4’s potential highly, saying performance has improved sharply from earlier versions.

Gemma 4 is available in 4 sizes. E2B (Effective 2B parameters) and E4B (Effective 4B parameters) are edge models that run on devices such as smartphones and Raspberry Pi. Speech recognition is handled directly on-device without going through the cloud. The 26B MoE (Mixture of Experts, 4B active) has 25.2 billion parameters, but only 3.8 billion are activated during inference. The 31B Dense can be described as Gemma 4’s flagship model and is currently ranked third on the overall open-model leaderboard.

The context window is 128,000 tokens for the edge models and 256,000 tokens for the large models. Pande said it is large enough to put an entire codebase into a single prompt.

Benchmarks show Gemma 4’s performance has improved significantly. On AIME, a math-competition benchmark, Gemma 3 27B scored 20.8 percent, while Gemma 4 31B jumped to 89.2 percent. On Codeforces ELO, a coding capability metric, it rose to 2,150 from 110. On the PhD-level science benchmark GPQA Diamond, it scored 84.3 percent. Human experts in the field score about 65 percent.

Pande said it is "7 to 8 points lower than Claude Opus 4.6 (91.3 percent) and GPT-5.2 (92.4 percent), but those two are large proprietary models that use tens of billions of parameters." He stressed that Gemma 4 31B runs on a laptop, yet still leads Claude Sonnet 4.6 (74.1 percent), which many developers use daily, by more than 10 points.

Pande singled out the 26B MoE. He said it "scored 82.3 percent on GPQA Diamond while activating only 3.8 billion parameters." He said Kimi K2.5, developed by Chinese AI startup Moonshot AI, "has 32 billion active parameters, eight times more than Gemma 4 26B MoE, and scored 87.6 percent." He added: "It is using eight times more computing for a five-point gap."

Beyond benchmark results, Pande also focused on licensing. Earlier Gemma models were based on Google’s own license, but Gemma 4 uses the Apache 2.0 license. It is the same license used by Kubernetes and TensorFlow.

Pande said, "The Apache 2.0 license has no restrictions on use. It can be used commercially without any reporting obligation. You can fork and fine-tune as you like." He added, "One of the biggest obstacles for startups and companies building AI products has disappeared. You can directly own the model, the data and deployment."

Until now, Chinese models such as DeepSeek have dominated the top ranks of open-source AI leaderboards. On the U.S. open-source side, it has been Meta’s Llama and Nvidia’s Nemotron. Pande said, "With Gemma 4 31B ranked third and 26B MoE sixth, Google has also joined the open-source model race in earnest." He added, "With a 16GB RAM laptop, you can run the E4B model right away."

He also said, "Gemma 4 cannot beat Claude Opus 4.6 or GPT-5.2. But that is not the right comparison. If the question is what is the best model you can use without API costs, without data leaving the device, and without vendor lock-in, Gemma 4 has become a strong candidate."

Chi-gyu Hwang delight@d-today.co.kr

Keyword