Google Launches Gemma 4: Scaling Open-Weight AI from Smartphones to Workstations

BY Ewan Hurst
April 2, 2026

Google has officially pulled back the curtain on Gemma 4, the latest iteration of its open-weight model family. This 2026 release marks a significant departure from previous generations, primarily through a pivot to the highly permissive Apache 2.0 license – a move that Hugging Face CEO Clément Delangue described as a “huge milestone” for the open AI ecosystem.

Here is a breakdown of the new lineup and its technical milestones:

The Gemma 4 Lineup: Four Sizes, One Goal

The new family is built on the same “frontier-class” research as Gemini 3 and is categorized into two distinct groups based on their deployment targets:

The Edge Models (E2B & E4B): Designed for on-device inference, the Effective 2B and Effective 4B variants are optimized for high-performance use on smartphones, Raspberry Pi, and Jetson Nano hardware. These were developed in deep collaboration with the Google Pixel, Qualcomm, and MediaTek teams.
The Powerhouses (26B MoE & 31B Dense): Aimed at developers and researchers, these models are built for offline use on consumer GPUs and workstations. The 26B Mixture-of-Experts (MoE) model uses only 4B active parameters per forward pass, providing high-tier quality at a fraction of the compute cost, while the 31B Dense model currently ranks #3 globally among all open models on the Arena AI leaderboard.

Technical Breakthroughs

Gemma 4 isn’t just a size upgrade; it’s a multimodal leap. Key features include:

Native Multimodality: All four models can natively process text, images, and video. The E2B and E4B models go a step further by supporting native audio input for seamless speech recognition.
Massive Context Windows: The edge models feature a 128K token window, while the 26B and 31B variants support up to 256K tokens, allowing for the processing of entire code repositories or long-form documents in a single prompt.
Agentic Readiness: Google has prioritized “multi-step reasoning” and native function-calling, enabling developers to build autonomous agents that can generate structured JSON outputs and interact with third-party APIs reliably.
Global Reach: The models were trained on a dataset spanning over 140 languages, ensuring high-level performance for a global developer base.

Efficiency and Performance

For Android developers, the Gemma 4 rollout is a battery-life win. The new Effective 2B model runs 3x faster than the 4B variant, while the entire edge family is up to 4x faster than Gemma 3. More importantly, these models use up to 60% less battery power, making high-end AI features viable for all-day mobile use.

The E2B and E4B models also serve as the architectural foundation for Gemini Nano 4, which is slated to arrive as a system-wide update for Android devices later this year via the AICore Developer Preview.

Availability

Gemma 4 is available immediately for download on Hugging Face, Kaggle, and Ollama. Enterprise users can access the larger 31B and 26B models through Google AI Studio, while edge developers can find the E2B and E4B weights in the AI Edge Gallery.

With over 400 million downloads across the Gemma series to date, this Apache 2.0 release effectively removes the final legal barriers for large-scale enterprise adoption, positioning Google as a dominant force in the “open-weights” landscape.