Mountain View, California, April 14, 2026, 14:03 PDT.
- Google’s AI Edge Gallery now supports Gemma 4 on Android and iPhone, and the Google Play listing shows more than 500,000 downloads.
- Together AI and Microsoft Foundry have added hosted Gemma 4 options, widening distribution beyond Google’s own apps and cloud.
Google is widening the reach of Gemma 4, its newest open-weight AI family, across phones and outside cloud platforms, turning what started as a developer release into a broader push for offline and self-hosted AI. AI Edge Gallery is now in the main mobile app stores, while outside providers are already offering larger hosted versions.
That matters because Google is pitching Gemma as more than another chatbot model. Open-weight AI — meaning developers can download the model parameters and run them themselves — can keep prompts and files on a device, cut API bills and keep working when a connection drops, according to Google’s own product material and developer posts.
Google DeepMind introduced Gemma 4 on April 2 under an Apache 2.0 license in four sizes, from E2B and E4B versions aimed at phones to 26B A4B and 31B models for larger hardware. Clement Farabet, a Google DeepMind vice president, and Olivier Lacombe, a group product manager, called the release “breakthrough capabilities made widely accessible” in a company blog post. Blog
Google says the family is built for coding, reasoning and “agentic workflows” — software that can call tools and take several steps rather than stop at one answer. The models support up to 128,000 or 256,000 tokens of context, industry shorthand for how much text, code or other data they can process in one pass. Google AI for Developers
On phones, the company is using AI Edge Gallery as the storefront. Google’s developer blog says the app is available on Android and iOS, and Android Authority reported that users can now summarize PDFs, write code, analyze images and transcribe audio locally instead of routing those requests back to Google’s servers.
Karl Weinmeister, a developer relations specialist at Google Cloud, wrote last week that the refreshed app “shows what’s now possible” on-device because it can generate structured code and control device settings with natural language, offline. Google is also threading the same model family into Android itself: product manager David Chou and developer relations engineer Caren Chang said Gemma 4 is the “foundation for the next generation of Gemini Nano,” with Gemini Nano 4-enabled devices due later this year. Medium
For heavier work, Google released Gemma 4 on Google Cloud on April 3, saying the models could be deployed inside enterprise environments with data kept under customer control. Together AI has since listed Gemma 4 31B on its serverless platform with image input, function calling and a 256,000-token context window, pricing it at $0.20 per million input tokens and $0.50 per million output tokens.
The cloud footprint widened again on Tuesday when Microsoft said Gemma 4 had become available in Microsoft Foundry, giving Azure users another place to evaluate and deploy the models inside their own environments. Weinmeister wrote separately that prompts and skills built locally in Edge Gallery can work the same way with larger Gemma 4 variants in the cloud, a sign Google wants one model family to span phone and backend use.
Competition is tightening. Artificial Analysis said Tuesday that Alibaba’s Qwen3.5 27B and Google’s Gemma 4 31B are now the two leading open-weight families under 32 billion parameters, with Qwen scoring higher overall on its benchmark while Gemma used tokens more efficiently.
But the trade-offs have not gone away. Google says AI Edge Gallery is still in active development and that performance depends on a device’s CPU and GPU, while its Android preview needs AICore-enabled hardware for representative speed. Artificial Analysis said models in this size class still trail top proprietary systems on factual knowledge and hallucination avoidance.