Microsoft’s Maia 200 AI chip takes a swing at Nvidia’s CUDA lock-in as Azure rollout starts

January 26, 2026
Microsoft’s Maia 200 AI chip takes a swing at Nvidia’s CUDA lock-in as Azure rollout starts

San Francisco, January 26, 2026, 09:12 PST

  • Microsoft revealed Maia 200, its second-generation AI inference chip developed in-house, along with fresh software tools to support its programming
  • The company said initial deployments kick off this week in Iowa, followed by Arizona.
  • Microsoft is promoting Maia 200 as the platform to power OpenAI’s newest GPT-5.2 models and additional services within Azure

Microsoft revealed its Maia 200 chip on Monday, marking the second generation of its custom AI hardware. The company also introduced new software tools designed to chip away at Nvidia’s lead among developers. (Reuters)

This shift is crucial as the expense of running generative AI systems climbs rapidly, with cloud providers scrambling to manage both the availability and pricing of the hardware behind them. Nvidia remains the leader in AI computing, largely because a lot of developers rely on its CUDA software platform.

Microsoft designed Maia 200 specifically for “inference” — the phase when a trained model generates answers — not for training itself. Inference often drives up daily costs for chatbots and assistants since they create output token by token, with each token representing a small piece of text.

The company announced that Maia 200 will go live this week at a data center near Des Moines, Iowa, with plans for a second location near Phoenix, Arizona soon after. This model succeeds Maia 100, which Microsoft launched in 2023.

Microsoft revealed in a blog post that Maia 200 is built on TSMC’s cutting-edge 3-nanometer process, aiming to slash the cost of “AI token generation.” The chip packs 216GB of HBM3e memory to keep data flowing smoothly, plus 272MB of on-chip SRAM, which boosts performance when multiple users access a model simultaneously. With over 140 billion transistors under the hood, Maia 200 delivers more than 10 petaFLOPS in FP4 and over 5 petaFLOPS in FP8—both lower-precision formats that speed up AI tasks. (The Official Microsoft Blog)

Microsoft directly compared its new chip against competitors this time. The Maia 200 reportedly offers roughly three times the FP4 performance of Amazon’s third-gen Trainium and surpasses Google’s seventh-gen TPU in FP8 performance. Microsoft also claimed a 30% boost in performance per dollar compared to the newest hardware in its own fleet.

Microsoft’s announcement focused heavily on software. They’re rolling out a Maia software development kit that works with PyTorch, the popular AI framework, and it bundles a Triton compiler plus a kernel library. Triton, an open-source project with significant input from OpenAI, is pitched as a different approach to the low-level optimizations developers typically handle using CUDA.

Scott Guthrie, executive vice president of Microsoft’s Cloud and AI division, claimed Maia 200 can “run today’s largest models” and still has capacity to scale. Microsoft plans to deploy Maia 200 to power OpenAI’s GPT-5.2 and other models within Microsoft Foundry and Microsoft 365 Copilot. (The Verge)

This chip positions Microsoft alongside Amazon and Google, both of which have been developing their own AI processors for cloud service clients. Nvidia is moving forward with its next “Vera Rubin” platform, while Microsoft’s chip relies on an older generation of high-bandwidth memory compared to what Nvidia plans for its upcoming models.

Silicon isn’t usually the toughest challenge. Developers have invested years in CUDA’s code and tools, so Microsoft must prove its Triton-based stack is dependable, speedy, and can smoothly handle large-scale workload migration. Plus, the rollout begins in select regions, meaning capacity—not just speed—will be under close scrutiny.

Microsoft announced that Maia 200 will back several models, including GPT-5.2. The Superintelligence team plans to leverage these chips for synthetic data generation and reinforcement learning while developing their next-gen models.

Technology News

  • Apple could release four new products in 2026, including a HomePad hub, doorbell, foldable iPhone and AR glasses
    January 27, 2026, 3:40 AM EST. Apple is expected to broaden into new hardware categories in 2026, unveiling four products alongside new services. Rumors point to a smart home hub called the HomePad, described as a mix of a HomePod-like speaker and an iPad-style display. DigiTimes and other outlets tie its release to iOS 26.4 and Google's Gemini AI models, with a 6-7 inch display and an A18 chip. The device would run on Apple Intelligence and rely on the App Intents API to control apps and devices. Other rumored items include a smart doorbell, a foldable iPhone, and first-generation augmented reality glasses, plus a premium tabletop robot with a mechanical arm expected by 2027. Apple reportedly plans roughly 20 product releases this year.