Google (GOOGL) is making its most direct move yet against Nvidia in the AI chip market. The company announced its 8th generation TPU on Wednesday and, for the first time, split the chip line into two dedicated products: TPU 8t for training AI models and TPU 8i for running them.
Both chips will be available later this year, with cloud customers and internal Google products expected to be among the first adopters.
The Performance Claims
Google says the TPU 8t delivers 2.8x the performance of the 7th generation "Ironwood" TPU at the same price, which is a big jump in one cycle. The TPU 8i delivers 80% better performance than the prior generation's inference workload.
Amin Vahdat, Google's senior vice president and chief technologist for AI and infrastructure, said in a blog post that the company determined "the community would benefit from chips individually specialized to the needs of training and serving."
Why Split Training and Inference
Training a model is a very different workload from running one, so using a single chip for both forces trade-offs. Training needs massive memory and raw compute, while inference needs low latency and tight cost control at massive scale.
Nvidia's H100 and B200 chips are designed to handle both use cases, which has made them the default for most AI builders. Google's bet is that specialized chips can beat generalist ones on both cost and performance, especially as AI agents push inference volume higher.
The Supply Chain
Two non-Nvidia supply chains:
- Broadcom (AVGO) designs both the current Ironwood and new TPU 8t training chip, with an agreement running through 2031.
- MediaTek designs the TPU 8i inference chip and its cost-optimized variants, including TPU v7e and v8e.
Having two separate chip designers gives Google real redundancy across its core AI compute stack. It also splits partnership risk between two large suppliers instead of relying on a single Nvidia dependency.
The Cloud Customer Play
Google Cloud has been pitching TPUs as a way to cut AI inference costs for enterprise customers, and this split is the clearest version of that pitch yet. Anthropic already runs large parts of its model on TPUs, and Apple has been testing similar workloads with Google for its own AI features.
If more model builders follow, Google earns both the hardware revenue and the recurring cloud spend. That combo is the exact playbook Nvidia and the hyperscalers have been fighting over.
What to Watch
Google is still a big Nvidia customer, and that will not change overnight. If cloud customers start to prefer TPUs on cost alone, especially for inference workloads, Nvidia loses some of the pricing power it has relied on since 2023.
The next signal is AWS and Microsoft, since both are spending heavily on custom chips of their own. That is the market where most of the AI money goes from here.
