Home / Technology / Google Splits AI Chips for Training and Inference
Google Splits AI Chips for Training and Inference
23 Apr
Summary
- Google Cloud launched new TPUs for training and inference.
- New chips offer up to 3x faster AI model training.
- Google's chips supplement, not replace, Nvidia's infrastructure.

Google Cloud announced its eighth generation of custom-built AI chips, known as tensor processing units (TPUs), will be offered in two specialized versions. The TPU 8t is designed for AI model training, while the TPU 8i is optimized for inference, the process of using trained models with new data.
These new TPUs reportedly deliver substantial improvements over previous generations. Google claims they provide up to three times faster AI model training and an 80% enhancement in performance per dollar. Additionally, the infrastructure can support over one million TPUs working together in a single cluster, promising increased compute efficiency with reduced energy consumption and costs for customers.
However, these custom chips are not yet positioned as a direct replacement for Nvidia's dominant offerings. Like other major cloud providers, Google intends to use its TPUs to supplement the Nvidia-based systems it already offers. Google has committed to making Nvidia's latest chip, Vera Rubin, available on its cloud later this year.
Looking ahead, the hyperscalers that develop their own AI chips, including Google, Amazon, and Microsoft, may eventually reduce their reliance on Nvidia as more enterprises adopt cloud-based AI solutions and port their applications to these custom chips. Despite past predictions, Nvidia has maintained a strong market position, with its market capitalization soaring.
Furthermore, Google is collaborating with Nvidia to enhance the performance of Nvidia-based systems within its cloud. This collaboration focuses on improving Falcon, Google's open-sourced software-based networking technology, to boost efficiency.