What are Meta's MTIA chips designed for?

Meta's MTIA chips are primarily designed for inference workloads across its applications, with future generations also supporting GenAI inference and ranking.

What are the specifications of Meta's new superchip?

Meta's new 1700W superchip is capable of delivering 30 PFLOPs and features 512GB of HBM memory.

How is Meta developing its MTIA chips?

Meta emphasizes rapid, iterative development with a modular and reusable design approach, releasing new chip generations approximately every six months.

Home / Technology / Meta Bets Big on Custom Silicon for AI Inference

Meta Bets Big on Custom Silicon for AI Inference

20 Mar

•

Summary

Meta is deploying hundreds of thousands of custom MTIA chips.
A new 1700W superchip delivers 30 PFLOPs and 512GB HBM.
Future MTIA chips will support GenAI inference and ranking workloads.

Meta Bets Big on Custom Silicon for AI Inference

Meta is bolstering its AI capabilities with a vast deployment of custom MTIA chips, focusing on inference tasks across its diverse applications. The company has introduced a powerful 1700W superchip, capable of delivering 30 PFLOPs and incorporating 512GB of HBM. This custom silicon strategy prioritizes efficiency for inference, aiming for cost-effectiveness. Meta has already integrated hundreds of thousands of MTIA chips into its production environment, supporting critical functions like ranking and ad serving. The company is committed to rapid development, with plans for four new chip generations over the next two years. These future iterations, including MTIA 400, 450, and 500, will specifically enhance support for GenAI inference and other demanding workloads. Meta's modular design approach enables swift integration and accelerates time to market for new chip advancements. While the company acknowledges the need for a spectrum of AI solutions, its MTIA development emphasizes inference-first designs, contrasting with many chips that prioritize pre-training. This strategy aims to meet the anticipated surge in inference demand efficiently.

Disclaimer: This story has been auto-aggregated and auto-summarised by a computer program. This story has not been edited or created by the Feedzop team.