What is Microsoft's new Phi-4-reasoning-vision-15B model?

Phi-4-reasoning-vision-15B is a compact, open-weight multimodal AI model that processes images and text, matching or exceeding the performance of larger systems with less compute and data.

How much data was used to train Phi-4-reasoning-vision-15B?

The model was trained on approximately 200 billion tokens, a significantly smaller dataset compared to many competing multimodal models that use over a trillion tokens.

How does Phi-4-reasoning-vision-15B handle complex tasks versus simple ones?

It uses a mixed reasoning approach, invoking detailed analysis for complex math and science problems while defaulting to fast, direct responses for simpler visual tasks like image captioning.

Home / Technology / Microsoft's Tiny AI: Big Brains, Small Footprint

Microsoft's Tiny AI: Big Brains, Small Footprint

5 Mar

•

Summary

New AI model matches larger systems with less compute.
Trained on significantly less data than rivals.
Balances reasoning for complex tasks and speed for simple ones.

Microsoft's Tiny AI: Big Brains, Small Footprint

Microsoft has released Phi-4-reasoning-vision-15B, a small, open-weight multimodal AI model designed to compete with much larger systems. This 15-billion-parameter model efficiently processes both images and text, demonstrating strong capabilities in complex reasoning, math, and science interpretation, as well as visual tasks. Its performance rivals larger models while using a fraction of the compute and training data.

The model was trained on approximately 200 billion tokens, significantly less than the trillion tokens used by many competitors. This efficiency is attributed to meticulous data curation and a novel training approach. It employs a mixed reasoning strategy, invoking detailed step-by-step analysis for complex problems and direct responses for simpler visual tasks, optimizing performance and speed.

This development signifies a potential shift in AI deployment, making advanced AI more accessible for resource-constrained environments. Its architecture focuses on high-resolution image understanding, crucial for applications like powering autonomous agents that navigate computer interfaces. The model is available through Microsoft Foundry, HuggingFace, and GitHub.

Phi-4-reasoning-vision-15B is the latest in Microsoft's Phi family, which has expanded to include models for on-device inference, education, and robotics. This efficient, adaptable model could unlock new deployment scenarios for enterprises seeking powerful AI solutions without the immense costs associated with larger systems.

Disclaimer: This story has been auto-aggregated and auto-summarised by a computer program. This story has not been edited or created by the Feedzop team.

Microsoft's Tiny AI: Big Brains, Small Footprint

5 Mar

•

Summary

New AI model matches larger systems with less compute.
Trained on significantly less data than rivals.
Balances reasoning for complex tasks and speed for simple ones.

Disclaimer: This story has been auto-aggregated and auto-summarised by a computer program. This story has not been edited or created by the Feedzop team.