Home / Technology / Microsoft's OPCD: Smarter AI, Faster Apps
Microsoft's OPCD: Smarter AI, Faster Apps
28 Feb
Summary
- New AI training framework bakes knowledge into models.
- OPCD reduces inference latency and per-query costs.
- Framework improves model performance for specific tasks.

Enterprises deploying large language models (LLMs) often face challenges with long system prompts that increase inference latency and costs. Microsoft researchers have developed On-Policy Context Distillation (OPCD), a novel training framework that integrates essential company knowledge and application-specific instructions directly into AI models.
This method trains models to internalize information, compressing complex instructions into their parameters. Unlike older techniques that suffer from exposure bias and mode-covering behaviors, OPCD uses the model's own generation trajectories and reverse KL divergence for training. This on-policy approach allows the student model to learn from its mistakes, promoting mode-seeking behavior and reducing hallucinations.




