Home / Technology / AI Learns to Discover: New Technique Unleashes Problem-Solving Power
AI Learns to Discover: New Technique Unleashes Problem-Solving Power
6 Feb
Summary
- New AI technique trains models during problem-solving, not just before.
- Optimized a critical GPU kernel to run 2x faster than human experts.
- Cost of $500 per discovery problem, suited for high-value assets.

A groundbreaking AI technique called Test-Time Training to Discover (TTT-Discover) is challenging traditional AI development paradigms. Developed by researchers from Stanford, Nvidia, and Together AI, this method enables AI models to train and update their weights in real-time while attempting to solve a problem, rather than relying on pre-trained, static parameters.
This approach is particularly effective for complex discovery problems that lie outside a model's original training data. Unlike 'frozen' models that search within their learned knowledge, TTT-Discover treats each problem as a unique environment for mastery. This allows the AI to learn from its failures and partial successes, laser-focusing on finding optimal solutions.
TTT-Discover requires a continuous reward signal for incremental progress, differentiating it from standard reinforcement learning. While experiments showed a cost of approximately $500 per discovery run, this method is deemed economical for high-impact, low-frequency decisions. Examples include optimizing critical GPU kernels, potentially achieving double the speed of human-expert solutions, or finding faster routes in logistics.
The researchers have released the TTT-Discover code, which works with open-weights models like gpt-oss-120b. This allows companies to run the discovery loop securely within their own infrastructure. Implementation is feasible for enterprises already using reinforcement learning, with tools like Tinker API further reducing setup complexity.



