Home / Technology / AI Orchestra: Agents Conduct Themselves!
AI Orchestra: Agents Conduct Themselves!
8 May
Summary
- RL Conductor automates agent orchestration for LLMs.
- Outperforms GPT-5 and Claude Sonnet 4 on benchmarks.
- Sakana Fugu commercializes this orchestration technology.

Sakana AI has introduced the RL Conductor, a novel small language model trained via reinforcement learning to autonomously orchestrate multiple worker LLMs. This system dynamically analyzes inputs, delegates tasks to specialized agents, and coordinates their efforts, demonstrating superior performance on challenging reasoning and coding benchmarks. It surpasses individual frontier models like GPT-5 and Claude Sonnet 4, as well as complex human-designed multi-agent systems, all while operating at a reduced cost and with fewer API calls.
The RL Conductor addresses the inherent limitations of static, hard-coded agentic frameworks, which often break when query distributions shift. Unlike rigid pipelines, the Conductor generates flexible, natural language-driven workflows tailored to each input. It learns advanced orchestration strategies, such as iterative refinement and meta-prompt optimization, through trial-and-error reinforcement learning, automatically leveraging the distinct strengths of various LLMs without human intervention.
This technology has been productized as Sakana Fugu, a commercial multi-agent orchestration service accessible via an OpenAI-compatible API. Fugu aims to unlock AI productivity gains in industries like finance and defense, where current AI adoption is hindered by generalization limitations. Sakana Fugu offers variants like Fugu Mini for low-latency operations and Fugu Ultra for maximum performance, providing enterprise developers with seamless integration and automated complex collaboration topologies.