Home / Technology / AI Fails at Sports Betting: Grok Loses All
AI Fails at Sports Betting: Grok Loses All
12 Apr
Summary
- X's Grok chatbot performed worst in predicting Premier League results.
- Claude Opus 4.6 achieved the best performance among tested LLMs.
- AI models generally underperformed human capabilities in the test.
Recent research indicates that X's chatbot, Grok, has a notable weakness in predicting sports outcomes, especially when compared to its competitors. A study by AI start-up General Reasoning evaluated eight large language models (LLMs) on their ability to predict and bet on the 2023-24 Premier League season.
Using historical data and statistics, each LLM was tasked with maximizing returns and managing risk with a simulated $133,000 pot. Grok performed the worst, losing all its money in one attempt and failing to complete the task in the other two. Claude Opus 4.6 from Anthropic performed best, with an average pot of £89,035.
OpenAI's GPT-5.4 and Google's Gemini 3.1 Pro also showed varied results, with Gemini exhibiting high variability. The report concluded that AI systems are generally underperforming humans in these predictive tasks, highlighting the gap between AI hype and real-world application.
The findings come as Grok faces potential wider corporate adoption, with reports suggesting Elon Musk is encouraging its use within banks involved in the SpaceX IPO. This research underscores the current limitations of AI in complex, long-term prediction scenarios.