How did X's Grok chatbot perform in Premier League predictions?

X's Grok chatbot performed the worst among eight large language models tested, failing to predict Premier League results effectively and losing all its simulated funds.

Which AI chatbot performed best in predicting sports results?

Anthropic's Claude Opus 4.6 performed the best among the tested AI models in predicting and betting on the 2023-24 Premier League season.

Do AI models currently outperform humans in sports betting predictions?

According to recent research, AI models are currently systematically underperforming humans in predicting sports results and managing betting simulations.

Home / Technology / AI Fails at Sports Betting: Grok Loses All

AI Fails at Sports Betting: Grok Loses All

12 Apr

•

Summary

X's Grok chatbot performed worst in predicting Premier League results.
Claude Opus 4.6 achieved the best performance among tested LLMs.
AI models generally underperformed human capabilities in the test.

Recent research indicates that X's chatbot, Grok, has a notable weakness in predicting sports outcomes, especially when compared to its competitors. A study by AI start-up General Reasoning evaluated eight large language models (LLMs) on their ability to predict and bet on the 2023-24 Premier League season.

Using historical data and statistics, each LLM was tasked with maximizing returns and managing risk with a simulated $133,000 pot. Grok performed the worst, losing all its money in one attempt and failing to complete the task in the other two. Claude Opus 4.6 from Anthropic performed best, with an average pot of £89,035.

OpenAI's GPT-5.4 and Google's Gemini 3.1 Pro also showed varied results, with Gemini exhibiting high variability. The report concluded that AI systems are generally underperforming humans in these predictive tasks, highlighting the gap between AI hype and real-world application.

The findings come as Grok faces potential wider corporate adoption, with reports suggesting Elon Musk is encouraging its use within banks involved in the SpaceX IPO. This research underscores the current limitations of AI in complex, long-term prediction scenarios.

Disclaimer: This story has been auto-aggregated and auto-summarised by a computer program. This story has not been edited or created by the Feedzop team.

AI Fails at Sports Betting: Grok Loses All

12 Apr

•

Summary

X's Grok chatbot performed worst in predicting Premier League results.
Claude Opus 4.6 achieved the best performance among tested LLMs.
AI models generally underperformed human capabilities in the test.

Disclaimer: This story has been auto-aggregated and auto-summarised by a computer program. This story has not been edited or created by the Feedzop team.