feedzop-word-mark-logo
searchLogin
Feedzop
homeFor YouUnited StatesUnited States
You
bookmarksYour BookmarkshashtagYour Topics
Trending
trending

Buckeyes, Wolverines high-stakes game

trending

Louisville Kentucky Governor's Cup game

trending

Telefónica secures LaLiga rights

trending

ACC Championship game scenarios

trending

Inter Miami defeats NYCFC

trending

Monterrey defeats Club America 2-0

trending

Kiffin: Ole Miss or LSU?

trending

Timberwolves hold off Celtics

trending

Fortnite Chapter 7 launch today

Terms of UsePrivacy PolicyAboutJobsPartner With Us

© 2025 Advergame Technologies Pvt. Ltd. ("ATPL"). Gamezop ® & Quizzop ® are registered trademarks of ATPL.

Gamezop is a plug-and-play gaming platform that any app or website can integrate to bring casual gaming for its users. Gamezop also operates Quizzop, a quizzing platform, that digital products can add as a trivia section.

Over 5,000 products from more than 70 countries have integrated Gamezop and Quizzop. These include Amazon, Samsung Internet, Snap, Tata Play, AccuWeather, Paytm, Gulf News, and Branch.

Games and trivia increase user engagement significantly within all kinds of apps and websites, besides opening a new stream of advertising revenue. Gamezop and Quizzop take 30 minutes to integrate and can be used for free: both by the products integrating them and end users

Increase ad revenue and engagement on your app / website with games, quizzes, astrology, and cricket content. Visit: business.gamezop.com

Property Code: 5571

Home / Technology / Nine-Month-Old AI Startup Unveils Benchmark Challenging Industry Giants

Nine-Month-Old AI Startup Unveils Benchmark Challenging Industry Giants

18 Nov

•

Summary

  • Artificial Analysis announces new benchmark for AI knowledge and hallucination
  • Benchmark covers over 40 topics, with most models more likely to hallucinate than provide correct answers
  • Claude 4.1 Opus takes first place in the benchmark's key metric
Nine-Month-Old AI Startup Unveils Benchmark Challenging Industry Giants

In a surprising move, a little-known nine-month-old AI company called Artificial Analysis has announced the launch of its new benchmark, AA-Omniscience, which evaluates knowledge and hallucination across more than 40 topics. The benchmark, revealed just last month, has already made waves in the industry.

The results of the AA-Omniscience benchmark are quite startling. According to the data, all but three of the language models tested were more likely to hallucinate, or provide incorrect information, than to give a correct answer. This highlights the significant challenges that still exist in developing AI systems with robust and reliable knowledge.

Despite the sobering findings, there were some bright spots. The Claude 4.1 Opus model managed to take first place in the benchmark's key metric, demonstrating its relative strength in accurately conveying information. This achievement by the nine-month-old startup's creation is a testament to the rapid advancements being made in the field of artificial intelligence.

As the industry continues to grapple with the complexities of building truly knowledgeable and trustworthy AI systems, the AA-Omniscience benchmark promises to play a crucial role in guiding future research and development efforts. With its comprehensive coverage and insightful results, this new tool could help shape the future of the AI landscape.

Disclaimer: This story has been auto-aggregated and auto-summarised by a computer program. This story has not been edited or created by the Feedzop team.
The AA-Omniscience benchmark is a new tool developed by Artificial Analysis, a nine-month-old AI company, to evaluate the knowledge and hallucination capabilities of language models across over 40 topics.
According to the article, the Claude 4.1 Opus model took first place in the benchmark's key metric, demonstrating its relative strength in accurately conveying information.
The benchmark found that all but three of the language models tested were more likely to hallucinate, or provide incorrect information, than to give a correct answer, highlighting the significant challenges in developing AI systems with robust and reliable knowledge.

Read more news on

Technologyside-arrow

You may also like

Learning's New Horizons: AI, VR & More Explored

27 Nov • 3 reads

article image

Claude 4.5 Opus Beats GPT-4.5 in Coding

25 Nov • 15 reads

article image

WWII's Secret Weapons: Drugs, Bats & Codebreaking

25 Nov • 16 reads

article image

AI's Future: Proofs, Not Promises

23 Nov • 88 reads

article image

AI Pioneer LeCun Leaves Meta for New Startup

20 Nov • 46 reads

article image