feedzop-word-mark-logo
searchLogin
Feedzop
homeFor YouIndiaIndia
You
bookmarksYour BookmarkshashtagYour Topics
Trending
trending

Nifty 50 opens lower

trending

Aadhar update simplifies key details

trending

PM Kisan eKYC mandatory

trending

Rajasthan police busts Mephedrone lab

trending

Powerball jackpot at $570 million

trending

AI evidence in court

trending

India faces Bangladesh in qualifier

trending

Cloudflare outage crashes websites

trending

Dow futures slip after slide

Terms of UsePrivacy PolicyAboutJobsPartner With Us

© 2025 Advergame Technologies Pvt. Ltd. ("ATPL"). Gamezop ® & Quizzop ® are registered trademarks of ATPL.

Gamezop is a plug-and-play gaming platform that any app or website can integrate to bring casual gaming for its users. Gamezop also operates Quizzop, a quizzing platform, that digital products can add as a trivia section.

Over 5,000 products from more than 70 countries have integrated Gamezop and Quizzop. These include Amazon, Samsung Internet, Snap, Tata Play, AccuWeather, Paytm, Gulf News, and Branch.

Games and trivia increase user engagement significantly within all kinds of apps and websites, besides opening a new stream of advertising revenue. Gamezop and Quizzop take 30 minutes to integrate and can be used for free: both by the products integrating them and end users

Increase ad revenue and engagement on your app / website with games, quizzes, astrology, and cricket content. Visit: business.gamezop.com

Property Code: 5571

Home / Technology / Nine-Month-Old AI Startup Unveils Benchmark Challenging Industry Giants

Nine-Month-Old AI Startup Unveils Benchmark Challenging Industry Giants

18 Nov

•

Summary

  • Artificial Analysis announces new benchmark for AI knowledge and hallucination
  • Benchmark covers over 40 topics, with most models more likely to hallucinate than provide correct answers
  • Claude 4.1 Opus takes first place in the benchmark's key metric
Nine-Month-Old AI Startup Unveils Benchmark Challenging Industry Giants

In a surprising move, a little-known nine-month-old AI company called Artificial Analysis has announced the launch of its new benchmark, AA-Omniscience, which evaluates knowledge and hallucination across more than 40 topics. The benchmark, revealed just last month, has already made waves in the industry.

The results of the AA-Omniscience benchmark are quite startling. According to the data, all but three of the language models tested were more likely to hallucinate, or provide incorrect information, than to give a correct answer. This highlights the significant challenges that still exist in developing AI systems with robust and reliable knowledge.

Despite the sobering findings, there were some bright spots. The Claude 4.1 Opus model managed to take first place in the benchmark's key metric, demonstrating its relative strength in accurately conveying information. This achievement by the nine-month-old startup's creation is a testament to the rapid advancements being made in the field of artificial intelligence.

As the industry continues to grapple with the complexities of building truly knowledgeable and trustworthy AI systems, the AA-Omniscience benchmark promises to play a crucial role in guiding future research and development efforts. With its comprehensive coverage and insightful results, this new tool could help shape the future of the AI landscape.

Disclaimer: This story has been auto-aggregated and auto-summarised by a computer program. This story has not been edited or created by the Feedzop team.
The AA-Omniscience benchmark is a new tool developed by Artificial Analysis, a nine-month-old AI company, to evaluate the knowledge and hallucination capabilities of language models across over 40 topics.
According to the article, the Claude 4.1 Opus model took first place in the benchmark's key metric, demonstrating its relative strength in accurately conveying information.
The benchmark found that all but three of the language models tested were more likely to hallucinate, or provide incorrect information, than to give a correct answer, highlighting the significant challenges in developing AI systems with robust and reliable knowledge.

Read more news on

Technologyside-arrow

You may also like

Judges Warn: AI Fakes Could Undermine Justice

4 hours ago • 1 read

article image

Streamer Raises $104K for LGBTQ+ Charity in Emotional Tribute

15 Nov • 6 reads

article image

Actress Rukmini Vasanth Warns Fans of Impersonator Scam

10 Nov • 22 reads

article image

Tommy Egan's Final Chicago Showdown: Power Book IV: Force Season 3 Premieres

7 Nov • 11 reads

article image

Legendary German Composer and Passport Founder Klaus Doldinger Dies at 89

18 Oct • 71 reads

article image