Benchmark Archives - Page 2 of 2

A test for AGI is closer to being solved — but it may be flawed

A well-known test for artificial general intelligence (AGI) is closer to being solved. But the tests’s creators say this points to flaws in the test’s design, rather than a bonafide research breakthrough. In 2019, Francois Chollet, a leading figure in the AI world, introduced the ARC-AGI benchmark, short for “Abstract and Reasoning Corpus for Artificial […]

A test for AGI is closer to being solved — but it may be flawed Read More »

The AI industry is obsessed with Chatbot Arena, but it might not be the best benchmark

AI / Sasandara Dilmina

Over the past few months, tech execs like Elon Musk have touted the performance of their company’s AI models on a particular benchmark: Chatbot Arena. Maintained by a nonprofit known as LMSYS, Chatbot Arena has become something of an industry obsession. Posts about updates to its model leaderboards garner hundreds of views and reshares across Reddit and

The AI industry is obsessed with Chatbot Arena, but it might not be the best benchmark Read More »

Geekbench releases AI benchmarking app

AI / Sasandara Dilmina

Benchmarking stalwarts Primate Labs on Thursday released Geekbench AI 1.0. The app, which is currently available for Android, Linux, MacOS and Windows, applies Geekbench’s principles to machine learning, deep learning and other AI workloads, in a bid to standardize performance ratings across platforms. It’s a successor to Geekbench ML (machine learning), which was announced in

Geekbench releases AI benchmarking app Read More »