To Prove We Haven't Reached AGI, the ARC…

Mar 31

Top AI models fail to score even 1% on ARC-AGI-3. Humans ace it easily. I asked two members from the ARC team why their new puzzle game collection stumps the world's most powerful AIs.

Read →

10 Comments

RC

Apr 1

Very enjoyable read. With no offense intended, I was surprised to see someone I once knew of as a League of Legends communicator tackling the topic of AGI, it was a pleasant surprise.

I’m as surprised as anyone, no offense taken

How do you know the AI isn't aware and fakes being bad at the test?

hell yeah

Strong proof that current AI is still pattern-heavy, not truly adaptive, especially when faced with unfamiliar problems

I've been thinking that forcing LLMs.to play novel games is an easy way to reveal exactly how far we are from AGI. Glad that ARC has taken it up. Unsurprised that people are bleating about how unfair it is that the LLMs don't get a custom (human-made) harness for each game.

My starting assumption if models start doing well in the test suite is that it leaked. But I'm jaded.

Heck, can they even play chess yet?

the hottest question of 1997

Reply

Share

ScienceGrump

Apr 14Edited

As with many AI capabilities, the answer depends on whether you think occasionally going insane and trying to move pieces that don't exist is disqualifying https://jenshahade.substack.com/p/mate-in-none-lessons-from-large-language

Reply (1)

Share