DeepSeek-R1 & OpenAI's o1 Aren't Still As…

Dr. Ashish Bamania

Jan 30

Here are the results of testing R1 and o1 on the toughest mathematical questions known to humans.

Read →

4 Comments

E J Hermann

Jan 31

Conclusion: Stop the Hype TrAIn LOL, thanks for sharing and testing!

Expand full comment

John Allard

Feb 1

We went from models not being able to do any sort of even the most basic math to “look they can’t even reliably pass the most difficult mathematics benchmark on earth!” In 18 months and you’re staying to “stop the hype!” ?? They’re putting up new SOTA results every 3 weeks with no sign of slowing down and you think that’s a reason to *temper* expectations??

Expand full comment

Reply (1)

John Allard

Feb 1

Where do you think these models will be one year from now? Why should we overfit to their capabilities at the time of writing this article instead of gawking at a nearly vertical trend line in capabilities?

Expand full comment