scale logo
<- Back to leaderboard

Humanity's Last Exam

Frontier Multimodal Benchmark

Rank (UB): 1 + the number of models whose lower CI bound exceeds this model’s upper CI bound.

-

o1-pro coming soon.