Frontier Leaderboards
2025 Scale AI. All rights reserved.
Humanity's Last Exam (Preview)
Challenging LLMs at the frontier of human knowledge
Last updated: April 10, 2025
Performance Comparison
1
18.81±1.47
2
9.15±1.09
2
8.93±1.08
2
8.81±1.07
2
7.22±0.98
3
7.07±0.97
5
6.78±0.95
5
6.41±0.92
5
Llama 3.2 90B Vision Instruct
5.52±0.86
7
5.22±0.84
7
Gemini 2.0 Flash Experimental (December 2024)
5.19±0.84
7
5.07±0.83
7
5.04±0.83
8
4.78±0.80
9
Qwen2-VL-72B-Instruct
4.67±0.80
9
4.63±0.79
9
4.56±0.79
9
Claude 3 Opus
4.19±0.76
9
Gemini-1.5-Flash-002
4.15±0.75
9
3.96±0.74
18
3.07±0.65