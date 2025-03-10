Frontier Leaderboards
Legacy Leaderboards
2025 Scale AI. All rights reserved.
Humanity's Last Exam
Challenging LLMs at the frontier of human knowledge
Last updated: April 3, 2025
Performance Comparison
1
18.81 ±1.47
2
9.15 ±1.09
2
8.93 ±1.08
2
8.81 ±1.07
2
7.22 ±0.98
3
7.07 ±0.97
5
6.41 ±0.92
5
Llama 3.2 90B Vision Instruct
5.52 ±0.86
7
5.22 ±0.84
7
Gemini 2.0 Flash Experimental (December 2024)
5.19 ±0.84
7
5.07 ±0.83
7
5.04 ±0.83
7
4.78 ±0.80
8
Qwen2-VL-72B-Instruct
4.67 ±0.80
8
4.63 ±0.79
8
4.56 ±0.79
8
Claude 3 Opus
4.19 ±0.76
8
Gemini-1.5-Flash-002
4.15 ±0.75
8
3.96 ±0.74
17
3.07 ±0.65