Frontier Leaderboards
Legacy Leaderboards
2025 Scale AI. All rights reserved.
Humanity's Last Exam
Challenging LLMs at the frontier of human knowledge
Last updated: April 30, 2025
Performance Comparison
1
20.32±1.58Calib Err: 34
1
19.20±1.54Calib Err: 39
1
18.16±1.51Calib Err: 71
1
18.08±1.51Calib Err: 57
1
17.80±1.50Calib Err: 70
6
14.28±1.37Calib Err: 59
6
12.08±1.28Calib Err: 80
7
10.96±1.22Calib Err: 82
9
8.12±1.07Calib Err: 82
9
8.04±1.07Calib Err: 80
9
7.96±1.06Calib Err: 83
9
6.56±0.97Calib Err: 82
9
5.68±0.91Calib Err: 83
12
5.44±0.89Calib Err: 85
12
5.40±0.89Calib Err: 89
13
4.60±0.82Calib Err: 88
13
4.52±0.81Calib Err: 77
14
4.40±0.80Calib Err: 80
14
4.08±0.78Calib Err: 84
16
3.64±0.73Calib Err: 82
19
2.72±0.64Calib Err: 89