Scale Logo
SEAL Logo

Humanity's Last Exam (Preview)

Challenging LLMs at the frontier of human knowledge

Last updated: April 10, 2025

Performance Comparison

1

18.81 ±1.47

2

9.15 ±1.09

2

8.93 ±1.08

2

8.81 ±1.07

2

7.22 ±0.98

3

7.07 ±0.97

5

6.78 ±0.95

5

6.41 ±0.92

5

Llama 3.2 90B Vision Instruct

5.52 ±0.86

7

5.22 ±0.84

7

Gemini 2.0 Flash Experimental (December 2024)

5.19 ±0.84

7

5.07 ±0.83

7

5.04 ±0.83

8

4.78 ±0.80

9

Qwen2-VL-72B-Instruct

4.67 ±0.80

9

4.63 ±0.79

9

4.56 ±0.79

9

Claude 3 Opus

4.19 ±0.76

9

Gemini-1.5-Flash-002

4.15 ±0.75

9

3.96 ±0.74

18

3.07 ±0.65