Frontier Leaderboards
Legacy Leaderboards
2025 Scale AI. All rights reserved.
EnigmaEval
Puzzle Solving
Last updated: April 3, 2025
Performance Comparison
1
6.14 ±1.02
1
o1 (December 2024)
5.65 ±0.52
3
4.23 ±0.45
3
4.14 ±0.25
5
3.18 ±0.28
6
2.26 ±0.63
6
2.17 ±0.48
8
Gemini 2.0 Flash Thinking (January 2025)
1.10 ±0.17
8
Claude 3.5 Sonnet (October 2024)
0.91 ±0.16
8
Pixtral Large (November 2024)
0.84 ±0.19
8
0.69 ±0.42
9
Claude 3 Opus
0.82 ±0.05
9
GPT-4o (November 2024)
0.80 ±0.12
9
Gemini 2.0 Flash (February 2025)
0.63 ±0.24
10
0.58 ±0.12
13
Llama 3.2 90B Vision Instruct
0.38 ±0.06