Scale Logo
SEAL Logo

EnigmaEval

Puzzle Solving

Last updated: April 3, 2025

Performance Comparison

1

13.09±1.92

1

11.91±1.85

2

9.21±1.65

3

6.81±0.83

4

6.14±1.37

4

o1 (December 2024)

5.65±1.32

5

4.23±1.17

7

4.14±1.16

7

3.18±1.02

7

2.26±0.86

8

2.17±0.84

10

Gemini 2.0 Flash Thinking (January 2025)

1.10±0.60

10

Claude 3.5 Sonnet (October 2024)

0.91±0.55

11

Pixtral Large (November 2024)

0.84±0.53

12

Claude 3 Opus

0.82±0.45

12

GPT-4o (November 2024)

0.80±0.44

12

0.69±0.48

12

0.63±0.45

12

0.58±0.43

12

Llama 3.2 90B Vision Instruct

0.38±0.35