Scale Logo
SEAL Logo

Coding

Deprecated (as of March 2025)

Last updated: March 27, 2025

Performance Comparison

1

1237.00 ±31.00

2

o3-mini

1137.00 ±34.00

3

GPT-4o (November 2024)

1132.00 ±31.00

4

1123.00 ±35.00

5

Gemini 2.0 Flash Experimental (December 2024)

1111.00 ±26.00

6

Gemini 2.0 Pro (December 2024)

1109.00 ±33.00

7

Gemini 2.0 Flash Thinking (January 2025)

1108.00 ±37.00

8

DeepSeek R1

1100.00 ±31.00

9

o1 (December 2024)

1083.00 ±30.00

10

1079.00 ±22.00

11

1045.00 ±25.00

12

GPT-4o (May 2024)

1036.00 ±24.00

13

GPT-4 Turbo Preview

1034.00 ±22.00

13

Claude 3 Opus

959.00 ±26.00

14

Mistral Large 2

1029.00 ±23.00

15

Llama 3.1 405B Instruct

1022.00 ±24.00

16

1007.00 ±26.00

17

Gemini 1.5 Pro (May 2024)

994.00 ±25.00

18

GPT-4 (November 2024)

992.00 ±28.00

19

Deepseek V3

985.00 ±25.00

20

Llama 3.2 90B Vision Instruct

984.00 ±30.00

22

Gemini 1.5 Flash

943.00 ±26.00

23

Gemini 1.5 Pro (April 2024)

891.00 ±32.00

24

Claude 3 Sonnet

879.00 ±31.00

25

Llama 3 70B Instruct

871.00 ±26.00

26

Mistral Large

811.00 ±25.00

27

Gemini 1.0 Pro

685.00 ±34.00

28

CodeLlama 34B Instruct

598.00 ±38.00