Scale AI logo
SEAL Logo

Coding

Deprecated (as of March 2025)

Last updated: March 27, 2025

Performance Comparison

1

1237.00±31.00

2

o3-mini

1137.00±34.00

3

GPT-4o (November 2024)

1132.00±31.00

4

1123.00±35.00

5

Gemini 2.0 Flash Experimental (December 2024)

1111.00±26.00

6

Gemini 2.0 Pro (December 2024)

1109.00±33.00

7

Gemini 2.0 Flash Thinking (January 2025)

1108.00±37.00

8

DeepSeek R1

1100.00±31.00

9

o1 (December 2024)

1083.00±30.00

10

1079.00±22.00

11

1045.00±25.00

12

GPT-4o (May 2024)

1036.00±24.00

13

GPT-4 Turbo Preview

1034.00±22.00

13

Claude 3 Opus

959.00±26.00

14

Mistral Large 2

1029.00±23.00

15

Llama 3.1 405B Instruct

1022.00±24.00

16

1007.00±26.00

17

Gemini 1.5 Pro (May 2024)

994.00±25.00

18

GPT-4 (November 2024)

992.00±28.00

19

Deepseek V3

985.00±25.00

20

Llama 3.2 90B Vision Instruct

984.00±30.00

22

Gemini 1.5 Flash

943.00±26.00

23

Gemini 1.5 Pro (April 2024)

891.00±32.00

24

Claude 3 Sonnet

879.00±31.00

25

Llama 3 70B Instruct

871.00±26.00

26

Mistral Large

811.00±25.00

27

Gemini 1.0 Pro

685.00±34.00

28

CodeLlama 34B Instruct

598.00±38.00