Scale Logo
SEAL Logo

Chinese

Deprecated (as of March 2025)

Last updated: March 20, 2025

Performance Comparison

1

o1 (December 2024)

1165.00 ±30.00

2

o3-mini

1156.00 ±32.00

3

o1-preview

1120.00 ±44.00

4

Gemini 1.5 Pro (August 27, 2024)

1120.00 ±35.00

5

Gemini 2.0 Pro (December 2024)

1117.00 ±33.00

6

Gemini Pro Flash 2

1115.00 ±28.00

7

Gemini 1.5 Pro (November 2024)

1077.00 ±28.00

8

Gemini 2.0 Flash Thinking (January 2025)

1060.00 ±33.00

9

DeepSeek R1

1052.00 ±32.00

10

Deepseek V3

1031.00 ±26.00

11

GPT-4o (August 2024)

1029.00 ±30.00

12

Gemini 1.5 Flash

1015.00 ±53.00

12

Aya Expanse 32B

967.00 ±29.00

13

Mistral Large 2

1006.00 ±34.00

14

DeepSeek V2 Chat

996.00 ±24.00

15

GPT-4 (November 2024)

985.00 ±28.00

17

Gemma 2 27B

966.00 ±29.00

18

Claude 3.5 Sonnet (June 2024)

930.00 ±42.00

19

Qwen 2 72B Instruct

902.00 ±37.00

20

Llama 3.3 70B Instruct

883.00 ±33.00

21

Yi 1.5 34B Chat

780.00 ±41.00