Scale Logo
SEAL Logo

Chinese

Deprecated (as of March 2025)

Last updated: March 20, 2025

Performance Comparison

1

o1 (December 2024)

1165.00±30.00

2

o3-mini

1156.00±32.00

3

o1-preview

1120.00±44.00

4

Gemini 1.5 Pro (August 27, 2024)

1120.00±35.00

5

Gemini 2.0 Pro (December 2024)

1117.00±33.00

6

Gemini Pro Flash 2

1115.00±28.00

7

Gemini 1.5 Pro (November 2024)

1077.00±28.00

8

Gemini 2.0 Flash Thinking (January 2025)

1060.00±33.00

9

DeepSeek R1

1052.00±32.00

10

Deepseek V3

1031.00±26.00

11

GPT-4o (August 2024)

1029.00±30.00

12

Gemini 1.5 Flash

1015.00±53.00

12

Aya Expanse 32B

967.00±29.00

13

Mistral Large 2

1006.00±34.00

14

DeepSeek V2 Chat

996.00±24.00

15

GPT-4 (November 2024)

985.00±28.00

17

Gemma 2 27B

966.00±29.00

18

Claude 3.5 Sonnet (June 2024)

930.00±42.00

19

Qwen 2 72B Instruct

902.00±37.00

20

Llama 3.3 70B Instruct

883.00±33.00

21

Yi 1.5 34B Chat

780.00±41.00