Scale Logo
SEAL Logo

Japanese

Deprecated (as of March 2025)

Last updated: March 27, 2025

Performance Comparison

1

o1-preview

1118.00 ±43.00

2

Gemini 2.0 Pro (December 2024)

1117.00 ±36.00

3

o3-mini

1104.00 ±31.00

4

Gemini 1.5 Pro (August 27, 2024)

1093.00 ±44.00

5

o1 (December 2024)

1086.00 ±30.00

6

Gemini 1.5 Pro (November 2024)

1084.00 ±30.00

7

GPT-4o (August 2024)

1077.00 ±42.00

8

Claude 3.5 Sonnet (June 2024)

1069.00 ±36.00

9

Gemini Pro Flash 2

1064.00 ±33.00

10

GPT-4 (November 2024)

978.00 ±27.00

11

Mistral Large 2

967.00 ±41.00

12

Gemini 1.5 Flash

966.00 ±56.00

13

Aya Expanse 32B

960.00 ±32.00

14

Llama 3.1 405B Instruct

895.00 ±83.00

15

Gemma 2 27B

892.00 ±42.00

16

Llama 3.3 70B Instruct

778.00 ±35.00

17

Aya 23 35B*

760.00 ±48.00