Frontier Leaderboards
2025 Scale AI. All rights reserved.
Japanese
Deprecated (as of March 2025)
Last updated: March 27, 2025
Performance Comparison
1
o1-preview
1118.00±43.00
2
Gemini 2.0 Pro (December 2024)
1117.00±36.00
3
o3-mini
1104.00±31.00
4
Gemini 1.5 Pro (August 27, 2024)
1093.00±44.00
5
o1 (December 2024)
1086.00±30.00
6
Gemini 1.5 Pro (November 2024)
1084.00±30.00
7
GPT-4o (August 2024)
1077.00±42.00
8
Claude 3.5 Sonnet (June 2024)
1069.00±36.00
9
Gemini Pro Flash 2
1064.00±33.00
10
GPT-4 (November 2024)
978.00±27.00
11
Mistral Large 2
967.00±41.00
12
Gemini 1.5 Flash
966.00±56.00
13
Aya Expanse 32B
960.00±32.00
14
Llama 3.1 405B Instruct
895.00±83.00
15
Gemma 2 27B
892.00±42.00
16
Llama 3.3 70B Instruct
778.00±35.00
17
Aya 23 35B*
760.00±48.00