Frontier Leaderboards
2025 Scale AI. All rights reserved.
Arabic
Deprecated (as of March 2025)
Last updated: March 20, 2025
Performance Comparison
1
Gemini 1.5 Pro (August 27, 2024)
1147.00±33.00
2
Gemini 2.0 Pro (December 2024)
1138.00±29.00
3
Gemini 2.0 Flash Thinking (January 2025)
1120.00±29.00
4
o1 (December 2024)
1120.00±27.00
5
Gemini 1.5 Pro (November 2024)
1116.00±28.00
6
o3-mini
1093.00±30.00
7
Gemini Pro Flash 2
1090.00±28.00
8
o1-preview
1087.00±36.00
9
GPT-4o (August 2024)
1066.00±47.00
11
GPT-4 (November 2024)
1011.00±26.00
12
Claude 3.5 Sonnet (June 2024)
995.00±44.00
13
Mistral Large 2
970.00±54.00
14
Gemini 1.5 Flash
967.00±39.00
15
Aya 23 35B*
932.00±25.00
16
Aya Expanse 32B
1025.00±24.00
16
Llama 3.1 405B Instruct
875.00±55.00
17
Llama 3.3 70B Instruct
808.00±36.00
18
Jais Adapted 70B
787.00±25.00
19
Gemma 2 27B
661.00±29.00