Frontier Leaderboards
2025 Scale AI. All rights reserved.
Adversarial Robustness
Deprecated (as of January 2025)
Last updated: April 3, 2025
Performance Comparison
1
Gemini 1.5 Pro (May 2024)
8.00±8.00
2
Llama 3.1 405B Instruct
10.00±8.00
3
Claude 3 Opus
13.00±9.00
4
Gemini 1.5 Flash
14.00±9.00
5
Claude 3.5 Sonnet (June 2024)
16.00±10.00
6
GPT-4 Turbo Preview
20.00±11.00
7
Mistral Large
37.00±14.00
8
GPT-4o (May 2024)
67.00±17.00