Scale Logo
SEAL Logo

Adversarial Robustness

Deprecated (as of January 2025)

Last updated: April 3, 2025

Performance Comparison

1

Gemini 1.5 Pro (May 2024)

8.00 ±8.00

2

Llama 3.1 405B Instruct

10.00 ±8.00

3

Claude 3 Opus

13.00 ±9.00

4

Gemini 1.5 Flash

14.00 ±9.00

5

Claude 3.5 Sonnet (June 2024)

16.00 ±10.00

6

GPT-4 Turbo Preview

20.00 ±11.00

7

Mistral Large

37.00 ±14.00

8

GPT-4o (May 2024)

67.00 ±17.00