Claude 3.5 Sonnet (June 2024)
96.60±1.02
GPT-4o (August 2024)
95.68±1.15
Llama 3.1 405B Instruct
95.60±1.16
Claude 3 Opus
95.19±1.21
GPT-4 Turbo Preview
95.10±1.22
GPT-4o (May 2024)
94.85±1.25
Gemini 1.5 Pro (August 2024)
94.69±1.27
Mistral Large 2
93.94±1.35
Claude 3 Sonnet
93.28±1.41
Gemini 1.5 Pro (May 2024)
92.28±1.51
Gemini 1.5 Pro (April 2024)
90.54±1.65
Llama 3 70B Instruct
90.12±1.69
Gemini 1.5 Flash
90.12±1.69
Mistral Large
87.47±1.87
Gemini 1.0 Pro
79.83±2.27
CodeLlama 34B Instruct
37.51±2.73