Frontier Multimodal Benchmark
18.81
+1.47 / -1.47
88.52
8.93
+1.08 / -1.08
88.34
8.81
+1.07 / -1.07
92.79
7.22
+0.98 / -0.98
90.58
7.07
+0.97 / -0.97
92.98
6.41
+0.92 / -0.92
90.53
5.52
+0.86 / -0.86
88.61
5.22
+0.84 / -0.84
93.04
5.19
95.08
5.07
+0.83 / -0.83
90.81
5.04
82.30
4.78
+0.80 / -0.80
88.53
4.67
86.48
4.63
+0.79 / -0.79
85.02
4.56
89.40
4.19
+0.76 / -0.76
85.06
4.15
+0.75 / -0.75
88.66
3.96
+0.74 / -0.74
86.39
3.07
+0.65 / -0.65
92.27
Rank (UB): 1 + the number of models whose lower CI bound exceeds this model’s upper CI bound.
-
o1-pro coming soon.