scale logo
<- Back to leaderboard

MultiChallenge

Realistic multi-turn conversation

Rank (UB): 1 + the number of models whose lower CI bound exceeds this model’s upper CI bound.