scale logo
Note that this is NOT the newest Aya model and we are actively working on evaluating Aya Expanse

Potential contamination warning: o1-preview was evaluated approximately 4 months after GPT-4o (May 2024), possibly allowing OpenAI to access the prompt set from API logs. However, OpenAI's policy states they don't train on these data.

Potential contamination warning: GPT-4o (August 2024) was evaluated approximately 3 months after GPT-4o (May 2024), possibly allowing OpenAI to access the prompt set from API logs. However, OpenAI's policy states they don't train on these data.

Potential contamination warning: Claude 3.5 Sonnet was evaluated six weeks after Claude 3 Opus, possibly allowing Anthropic to access the prompt set from API logs. However, Anthropic's policy states they don't train on these data.