March 9, 2026

Scale Labs is Scale’s new research hub studying how advanced AI systems behave in real-world environments. Building on the work of SEAL, the lab focuses on evaluation, agentic and multimodal systems, post-training methods, enterprise deployment, and AI risk and oversight infrastructure, with research spanning frontier capability measurement, long-horizon agent behavior, and national-scale safety evaluation.
Read more
November 20, 2025

Today, we add several new models to Showdown. A surprising finding is that users consistently rank GPT-5 significantly lower than other models. In this blog post, we share our preliminary analysis of GPT-5's ranking on Showdown, where we examine the effect of thinking effort, task type, and evaluation setting.
Read more