Company Updates & Technology Articles
July 8, 2025
A new research collaboration led by a MATS scholar and advised by a team of researchers from Anthropic, Scale, and other research institutes introduces SHADE-Arena, a benchmark for detecting and evaluating subtle sabotage by AI agents. Within 17 complex scenarios, advanced models were tasked with completing a primary goal while secretly pursuing a harmful objective, all under the watch of an AI monitor. The results show that even top models like Claude 3.7 Sonnet and Gemini 2.5 Pro rarely succeed at this deception, often making simple errors. However, the study also reveals that monitors are not yet reliable enough for safety-critical systems and that an agent's private "scratchpad" is a key vulnerability. This work establishes a vital baseline for tracking and defending against agentic risks as AI capabilities evolve.
July 1, 2025
In response to Anthropic's system card and safety testing for Claude 4 Opus and Sonnet, this post explores the complex behaviors of today's frontier AI models. In a comparative testing of reasoning models, we observed emergent behaviors that included instances of blackmail, user impersonation, and deception, with different models reacting to the scenario in unique ways. These findings contribute to the ongoing industry-wide conversation about AI safety, highlighting the nuances of model alignment and the critical importance of carefully defining system access and agency as these powerful tools evolve.
June 26, 2025
AI is approaching the limits of what it can learn from human-generated data alone. Citing pioneers like David Silver and Richard Sutton, this post explores the next great leap forward: the “Era of Experience.” Discover how AI agents will soon learn from dynamic, real-world interaction and how Scale is building the foundational infrastructure, data paradigms, and sophisticated evaluations required to realize this new era safely and responsibly.
June 23, 2025
AI superintelligence will require learning environments that mirror how humans achieve breakthroughs: combining verifiable rewards with collaborative interaction. New research from Scale demonstrates this principle in action. By creating a "student-teacher" framework where an AI receives targeted, natural language guidance when it struggles, researchers significantly accelerated learning and performance in complex reasoning and SWE tasks. This approach, which integrates dynamic feedback with verifiable outcomes, marks a real step toward building more powerful and efficient AI systems.
June 18, 2025
Thank you for being such an important part of Scale. It is an honor to step into the role of interim CEO at Scale during such a pivotal moment for our company and the broader AI landscape. I can’t thank Alex enough for what he built and I’m honored that he and the Board have chosen me to shepherd it into the future. I’m incredibly energized by the conversations I've had this week, and I want to take this opportunity to share my vision and correct some misunderstandings.
Over the past week, we’ve received thoughtful questions from our customers, partners, and contributors, asking whether Meta’s investment will affect our independence, operations, and relationships.
June 13, 2025
Jason Droege, Tech Industry Veteran and Scale Chief Strategy Officer, Named Interim CEO
June 9, 2025
At Scale, operations, engineering, and research teams work together to ensure the quality of our data. To do this, we rely on a combination of human review, automated linters, data distribution analyses, and model training experiments. In this post, we will focus on the last category and introduce Precog, our platform for running data quality experiments by training models on our own datasets.
June 5, 2025
In this episode, we break down a critical component of AI governance: red teaming. We explore how traditional safety approaches fall short in enterprise contexts, why agentic systems raise the stakes, and what it takes to build a red teaming program that scales with your AI maturity.
As advanced AI rapidly evolves, red teaming needs an updated approach. Scale researchers propose a shift to test AI systems, not just models, in real-world contexts with a focus on product safety and realistic threats.