The Scale Team

65 articles

December 19, 2025

Government

The Agentic Era: Building the Foundation for Autonomous Mission Assurance

The Agentic Era: Building the Foundation for Autonomous Mission Assurance

Agentic AI marks a shift from reactive chatbots to autonomous mission partners. Government must adopt unified Agentic Infrastructure—combining resilient agent execution and governed AgentOps—to enable machine-speed decisions. Platforms like Scale’s SGP and Agentex deliver interoperable, durable, and accountable autonomy for mission assurance.

Read more

April 25, 2025

Company

Scale’s Role In Building a Safer Internet

Scale’s Role In Building a Safer Internet

Training AI models to behave responsibly in the real world means preparing them for the full range of online content — including the challenging parts. It’s not easy work, but it’s necessary. At Scale, we believe that building AI systems that avoid harmful, abusive, or dangerous behavior is one of the most important challenges of our time. And we’re proud to support the people who make this possible.

Read more

April 24, 2025

Company

Introducing the Scale AI and University of Missouri - St. Louis Geospatial Collaborative

Introducing the Scale AI and University of Missouri - St. Louis Geospatial Collaborative

As part of Scale’s ongoing investment in its AI workforce in St. Louis, Scale and the University of Missouri-St. Louis (UMSL) are officially launching a collaborative education effort.

Read more

April 3, 2025

Company

Outlier Updates to Empower Contributors

Outlier Updates to Empower Contributors

Since its inception in 2023, Outlier has become a cornerstone of the AI industry—connecting hundreds of thousands of people across the globe with meaningful and flexible work. Hailing from cities and small towns across the world, Outlier contributors have earned a combined hundreds of millions of dollars to help build the foundation of today’s most advanced AI models

Read more

April 2, 2025

Product

Advancing Frontier Model Evaluation

Advancing Frontier Model Evaluation

Frontier AI development has reached an inflection point: as models rapidly advance in capabilities, the need for sophisticated evaluation has become a decisive factor in competitive success. That’s why today we're announcing updates to Scale Evaluation, our platform that helps teams identify model weaknesses and validate improvements. Our updated platform introduces four key capabilities: instant model comparison across thousands of tests, multi-dimensional performance visualization, automated error discovery, and targeted improvement guidance—all designed to help teams identify weaknesses faster and make more confident release decisions. These updates build on Scale Evaluation’s foundation introduced last year, broadening access to frontier evaluation capabilities.

Read more

March 26, 2025

Product

Scale AI products approved for purchase on AWS Marketplace for the U.S. National Security Community

Scale AI products approved for purchase on AWS Marketplace for the U.S. National Security Community

Scale AI products have been approved for purchase on AWS Marketplace for the U.S. Intelligence Community (ICMP). ICMP is a digital catalog that makes it easy for customers in the U.S. national security community to find, test, buy, and deploy software that runs on AWS.

Read more

March 5, 2025

Government

Introducing Thunderforge: AI for American Defense

Introducing Thunderforge: AI for American Defense

Scale is proud to have been awarded a prime contract by the Defense Innovation Unit (DIU) for Thunderforge - the DoD’s flagship program leveraging AI for military planning and wargaming. Thunderforge represents our commitment to advancing U.S. military capabilities. Following its initial deployment, Thunderforge will expand throughout combatant commands, leveraging Scale AI's agentic applications and GenAI evaluation expertise.

Read more

February 27, 2025

Government

Scale AI & Center for Strategic and International Studies (CSIS) Introduce Foreign Policy Decision Benchmark

Scale AI & Center for Strategic and International Studies (CSIS) Introduce Foreign Policy Decision Benchmark

Scale AI, in collaboration with the Center for Strategic and International Studies (CSIS), is proud to introduce the Critical Foreign Policy Decision (CFPD) Benchmark—a pioneering effort to evaluate large language models (LLMs) on national security and foreign policy decision-making tendencies.

Read more

February 23, 2025

General

MCIT & Scale AI: Paving the Way for Qatar’s Digital Future

MCIT & Scale AI: Paving the Way for Qatar’s Digital Future

The Ministry of Communications and Information Technology (MCIT) and Scale AI, the leader in frontier AI solutions, are announcing a strategic, long-term partnership to drive Qatar’s digital transformation.

Read more

February 11, 2025

Research

Jailbreaking to Jailbreak: A Novel Approach to Safety Testing

Jailbreaking to Jailbreak: A Novel Approach to Safety Testing

Scale researchers have discovered a groundbreaking method for AI safety testing called J2 (Jailbreaking to Jailbreak), where language models are taught to systematically test their own and other models' safety measures. This hybrid approach combines human-like strategic reasoning with automated scalability, achieving success rates of over 90% in vulnerability testing, nearly matching professional human red-teaming effectiveness. While highlighting significant advances in automated security testing, these findings also reveal important challenges for the future of AI safety.

Read more