AI & ML Blog | Scale AI

Loading...

Scale AI logo

Enterprise
Government

Field Unknown Model.Unknown Field

Blog

Company Updates & Technology Articles

November 7, 2025

Beyond "Out-of-the-Box": Why Enterprises Need Specialized RL Agents

While general-purpose AI models are powerful, they often fail to deliver on complex, specialized enterprise workflows that use private data. We share results from our real world work in the insurance and legal industries, highlighting how our RL-tuned agents outperformed leading LLMs and dive into how we achieved these performance gains.

November 5, 2025

Expanding Our Presence with New Offices Around the World

Scale AI is expanding offices in New York, London, Washington D.C., and St. Louis to support growth, innovation, and reliable AI development worldwide.

October 29, 2025

Why I Joined Scale: Building the Applications for Saudi Arabia's AI Future

Talal AlBakr joins Scale AI to build production-ready AI applications that power Saudi Arabia’s Vision 2030.

October 29, 2025

The Remote Labor Index: Measuring the Automation of Work

Can AI actually automate complex, professional jobs? The new Remote Labor Index (RLI) from Scale and the Center for AI Safety (CAIS) provides the first data-driven answer. By testing AI agents against 240 real-world, paid freelance projects, the RLI found that the best-performing agents could only successfully automate 2.5% of them. This new benchmark reveals a critical gap between AI's generative skill and the end-to-end reliability required for professional work, showing the immediate impact is augmentation, not mass automation.

October 28, 2025

Scale AI Partners with Korea’s AI Safety Institute to Advance Global AI Evaluation and Governance

Scale AI Partners with Korea’s AI Safety Institute to Advance Global AI Evaluation and Governance

October 27, 2025

Beyond Code Exploits: Red Teaming the New AI Attack Surface

Your cybersecurity playbook is obsolete. In the age of AI, the greatest risks aren't traditional code exploits but unpredictable model behaviors—from prompt injections and data leakage to emergent misuse. Drawing on insights from live red teaming exercises with members of Congress, NATO, and the UK Parliament, AI security expert David Campbell explains why we must treat the model itself as the new attack surface. This post unveils an enterprise playbook for proactive AI red teaming, moving beyond static checks to continuously test systems like an adversary. Learn how to map, score, and measure AI risks to get ahead of the threat before an incident occurs.

October 22, 2025

Partnering with Lindbergh Schools to Prepare the Next Generation for the Age of AI

As part of our Pledge to America’s Youth, Scale AI is helping bring AI literacy into classrooms across America, starting in St. Louis.

October 16, 2025

Agentic infra is the problem you're probably not thinking about | Human in the Loop: Episode 14

Today on the podcast, the team is talking about the latest with enterprise agents including the problem you're probably not thinking about but should: agentic infrastructure.

October 16, 2025

Why I Joined Scale: Answering the AI Revolution's Biggest Challenge

The world is buzzing about the potential of artificial intelligence. We've seen groups proclaiming a "$10 Trillion AI Revolution" and comparing its impact to the Industrial Revolution. In the last 6-12 months, we've watched as true Generative AI solutions have begun to move from the lab and into the enterprise.

October 16, 2025

VisualToolBench: Testing the Limits of AI Vision

Our new benchmark, VisualToolBench, reveals a striking limitation in today's most advanced AI: models are much better at "thinking about images" than "thinking with them." While AI can describe what's in a picture, it fails when asked to manipulate an image by cropping, editing, or enhancing it to solve complex, real-world problems. The results are stark, with no model scoring above 19% correctness. Dive into our breakdown of why even the best models fail, what this reveals about the core bottleneck of visual perception, and how these findings create a new roadmap for the future of AI.

The future of your industry starts here

Products
- Scale Data Engine
- Scale GenAI Platform
- Scale Donovan
- Government
  - Public Sector
Company
Resources
Guides
Follow Us

Copyright © 2025 Scale AI, Inc. All rights reserved.Terms of Use & Privacy Policy