Company Updates & Technology Articles
April 24, 2025
Welcome to Human in the Loop with Scale AI. We're kicking off with an episode diving into the current AI agent landscape and covering what’s important for enterprises to move beyond demos to real, reliable agentic systems.
As part of Scale’s ongoing investment in its AI workforce in St. Louis, Scale and the University of Missouri-St. Louis (UMSL) are officially launching a collaborative education effort.
April 17, 2025
When we evaluated o3 and o4-mini on Humanity’s Last Exam, we noticed their calibration errors were significantly lower than predecessors. A well-calibrated model is like someone who knows when they are likely to be right or wrong. If a well-calibrated model says it’s 70% confident on a set of questions, it should be correct about 70% of them. Calibration error measures this difference between the model’s stated confidence and its actual accuracy – ideally it’s 0%. All models benchmarked so far have exhibited much higher calibration errors. Are the newer generation of reasoning models from OpenAI truly better calibrated?
April 14, 2025
As LLMs become more sophisticated, maintaining a distinct human voice isn't just stylistic—it's essential. Explore why your unique perspective matters more than ever and learn actionable techniques for working with LLMs to enhance your writing process while keeping your authentic voice front and center.
April 3, 2025
Since its inception in 2023, Outlier has become a cornerstone of the AI industry—connecting hundreds of thousands of people across the globe with meaningful and flexible work. Hailing from cities and small towns across the world, Outlier contributors have earned a combined hundreds of millions of dollars to help build the foundation of today’s most advanced AI models
April 2, 2025
Frontier AI development has reached an inflection point: as models rapidly advance in capabilities, the need for sophisticated evaluation has become a decisive factor in competitive success. That’s why today we're announcing updates to Scale Evaluation, our platform that helps teams identify model weaknesses and validate improvements. Our updated platform introduces four key capabilities: instant model comparison across thousands of tests, multi-dimensional performance visualization, automated error discovery, and targeted improvement guidance—all designed to help teams identify weaknesses faster and make more confident release decisions. These updates build on Scale Evaluation’s foundation introduced last year, broadening access to frontier evaluation capabilities.
March 26, 2025
Scale AI products have been approved for purchase on AWS Marketplace for the U.S. Intelligence Community (ICMP). ICMP is a digital catalog that makes it easy for customers in the U.S. national security community to find, test, buy, and deploy software that runs on AWS.
March 17, 2025
The next four years will be critical to the future of AI leadership around the world and the case for bold action has never been clearer. More than one year ago the United States was leading the world in the development of AI systems, but today that is no longer the case. Chinese AI advancements, most notably with the launch of Deepsek, have shown that China has closed the gap and now the race is nearly tied. It is not enough for the United States to match China’s intensity on AI, we must exceed it or simply put, we lose. President Trump rightly called Deepseek’s release “a wake up call” and now, the US needs to heed the call to action and determine how to best respond in order to win.
March 5, 2025
Scale is proud to have been awarded a prime contract by the Defense Innovation Unit (DIU) for Thunderforge - the DoD’s flagship program leveraging AI for military planning and wargaming. Thunderforge represents our commitment to advancing U.S. military capabilities. Following its initial deployment, Thunderforge will expand throughout combatant commands, leveraging Scale AI's agentic applications and GenAI evaluation expertise.
March 3, 2025
Scale AI, a leader in building frontier AI solutions, and Inception, a G42 company developing AI-native products for enterprises, have announced a strategic partnership aimed at accelerating global AI adoption across the public and private sector. The partnership agreement was signed between Ashish Koshy, COO of Inception and Trevor Thompson, Global Managing Director at Scale AI.