Scale’s Series F: Expanding the Data Foundry for AI

May 21, 2024

For 8 years, Scale has been the leading AI data foundry helping fuel the most exciting advancements in AI, including autonomous vehicles, defense applications, and generative AI.

Today, we’re announcing that Scale has closed a $1B financing transaction at a $13.8B valuation. The financing, a mix of primary and secondary, is led by existing investor Accel with nearly all of our existing investors participating: Y Combinator, Nat Friedman, Index Ventures, Founders Fund, Coatue, Thrive Capital, Spark Capital, NVIDIA, Tiger Global Management, Greenoaks, and Wellington Management. We also are excited to welcome new investors: Cisco Investments, DFJ Growth, Intel Capital, ServiceNow Ventures, AMD Ventures, WCM, Amazon, Elad Gil, Meta, and Qualcomm Ventures. 

With this milestone, here’s where we are on our journey and what comes next.

Becoming the Data Foundry for AI

In 2016, I was studying AI at MIT. Even then, it was clear that AI is built from three fundamental pillars: data, compute, and algorithms. I founded Scale to supply the data pillar that advances AI by fueling its entire development lifecycle. 

Over the past 8 years, Scale has powered nearly every major breakthrough field of AI:

  • Scale’s Autonomy Data Engine powered breakthroughs in L4 autonomy.

  • Scale’s Public Sector Data Engine has powered many major AI programs within the US Department of Defense.

  • Scale partnered with OpenAI in the first experiments of reinforcement learning with human feedback (RLHF) on GPT-2, and scaled these techniques to InstructGPT and beyond.

  • Scale worked on the White House-supported DEFCON 31 red-teaming event and with the U.S. Department of Defense on rigorous evaluation, testing, and red-teaming of LLMs.

Today, Scale supplies data to power nearly every leading AI model, serving organizations like OpenAI, Meta, Microsoft, and more.

The Next Phase: Data Abundance for Frontier AI

Major problems in AI data still remain. The scaling laws imply an exponentially growing need for data as models get bigger, which raises a key question: will we run out of data?

Just as data, compute, and algorithms comprise the three pillars of AI, we believe the future of AI data in turn rests on three principles:

  • Data Abundance: We must build the data foundry that ushers in an era of AI-ready data abundance, and not resign ourselves to data scarcity. 

  • Frontier Data: As we develop progressively more powerful AI, we must build frontier data which is always pushing the boundaries of AI capabilities towards complex reasoning, agents, multimodality, and more.

  • Measurement and Evaluation: We must build an evaluation system that enables measurement of AI to build confidence, drive adoption, and scale impact.

Abundance is not the default; it’s a choice. It requires bringing together the best minds in engineering, operations, and AI.

Our calling is to build the data foundry for AI, and with today’s funding, we’re moving into the next phase of that journey – accelerating the abundance of frontier data that will pave our road to AGI.

There’s a lot left to do. If this challenge excites you, join us.

