Please rotate your device for the best experience.

Log inBook demoBook demo
←Blog

Madhu Sehwag

November 25, 2025

Research

Crumbling Under Pressure: PropensityBench Reveals AI’s Weaknesses

Crumbling Under Pressure: PropensityBench Reveals AI’s Weaknesses

To measure the propensity of agents to make unsafe choices, Scale, the University of Maryland, and other collaborators developed PropensityBench. This benchmark simulates real-world pressure by allowing agents to choose between a safe approach that consistently fails and a functional, harmful shortcut, revealing their true inclinations. The benchmark reveals that agent safety compromises significantly under pressure.

Read more

Scale AI's logo

Products

Scale data engineScale GenAI PlatformScale Donovan

Solutions

EnterpriseInsuranceHealthcareUS Public SectorGlobal Public Sector

Company

AboutCareersSecurityTermsPrivacyModern Slavery Statement

Resources

BlogContact UsEventsDocumentation

Guides

Data LabelingML Model TrainingDiffusion ModelsGuide to AI for eCommerceComputer Vision ApplicationsLarge Language Models

Reliable AI for the world’s most important decisions

Manage your 

Copyright © 2026 Scale AI, Inc. All rights reserved

Terms of Use & Privacy Policy