Role Overview

Scale’s rapidly growing International Public Sector team is focused on using AI to address critical challenges facing the public sector around the world. Our core work consists of:

Creating custom AI applications that will impact millions of citizens

Generating high-quality training data for national LLMs

Upskilling and advisory services to spread the impact of AI

As a Production AI Ops Lead, you will design and develop the production lifecycle of full-stack AI applications, while supporting end-to-end system reliability, real-time inference observability, sovereign data orchestration, high-security software integration, and the resilient cloud infrastructure required for our international government partners.

At Scale, we’re not just building AI solutions—we’re enabling the public sector to transform their operations and better serve citizens through cutting-edge technology. If you’re ready to shape the future of AI in the public sector and be a founding member of our team, we’d love to hear from you.

You will:

Take full accountability for the long-term performance and reliability of AI use cases deployed across international government agencies. Ensure Full-Stack integrity: Oversee the end-to-end health of the platform, ensuring seamless integration between the AI core and all full-stack components, from APIs to UI, to maintain a responsive and production-ready environment.

Lead the response for production issues in mission-critical environments, ensuring rapid resolution and building the guardrails to prevent them from happening again. Bridge the gap: Translate deep technical performance metrics into clear insights for senior international government officials.

Translate deep technical performance metrics into clear insights for senior international government officials. Drive product evolution: Partner with our Engineering and ML teams to ensure the lessons learned in the field directly influence the technical architecture and decisions of future use cases.

Ideally, you have:

You treat every production deployment as your own. You race toward solving hard problems before the customer even sees them. Reliability: You understand that in the public sector, a model failure may be a risk to public safety or privacy.

You understand that in the public sector, a model failure may be a risk to public safety or privacy. Customer communication: The ability to explain to a high-ranking official why the performance of the system has degraded and how we are fixing it.