A Holistic Approach For Test And Evaluation Of Large Language Models

Dylan Slack*, Jean Wang*, Denis Semenenko*, Kate Park, Sean Hendryx

*Equal Contribution

Abstract: As large language models (LLMs) become increasingly prevalent in diverse applications, ensuring the utility and safety of model generations becomes paramount. We present a holistic approach for test and evaluation of large language models. The approach encompasses the design of evaluation taxonomies, a novel framework for safety testing, and hybrid methodologies that improve the scalability and cost of evaluations. In particular, we introduce a hybrid methodology for the evaluation of large language models (LLMs) that leverages both human expertise and AI assistance. Our hybrid methodology generalizes across both LLM capabilities and safety, accurately identifying areas where AI assistance can be used to automate this evaluation. Similarly, we find that by combining automated evaluations, generalist red teamers, and expert red teamers, we’re able to more efficiently discover new vulnerabilities. We share our approach in hopes that it can contribute to the development of common standards and approaches around test and evaluation of LLMs.