Principled OOD Detection via Multiple Testing


We study the problem of out-of-distribution (OOD) detection, that is, detecting whether a machine learning (ML) model’s output can be trusted at inference time. While a number of tests for OOD detection have been proposed in prior work, a formal framework for studying this problem is lacking. We propose a definition for the notion of OOD that includes both the input distribution and the ML model, which provides insights for the construction of powerful tests for OOD detection. We also propose a multiple hypothesis testing inspired procedure to systematically combine any number of different statistics from the ML model using conformal p-values. We further provide strong guarantees on the probability of incorrectly classifying an in-distribution sample as OOD. In our experiments, we find that threshold-based tests proposed in prior work perform well in specific settings, but not uniformly well across different OOD instances. In contrast, our proposed method that combines multiple statistics performs uniformly well across different datasets and neural networks architectures.

In Journal of Machine Learning Research, 2023
Susmit Jha
Susmit Jha
Technical Director, NuSCI

My research interests include artificial intelligence, formal methods, machine learning and dynamical systems.

Anirban Roy
Anirban Roy
Senior Computer Scientist

Anirban Roy is a Senior Computer Scientist at SRI International. His current interests include Generative models, assured machine learning, AI for creativity and design, AI for education. In recent past, he has worked on activity recognition, object recognition, multi-object tracking. He has lead/involved on multiple government and commercial projects with clients including DARPA, IARPA, NSF and ARL.