URSABench: A System for Comprehensive Benchmarking of Bayesian Deep Neural Network Models and Inference methods


While deep learning methods continue to improve in predictive accuracy on a wide range of application domains, significant issues remain with other aspects of their performance, including their ability to quantify uncertainty and their robustness. Recent advances in approximate Bayesian inference hold significant promise for addressing these concerns, but the computational scalability of these methods can be problematic when applied to large-scale models. In this paper, we present URSABench (the Uncertainty, Robustness, Scalability, and Accuracy Benchmark), an open-source suite of models, inference methods, tasks and benchmarking tools. URSABench supports comprehensive assessment of Bayesian deep learning models and approximate Bayesian inference methods, with a focus on classification tasks performed both on server and edge GPUs.

In the Conference on Machine Learning and Systems 2022
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.

For code please refer to URSABench.