Concept-based Analysis of Neural Networks via Vision-Language Models

Ravi Mangal, Nina Narodytska, Divya Gopinath, Boyue Caroline Hu, Anirban Roy, Susmit Jha, Corina Pasareanu

January 2024

Abstract

The analysis of vision-based deep neural networks (DNNs) is highly desirable but it is very challenging due to the difficulty of expressing formal specifications for vision tasks and the lack of efficient verification procedures. In this paper, we propose to leverage emerging multimodal, vision-language, foundation models (VLMs) as a lens through which we can reason about vision models. VLMs have been trained on a large body of images accompanied by their textual description, and are thus implicitly aware of high-level, human-understandable concepts describing the images. We describe a logical specification language 𝙲𝚘𝚗𝚜𝚙𝚎𝚌 designed to facilitate writing specifications in terms of these concepts. To define and formally check 𝙲𝚘𝚗𝚜𝚙𝚎𝚌 specifications, we build a map between the internal representations of a given vision model and a VLM, leading to an efficient verification procedure of natural-language properties for vision models. We demonstrate our techniques on a ResNet-based classifier trained on the RIVAL-10 dataset using CLIP as the multimodal model.

Type

Conference paper

Publication

In 7th Symposium on AI Verification 2024

Deep Learning Formal Methods

Concept-based Analysis of Neural Networks via Vision-Language Models

Abstract

Susmit Jha

Technical Director, NuSCI

Related