CPS

Neuro-symbolic Co-designer for Symbiotic Design of Cyber Physical Systems (DARPA) : [SRI funding: 5.63M ]

ALES in Assured Autonomy (DARPA) [SRI funding: 2.02M]

Self-Improving Cyber-Physical Systems (NSF CPS Small) [SRI funding: 500K]

iDECODe: In-distribution Equivariance for Conformal Out-of-distribution Detection

Machine learning methods such as deep neural networks (DNNs), despite their success across different domains, are known to often generate incorrect predictions with high confidence on inputs outside their training distribution. The deployment of DNNs in safety-critical domains requires detection of out-of-distribution (OOD) data so that DNNs can abstain from making predictions on those. A number of methods have been recently developed for OOD detection, but there is still room for improvement. We propose the new method iDECODe, leveraging in-distribution equivariance for conformal OOD detection. It relies on a novel base non-conformity measure and a new aggregation method, used in the inductive conformal anomaly detection framework, thereby guaranteeing a bounded false detection rate. We demonstrate the efficacy of iDECODe by experiments on image and audio datasets, obtaining state-of-the-art results. We also show that iDECODe can detect adversarial examples.

Shaping Noise for Robust Attributions in Neural Stochastic Differential Equations

It has recently been shown that neural SDEs with Brownian motion as noise lead to smoother attributions than traditional ResNets. Various attribution methods such as saliency maps, integrated gradients, DeepSHAP and DeepLIFT have been shown to be more robust for neural SDEs than ResNets using the recently proposed sensitivity metric. In this paper, we show that neural SDEs with adaptive attribution-driven noise lead to even more robust attributions and smaller sensitivity metrics than traditional neural SDEs with Brownian motion as noise. In particular, attribution-driven shaping of noise leads to 6.7\%, 6.9\% and 19.4\% smaller sensitivity metric for integrated gradients computed on three discrete approximations of neural SDEs with standard Brownian motion noise- stochastic ResNet-50, WideResNet-101 and ResNeXt-101 models respectively. The neural SDE model with adaptive attribution-driven noise leads to 25.7\% and 4.8\% improvement in the SIC metric over traditional ResNets and Neural SDEs with Brownian motion as noise. To the best of our knowledge, we are the first to propose the use of attributions for shaping the noise injected in neural SDEs, and demonstrate that this process leads to more robust attributions than traditional neural SDEs with standard Brownian motion as noise.

Detecting out-of-context objects using graph contextual reasoning network.

This paper presents an approach to detect out-of-context (OOC) objects in an image. Given an image with a set of objects, our goal is to determine if an object is inconsistent with the scene context and detect the OOC object with a bounding box. In this work, we consider commonly explored contextual relations such as co-occurrence relations, the relative size of an object with respect to other objects, and the position of the object in the scene. We posit that contextual cues are useful to determine object labels for in-context objects and inconsistent context cues are detrimental to determining object labels for out-of-context objects. To realize this hypothesis, we propose a graph contextual reasoning network (GCRN) to detect OOC objects. GCRN consists of two separate graphs to predict object labels based on the contextual cues in the image - 1) a representation graph to learn object features based on the neighboring objects and 2) a context graph to explicitly capture contextual cues from the neighboring objects. GCRN explicitly captures the contextual cues to improve the detection of in-context objects and identify objects that violate contextual relations. In order to evaluate our approach, we create a large-scale dataset by adding OOC object instances to the COCO images. We also evaluate on recent OCD benchmark. Our results show that GCRN outperforms competitive baselines in detecting OOC objects and correctly detecting in-context objects.

Principal Manifold Flows

Normalizing flows map an independent set of latent variables to their samples using a bijective transformation. Despite the exact correspondence between samples and latent variables, their high level relationship is not well understood. In this paper we characterize the geometric structure of flows using principal manifolds and understand the relationship between latent variables and samples using contours. We introduce a novel class of normalizing flows, called principal component flows (PCF), whose contours are its principal manifolds, and a variant for injective flows (iPCF) that is more efficient to train than regular injective flows. PCFs can be constructed using any flow architecture, are trained with a regularized maximum likelihood objective and can perform density estimation on all of their principal manifolds. In our experiments we show that PCFs and iPCFs are able to learn the principal manifolds over a variation of datasets. Additionally, we show that PCFs can perform density estimation on data that lie on a manifold with variable dimensionality, which is not possible with existing normalizing flows.

Trinity: Trust, Resilience and Interpretability of Machine Learning Models

Despite the remarkable strides over the last decade in the performance of machine learning techniques, their applications are typically limited to nonadversarial benign environments. The use of deep learning in applications such as biometric recognition, and intrusion detection, require them to operate in adversarial environments. But the overwhelming empirical studies and theoretical results have shown that these methods are extremely fragile and susceptible to adversarial attacks. The rationale for why these methods make the decisions they do are also notoriously difficult to interpret; understanding such rationale may be crucial for the aforementioned applications. In this chapter, we discuss the connections between these related challenges, and describe a novel integrated approach, Trinity (Trust, Resilience and INterpretabilITY ), for analyzing these models.

On Smoother Attributions using Neural Stochastic Differential Equations

Several methods have recently been developed for computing attributions of a neural network's prediction over the input features. However, these existing approaches for computing attributions are noisy and not robust to small perturbations of the input. This paper uses the recently identified connection between dynamical systems and residual neural networks to show that the attributions computed over neural stochastic differential equations (SDEs) are less noisy, visually sharper, and quantitatively more robust. Using dynamical systems theory, we theoretically analyze the robustness of these attributions. We also experimentally demonstrate the efficacy of our approach in providing smoother, visually sharper and quantitatively robust attributions by computing attributions for ImageNet images using ResNet-50, WideResNet-101 models and ResNeXt-101 models.

Learning Certified Control Using Contraction Metric

In this paper, we solve the problem of finding a certified control policy that drives a robot from any given initial state and under any bounded disturbance to the desired reference trajectory, with guarantees on the convergence or bounds on the tracking error. Such a controller is crucial in safe motion planning. We leverage the advanced theory in Control Contraction Metric and design a learning framework based on neural networks to co-synthesize the contraction metric and the controller for control-affine systems. We further provide methods to validate the convergence and bounded error guarantees. We demonstrate the performance of our method using a suite of challenging robotic models, including models with learned dynamics as neural networks. We compare our approach with leading methods using sum-of-squares programming, reinforcement learning, and model predictive control. Results show that our methods indeed can handle a broader class of systems with less tracking error and faster execution speed.