Source author record

Prasad Chalasani

Prasad Chalasani appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computational Complexity math.CO

Catalog footprint

What is connected

4works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

CAUSE: Learning Granger Causality from Event Sequences using Attribution Methods

We study the problem of learning Granger causality between event types from asynchronous, interdependent, multi-type event sequences. Existing work suffers from either limited model flexibility or poor model explainability and thus fails to uncover Granger causality across a wide variety of event sequences with diverse event interdependency. To address these weaknesses, we propose CAUSE (Causality from AttribUtions on Sequence of Events), a novel framework for the studied task. The key idea of CAUSE is to first implicitly capture the underlying event interdependency by fitting a neural point process, and then extract from the process a Granger causality statistic using an axiomatic attribution method. Across multiple datasets riddled with diverse event interdependency, we demonstrate that CAUSE achieves superior performance on correctly inferring the inter-type Granger causality over a range of state-of-the-art methods.

preprint2020arXiv

Concise Explanations of Neural Networks using Adversarial Training

We show new connections between adversarial learning and explainability for deep neural networks (DNNs). One form of explanation of the output of a neural network model in terms of its input features, is a vector of feature-attributions. Two desirable characteristics of an attribution-based explanation are: (1) $\textit{sparseness}$: the attributions of irrelevant or weakly relevant features should be negligible, thus resulting in $\textit{concise}$ explanations in terms of the significant features, and (2) $\textit{stability}$: it should not vary significantly within a small local neighborhood of the input. Our first contribution is a theoretical exploration of how these two properties (when using attributions based on Integrated Gradients, or IG) are related to adversarial training, for a class of 1-layer networks (which includes logistic regression models for binary and multi-class classification); for these networks we show that (a) adversarial training using an $\ell_\infty$-bounded adversary produces models with sparse attribution vectors, and (b) natural model-training while encouraging stable explanations (via an extra term in the loss function), is equivalent to adversarial training. Our second contribution is an empirical verification of phenomenon (a), which we show, somewhat surprisingly, occurs $\textit{not only}$ $\textit{in 1-layer networks}$, $\textit{but also DNNs}$ $\textit{trained on }$ $\textit{standard image datasets}$, and extends beyond IG-based attributions, to those based on DeepSHAP: adversarial training with $\ell_\infty$-bounded perturbations yields significantly sparser attribution vectors, with little degradation in performance on natural test data, compared to natural training. Moreover, the sparseness of the attribution vectors is significantly better than that achievable via $\ell_1$-regularized natural training.

preprint2020arXiv

Representation Bayesian Risk Decompositions and Multi-Source Domain Adaptation

We consider representation learning (hypothesis class $\mathcal{H} = \mathcal{F}\circ\mathcal{G}$) where training and test distributions can be different. Recent studies provide hints and failure examples for domain invariant representation learning, a common approach for this problem, but the explanations provided are somewhat different and do not provide a unified picture. In this paper, we provide new decompositions of risk which give finer-grained explanations and clarify potential generalization issues. For Single-Source Domain Adaptation, we give an exact decomposition (an equality) of the target risk, via a natural hybrid argument, as sum of three factors: (1) source risk, (2) representation conditional label divergence, and (3) representation covariate shift. We derive a similar decomposition for the Multi-Source case. These decompositions reveal factors (2) and (3) as the precise reasons for failure to generalize. For example, we demonstrate that domain adversarial neural networks (DANN) attempt to regularize for (3) but miss (2), while a recent technique Invariant Risk Minimization (IRM) attempts to account for (2) but does not consider (3). We also verify our observations experimentally.

preprint1994arXiv

An on-line algorithm for improving performance in navigation

Recent papers have shown optimally-competitive on-line strategies for a robot traveling from a point $s$ to a point $t$ in certain unknown geometric environments. We consider the question: Having gained some partial information about the scene on its first trip from $s$ to $t$, can the robot improve its performance on subsequent trips it might make? This is a type of on-line problem where a strategy must exploit {\em partial information \/} about the future (e.g., about obstacles that lie ahead). For scenes with axis-parallel rectangular obstacles where the Euclidean distance between $s$ and $t$ is $n$, we present a deterministic algorithm whose {\em average\/} trip length after $k$ trips, $k \leq n$, is $O(\rootnbyk)$ times the length of the shortest $s$-$t$ path in the scene. We also show that this is the best a deterministic strategy can do. This algorithm can be thought of as performing an optimal tradeoff between search effort and the goodness of the path found. We improve this algorithm so that for {\em every\/} $i \leq n$, the robot's $i$th trip length is $O(\rootnbyi)$ times the shortest $s$-$t$ path length. A key idea of the paper is that a {\em tree\/} structure can be defined in the scene, where the nodes are portions of certain obstacles and the edges are ``short'' paths from a node to its children. The core of our algorithms is an on-line strategy for traversing this tree optimally.

Prasad Chalasani

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

CAUSE: Learning Granger Causality from Event Sequences using Attribution Methods

Concise Explanations of Neural Networks using Adversarial Training

Representation Bayesian Risk Decompositions and Multi-Source Domain Adaptation

An on-line algorithm for improving performance in navigation