Researcher profile

Chetan Gupta

Chetan Gupta contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2023arXiv

CDA: Contrastive-adversarial Domain Adaptation

Recent advances in domain adaptation reveal that adversarial learning on deep neural networks can learn domain invariant features to reduce the shift between source and target domains. While such adversarial approaches achieve domain-level alignment, they ignore the class (label) shift. When class-conditional data distributions are significantly different between the source and target domain, it can generate ambiguous features near class boundaries that are more likely to be misclassified. In this work, we propose a two-stage model for domain adaptation called \textbf{C}ontrastive-adversarial \textbf{D}omain \textbf{A}daptation \textbf{(CDA)}. While the adversarial component facilitates domain-level alignment, two-stage contrastive learning exploits class information to achieve higher intra-class compactness across domains resulting in well-separated decision boundaries. Furthermore, the proposed contrastive framework is designed as a plug-and-play module that can be easily embedded with existing adversarial methods for domain adaptation. We conduct experiments on two widely used benchmark datasets for domain adaptation, namely, \textit{Office-31} and \textit{Digits-5}, and demonstrate that CDA achieves state-of-the-art results on both datasets.

preprint2022arXiv

K-nearest Multi-agent Deep Reinforcement Learning for Collaborative Tasks with a Variable Number of Agents

Traditionally, the performance of multi-agent deep reinforcement learning algorithms are demonstrated and validated in gaming environments where we often have a fixed number of agents. In many industrial applications, the number of available agents can change at any given day and even when the number of agents is known ahead of time, it is common for an agent to break during the operation and become unavailable for a period of time. In this paper, we propose a new deep reinforcement learning algorithm for multi-agent collaborative tasks with a variable number of agents. We demonstrate the application of our algorithm using a fleet management simulator developed by Hitachi to generate realistic scenarios in a production site.

preprint2022arXiv

Sparse Matrix Multiplication in the Low-Bandwidth Model

We study matrix multiplication in the low-bandwidth model: There are $n$ computers, and we need to compute the product of two $n \times n$ matrices. Initially computer $i$ knows row $i$ of each input matrix. In one communication round each computer can send and receive one $O(\log n)$-bit message. Eventually computer $i$ has to output row $i$ of the product matrix. We seek to understand the complexity of this problem in the uniformly sparse case: each row and column of each input matrix has at most $d$ non-zeros and in the product matrix we only need to know the values of at most $d$ elements in each row or column. This is exactly the setting that we have, e.g., when we apply matrix multiplication for triangle detection in graphs of maximum degree $d$. We focus on the supported setting: the structure of the matrices is known in advance; only the numerical values of nonzero elements are unknown. There is a trivial algorithm that solves the problem in $O(d^2)$ rounds, but for a large $d$, better algorithms are known to exist; in the moderately dense regime the problem can be solved in $O(dn^{1/3})$ communication rounds, and for very large $d$, the dominant solution is the fast matrix multiplication algorithm using $O(n^{1.158})$ communication rounds (for matrix multiplication over fields and rings supporting fast matrix multiplication). In this work we show that it is possible to overcome quadratic barrier for all values of $d$: we present an algorithm that solves the problem in $O(d^{1.907})$ rounds for fields and rings supporting fast matrix multiplication and $O(d^{1.927})$ rounds for semirings, independent of $n$.

preprint2020arXiv

Dynamic Dispatching for Large-Scale Heterogeneous Fleet via Multi-agent Deep Reinforcement Learning

Dynamic dispatching is one of the core problems for operation optimization in traditional industries such as mining, as it is about how to smartly allocate the right resources to the right place at the right time. Conventionally, the industry relies on heuristics or even human intuitions which are often short-sighted and sub-optimal solutions. Leveraging the power of AI and Internet of Things (IoT), data-driven automation is reshaping this area. However, facing its own challenges such as large-scale and heterogenous trucks running in a highly dynamic environment, it can barely adopt methods developed in other domains (e.g., ride-sharing). In this paper, we propose a novel Deep Reinforcement Learning approach to solve the dynamic dispatching problem in mining. We first develop an event-based mining simulator with parameters calibrated in real mines. Then we propose an experience-sharing Deep Q Network with a novel abstract state/action representation to learn memories from heterogeneous agents altogether and realizes learning in a centralized way. We demonstrate that the proposed methods significantly outperform the most widely adopted approaches in the industry by $5.56\%$ in terms of productivity. The proposed approach has great potential in a broader range of industries (e.g., manufacturing, logistics) which have a large-scale of heterogenous equipment working in a highly dynamic environment, as a general framework for dynamic resource allocation.

preprint2020arXiv

Health Indicator Forecasting for Improving Remaining Useful Life Estimation

Prognostics is concerned with predicting the future health of the equipment and any potential failures. With the advances in the Internet of Things (IoT), data-driven approaches for prognostics that leverage the power of machine learning models are gaining popularity. One of the most important categories of data-driven approaches relies on a predefined or learned health indicator to characterize the equipment condition up to the present time and make inference on how it is likely to evolve in the future. In these approaches, health indicator forecasting that constructs the health indicator curve over the lifespan using partially observed measurements (i.e., health indicator values within an initial period) plays a key role. Existing health indicator forecasting algorithms, such as the functional Empirical Bayesian approach, the regression-based formulation, a naive scenario matching based on the nearest neighbor, have certain limitations. In this paper, we propose a new `generative + scenario matching' algorithm for health indicator forecasting. The key idea behind the proposed approach is to first non-parametrically fit the underlying health indicator curve with a continuous Gaussian Process using a sample of run-to-failure health indicator curves. The proposed approach then generates a rich set of random curves from the learned distribution, attempting to obtain all possible variations of the target health condition evolution process over the system's lifespan. The health indicator extrapolation for a piece of functioning equipment is inferred as the generated curve that has the highest matching level within the observed period. Our experimental results show the superiority of our algorithm over the other state-of-the-art methods.

preprint2020arXiv

Spatio-Temporal Functional Neural Networks

Explosive growth in spatio-temporal data and its wide range of applications have attracted increasing interests of researchers in the statistical and machine learning fields. The spatio-temporal regression problem is of paramount importance from both the methodology development and real-world application perspectives. Given the observed spatially encoded time series covariates and real-valued response data samples, the goal of spatio-temporal regression is to leverage the temporal and spatial dependencies to build a mapping from covariates to response with minimized prediction error. Prior arts, including the convolutional Long Short-Term Memory (CovLSTM) and variations of the functional linear models, cannot learn the spatio-temporal information in a simple and efficient format for proper model building. In this work, we propose two novel extensions of the Functional Neural Network (FNN), a temporal regression model whose effectiveness and superior performance over alternative sequential models have been proven by many researchers. The effectiveness of the proposed spatio-temporal FNNs in handling varying spatial correlations is demonstrated in comprehensive simulation studies. The proposed models are then deployed to solve a practical and challenging precipitation prediction problem in the meteorology field.

preprint2020arXiv

Time Space Optimal Algorithm for Computing Separators in Bounded Genus Graphs

A graph separator is a subset of vertices of a graph whose removal divides the graph into small components. Computing small graph separators for various classes of graphs is an important computational task. In this paper, we present a polynomial time algorithm that uses $O(g^{1/2}n^{1/2}\log n)$-space to find an $O(g^{1/2}n^{1/2})$-sized separator of a graph having $n$ vertices and embedded on a surface of genus $g$.

preprint2019arXiv

Building chatbots from large scale domain-specific knowledge bases: challenges and opportunities

Popular conversational agents frameworks such as Alexa Skills Kit (ASK) and Google Actions (gActions) offer unprecedented opportunities for facilitating the development and deployment of voice-enabled AI solutions in various verticals. Nevertheless, understanding user utterances with high accuracy remains a challenging task with these frameworks. Particularly, when building chatbots with large volume of domain-specific entities. In this paper, we describe the challenges and lessons learned from building a large scale virtual assistant for understanding and responding to equipment-related complaints. In the process, we describe an alternative scalable framework for: 1) extracting the knowledge about equipment components and their associated problem entities from short texts, and 2) learning to identify such entities in user utterances. We show through evaluation on a real dataset that the proposed framework, compared to off-the-shelf popular ones, scales better with large volume of entities being up to 30% more accurate, and is more effective in understanding user utterances with domain-specific entities.