Researcher profile

Chirag Gupta

Chirag Gupta contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2022arXiv

Distribution-free binary classification: prediction sets, confidence intervals and calibration

We study three notions of uncertainty quantification -- calibration, confidence intervals and prediction sets -- for binary classification in the distribution-free setting, that is without making any distributional assumptions on the data. With a focus towards calibration, we establish a 'tripod' of theorems that connect these three notions for score-based classifiers. A direct implication is that distribution-free calibration is only possible, even asymptotically, using a scoring function whose level sets partition the feature space into at most countably many sets. Parametric calibration schemes such as variants of Platt scaling do not satisfy this requirement, while nonparametric schemes based on binning do. To close the loop, we derive distribution-free confidence intervals for binned probabilities for both fixed-width and uniform-mass binning. As a consequence of our 'tripod' theorems, these confidence intervals for binned probabilities lead to distribution-free calibration. We also derive extensions to settings with streaming data and covariate shift.

preprint2022arXiv

Faster online calibration without randomization: interval forecasts and the power of two choices

We study the problem of making calibrated probabilistic forecasts for a binary sequence generated by an adversarial nature. Following the seminal paper of Foster and Vohra (1998), nature is often modeled as an adaptive adversary who sees all activity of the forecaster except the randomization that the forecaster may deploy. A number of papers have proposed randomized forecasting strategies that achieve an $ε$-calibration error rate of $O(1/\sqrt{T})$, which we prove is tight in general. On the other hand, it is well known that it is not possible to be calibrated without randomization, or if nature also sees the forecaster's randomization; in both cases the calibration error could be $Ω(1)$. Inspired by the equally seminal works on the "power of two choices" and imprecise probability theory, we study a small variant of the standard online calibration problem. The adversary gives the forecaster the option of making two nearby probabilistic forecasts, or equivalently an interval forecast of small width, and the endpoint closest to the revealed outcome is used to judge calibration. This power of two choices, or imprecise forecast, accords the forecaster with significant power -- we show that a faster $ε$-calibration rate of $O(1/T)$ can be achieved even without deploying any randomization.

preprint2022arXiv

Nested conformal prediction and quantile out-of-bag ensemble methods

Conformal prediction is a popular tool for providing valid prediction sets for classification and regression problems, without relying on any distributional assumptions on the data. While the traditional description of conformal prediction starts with a nonconformity score, we provide an alternate (but equivalent) view that starts with a sequence of nested sets and calibrates them to find a valid prediction set. The nested framework subsumes all nonconformity scores, including recent proposals based on quantile regression and density estimation. While these ideas were originally derived based on sample splitting, our framework seamlessly extends them to other aggregation schemes like cross-conformal, jackknife+ and out-of-bag methods. We use the framework to derive a new algorithm (QOOB, pronounced cube) that combines four ideas: quantile regression, cross-conformalization, ensemble methods and out-of-bag predictions. We develop a computationally efficient implementation of cross-conformal, that is also used by QOOB. In a detailed numerical investigation, QOOB performs either the best or close to the best on all simulated and real datasets. Code for QOOB is available at https://github.com/aigen/QOOB.

preprint2022arXiv

Top-label calibration and multiclass-to-binary reductions

A multiclass classifier is said to be top-label calibrated if the reported probability for the predicted class -- the top-label -- is calibrated, conditioned on the top-label. This conditioning on the top-label is absent in the closely related and popular notion of confidence calibration, which we argue makes confidence calibration difficult to interpret for decision-making. We propose top-label calibration as a rectification of confidence calibration. Further, we outline a multiclass-to-binary (M2B) reduction framework that unifies confidence, top-label, and class-wise calibration, among others. As its name suggests, M2B works by reducing multiclass calibration to numerous binary calibration problems, each of which can be solved using simple binary calibration routines. We instantiate the M2B framework with the well-studied histogram binning (HB) binary calibrator, and prove that the overall procedure is multiclass calibrated without making any assumptions on the underlying data distribution. In an empirical evaluation with four deep net architectures on CIFAR-10 and CIFAR-100, we find that the M2B + HB procedure achieves lower top-label and class-wise calibration error than other approaches such as temperature scaling. Code for this work is available at \url{https://github.com/aigen/df-posthoc-calibration}.

preprint2021arXiv

Modern Machine and Deep Learning Systems as a way to achieve Man-Computer Symbiosis

Man-Computer Symbiosis (MCS) was originally envisioned by the famous computer pioneer J.C.R. Licklider in 1960, as a logical evolution of the then inchoate relationship between computer and humans. In his paper, Licklider provided a set of criteria by which to judge if a Man-Computer System is a symbiotic one, and also provided some predictions about such systems in the near and far future. Since then, innovations in computer networks and the invention of the Internet were major developments towards that end. However, with most systems based on conventional logical algorithms, many aspects of Licklider's MCS remained unfulfilled. This paper explores the extent to which modern machine learning systems in general, and deep learning ones in particular best exemplify MCS systems, and why they are the prime contenders to achieve a true Man-Computer Symbiosis as described by Licklider in his original paper in the future. The case for deep learning is built by illustrating each point of the original criteria as well as the criteria laid by subsequent research into MCS systems, with specific examples and applications provided to strengthen the arguments. The efficacy of deep neural networks in achieving Artificial General Intelligence, which would be the perfect version of an MCS system is also explored.

preprint2020arXiv

Self learning robot using real-time neural networks

With the advancements in high volume, low precision computational technology and applied research on cognitive artificially intelligent heuristic systems, machine learning solutions through neural networks with real-time learning has seen an immense interest in the research community as well the industry. This paper involves research, development and experimental analysis of a neural network implemented on a robot with an arm through which evolves to learn to walk in a straight line or as required. The neural network learns using the algorithms of Gradient Descent and Backpropagation. Both the implementation and training of the neural network is done locally on the robot on a raspberry pi 3 so that its learning process is completely independent. The neural network is first tested on a custom simulator developed on MATLAB and then implemented on the raspberry computer. Data at each generation of the evolving network is stored, and analysis both mathematical and graphical is done on the data. Impact of factors like the learning rate and error tolerance on the learning process and final output is analyzed.

preprint2020arXiv

Shallow Encoder Deep Decoder (SEDD) Networks for Image Encryption and Decryption

This paper explores a new framework for lossy image encryption and decryption using a simple shallow encoder neural network E for encryption, and a complex deep decoder neural network D for decryption. E is kept simple so that encoding can be done on low power and portable devices and can in principle be any nonlinear function which outputs an encoded vector. D is trained to decode the encodings using the dataset of image - encoded vector pairs obtained from E and happens independently of E. As the encodings come from E which while being a simple neural network, still has thousands of random parameters and therefore the encodings would be practically impossible to crack without D. This approach differs from autoencoders as D is trained completely independently of E, although the structure may seem similar. Therefore, this paper also explores empirically if a deep neural network can learn to reconstruct the original data in any useful form given the output of a neural network or any other nonlinear function, which can have very useful applications in Cryptanalysis. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the decoded images from D along with some limitations.