Source author record

Sayan Ghosh

Sayan Ghosh appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computation and Language Artificial Intelligence astro-ph.HE astro-ph.IM Computer Vision Cryptography and Security Distributed, Parallel, and Cluster Computing hep-ph math.OC physics.comp-ph physics.ins-det Social and Information Networks

Catalog footprint

What is connected

15works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Pruning Compact ConvNets for Efficient Inference

Neural network pruning is frequently used to compress over-parameterized networks by large amounts, while incurring only marginal drops in generalization performance. However, the impact of pruning on networks that have been highly optimized for efficient inference has not received the same level of attention. In this paper, we analyze the effect of pruning for computer vision, and study state-of-the-art ConvNets, such as the FBNetV3 family of models. We show that model pruning approaches can be used to further optimize networks trained through NAS (Neural Architecture Search). The resulting family of pruned models can consistently obtain better performance than existing FBNetV3 models at the same level of computation, and thus provide state-of-the-art results when trading off between computational complexity and generalization performance on the ImageNet benchmark. In addition to better generalization performance, we also demonstrate that when limited computation resources are available, pruning FBNetV3 models incur only a fraction of GPU-hours involved in running a full-scale NAS.

preprint2022arXiv

A Comprehensive Review of Digital Twin -- Part 2: Roles of Uncertainty Quantification and Optimization, a Battery Digital Twin, and Perspectives

As an emerging technology in the era of Industry 4.0, digital twin is gaining unprecedented attention because of its promise to further optimize process design, quality control, health monitoring, decision and policy making, and more, by comprehensively modeling the physical world as a group of interconnected digital models. In a two-part series of papers, we examine the fundamental role of different modeling techniques, twinning enabling technologies, and uncertainty quantification and optimization methods commonly used in digital twins. This second paper presents a literature review of key enabling technologies of digital twins, with an emphasis on uncertainty quantification, optimization methods, open source datasets and tools, major findings, challenges, and future directions. Discussions focus on current methods of uncertainty quantification and optimization and how they are applied in different dimensions of a digital twin. Additionally, this paper presents a case study where a battery digital twin is constructed and tested to illustrate some of the modeling and twinning methods reviewed in this two-part review. Code and preprocessed data for generating all the results and figures presented in the case study are available on GitHub.

preprint2022arXiv

CLUES: A Benchmark for Learning Classifiers using Natural Language Explanations

Supervised learning has traditionally focused on inductive learning by observing labeled examples of a task. In contrast, humans have the ability to learn new concepts from language. Here, we explore training zero-shot classifiers for structured data purely from language. For this, we introduce CLUES, a benchmark for Classifier Learning Using natural language ExplanationS, consisting of a range of classification tasks over structured data along with natural language supervision in the form of explanations. CLUES consists of 36 real-world and 144 synthetic classification tasks. It contains crowdsourced explanations describing real-world tasks from multiple teachers and programmatically generated explanations for the synthetic tasks. To model the influence of explanations in classifying an example, we develop ExEnt, an entailment-based model that learns classifiers using explanations. ExEnt generalizes up to 18% better (relative) on novel tasks than a baseline that does not use explanations. We delineate key challenges for automated learning from explanations, addressing which can lead to progress on CLUES in the future. Code and datasets are available at: https://clues-benchmark.github.io.

preprint2022arXiv

ePiC: Employing Proverbs in Context as a Benchmark for Abstract Language Understanding

While large language models have shown exciting progress on several NLP benchmarks, evaluating their ability for complex analogical reasoning remains under-explored. Here, we introduce a high-quality crowdsourced dataset of narratives for employing proverbs in context as a benchmark for abstract language understanding. The dataset provides fine-grained annotation of aligned spans between proverbs and narratives, and contains minimal lexical overlaps between narratives and proverbs, ensuring that models need to go beyond surface-level reasoning to succeed. We explore three tasks: (1) proverb recommendation and alignment prediction, (2) narrative generation for a given proverb and topic, and (3) identifying narratives with similar motifs. Our experiments show that neural language models struggle on these tasks compared to humans, and these tasks pose multiple learning challenges.

preprint2022arXiv

Inelastic charged current interaction of supernova neutrinos in two-phase liquid xenon dark matter detectors

It has been known that neutrinos from supernova (SN) bursts can give rise to nuclear recoil (NR) signals arising from coherent elastic neutrino-nucleus scattering (CE$ν$NS) interaction, a neutral current (NC) process, of the neutrinos with xenon nuclei in future large (multi-ton scale) liquid xenon (LXe) detectors employed for dark matter search, depending on the SN progenitor mass and distance to the SN. In this paper, we show that the same detectors will also be sensitive to inelastic charged current (CC) interactions of the SN electron neutrinos ($ν_e$CC) with the xenon nuclei. Such interactions, while creating an electron in the final state, also leave the post-interaction target nucleus in an excited state, the subsequent deexcitation of which produces, among other particles, gamma rays and neutrons. The electron and deexcitation gamma rays will give ``electron recoil" (ER) type signals, while the deexcitation neutrons produce, through their multiple scattering on the xenon nuclei, further xenon nuclear recoils that will also give NR signals (in addition to those produced through the CE$ν$NS interactions). We discuss the observable scintillation and ionization signals associated with SN neutrino induced CE$ν$NS and $ν_e$CC events in a generic LXe detector and argue that upcoming sufficiently large LXe detectors should be able to detect both these types of events due to neutrinos from reasonably close by SN bursts. We also note that since the total CC induced ER and NR signals receive contributions predominantly from $ν_e$CC interactions while the CE$ν$NS contribution comes from NC interactions of {\emph all the six species of neutrinos}, identification of the $ν_e$CC and CE$ν$NS origin events may offer the possibility of extracting useful information about the distribution of the total SN explosion energy going into different neutrino flavors.

preprint2022arXiv

Learning to Mediate Disparities Towards Pragmatic Communication

Human communication is a collaborative process. Speakers, on top of conveying their own intent, adjust the content and language expressions by taking the listeners into account, including their knowledge background, personalities, and physical capabilities. Towards building AI agents with similar abilities in language communication, we propose Pragmatic Rational Speaker (PRS), a framework extending Rational Speech Act (RSA). The PRS attempts to learn the speaker-listener disparity and adjust the speech accordingly, by adding a light-weighted disparity adjustment layer into working memory on top of speaker's long-term memory system. By fixing the long-term memory, the PRS only needs to update its working memory to learn and adapt to different types of listeners. To validate our framework, we create a dataset that simulates different types of speaker-listener disparities in the context of referential games. Our empirical results demonstrate that the PRS is able to shift its output towards the language that listener are able to understand, significantly improve the collaborative task outcome.

preprint2022arXiv

Opacus: User-Friendly Differential Privacy Library in PyTorch

We introduce Opacus, a free, open-source PyTorch library for training deep learning models with differential privacy (hosted at opacus.ai). Opacus is designed for simplicity, flexibility, and speed. It provides a simple and user-friendly API, and enables machine learning practitioners to make a training pipeline private by adding as little as two lines to their code. It supports a wide variety of layers, including multi-head attention, convolution, LSTM, GRU (and generic RNN), and embedding, right out of the box and provides the means for supporting other user-defined layers. Opacus computes batched per-sample gradients, providing higher efficiency compared to the traditional "micro batch" approach. In this paper we present Opacus, detail the principles that drove its implementation and unique features, and benchmark it against other frameworks for training models with differential privacy as well as standard PyTorch.

preprint2022arXiv

Reconciling Security and Communication Efficiency in Federated Learning

Cross-device Federated Learning is an increasingly popular machine learning setting to train a model by leveraging a large population of client devices with high privacy and security guarantees. However, communication efficiency remains a major bottleneck when scaling federated learning to production environments, particularly due to bandwidth constraints during uplink communication. In this paper, we formalize and address the problem of compressing client-to-server model updates under the Secure Aggregation primitive, a core component of Federated Learning pipelines that allows the server to aggregate the client updates without accessing them individually. In particular, we adapt standard scalar quantization and pruning methods to Secure Aggregation and propose Secure Indexing, a variant of Secure Aggregation that supports quantization for extreme compression. We establish state-of-the-art results on LEAF benchmarks in a secure Federated Learning setup with up to 40$\times$ compression in uplink communication with no meaningful loss in utility compared to uncompressed baselines.

preprint2022arXiv

Simulation of Nuclear Recoils due to Supernova Neutrino-induced Neutrons in Liquid Xenon Detectors

Neutrinos from supernova (SN) bursts can give rise to detectable number of nuclear recoil (NR) events through the coherent elastic neutrino-nucleus scattering (CE$ν$NS) process in large scale liquid xenon detectors designed for direct dark matter search, depending on the SN progenitor mass and distance. Here we show that in addition to the direct NR events due to CE$ν$NS process, the SN neutrinos can give rise to additional nuclear recoils due to the elastic scattering of neutrons produced through inelastic interaction of the neutrinos with the xenon nuclei. We find that the contribution of the supernova neutrino-induced neutrons ($ν$I$n$) can significantly modify the total xenon NR spectrum at large recoil energies compared to that expected from the CE$ν$NS process alone. Moreover, for recoil energies $\gtrsim20$ keV, dominant contribution is obtained from the ($ν$I$n$) events. We numerically calculate the observable S1 and S2 signals due to both CE$ν$NS and $ν$I$n$ processes for a typical liquid xenon based detector, accounting for the multiple scattering effects of the neutrons in the case of $ν$I$n$, and find that sufficiently large signal events, those with S1$\gtrsim$50 photo-electrons (PE) and S2$\gtrsim$2300 PE, come mainly from the $ν$I$n$ scatterings.

preprint2021arXiv

Measurements of gamma ray, cosmic muon and residual neutron background fluxes for rare event search experiments at an underground laboratory

Ambient radiation background contributed by the penetrating cosmic ray particles and the radionuclides present in the rock materials have been measured at an underground laboratory located inside a mine at 555 m depth. The laboratory is being set up to explore rare event search processes, such as direct dark matter search, neutrinoless double beta decay, axion search, supernova neutrino detection, etc., that require specific knowledge of the nature and extent of the radiation environment in order to assess the sensitivity reach and also to plan for its reduction for the targeted experiment. The gamma ray background, which is mostly contributed by the primordial radionuclides and their decay chain products, have been measured inside the laboratory and found to be dominated by rock radioactivity for $E_γ\lesssim 3 \,{\rm MeV}$. Shielding of these residual gamma rays for the experiment was also evaluated. The cosmic muon flux, measured inside the laboratory using large area plastic scintillator telescope, was found to be: $(2.051 \pm 0.142 \pm 0.009) \times 10^{-7}\, {\rm cm}^{-2}.{\rm sec}^{-1}$, which agrees reasonably well with simulation results. The neutron background flux has been measured for the radiogenic neutrons and found to be: $(1.61 \pm 0.03) \times 10^{-4} \, {\rm cm}^{-2}.{\rm sec}^{-1}$ for no threshold cut. Detailed GEANT4 simulation for the radiogenic neutrons and the cosmogenic neutrons have been carried out. Effects of multiple scattering of both the types of neutrons within the surrounding rock and the cavern walls were studied and the results for the radiogenic neutrons are found to be in reasonable agreement with experimental results. Neutron fluxes contributed by those neutrons of cosmogenic origin have been reported as function of the energy threshold.

preprint2020arXiv

Advances in Bayesian Probabilistic Modeling for Industrial Applications

Industrial applications frequently pose a notorious challenge for state-of-the-art methods in the contexts of optimization, designing experiments and modeling unknown physical response. This problem is aggravated by limited availability of clean data, uncertainty in available physics-based models and additional logistic and computational expense associated with experiments. In such a scenario, Bayesian methods have played an impactful role in alleviating the aforementioned obstacles by quantifying uncertainty of different types under limited resources. These methods, usually deployed as a framework, allows decision makers to make informed choices under uncertainty while being able to incorporate information on the the fly, usually in the form of data, from multiple sources while being consistent with the physical intuition about the problem. This is a major advantage that Bayesian methods bring to fruition especially in the industrial context. This paper is a compendium of the Bayesian modeling methodology that is being consistently developed at GE Research. The methodology, called GE's Bayesian Hybrid Modeling (GEBHM), is a probabilistic modeling method, based on the Kennedy and O'Hagan framework, that has been continuously scaled-up and industrialized over several years. In this work, we explain the various advancements in GEBHM's methods and demonstrate their impact on several challenging industrial problems.

preprint2020arXiv

Analysing the Extent of Misinformation in Cancer Related Tweets

Twitter has become one of the most sought after places to discuss a wide variety of topics, including medically relevant issues such as cancer. This helps spread awareness regarding the various causes, cures and prevention methods of cancer. However, no proper analysis has been performed, which discusses the validity of such claims. In this work, we aim to tackle the misinformation spread in such platforms. We collect and present a dataset regarding tweets which talk specifically about cancer and propose an attention-based deep learning model for automated detection of misinformation along with its spread. We then do a comparative analysis of the linguistic variation in the text corresponding to misinformation and truth. This analysis helps us gather relevant insights on various social aspects related to misinformed tweets.

preprint2020arXiv

Bayesian task embedding for few-shot Bayesian optimization

We describe a method for Bayesian optimization by which one may incorporate data from multiple systems whose quantitative interrelationships are unknown a priori. All general (nonreal-valued) features of the systems are associated with continuous latent variables that enter as inputs into a single metamodel that simultaneously learns the response surfaces of all of the systems. Bayesian inference is used to determine appropriate beliefs regarding the latent variables. We explain how the resulting probabilistic metamodel may be used for Bayesian optimization tasks and demonstrate its implementation on a variety of synthetic and real-world examples, comparing its performance under zero-, one-, and few-shot settings against traditional Bayesian optimization, which usually requires substantially more data from the system of interest.

preprint2020arXiv

Data-Informed Decomposition for Localized Uncertainty Quantification of Dynamical Systems

Industrial dynamical systems often exhibit multi-scale response due to material heterogeneities, operation conditions and complex environmental loadings. In such problems, it is the case that the smallest length-scale of the systems dynamics controls the numerical resolution required to effectively resolve the embedded physics. In practice however, high numerical resolutions is only required in a confined region of the system where fast dynamics or localized material variability are exhibited, whereas a coarser discretization can be sufficient in the rest majority of the system. To this end, a unified computational scheme with uniform spatio-temporal resolutions for uncertainty quantification can be very computationally demanding. Partitioning the complex dynamical system into smaller easier-to-solve problems based of the localized dynamics and material variability can reduce the overall computational cost. However, identifying the region of interest for high-resolution and intensive uncertainty quantification can be a problem dependent. The region of interest can be specified based on the localization features of the solution, user interest, and correlation length of the random material properties. For problems where a region of interest is not evident, Bayesian inference can provide a feasible solution. In this work, we employ a Bayesian framework to update our prior knowledge on the localized region of interest using measurements and system response. To address the computational cost of the Bayesian inference, we construct a Gaussian process surrogate for the forward model. Once, the localized region of interest is identified, we use polynomial chaos expansion to propagate the localization uncertainty. We demonstrate our framework through numerical experiments on a three-dimensional elastodynamic problem.

preprint2016arXiv

Learning Representations of Affect from Speech

There has been a lot of prior work on representation learning for speech recognition applications, but not much emphasis has been given to an investigation of effective representations of affect from speech, where the paralinguistic elements of speech are separated out from the verbal content. In this paper, we explore denoising autoencoders for learning paralinguistic attributes i.e. categorical and dimensional affective traits from speech. We show that the representations learnt by the bottleneck layer of the autoencoder are highly discriminative of activation intensity and at separating out negative valence (sadness and anger) from positive valence (happiness). We experiment with different input speech features (such as FFT and log-mel spectrograms with temporal context windows), and different autoencoder architectures (such as stacked and deep autoencoders). We also learn utterance specific representations by a combination of denoising autoencoders and BLSTM based recurrent autoencoders. Emotion classification is performed with the learnt temporal/dynamic representations to evaluate the quality of the representations. Experiments on a well-established real-life speech dataset (IEMOCAP) show that the learnt representations are comparable to state of the art feature extractors (such as voice quality features and MFCCs) and are competitive with state-of-the-art approaches at emotion and dimensional affect recognition.

Sayan Ghosh

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

Pruning Compact ConvNets for Efficient Inference

A Comprehensive Review of Digital Twin -- Part 2: Roles of Uncertainty Quantification and Optimization, a Battery Digital Twin, and Perspectives

CLUES: A Benchmark for Learning Classifiers using Natural Language Explanations

ePiC: Employing Proverbs in Context as a Benchmark for Abstract Language Understanding

Inelastic charged current interaction of supernova neutrinos in two-phase liquid xenon dark matter detectors

Learning to Mediate Disparities Towards Pragmatic Communication

Opacus: User-Friendly Differential Privacy Library in PyTorch

Reconciling Security and Communication Efficiency in Federated Learning

Simulation of Nuclear Recoils due to Supernova Neutrino-induced Neutrons in Liquid Xenon Detectors

Measurements of gamma ray, cosmic muon and residual neutron background fluxes for rare event search experiments at an underground laboratory

Advances in Bayesian Probabilistic Modeling for Industrial Applications

Analysing the Extent of Misinformation in Cancer Related Tweets

Bayesian task embedding for few-shot Bayesian optimization

Data-Informed Decomposition for Localized Uncertainty Quantification of Dynamical Systems

Learning Representations of Affect from Speech