Researcher profile

Manoj Kumar

Manoj Kumar contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
14works
0followers
14topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

14 published item(s)

preprint2022arXiv

Functional Optimization Reinforcement Learning for Real-Time Bidding

Real-time bidding is the new paradigm of programmatic advertising. An advertiser wants to make the intelligent choice of utilizing a \textbf{Demand-Side Platform} to improve the performance of their ad campaigns. Existing approaches are struggling to provide a satisfactory solution for bidding optimization due to stochastic bidding behavior. In this paper, we proposed a multi-agent reinforcement learning architecture for RTB with functional optimization. We designed four agents bidding environment: three Lagrange-multiplier based functional optimization agents and one baseline agent (without any attribute of functional optimization) First, numerous attributes have been assigned to each agent, including biased or unbiased win probability, Lagrange multiplier, and click-through rate. In order to evaluate the proposed RTB strategy's performance, we demonstrate the results on ten sequential simulated auction campaigns. The results show that agents with functional actions and rewards had the most significant average winning rate and winning surplus, given biased and unbiased winning information respectively. The experimental evaluations show that our approach significantly improve the campaign's efficacy and profitability.

preprint2022arXiv

IISERB Brains at SemEval 2022 Task 6: A Deep-learning Framework to Identify Intended Sarcasm in English

This paper describes the system architectures and the models submitted by our team "IISERBBrains" to SemEval 2022 Task 6 competition. We contested for all three sub-tasks floated for the English dataset. On the leader-board, wegot19th rank out of43 teams for sub-taskA, the 8th rank out of22 teams for sub-task B,and13th rank out of 16 teams for sub-taskC. Apart from the submitted results and models, we also report the other models and results that we obtained through our experiments after organizers published the gold labels of their evaluation data

preprint2022arXiv

Minimizing the phase structure of quark mass matrices

Fritzsch-Xing matrices are a particular class of texture 4 zero hermitian quark mass matrices, known to be successful in accommodating the quark mixing data. In the present work, it is shown that these texture 4-zero matrices with only one phase parameter, unlike the usually considered two phase parameters, are not only consistent with the latest experimental quark mixing data, but also predict the CP violation parameters, $J$ and corresponding phase $δ$, in agreement with the recent global analyses. We also show that the mass matrix elements do not exhibit a strong hierarchy and there is a strong correlation between some of the mass matrix elements of up and down sector. A precision measurement of $δ$ as well as small quark masses would have the potential to constrain the phase structure of the matrices further.

preprint2022arXiv

Titchmarsh theorems on Damek-Ricci spaces via moduli of continuity of higher order

A classical theorem of Titchmarsh relates the $L^2$-Lipschitz functions and decay of the Fourier transform of the functions. In this note, we prove the Titchmarsh theorem for Damek-Ricci space (also known as harmonic $NA$ groups) via moduli of continuity of higher orders. We also prove an analogue of another Titchmarsh theorem which provides integrability properties of the Fourier transform for functions in the Hölder Lipschitz spaces.

preprint2021arXiv

Colorization Transformer

We present the Colorization Transformer, a novel approach for diverse high fidelity image colorization based on self-attention. Given a grayscale image, the colorization proceeds in three steps. We first use a conditional autoregressive transformer to produce a low resolution coarse coloring of the grayscale image. Our architecture adopts conditional transformer layers to effectively condition grayscale input. Two subsequent fully parallel networks upsample the coarse colored low resolution image into a finely colored high resolution image. Sampling from the Colorization Transformer produces diverse colorings whose fidelity outperforms the previous state-of-the-art on colorising ImageNet based on FID results and based on a human evaluation in a Mechanical Turk test. Remarkably, in more than 60% of cases human evaluators prefer the highest rated among three generated colorings over the ground truth. The code and pre-trained checkpoints for Colorization Transformer are publicly available at https://github.com/google-research/google-research/tree/master/coltran

preprint2021arXiv

ProtoDA: Efficient Transfer Learning for Few-Shot Intent Classification

Practical sequence classification tasks in natural language processing often suffer from low training data availability for target classes. Recent works towards mitigating this problem have focused on transfer learning using embeddings pre-trained on often unrelated tasks, for instance, language modeling. We adopt an alternative approach by transfer learning on an ensemble of related tasks using prototypical networks under the meta-learning paradigm. Using intent classification as a case study, we demonstrate that increasing variability in training tasks can significantly improve classification performance. Further, we apply data augmentation in conjunction with meta-learning to reduce sampling bias. We make use of a conditional generator for data augmentation that is trained directly using the meta-learning objective and simultaneously with prototypical networks, hence ensuring that data augmentation is customized to the task. We explore augmentation in the sentence embedding space as well as prototypical embedding space. Combining meta-learning with augmentation provides upto 6.49% and 8.53% relative F1-score improvements over the best performing systems in the 5-shot and 10-shot learning, respectively.

preprint2020arXiv

Age structured SIR model for the spread of infectious diseases through indirect contacts

In this article, we discuss an age-structured SIR model in which disease not only spread through direct person to person contacts for e.g. infection due to surface contamination but it can also spread through indirect contacts. It is evident that age also plays a crucial role in SARS virus infection including COVID-19 infection. We formulate our model as an abstract semilinear Cauchy problem in an appropriate Banach space to show the existence of solution and also show the existence of steady states. It is assumed in this work that the population is in a demographic stationary state and show that there is no disease-free equilibrium point as long as there is a transmission of infection due to the indirect contacts in the environment.

preprint2020arXiv

Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap

In this study, we propose a new spectral clustering framework that can auto-tune the parameters of the clustering algorithm in the context of speaker diarization. The proposed framework uses normalized maximum eigengap (NME) values to estimate the number of clusters and the parameters for the threshold of the elements of each row in an affinity matrix during spectral clustering, without the use of parameter tuning on the development set. Even through this hands-off approach, we achieve a comparable or better performance across various evaluation sets than the results found using traditional clustering methods that apply careful parameter tuning and development data. A relative improvement of 17% in the speaker error rate on the well-known CALLHOME evaluation set shows the effectiveness of our proposed spectral clustering with auto-tuning.

preprint2020arXiv

Designing Neural Speaker Embeddings with Meta Learning

Neural speaker embeddings trained using classification objectives have demonstrated state-of-the-art performance in multiple applications. Typically, such embeddings are trained on an out-of-domain corpus on a single task e.g., speaker classification, albeit with a large number of classes (speakers). In this work, we reformulate embedding training under the meta-learning paradigm. We redistribute the training corpus as an ensemble of multiple related speaker classification tasks, and learn a representation that generalizes better to unseen speakers. First, we develop an open source toolkit to train x-vectors that is matched in performance with pre-trained Kaldi models for speaker diarization and speaker verification applications. We find that different bottleneck layers in the architecture variedly favor different applications. Next, we use two meta-learning strategies, namely prototypical networks and relation networks, to improve over the x-vector embeddings. Our best performing model achieves a relative improvement of 12.37% and 7.11% in speaker error on the DIHARD II development corpus and the AMI meeting corpus, respectively. We analyze improvements across different domains in the DIHARD corpus. Notably, on the challenging child speech domain, we study the relation between child age and the diarization performance. Further, we show reductions in equal error rate for speaker verification on the SITW corpus (7.68%) and the VOiCES challenge corpus (8.78%). We observe that meta-learning particularly offers benefits in challenging acoustic conditions and recording setups encountered in these corpora. Our experiments illustrate the applicability of meta-learning as a generalized learning paradigm for training deep neural speaker embeddings.

preprint2020arXiv

Growth Kinetics and Aging Phenomena in a Frustrated System

We study numerically the ordering kinetics in a two-dimensional Ising model with random coupling where the fraction of antiferromagnetic links $a$ can be gradually tuned. We show that, upon increasing such fraction, the behavior changes in a radical way. Small $a$ does not prevent the system from a complete ordering, but this occurs in an extremely (logarithmically) slow manner. However, larger values of this parameter destroy complete ordering, due to frustration, and the evolution is comparatively faster (algebraic). Our study shows a precise correspondence between the kind of developing order, ferromagnetic versus frustrated, and the speed of evolution. The aging properties of the system are studied by focusing on the scaling properties of two-time quantities, the autocorrelation and linear response functions. We find that the contribution of equilibrium and an aging part to these functions occurs differently in the various regions of the phase diagram of the model. When quenching inside the ferromagnetic phase, the two-time quantities are obtained by the addition of these parts. Instead, in the paramagnetic phase, these two contributions enter multiplicatively. Both of the scaling forms are shown with excellent accuracy, and the corresponding scaling functions and exponents have been determined and discussed.

preprint2020arXiv

Meta-learning with Latent Space Clustering in Generative Adversarial Network for Speaker Diarization

The performance of most speaker diarization systems with x-vector embeddings is both vulnerable to noisy environments and lacks domain robustness. Earlier work on speaker diarization using generative adversarial network (GAN) with an encoder network (ClusterGAN) to project input x-vectors into a latent space has shown promising performance on meeting data. In this paper, we extend the ClusterGAN network to improve diarization robustness and enable rapid generalization across various challenging domains. To this end, we fetch the pre-trained encoder from the ClusterGAN and fine-tune it by using prototypical loss (meta-ClusterGAN or MCGAN) under the meta-learning paradigm. Experiments are conducted on CALLHOME telephonic conversations, AMI meeting data, DIHARD II (dev set) which includes challenging multi-domain corpus, and two child-clinician interaction corpora (ADOS, BOSCC) related to the autism spectrum disorder domain. Extensive analyses of the experimental data are done to investigate the effectiveness of the proposed ClusterGAN and MCGAN embeddings over x-vectors. The results show that the proposed embeddings with normalized maximum eigengap spectral clustering (NME-SC) back-end consistently outperform Kaldi state-of-the-art z-vector diarization system. Finally, we employ embedding fusion with x-vectors to provide further improvement in diarization performance. We achieve a relative diarization error rate (DER) improvement of 6.67% to 53.93% on the aforementioned datasets using the proposed fused embeddings over x-vectors. Besides, the MCGAN embeddings provide better performance in the number of speakers estimation and short speech segment diarization as compared to x-vectors and ClusterGAN in telephonic data.

preprint2020arXiv

On the comparison of optimization algorithms for the random-field Potts model

For many systems with quenched disorder the study of ground states can crucially contribute to a thorough understanding of the physics at play, be it for the critical behavior if that is governed by a zero-temperature fixed point or for uncovering properties of the ordered phase. While ground states can in principle be computed using general-purpose optimization algorithms such as simulated annealing or genetic algorithms, it is often much more efficient to use exact or approximate techniques specifically tailored to the problem at hand. For certain systems with discrete degrees of freedom such as the random-field Ising model, there are polynomial-time methods to compute exact ground states. But even as the number of states increases beyond two as in the random-field Potts model, the problem becomes NP hard and one cannot hope to find exact ground states for relevant system sizes. Here, we compare a number of approximate techniques for this problem and evaluate their performance.

preprint2020arXiv

Transport, correlations, and chaos in a classical disordered anharmonic chain

We explore transport properties in a disordered nonlinear chain of classical harmonic oscillators and thereby identify a regime exhibiting behavior analogous to that seen in quantum many-body-localized systems. Through extensive numerical simulations of this system connected at its ends to heat baths at different temperatures, we computed the heat current and the temperature profile in the nonequilibrium steady state as a function of system size $N$, disorder strength $Δ$, and temperature $T$. The conductivity $κ_N$, obtained for finite length ($N$) systems, saturates to a value $κ_\infty >0$ in the large $N$ limit, for all values of disorder strength $Δ$ and temperature $T>0$. We show evidence that for any $Δ>0$ the conductivity goes to zero faster than any power of $T$ in the $(T/Δ) \to 0$ limit, and find that the form $κ_\infty \sim e^{-B |\ln(C Δ/T)|^3}$ fits our data. This form has earlier been suggested by a theory based on the dynamics of multi-oscillator chaotic islands. The finite-size effect can be $κ_N < κ_{\infty}$ due to boundary resistance when the bulk conductivity is high (the weak disorder case), or $κ_N > κ_{\infty}$ due to direct bath-to-bath coupling through bulk localized modes when the bulk is weakly conducting (the strong disorder case). We also present results on equilibrium dynamical correlation functions and on the role of chaos on transport properties. Finally, we explore the differences in the growth and propagation of chaos in the weak and strong chaos regimes by studying the classical version of the Out-of-Time-Ordered-Commutator.

preprint2020arXiv

VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation

Generative models that can model and predict sequences of future events can, in principle, learn to capture complex real-world phenomena, such as physical interactions. However, a central challenge in video prediction is that the future is highly uncertain: a sequence of past observations of events can imply many possible futures. Although a number of recent works have studied probabilistic models that can represent uncertain futures, such models are either extremely expensive computationally as in the case of pixel-level autoregressive models, or do not directly optimize the likelihood of the data. To our knowledge, our work is the first to propose multi-frame video prediction with normalizing flows, which allows for direct optimization of the data likelihood, and produces high-quality stochastic predictions. We describe an approach for modeling the latent space dynamics, and demonstrate that flow-based generative models offer a viable and competitive approach to generative modelling of video.