Researcher profile

Rohit Kumar

Rohit Kumar contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2022arXiv

End-to-End Speech Recognition With Joint Dereverberation Of Sub-Band Autoregressive Envelopes

The end-to-end (E2E) automatic speech recognition (ASR) systems are often required to operate in reverberant conditions, where the long-term sub-band envelopes of the speech are temporally smeared. In this paper, we develop a feature enhancement approach using a neural model operating on sub-band temporal envelopes. The temporal envelopes are modeled using the framework of frequency domain linear prediction (FDLP). The neural enhancement model proposed in this paper performs an envelope gain based enhancement of temporal envelopes. The model architecture consists of a combination of convolutional and long short term memory (LSTM) neural network layers. Further, the envelope dereverberation, feature extraction and acoustic modeling using transformer based E2E ASR can all be jointly optimized for the speech recognition task. The joint optimization ensures that the dereverberation model targets the ASR cost function. We perform E2E speech recognition experiments on the REVERB challenge dataset as well as on the VOiCES dataset. In these experiments, the proposed joint modeling approach yields significant improvements compared to the baseline E2E ASR system (average relative improvements of 21% on the REVERB challenge dataset and about 10% on the VOiCES dataset).

preprint2022arXiv

Evaluation of the suitable analytical techniques for the investigation of the toxic elements and compounds in the Pyrotechnic materials (Green crackers)

The present manuscript reports the elemental as well as molecular study of the Green Crackers. Laser-induced breakdown spectroscopy has been used for elemental analysis, UV-Vis and Photoacoustic Spectroscopy (PAS) are used for molecular study of the green crackers. The spectral lines of several elements including heavy/toxic such as Al, Ba, Sr, Cr, Cu are observed in the LIBS spectra of green crackers like present in normal crackers. In addition to this, the electronic bands of diatomic molecules like AlO, SrO, and CaO are also observed in LIB spectra of the green crackers. PAS, which is non-destructive, useful for scattering & opaque substances, is more suitable than the UV-VIS method for the investigation of the various organic compounds/molecules present in the firecrackers. Molecular bands of these molecules (AlO, SrO and CaO) are also in the absorption spectra of the crackers recorded using PAS technique and UV-Vis spectroscopy technique. In addition to these, absorption bands of some additional compounds/molecules like AlO, SrO, CaCO3, KNO3, NH4NO3, NHClO4 are also observed in the PA spectra of the green crackers, which show that PAS is more appropriate technique than the UV-VIS method for the investigation of the organic compounds/molecules in firecrackers. To determine the exact concentration of the constituents (Al, Cr, Cu) in green crackers AAS has been used. The results of the present manuscript show that the green crackers are also toxic for the environment as well as for humans although with lesser intensity than traditional/normal crackers.

preprint2022arXiv

Local Relighting of Real Scenes

We introduce the task of local relighting, which changes a photograph of a scene by switching on and off the light sources that are visible within the image. This new task differs from the traditional image relighting problem, as it introduces the challenge of detecting light sources and inferring the pattern of light that emanates from them. We propose an approach for local relighting that trains a model without supervision of any novel image dataset by using synthetically generated image pairs from another model. Concretely, we collect paired training images from a stylespace-manipulated GAN; then we use these images to train a conditional image-to-image model. To benchmark local relighting, we introduce Lonoff, a collection of 306 precisely aligned images taken in indoor spaces with different combinations of lights switched on. We show that our method significantly outperforms baseline methods based on GAN inversion. Finally, we demonstrate extensions of our method that control different light sources separately. We invite the community to tackle this new task of local relighting.

preprint2022arXiv

Nanomechanical Characterization of an Antiferromagnetic Topological Insulator

The antiferromagnetic topological insulator MnBi2Te4 (MBT) exhibits an ideal platform to study exotic topological phenomena and magnetic properties. The transport signatures of magnetic phase transitions in the MBT family materials have been well-studied. However, their mechanical properties and magneto-mechanical coupling have not been well-explored. We use nanoelectromechanical systems to study the intrinsic magnetism in MBT thin flakes via their magnetostrictive coupling. We investigate mechanical resonance signatures of magnetic phase transitions from antiferromagnetic (AFM) to canted antiferromagnetic (cAFM) to ferromagnetic (FM) phases versus magnetic field at different temperatures. The spin-flop transitions in MBT are revealed by frequency shifts of mechanical resonance. With temperatures going above TN, the transitions disappear in the resonance frequency map, consistent with transport measurements. We use a magnetostrictive model to correlate the frequency shifts with the spin-canting states. Our work demonstrates a technique to study magnetic phase transitions, magnetization and magnetoelastic properties of the magnetic topological insulator.

preprint2022arXiv

Nodeless superconductivity in the topological nodal-line semimetal CaSb2

CaSb2 is a topological nodal-line semimetal that becomes superconducting below 1.6 K, providing an ideal platform to investigate the interplay between topologically nontrivial electronic bands and superconductivity. In this work, we investigated the superconducting order parameter of CaSb2 by measuring its magnetic penetration depth change Δλ(T) down to 0.07 K, using a tunneling diode oscillator (TDO) based technique. Well inside the superconducting state, Δλ(T) shows an exponential activated behavior, and provides direct evidence for a nodeless superconducting gap. By analyzing the temperature dependence of the superfluid density and the electronic specific heat, we find both can be consistently described by a two-gap s-wave model, in line with the presence of multiple Fermi surfaces associated with distinct Sb sites in this compound. These results demonstrate fully-gapped superconductivity in CaSb2 and constrain the allowed pairing symmetry.

preprint2022arXiv

Transport Model Comparison Studies of Intermediate-Energy Heavy-Ion Collisions

Transport models are the main method to obtain physics information from low to relativistic-energy heavy-ion collisions. The Transport Model Evaluation Project (TMEP) has been pursued to test the robustness of transport model predictions in reaching consistent conclusions from the same type of physical model. Calculations under controlled conditions of physical input and set-up were performed with various participating codes. These included both calculations of nuclear matter in a box with periodic boundary conditions, and more realistic calculations of heavy-ion collisions. In this intermediate review, we summarize and discuss the present status of the project. We also provide condensed descriptions of the 26 participating codes, which contributed to some part of the project. These include the major codes in use today. We review the main results of the studies completed so far. They show, that in box calculations the differences between the codes can be well understood and a convergence of the results can be reached. These studies also highlight the systematic differences between the two families of transport codes, known as BUU and QMD type codes. However, when the codes were compared in full heavy-ion collisions using different physical models, as recently for pion production, they still yielded substantially different results. This calls for further comparisons of heavy-ion collisions with controlled models and of box comparisons of important ingredients, like momentum-dependent fields, which are currently underway. We often indicate improved strategies in performing transport simulations and thus provide guidance to code developers. Results of transport simulations of heavy-ion collisions from a given code will have more significance if the code can be validated against benchmark calculations such as the ones summarized in this review.

preprint2020arXiv

Deep Learning Based Dereverberation of Temporal Envelopesfor Robust Speech Recognition

Automatic speech recognition in reverberant conditions is a challenging task as the long-term envelopes of the reverberant speech are temporally smeared. In this paper, we propose a neural model for enhancement of sub-band temporal envelopes for dereverberation of speech. The temporal envelopes are derived using the autoregressive modeling framework of frequency domain linear prediction (FDLP). The neural enhancement model proposed in this paper performs an envelop gain based enhancement of temporal envelopes and it consists of a series of convolutional and recurrent neural network layers. The enhanced sub-band envelopes are used to generate features for automatic speech recognition (ASR). The ASR experiments are performed on the REVERB challenge dataset as well as the CHiME-3 dataset. In these experiments, the proposed neural enhancement approach provides significant improvements over a baseline ASR system with beamformed audio (average relative improvements of 21% on the development set and about 11% on the evaluation set in word error rates for REVERB challenge dataset).

preprint2020arXiv

Distributed Algorithm for Dynamic Cognitive Ad-hoc Networks

Cognitive ad-hoc networks allow users to access an unlicensed/shared spectrum without the need for any coordination via a central controller and are being envisioned for futuristic ultra-dense wireless networks. The ad-hoc nature of networks require each user to learn and regularly update various network parameters such as channel quality and the number of users, and use learned information to improve the spectrum utilization and minimize collisions. For such a learning and coordination task, we propose a distributed algorithm based on a multi-player multi-armed bandit approach and novel signaling scheme. The proposed algorithm does not need prior knowledge of network parameters (users, channels) and its ability to detect as well as adapt to the changes in the network parameters thereby making it suitable for static as well as dynamic networks. The theoretical analysis and extensive simulation results validate the superiority of the proposed algorithm over existing state-of-the-art algorithms.

preprint2020arXiv

LEAP Submission to CHiME-6 ASR Challenge}

This paper reports the LEAP submission to the CHiME-6 challenge. The CHiME-6 Automatic Speech Recognition (ASR) challenge Track 1 involved the recognition of speech in noisy and reverberant acoustic conditions in home environments with multiple-party interactions. For the challenge submission, the LEAP system used extensive data augmentation and a factorized time-delay neural network (TDNN) architecture. We also explored a neural architecture that interleaved the TDNN layers with LSTM layers. The submitted system improved the Kaldi recipe by 2% in terms of relative word-error-rate improvements.