Source author record

Roshan Sharma

Roshan Sharma appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence eess.AS quant-ph Sound Computation and Language math-ph math.DG math.MP

Catalog footprint

What is connected

5works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Self-supervision and Learnable STRFs for Age, Emotion, and Country Prediction

This work presents a multitask approach to the simultaneous estimation of age, country of origin, and emotion given vocal burst audio for the 2022 ICML Expressive Vocalizations Challenge ExVo-MultiTask track. The method of choice utilized a combination of spectro-temporal modulation and self-supervised features, followed by an encoder-decoder network organized in a multitask paradigm. We evaluate the complementarity between the tasks posed by examining independent task-specific and joint models, and explore the relative strengths of different feature sets. We also introduce a simple score fusion mechanism to leverage the complementarity of different feature sets for this task. We find that robust data preprocessing in conjunction with score fusion over spectro-temporal receptive field and HuBERT models achieved our best ExVo-MultiTask test score of 0.412.

preprint2022arXiv

Speech Summarization using Restricted Self-Attention

Speech summarization is typically performed by using a cascade of speech recognition and text summarization models. End-to-end modeling of speech summarization models is challenging due to memory and compute constraints arising from long input audio sequences. Recent work in document summarization has inspired methods to reduce the complexity of self-attentions, which enables transformer models to handle long sequences. In this work, we introduce a single model optimized end-to-end for speech summarization. We apply the restricted self-attention technique from text-based models to speech models to address the memory and compute constraints. We demonstrate that the proposed model learns to directly summarize speech for the How-2 corpus of instructional videos. The proposed end-to-end model outperforms the previously proposed cascaded model by 3 points absolute on ROUGE. Further, we consider the spoken language understanding task of predicting concepts from speech inputs and show that the proposed end-to-end model outperforms the cascade model by 4 points absolute F-1.

preprint2015arXiv

Quantum State Synthesis of Superconducting Resonators

We present a theoretical analysis of different methods to synthesize entangled states of two superconducting resonators. These methods use experimentally demonstrated interactions of resonators with artificial atoms, and offer efficient routes to generate nonclassical states. We analyze the theoretical structure of these algorithms and their average performance for arbitrary states and for deterministically preparing NOON states. Using a new state synthesis algorithm, we show that NOON states can be prepared in a time linear in the desired photon number and without any state-selective interactions.

preprint2015arXiv

States that "look the same" with respect to every basis in a mutually unbiased set

A complete set of mutually unbiased bases in a Hilbert space of dimension $d$ defines a set of $d+1$ orthogonal measurements. Relative to such a set, we define a "MUB-balanced state" to be a pure state for which the list of probabilities of the $d$ outcomes of one of these measurements is independent of the choice of measurement, up to permutations. In this paper we explicitly construct a MUB-balanced state for each prime power dimension $d$ for which $d = 3$ (mod 4). These states have already been constructed by Appleby in unpublished notes, but our presentation here is different in that both the expression for the states themselves and the proof of MUB-balancedness are given in terms of the discrete Wigner function, rather than the density matrix or state vector. The discrete Wigner functions of these states are "rotationally symmetric" in a sense roughly analogous to the rotational symmetry of the energy eigenstates of a harmonic oscillator in the continuous two-dimensional phase space. Upon converting the Wigner function to a density matrix, we find that the states are expressible as real state vectors in the standard basis. We observe numerically that when $d$ is large (and not a power of 3), a histogram of the components of such a state vector appears to form a semicircular distribution.

preprint2012arXiv

The Weierstrass Representation always gives a minimal surface

We give a simple, direct proof of the easy fact about the Weierstrass Representation, namely, that it always gives a minimal surface. Most presentations include the much harder converse that every simply connected minimal surface is given by the Weierstrass Representation.