Source author record

Abhinav Mehrotra

Abhinav Mehrotra appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning eess.AS Sound cs.CY Human-Computer Interaction

Catalog footprint

What is connected

4works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Federated Self-supervised Speech Representations: Are We There Yet?

The ubiquity of microphone-enabled devices has lead to large amounts of unlabelled audio data being produced at the edge. The integration of self-supervised learning (SSL) and federated learning (FL) into one coherent system can potentially offer data privacy guarantees while also advancing the quality and robustness of speech representations. In this paper, we provide a first-of-its-kind systematic study of the feasibility and complexities for training speech SSL models under FL scenarios from the perspective of algorithms, hardware, and systems limits. Despite the high potential of their combination, we find existing system constraints and algorithmic behaviour make SSL and FL systems nearly impossible to build today. Yet critically, our results indicate specific performance bottlenecks and research opportunities that would allow this situation to be reversed. While our analysis suggests that, given existing trends in hardware, hybrid SSL and FL speech systems will not be viable until 2027. We believe this study can act as a roadmap to accelerate work towards reaching this milestone much earlier.

preprint2020arXiv

Bunched LPCNet : Vocoder for Low-cost Neural Text-To-Speech Systems

LPCNet is an efficient vocoder that combines linear prediction and deep neural network modules to keep the computational complexity low. In this work, we present two techniques to further reduce it's complexity, aiming for a low-cost LPCNet vocoder-based neural Text-to-Speech (TTS) System. These techniques are: 1) Sample-bunching, which allows LPCNet to generate more than one audio sample per inference; and 2) Bit-bunching, which reduces the computations in the final layer of LPCNet. With the proposed bunching techniques, LPCNet, in conjunction with a Deep Convolutional TTS (DCTTS) acoustic model, shows a 2.19x improvement over the baseline run-time when running on a mobile device, with a less than 0.1 decrease in TTS mean opinion score (MOS).

preprint2020arXiv

Iterative Compression of End-to-End ASR Model using AutoML

Increasing demand for on-device Automatic Speech Recognition (ASR) systems has resulted in renewed interests in developing automatic model compression techniques. Past research have shown that AutoML-based Low Rank Factorization (LRF) technique, when applied to an end-to-end Encoder-Attention-Decoder style ASR model, can achieve a speedup of up to 3.7x, outperforming laborious manual rank-selection approaches. However, we show that current AutoML-based search techniques only work up to a certain compression level, beyond which they fail to produce compressed models with acceptable word error rates (WER). In this work, we propose an iterative AutoML-based LRF approach that achieves over 5x compression without degrading the WER, thereby advancing the state-of-the-art in ASR compression.

preprint2015arXiv

Anticipatory Mobile Digital Health: Towards Personalised Proactive Therapies and Prevention Strategies

The last two centuries saw groundbreaking advances in the field of healthcare: from the invention of the vaccine to organ transplant, and eradication of numerous deadly diseases. Yet, these breakthroughs have only illuminated the role that individual traits and behaviours play in the health state of a person. Continuous patient monitoring and individually-tailored therapies can help in early detection and efficient tackling of health issues. However, even the most developed nations cannot afford proactive personalised healthcare at scale. Mobile computing devices, nowadays equipped with an array of sensors, high-performance computing power, and carried by their owners at all time, promise to revolutionise modern healthcare. These devices can enable continuous patient monitoring, and, with the help of machine learning, can build predictive models of patient's health and behaviour. Finally, through their close integration with a user's lifestyle mobiles can be used to deliver personalised proactive therapies. In this article, we develop the concept of anticipatory mobile-based healthcare - anticipatory mobile digital health - and examine the opportunities and challenges associated with its practical realisation.