Source author record

Robert Gmyr

Robert Gmyr appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Distributed, Parallel, and Cluster Computing Emerging Technologies Machine Learning Artificial Intelligence Computation and Language eess.AS Computer Vision

Catalog footprint

What is connected

9works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Building a great multi-lingual teacher with sparsely-gated mixture of experts for speech recognition

The sparsely-gated Mixture of Experts (MoE) can magnify a network capacity with a little computational complexity. In this work, we investigate how multi-lingual Automatic Speech Recognition (ASR) networks can be scaled up with a simple routing algorithm in order to achieve better accuracy. More specifically, we apply the sparsely-gated MoE technique to two types of networks: Sequence-to-Sequence Transformer (S2S-T) and Transformer Transducer (T-T). We demonstrate through a set of ASR experiments on multiple language data that the MoE networks can reduce the relative word error rates by 16.3% and 4.6% with the S2S-T and T-T, respectively. Moreover, we thoroughly investigate the effect of the MoE on the T-T architecture in various conditions: streaming mode, non-streaming mode, the use of language ID and the label decoder with the MoE.

preprint2022arXiv

i-Code: An Integrative and Composable Multimodal Learning Framework

Human intelligence is multimodal; we integrate visual, linguistic, and acoustic signals to maintain a holistic worldview. Most current pretraining methods, however, are limited to one or two modalities. We present i-Code, a self-supervised pretraining framework where users may flexibly combine the modalities of vision, speech, and language into unified and general-purpose vector representations. In this framework, data from each modality are first given to pretrained single-modality encoders. The encoder outputs are then integrated with a multimodal fusion network, which uses novel attention mechanisms and other architectural innovations to effectively combine information from the different modalities. The entire system is pretrained end-to-end with new objectives including masked modality unit modeling and cross-modality contrastive learning. Unlike previous research using only video for pretraining, the i-Code framework can dynamically process single, dual, and triple-modality data during training and inference, flexibly projecting different combinations of modalities into a single representation space. Experimental results demonstrate how i-Code can outperform state-of-the-art techniques on five video understanding tasks and the GLUE NLP benchmark, improving by as much as 11% and demonstrating the power of integrative multimodal pretraining.

preprint2020arXiv

Federated Transfer Learning with Dynamic Gradient Aggregation

In this paper, a Federated Learning (FL) simulation platform is introduced. The target scenario is Acoustic Model training based on this platform. To our knowledge, this is the first attempt to apply FL techniques to Speech Recognition tasks due to the inherent complexity. The proposed FL platform can support different tasks based on the adopted modular design. As part of the platform, a novel hierarchical optimization scheme and two gradient aggregation methods are proposed, leading to almost an order of magnitude improvement in training convergence speed compared to other distributed or FL training algorithms like BMUF and FedAvg. The hierarchical optimization offers additional flexibility in the training pipeline besides the enhanced convergence speed. On top of the hierarchical optimization, a dynamic gradient aggregation algorithm is proposed, based on a data-driven weight inference. This aggregation algorithm acts as a regularizer of the gradient quality. Finally, an unsupervised training pipeline tailored to FL is presented as a separate training scenario. The experimental validation of the proposed system is based on two tasks: first, the LibriSpeech task showing a speed-up of 7x and 6% Word Error Rate reduction (WERR) compared to the baseline results. The second task is based on session adaptation providing an improvement of 20% WERR over a competitive production-ready LAS model. The proposed Federated Learning system is shown to outperform the golden standard of distributed training in both convergence speed and overall model performance.

preprint2020arXiv

Sleeping is Efficient: MIS in $O(1)$-rounds Node-averaged Awake Complexity

Maximal Independent Set (MIS) is one of the fundamental problems in distributed computing. The round (time) complexity of distributed MIS has traditionally focused on the \emph{worst-case time} for all nodes to finish. The best-known (randomized) MIS algorithms take $O(\log{n})$ worst-case rounds on general graphs (where $n$ is the number of nodes). Motivated by the goal to reduce \emph{total} energy consumption in energy-constrained networks such as sensor and ad hoc wireless networks, we take an alternative approach to measuring performance. We focus on minimizing the total (or equivalently, the \emph{average}) time for all nodes to finish. It is not clear whether the currently best-known algorithms yield constant-round (or even $o(\log{n})$) node-averaged round complexity for MIS in general graphs. We posit the \emph{sleeping model}, a generalization of the traditional model, that allows nodes to enter either ``sleep'' or ``waking'' states at any round. While waking state corresponds to the default state in the traditional model, in sleeping state a node is ``offline'', i.e., it does not send or receive messages (and messages sent to it are dropped as well) and does not incur any time, communication, or local computation cost. Hence, in this model, only rounds in which a node is awake are counted and we are interested in minimizing the average as well as the worst-case number of rounds a node spends in the awake state. Our main result is that we show that {\em MIS can be solved in (expected) $O(1)$ rounds under node-averaged awake complexity measure} in the sleeping model. In particular, we present a randomized distributed algorithm for MIS that has expected {\em $O(1)$-rounds node-averaged awake complexity} and, with high probability has {\em $O(\log{n})$-rounds worst-case awake complexity} and {\em $O(\log^{3.41}n)$-rounds worst-case complexity}.

preprint2016arXiv

Leader Election and Shape Formation with Self-Organizing Programmable Matter

We consider programmable matter consisting of simple computational elements, called particles, that can establish and release bonds and can actively move in a self-organized way, and we investigate the feasibility of solving fundamental problems relevant for programmable matter. As a suitable model for such self-organizing particle systems, we will use a generalization of the geometric amoebot model first proposed in SPAA 2014. Based on the geometric model, we present efficient local-control algorithms for leader election and line formation requiring only particles with constant size memory, and we also discuss the limitations of solving these problems within the general amoebot model.

preprint2016arXiv

Universal Coating for Programmable Matter

The idea behind universal coating is to have a thin layer of a specific substance covering an object of any shape so that one can measure a certain condition (like temperature or cracks) at any spot on the surface of the object without requiring direct access to that spot. We study the universal coating problem in the context of self-organizing programmable matter consisting of simple computational elements, called particles, that can establish and release bonds and can actively move in a self-organized way. Based on that matter, we present a worst-case work-optimal universal coating algorithm that uniformly coats any object of arbitrary shape and size that allows a uniform coating. Our particles are anonymous, do not have any global information, have constant-size memory, and utilize only local interactions.

preprint2015arXiv

An Algorithmic Framework for Shape Formation Problems in Self-Organizing Particle Systems

Many proposals have already been made for realizing programmable matter, ranging from shape-changing molecules, DNA tiles, and synthetic cells to reconfigurable modular robotics. Envisioning systems of nano-sensors devices, we are particularly interested in programmable matter consisting of systems of simple computational elements, called particles, that can establish and release bonds and can actively move in a self-organized way, and in shape formation problems relevant for programmable matter in those self-organizing particle systems (SOPS). In this paper, we present a general algorithmic framework for shape formation problems in SOPS, and show direct applications of this framework to the problems of having the particle system self-organize to form a hexagonal or triangular shape. Our algorithms utilize only local control, require only constant-size memory particles, and are asymptotically optimal both in terms of the total number of movements needed to reach the desired shape configuration.

preprint2014arXiv

Infinite Object Coating in the Amoebot Model

The term programmable matter refers to matter which has the ability to change its physical properties (shape, density, moduli, conductivity, optical properties, etc.) in a programmable fashion, based upon user input or autonomous sensing. This has many applications like smart materials, autonomous monitoring and repair, and minimal invasive surgery. While programmable matter might have been considered pure science fiction more than two decades ago, in recent years a large amount of research has been conducted in this field. Often programmable matter is envisioned as a very large number of small locally interacting computational particles. We propose the Amoebot model, a new model which builds upon this vision of programmable matter. Inspired by the behavior of amoeba, the Amoebot model offers a versatile framework to model self-organizing particles and facilitates rigorous algorithmic research in the area of programmable matter. We present an algorithm for the problem of coating an infinite object under this model, and prove the correctness of the algorithm and that it is work-optimal.

preprint2013arXiv

Ameba-inspired Self-organizing Particle Systems

Particle systems are physical systems of simple computational particles that can bond to neighboring particles and use these bonds to move from one spot to another (non-occupied) spot. These particle systems are supposed to be able to self-organize in order to adapt to a desired shape without any central control. Self-organizing particle systems have many interesting applications like coating objects for monitoring and repair purposes and the formation of nano-scale devices for surgery and molecular-scale electronic structures. While there has been quite a lot of systems work in this area, especially in the context of modular self-reconfigurable robotic systems, only very little theoretical work has been done in this area so far. We attempt to bridge this gap by proposing a model inspired by the behavior of ameba that allows rigorous algorithmic research on self-organizing particle systems.