Source author record

Daniel Wong

Daniel Wong appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Distributed, Parallel, and Cluster Computing eess.AS eess.SP Information Theory math.IT Sound

Catalog footprint

What is connected

3works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Rethinking complex-valued deep neural networks for monaural speech enhancement

Despite multiple efforts made towards adopting complex-valued deep neural networks (DNNs), it remains an open question whether complex-valued DNNs are generally more effective than real-valued DNNs for monaural speech enhancement. This work is devoted to presenting a critical assessment by systematically examining complex-valued DNNs against their real-valued counterparts. Specifically, we investigate complex-valued DNN atomic units, including linear layers, convolutional layers, long short-term memory (LSTM), and gated linear units. By comparing complex- and real-valued versions of fundamental building blocks in the recently developed gated convolutional recurrent network (GCRN), we show how different mechanisms for basic blocks affect the performance. We also find that the use of complex-valued operations hinders the model capacity when the model size is small. In addition, we examine two recent complex-valued DNNs, i.e. deep complex convolutional recurrent network (DCCRN) and deep complex U-Net (DCUNET). Evaluation results show that both DNNs produce identical performance to their real-valued counterparts while requiring much more computation. Based on these comprehensive comparisons, we conclude that complex-valued DNNs do not provide a performance gain over their real-valued counterparts for monaural speech enhancement, and thus are less desirable due to their higher computational costs.

preprint2021arXiv

Information Decoding and SDR Implementation of DFRC Systems Without Training Signals

Recent performance analysis of dual-function radar communications (DFRC) systems, which embed information using phase shift keying (PSK) into multiple-input multiple-output (MIMO) frequency hopping (FH) radar pulses, shows promising results for addressing spectrum sharing issues between radar and communications. However, the problem of decoding information at the communication receiver remains challenging, since the DFRC transmitter is typically assumed to transmit only information embedded radar waveforms and not the training sequence. We propose a novel method for decoding information at the communication receiver without using training data, which is implemented using a software-defined radio (SDR). The performance of the SDR implementation is examined in terms of bit error rate (BER) as a function of signal-to-noise ratio (SNR) for differential binary and quadrature phase shift keying modulation schemes and compared with the BER versus SNR obtained with numerical simulations.

preprint2021arXiv

Transferable Graph Optimizers for ML Compilers

Most compilers for machine learning (ML) frameworks need to solve many correlated optimization problems to generate efficient machine code. Current ML compilers rely on heuristics based algorithms to solve these optimization problems one at a time. However, this approach is not only hard to maintain but often leads to sub-optimal solutions especially for newer model architectures. Existing learning based approaches in the literature are sample inefficient, tackle a single optimization problem, and do not generalize to unseen graphs making them infeasible to be deployed in practice. To address these limitations, we propose an end-to-end, transferable deep reinforcement learning method for computational graph optimization (GO), based on a scalable sequential attention mechanism over an inductive graph neural network. GO generates decisions on the entire graph rather than on each individual node autoregressively, drastically speeding up the search compared to prior methods. Moreover, we propose recurrent attention layers to jointly optimize dependent graph optimization tasks and demonstrate 33%-60% speedup on three graph optimization tasks compared to TensorFlow default optimization. On a diverse set of representative graphs consisting of up to 80,000 nodes, including Inception-v3, Transformer-XL, and WaveNet, GO achieves on average 21% improvement over human experts and 18% improvement over the prior state of the art with 15x faster convergence, on a device placement task evaluated in real systems.

Daniel Wong

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

Rethinking complex-valued deep neural networks for monaural speech enhancement

Information Decoding and SDR Implementation of DFRC Systems Without Training Signals

Transferable Graph Optimizers for ML Compilers