Source author record

Wai Ho Mow

Wai Ho Mow appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT eess.SP Machine Learning Performance math.PR

Catalog footprint

What is connected

11works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

OptiVote: Non-Coherent FSO Over-the-Air Majority Vote for Communication-Efficient Distributed Federated Learning in Space Data Centers

The rapid deployment of mega-constellations is driving the long-term vision of space data centers (SDCs), where interconnected satellites form in-orbit distributed computing and learning infrastructures. Enabling distributed federated learning in such systems is challenging because iterative training requires frequent aggregation over inter-satellite links that are bandwidth- and energy-constrained, and the link conditions can be highly dynamic. In this work, we exploit over-the-air computation (AirComp) as an in-network aggregation primitive. However, conventional coherent AirComp relies on stringent phase alignment, which is difficult to maintain in space environments due to satellite jitter and Doppler effects. To overcome this limitation, we propose OptiVote, a robust and communication-efficient non-coherent free-space optical (FSO) AirComp framework for federated learning toward Space Data Centers. OptiVote integrates sign stochastic gradient descent (signSGD) with a majority-vote (MV) aggregation principle and pulse-position modulation (PPM), where each satellite conveys local gradient signs by activating orthogonal PPM time slots. The aggregation node performs MV detection via non-coherent energy accumulation, transforming phase-sensitive field superposition into phase-agnostic optical intensity combining, thereby eliminating the need for precise phase synchronization and improving resilience under dynamic impairments. To mitigate aggregation bias induced by heterogeneous FSO channels, we further develop an importance-aware, channel state information (CSI)-free dynamic power control scheme that balances received energies without additional signaling. We provide theoretical analysis by characterizing the aggregate error probability under statistical FSO channels and establishing convergence guarantees for non-convex objectives.

preprint2022arXiv

Fast Performance Evaluation of Linear Block Codes over Memoryless Continuous Channels

There are rising scenarios in communication systems, where the noises exhibit impulsive behavior and are not adequate to be modeled as the Gaussian distribution. The generalized Gaussian distribution instead is an effective model to describe real-world systems with impulsive noises. In this paper, the problem of efficiently evaluating the error performance of linear block codes over an additive white generalized Gaussian noise (AWGGN) channel is considered. The Monte Carlo (MC) simulation is a widely used but inefficient performance evaluation method, especially in the low error probability regime. As a variance-reduction technique, importance sampling (IS) can significantly reduce the sample size needed for reliable estimation based on a well-designed IS distribution. By deriving the optimal IS distribution on the one-dimensional space mapped from the observation space, we present a general framework to designing IS estimators for memoryless continuous channels. Specifically, for the AWGGN channel, we propose an $L_p$-norm-based minimum-variance IS estimator. As an efficiency measure, the asymptotic IS gain of the proposed estimator is derived in a multiple integral form as the signal-to-noise ratio tends to infinity. Specifically, for the Laplace and Gaussian noises, the gains can be derived in a one-dimensional integral form, which makes the numerical calculation affordable. In addition, by limiting the use of the union bound to an optimized $L_1$-norm sphere, we derive the sphere bound for the additive white Laplace noise channel. Simulation results verify the accuracy of the derived IS gain in predicting the efficiency of the proposed IS estimator.

preprint2020arXiv

Deep Multi-Task Learning for Cooperative NOMA: System Design and Principles

Envisioned as a promising component of the future wireless Internet-of-Things (IoT) networks, the non-orthogonal multiple access (NOMA) technique can support massive connectivity with a significantly increased spectral efficiency. Cooperative NOMA is able to further improve the communication reliability of users under poor channel conditions. However, the conventional system design suffers from several inherent limitations and is not optimized from the bit error rate (BER) perspective. In this paper, we develop a novel deep cooperative NOMA scheme, drawing upon the recent advances in deep learning (DL). We develop a novel hybrid-cascaded deep neural network (DNN) architecture such that the entire system can be optimized in a holistic manner. On this basis, we construct multiple loss functions to quantify the BER performance and propose a novel multi-task oriented two-stage training method to solve the end-to-end training problem in a self-supervised manner. The learning mechanism of each DNN module is then analyzed based on information theory, offering insights into the proposed DNN architecture and its corresponding training method. We also adapt the proposed scheme to handle the power allocation (PA) mismatch between training and inference and incorporate it with channel coding to combat signal deterioration. Simulation results verify its advantages over orthogonal multiple access (OMA) and the conventional cooperative NOMA scheme in various scenarios.

preprint2020arXiv

Detection and Performance Analysis for Non-Coherent DF Relay Networks with Optimized Generalized Differential Modulation

This paper studies the detection and performance analysis problems for a relay network with $N$ parallel decode-and-forward (DF) relays. Due to the distributed nature of this network, it is practically very challenging to fulfill the requirement of instantaneous channel state information for coherent detection. To bypass this requirement, we consider the use of non-coherent DF relaying based on a generalized differential modulation (GDM) scheme, in which transmission power allocation over the $M$-ary phase shift keying symbols is exploited when performing differential encoding. In this paper, a novel detector at the destination of such a non-coherent DF relay network is proposed. It is an accurate approximation of the state-of-the-art detector, called the almost maximum likelihood detector (AMLD), but the detection complexity is considerably reduced from $\mathcal{O}(M^2N)$ to $\mathcal{O}(MN)$. By characterizing the dominant error terms, we derive an accurate approximate symbol error rate (SER) expression. An optimized power allocation scheme for GDM is further designed based on this SER expression. Our simulation demonstrates that the proposed non-coherent scheme can perform close to the coherent counterpart as the block length increases. Additionally, we prove that the diversity order of both the proposed detector and the AMLD is exactly $\lceil N/2 \rceil + 1$. Extensive simulation results further verify the accuracy of our results in various scenarios.

preprint2020arXiv

Near-optimal Detector for SWIPT-enabled Differential DF Relay Networks with SER Analysis

In this paper, we analyze the symbol error rate (SER) performance of the simultaneous wireless information and power transfer (SWIPT) enabled three-node differential decode-and-forward (DDF) relay networks, which adopt the power splitting (PS) protocol at the relay. The use of non-coherent differential modulation eliminates the need for sending training symbols to estimate the instantaneous channel state informations (CSIs) at all network nodes, and therefore improves the power efficiency, as compared with the coherent modulation. However, performance analysis results are not yet available for the state-of-the-art detectors such as the approximate maximum-likelihood detector. Existing works rely on Monte-Carlo simulation to show that there exists an optimal PS ratio that minimizes the overall SER. In this work, we propose a near-optimal detector with linear complexity with respect to the modulation size. We derive an accurate approximate SER expression, based on which the optimal PS ratio can be accurately estimated without requiring any Monte-Carlo simulation.

preprint2020arXiv

SER Analysis for SWIPT-Enabled Differential Decode-and-Forward Relay Networks

In this paper, we analyze the symbol error rate (SER) performance of the simultaneous wireless information and power transfer (SWIPT) enabled three-node differential decode-and-forward (DDF) relay networks, which adopt the power splitting (PS) protocol at the relay. The use of non-coherent differential modulation eliminates the need for sending training symbols to estimate the instantaneous channel state information (CSI) at all network nodes, and therefore improves the power efficiency, as compared with the coherent modulation. However, performance analysis results are not yet available for the state-of-the-art detectors such as the maximum-likelihood detector (MLD) and approximate MLD. Existing works rely on the Monte-Carlo simulation method to show the existence of an optimal PS ratio that minimizes the overall SER. In this work, we propose a near-optimal detector with linear complexity with respect to the modulation size. We derive an approximate SER expression and prove that the proposed detector achieves the full diversity order. Based on our expression, the optimal PS ratio can be accurately estimated without requiring any Monte-Carlo simulation. We also extend the proposed detector and its SER analysis for adopting the time switching (TS) protocol at the relay. Simulation results verify the effectiveness of our proposed detector and the accuracy of our SER results in various network scenarios for both PS and TS protocols.

preprint2016arXiv

A Quadratic Programming Relaxation Approach to Compute-and-Forward Network Coding Design

Using physical layer network coding, compute-and-forward is a promising relaying scheme that effectively exploits the interference between users and thus achieves high rates. In this paper, we consider the problem of finding the optimal integer-valued coefficient vector for a relay in the compute-and-forward scheme to maximize the computation rate at that relay. Although this problem turns out to be a shortest vector problem, which is suspected to be NP-hard, we show that it can be relaxed to a series of equality-constrained quadratic programmings. The solutions of the relaxed problems serve as real-valued approximations of the optimal coefficient vector, and are quantized to a set of integer-valued vectors, from which a coefficient vector is selected. The key to the efficiency of our method is that the closed-form expressions of the real-valued approximations can be derived with the Lagrange multiplier method. Numerical results demonstrate that compared with the existing methods, our method offers comparable rates at an impressively low complexity.

preprint2016arXiv

An Efficient Algorithm for Optimally Solving a Shortest Vector Problem in Compute-and-Forward Protocol Design

We consider the problem of finding the optimal coefficient vector that maximizes the computation rate at a relay in the compute-and-forward scheme. Based on the idea of sphere decoding, we propose a highly efficient algorithm that finds the optimal coefficient vector. First, we derive a novel algorithm to transform the original quadratic form optimization problem into a shortest vector problem (SVP) using the Cholesky factorization. Instead of computing the Cholesky factor explicitly, the proposed algorithm realizes the Cholesky factorization with only $\bigO(n)$ flops by taking advantage of the structure of the Gram matrix in the quadratic form. Then, we propose some conditions that can be checked with $\bigO(n)$ flops, under which a unit vector is the optimal coefficient vector. Finally, by taking into account some useful properties of the optimal coefficient vector, we modify the Schnorr-Euchner search algorithm to solve the SVP. We show that the estimated average complexity of our new algorithm is $\bigO(n^{1.5}P^{0.5})$ flops for i.i.d. Gaussian channel entries with SNR $P$ based on the Gaussian heuristic. Simulations show that our algorithm is not only much more efficient than the existing ones that give the optimal solution, but also faster than some best known suboptimal methods. Besides, we show that our algorithm can be readily adapted to output a list of $L$ best candidate vectors for use in the compute-and-forward design. The estimated average complexity of the resultant list-output algorithm is $\bigO\left(n^{1.5}P^{0.5}\log L + nL\right)$ flops for i.i.d. Gaussian channel entries.

preprint2015arXiv

Improving Two-Way Selective Decode-and-forward Wireless Relaying with Energy-Efficient One-bit Soft Forwarding

Motivated by applications such as battery-operated wireless sensor networks (WSN), we propose an easy-to-implement energy-efficient two-way relaying scheme. In particular, we address the challenge of improving the standard two-way selective decode-and-forward protocol (TW-SDF) in terms of block-error-rate (BLER) with minor additional complexity and energy consumption. By following the principle of soft relaying, our solution is the two-way one-bit soft forwarding (TW-1bSF) protocol in which the relay forwards the one-bit quantization of a posterior information metric about the transmitted bits, associated with an appropriately designed reliability parameter. In WSN-related standards (such as IEEE802.15.6 and Bluetooth), block codes are adopted instead of convolutional and other sophisticated codes, due to their efficient decoder hardware implementation. As the second main contribution, we derive tight upper bounds on the BLER performance for both TW-SDF and TW-1bSF, when the two-way relaying network employs block codes and hard decoding. The error probability analysis confirms the superiority of TW-1bSF. Moreover, we derive the asymptotic performance gain of TW-1bSF over TW-SDF, which further suggests that the proposed protocol is a good choice, especially when long block codes are used.

preprint2011arXiv

Exact Regenerating Codes for Byzantine Fault Tolerance in Distributed Storage

Due to the use of commodity software and hardware, crash-stop and Byzantine failures are likely to be more prevalent in today's large-scale distributed storage systems. Regenerating codes have been shown to be a more efficient way to disperse information across multiple nodes and recover crash-stop failures in the literature. In this paper, we present the design of regeneration codes in conjunction with integrity check that allows exact regeneration of failed nodes and data reconstruction in presence of Byzantine failures. A progressive decoding mechanism is incorporated in both procedures to leverage computation performed thus far. The fault-tolerance and security properties of the schemes are also analyzed.

preprint2010arXiv

Variants of the LLL Algorithm in Digital Communications: Complexity Analysis and Fixed-Complexity Implementation

The Lenstra-Lenstra-Lovász (LLL) algorithm is the most practical lattice reduction algorithm in digital communications. In this paper, several variants of the LLL algorithm with either lower theoretic complexity or fixed-complexity implementation are proposed and/or analyzed. Firstly, the $O(n^4\log n)$ theoretic average complexity of the standard LLL algorithm under the model of i.i.d. complex normal distribution is derived. Then, the use of effective LLL reduction for lattice decoding is presented, where size reduction is only performed for pairs of consecutive basis vectors. Its average complexity is shown to be $O(n^3\log n)$, which is an order lower than previously thought. To address the issue of variable complexity of standard LLL, two fixed-complexity approximations of LLL are proposed. One is fixed-complexity effective LLL, while the other is fixed-complexity LLL with deep insertion, which is closely related to the well known V-BLAST algorithm. Such fixed-complexity structures are much desirable in hardware implementation since they allow straightforward constant-throughput implementation.

Wai Ho Mow

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

OptiVote: Non-Coherent FSO Over-the-Air Majority Vote for Communication-Efficient Distributed Federated Learning in Space Data Centers

Fast Performance Evaluation of Linear Block Codes over Memoryless Continuous Channels

Deep Multi-Task Learning for Cooperative NOMA: System Design and Principles

Detection and Performance Analysis for Non-Coherent DF Relay Networks with Optimized Generalized Differential Modulation

Near-optimal Detector for SWIPT-enabled Differential DF Relay Networks with SER Analysis

SER Analysis for SWIPT-Enabled Differential Decode-and-Forward Relay Networks

A Quadratic Programming Relaxation Approach to Compute-and-Forward Network Coding Design

An Efficient Algorithm for Optimally Solving a Shortest Vector Problem in Compute-and-Forward Protocol Design

Improving Two-Way Selective Decode-and-forward Wireless Relaying with Energy-Efficient One-bit Soft Forwarding

Exact Regenerating Codes for Byzantine Fault Tolerance in Distributed Storage

Variants of the LLL Algorithm in Digital Communications: Complexity Analysis and Fixed-Complexity Implementation