Source author record

Xiao Ma

Xiao Ma appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

61works

22topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Chain-of-Action: Trajectory Autoregressive Modeling for Robotic Manipulation

We present Chain-of-Action (CoA), a novel visuo-motor policy paradigm built upon Trajectory Autoregressive Modeling. Unlike conventional approaches that predict next step action(s) forward, CoA generates an entire trajectory by explicit backward reasoning with task-specific goals through an action-level Chain-of-Thought (CoT) process. This process is unified within a single autoregressive structure: (1) the first token corresponds to a stable keyframe action that encodes the task-specific goals; and (2) subsequent action tokens are generated autoregressively, conditioned on the initial keyframe and previously predicted actions. This backward action reasoning enforces a global-to-local structure, allowing each local action to be tightly constrained by the final goal. To further realize the action reasoning structure, CoA incorporates four complementary designs: continuous action token representation; dynamic stopping for variable-length trajectory generation; reverse temporal ensemble; and multi-token prediction to balance action chunk modeling with global structure. As a result, CoA gives strong spatial generalization capabilities while preserving the flexibility and simplicity of a visuo-motor policy. Empirically, we observe CoA achieves the state-of-the-art performance across 60 RLBench tasks and 8 real-world manipulation tasks.

preprint2026arXiv

GR-Dexter Technical Report

Vision-language-action (VLA) models have enabled language-conditioned, long-horizon robot manipulation, but most existing systems are limited to grippers. Scaling VLA policies to bimanual robots with high degree-of-freedom (DoF) dexterous hands remains challenging due to the expanded action space, frequent hand-object occlusions, and the cost of collecting real-robot data. We present GR-Dexter, a holistic hardware-model-data framework for VLA-based generalist manipulation on a bimanual dexterous-hand robot. Our approach combines the design of a compact 21-DoF robotic hand, an intuitive bimanual teleoperation system for real-robot data collection, and a training recipe that leverages teleoperated robot trajectories together with large-scale vision-language and carefully curated cross-embodiment datasets. Across real-world evaluations spanning long-horizon everyday manipulation and generalizable pick-and-place, GR-Dexter achieves strong in-domain performance and improved robustness to unseen objects and unseen instructions. We hope GR-Dexter serves as a practical step toward generalist dexterous-hand robotic manipulation.

preprint2026arXiv

Hand-in-the-Loop: Improving Dexterous VLA via Seamless Interventional Correction

Vision-Language-Action (VLA) models are prone to compounding errors in dexterous manipulation, where high-dimensional action spaces and contact-rich dynamics amplify small policy deviations over long horizons. While Interactive Imitation Learning (IIL) can refine policies through human takeover data, applying it to high-degree-of-freedom (DoF) robotic hands remains challenging due to a command mismatch between human teleoperation and policy execution at the takeover moment, which causes abrupt robot-hand configuration changes, or "gesture jumps". We present Hand-in-the-Loop (HandITL), a seamless human-in-the-loop intervention method that blends human corrective intent with autonomous policy execution to avoid gesture jumps during bimanual dexterous manipulation. Compared with direct teleoperation takeover, HandITL reduces takeover jitter by 99.8% and preserves robust post-takeover manipulation, reducing grasp failures by 87.5% and mean completion time by 19.1%. We validate HandITL on tasks requiring bimanual coordination, tool use, and fine-grained long-horizon manipulation. When used to collect intervention data for policy refinement, HandITL yields policies that outperform those trained with standard teleoperation data by 19% on average across three long-horizon dexterous tasks.

preprint2026arXiv

RePose: A Real-Time 3D Human Pose Estimation and Biomechanical Analysis Framework for Rehabilitation

We propose a real-time 3D human pose estimation and motion analysis method termed RePose for rehabilitation training. It is capable of real-time monitoring and evaluation of patients'motion during rehabilitation, providing immediate feedback and guidance to assist patients in executing rehabilitation exercises correctly. Firstly, we introduce a unified pipeline for end-to-end real-time human pose estimation and motion analysis using RGB video input from multiple cameras which can be applied to the field of rehabilitation training. The pipeline can help to monitor and correct patients'actions, thus aiding them in regaining muscle strength and motor functions. Secondly, we propose a fast tracking method for medical rehabilitation scenarios with multiple-person interference, which requires less than 1ms for tracking for a single frame. Additionally, we modify SmoothNet for real-time posture estimation, effectively reducing pose estimation errors and restoring the patient's true motion state, making it visually smoother. Finally, we use Unity platform for real-time monitoring and evaluation of patients' motion during rehabilitation, and to display the muscle stress conditions to assist patients with their rehabilitation training.

preprint2026arXiv

Test-time generative augmentation for medical image segmentation

Medical image segmentation is critical for clinical diagnosis, treatment planning, and monitoring, yet segmentation models often struggle with uncertainties stemming from occlusions, ambiguous boundaries, and variations in imaging devices. Traditional test-time augmentation (TTA) techniques typically rely on predefined geometric and photometric transformations, limiting their adaptability and effectiveness in complex medical scenarios. In this study, we introduced Test-Time Generative Augmentation (TTGA), a novel augmentation strategy specifically tailored for medical image segmentation at inference time. Different from conventional augmentation strategies that suffer from excessive randomness or limited flexibility, TTGA leverages a domain-fine-tuned generative model to produce contextually relevant and diverse augmentations tailored to the characteristics of each test image. Built upon diffusion model inversion, a masked null-text inversion method is proposed to enable region-specific augmentations during sampling. Furthermore, a dual denoising pathway is designed to balance precise identity preservation with controlled variability. We demonstrate the efficacy of our TTGA through extensive experiments across three distinct segmentation tasks spanning nine datasets. Our results consistently demonstrate that TTGA not only improves segmentation accuracy (with DSC gains ranging from 0.1% to 2.3% over the baseline) but also offers pixel-wise error estimation (with DSC gains ranging from 1.1% to 29.0% over the baseline). The source code and demonstration are available at: https://github.com/maxiao0234/TTGA.

preprint2022arXiv

Almost sharp wave kinetic theory of multidimensional KdV type equations with $d\ge 3$

In this work, we study the random series expansion of a multidimensional KdV type equation with a diffusion term, the so-called Zakharov-Kuznetsov (ZK) equation. We impose random initial data and periodic boundary condition with period $L$ on this equation. Using the random series expansion, we derive the $3$-wave kinetic equation on the inertial range for $t\lesssim L^{-\varepsilon}T_{\text{kin}}$. Our result reaches kinetic time scale up to $\varepsilon$ loss.

preprint2022arXiv

Benchmarking of DL Libraries and Models on Mobile Devices

Deploying deep learning (DL) on mobile devices has been a notable trend in recent years. To support fast inference of on-device DL, DL libraries play a critical role as algorithms and hardware do. Unfortunately, no prior work ever dives deep into the ecosystem of modern DL libs and provides quantitative results on their performance. In this paper, we first build a comprehensive benchmark that includes 6 representative DL libs and 15 diversified DL models. We then perform extensive experiments on 10 mobile devices, which help reveal a complete landscape of the current mobile DL libs ecosystem. For example, we find that the best-performing DL lib is severely fragmented across different models and hardware, and the gap between those DL libs can be rather huge. In fact, the impacts of DL libs can overwhelm the optimizations from algorithms or hardware, e.g., model quantization and GPU/DSP-based heterogeneous computing. Finally, atop the observations, we summarize practical implications to different roles in the DL lib ecosystem.

preprint2022arXiv

From Earth to Space: A First Deployment of 5G Core Network on Satellite

Recent developments in the aerospace industry have led to a dramatic reduction in the manufacturing and launch costs of low Earth orbit satellites. The new trend enables the paradigm shift of satellite-terrestrial integrated networks with global coverage. In particular, the integration of 5G communication systems and satellites has the potential to restructure next-generation mobile networks. By leveraging the network function virtualization and network slicing, the orbital 5G core networks will facilitate the coordination and management of network functions in satellite-terrestrial integrated networks. We are the first to deploy a lightweight 5G core network on a real-world satellite to investigate its feasibility. We conducted experiments to validate the onboard 5G core network functions. The validated procedures include registration and session setup procedures. The results show that the 5G core network can function normally and generate correct signaling.

preprint2022arXiv

Hierarchical Reinforcement Learning under Mixed Observability

The framework of mixed observable Markov decision processes (MOMDP) models many robotic domains in which some state variables are fully observable while others are not. In this work, we identify a significant subclass of MOMDPs defined by how actions influence the fully observable components of the state and how those, in turn, influence the partially observable components and the rewards. This unique property allows for a two-level hierarchical approach we call HIerarchical Reinforcement Learning under Mixed Observability (HILMO), which restricts partial observability to the top level while the bottom level remains fully observable, enabling higher learning efficiency. The top level produces desired goals to be reached by the bottom level until the task is solved. We further develop theoretical guarantees to show that our approach can achieve optimal and quasi-optimal behavior under mild assumptions. Empirical results on long-horizon continuous control tasks demonstrate the efficacy and efficiency of our approach in terms of improved success rate, sample efficiency, and wall-clock training time. We also deploy policies learned in simulation on a real robot.

preprint2022arXiv

Imitation Learning via Differentiable Physics

Existing imitation learning (IL) methods such as inverse reinforcement learning (IRL) usually have a double-loop training process, alternating between learning a reward function and a policy and tend to suffer long training time and high variance. In this work, we identify the benefits of differentiable physics simulators and propose a new IL method, i.e., Imitation Learning via Differentiable Physics (ILD), which gets rid of the double-loop design and achieves significant improvements in final performance, convergence speed, and stability. The proposed ILD incorporates the differentiable physics simulator as a physics prior into its computational graph for policy learning. It unrolls the dynamics by sampling actions from a parameterized policy, simply minimizing the distance between the expert trajectory and the agent trajectory, and back-propagating the gradient into the policy via temporal physics operators. With the physics prior, ILD policies can not only be transferable to unseen environment specifications but also yield higher final performance on a variety of tasks. In addition, ILD naturally forms a single-loop structure, which significantly improves the stability and training speed. To simplify the complex optimization landscape induced by temporal physics operations, ILD dynamically selects the learning objectives for each state during optimization. In our experiments, we show that ILD outperforms state-of-the-art methods in a variety of continuous control tasks with Brax, requiring only one expert demonstration. In addition, ILD can be applied to challenging deformable object manipulation tasks and can be generalized to unseen configurations.

preprint2022arXiv

Label Adversarial Learning for Skeleton-level to Pixel-level Adjustable Vessel Segmentation

You can have your cake and eat it too. Microvessel segmentation in optical coherence tomography angiography (OCTA) images remains challenging. Skeleton-level segmentation shows clear topology but without diameter information, while pixel-level segmentation shows a clear caliber but low topology. To close this gap, we propose a novel label adversarial learning (LAL) for skeleton-level to pixel-level adjustable vessel segmentation. LAL mainly consists of two designs: a label adversarial loss and an embeddable adjustment layer. The label adversarial loss establishes an adversarial relationship between the two label supervisions, while the adjustment layer adjusts the network parameters to match the different adversarial weights. Such a design can efficiently capture the variation between the two supervisions, making the segmentation continuous and tunable. This continuous process allows us to recommend high-quality vessel segmentation with clear caliber and topology. Experimental results show that our results outperform manual annotations of current public datasets and conventional filtering effects. Furthermore, such a continuous process can also be used to generate an uncertainty map representing weak vessel boundaries and noise.

preprint2022arXiv

Learning Latent Graph Dynamics for Visual Manipulation of Deformable Objects

Manipulating deformable objects, such as ropes and clothing, is a long-standing challenge in robotics, because of their large degrees of freedom, complex non-linear dynamics, and self-occlusion in visual perception. The key difficulty is a suitable representation, rich enough to capture the object shape, dynamics for manipulation and yet simple enough to be estimated reliably from visual observations. This work aims to learn latent Graph dynamics for DefOrmable Object Manipulation (G-DOOM). G-DOOM approximates a deformable object as a sparse set of interacting keypoints, which are extracted automatically from images via unsupervised learning. It learns a graph neural network that captures abstractly the geometry and the interaction dynamics of the keypoints. To handle object self-occlusion, G-DOOM uses a recurrent neural network to track the keypoints over time and condition their interactions on the history. We then train the resulting recurrent graph dynamics model through contrastive learning in a high-fidelity simulator. For manipulation planning, G-DOOM reasons explicitly about the learned dynamics model through model-predictive control applied at each keypoint. Preliminary experiments of G-DOOM on a set of challenging rope and cloth manipulation tasks indicate strong performance, compared with state-of-the-art methods. Although trained in a simulator, G-DOOM transfers directly to a real robot for both rope and cloth manipulation.

preprint2022arXiv

On Exploring Pose Estimation as an Auxiliary Learning Task for Visible-Infrared Person Re-identification

Visible-infrared person re-identification (VI-ReID) has been challenging due to the existence of large discrepancies between visible and infrared modalities. Most pioneering approaches reduce intra-class variations and inter-modality discrepancies by learning modality-shared and ID-related features. However, an explicit modality-shared cue, i.e., body keypoints, has not been fully exploited in VI-ReID. Additionally, existing feature learning paradigms imposed constraints on either global features or partitioned feature stripes, which neglect the prediction consistency of global and part features. To address the above problems, we exploit Pose Estimation as an auxiliary learning task to assist the VI-ReID task in an end-to-end framework. By jointly training these two tasks in a mutually beneficial manner, our model learns higher quality modality-shared and ID-related features. On top of it, the learnings of global features and local features are seamlessly synchronized by Hierarchical Feature Constraint (HFC), where the former supervises the latter using the knowledge distillation strategy. Experimental results on two benchmark VI-ReID datasets show that the proposed method consistently improves state-of-the-art methods by significant margins. Specifically, our method achieves nearly 20$\%$ mAP improvements against the state-of-the-art method on the RegDB dataset. Our intriguing findings highlight the usage of auxiliary task learning in VI-ReID.

preprint2022arXiv

Spectral radius and rainbow matchings of graphs

Let $n,m$ be integers such that $1\leq m\leq (n-2)/2$ and let $[n]=\{1,\ldots,n\}$. Let $\mathcal{G}=\{G_1,\ldots,G_{m+1}\}$ be a family of graphs on the same vertex set $[n]$. In this paper, we prove that if for any $i\in [m+1]$, the spectral radius of $G_i$ is not less than $\max\{2m,\frac{1}{2}(m-1+\sqrt{(m-1)^2+4m(n-m)})\}$, then $\mathcal{G}$ admits a rainbow matching, i.e. a choice of disjoint edges $e_i\in G_i$, unless $G_1=G_2=\ldots=G_{m+1}$ and $G_1\in \{K_{2m+1}\cup (n-2m-1)K_1, K_m\vee (n-m)K_1\}$.

preprint2022arXiv

Towards Sustainable Satellite Edge Computing

Recently, Low Earth Orbit (LEO) satellites experience rapid development and satellite edge computing emerges to address the limitation of bent-pipe architecture in existing satellite systems. Introducing energy-consuming computing components in satellite edge computing increases the depth of battery discharge. This will shorten batteries' life and influences the satellites' operation in orbit. In this paper, we aim to extend batteries' life by minimizing the depth of discharge for Earth observation missions. Facing the challenges of wireless uncertainty and energy harvesting dynamics, our work develops an online energy scheduling algorithm within an online convex optimization framework. Our algorithm achieves sub-linear regret and the constraint violation asymptotically approaches zero. Simulation results show that our algorithm can reduce the depth of discharge significantly.

preprint2022arXiv

Transmission of Bernoulli Sources Using Convolutional LDGM Codes

We propose in this paper to exploit convolutional low density generator matrix (LDGM) codes for transmission of Bernoulli sources over binary-input output-symmetric (BIOS) channels. To this end, we present a new framework to prove the coding theorems for linear codes, which unifies the channel coding theorem, the source coding theorem and the joint source-channel coding (JSCC) theorem. In the presented framework, the systematic bits and the corresponding parity-check bits play different roles. Precisely, the noisy systematic bits are used to limit the list size of typical codewords, while the noisy parity-check bits are used to select from the list the maximum likelihood codeword. This new framework for linear codes allows that the systematic bits and the parity-check bits are transmitted in different ways and over different channels. With this framework, we prove that the Bernoulli generator matrix codes (BGMCs) are capacity-achieving over BIOS channels, entropy-achieving for Bernoulli sources, and also system-capacity-achieving for JSCC applications. A lower bound on the bit-error rate (BER) is derived for linear codes, which can be used to predict the error floors and hence serves as a simple tool to design the JSCC system. Numerical results show that the convolutional LDGM codes perform well in the waterfall region and match well with the derived error floors, which can be lowered down if required by simply increasing the encoding memory.

preprint2021arXiv

Ab Initio Particle-based Object Manipulation

This paper presents Particle-based Object Manipulation (Prompt), a new approach to robot manipulation of novel objects ab initio, without prior object models or pre-training on a large object data set. The key element of Prompt is a particle-based object representation, in which each particle represents a point in the object, the local geometric, physical, and other features of the point, and also its relation with other particles. Like the model-based analytic approaches to manipulation, the particle representation enables the robot to reason about the object's geometry and dynamics in order to choose suitable manipulation actions. Like the data-driven approaches, the particle representation is learned online in real-time from visual sensor input, specifically, multi-view RGB images. The particle representation thus connects visual perception with robot control. Prompt combines the benefits of both model-based reasoning and data-driven learning. We show empirically that Prompt successfully handles a variety of everyday objects, some of which are transparent. It handles various manipulation tasks, including grasping, pushing, etc,. Our experiments also show that Prompt outperforms a state-of-the-art data-driven grasping method on the daily objects, even though it does not use any offline training data.

preprint2021arXiv

Detecting and modelling real percolation and phase transitions of information on social media

It is widely believed that information spread on social media is a percolation process, with parallels to phase transitions in theoretical physics. However, evidence for this hypothesis is limited, as phase transitions have not been directly observed in any social media. Here, through analysis of 100 million Weibo and 40 million Twitter users, we identify percolation-like spread, and find that it happens more readily than current theoretical models would predict. The lower percolation threshold can be explained by the existence of positive feedback in the coevolution between network structure and user activity level, such that more active users gain more followers. Moreover, this coevolution induces an extreme imbalance in users' influence. Our findings indicate that the ability of information to spread across social networks is higher than expected, with implications for many information spread problems.

preprint2021arXiv

HAVANA: Hierarchical and Variation-Normalized Autoencoder for Person Re-identification

Person Re-Identification (Re-ID) is of great importance to the many video surveillance systems. Learning discriminative features for Re-ID remains a challenge due to the large variations in the image space, e.g., continuously changing human poses, illuminations and point of views. In this paper, we propose HAVANA, a novel extensible, light-weight HierArchical and VAriation-Normalized Autoencoder that learns features robust to intra-class variations. In contrast to existing generative approaches that prune the variations with heavy extra supervised signals, HAVANA suppresses the intra-class variations with a Variation-Normalized Autoencoder trained with no additional supervision. We also introduce a novel Jensen-Shannon triplet loss for contrastive distribution learning in Re-ID. In addition, we present Hierarchical Variation Distiller, a hierarchical VAE to factorize the latent representation and explicitly model the variations. To the best of our knowledge, HAVANA is the first VAE-based framework for person ReID.

preprint2021arXiv

Tiansuan Constellation: An Open Research Platform

Satellite network is the first step of interstellar voyages. It can provide global Internet connectivity everywhere on earth, where most areas cannot access the Internet by the terrestrial infrastructure due to the geographic accessibility and high cost. The space industry experiences a rise in large low-earth-orbit satellite constellations to achieve universal connectivity. The research community is also urgent to do some leading research to bridge the connectivity divide. Researchers now conduct their work by simulation, which is far from enough. However, experiments on real satellites are blocked by the high threshold of space technology, such as deployment cost and unknown risks. To solve the above dilemma, we are eager to contribute to the universal connectivity and build an open research platform, Tiansuan constellation to support experiments on real satellite networks. We discuss the potential research topics that would benefit from Tiansuan constellation. We provide two case studies that have already deployed in two experimental satellites of Tiansuan constellation.

preprint2021arXiv

Twisted-Pair Superposition Transmission

We propose in this paper a new coding scheme called twisted-pair superposition transmission (TPST). The encoding is to "mix together" a pair of basic codes by superposition, while the decoding can be implemented as a successive cancellation list decoding algorithm. The most significant features of the TPST code are its predictable performance that can be estimated numerically from the basic codes and its flexible construction in the sense that it can be easily adapted to different coding rates. To construct good TPST codes in the finite length regime, we propose two design approaches-rate allocation and partial superposition. By taking tail-biting convolutional codes (TBCC) as basic codes, we show by numerical results that the TPST codes can have near-capacity performance in the short length regime.

preprint2020arXiv

AI-Mediated Exchange Theory

As Artificial Intelligence (AI) plays an ever-expanding role in sociotechnical systems, it is important to articulate the relationships between humans and AI. However, the scholarly communities studying human-AI relationships -- including but not limited to social computing, machine learning, science and technology studies, and other social sciences -- are divided by the perspectives that define them. These perspectives vary both by their focus on humans or AI, and in the micro/macro lenses through which they approach subjects. These differences inhibit the integration of findings, and thus impede science and interdisciplinarity. In this position paper, we propose the development of a framework AI-Mediated Exchange Theory (AI-MET) to bridge these divides. As an extension to Social Exchange Theory (SET) in the social sciences, AI-MET views AI as influencing human-to-human relationships via a taxonomy of mediation mechanisms. We list initial ideas of these mechanisms, and show how AI-MET can be used to help human-AI research communities speak to one another.

preprint2020arXiv

Analysis on Computation-Intensive Status Update in Mobile Edge Computing

In status update scenarios, the freshness of information is measured in terms of age-of-information (AoI), which essentially reflects the timeliness for real-time applications to transmit status update messages to a remote controller. For some applications, computational expensive and time consuming data processing is inevitable for status information of messages to be displayed. Mobile edge servers are equipped with adequate computation resources and they are placed close to users. Thus, mobile edge computing (MEC) can be a promising technology to reduce AoI for computation-intensive messages. In this paper, we study the AoI for computation-intensive messages with MEC, and consider three computing schemes: local computing, remote computing at the MEC server, and partial computing, i.e., some part of computing tasks are performed locally, and the rest is executed at the MEC server. Zero-wait policy is adopted in all three schemes. Specifically, in local computing, a new message is generated immediately after the previous one is revealed by computing. While in remote computing and partial computing, a new message is generated once the previous one is received by the remote MEC server. With infinite queue size and exponentially distributed transmission time, closed-form average AoI for exponentially distributed computing time is derived for the three computing schemes. For deterministic computing time, the average AoI is analyzed numerically. Simulation results show that by carefully partitioning the computing tasks, the average AoI in partial computing is the smallest compared to local computing and remote computing. The results also indicate numerically the conditions on which remote computing attains smaller average AoI compared with local computing.

preprint2020arXiv

AoI-Delay Tradeoff in Mobile Edge Caching with Freshness-Aware Content Refreshing

Mobile edge caching can effectively reduce service delay but may introduce information staleness, calling for timely content refreshing. However, content refreshing consumes additional transmission resources and may degrade the delay performance of mobile systems. In this work, we propose a freshness-aware refreshing scheme to balance the service delay and content freshness measured by Age of Information (AoI). Specifically, the cached content items will be refreshed to the up-to-date version upon user requests if the AoI exceeds a certain threshold (named as refreshing window). The average AoI and service delay are derived in closed forms approximately, which reveals an AoI-delay tradeoff relationship with respect to the refreshing window. In addition, the refreshing window is optimized to minimize the average delay while meeting the AoI requirements, and the results indicate to set a smaller refreshing window for the popular content items. Extensive simulations are conducted on the OMNeT++ platform to validate the analytical results. The results indicate that the proposed scheme can restrain frequent refreshing as the request arrival rate increases, whereby the average delay can be reduced by around 80% while maintaining the AoI below one second in heavily-loaded scenarios.

preprint2020arXiv

Binary Representaion for Non-binary LDPC Code with Decoder Design

The equivalent binary parity check matrices for the binary images of the cycle-free non-binary LDPC codes have numerous bit-level cycles. In this paper, we show how to transform these binary parity check matrices into their cycle-free forms. It is shown that the proposed methodology can be adopted not only for the binary images of non-binary LDPC codes but also for a large class of binary LDPC codes. Specifically, we present an extended $p$-reducible (EPR) LDPC code structure to eliminate the bit-level cycles. For the non-binary LDPC codes with short length symbol-level cycles, the EPR-LDPC codes can largely avoid the corresponding short length bit-level cycles. As to the decoding of the EPR-LDPC codes, we propose a hybrid hard-decision decoder and a hybrid parallel decoder for binary symmetric channel and binary input Gaussian channel, respectively. A simple code optimization algorithm for these binary decoders is also provided. Simulations show the comparative results and justify the advantages, i.e., better performance and lower decoding complexity, of the proposed binary constructions.

preprint2020arXiv

Challenges in Supporting Exploratory Search through Voice Assistants

Voice assistants have been successfully adopted for simple, routine tasks, such as asking for the weather or setting an alarm. However, as people get more familiar with voice assistants, they may increase their expectations for more complex tasks, such as exploratory search-- e.g., "What should I do when I visit Paris with kids? Oh, and ideally not too expensive." Compared to simple search tasks such as "How tall is the Eiffel Tower?", which can be answered with a single-shot answer, the response to exploratory search is more nuanced, especially through voice-based assistants. In this paper, we outline four challenges in designing voice assistants that can better support exploratory search: addressing situationally induced impairments; working with mixed-modal interactions; designing for diverse populations; and meeting users' expectations and gaining their trust. Addressing these challenges is important for developing more "intelligent" voice-based personal assistants.

preprint2020arXiv

Cooperative Service Caching and Workload Scheduling in Mobile Edge Computing

Mobile edge computing is beneficial to reduce service response time and core network traffic by pushing cloud functionalities to network edge. Equipped with storage and computation capacities, edge nodes can cache services of resource-intensive and delay-sensitive mobile applications and process the corresponding computation tasks without outsourcing to central clouds. However, the heterogeneity of edge resource capacities and inconsistence of edge storage and computation capacities make it difficult to jointly fully utilize the storage and computation capacities when there is no cooperation among edge nodes. To address this issue, we consider cooperation among edge nodes and investigate cooperative service caching and workload scheduling in mobile edge computing. This problem can be formulated as a mixed integer nonlinear programming problem, which has non-polynomial computation complexity. To overcome the challenges of subproblem coupling, computation-communication tradeoff, and edge node heterogeneity, we develop an iterative algorithm called ICE. This algorithm is designed based on Gibbs sampling, which has provably near-optimal results, and the idea of water filling, which has polynomial computation complexity. Simulations are conducted and the results demonstrate that our algorithm can jointly reduce the service response time and the outsourcing traffic compared with the benchmark algorithms.

preprint2020arXiv

DinerDash Gym: A Benchmark for Policy Learning in High-Dimensional Action Space

It has been arduous to assess the progress of a policy learning algorithm in the domain of hierarchical task with high dimensional action space due to the lack of a commonly accepted benchmark. In this work, we propose a new light-weight benchmark task called Diner Dash for evaluating the performance in a complicated task with high dimensional action space. In contrast to the traditional Atari games that only have a flat structure of goals and very few actions, the proposed benchmark task has a hierarchical task structure and size of 57 for the action space and hence can facilitate the development of policy learning in complicated tasks. On top of that, we introduce Decomposed Policy Graph Modelling (DPGM), an algorithm that combines both graph modelling and deep learning to allow explicit domain knowledge embedding and achieves significant improvement comparing to the baseline. In the experiments, we have shown the effectiveness of the domain knowledge injection via a specially designed imitation algorithm as well as results of other popular algorithms.

preprint2020arXiv

Discriminative Particle Filter Reinforcement Learning for Complex Partial Observations

Deep reinforcement learning is successful in decision making for sophisticated games, such as Atari, Go, etc. However, real-world decision making often requires reasoning with partial information extracted from complex visual observations. This paper presents Discriminative Particle Filter Reinforcement Learning (DPFRL), a new reinforcement learning framework for complex partial observations. DPFRL encodes a differentiable particle filter in the neural network policy for explicit reasoning with partial observations over time. The particle filter maintains a belief using learned discriminative update, which is trained end-to-end for decision making. We show that using the discriminative update instead of standard generative models results in significantly improved performance, especially for tasks with complex visual observations, because they circumvent the difficulty of modeling complex observations that are irrelevant to decision making. In addition, to extract features from the particle belief, we propose a new type of belief feature based on the moment generating function. DPFRL outperforms state-of-the-art POMDP RL models in Flickering Atari Games, an existing POMDP RL benchmark, and in Natural Flickering Atari Games, a new, more challenging POMDP RL benchmark introduced in this paper. Further, DPFRL performs well for visual navigation with real-world data in the Habitat environment.

preprint2020arXiv

Linear Stability of the 2D Irrotational Circulation Flow around An Elliptical Cylinder

In this article we prove a linear inviscid damping result with optimal decay rates of the 2D irrotational circulation flow around an elliptical cylinder. In our result, all components of the asymptotic velocity field do not vanish and the asymptotic flow lines are not ellipse any more.

preprint2020arXiv

R3: A Reading Comprehension Benchmark Requiring Reasoning Processes

Existing question answering systems can only predict answers without explicit reasoning processes, which hinder their explainability and make us overestimate their ability of understanding and reasoning over natural language. In this work, we propose a novel task of reading comprehension, in which a model is required to provide final answers and reasoning processes. To this end, we introduce a formalism for reasoning over unstructured text, namely Text Reasoning Meaning Representation (TRMR). TRMR consists of three phrases, which is expressive enough to characterize the reasoning process to answer reading comprehension questions. We develop an annotation platform to facilitate TRMR's annotation, and release the R3 dataset, a \textbf{R}eading comprehension benchmark \textbf{R}equiring \textbf{R}easoning processes. R3 contains over 60K pairs of question-answer pairs and their TRMRs. Our dataset is available at: \url{http://anonymous}.

preprint2020arXiv

Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction

Understanding crowd motion dynamics is critical to real-world applications, e.g., surveillance systems and autonomous driving. This is challenging because it requires effectively modeling the socially aware crowd spatial interaction and complex temporal dependencies. We believe attention is the most important factor for trajectory prediction. In this paper, we present STAR, a Spatio-Temporal grAph tRansformer framework, which tackles trajectory prediction by only attention mechanisms. STAR models intra-graph crowd interaction by TGConv, a novel Transformer-based graph convolution mechanism. The inter-graph temporal dependencies are modeled by separate temporal Transformers. STAR captures complex spatio-temporal interactions by interleaving between spatial and temporal Transformers. To calibrate the temporal prediction for the long-lasting effect of disappeared pedestrians, we introduce a read-writable external memory module, consistently being updated by the temporal Transformer. We show that with only attention mechanism, STAR achieves state-of-the-art performance on 5 commonly used real-world pedestrian prediction datasets.

preprint2020arXiv

Successive Cancellation List Decoding of Semi-random Unit Memory Convolutional Codes

We present in this paper a special class of unit memory convolutional codes (UMCCs), called semi-random UMCCs (SRUMCCs), where the information block is first encoded by a short block code and then transmitted in a block Markov (random) superposition manner. We propose a successive cancellation list decoding algorithm, by which a list of candidate codewords are generated serially until one passes an empirical divergence test instead of the conventional cyclic redundancy check (CRC). The threshold for testing the correctness of candidate codewords can be learned off-line based on the statistical behavior of the introduced empirical divergence function (EDF). The performance-complexity tradeoff and the performance-delay tradeoff can be achieved by adjusting the statistical threshold and the decoding window size. To analyze the performance, a closed-form upper bound and a simulated lower bound are derived. Simulation results verify our analysis and show that: 1) The proposed list decoding algorithm with empirical divergence test outperforms the sequential decoding in high signal-to-noise ratio (SNR) region; 2) Taking the tail-biting convolutional codes (TBCC) as the basic codes, the proposed list decoding of SRUMCCs have comparable performance with the polar codes under the constraint of equivalent decoding delay.

preprint2020arXiv

Systematic Convolutional Low Density Generator Matrix Code

In this paper, we propose a systematic low density generator matrix (LDGM) code ensemble, which is defined by the Bernoulli process. We prove that, under maximum likelihood (ML) decoding, the proposed ensemble can achieve the capacity of binary-input output symmetric (BIOS) memoryless channels in terms of bit error rate (BER). The proof technique reveals a new mechanism, different from lowering down frame error rate (FER), that the BER can be lowered down by assigning light codeword vectors to light information vectors. The finite length performance is analyzed by deriving an upper bound and a lower bound, both of which are shown to be tight in the high signal-to-noise ratio (SNR) region. To improve the waterfall performance, we construct the systematic convolutional LDGM (SC-LDGM) codes by a random splitting process. The SC-LDGM codes are easily configurable in the sense that any rational code rate can be realized without complex optimization. As a universal construction, the main advantage of the SC-LDGM codes is their near-capacity performance in the waterfall region and predictable performance in the error-floor region that can be lowered down to any target as required by increasing the density of the uncoupled LDGM codes. Numerical results are also provided to verify our analysis.

preprint2020arXiv

Transmitting Extra Bits by Rotating Signal Constellations

In this letter, we propose a novel LDPC coding scheme to transmit extra bits aided by rotated signal constellations without any additional cost in transmission power or bandwidth. In the proposed scheme, the LDPC coded data are modulated by a rotated two-dimensional signal constellation, in which the rotation angle is specified by the given extra bits. At the receiver, the rotation angle is estimated with the aid of the statistical learning of the syndrome of the LDPC code. After recovering the rotation angle, the coded payload data can be decoded by the LDPC decoder. The simulation results show that, for an LDPC code of length 2304, up to four extra bits can be transmitted with negligible influence on the reliability of the LDPC coded data.

preprint2016arXiv

Partially Block Markov Superposition Transmission of Gaussian Source with Nested Lattice Codes

This paper studies the transmission of Gaussian sources through additive white Gaussian noise (AWGN) channels in bandwidth expansion regime, i.e., the channel bandwidth is greater than the source bandwidth. To mitigate the error propagation phenomenon of conventional digital transmission schemes, we propose in this paper a new capacity-approaching joint source channel coding (JSCC) scheme based on partially block Markov superposition transmission (BMST) of nested lattice codes. In the proposed scheme, first, the Gaussian source sequence is discretized by a lattice-based quantizer, resulting in a sequence of lattice points. Second, these lattice points are encoded by a short systematic group code. Third, the coded sequence is partitioned into blocks of equal length and then transmitted in the BMST manner. Main characteristics of the proposed JSCC scheme include: 1) Entropy coding is not used explicitly. 2) Only parity-check sequence is superimposed, hence, termed partially BMST (PBMST). This is different from the original BMST. To show the superior performance of the proposed scheme, we present extensive simulation results which show that the proposed scheme performs within 1.0 dB of the Shannon limits. Hence, the proposed scheme provides an attractive candidate for transmission of Gaussian sources.

preprint2016arXiv

Systematic Block Markov Superposition Transmission of Repetition Codes

In this paper, we propose systematic block Markov superposition transmission of repetition~(BMST-R) codes, which can support a wide range of code rates but maintain essentially the same encoding/decoding hardware structure. The systematic BMST-R codes resemble the classical rate-compatible punctured convolutional~(RCPC) codes, except that they are typically non-decodable by the Viterbi algorithm due to the huge constraint length induced by the block-oriented encoding process. The information sequence is partitioned equally into blocks and transmitted directly, while their replicas are interleaved and transmitted in a block Markov superposition manner. By taking into account that the codes are systematic, we derive both upper and lower bounds on the bit-error-rate~(BER) under maximum {\em a posteriori}~(MAP) decoding. The derived lower bound reveals connections among BER, encoding memory and code rate, which provides a way to design good systematic BMST-R codes and also allows us to make trade-offs among efficiency, performance and complexity. Numerical results show that:~1)~the proposed bounds are tight in the high signal-to-noise ratio~(SNR) region;~2)~systematic BMST-R codes perform well in a wide range of code rates, and~3)~systematic BMST-R codes outperform spatially coupled low-density parity-check~(SC-LDPC) codes under an equal decoding latency constraint.

preprint2015arXiv

Block Markov Superposition Transmission of RUN Codes

In this paper, we propose a simple procedure to construct (decodable) good codes with any given alphabet (of moderate size) for any given (rational) code rate to achieve any given target error performance (of interest) over additive white Gaussian noise (AWGN) channels. We start with constructing codes over groups for any given code rates. This can be done in an extremely simple way if we ignore the error performance requirement for the time being. Actually, this can be satisfied by repetition (R) codes and uncoded (UN) transmission along with time-sharing technique. The resulting codes are simply referred to as RUN codes for convenience. The encoding/decoding algorithms for RUN codes are almost trivial. In addition, the performance can be easily analyzed. It is not difficult to imagine that a RUN code usually performs far away from the corresponding Shannon limit. Fortunately, the performance can be improved as required by spatially coupling the RUN codes via block Markov superposition transmission (BMST), resulting in the BMST-RUN codes. Simulation results show that the BMST-RUN codes perform well (within one dB away from Shannon limits) for a wide range of code rates and outperform the BMST with bit-interleaved coded modulation (BMST-BICM) scheme.

preprint2015arXiv

EXIT Chart Analysis of Block Markov Superposition Transmission of Short Codes

In this paper, a modified extrinsic information transfer (EXIT) chart analysis that takes into account the relation between mutual information (MI) and bit-error-rate (BER) is presented to study the convergence behavior of block Markov superposition transmission (BMST) of short codes (referred to as basic codes). We show that the threshold curve of BMST codes using an iterative sliding window decoding algorithm with a fixed decoding delay achieves a lower bound in the high signal-to-noise ratio (SNR) region, while in the low SNR region, due to error propagation, the thresholds of BMST codes become slightly worse as the encoding memory increases. We also demonstrate that the threshold results are consistent with finite-length performance simulations.

preprint2015arXiv

Performance Analysis of Block Markov Superposition Transmission of Short Codes

In this paper, we consider the asymptotic and finite-length performance of block Markov superposition transmission~(BMST) of short codes, which can be viewed as a new class of spatially coupled~(SC) codes with the generator matrices of short codes~(referred to as {\em basic codes}) coupled. A modified extrinsic information transfer~(EXIT) chart analysis that takes into account the relation between mutual information~(MI) and bit-error-rate~(BER) is presented to study the convergence behavior of BMST codes. Using the modified EXIT chart analysis, we investigate the impact of various parameters on BMST code performance, thereby providing theoretical guidance for designing and implementing practical BMST codes suitable for sliding window decoding. Then, we present a performance comparison of BMST codes and SC low-density parity-check (SC-LDPC) codes on the basis of equal decoding latency. Also presented is a comparison of computational complexity. Simulation results show that, under the equal decoding latency constraint, BMST codes using the repetition code as the basic code can outperform $(3,6)$-regular SC-LDPC codes in the waterfall region but have a higher computational complexity.

preprint2014arXiv

Bounds on the ML Decoding Error Probability of RS-Coded Modulation over AWGN Channels

This paper is concerned with bounds on the maximum-likelihood (ML) decoding error probability of Reed-Solomon (RS) codes over additive white Gaussian noise (AWGN) channels. To resolve the difficulty caused by the dependence of the Euclidean distance spectrum on the way of signal mapping, we propose to use random mapping, resulting in an ensemble of RS-coded modulation (RS-CM) systems. For this ensemble of RS-CM systems, analytic bounds are derived, which can be evaluated from the known (symbol-level) Hamming distance spectrum. Also presented in this paper are simulation-based bounds, which are applicable to any specific RS-CM system and can be evaluated by the aid of a list decoding (in the Euclidean space) algorithm. The simulation-based bounds do not need distance spectrum and are numerically tight for short RS codes in the regime where the word error rate (WER) is not too low. Numerical comparison results are relevant in at least three aspects. First, in the short code length regime, RS-CM using BPSK modulation with random mapping has a better performance than binary random linear codes. Second, RS-CM with random mapping (time varying) can have a better performance than with specific mapping. Third, numerical results show that the recently proposed Chase-type decoding algorithm is essentially the ML decoding algorithm for short RS codes.

preprint2014arXiv

Performance Comparison of LDPC Block and Spatially Coupled Codes over GF(q)

In this paper, we compare the finite-length performance of protograph-based spatially coupled low-density parity-check (SC-LDPC) codes and LDPC block codes (LDPC-BCs) over GF(q). In order to reduce computational complexity and latency, a sliding window decoder with a stopping rule based on a soft bit-error-rate (BER) estimate is used for the q-ary SC-LDPC codes. Two regimes are considered: one when the constraint length of q-ary SC-LDPC codes is equal to the block length of q-ary LDPC-BCs and the other when the two decoding latencies are equal. Simulation results confirm that, in both regimes, (3,6)-, (3,9)-, and (3,12)-regular non-binary SC-LDPC codes can significantly outperform both binary and non-binary LDPC-BCs and binary SC-LDPC codes. Finally, we present a computational complexity comparison of q-ary SC-LDPC codes and q-ary LDPC-BCs under equal decoding latency and equal decoding performance assumptions.

preprint2014arXiv

Spatial Coupling of Generator Matrix: A General Approach to Design of Good Codes at a Target BER

For any given short code (referred to as the basic code), block Markov superposition transmission (BMST) provides a simple way to obtain predictable extra coding gain by spatial coupling the generator matrix of the basic code. This paper presents a systematic design methodology for BMST systems to approach the channel capacity at any given target bit-error-rate (BER) of interest. To simplify the design, we choose the basic code as the Cartesian product of a short block code. The encoding memory is then inferred from the genie-aided lower bound according to the performance gap of the short block code to the corresponding Shannon limit at the target BER. In addition to the sliding-window decoding algorithm, we propose to perform one more phase decoding to remove residual (rare) errors. A new technique that assumes a noisy genie is proposed to upper bound the performance. Under some mild assumptions, these genie-aided bounds can be used to predict the performance of the proposed two-phase decoding algorithm in the extremely low BER region. Using the Cartesian product of a repetition code as the basic code, we construct a BMST system with an encoding memory 30 whose performance at the BER of $10^{-15}$ can be predicted within one dB away from the Shannon limit over the binary-input additive white Gaussian noise channel (BI-AWGNC).

preprint2013arXiv

Accessible Capacity of Secondary Users

A new problem formulation is presented for the Gaussian interference channels (GIFC) with two pairs of users, which are distinguished as primary users and secondary users, respectively. The primary users employ a pair of encoder and decoder that were originally designed to satisfy a given error performance requirement under the assumption that no interference exists from other users. In the scenario when the secondary users attempt to access the same medium, we are interested in the maximum transmission rate (defined as {\em accessible capacity}) at which secondary users can communicate reliably without affecting the error performance requirement by the primary users under the constraint that the primary encoder (not the decoder) is kept unchanged. By modeling the primary encoder as a generalized trellis code (GTC), we are then able to treat the secondary link and the cross link from the secondary transmitter to the primary receiver as finite state channels (FSCs). Based on this, upper and lower bounds on the accessible capacity are derived. The impact of the error performance requirement by the primary users on the accessible capacity is analyzed by using the concept of interference margin. In the case of non-trivial interference margin, the secondary message is split into common and private parts and then encoded by superposition coding, which delivers a lower bound on the accessible capacity. For some special cases, these bounds can be computed numerically by using the BCJR algorithm. Numerical results are also provided to gain insight into the impacts of the GTC and the error performance requirement on the accessible capacity.

preprint2013arXiv

An information spectrum approach to the capacity region of GIFC

In this paper, we present a general formula for the capacity region of a general interference channel with two pairs of users. The formula shows that the capacity region is the union of a family of rectangles, where each rectangle is determined by a pair of spectral inf-mutual information rates. Although the presented formula is usually difficult to compute, it provides us useful insights into the interference channels. In particular, when the inputs are discrete ergodic Markov processes and the channel is stationary memoryless, the formula can be evaluated by BCJR algorithm. Also the formula suggests us that the simplest inner bounds (obtained by treating the interference as noise) could be improved by taking into account the structure of the interference processes. This is verified numerically by computing the mutual information rates for Gaussian interference channels with embedded convolutional codes. Moreover, we present a coding scheme to approach the theoretical achievable rate pairs. Numerical results show that decoding gain can be achieved by considering the structure of the interference.

preprint2013arXiv

Block Markov Superposition Transmission: Construction of Big Convolutional Codes from Short Codes

A construction of big convolutional codes from short codes called block Markov superposition transmission (BMST) is proposed. The BMST is very similar to superposition blockMarkov encoding (SBME), which has been widely used to prove multiuser coding theorems. The encoding process of BMST can be as fast as that of the involved short code, while the decoding process can be implemented as an iterative sliding-window decoding algorithm with a tunable delay. More importantly, the performance of BMST can be simply lower-bounded in terms of the transmission memory given that the performance of the short code is available. Numerical results show that, 1) the lower bounds can be matched with a moderate decoding delay in the low bit-error-rate (BER) region, implying that the iterative slidingwindow decoding algorithm is near optimal; 2) BMST with repetition codes and single parity-check codes can approach the Shannon limit within 0.5 dB at BER of 10^{-5} for a wide range of code rates; and 3) BMST can also be applied to nonlinear codes.

preprint2013arXiv

Boundedness for Second Order Differential Equations with Jumping p-Laplacian and an Oscillating Term

In this paper, we are concerned with the boundedness of all the solutions for a kind of second order differential equations with p-Laplacian and an oscillating term $(ϕ_p(x'))'+aϕ_p(x^+)-bϕ_p(x^-)=G_x(x,t)+f(t)$, where$x^+=\max (x,0)$,$x^- =\max(-x,0)$,$ϕ_p(s)=|s|^{p-2}s$,$p\geq2$, $a $ and $b$ are positive constants $(a\not=b)$, the perturbation $f(t)\in {\cal C}^{23}(\RR/2π_p \ZZ)$, the oscillating term $G\in {\cal C}^{21}(\RR\times\RR/2π_p \ZZ)$,where $π_p=\frac{2π(p-1)^{\frac{1}{p}}}{p\sin\fracπ{p}},$ and $G(x,t)$ satisfies $\label{G} |D_x^iD_t^jG(x,t)|\le C,\quad 0\le i+j\le 21,$ and $\label{hatG} |D_t^j\hat{G}|\le C,\quad 0\le j\le 21$ for some $C>0$, where $\hat{G}$ is some function satisfying $\frac{\pa \hat{G}}{\pa x}=G$.

preprint2013arXiv

Quasi-periodic solutions for p-Laplacian equations with jumping nonlinearity and unbounded potential terms

In this paper, we are concerned with the boundedness of all the solutions for a kind of second order differential equations with p-Laplacian term $(ϕ_p(x'))'+aϕ_p(x^+)-bϕ_p(x^-)+f(x)=e(t)$, where $x^+=\max (x,0)$, $x^- =\max(-x,0)$, $ϕ_p(s)=|s|^{p-2}s$, $p\geq2$, $a$ and $b$ are positive constants $(a\not=b)$, and satisfy $\frac{1}{a^{\frac{1}{p}}}+\frac{1}{b^{\frac{1}{p}}}=2ω^{-1} $,where $ω\in \RR^+ \backslash \QQ$, the perturbation $f$ is unbounded, $e(t)\in {\cal C}^{6}$ is is a smooth $2π_p$-periodic function on $t$, where $π_p=\frac{2π(p-1)^{\frac{1}{p}}}{p\sin\fracπ{p}}$.

preprint2013arXiv

Unequal Error Protection by Partial Superposition Transmission Using LDPC Codes

In this paper, we consider designing low-density parity-check (LDPC) coded modulation systems to achieve unequal error protection (UEP). We propose a new UEP approach by partial superposition transmission called UEP-by-PST. In the UEP-by-PST system, the information sequence is distinguished as two parts, the more important data (MID) and the less important data (LID), both of which are coded with LDPC codes. The codeword that corresponds to the MID is superimposed on the codeword that corresponds to the LID. The system performance can be analyzed by using discretized density evolution. Also proposed in this paper is a criterion from a practical point of view to compare the efficiencies of different UEP approaches. Numerical results show that, over both additive white Gaussian noise (AWGN) channels and uncorrelated Rayleigh fading channels, 1) UEP-by-PST provides higher coding gain for the MID compared with the traditional equal error protection (EEP) approach, but with negligible performance loss for the LID; 2) UEP-by-PST is more efficient with the proposed practical criterion than the UEP approach in the digital video broadcasting (DVB) system.

preprint2013arXiv

Upper Bounds On the ML Decoding Error Probability of General Codes over AWGN Channels

In this paper, parameterized Gallager's first bounding technique (GFBT) is presented by introducing nested Gallager regions, to derive upper bounds on the ML decoding error probability of general codes over AWGN channels. The three well-known bounds, namely, the sphere bound (SB) of Herzberg and Poltyrev, the tangential bound (TB) of Berlekamp, and the tangential-sphere bound (TSB) of Poltyrev, are generalized to general codes without the properties of geometrical uniformity and equal energy. When applied to the binary linear codes, the three generalized bounds are reduced to the conventional ones. The new derivation also reveals that the SB of Herzberg and Poltyrev is equivalent to the SB of Kasami et al., which was rarely cited in the literatures.

preprint2012arXiv

A New Ensemble of Rate-Compatible LDPC Codes

In this paper, we presented three approaches to improve the design of Kite codes (newly proposed rateless codes), resulting in an ensemble of rate-compatible LDPC codes with code rates varying "continuously" from 0.1 to 0.9 for additive white Gaussian noise (AWGN) channels. The new ensemble rate-compatible LDPC codes can be constructed conveniently with an empirical formula. Simulation results show that, when applied to incremental redundancy hybrid automatic repeat request (IR-HARQ) system, the constructed codes (with higher order modulation) perform well in a wide range of signal-to-noise-ratios (SNRs).

preprint2012arXiv

An Information-Spectrum Approach to the Capacity Region of General Interference Channel

This paper is concerned with general interference channels characterized by a sequence of transition (conditional) probabilities. We present a general formula for the capacity region of the interference channel with two pairs of users. The formula shows that the capacity region is the union of a family of rectangles, where each rectangle is determined by a pair of spectral inf-mutual information rates. Although the presented formula is usually difficult to compute, it provides us useful insights into the interference channels. For example, the formula suggests us that the simplest inner bounds (obtained by treating the interference as noise) could be improved by taking into account the structure of the interference processes. This is verified numerically by computing the mutual information rates for Gaussian interference channels with embedded convolutional codes.

preprint2012arXiv

Joint Detection/Decoding Algorithms for Nonbinary LDPC Codes over ISI Channels

This paper is concerned with the application of nonbinary low-density parity-check (NB-LDPC) codes to binary input inter-symbol interference (ISI) channels. Two low-complexity joint detection/decoding algorithms are proposed. One is referred to as max-log-MAP/X-EMS algorithm, which is implemented by exchanging soft messages between the max-log-MAP detector and the extended min-sum (EMS) decoder. The max-log-MAP/X-EMS algorithm is applicable to general NB-LDPC codes. The other one, referred to as Viterbi/GMLGD algorithm, is designed in particular for majority-logic decodable NB-LDPC codes. The Viterbi/GMLGD algorithm works in an iterative manner by exchanging hard-decisions between the Viterbi detector and the generalized majority-logic decoder(GMLGD). As a by-product, a variant of the original EMS algorithm is proposed, which is referred to as μ-EMS algorithm. In the μ-EMS algorithm, the messages are truncated according to an adaptive threshold, resulting in a more efficient algorithm. Simulations results show that the max-log-MAP/X-EMS algorithm performs as well as the traditional iterative detection/decoding algorithm based on the BCJR algorithm and the QSPA, but with lower complexity. The complexity can be further reduced for majority-logic decodable NB-LDPC codes by executing the Viterbi/GMLGD algorithm with a performance degradation within one dB. Simulation results also confirm that the μ-EMS algorithm requires lower computational loads than the EMS algorithm with a fixed threshold. These algorithms provide good candidates for trade-offs between performance and complexity.

preprint2012arXiv

New Geometrical Spectra of Linear Codes with Applications to Performance Analysis

In this paper, new enumerating functions for linear codes are defined, including the triangle enumerating function and the tetrahedron enumerating function, both of which can be computed using a trellis-based algorithm over polynomial rings. The computational complexity is dominated by the complexity of the trellis. In addition, we show that these new enumerating functions can be used to improve existing performance bounds on the maximum likelihood decoding.

preprint2012arXiv

New Techniques for Upper-Bounding the ML Decoding Performance of Binary Linear Codes

In this paper, new techniques are presented to either simplify or improve most existing upper bounds on the maximum-likelihood (ML) decoding performance of the binary linear codes over additive white Gaussian noise (AWGN) channels. Firstly, the recently proposed union bound using truncated weight spectrums by Ma {\em et al} is re-derived in a detailed way based on Gallager's first bounding technique (GFBT), where the "good region" is specified by a sub-optimal list decoding algorithm. The error probability caused by the bad region can be upper-bounded by the tail-probability of a binomial distribution, while the error probability caused by the good region can be upper-bounded by most existing techniques. Secondly, we propose two techniques to tighten the union bound on the error probability caused by the good region. The first technique is based on pair-wise error probabilities, which can be further tightened by employing the independence between the error events and certain components of the received random vectors. The second technique is based on triplet-wise error probabilities, which can be upper-bounded by proving that any three bipolar vectors form a non-obtuse triangle. The proposed bounds improve the conventional union bounds but have a similar complexity since they involve only the $Q$-function. The proposed bounds can also be adapted to bit-error probabilities.

preprint2012arXiv

On Parameterized Gallager's First Bounds for Binary Linear Codes over AWGN Channels

In this paper, nested Gallager regions with a single parameter is introduced to exploit Gallager's first bounding technique (GFBT). We present a necessary and sufficient condition on the optimal parameter. We also present a sufficient condition (with a simple geometrical explanation) under which the optimal parameter does not depend on the signal-to-noise ratio (SNR). With this general framework, three existing upper bounds are revisited, including the tangential bound (TB) of Berlekamp, the sphere bound (SB) of Herzberg and Poltyrev, and the tangential-sphere bound (TSB) of Poltyrev. This paper also reveals that the SB of Herzberg and Poltyrev is equivalent to the SB of Kasami et al., which was rarely cited in literature.

preprint2012arXiv

Upper Bounds on the Capacities of Noncontrollable Finite-State Channels with/without Feedback

Noncontrollable finite-state channels (FSCs) are FSCs in which the channel inputs have no influence on the channel states, i.e., the channel states evolve freely. Since single-letter formulae for the channel capacities are rarely available for general noncontrollable FSCs, computable bounds are usually utilized to numerically bound the capacities. In this paper, we take the delayed channel state as part of the channel input and then define the {\em directed information rate} from the new channel input (including the source and the delayed channel state) sequence to the channel output sequence. With this technique, we derive a series of upper bounds on the capacities of noncontrollable FSCs with/without feedback. These upper bounds can be achieved by conditional Markov sources and computed by solving an average reward per stage stochastic control problem (ARSCP) with a compact state space and a compact action space. By showing that the ARSCP has a uniformly continuous reward function, we transform the original ARSCP into a finite-state and finite-action ARSCP that can be solved by a value iteration method. Under a mild assumption, the value iteration algorithm is convergent and delivers a near-optimal stationary policy and a numerical upper bound.

preprint2011arXiv

Serial Concatenation of RS Codes with Kite Codes: Performance Analysis, Iterative Decoding and Design

In this paper, we propose a new ensemble of rateless forward error correction (FEC) codes. The proposed codes are serially concatenated codes with Reed-Solomon (RS) codes as outer codes and Kite codes as inner codes. The inner Kite codes are a special class of prefix rateless low-density parity-check (PRLDPC) codes, which can generate potentially infinite (or as many as required) random-like parity-check bits. The employment of RS codes as outer codes not only lowers down error-floors but also ensures (with high probability) the correctness of successfully decoded codewords. In addition to the conventional two-stage decoding, iterative decoding between the inner code and the outer code are also implemented to improve the performance further. The performance of the Kite codes under maximum likelihood (ML) decoding is analyzed by applying a refined Divsalar bound to the ensemble weight enumerating functions (WEF). We propose a simulation-based optimization method as well as density evolution (DE) using Gaussian approximations (GA) to design the Kite codes. Numerical results along with semi-analytic bounds show that the proposed codes can approach Shannon limits with extremely low error-floors. It is also shown by simulation that the proposed codes performs well within a wide range of signal-to-noise-ratios (SNRs).

preprint2010arXiv

A Low-Complexity Joint Detection-Decoding Algorithm for Nonbinary LDPC-Coded Modulation Systems

In this paper, we present a low-complexity joint detection-decoding algorithm for nonbinary LDPC codedmodulation systems. The algorithm combines hard-decision decoding using the message-passing strategy with the signal detector in an iterative manner. It requires low computational complexity, offers good system performance and has a fast rate of decoding convergence. Compared to the q-ary sum-product algorithm (QSPA), it provides an attractive candidate for practical applications of q-ary LDPC codes.

preprint2010arXiv

Interference Avoidance Game in the Gaussian Interference Channel: Sub-Optimal and Optimal Schemes

This paper considers a distributed interference avoidance problem employing frequency assignment in the Gaussian interference channel (IC). We divide the common channel into several subchannels and each user chooses the subchannel with less amount of interference from other users as the transmit channel. This mechanism named interference avoidance in this paper can be modeled as a competitive game model. And a completely autonomous distributed iterative algorithm called Tdistributed interference avoidance algorithm (DIA) is adopted to achieve the Nash equilibriumT (NE) of the game. Due to the self-optimum, DIA is a sub-optimal algorithm. Therefore, through introducing an optimal compensation into the competitive game model, we successfully develop a compensation-based game model to approximate the optimal interference avoidance problem. Moreover, an optimal algorithm called iterative optimal interference avoidance algorithm (IOIA) is proposed to reach the optimality of the interference avoidance scheme. We analyze the implementation complexities of the two algorithms. We also give the proof on the convergence of the proposed algorithms. The performance upper bound and lower bound are also derived for the proposed algorithms. The simulation results show that IOIA does reach the optimality under condition of interference avoidance mechanism.

preprint2010arXiv

Using Distributed Rate-Splitting Game to Approach Rate Region Boundary of the Gaussian Interference Channel

Determining how to approach the rate boundary of the Gaussian interference channel in practical system is a big concern. In this paper, a distributed rate-splitting (DRS) scheme is proposed to approach the rate region boundary of the Gaussian interference channel. It is shown that the DRS scheme can be formulated as a non-cooperative game. We introduce the Stackelberg equilibrium (SE) with multiple leaders as the equilibrium point of the non-cooperative game. Therefore, an iterative multiple waterlevels water-filling algorithm (IML-WFA) is developed to efficiently reach the SE of the non-cooperative game. The existence of SE is established for the game. Numerical examples show that the rate-tuples achieved by the DRS are very close to the boundary of the well-known HK region.

Xiao Ma

What is connected

Connect this record

See the researcher in context

Building this map preview

61 published item(s)

Chain-of-Action: Trajectory Autoregressive Modeling for Robotic Manipulation

GR-Dexter Technical Report

Hand-in-the-Loop: Improving Dexterous VLA via Seamless Interventional Correction

RePose: A Real-Time 3D Human Pose Estimation and Biomechanical Analysis Framework for Rehabilitation

Test-time generative augmentation for medical image segmentation

Almost sharp wave kinetic theory of multidimensional KdV type equations with $d\ge 3$

Benchmarking of DL Libraries and Models on Mobile Devices

From Earth to Space: A First Deployment of 5G Core Network on Satellite

Hierarchical Reinforcement Learning under Mixed Observability

Imitation Learning via Differentiable Physics

Label Adversarial Learning for Skeleton-level to Pixel-level Adjustable Vessel Segmentation

Learning Latent Graph Dynamics for Visual Manipulation of Deformable Objects

On Exploring Pose Estimation as an Auxiliary Learning Task for Visible-Infrared Person Re-identification

Spectral radius and rainbow matchings of graphs

Towards Sustainable Satellite Edge Computing

Transmission of Bernoulli Sources Using Convolutional LDGM Codes

Ab Initio Particle-based Object Manipulation

Detecting and modelling real percolation and phase transitions of information on social media

HAVANA: Hierarchical and Variation-Normalized Autoencoder for Person Re-identification

Tiansuan Constellation: An Open Research Platform

Twisted-Pair Superposition Transmission

AI-Mediated Exchange Theory

Analysis on Computation-Intensive Status Update in Mobile Edge Computing

AoI-Delay Tradeoff in Mobile Edge Caching with Freshness-Aware Content Refreshing

Binary Representaion for Non-binary LDPC Code with Decoder Design

Challenges in Supporting Exploratory Search through Voice Assistants

Cooperative Service Caching and Workload Scheduling in Mobile Edge Computing

DinerDash Gym: A Benchmark for Policy Learning in High-Dimensional Action Space

Discriminative Particle Filter Reinforcement Learning for Complex Partial Observations

Linear Stability of the 2D Irrotational Circulation Flow around An Elliptical Cylinder

R3: A Reading Comprehension Benchmark Requiring Reasoning Processes

Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction

Successive Cancellation List Decoding of Semi-random Unit Memory Convolutional Codes

Systematic Convolutional Low Density Generator Matrix Code

Transmitting Extra Bits by Rotating Signal Constellations

Partially Block Markov Superposition Transmission of Gaussian Source with Nested Lattice Codes

Systematic Block Markov Superposition Transmission of Repetition Codes

Block Markov Superposition Transmission of RUN Codes

EXIT Chart Analysis of Block Markov Superposition Transmission of Short Codes

Performance Analysis of Block Markov Superposition Transmission of Short Codes

Bounds on the ML Decoding Error Probability of RS-Coded Modulation over AWGN Channels

Performance Comparison of LDPC Block and Spatially Coupled Codes over GF(q)

Spatial Coupling of Generator Matrix: A General Approach to Design of Good Codes at a Target BER

Accessible Capacity of Secondary Users

An information spectrum approach to the capacity region of GIFC

Block Markov Superposition Transmission: Construction of Big Convolutional Codes from Short Codes

Boundedness for Second Order Differential Equations with Jumping p-Laplacian and an Oscillating Term

Quasi-periodic solutions for p-Laplacian equations with jumping nonlinearity and unbounded potential terms

Unequal Error Protection by Partial Superposition Transmission Using LDPC Codes

Upper Bounds On the ML Decoding Error Probability of General Codes over AWGN Channels

A New Ensemble of Rate-Compatible LDPC Codes

An Information-Spectrum Approach to the Capacity Region of General Interference Channel

Joint Detection/Decoding Algorithms for Nonbinary LDPC Codes over ISI Channels

New Geometrical Spectra of Linear Codes with Applications to Performance Analysis

New Techniques for Upper-Bounding the ML Decoding Performance of Binary Linear Codes

On Parameterized Gallager's First Bounds for Binary Linear Codes over AWGN Channels

Upper Bounds on the Capacities of Noncontrollable Finite-State Channels with/without Feedback

Serial Concatenation of RS Codes with Kite Codes: Performance Analysis, Iterative Decoding and Design

A Low-Complexity Joint Detection-Decoding Algorithm for Nonbinary LDPC-Coded Modulation Systems

Interference Avoidance Game in the Gaussian Interference Channel: Sub-Optimal and Optimal Schemes

Using Distributed Rate-Splitting Game to Approach Rate Region Boundary of the Gaussian Interference Channel