Source author record

Xiaoming Tao

Xiaoming Tao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision eess.IV eess.SP Machine Learning Artificial Intelligence Information Theory math.IT Robotics

Catalog footprint

What is connected

7works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Meta-learning enhanced adaptive robot control strategy for automated PCB assembly

The assembly of printed circuit boards (PCBs) is one of the standard processes in chip production, directly contributing to the quality and performance of the chips. In the automated PCB assembly process, machine vision and coordinate localization methods are commonly employed to guide the positioning of assembly units. However, occlusion or poor lighting conditions can affect the effectiveness of machine vision-based methods. Additionally, the assembly of odd-form components requires highly specialized fixtures for assembly unit positioning, leading to high costs and low flexibility, especially for multi-variety and small-batch production. Drawing on these considerations, a vision-free, model-agnostic meta-method for compensating robotic position errors is proposed, which maximizes the probability of accurate robotic positioning through interactive feedback, thereby reducing the dependency on visual feedback and mitigating the impact of occlusions or lighting variations. The proposed method endows the robot with the capability to learn and adapt to various position errors, inspired by the human instinct for grasping under uncertainties. Furthermore, it is a self-adaptive method that can accelerate the robotic positioning process as more examples are incorporated and learned. Empirical studies show that the proposed method can handle a variety of odd-form components without relying on specialized fixtures, while achieving similar assembly efficiency to highly dedicated automation equipment. As of the writing of this paper, the proposed meta-method has already been implemented in a robotic-based assembly line for odd-form electronic components. Since PCB assembly involves various electronic components with different sizes, shapes, and functions, subsequent studies can focus on assembly sequence and assembly route optimization to further enhance assembly efficiency.

preprint2023arXiv

Environment Semantics Aided Wireless Communications: A Case Study of mmWave Beam Prediction and Blockage Prediction

In this paper, we propose an environment semantics aided wireless communication framework to reduce the transmission latency and improve the transmission reliability, where semantic information is extracted from environment image data, selectively encoded based on its task-relevance, and then fused to make decisions for channel related tasks. As a case study, we develop an environment semantics aided network architecture for mmWave communication systems, which is composed of a semantic feature extraction network, a feature selection algorithm, a task-oriented encoder, and a decision network. With images taken from street cameras and user's identification information as the inputs, the environment semantics aided network architecture is trained to predict the optimal beam index and the blockage state for the base station. It is seen that without pilot training or the costly beam scans, the environment semantics aided network architecture can realize extremely efficient beam prediction and timely blockage prediction, thus meeting requirements for ultra-reliable and low-latency communications (URLLCs). Simulation results demonstrate that compared with existing works, the proposed environment semantics aided network architecture can reduce system overheads such as storage space and computational cost while achieving satisfactory prediction accuracy and protecting user privacy.

preprint2023arXiv

Federated Multi-View Synthesizing for Metaverse

The metaverse is expected to provide immersive entertainment, education, and business applications. However, virtual reality (VR) transmission over wireless networks is data- and computation-intensive, making it critical to introduce novel solutions that meet stringent quality-of-service requirements. With recent advances in edge intelligence and deep learning, we have developed a novel multi-view synthesizing framework that can efficiently provide computation, storage, and communication resources for wireless content delivery in the metaverse. We propose a three-dimensional (3D)-aware generative model that uses collections of single-view images. These single-view images are transmitted to a group of users with overlapping fields of view, which avoids massive content transmission compared to transmitting tiles or whole 3D models. We then present a federated learning approach to guarantee an efficient learning process. The training performance can be improved by characterizing the vertical and horizontal data samples with a large latent feature space, while low-latency communication can be achieved with a reduced number of transmitted parameters during federated learning. We also propose a federated transfer learning framework to enable fast domain adaptation to different target domains. Simulation results have demonstrated the effectiveness of our proposed federated multi-view synthesizing framework for VR content delivery.

preprint2022arXiv

A Robust Deep Learning Enabled Semantic Communication System for Text

With the advent of the 6G era, the concept of semantic communication has attracted increasing attention. Compared with conventional communication systems, semantic communication systems are not only affected by physical noise existing in the wireless communication environment, e.g., additional white Gaussian noise, but also by semantic noise due to the source and the nature of deep learning-based systems. In this paper, we elaborate on the mechanism of semantic noise. In particular, we categorize semantic noise into two categories: literal semantic noise and adversarial semantic noise. The former is caused by written errors or expression ambiguity, while the latter is caused by perturbations or attacks added to the embedding layer via the semantic channel. To prevent semantic noise from influencing semantic communication systems, we present a robust deep learning enabled semantic communication system (R-DeepSC) that leverages a calibrated self-attention mechanism and adversarial training to tackle semantic noise. Compared with baseline models that only consider physical noise for text transmission, the proposed R-DeepSC achieves remarkable performance in dealing with semantic noise under different signal-to-noise ratios.

preprint2022arXiv

Semantic Communications: Principles and Challenges

Semantic communication, regarded as the breakthrough beyond the Shannon paradigm, aims at the successful transmission of semantic information conveyed by the source rather than the accurate reception of each single symbol or bit regardless of its meaning. This article provides an overview on semantic communications. After a brief review of Shannon information theory, we discuss semantic communications with theory, framework, and system design enabled by deep learning. Different from the symbol/bit error rate used for measuring conventional communication systems, performance metrics for semantic communications are also discussed. The article concludes with several open questions in semantic communications.

preprint2022arXiv

Towards Semantic Communications: Deep Learning-Based Image Semantic Coding

Semantic communications has received growing interest since it can remarkably reduce the amount of data to be transmitted without missing critical information. Most existing works explore the semantic encoding and transmission for text and apply techniques in Natural Language Processing (NLP) to interpret the meaning of the text. In this paper, we conceive the semantic communications for image data that is much more richer in semantics and bandwidth sensitive. We propose an reinforcement learning based adaptive semantic coding (RL-ASC) approach that encodes images beyond pixel level. Firstly, we define the semantic concept of image data that includes the category, spatial arrangement, and visual feature as the representation unit, and propose a convolutional semantic encoder to extract semantic concepts. Secondly, we propose the image reconstruction criterion that evolves from the traditional pixel similarity to semantic similarity and perceptual performance. Thirdly, we design a novel RL-based semantic bit allocation model, whose reward is the increase in rate-semantic-perceptual performance after encoding a certain semantic concept with adaptive quantization level. Thus, the task-related information is preserved and reconstructed properly while less important data is discarded. Finally, we propose the Generative Adversarial Nets (GANs) based semantic decoder that fuses both locally and globally features via an attention module. Experimental results demonstrate that the proposed RL-ASC is noise robust and could reconstruct visually pleasant and semantic consistent image, and saves times of bit cost compared to standard codecs and other deep learning-based image codecs.

preprint2021arXiv

Perceptual Image Restoration with High-Quality Priori and Degradation Learning

Perceptual image restoration seeks for high-fidelity images that most likely degrade to given images. For better visual quality, previous work proposed to search for solutions within the natural image manifold, by exploiting the latent space of a generative model. However, the quality of generated images are only guaranteed when latent embedding lies close to the prior distribution. In this work, we propose to restrict the feasible region within the prior manifold. This is accomplished with a non-parametric metric for two distributions: the Maximum Mean Discrepancy (MMD). Moreover, we model the degradation process directly as a conditional distribution. We show that our model performs well in measuring the similarity between restored and degraded images. Instead of optimizing the long criticized pixel-wise distance over degraded images, we rely on such model to find visual pleasing images with high probability. Our simultaneous restoration and enhancement framework generalizes well to real-world complicated degradation types. The experimental results on perceptual quality and no-reference image quality assessment (NR-IQA) demonstrate the superior performance of our method.

Xiaoming Tao

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Meta-learning enhanced adaptive robot control strategy for automated PCB assembly

Environment Semantics Aided Wireless Communications: A Case Study of mmWave Beam Prediction and Blockage Prediction

Federated Multi-View Synthesizing for Metaverse

A Robust Deep Learning Enabled Semantic Communication System for Text

Semantic Communications: Principles and Challenges

Towards Semantic Communications: Deep Learning-Based Image Semantic Coding

Perceptual Image Restoration with High-Quality Priori and Degradation Learning