Source author record

Yuhang Zhang

Yuhang Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision eess.SY Robotics Systems and Control Artificial Intelligence cond-mat.supr-con Machine Learning math.DS math.NA Numerical Analysis physics.flu-dyn

Catalog footprint

What is connected

9works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Differential Barometric Altimetry for Submeter Vertical Localization and Floor Recognition Indoors

Accurate altitude estimation and reliable floor recognition are critical for mobile robot localization and navigation within complex multi-storey environments. In this paper, we present a robust, low-cost vertical estimation framework leveraging differential barometric sensing integrated within a fully ROS-compliant software package. Our system simultaneously publishes real-time altitude data from both a stationary base station and a mobile sensor, enabling precise and drift-free vertical localization. Empirical evaluations conducted in challenging scenarios -- such as fully enclosed stairwells and elevators, demonstrate that our proposed barometric pipeline achieves sub-meter vertical accuracy (RMSE: 0.29 m) and perfect (100%) floor-level identification. In contrast, our results confirm that standalone height estimates, obtained solely from visual- or LiDAR-based SLAM odometry, are insufficient for reliable vertical localization. The proposed ROS-compatible barometric module thus provides a practical and cost-effective solution for robust vertical awareness in real-world robotic deployments. The implementation of our method is released as open source at https://github.com/witsir/differential-barometric.

preprint2026arXiv

VReID-XFD: Video-based Person Re-identification at Extreme Far Distance Challenge Results

Person re-identification (ReID) across aerial and ground views at extreme far distances introduces a distinct operating regime where severe resolution degradation, extreme viewpoint changes, unstable motion cues, and clothing variation jointly undermine the appearance-based assumptions of existing ReID systems. To study this regime, we introduce VReID-XFD, a video-based benchmark and community challenge for extreme far-distance (XFD) aerial-to-ground person re-identification. VReID-XFD is derived from the DetReIDX dataset and comprises 371 identities, 11,288 tracklets, and 11.75 million frames, captured across altitudes from 5.8 m to 120 m, viewing angles from oblique (30 degrees) to nadir (90 degrees), and horizontal distances up to 120 m. The benchmark supports aerial-to-aerial, aerial-to-ground, and ground-to-aerial evaluation under strict identity-disjoint splits, with rich physical metadata. The VReID-XFD-25 Challenge attracted 10 teams with hundreds of submissions. Systematic analysis reveals monotonic performance degradation with altitude and distance, a universal disadvantage of nadir views, and a trade-off between peak performance and robustness. Even the best-performing SAS-PReID method achieves only 43.93 percent mAP in the aerial-to-ground setting. The dataset, annotations, and official evaluation protocols are publicly available at https://www.it.ubi.pt/DetReIDX/ .

preprint2022arXiv

A Class-wise Non-salient Region Generalized Framework for Video Semantic Segmentation

Video semantic segmentation (VSS) is beneficial for dealing with dynamic scenes due to the continuous property of the real-world environment. On the one hand, some methods alleviate the predicted inconsistent problem between continuous frames. On the other hand, other methods employ the previous frame as the prior information to assist in segmenting the current frame. Although the previous methods achieve superior performances on the independent and identically distributed (i.i.d) data, they can not generalize well on other unseen domains. Thus, we explore a new task, the video generalizable semantic segmentation (VGSS) task that considers both continuous frames and domain generalization. In this paper, we propose a class-wise non-salient region generalized (CNSG) framework for the VGSS task. Concretely, we first define the class-wise non-salient feature, which describes features of the class-wise non-salient region that carry more generalizable information. Then, we propose a class-wise non-salient feature reasoning strategy to select and enhance the most generalized channels adaptively. Finally, we propose an inter-frame non-salient centroid alignment loss to alleviate the predicted inconsistent problem in the VGSS task. We also extend our video-based framework to the image-based generalizable semantic segmentation (IGSS) task. Experiments demonstrate that our CNSG framework yields significant improvement in the VGSS and IGSS tasks.

preprint2022arXiv

Learning Efficient Representations for Enhanced Object Detection on Large-scene SAR Images

It is a challenging problem to detect and recognize targets on complex large-scene Synthetic Aperture Radar (SAR) images. Recently developed deep learning algorithms can automatically learn the intrinsic features of SAR images, but still have much room for improvement on large-scene SAR images with limited data. In this paper, based on learning representations and multi-scale features of SAR images, we propose an efficient and robust deep learning based target detection method. Especially, by leveraging the effectiveness of adversarial autoencoder (AAE) which influences the distribution of the investigated data explicitly, the raw SAR dataset is augmented into an enhanced version with a large quantity and diversity. Besides, an auto-labeling scheme is proposed to improve labeling efficiency. Finally, with jointly training small target chips and large-scene images, an integrated YOLO network combining non-maximum suppression on sub-images is used to realize multiple targets detection of high resolution images. The numerical experimental results on the MSTAR dataset show that our method can realize target detection and recognition on large-scene images accurately and efficiently. The superior anti-noise performance is also confirmed by experiments.

preprint2022arXiv

Observer-Based Coordinated Tracking Control for Nonlinear Multi-Agent Systems with Intermittent Communication under Heterogeneous Coupling Framework

In this article, the observer-based coordinated tracking control problem for a class of nonlinear multi-agent systems(MASs) with intermittent communication and information constraints is studied under dynamic switching topology. First, a state observer is designed to estimate the unmeasurable actual state information in the system. Second, adjustable heterogeneous coupling weighting parameters are introduced in the dynamic switching topology, and the distributed coordinated tracking control protocol under heterogeneous coupling framework is proposed. Then, a new Lemma is constructed to realize the cooperative design of observer gain, state feedback gain and heterogeneous coupling gain matrices. Furthermore, the stability of the system is further proved, and the range of communication rate is obtained. On this basis, the intermittent communication mode is extended to three time interval cases, namely normal communication, leader-follower communication interruption and all agents communication interruption, and then the distributed coordinated tracking control method is improved to solve this problem. Finally, simulation experiments are conducted with nonlinear MASs to verify the correctness of methods.

preprint2022arXiv

The rate of Lp-convergence for the Euler-Maruyama method of the stochastic differential equations with Markovian switching

This work deals with the Euler-Maruyama (EM) scheme for stochastic differential equations with Markovian switching (SDEwMSs). We focus on the Lp-convergence rate (p is greater than or equal to 2) of the EM method given in this paper. As far as we know, the skeleton process of the Markov chain is used in the continuous numerical methods in most papers. By contrast, the continuous EM method in this paper is to use the Markov chain directly. To the best of our knowledge, there are only two papers that consider the rate of Lp-convergence, which is no more than 1/p (p is greater than or equal to 2) in these papers. The contribution of this paper is that the rate of Lp-convergence of the EM method can reach 1/2. We believe that the technique used in this paper to construct the EM method can also be used to construct other methods for SDEwMSs.

preprint2022arXiv

Titanium-based kagome superconductor CsTi_3Bi_5 and topological states

Since the discovery of a new family of vanadium-based kagome superconductor AV3Sb5 (A=K, Rb, and Cs) with topological band structures, extensive effort has been devoted to exploring the origin of superconducting states and the intertwined orders. Meanwhile, searching for new types of superconductors with novel physical properties and higher superconducting transition temperatures has always been a major thread in the history of superconductor research. Here we report a successful fabrication and the topological states of a Titanium-based kagome metal CsTi3Bi5 (CT3B5) crystal. The as-grown CT3B5 crystal is of high quality and possesses a perfect two-dimensional kagome net of Titanium. The superconductivity of the CT3B5 crystal shows that the critical temperature Tc is of ~4.8 K. First-principle calculations predict that the CT3B5 has robust topological surface states, implying that CT3B5 is a Z2 topological kagome superconductor. This finding provides a new type of superconductors and the base for exploring the origin of superconductivity and topological states in kagome superconductors.

preprint2021arXiv

Steadily Learn to Drive with Virtual Memory

Reinforcement learning has shown great potential in developing high-level autonomous driving. However, for high-dimensional tasks, current RL methods suffer from low data efficiency and oscillation in the training process. This paper proposes an algorithm called Learn to drive with Virtual Memory (LVM) to overcome these problems. LVM compresses the high-dimensional information into compact latent states and learns a latent dynamic model to summarize the agent's experience. Various imagined latent trajectories are generated as virtual memory by the latent dynamic model. The policy is learned by propagating gradient through the learned latent model with the imagined latent trajectories and thus leads to high data efficiency. Furthermore, a double critic structure is designed to reduce the oscillation during the training process. The effectiveness of LVM is demonstrated by an image-input autonomous driving task, in which LVM outperforms the existing method in terms of data efficiency, learning stability, and control performance.

preprint2020arXiv

Gas-Vapor Interplay in Plasmonic Bubble Shrinkage

The understanding of the shrinkage dynamics of plasmonic bubbles formed around metallic nanoparticles immersed in liquid and irradiated by a resonant light source is crucial for the usage of these bubbles in numerous applications. In this paper we experimentally show and theoretically explain that a plasmonic bubble during its shrinkage undergoes two different phases: first, a rapid partial bubble shrinkage governed by vapor condensation and, second, a slow diffusion-controlled bubble dissolution. The history of the bubble formation plays an important role in the shrinkage dynamics during the first phase, as it determines the gas-vapor ratio in the bubble composition. Higher laser powers lead to more vaporous bubbles, while longer pulses and higher dissolved air concentrations lead to more gaseous bubbles. The dynamics of the second phase barely depends on the history of bubble formation, i.e. laser power and pulse duration, but strongly on the dissolved air concentration, which defines the concentration gradient at the bubble interface. Finally, for the bubble dissolution in the second phase, with decreasing dissolved air concentration, we observe a gradual transition from a $R(t) \propto (t_0 - t) ^{1/3}$ scaling law to a $R(t) \propto (t_0 - t) ^{1/2}$ scaling law, where $t_0$ is the lifetime of the bubble and theoretically explain this transition.

Yuhang Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Differential Barometric Altimetry for Submeter Vertical Localization and Floor Recognition Indoors

VReID-XFD: Video-based Person Re-identification at Extreme Far Distance Challenge Results

A Class-wise Non-salient Region Generalized Framework for Video Semantic Segmentation

Learning Efficient Representations for Enhanced Object Detection on Large-scene SAR Images

Observer-Based Coordinated Tracking Control for Nonlinear Multi-Agent Systems with Intermittent Communication under Heterogeneous Coupling Framework

The rate of Lp-convergence for the Euler-Maruyama method of the stochastic differential equations with Markovian switching

Titanium-based kagome superconductor CsTi_3Bi_5 and topological states

Steadily Learn to Drive with Virtual Memory

Gas-Vapor Interplay in Plasmonic Bubble Shrinkage