Source author record

Lei Zhou

Lei Zhou appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

44works

23topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Improving LLM Reasoning with Homophily-aware Structural and Semantic Text-Attributed Graph Compression

Large language models (LLMs) have demonstrated promising capabilities in Text-Attributed Graph (TAG) understanding. Recent studies typically focus on verbalizing the graph structures via handcrafted prompts, feeding the target node and its neighborhood context into LLMs. However, constrained by the context window, existing methods mainly resort to random sampling, often implemented via dropping node/edge randomly, which inevitably introduces noise and cause reasoning instability. We argue that graphs inherently contain rich structural and semantic information, and that their effective exploitation can unlock potential gains in LLMs reasoning performance. To this end, we propose Homophily-aware Structural and Semantic Compression for LLMs (HS2C), a framework centered on exploiting graph homophily. Structurally, guided by the principle of Structural Entropy minimization, we perform a global hierarchical partition that decodes the graph's essential topology. This partition identifies naturally cohesive, homophilic communities, while discarding stochastic connectivity noise. Semantically, we deliver the detected structural homophily to the LLM, empowering it to perform differentiated semantic aggregation based on predefined community type. This process compresses redundant background contexts into concise community-level consensus, selectively preserving semantically homophilic information aligned with the target nodes. Extensive experiments on 10 node-level benchmarks across LLMs of varying sizes and families demonstrate that, by feeding LLMs with structurally and semantically compressed inputs, HS2C simultaneously enhances the compression rate and downstream inference accuracy, validating its superiority and scalability. Extensions to 7 diverse graph-level benchmarks further consolidate HS2C's task generalizability.

preprint2026arXiv

The RoboSense Challenge: Sense Anything, Navigate Anywhere, Adapt Across Platforms

Autonomous systems are increasingly deployed in open and dynamic environments -- from city streets to aerial and indoor spaces -- where perception models must remain reliable under sensor noise, environmental variation, and platform shifts. However, even state-of-the-art methods often degrade under unseen conditions, highlighting the need for robust and generalizable robot sensing. The RoboSense 2025 Challenge is designed to advance robustness and adaptability in robot perception across diverse sensing scenarios. It unifies five complementary research tracks spanning language-grounded decision making, socially compliant navigation, sensor configuration generalization, cross-view and cross-modal correspondence, and cross-platform 3D perception. Together, these tasks form a comprehensive benchmark for evaluating real-world sensing reliability under domain shifts, sensor failures, and platform discrepancies. RoboSense 2025 provides standardized datasets, baseline models, and unified evaluation protocols, enabling large-scale and reproducible comparison of robust perception methods. The challenge attracted 143 teams from 85 institutions across 16 countries, reflecting broad community engagement. By consolidating insights from 23 winning solutions, this report highlights emerging methodological trends, shared design principles, and open challenges across all tracks, marking a step toward building robots that can sense reliably, act robustly, and adapt across platforms in real-world environments.

preprint2023arXiv

Sequential Structure and Control Co-design of Lightweight Precision Stages with Active control of flexible modes

Precision motion stages are playing a prominent role in various manufacturing equipment. The drastically increasing demand for higher throughput in integrated circuit (IC) manufacturing and inspection calls for the next-generation precision stages that have light weight and high control bandwidth simultaneously. In today's design techniques, the stage's first flexible mode is limiting its achievable control bandwidth, which enforces a trade-off between the stage's acceleration and closed-loop stiffness and thus limits the system's overall performance. To overcome this challenge, this paper proposes a new hardware design and control framework for lightweight precision motion stages with the stage's low-frequency flexible modes actively controlled. Our method proposes to minimize the resonance frequency of the controlled mode to reduce the stage's weight, and to maximize that of the uncontrolled mode to enable high control bandwidth. In addition, the proposed framework determines the placement of the actuators and sensors to maximize the controllability/observability of the stage's controlled flexible mode while minimizing that of the uncontrolled mode, which effectively simplifies the controller designs. Two case studies are used to evaluate the effectiveness of the proposed framework. Simulation results show that the stage designed using the proposed method has a weight reduction of more than 55% compared to a baseline stage design. Improvement in control bandwidth was also achieved. These results demonstrate the effectiveness of the proposed method in achieving lightweight precision positioning stages with high acceleration, bandwidth, and precision.

preprint2022arXiv

ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer

Generating robust and reliable correspondences across images is a fundamental task for a diversity of applications. To capture context at both global and local granularity, we propose ASpanFormer, a Transformer-based detector-free matcher that is built on hierarchical attention structure, adopting a novel attention operation which is capable of adjusting attention span in a self-adaptive manner. To achieve this goal, first, flow maps are regressed in each cross attention phase to locate the center of search region. Next, a sampling grid is generated around the center, whose size, instead of being empirically configured as fixed, is adaptively computed from a pixel uncertainty estimated along with the flow map. Finally, attention is computed across two images within derived regions, referred to as attention span. By these means, we are able to not only maintain long-range dependencies, but also enable fine-grained attention among pixels of high relevance that compensates essential locality and piece-wise smoothness in matching tasks. State-of-the-art accuracy on a wide range of evaluation benchmarks validates the strong matching capability of our method.

preprint2022arXiv

Automatic detection of multilevel communities: scalable and resolution-limit-free

Community structure is one of the most important features of complex networks. Modularity-based methods for community detection typically rely on heuristic algorithms to optimize a specific community quality function. Such methods are limited by two major defects: (1) the resolution limit problem, which prohibits communities of heterogeneous sizes being simultaneously detected, and (2) divergent outputs of the heuristic algorithm, which make it difficult to differentiate relevant and irrelevant results. In this paper, we propose an improved method for community detection based on a scalable community "fitness function." We introduced a new parameter to enhance its scalability, and a strict strategy to filter the outputs. Due to the scalability, on the one hand our method is free of the resolution limit problem and performs excellently on large heterogeneous networks, while on the other hand it is capable of detecting more levels of communities than previous methods in deep hierarchical networks. Moreover, our strict strategy automatically removes redundant and irrelevant results, without any artificial selection. As a result, our method neatly outputs only the stable and unique communities, which are largely interpretable by the a priori knowledge about the network, including the implanted structures within synthetic networks, or metadata for real-world networks.

preprint2022arXiv

Control Co-design of Actively Controlled Lightweight Structures for High-acceleration Precision Motion Systems

Precision motion stages are an essential part of a wide range of manufacturing equipment, and their motion performance are critical to the quality and throughput of the systems. The drastically increasing demand for higher manufacturing throughput in various processes necessities the development of next-generation motion systems with reduced moving weight and high control bandwidth. However, the reduction of moving stage's weight can lower the stage's structural resonance frequencies, making the hardware dynamics and controller design problem strongly coupled. Aiming at this challenge, this paper proposes a new formulation of nested hardware and control co-design framework for precision motion stages. The proposed framework explicitly optimizes the closed-loop control bandwidth with guaranteed robustness, and explicitly considers the limits in the physical system. Two case studies, including a motivating example using lumped-parameter mechanical system and a finite-element-simulated lightweight motion stage, are being used to evaluate the effectiveness of the proposed nested CCD framework. Simulation results show that the proposed nested CCD framework has 42\% of weight reduction and 28\% bandwidth improvement compared with a sequential design baseline, which demonstrates the efficacy of the proposed approach.

preprint2022arXiv

Economical Precise Manipulation and Auto Eye-Hand Coordination with Binocular Visual Reinforcement Learning

Precision robotic manipulation tasks (insertion, screwing, precisely pick, precisely place) are required in many scenarios. Previous methods achieved good performance on such manipulation tasks. However, such methods typically require tedious calibration or expensive sensors. 3D/RGB-D cameras and torque/force sensors add to the cost of the robotic application and may not always be economical. In this work, we aim to solve these but using only weak-calibrated and low-cost webcams. We propose Binocular Alignment Learning (BAL), which could automatically learn the eye-hand coordination and points alignment capabilities to solve the four tasks. Our work focuses on working with unknown eye-hand coordination and proposes different ways of performing eye-in-hand camera calibration automatically. The algorithm was trained in simulation and used a practical pipeline to achieve sim2real and test it on the real robot. Our method achieves a competitively good result with minimal cost on the four tasks.

preprint2022arXiv

Half a Dozen Real-World Applications of Evolutionary Multitasking, and More

Until recently, the potential to transfer evolved skills across distinct optimization problem instances (or tasks) was seldom explored in evolutionary computation. The concept of evolutionary multitasking (EMT) fills this gap. It unlocks a population's implicit parallelism to jointly solve a set of tasks, hence creating avenues for skills transfer between them. Despite it being early days, the idea of EMT has begun to show promise in a range of real-world applications. In the backdrop of recent advances, the contribution of this paper is twofold. First, a review of several application-oriented explorations of EMT in the literature is presented; the works are assimilated into half a dozen broad categories according to their respective application domains. Each of these six categories elaborates fundamental motivations to multitask, and contains a representative experimental study (referred from the literature). Second, a set of recipes is provided showing how problem formulations of general interest, those that cut across different disciplines, could be transformed in the new light of EMT. Our discussions emphasize the many practical use-cases of EMT, and is intended to spark future research towards crafting novel algorithms for real-world deployment.

preprint2022arXiv

Learning Prototype via Placeholder for Zero-shot Recognition

Zero-shot learning (ZSL) aims to recognize unseen classes by exploiting semantic descriptions shared between seen classes and unseen classes. Current methods show that it is effective to learn visual-semantic alignment by projecting semantic embeddings into the visual space as class prototypes. However, such a projection function is only concerned with seen classes. When applied to unseen classes, the prototypes often perform suboptimally due to domain shift. In this paper, we propose to learn prototypes via placeholders, termed LPL, to eliminate the domain shift between seen and unseen classes. Specifically, we combine seen classes to hallucinate new classes which play as placeholders of the unseen classes in the visual and semantic space. Placed between seen classes, the placeholders encourage prototypes of seen classes to be highly dispersed. And more space is spared for the insertion of well-separated unseen ones. Empirically, well-separated prototypes help counteract visual-semantic misalignment caused by domain shift. Furthermore, we exploit a novel semantic-oriented fine-tuning to guarantee the semantic reliability of placeholders. Extensive experiments on five benchmark datasets demonstrate the significant performance gain of LPL over the state-of-the-art methods. Code is available at https://github.com/zaiquanyang/LPL.

preprint2022arXiv

Lung Swapping Autoencoder: Learning a Disentangled Structure-texture Representation of Chest Radiographs

Well-labeled datasets of chest radiographs (CXRs) are difficult to acquire due to the high cost of annotation. Thus, it is desirable to learn a robust and transferable representation in an unsupervised manner to benefit tasks that lack labeled data. Unlike natural images, medical images have their own domain prior; e.g., we observe that many pulmonary diseases, such as the COVID-19, manifest as changes in the lung tissue texture rather than the anatomical structure. Therefore, we hypothesize that studying only the texture without the influence of structure variations would be advantageous for downstream prognostic and predictive modeling tasks. In this paper, we propose a generative framework, the Lung Swapping Autoencoder (LSAE), that learns factorized representations of a CXR to disentangle the texture factor from the structure factor. Specifically, by adversarial training, the LSAE is optimized to generate a hybrid image that preserves the lung shape in one image but inherits the lung texture of another. To demonstrate the effectiveness of the disentangled texture representation, we evaluate the texture encoder $Enc^t$ in LSAE on ChestX-ray14 (N=112,120), and our own multi-institutional COVID-19 outcome prediction dataset, COVOC (N=340 (Subset-1) + 53 (Subset-2)). On both datasets, we reach or surpass the state-of-the-art by finetuning $Enc^t$ in LSAE that is 77% smaller than a baseline Inception v3. Additionally, in semi-and-self supervised settings with a similar model budget, $Enc^t$ in LSAE is also competitive with the state-of-the-art MoCo. By "re-mixing" the texture and shape factors, we generate meaningful hybrid images that can augment the training set. This data augmentation method can further improve COVOC prediction performance. The improvement is consistent even when we directly evaluate the Subset-1 trained model on Subset-2 without any fine-tuning.

preprint2022arXiv

Self-Sensing Hysteresis-Type Bearingless Motor

Bearingless motors use a single stator assembly to apply torque and magnetic suspension forces on the rotor, making these machines compact with frictionless operation and thus well suited to high-speed applications. One major challenge that prevents wide usage of bearingless motors is the need for air-gap position sensors, which are typically expensive. Here we present a method to estimate the radial position of a hysteresis-type bearingless motor using the inductance variation of the stator coils amplified by an injected high-frequency signal. We have carried out finite element (FE) simulations to demonstrate its feasibility, and have constructed a prototype self-sensing bearingless motor for experimental validations.

preprint2021arXiv

Goal-Oriented Gaze Estimation for Zero-Shot Learning

Zero-shot learning (ZSL) aims to recognize novel classes by transferring semantic knowledge from seen classes to unseen classes. Since semantic knowledge is built on attributes shared between different classes, which are highly local, strong prior for localization of object attribute is beneficial for visual-semantic embedding. Interestingly, when recognizing unseen images, human would also automatically gaze at regions with certain semantic clue. Therefore, we introduce a novel goal-oriented gaze estimation module (GEM) to improve the discriminative attribute localization based on the class-level attributes for ZSL. We aim to predict the actual human gaze location to get the visual attention regions for recognizing a novel object guided by attribute description. Specifically, the task-dependent attention is learned with the goal-oriented GEM, and the global image features are simultaneously optimized with the regression of local attribute features. Experiments on three ZSL benchmarks, i.e., CUB, SUN and AWA2, show the superiority or competitiveness of our proposed method against the state-of-the-art ZSL methods. The ablation analysis on real gaze data CUB-VWSW also validates the benefits and accuracy of our gaze estimation module. This work implies the promising benefits of collecting human gaze dataset and automatic gaze estimation algorithms on high-level computer vision tasks. The code is available at https://github.com/osierboy/GEM-ZSL.

preprint2021arXiv

Information Bottleneck Constrained Latent Bidirectional Embedding for Zero-Shot Learning

Zero-shot learning (ZSL) aims to recognize novel classes by transferring semantic knowledge from seen classes to unseen classes. Though many ZSL methods rely on a direct mapping between the visual and the semantic space, the calibration deviation and hubness problem limit the generalization capability to unseen classes. Recently emerged generative ZSL methods generate unseen image features to transform ZSL into a supervised classification problem. However, most generative models still suffer from the seen-unseen bias problem as only seen data is used for training. To address these issues, we propose a novel bidirectional embedding based generative model with a tight visual-semantic coupling constraint. We learn a unified latent space that calibrates the embedded parametric distributions of both visual and semantic spaces. Since the embedding from high-dimensional visual features comprise much non-semantic information, the alignment of visual and semantic in latent space would inevitably been deviated. Therefore, we introduce information bottleneck (IB) constraint to ZSL for the first time to preserve essential attribute information during the mapping. Specifically, we utilize the uncertainty estimation and the wake-sleep procedure to alleviate the feature noises and improve model abstraction capability. In addition, our method can be easily extended to transductive ZSL setting by generating labels for unseen images. We then introduce a robust loss to solve this label noise problem. Extensive experimental results show that our method outperforms the state-of-the-art methods in different ZSL settings on most benchmark datasets. The code will be available at https://github.com/osierboy/IBZSL.

preprint2021arXiv

PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency

Removing outlier correspondences is one of the critical steps for successful feature-based point cloud registration. Despite the increasing popularity of introducing deep learning methods in this field, spatial consistency, which is essentially established by a Euclidean transformation between point clouds, has received almost no individual attention in existing learning frameworks. In this paper, we present PointDSC, a novel deep neural network that explicitly incorporates spatial consistency for pruning outlier correspondences. First, we propose a nonlocal feature aggregation module, weighted by both feature and spatial coherence, for feature embedding of the input correspondences. Second, we formulate a differentiable spectral matching module, supervised by pairwise spatial compatibility, to estimate the inlier confidence of each correspondence from the embedded features. With modest computation cost, our method outperforms the state-of-the-art hand-crafted and learning-based outlier rejection approaches on several real-world datasets by a significant margin. We also show its wide applicability by combining PointDSC with different 3D local descriptors.

preprint2020arXiv

ASLFeat: Learning Local Features of Accurate Shape and Localization

This work focuses on mitigating two limitations in the joint learning of local feature detectors and descriptors. First, the ability to estimate the local shape (scale, orientation, etc.) of feature points is often neglected during dense feature extraction, while the shape-awareness is crucial to acquire stronger geometric invariance. Second, the localization accuracy of detected keypoints is not sufficient to reliably recover camera geometry, which has become the bottleneck in tasks such as 3D reconstruction. In this paper, we present ASLFeat, with three light-weight yet effective modifications to mitigate above issues. First, we resort to deformable convolutional networks to densely estimate and apply local transformation. Second, we take advantage of the inherent feature hierarchy to restore spatial resolution and low-level details for accurate keypoint localization. Finally, we use a peakiness measurement to relate feature responses and derive more indicative detection scores. The effect of each modification is thoroughly studied, and the evaluation is extensively conducted across a variety of practical scenarios. State-of-the-art results are reported that demonstrate the superiority of our methods.

preprint2020arXiv

BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks

While deep learning has recently achieved great success on multi-view stereo (MVS), limited training data makes the trained model hard to be generalized to unseen scenarios. Compared with other computer vision tasks, it is rather difficult to collect a large-scale MVS dataset as it requires expensive active scanners and labor-intensive process to obtain ground truth 3D structures. In this paper, we introduce BlendedMVS, a novel large-scale dataset, to provide sufficient training ground truth for learning-based MVS. To create the dataset, we apply a 3D reconstruction pipeline to recover high-quality textured meshes from images of well-selected scenes. Then, we render these mesh models to color images and depth maps. To introduce the ambient lighting information during training, the rendered color images are further blended with the input images to generate the training input. Our dataset contains over 17k high-resolution images covering a variety of scenes, including cities, architectures, sculptures and small objects. Extensive experiments demonstrate that BlendedMVS endows the trained model with significantly better generalization ability compared with other MVS datasets. The dataset and pretrained models are available at \url{https://github.com/YoYo000/BlendedMVS}.

preprint2020arXiv

D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features

A successful point cloud registration often lies on robust establishment of sparse matches through discriminative 3D local features. Despite the fast evolution of learning-based 3D feature descriptors, little attention has been drawn to the learning of 3D feature detectors, even less for a joint learning of the two tasks. In this paper, we leverage a 3D fully convolutional network for 3D point clouds, and propose a novel and practical learning mechanism that densely predicts both a detection score and a description feature for each 3D point. In particular, we propose a keypoint selection strategy that overcomes the inherent density variations of 3D point clouds, and further propose a self-supervised detector loss guided by the on-the-fly feature matching results during training. Finally, our method achieves state-of-the-art results in both indoor and outdoor scenarios, evaluated on 3DMatch and KITTI datasets, and shows its strong generalization ability on the ETH dataset. Towards practical use, we show that by adopting a reliable feature detector, sampling a smaller number of features is sufficient to achieve accurate and fast point cloud alignment.[code release](https://github.com/XuyangBai/D3Feat)

preprint2020arXiv

Dynamics of charged dust in the orbit of Venus

We study the dynamics of co-orbital dust in the inner solar system, i.e. the role of the solar radiation pressure, Poynting-Robertson effect, solar wind, and the interplanetary magnetic field on the location, width and stability of resonant motion of charged, and micron sized dust grains situated in the 1:1 mean motion resonance with planet Venus. We find deviations and asymmetry between $L_4$ and $L_5$ in locations of libration centers and libration width under the influence of non-gravitational effects via both analytical and numerical methods. The triangular Lagrangian points become unstable once we take into consideration solar radiation pressure, the Poynting-Robertson effect and solar wind drag. The Lorentz force could further destabilize the orbits, especially for small dust particles. We also make a comparison between the circular, elliptic restricted three-body model and a more complete model including all planets.

preprint2020arXiv

End-to-end Optimized Video Compression with MV-Residual Prediction

We present an end-to-end trainable framework for P-frame compression in this paper. A joint motion vector (MV) and residual prediction network MV-Residual is designed to extract the ensembled features of motion representations and residual information by treating the two successive frames as inputs. The prior probability of the latent representations is modeled by a hyperprior autoencoder and trained jointly with the MV-Residual network. Specially, the spatially-displaced convolution is applied for video frame prediction, in which a motion kernel for each pixel is learned to generate predicted pixel by applying the kernel at a displaced location in the source image. Finally, novel rate allocation and post-processing strategies are used to produce the final compressed bits, considering the bits constraint of the challenge. The experimental results on validation set show that the proposed optimized framework can generate the highest MS-SSIM for P-frame compression competition.

preprint2020arXiv

Joint Semantic Segmentation and Boundary Detection using Iterative Pyramid Contexts

In this paper, we present a joint multi-task learning framework for semantic segmentation and boundary detection. The critical component in the framework is the iterative pyramid context module (PCM), which couples two tasks and stores the shared latent semantics to interact between the two tasks. For semantic boundary detection, we propose the novel spatial gradient fusion to suppress nonsemantic edges. As semantic boundary detection is the dual task of semantic segmentation, we introduce a loss function with boundary consistency constraint to improve the boundary pixel accuracy for semantic segmentation. Our extensive experiments demonstrate superior performance over state-of-the-art works, not only in semantic segmentation but also in semantic boundary detection. In particular, a mean IoU score of 81:8% on Cityscapes test set is achieved without using coarse data or any external data for semantic segmentation. For semantic boundary detection, we improve over previous state-of-the-art works by 9.9% in terms of AP and 6:8% in terms of MF(ODS).

preprint2020arXiv

KFNet: Learning Temporal Camera Relocalization using Kalman Filtering

Temporal camera relocalization estimates the pose with respect to each video frame in sequence, as opposed to one-shot relocalization which focuses on a still image. Even though the time dependency has been taken into account, current temporal relocalization methods still generally underperform the state-of-the-art one-shot approaches in terms of accuracy. In this work, we improve the temporal relocalization method by using a network architecture that incorporates Kalman filtering (KFNet) for online camera relocalization. In particular, KFNet extends the scene coordinate regression problem to the time domain in order to recursively establish 2D and 3D correspondences for the pose determination. The network architecture design and the loss formulation are based on Kalman filtering in the context of Bayesian learning. Extensive experiments on multiple relocalization benchmarks demonstrate the high accuracy of KFNet at the top of both one-shot and temporal relocalization approaches. Our codes are released at https://github.com/zlthinker/KFNet.

preprint2020arXiv

Learning Discriminative Feature with CRF for Unsupervised Video Object Segmentation

In this paper, we introduce a novel network, called discriminative feature network (DFNet), to address the unsupervised video object segmentation task. To capture the inherent correlation among video frames, we learn discriminative features (D-features) from the input images that reveal feature distribution from a global perspective. The D-features are then used to establish correspondence with all features of test image under conditional random field (CRF) formulation, which is leveraged to enforce consistency between pixels. The experiments verify that DFNet outperforms state-of-the-art methods by a large margin with a mean IoU score of 83.4% and ranks first on the DAVIS-2016 leaderboard while using much fewer parameters and achieving much more efficient performance in the inference phase. We further evaluate DFNet on the FBMS dataset and the video saliency dataset ViSal, reaching a new state-of-the-art. To further demonstrate the generalizability of our framework, DFNet is also applied to the image object co-segmentation task. We perform experiments on a challenging dataset PASCAL-VOC and observe the superiority of DFNet. The thorough experiments verify that DFNet is able to capture and mine the underlying relations of images and discover the common foreground objects.

preprint2020arXiv

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency

Recent learning-based approaches, in which models are trained by single-view images have shown promising results for monocular 3D face reconstruction, but they suffer from the ill-posed face pose and depth ambiguity issue. In contrast to previous works that only enforce 2D feature constraints, we propose a self-supervised training architecture by leveraging the multi-view geometry consistency, which provides reliable constraints on face pose and depth estimation. We first propose an occlusion-aware view synthesis method to apply multi-view geometry consistency to self-supervised learning. Then we design three novel loss functions for multi-view consistency, including the pixel consistency loss, the depth consistency loss, and the facial landmark-based epipolar loss. Our method is accurate and robust, especially under large variations of expressions, poses, and illumination conditions. Comprehensive experiments on the face alignment and 3D face reconstruction benchmarks have demonstrated superiority over state-of-the-art methods. Our code and model are released in https://github.com/jiaxiangshang/MGCNet.

preprint2020arXiv

Tunable Graphene Split-Ring Resonators

A split-ring resonator is a prototype of meta-atom in metamaterials. Though noble metal-based split-ring resonators have been extensively studied, up to date, there is no experimental demonstration of split-ring resonators made from graphene, an emerging intriguing plasmonic material. Here, we experimentally demonstrate graphene split-ring resonators with deep subwavelength (about one hundredth of the excitation wavelength) magnetic dipole response in the terahertz regime. Meanwhile, the quadrupole and electric dipole are observed,depending on the incident light polarization. All modes can be tuned via chemical doping or stacking multiple graphene layers. The strong interaction with surface polar phonons of the SiO2 substrate also significantly modifies the response. Finite-element frequency domain simulations nicely reproduce experimental results. Our study moves one stride forward toward the multi-functional graphene metamaterials, beyond simple graphene ribbon or disk arrays with electrical dipole resonances only.

preprint2019arXiv

A systematic survey of the dynamics of Uranus Trojans

We aim to locate the stability region for Uranus Trojans (UT hereafter) and find out the dynamical mechanisms responsible for the structures in the phase space. Using the spectral number as the stability indicator, we construct the dynamical maps on the (a0, i0) plane. The proper frequencies of UTs are determined precisely so that we can depict the resonance web via a semi-analytical method. Two main stability regions are found, one each for the low-inclination (0-14deg) and high-inclination regime (32-59deg). There is also an instability strip in each of them, at 9deg and 51deg respectively. All stability regions are in the tadpole regime and no stable horseshoe orbits exist for UTs. The lack of moderate-inclined UTs is caused by the nu5 and nu7 secular resonances. The fine structures in the dynamical maps are shaped by high-degree secular resonances and secondary resonances. During the planetary migration, about 36.3% and 0.4% of the pre-formed orbits survive the fast and slow migrations (with migrating time scales of 1 and 10Myr) respectively, most of which are in high inclination. Since the low-inclined UTs are more likely to survive the age of the solar system, they make up 77% of all such long-life orbits by the end of the migration, making a total fraction up to 4.06E-3 and 9.07E-5 of the original population for the fast and slow migrations, respectively. About 3.81% UTs are able to survive the age of the solar system, among which 95.5% are on low-inclined orbits with i0<7.5deg. However, the depletion of the planetary migration seems to prevent a large fraction of such orbits, especially for the slow migration model.

preprint2016arXiv

Evolution of Cooperation on Temporal Networks

The structure of social networks is a key determinant in fostering cooperation and other altruistic behavior among naturally selfish individuals. However, most real social interactions are temporal, being both finite in duration and spread out over time. This raises the question of whether stable cooperation can form despite an intrinsically fragmented social fabric. Here we develop a framework to study the evolution of cooperation on temporal networks in the setting of the classic Prisoner's Dilemma. By analyzing both real and synthetic datasets, we find that temporal networks generally facilitate the evolution of cooperation compared to their static counterparts. More interestingly, we find that the intrinsic human interactive pattern like bursty behavior impedes the evolution of cooperation. Finally, we introduce a measure to quantify the temporality present in networks and demonstrate that there is an intermediate level of temporality that boosts cooperation most. Our results open a new avenue for investigating the evolution of cooperation in more realistic structured populations.

preprint2016arXiv

Revealing the spin optics of conics

Ellipse and hyperbola are two well-known curves in mathematics with numerous applications in various fields, but their properties and inherent differences in spin optics are less understood. Here, we investigate the peculiar optical spin properties of the two curves and establish a connection between their foci and the spin states of incident light. We show that the optical spin Hall effect is the intrinsic optical spin property of ellipse, where photons with different spin states can be exactly separated to each of its two foci. While a hyperbola exhibits optical spin-selective effect, where only photons with one particular spin state can be accumulated at its foci. These properties are then experimentally demonstrated in near field by arranging nanoslits in conic shape. Based on the spin properties of the curves, we design spin-based plasmonic devices with various functionalities. Our results reveal the intrinsic optical spin properties behind conic curves and provide a route for designing spin-based plasmonic device.

preprint2015arXiv

Analytic derivation of electrostrictive tensors and their application to optical force density calculations

Using multiple scattering theory, we derived for the first time analytical formulas for electrostrictive tensors for two dimensional metamaterial systems. The electrostrictive tensor terms are found to depend explicitly on the symmetry of the underlying lattice of the metamaterial and they also depend explicitly on the direction of a local effective wave vector. These analytical results enable us to calculate light induced body forces inside a composite system (metamaterial) using the Helmholtz stress tensor within the effective medium formalism in the sense that the fields used in the stress tensor are those obtained by solving the macroscopic Maxwell equation with the microstructure of the metamaterial replaced by an effective medium. Our results point to some fundamental questions of using an effective medium theory to determine optical force density. In particular, the fact that Helmholtz tensor carries electrostrictive terms that are explicitly symmetry dependent means that the standard effective medium parameters cannot give sufficient information to determine body force density, even though they can give the correct total force. A more challenging issue is that the electrostrictive terms are related to a local effective wave vector, and it is not always obtainable in systems with boundary reflections within the context of a standard effective medium approach.

preprint2015arXiv

Tailor the functionalities of metasurfaces: From perfect absorption to phase modulation

Metasurfaces in metal/insulator/metal configuration have recently been widely used in photonics research, with applications ranging from perfect absorption to phase modulation, but why and when such structures can realize what kind of functionalities are not yet fully understood. Here, based on a coupled-mode theory analysis, we establish a complete phase diagram in which the optical properties of such systems are fully controlled by two simple parameters (i.e., the intrinsic and radiation losses), which are in turn dictated by the geometrical/material parameters of the underlying structures. Such a phase diagram can greatly facilitate the design of appropriate metasurfaces with tailored functionalities (e.g., perfect absorption, phase modulator, electric/magnetic reflector, etc.), demonstrated by our experiments and simulations in the Terahertz regime. In particular, our experiments show that, through appropriate structural/material tuning, the device can be switched across the functionality phase boundaries yielding dramatic changes in optical responses. Our discoveries lay a solid basis for realizing functional and tunable photonic devices with such structures.

preprint2014arXiv

Full-range Gate-controlled Terahertz Phase Modulations with Graphene Metasurfaces

Local phase control of electromagnetic wave, the basis of a diverse set of applications such as hologram imaging, polarization and wave-front manipulation, is of fundamental importance in photonic research. However, the bulky, passive phase modulators currently available remain a hurdle for photonic integration. Here we demonstrate full-range active phase modulations in the Tera-Hertz (THz) regime, realized by gate-tuned ultra-thin reflective metasurfaces based on graphene. A one-port resonator model, backed by our full-wave simulations, reveals the underlying mechanism of our extreme phase modulations, and points to general strategies for the design of tunable photonic devices. As a particular example, we demonstrate a gate-tunable THz polarization modulator based on our graphene metasurface. Our findings pave the road towards exciting photonic applications based on active phase manipulations.

preprint2014arXiv

High-performance THz metamaterial absorber

We demonstrated an ultra-broadband, polarization-insensitive and wide-angle metamaterial absorber for terahertz (THz) frequencies using arrays of truncated pyramid unit structure made of metal-dielectric multilayer composite. In our design each sub-layer behaving as an effective waveguide is gradually modified in their lateral width to realize a wideband response by effectively stitching together the resonance bands of different waveguide modes. Experimentally, our five layer sample with a total thickness 21um is capable of producing a large absorptivity above 80% from 0.7 to 2.3 THz up to the maximum measurement angle 40°. The full absorption width at half maximum (FWHM) of our device is around 127%, greater than those previously reported for THz frequencies. Our absorber design has high practical feasibility and can be easily integrated with the semiconductor technology to make high efficient THz-oriented devices.

preprint2013arXiv

A real-time QKD system based on FPGA

A real-time Quantum Key Distribution System is developed in this paper. In the system, based on the feature of Field Programmable Gate Array (FPGA), secure key extraction control and algorithm have been optimally designed to perform sifting, error correction and privacy amplification altogether in real-time. In the QKD experiment information synchronization mechanism and high-speed classic data channel are designed to ensure the steady operation of the system. Decoy state and synchronous laser light source are used in the system, while the length of optical fiber between Alice and Bob is 20 km. With photons repetition frequency of 20 MHz, the final key rate could reach 17 kbps. Smooth and robust operation is verified with 6-hour continuous test and associated with encrypted voice communication test.

preprint2013arXiv

Realizing optical pulling force using chirality

We derived an analytical formula for the optical force acting on a small anisotropic chiral particle. The behavior of chiral particles is qualitatively different from achiral particles due to new chirality dependent terms which couple mechanical linear momentum and optical spin angular momentum. Such coupling induced by chirality can serve as a new mechanism to achieve optical pulling force. Our analytical predictions are verified by numerical simulations.

preprint2013arXiv

Uplink Multicell Processing with Limited Backhaul via Per-Base-Station Successive Interference Cancellation

This paper studies an uplink multicell joint processing model in which the base-stations are connected to a centralized processing server via rate-limited digital backhaul links. Unlike previous studies where the centralized processor jointly decodes all the source messages from all base-stations, this paper proposes a suboptimal achievability scheme in which the Wyner-Ziv compress-and-forward relaying technique is employed on a per-base-station basis, but successive interference cancellation (SIC) is used at the central processor to mitigate multicell interference. This results in an achievable rate region that is easily computable, in contrast to the joint processing schemes in which the rate regions can only be characterized by exponential number of rate constraints. Under the per-base-station SIC framework, this paper further studies the impact of the limited-capacity backhaul links on the achievable rates and establishes that in order to achieve to within constant number of bits to the maximal SIC rate with infinite-capacity backhaul, the backhaul capacity must scale logarithmically with the signal-to-interference-and-noise ratio (SINR) at each base-station. Finally, this paper studies the optimal backhaul rate allocation problem for an uplink multicell joint processing model with a total backhaul capacity constraint. The analysis reveals that the optimal strategy that maximizes the overall sum rate should also scale with the log of the SINR at each base-station.

preprint2012arXiv

Gaussian Z-Interference Channel with a Relay Link: Achievability Region and Asymptotic Sum Capacity

This paper studies a Gaussian Z-interference channel with a rate-limited digital relay link from one receiver to another. Achievable rate regions are derived based on a combination of Han-Kobayashi common-private information splitting technique and several different relay strategies including compress-and-forward and a partial decode-and-forward strategy, in which the interference is partially decoded then binned and forwarded through the digital link for subtraction at the other end. For the Gaussian Z-interference channel with a digital link from the interference-free receiver to the interfered receiver, the capacity region is established in the strong interference regime; an achievable rate region is established in the weak interference regime. In the weak interference regime, the partial decode-and-forward strategy is shown to be asymptotically sum-capacity achieving in the high signal-to-noise ratio and high interference-to-noise ratio limit. In this case, each relay bit asymptotically improves the sum capacity by one bit. For the Gaussian Z-interference channel with a digital link from the interfered receiver to the interference-free receiver, the capacity region is established in the strong interference regime; achievable rate regions are established in the moderately strong and weak interference regimes. In addition, the asymptotically sum capacity is established in the limit of large relay link rate. In this case, the sum capacity improvement due to the digital link is bounded by half a bit when the interference link is weaker than certain threshold, but the sum capacity improvement becomes unbounded as the interference link becomes stronger.

preprint2012arXiv

Incremental Relaying for the Gaussian Interference Channel with a Degraded Broadcasting Relay

This paper studies incremental relay strategies for a two-user Gaussian relay-interference channel with an in-band-reception and out-of-band-transmission relay, where the link between the relay and the two receivers is modelled as a degraded broadcast channel. It is shown that generalized hash-and-forward (GHF) can achieve the capacity region of this channel to within a constant number of bits in a certain weak relay regime, where the transmitter-to-relay link gains are not unboundedly stronger than the interference links between the transmitters and the receivers. The GHF relaying strategy is ideally suited for the broadcasting relay because it can be implemented in an incremental fashion, i.e., the relay message to one receiver is a degraded version of the message to the other receiver. A generalized-degree-of-freedom (GDoF) analysis in the high signal-to-noise ratio (SNR) regime reveals that in the symmetric channel setting, each common relay bit can improve the sum rate roughly by either one bit or two bits asymptotically depending on the operating regime, and the rate gain can be interpreted as coming solely from the improvement of the common message rates, or alternatively in the very weak interference regime as solely coming from the rate improvement of the private messages. Further, this paper studies an asymmetric case in which the relay has only a single single link to one of the destinations. It is shown that with only one relay-destination link, the approximate capacity region can be established for a larger regime of channel parameters. Further, from a GDoF point of view, the sum-capacity gain due to the relay can now be thought as coming from either signal relaying only, or interference forwarding only.

preprint2012arXiv

On the Capacity of the $K$-User Cyclic Gaussian Interference Channel

This paper studies the capacity region of a $K$-user cyclic Gaussian interference channel, where the $k$th user interferes with only the $(k-1)$th user (mod $K$) in the network. Inspired by the work of Etkin, Tse and Wang, who derived a capacity region outer bound for the two-user Gaussian interference channel and proved that a simple Han-Kobayashi power splitting scheme can achieve to within one bit of the capacity region for all values of channel parameters, this paper shows that a similar strategy also achieves the capacity region of the $K$-user cyclic interference channel to within a constant gap in the weak interference regime. Specifically, for the $K$-user cyclic Gaussian interference channel, a compact representation of the Han-Kobayashi achievable rate region using Fourier-Motzkin elimination is first derived, a capacity region outer bound is then established. It is shown that the Etkin-Tse-Wang power splitting strategy gives a constant gap of at most 2 bits in the weak interference regime. For the special 3-user case, this gap can be sharpened to 1 1/2 bits by time-sharing of several different strategies. The capacity result of the $K$-user cyclic Gaussian interference channel in the strong interference regime is also given. Further, based on the capacity results, this paper studies the generalized degrees of freedom (GDoF) of the symmetric cyclic interference channel. It is shown that the GDoF of the symmetric capacity is the same as that of the classic two-user interference channel, no matter how many users are in the network.

preprint2012arXiv

On the Capacity of the K-User Cyclic Gaussian Interference Channel

This paper studies the capacity region of a $K$-user cyclic Gaussian interference channel, where the $k$th user interferes with only the $(k-1)$th user (mod $K$) in the network. Inspired by the work of Etkin, Tse and Wang, which derived a capacity region outer bound for the two-user Gaussian interference channel and proved that a simple Han-Kobayashi power splitting scheme can achieve to within one bit of the capacity region for all values of channel parameters, this paper shows that a similar strategy also achieves the capacity region for the $K$-user cyclic interference channel to within a constant gap in the weak interference regime. Specifically, a compact representation of the Han-Kobayashi achievable rate region using Fourier-Motzkin elimination is first derived, a capacity region outer bound is then established. It is shown that the Etkin-Tse-Wang power splitting strategy gives a constant gap of at most two bits (or one bit per dimension) in the weak interference regime. Finally, the capacity result of the $K$-user cyclic Gaussian interference channel in the strong interference regime is also given.

preprint2011arXiv

Capacity of the Gaussian Relay Channel with Correlated Noises to Within a Constant Gap

This paper studies the relaying strategies and the approximate capacity of the classic three-node Gaussian relay channel, but where the noises at the relay and at the destination are correlated. It is shown that the capacity of such a relay channel can be achieved to within a constant gap of $\hf \log_2 3 =0.7925$ bits using a modified version of the noisy network coding strategy, where the quantization level at the relay is set in a correlation dependent way. As a corollary, this result establishes that the conventional compress-and-forward scheme also achieves to within a constant gap to the capacity. In contrast, the decode-and-forward and the single-tap amplify-and-forward relaying strategies can have an infinite gap to capacity in the regime where the noises at the relay and at the destination are highly correlated, and the gain of the relay-to-destination link goes to infinity.

preprint2011arXiv

Experimental demonstration of counterfactual quantum communication

Based on principle of quantum mechanics, quantum cryptography provides an intriguing way to establish secret keys between remote parties, generally relying on actual transmission of signal particles. Surprisingly, an even more striking method is recently proposed by Noh named as `counterfactual quantum cryptography' enabling key distribution, in which particles carrying secret information are seemly not being transmitted through quantum channel. We experimentally give here a faithful implementation by following the scheme with an on-table realization. Furthermore, we report an illustration on a 1 km fiber operating at telecom wavelength to verify its feasibility for extending to long distance. For both cases, high visibilities of better than 98% are maintained with active stabilization of interferometers, while a quantum bit error rate around 5.5% is attained after 1 km channel.

preprint2011arXiv

On Noisy Network Coding for a Gaussian Relay Chain Network with Correlated Noises

Noisy network coding, which elegantly combines the conventional compress-and-forward relaying strategy and ideas from network coding, has recently drawn much attention for its simplicity and optimality in achieving to within constant gap of the capacity of the multisource multicast Gaussian network. The constant-gap result, however, applies only to Gaussian relay networks with independent noises. This paper investigates the application of noisy network coding to networks with correlated noises. By focusing on a four-node Gaussian relay chain network with a particular noise correlation structure, it is shown that noisy network coding can no longer achieve to within constant gap to capacity with the choice of Gaussian inputs and Gaussian quantization. The cut-set bound of the relay chain network in this particular case, however, can be achieved to within half a bit by a simple concatenation of a correlation-aware noisy network coding strategy and a decode-and-forward scheme.

preprint2010arXiv

Room temperature one-dimensional polariton condensate in a ZnO microwire

A cavity-polariton, formed due to the strong coupling between exciton and cavity mode, is one of the most promising composite bosons for realizing macroscopic spontaneous coherence at high temperature. Up to date, most of polariton quantum degeneracy experiments were conducted in the complicated two-dimensional (2D) planar microcavities. The role of dimensionality in coherent quantum degeneracy of a composite bosonic system of exciton polaritons remains mysterious. Here we report the first experimental observation of a one-dimensional (1D) polariton condensate in a ZnO microwire at room temperature. The massive occupation of the polariton ground state above a distinct pump power threshold is clearly demonstrated by using the angular resolved spectroscopy under non-resonant excitation. The power threshold is one order of magnitude lower than that of Mott transition. Furthermore, a well-defined far field emission pattern from the polariton condensate mode is observed, manifesting the coherence build-up in the condensed polariton system.

preprint2009arXiv

Fractal plasmonic metamaterials for subwavelength imaging

We show that a metallic plate with fractal-shaped slits can be homogenitized as a plasmonic metamaterial with plasmon frequency dictated by the fractal geometry. Owing to the all-dimensional subwavelength nature of the fractal pattern, our system supports both transverse-electric and transverse-magnetic surface plasmons. As a result, this structure can be employed to focus light sources with all-dimensional subwavelength resolutions and enhanced field strengths. Microwave experiments reveal that the best achievable resolution is only, and simulations demonstrate that similar effects can be realized at infrared frequencies with appropriate designs.

preprint1996arXiv

An extension of the characteristic angle method to the easy-plane spin-3/2 ferromagnet

The Characteristic Angle (CA) method [Lei Zhou and Ruibao Tao, J. Phys. A, {\bf 27} 5599] developed previously for the easy-plane spin-1 magnetic systems has been successfully extended to the spin-3/2 case. A compact form of the CA spin-3/2 operator transformation is given, then the ground state energy, the magnon dispersion relation and the spontaneous magnetization are discussed for an easy-plane spin-3/2 ferromagnet by using the CA method. Comparisons with the old theoretical methods are made in the end.

Lei Zhou

What is connected

Connect this record

See the researcher in context

Building this map preview

44 published item(s)

Improving LLM Reasoning with Homophily-aware Structural and Semantic Text-Attributed Graph Compression

The RoboSense Challenge: Sense Anything, Navigate Anywhere, Adapt Across Platforms

Sequential Structure and Control Co-design of Lightweight Precision Stages with Active control of flexible modes

ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer

Automatic detection of multilevel communities: scalable and resolution-limit-free

Control Co-design of Actively Controlled Lightweight Structures for High-acceleration Precision Motion Systems

Economical Precise Manipulation and Auto Eye-Hand Coordination with Binocular Visual Reinforcement Learning

Half a Dozen Real-World Applications of Evolutionary Multitasking, and More

Learning Prototype via Placeholder for Zero-shot Recognition

Lung Swapping Autoencoder: Learning a Disentangled Structure-texture Representation of Chest Radiographs

Self-Sensing Hysteresis-Type Bearingless Motor

Goal-Oriented Gaze Estimation for Zero-Shot Learning

Information Bottleneck Constrained Latent Bidirectional Embedding for Zero-Shot Learning

PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency

ASLFeat: Learning Local Features of Accurate Shape and Localization

BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks

D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features

Dynamics of charged dust in the orbit of Venus

End-to-end Optimized Video Compression with MV-Residual Prediction

Joint Semantic Segmentation and Boundary Detection using Iterative Pyramid Contexts

KFNet: Learning Temporal Camera Relocalization using Kalman Filtering

Learning Discriminative Feature with CRF for Unsupervised Video Object Segmentation

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency

Tunable Graphene Split-Ring Resonators

A systematic survey of the dynamics of Uranus Trojans

Evolution of Cooperation on Temporal Networks

Revealing the spin optics of conics

Analytic derivation of electrostrictive tensors and their application to optical force density calculations

Tailor the functionalities of metasurfaces: From perfect absorption to phase modulation

Full-range Gate-controlled Terahertz Phase Modulations with Graphene Metasurfaces

High-performance THz metamaterial absorber

A real-time QKD system based on FPGA

Realizing optical pulling force using chirality

Uplink Multicell Processing with Limited Backhaul via Per-Base-Station Successive Interference Cancellation

Gaussian Z-Interference Channel with a Relay Link: Achievability Region and Asymptotic Sum Capacity

Incremental Relaying for the Gaussian Interference Channel with a Degraded Broadcasting Relay

On the Capacity of the $K$-User Cyclic Gaussian Interference Channel

On the Capacity of the K-User Cyclic Gaussian Interference Channel

Capacity of the Gaussian Relay Channel with Correlated Noises to Within a Constant Gap

Experimental demonstration of counterfactual quantum communication

On Noisy Network Coding for a Gaussian Relay Chain Network with Correlated Noises

Room temperature one-dimensional polariton condensate in a ZnO microwire

Fractal plasmonic metamaterials for subwavelength imaging

An extension of the characteristic angle method to the easy-plane spin-3/2 ferromagnet