Source author record

Zheng Zhu

Zheng Zhu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

56works

27topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Coordinated Pandemic Control with Large Language Model Agents as Policymaking Assistants

Effective pandemic control requires timely and coordinated policymaking across administrative regions that are intrinsically interdependent. However, human-driven responses are often fragmented and reactive, with policies formulated in isolation and adjusted only after outbreaks escalate, undermining proactive intervention and global pandemic mitigation. To address this challenge, here we propose a large language model (LLM) multi-agent policymaking framework that supports coordinated and proactive pandemic control across regions. Within our framework, each administrative region is assigned an LLM agent as an AI policymaking assistant. The agent reasons over region-specific epidemiological dynamics while communicating with other agents to account for cross-regional interdependencies. By integrating real-world data, a pandemic evolution simulator, and structured inter-agent communication, our framework enables agents to jointly explore counterfactual intervention scenarios and synthesize coordinated policy decisions through a closed-loop simulation process. We validate the proposed framework using state-level COVID-19 data from the United States between April and December 2020, together with real-world mobility records and observed policy interventions. Compared with real-world pandemic outcomes, our approach reduces cumulative infections and deaths by up to 63.7% and 40.1%, respectively, at the individual state level, and by 39.0% and 27.0%, respectively, when aggregated across states. These results demonstrate that LLM multi-agent systems can enable more effective pandemic control with coordinated policymaking...

preprint2026arXiv

RoboTransfer: Controllable Geometry-Consistent Video Diffusion for Manipulation Policy Transfer

The goal of general-purpose robotics is to create agents that can seamlessly adapt to and operate in diverse, unstructured human environments. Imitation learning has become a key paradigm for robotic manipulation, yet collecting large-scale and diverse demonstrations is prohibitively expensive. Simulators provide a cost-effective alternative, but the sim-to-real gap remains a major obstacle to scalability. We present RoboTransfer, a diffusion-based video generation framework for synthesizing robotic data. By leveraging cross-view feature interactions and globally consistent 3D geometry, RoboTransfer ensures multi-view geometric consistency while enabling fine-grained control over scene elements, such as background editing and object replacement. Extensive experiments demonstrate that RoboTransfer produces videos with superior geometric consistency and visual fidelity. Furthermore, policies trained on this synthetic data exhibit enhanced generalization to novel, unseen scenarios. Project page: https://horizonrobotics.github.io/robot_lab/robotransfer.

preprint2026arXiv

Spatial Multi-Task Learning for Breast Cancer Molecular Subtype Prediction from Single-Phase DCE-MRI

Accurate molecular subtype classification is essential for personalized breast cancer treatment, yet conventional immunohistochemical analysis relies on invasive biopsies and is prone to sampling bias. Although dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) enables non-invasive tumor characterization, clinical workflows typically acquire only single-phase post-contrast images to reduce scan time and contrast agent dose. In this study, we propose a spatial multi-task learning framework for breast cancer molecular subtype prediction from clinically practical single-phase DCE-MRI. The framework simultaneously predicts estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2) status, and the Ki-67 proliferation index -- biomarkers that collectively define molecular subtypes. The architecture integrates a deep feature extraction network with multi-scale spatial attention to capture intratumoral and peritumoral characteristics, together with a region-of-interest weighting module that emphasizes the tumor core, rim, and surrounding tissue. Multi-task learning exploits biological correlations among biomarkers through shared representations with task-specific prediction branches. Experiments on a dataset of 960 cases (886 internal cases split 7:1:2 for training/validation/testing, and 74 external cases evaluated via five-fold cross-validation) demonstrate that the proposed method achieves an AUC of 0.893, 0.824, and 0.857 for ER, PR, and HER2 classification, respectively, and a mean absolute error of 8.2\% for Ki-67 regression, significantly outperforming radiomics and single-task deep learning baselines. These results indicate the feasibility of accurate, non-invasive molecular subtype prediction using standard imaging protocols.

preprint2026arXiv

TokenSeg: Efficient 3D Medical Image Segmentation via Hierarchical Visual Token Compression

Three-dimensional medical image segmentation is a fundamental yet computationally demanding task due to the cubic growth of voxel processing and the redundant computation on homogeneous regions. To address these limitations, we propose \textbf{TokenSeg}, a boundary-aware sparse token representation framework for efficient 3D medical volume segmentation. Specifically, (1) we design a \emph{multi-scale hierarchical encoder} that extracts 400 candidate tokens across four resolution levels to capture both global anatomical context and fine boundary details; (2) we introduce a \emph{boundary-aware tokenizer} that combines VQ-VAE quantization with importance scoring to select 100 salient tokens, over 60\% of which lie near tumor boundaries; and (3) we develop a \emph{sparse-to-dense decoder} that reconstructs full-resolution masks through token reprojection, progressive upsampling, and skip connections. Extensive experiments on a 3D breast DCE-MRI dataset comprising 960 cases demonstrate that TokenSeg achieves state-of-the-art performance with 94.49\% Dice and 89.61\% IoU, while reducing GPU memory and inference latency by 64\% and 68\%, respectively. To verify the generalization capability, our evaluations on MSD cardiac and brain MRI benchmark datasets demonstrate that TokenSeg consistently delivers optimal performance across heterogeneous anatomical structures. These results highlight the effectiveness of anatomically informed sparse representation for accurate and efficient 3D medical image segmentation.

preprint2023arXiv

Detachable Novel Views Synthesis of Dynamic Scenes Using Distribution-Driven Neural Radiance Fields

Representing and synthesizing novel views in real-world dynamic scenes from casual monocular videos is a long-standing problem. Existing solutions typically approach dynamic scenes by applying geometry techniques or utilizing temporal information between several adjacent frames without considering the underlying background distribution in the entire scene or the transmittance over the ray dimension, limiting their performance on static and occlusion areas. Our approach $\textbf{D}$istribution-$\textbf{D}$riven neural radiance fields offers high-quality view synthesis and a 3D solution to $\textbf{D}$etach the background from the entire $\textbf{D}$ynamic scene, which is called $\text{D}^4$NeRF. Specifically, it employs a neural representation to capture the scene distribution in the static background and a 6D-input NeRF to represent dynamic objects, respectively. Each ray sample is given an additional occlusion weight to indicate the transmittance lying in the static and dynamic components. We evaluate $\text{D}^4$NeRF on public dynamic scenes and our urban driving scenes acquired from an autonomous-driving dataset. Extensive experiments demonstrate that our approach outperforms previous methods in rendering texture details and motion areas while also producing a clean static background. Our code will be released at https://github.com/Luciferbobo/D4NeRF.

preprint2022arXiv

A Simple Baseline for Multi-Camera 3D Object Detection

3D object detection with surrounding cameras has been a promising direction for autonomous driving. In this paper, we present SimMOD, a Simple baseline for Multi-camera Object Detection, to solve the problem. To incorporate multi-view information as well as build upon previous efforts on monocular 3D object detection, the framework is built on sample-wise object proposals and designed to work in a two-stage manner. First, we extract multi-scale features and generate the perspective object proposals on each monocular image. Second, the multi-view proposals are aggregated and then iteratively refined with multi-view and multi-scale visual features in the DETR3D-style. The refined proposals are end-to-end decoded into the detection results. To further boost the performance, we incorporate the auxiliary branches alongside the proposal generation to enhance the feature learning. Also, we design the methods of target filtering and teacher forcing to promote the consistency of two-stage training. We conduct extensive experiments on the 3D object detection benchmark of nuScenes to demonstrate the effectiveness of SimMOD and achieve new state-of-the-art performance. Code will be available at https://github.com/zhangyp15/SimMOD.

preprint2022arXiv

An Efficient Training Approach for Very Large Scale Face Recognition

Face recognition has achieved significant progress in deep learning era due to the ultra-large-scale and welllabeled datasets. However, training on the outsize datasets is time-consuming and takes up a lot of hardware resource. Therefore, designing an efficient training approach is indispensable. The heavy computational and memory costs mainly result from the million-level dimensionality of thefully connected (FC) layer. To this end, we propose a novel training approach, termed Faster Face Classification (F2C), to alleviate time and cost without sacrificing the performance. This method adopts Dynamic Class Pool (DCP) for storing and updating the identities features dynamically, which could be regarded as a substitute for the FC layer. DCP is efficiently time-saving and cost-saving, as its smaller size with the independence from the whole face identities together. We further validate the proposed F2C method across several face benchmarks and private datasets, and display comparable results, meanwhile the speed is faster than state-of-the-art FC-based methods in terms of recognition accuracy and hardware costs. Moreover, our method is further improved by a well-designed dual data loader including indentity-based and instancebased loaders, which makes it more efficient for the updating DCP parameters.

preprint2022arXiv

BEVDet: High-performance Multi-camera 3D Object Detection in Bird-Eye-View

Autonomous driving perceives its surroundings for decision making, which is one of the most complex scenarios in visual perception. The success of paradigm innovation in solving the 2D object detection task inspires us to seek an elegant, feasible, and scalable paradigm for fundamentally pushing the performance boundary in this area. To this end, we contribute the BEVDet paradigm in this paper. BEVDet performs 3D object detection in Bird-Eye-View (BEV), where most target values are defined and route planning can be handily performed. We merely reuse existing modules to build its framework but substantially develop its performance by constructing an exclusive data augmentation strategy and upgrading the Non-Maximum Suppression strategy. In the experiment, BEVDet offers an excellent trade-off between accuracy and time-efficiency. As a fast version, BEVDet-Tiny scores 31.2% mAP and 39.2% NDS on the nuScenes val set. It is comparable with FCOS3D, but requires just 11% computational budget of 215.3 GFLOPs and runs 9.2 times faster at 15.6 FPS. Another high-precision version dubbed BEVDet-Base scores 39.3% mAP and 47.2% NDS, significantly exceeding all published results. With a comparable inference speed, it surpasses FCOS3D by a large margin of +9.8% mAP and +10.0% NDS. The source code is publicly available for further research at https://github.com/HuangJunJie2017/BEVDet .

preprint2022arXiv

BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving

In this paper, we present BEVerse, a unified framework for 3D perception and prediction based on multi-camera systems. Unlike existing studies focusing on the improvement of single-task approaches, BEVerse features in producing spatio-temporal Birds-Eye-View (BEV) representations from multi-camera videos and jointly reasoning about multiple tasks for vision-centric autonomous driving. Specifically, BEVerse first performs shared feature extraction and lifting to generate 4D BEV representations from multi-timestamp and multi-view images. After the ego-motion alignment, the spatio-temporal encoder is utilized for further feature extraction in BEV. Finally, multiple task decoders are attached for joint reasoning and prediction. Within the decoders, we propose the grid sampler to generate BEV features with different ranges and granularities for different tasks. Also, we design the method of iterative flow for memory-efficient future prediction. We show that the temporal information improves 3D object detection and semantic map construction, while the multi-task learning can implicitly benefit motion prediction. With extensive experiments on the nuScenes dataset, we show that the multi-task BEVerse outperforms existing single-task methods on 3D object detection, semantic map construction, and motion prediction. Compared with the sequential paradigm, BEVerse also favors in significantly improved efficiency. The code and trained models will be released at https://github.com/zhangyp15/BEVerse.

preprint2022arXiv

CAFE: Learning to Condense Dataset by Aligning Features

Dataset condensation aims at reducing the network training effort through condensing a cumbersome training set into a compact synthetic one. State-of-the-art approaches largely rely on learning the synthetic data by matching the gradients between the real and synthetic data batches. Despite the intuitive motivation and promising results, such gradient-based methods, by nature, easily overfit to a biased set of samples that produce dominant gradients, and thus lack global supervision of data distribution. In this paper, we propose a novel scheme to Condense dataset by Aligning FEatures (CAFE), which explicitly attempts to preserve the real-feature distribution as well as the discriminant power of the resulting synthetic set, lending itself to strong generalization capability to various architectures. At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales, while accounting for the classification of real samples. Our scheme is further backed up by a novel dynamic bi-level optimization, which adaptively adjusts parameter updates to prevent over-/under-fitting. We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art: on the SVHN dataset, for example, the performance gain is up to 11%. Extensive experiments and analyses verify the effectiveness and necessity of proposed designs.

preprint2022arXiv

Crafting Better Contrastive Views for Siamese Representation Learning

Recent self-supervised contrastive learning methods greatly benefit from the Siamese structure that aims at minimizing distances between positive pairs. For high performance Siamese representation learning, one of the keys is to design good contrastive pairs. Most previous works simply apply random sampling to make different crops of the same image, which overlooks the semantic information that may degrade the quality of views. In this work, we propose ContrastiveCrop, which could effectively generate better crops for Siamese representation learning. Firstly, a semantic-aware object localization strategy is proposed within the training process in a fully unsupervised manner. This guides us to generate contrastive views which could avoid most false positives (i.e., object vs. background). Moreover, we empirically find that views with similar appearances are trivial for the Siamese model training. Thus, a center-suppressed sampling is further designed to enlarge the variance of crops. Remarkably, our method takes a careful consideration of positive pairs for contrastive learning with negligible extra training overhead. As a plug-and-play and framework-agnostic module, ContrastiveCrop consistently improves SimCLR, MoCo, BYOL, SimSiam by 0.4% ~ 2.0% classification accuracy on CIFAR-10, CIFAR-100, Tiny ImageNet and STL-10. Superior results are also achieved on downstream detection and segmentation tasks when pre-trained on ImageNet-1K.

preprint2022arXiv

Crafting Monocular Cues and Velocity Guidance for Self-Supervised Multi-Frame Depth Learning

Self-supervised monocular methods can efficiently learn depth information of weakly textured surfaces or reflective objects. However, the depth accuracy is limited due to the inherent ambiguity in monocular geometric modeling. In contrast, multi-frame depth estimation methods improve the depth accuracy thanks to the success of Multi-View Stereo (MVS), which directly makes use of geometric constraints. Unfortunately, MVS often suffers from texture-less regions, non-Lambertian surfaces, and moving objects, especially in real-world video sequences without known camera motion and depth supervision. Therefore, we propose MOVEDepth, which exploits the MOnocular cues and VElocity guidance to improve multi-frame Depth learning. Unlike existing methods that enforce consistency between MVS depth and monocular depth, MOVEDepth boosts multi-frame depth learning by directly addressing the inherent problems of MVS. The key of our approach is to utilize monocular depth as a geometric priority to construct MVS cost volume, and adjust depth candidates of cost volume under the guidance of predicted camera velocity. We further fuse monocular depth and MVS depth by learning uncertainty in the cost volume, which results in a robust depth estimation against ambiguity in multi-view geometry. Extensive experiments show MOVEDepth achieves state-of-the-art performance: Compared with Monodepth2 and PackNet, our method relatively improves the depth accuracy by 20\% and 19.8\% on the KITTI benchmark. MOVEDepth also generalizes to the more challenging DDAD benchmark, relatively outperforming ManyDepth by 7.2\%. The code is available at https://github.com/JeffWang987/MOVEDepth.

preprint2022arXiv

Decoupled Multi-task Learning with Cyclical Self-Regulation for Face Parsing

This paper probes intrinsic factors behind typical failure cases (e.g. spatial inconsistency and boundary confusion) produced by the existing state-of-the-art method in face parsing. To tackle these problems, we propose a novel Decoupled Multi-task Learning with Cyclical Self-Regulation (DML-CSR) for face parsing. Specifically, DML-CSR designs a multi-task model which comprises face parsing, binary edge, and category edge detection. These tasks only share low-level encoder weights without high-level interactions between each other, enabling to decouple auxiliary modules from the whole network at the inference stage. To address spatial inconsistency, we develop a dynamic dual graph convolutional network to capture global contextual information without using any extra pooling operation. To handle boundary confusion in both single and multiple face scenarios, we exploit binary and category edge detection to jointly obtain generic geometric structure and fine-grained semantic clues of human faces. Besides, to prevent noisy labels from degrading model generalization during training, cyclical self-regulation is proposed to self-ensemble several model instances to get a new model and the resulting model then is used to self-distill subsequent models, through alternating iterations. Experiments show that our method achieves the new state-of-the-art performance on the Helen, CelebAMask-HQ, and Lapa datasets. The source code is available at https://github.com/deepinsight/insightface/tree/master/parsing/dml_csr.

preprint2022arXiv

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

Recent progress has shown that large-scale pre-training using contrastive image-text pairs can be a promising alternative for high-quality visual representation learning from natural language supervision. Benefiting from a broader source of supervision, this new paradigm exhibits impressive transferability to downstream classification tasks and datasets. However, the problem of transferring the knowledge learned from image-text pairs to more complex dense prediction tasks has barely been visited. In this work, we present a new framework for dense prediction by implicitly and explicitly leveraging the pre-trained knowledge from CLIP. Specifically, we convert the original image-text matching problem in CLIP to a pixel-text matching problem and use the pixel-text score maps to guide the learning of dense prediction models. By further using the contextual information from the image to prompt the language model, we are able to facilitate our model to better exploit the pre-trained knowledge. Our method is model-agnostic, which can be applied to arbitrary dense prediction systems and various pre-trained visual backbones including both CLIP models and ImageNet pre-trained models. Extensive experiments demonstrate the superior performance of our methods on semantic segmentation, object detection, and instance segmentation tasks. Code is available at https://github.com/raoyongming/DenseCLIP

preprint2022arXiv

Divide to Adapt: Mitigating Confirmation Bias for Domain Adaptation of Black-Box Predictors

Domain Adaptation of Black-box Predictors (DABP) aims to learn a model on an unlabeled target domain supervised by a black-box predictor trained on a source domain. It does not require access to both the source-domain data and the predictor parameters, thus addressing the data privacy and portability issues of standard domain adaptation. Existing DABP approaches mostly rely on model distillation from the black-box predictor, \emph{i.e.}, training the model with its noisy target-domain predictions, which however inevitably introduces the confirmation bias accumulated from the prediction noises. To mitigate such bias, we propose a new method, named BETA, to incorporate knowledge distillation and noisy label learning into one coherent framework. This is enabled by a new divide-to-adapt strategy. BETA divides the target domain into an easy-to-adapt subdomain with less noise and a hard-to-adapt subdomain. Then it deploys mutually-teaching twin networks to filter the predictor errors for each other and improve them progressively, from the easy to hard subdomains. As such, BETA effectively purifies the noisy labels and reduces error accumulation. We theoretically show that the target error of BETA is minimized by decreasing the noise ratio of the subdomains. Extensive experiments demonstrate BETA outperforms existing methods on all DABP benchmarks, and is even comparable with the standard domain adaptation methods that use the source-domain data.

preprint2022arXiv

Doped Mott Insulators in the Triangular Lattice Hubbard Model

We investigate the evolution of the Mott insulators in the triangular lattice Hubbard Model, as a function of hole doping $δ$ in both the strong and intermediate coupling limits. Using the advanced density matrix renormalization group (DMRG) method, at light hole doping $δ\lesssim 10\%$, we find a significant difference between strong and intermediate couplings. Notably, at intermediate coupling an unusual metallic state emerges, with short ranged spin correlations but long ranged spin-chirality order. Moreover, no clear Fermi surface or wave-vector is observed, this chiral metal also exhibits staggered loop current, which breaks the translational symmetry. These features disappear on increasing interaction strength or on further doping. At strong coupling, the 120 degree magnetic order of the insulating magnet persists for light doping, and produces hole pockets with a well defined Fermi surface. On further doping, $δ\approx 10\%\sim 20\%$ SDW order and coherent hole Fermi pockets are found at both strong and intermediate couplings. At even higher doping $δ\gtrsim 20\%$, the SDW order is suppressed and the spin-singlet Cooper pair correlations are simultaneously enhanced. We also briefly comment on the strong particle-hole asymmetry of the model.

preprint2022arXiv

FaceMAE: Privacy-Preserving Face Recognition via Masked Autoencoders

Face recognition, as one of the most successful applications in artificial intelligence, has been widely used in security, administration, advertising, and healthcare. However, the privacy issues of public face datasets have attracted increasing attention in recent years. Previous works simply mask most areas of faces or synthesize samples using generative models to construct privacy-preserving face datasets, which overlooks the trade-off between privacy protection and data utility. In this paper, we propose a novel framework FaceMAE, where the face privacy and recognition performance are considered simultaneously. Firstly, randomly masked face images are used to train the reconstruction module in FaceMAE. We tailor the instance relation matching (IRM) module to minimize the distribution gap between real faces and FaceMAE reconstructed ones. During the deployment phase, we use trained FaceMAE to reconstruct images from masked faces of unseen identities without extra training. The risk of privacy leakage is measured based on face retrieval between reconstructed and original datasets. Experiments prove that the identities of reconstructed images are difficult to be retrieved. We also perform sufficient privacy-preserving face recognition on several public face datasets (i.e. CASIA-WebFace and WebFace260M). Compared to previous state of the arts, FaceMAE consistently \textbf{reduces at least 50\% error rate} on LFW, CFP-FP and AgeDB.

preprint2022arXiv

HFT: Lifting Perspective Representations via Hybrid Feature Transformation

Autonomous driving requires accurate and detailed Bird's Eye View (BEV) semantic segmentation for decision making, which is one of the most challenging tasks for high-level scene perception. Feature transformation from frontal view to BEV is the pivotal technology for BEV semantic segmentation. Existing works can be roughly classified into two categories, i.e., Camera model-Based Feature Transformation (CBFT) and Camera model-Free Feature Transformation (CFFT). In this paper, we empirically analyze the vital differences between CBFT and CFFT. The former transforms features based on the flat-world assumption, which may cause distortion of regions lying above the ground plane. The latter is limited in the segmentation performance due to the absence of geometric priors and time-consuming computation. In order to reap the benefits and avoid the drawbacks of CBFT and CFFT, we propose a novel framework with a Hybrid Feature Transformation module (HFT). Specifically, we decouple the feature maps produced by HFT for estimating the layout of outdoor scenes in BEV. Furthermore, we design a mutual learning scheme to augment hybrid transformation by applying feature mimicking. Notably, extensive experiments demonstrate that with negligible extra overhead, HFT achieves a relative improvement of 13.3% on the Argoverse dataset and 16.8% on the KITTI 3D Object datasets compared to the best-performing existing method. The codes are available at https://github.com/JiayuZou2020/HFT.

preprint2022arXiv

Modeling Ride-Sourcing Matching and Pickup Processes based on Additive Gaussian Process Models

Matching and pickup processes are core features of ride-sourcing services. Previous studies have adopted abundant analytical models to depict the two processes and obtain operational insights; while the goodness of fit between models and data was dismissed. To simultaneously consider the fitness between models and data and analytically tractable formations, we propose a data-driven approach based on the additive Gaussian Process Model (AGPM) for ride-sourcing market modeling. The framework is tested based on real-world data collected in Hangzhou, China. We fit analytical models, machine learning models, and AGPMs, in which the number of matches or pickups are used as outputs and spatial, temporal, demand, and supply covariates are utilized as inputs. The results demonstrate the advantages of AGPMs in recovering the two processes in terms of estimation accuracy. Furthermore, we illustrate the modeling power of AGPM by utilizing the trained model to design and estimate idle vehicle relocation strategies.

preprint2022arXiv

MVSTER: Epipolar Transformer for Efficient Multi-View Stereo

Learning-based Multi-View Stereo (MVS) methods warp source images into the reference camera frustum to form 3D volumes, which are fused as a cost volume to be regularized by subsequent networks. The fusing step plays a vital role in bridging 2D semantics and 3D spatial associations. However, previous methods utilize extra networks to learn 2D information as fusing cues, underusing 3D spatial correlations and bringing additional computation costs. Therefore, we present MVSTER, which leverages the proposed epipolar Transformer to learn both 2D semantics and 3D spatial associations efficiently. Specifically, the epipolar Transformer utilizes a detachable monocular depth estimator to enhance 2D semantics and uses cross-attention to construct data-dependent 3D associations along epipolar line. Additionally, MVSTER is built in a cascade structure, where entropy-regularized optimal transport is leveraged to propagate finer depth estimations in each stage. Extensive experiments show MVSTER achieves state-of-the-art reconstruction performance with significantly higher efficiency: Compared with MVSNet and CasMVSNet, our MVSTER achieves 34% and 14% relative improvements on the DTU benchmark, with 80% and 51% relative reductions in running time. MVSTER also ranks first on Tanks&Temples-Advanced among all published works. Code is released at https://github.com/JeffWang987.

preprint2022arXiv

Predict the Rover Mobility over Soft Terrain using Articulated Wheeled Bevameter

Robot mobility is critical for mission success, especially in soft or deformable terrains, where the complex wheel-soil interaction mechanics often leads to excessive wheel slip and sinkage, causing the eventual mission failure. To improve the success rate, online mobility prediction using vision, infrared imaging, or model-based stochastic methods have been used in the literature. This paper proposes an on-board mobility prediction approach using an articulated wheeled bevameter that consists of a force-controlled arm and an instrumented bevameter (with force and vision sensors) as its end-effector. The proposed bevameter, which emulates the traditional terramechanics tests such as pressure-sinkage and shear experiments, can measure contact parameters ahead of the rover's body in real-time, and predict the slip and sinkage of supporting wheels over the probed region. Based on the predicted mobility, the rover can select a safer path in order to avoid dangerous regions such as those covered with quicksand. Compared to the literature, our proposed method can avoid the complicated terramechanics modeling and time-consuming stochastic prediction; it can also mitigate the inaccuracy issues arising in non-contact vision-based methods. We also conduct multiple experiments to validate the proposed approach.

preprint2022arXiv

Proposal for asymmetric photoemission and tunneling spectroscopies in quantum simulators of the triangular-lattice Fermi-Hubbard model

Recent realization of well-controlled quantum simulators of the triangular-lattice Fermi-Hubbard model, including the triangular optical lattices loaded with ultracold Fermions and the heterostructures of the transition-metal dichalcogenides, as well as the more advanced techniques to probe them, pave the way for studying frustrated Fermi-Hubbard physics. Here, we theoretically predict asymmetric photoemission and tunneling spectroscopies for a lightly hole-doped and electron-doped triangular Mott antiferromagnet, and reveal two distinct types of magnetic polarons: a \emph{lightly} renormalized quasiparticle with the same momentum as the spin background and a \emph{heavily} renormalized quasiparticle with a shifted momentum and a nearly flat band, using both analytical and unbiased numerical methods. We propose these theoretical findings to be verified in frustrated optical lattices and Moiré superlattices by probing various observables including the spectral function, the density of states, the energy dispersion and the quasiparticle weight. Moreover, we reveal the asymmetric response of the spin background against charge doping, demonstrating that the interplay between the local spin and charge degrees of freedom plays a vital role in doped triangular Mott antiferromagnets.

preprint2022arXiv

Reliable Label Correction is a Good Booster When Learning with Extremely Noisy Labels

Learning with noisy labels has aroused much research interest since data annotations, especially for large-scale datasets, may be inevitably imperfect. Recent approaches resort to a semi-supervised learning problem by dividing training samples into clean and noisy sets. This paradigm, however, is prone to significant degeneration under heavy label noise, as the number of clean samples is too small for conventional methods to behave well. In this paper, we introduce a novel framework, termed as LC-Booster, to explicitly tackle learning under extreme noise. The core idea of LC-Booster is to incorporate label correction into the sample selection, so that more purified samples, through the reliable label correction, can be utilized for training, thereby alleviating the confirmation bias. Experiments show that LC-Booster advances state-of-the-art results on several noisy-label benchmarks, including CIFAR-10, CIFAR-100, Clothing1M and WebVision. Remarkably, under the extreme 90\% noise ratio, LC-Booster achieves 92.9\% and 48.4\% accuracy on CIFAR-10 and CIFAR-100, surpassing state-of-the-art methods by a large margin.

preprint2022arXiv

Shapley-NAS: Discovering Operation Contribution for Neural Architecture Search

In this paper, we propose a Shapley value based method to evaluate operation contribution (Shapley-NAS) for neural architecture search. Differentiable architecture search (DARTS) acquires the optimal architectures by optimizing the architecture parameters with gradient descent, which significantly reduces the search cost. However, the magnitude of architecture parameters updated by gradient descent fails to reveal the actual operation importance to the task performance and therefore harms the effectiveness of obtained architectures. By contrast, we propose to evaluate the direct influence of operations on validation accuracy. To deal with the complex relationships between supernet components, we leverage Shapley value to quantify their marginal contributions by considering all possible combinations. Specifically, we iteratively optimize the supernet weights and update the architecture parameters by evaluating operation contributions via Shapley value, so that the optimal architectures are derived by selecting the operations that contribute significantly to the tasks. Since the exact computation of Shapley value is NP-hard, the Monte-Carlo sampling based algorithm with early truncation is employed for efficient approximation, and the momentum update mechanism is adopted to alleviate fluctuation of the sampling process. Extensive experiments on various datasets and various search spaces show that our Shapley-NAS outperforms the state-of-the-art methods by a considerable margin with light search cost. The code is available at https://github.com/Euphoria16/Shapley-NAS.git

preprint2022arXiv

Symmetric Mass Generation in the 1+1 Dimensional Chiral Fermion 3-4-5-0 Model

Lattice regularization of chiral fermions has been a long-standing problem in physics. In this work, we present the density matrix renormalization group (DMRG) simulation of the 3-4-5-0 model of (1+1)D chiral fermions with an anomaly-free chiral U(1) symmetry, which contains two left-moving and two right-moving fermions carrying U(1) charges 3,4 and 5,0, respectively. Following the Wang-Wen chiral fermion model, we realize the chiral fermions and their mirror partners on the opposite boundaries of a thin strip of (2+1)D lattice model of multi-layer Chern insulator, whose finite-width implies the quantum system is effectively (1+1)D. By introducing carefully designed two sets of six-fermion local interactions to the mirror sector only, we demonstrate that the mirror fermions can be gapped out by the interaction beyond a critical strength without breaking the chiral U(1) symmetry, via the symmetric mass generation (SMG) mechanism. We show that the interaction-driven gapping transition is in the Berezinskii-Kosterlitz-Thouless (BKT) universality class. We determine the evolution of Luttinger parameters before the transition, which confirms that the transition happens exactly at the point when the interaction term becomes marginal. As the mirror sector is gapped after the transition, we check that the fermions in the light chiral fermion sector remain gapless, which provides the desired lattice regularization of chiral fermions.

preprint2022arXiv

WebFace260M: A Benchmark for Million-Scale Deep Face Recognition

Face benchmarks empower the research community to train and evaluate high-performance face recognition systems. In this paper, we contribute a new million-scale recognition benchmark, containing uncurated 4M identities/260M faces (WebFace260M) and cleaned 2M identities/42M faces (WebFace42M) training data, as well as an elaborately designed time-constrained evaluation protocol. Firstly, we collect 4M name lists and download 260M faces from the Internet. Then, a Cleaning Automatically utilizing Self-Training (CAST) pipeline is devised to purify the tremendous WebFace260M, which is efficient and scalable. To the best of our knowledge, the cleaned WebFace42M is the largest public face recognition training set and we expect to close the data gap between academia and industry. Referring to practical deployments, Face Recognition Under Inference Time conStraint (FRUITS) protocol and a new test set with rich attributes are constructed. Besides, we gather a large-scale masked face sub-set for biometrics assessment under COVID-19. For a comprehensive evaluation of face matchers, three recognition tasks are performed under standard, masked and unbiased settings, respectively. Equipped with this benchmark, we delve into million-scale face recognition problems. A distributed framework is developed to train face recognition models efficiently without tampering with the performance. Enabled by WebFace42M, we reduce 40% failure rate on the challenging IJB-C set and rank 3rd among 430 entries on NIST-FRVT. Even 10% data (WebFace4M) shows superior performance compared with the public training sets. Furthermore, comprehensive baselines are established under the FRUITS-100/500/1000 milliseconds protocols. The proposed benchmark shows enormous potential on standard, masked and unbiased face recognition scenarios. Our WebFace260M website is https://www.face-benchmark.org.

preprint2022arXiv

Zero volume boundary for extension domains from Sobolev to $BV$

In this note, we prove that the boundary of a $(W^{1, p}, BV)$-extension domain is of volume zero under the assumption that the domain $\boz$ is $1$-fat at almost every $x\in\partial\boz$. Especially, the boundary of any planar $(W^{1, p}, BV)$-extension domain is of volume zero.

preprint2021arXiv

WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition

In this paper, we contribute a new million-scale face benchmark containing noisy 4M identities/260M faces (WebFace260M) and cleaned 2M identities/42M faces (WebFace42M) training data, as well as an elaborately designed time-constrained evaluation protocol. Firstly, we collect 4M name list and download 260M faces from the Internet. Then, a Cleaning Automatically utilizing Self-Training (CAST) pipeline is devised to purify the tremendous WebFace260M, which is efficient and scalable. To the best of our knowledge, the cleaned WebFace42M is the largest public face recognition training set and we expect to close the data gap between academia and industry. Referring to practical scenarios, Face Recognition Under Inference Time conStraint (FRUITS) protocol and a test set are constructed to comprehensively evaluate face matchers. Equipped with this benchmark, we delve into million-scale face recognition problems. A distributed framework is developed to train face recognition models efficiently without tampering with the performance. Empowered by WebFace42M, we reduce relative 40% failure rate on the challenging IJB-C set, and ranks the 3rd among 430 entries on NIST-FRVT. Even 10% data (WebFace4M) shows superior performance compared with public training set. Furthermore, comprehensive baselines are established on our rich-attribute test set under FRUITS-100ms/500ms/1000ms protocol, including MobileNet, EfficientNet, AttentionNet, ResNet, SENet, ResNeXt and RegNet families. Benchmark website is https://www.face-benchmark.org.

preprint2020arXiv

${\bf 2k_F}$ Density Wave Instability of Composite Fermi Liquid

We investigate the $2k_F$ density-wave instability of non-Fermi liquid states by combining exact diagonalization with renormalization group analysis. At the half-filled zeroth Landau level, we study the fate of the composite Fermi liquid in the presence of the mass anisotropy and mixed Landau level form factors. These two experimentally accessible knobs trigger a phase transition towards a unidirectional charge-density-wave state with a wavevector equal to $2k_F$ of the composite Fermi liquid. Based on exact diagonalization, we identify such a transition by examining both the energy spectra and the static structure factor of charge density-density correlations. Moreover, the renormalization group analysis reveals that gauge fluctuations render the non-Fermi liquid state unstable against density-wave orders, consistent with numerical observations. Possible experimental probes of the density-wave instability are also discussed.

preprint2020arXiv

Complex Phase Diagram of Doped XXZ Ladder: Localization and Pairing

How the ground state nature can be dramatically changed by the distinct underlying spin correlation is a central issue of doped Mott insulators. The two-leg XXZ ladder provides a prototypical spin background, which can be tuned from a long-range Néel order to a short-range ``spin liquid'' via the superexchange anisotropy, giving rise to a complex phase diagram at finite doping. By density matrix renormalization group method, we show that although the charge is always self-localized in the Néel ordered phase, a second insulating phase emerges, in which the doped holes become paired but remain localized while the transverse spin-spin correlation reduces to short-ranged one to make the Néel order classical. Only when the Néel order totally disappears by further reducing anisotropy, does the pairing become truly coherent as characterized by a Luther-Emery state. In sharp contrast, the pairing is totally absent in the in-plane ferromagnetic XXZ regime, where a direct transition from the charge self-localization in the Néel ordered phase to a Fermi-gas-like state in the spin liquid phase is found. A consistent physical picture is briefly discussed.

preprint2020arXiv

Joint predictions of multi-modal ride-hailing demands: a deep multi-task multigraph learning-based approach

Ride-hailing platforms generally provide various service options to customers, such as solo ride services, shared ride services, etc. It is generally expected that demands for different service modes are correlated, and the prediction of demand for one service mode can benefit from historical observations of demands for other service modes. Moreover, an accurate joint prediction of demands for multiple service modes can help the platforms better allocate and dispatch vehicle resources. Although there is a large stream of literature on ride-hailing demand predictions for one specific service mode, little efforts have been paid towards joint predictions of ride-hailing demands for multiple service modes. To address this issue, we propose a deep multi-task multi-graph learning approach, which combines two components: (1) multiple multi-graph convolutional (MGC) networks for predicting demands for different service modes, and (2) multi-task learning modules that enable knowledge sharing across multiple MGC networks. More specifically, two multi-task learning structures are established. The first one is the regularized cross-task learning, which builds cross-task connections among the inputs and outputs of multiple MGC networks. The second one is the multi-linear relationship learning, which imposes a prior tensor normal distribution on the weights of various MGC networks. Although there are no concrete bridges between different MGC networks, the weights of these networks are constrained by each other and subject to a common prior distribution. Evaluated with the for-hire-vehicle datasets in Manhattan, we show that our propose approach outperforms the benchmark algorithms in prediction accuracy for different ride-hailing modes.

preprint2020arXiv

Magnetic Field Induced Spin Liquids in S=1 Kitaev Honeycomb Model

We investigate the ground state properties of the spin S=1 Kitaev honeycomb model under a magnetic field based on the density matrix renormalization group (DMRG) calculation. With the time reversal symmetry breaking due to the magnetic field, a gapped Kitaev spin liquid is identified for both ferromagnetic (FM) and antiferromagnetic (AFM) Kitaev couplings. The topological nature of such Kitaev spin liquid is manifested by the nearly quantized Wilson loop, degeneracy in the entanglement spectra and existence of edge modes. While the FM Kitaev spin liquid is destroyed by a weaker magnetic field $H_*^\text{FM}$, the AFM one demonstrates a robustness up to an order of magnitude larger critical field $H_*^\text{AFM}$. Moreover, an intermediate nonmagnetic phase appears only for the AFM case at larger fields, $H_*^\text{AFM} < H < H_{**}^\text{AFM}$, before the transition to a high-field polarized paramagnet. The stability of the Kitaev spin liquid against the Heisenberg interactions is also examined. Our findings may further inspire the investigation of recently proposed S=1 Kitaev materials.

preprint2020arXiv

Modeling indoor-level non-pharmaceutical interventions during the COVID-19 pandemic: a pedestrian dynamics-based microscopic simulation approach

Mathematical modeling of epidemic spreading has been widely adopted to estimate the threats of epidemic diseases (i.e., the COVID-19 pandemic) as well as to evaluate epidemic control interventions. The indoor place is considered to be a significant epidemic spreading risk origin, but existing widely-used epidemic spreading models are usually limited for indoor places since the dynamic physical distance changes between people are ignored, and the empirical features of the essential and non-essential travel are not differentiated. In this paper, we introduce a pedestrian-based epidemic spreading model that is capable of modeling indoor transmission risks of diseases during people's social activities. Taking advantage of the before-and-after mobility data from the University of Maryland COVID-19 Impact Analysis Platform, it's found that people tend to spend more time in grocery stores once their travel frequencies are restricted to a low level. In other words, an increase in dwell time could balance the decrease in travel frequencies and satisfy people's demand. Based on the pedestrian-based model and the empirical evidence, combined non-pharmaceutical interventions from different operational levels are evaluated. Numerical simulations show that restrictions on people's travel frequency and open-hours of indoor places may not be universally effective in reducing average infection risks for each pedestrian who visit the place. Entry limitations can be a widely effective alternative, whereas the decision-maker needs to balance the decrease in risky contacts and the increase in queue length outside the place that may impede people from fulfilling their travel needs.

preprint2020arXiv

The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation

Being a fundamental component in training and inference, data processing has not been systematically considered in human pose estimation community, to the best of our knowledge. In this paper, we focus on this problem and find that the devil of human pose estimation evolution is in the biased data processing. Specifically, by investigating the standard data processing in state-of-the-art approaches mainly including coordinate system transformation and keypoint format transformation (i.e., encoding and decoding), we find that the results obtained by common flipping strategy are unaligned with the original ones in inference. Moreover, there is a statistical error in some keypoint format transformation methods. Two problems couple together, significantly degrade the pose estimation performance and thus lay a trap for the research community. This trap has given bone to many suboptimal remedies, which are always unreported, confusing but influential. By causing failure in reproduction and unfair in comparison, the unreported remedies seriously impedes the technological development. To tackle this dilemma from the source, we propose Unbiased Data Processing (UDP) consist of two technique aspect for the two aforementioned problems respectively (i.e., unbiased coordinate system transformation and unbiased keypoint format transformation). As a model-agnostic approach and a superior solution, UDP successfully pushes the performance boundary of human pose estimation and offers a higher and more reliable baseline for research community. Code is public available in https://github.com/HuangJunJie2017/UDP-Pose

preprint2020arXiv

Widely Tunable Quantum Phase Transition from Moore-Read to Composite Fermi Liquid in Bilayer Graphene

We develop a proposal to realise a widely tunable and clean quantum phase transition in bilayer graphene between two paradigmatic fractionalized phases of matter: the Moore-Read fractional quantum Hall state and the composite Fermi liquid metal. This transition can be realized at total fillings $ν=\pm 3+1/2$ and the critical point can be controllably accessed by tuning either the interlayer electric bias or the perpendicular magnetic field values over a wide range of parameters. We study the transition numerically within a model that contains all leading single particle corrections to the band-structure of bilayer graphene and includes the fluctuations between the $n=0$ and $n=1$ cyclotron orbitals of its zeroth Landau level to delineate the most favorable region of parameters to experimentally access this unconventional critical point. We also find evidence for a new anisotropic gapless phase stabilized near the level crossing of $n=0/1$ orbits.

preprint2019arXiv

Deformations of Bi-conformal Energy and a new Characterization of Quasiconformality

The concept of hyperelastic deformations of bi-conformal energy is developed as an extension of quasiconformality. These are homeomorphisms $h:X \to Y$ between domains $ X, Y \subset \mathbb R^n$ of the Sobolev class $W^{1,n}_{loc} (X, Y)$ whose inverse $f =h^{-1}:Y \to X$ also belongs to $W^{1,n}_{loc}(Y, X)$. Thus the paper opens new topics in Geometric Function Theory with connections to mathematical models of Nonlinear Elasticity. In seeking differences and similarities with quasiconformal mappings we examine closely the modulus of continuity of deformations of bi-conformal energy. This leads us to a new characterization of quasiconformality. Specifically, it is observed that quasiconformal mappings behave locally at every point like radial stretchings. Without going into detail, if a quasiconformal map $h$ admits a function $ϕ$ as its optimal modulus of continuity at a point $x_0$, then $f = h^{-1}$ admits the inverse function $ψ= ϕ^{-1}$ as its modulus of continuity at $y_0 = h(x_0)$. That is to say; a poor continuity of $h$ at a given point $x_0$ is always compensated by a better continuity of $f$ at $y_0$, and vice versa. Such a gain/loss property, seemingly overlooked by many authors, is actually characteristic of quasiconformal mappings. It turns out that the elastic deformations of bi-conformal energy are very different in this respect. Unexpectedly, such a map may have the same optimal modulus of continuity as its inverse deformation. In line with Hooke's Law, when trying to restore the original shape of the body (by the inverse transformation) the modulus of continuity may neither be improved nor become worse. However, examples to confirm this phenomenon are far from being obvious. We eventually hope that our examples will gain an interest in the materials science, particularly in mathematical models of hyperelasticity.

preprint2019arXiv

Pointwise inequalities for Sobolev functions on outward cuspidal domains

We show that the first order Sobolev spaces on cuspidal symmetric domains can be characterized via pointwise inequalities. In particular, they coincide with the Hajlasz-Sobolev spaces.

preprint2019arXiv

Predicting origin-destination ride-sourcing demand with a spatio-temporal encoder-decoder residual multi-graph convolutional network

With the rapid development of mobile-internet technologies, on-demand ride-sourcing services have become increasingly popular and largely reshaped the way people travel. Demand prediction is one of the most fundamental components in supply-demand management systems of ride-sourcing platforms. With accurate short-term prediction for origin-destination (OD) demand, the platforms make precise and timely decisions on real-time matching, idle vehicle reallocations and ride-sharing vehicle routing, etc. Compared to zone-based demand prediction that has been examined by many previous studies, OD-based demand prediction is more challenging. This is mainly due to the complicated spatial and temporal dependencies among demand of different OD pairs. To overcome this challenge, we propose the Spatio-Temporal Encoder-Decoder Residual Multi-Graph Convolutional network (ST-ED-RMGC), a novel deep learning model for predicting ride-sourcing demand of various OD pairs. Firstly, the model constructs OD graphs, which utilize adjacent matrices to characterize the non-Euclidean pair-wise geographical and semantic correlations among different OD pairs. Secondly, based on the constructed graphs, a residual multi-graph convolutional (RMGC) network is designed to encode the contextual-aware spatial dependencies, and a long-short term memory (LSTM) network is used to encode the temporal dependencies, into a dense vector space. Finally, we reuse the RMGC networks to decode the compressed vector back to OD graphs and predict the future OD demand. Through extensive experiments on the for-hire-vehicles datasets in Manhattan, New York City, we show that our proposed deep learning framework outperforms the state-of-arts by a significant margin.

preprint2019arXiv

Tackling Challenges in Seebeck Coefficient Measurement of Ultra-High Resistance Samples with an AC Technique

Seebeck coefficient is a widely-studied semiconductor property. Conventional Seebeck coefficient measurements are based on DC voltage measurement. Normally this is performed on samples with low resistances below a few Mohm level. Meanwhile, certain semiconductors are highly intrinsic and resistive, many examples can be found in optical and photovoltaic materials. The hybrid halide perovskites that have gained extensive attention recently are a good example. Few credible studies exist on the Seebeck coefficient of, CH3NH3PbI3, for example. We report here an AC technique based Seebeck coefficient measurement, which makes high quality voltage measurement on samples with resistances up to 100Gohm. This is achieved through a specifically designed setup to enhance sample isolation and reduce meter loading. As a demonstration, we performed Seebeck coefficient measurement of a CH3NH3PbI3 thin film at dark and found S = +550 microV/K. Such property of this material has not been successfully studied before.

preprint2016arXiv

Best-case performance of quantum annealers on native spin-glass benchmarks: How chaos can affect success probabilities

Recent tests performed on the D-Wave Two quantum annealer have revealed no clear evidence of speedup over conventional silicon-based technologies. Here, we present results from classical parallel-tempering Monte Carlo simulations combined with isoenergetic cluster moves of the archetypal benchmark problem-an Ising spin glass-on the native chip topology. Using realistic uncorrelated noise models for the D-Wave Two quantum annealer, we study the best-case resilience, i.e., the probability that the ground-state configuration is not affected by random fields and random-bond fluctuations found on the chip. We thus compute classical upper-bound success probabilities for different types of disorder used in the benchmarks and predict that an increase in the number of qubits will require either error correction schemes or a drastic reduction of the intrinsic noise found in these devices. We outline strategies to develop robust, as well as hard benchmarks for quantum annealing devices, as well as any other computing paradigm affected by noise.

preprint2016arXiv

borealis - A generalized global update algorithm for Boolean optimization problems

Optimization problems with Boolean variables that fall into the nondeterministic polynomial (NP) class are of fundamental importance in computer science, mathematics, physics and industrial applications. Most notably, solving constraint-satisfaction problems, which are related to spin-glass-like Hamiltonians in physics, remains a difficult numerical task. As such, there has been great interest in designing efficient heuristics to solve these computationally difficult problems. Inspired by parallel tempering Monte Carlo in conjunction with the rejection-free isoenergetic cluster algorithm developed for Ising spin glasses, we present a generalized global update optimization heuristic that can be applied to different NP-complete problems with Boolean variables. The global cluster updates allow for a wide-spread sampling of phase space, thus considerably speeding up optimization. By carefully tuning the pseudo-temperature (needed to randomize the configurations) of the problem, we show that the method can efficiently tackle optimization problems with over-constraints or on topologies with a large site-percolation threshold. We illustrate the efficiency of the heuristic on paradigmatic optimization problems, such as the maximum satisfiability problem and the vertex cover problem.

preprint2016arXiv

Evolutionary Approaches to Optimization Problems in Chimera Topologies

Chimera graphs define the topology of one of the first commercially available quantum computers. A variety of optimization problems have been mapped to this topology to evaluate the behavior of quantum enhanced optimization heuristics in relation to other optimizers, being able to efficiently solve problems classically to use them as benchmarks for quantum machines. In this paper we investigate for the first time the use of Evolutionary Algorithms (EAs) on Ising spin glass instances defined on the Chimera topology. Three genetic algorithms (GAs) and three estimation of distribution algorithms (EDAs) are evaluated over $1000$ hard instances of the Ising spin glass constructed from Sidon sets. We focus on determining whether the information about the topology of the graph can be used to improve the results of EAs and on identifying the characteristics of the Ising instances that influence the success rate of GAs and EDAs.

preprint2016arXiv

Quasiparticle collapsing in an anisotropic $t$-$J$ ladder

Quasiparticle collapsing is a central issue in the study of strongly correlated electron systems. In the one-dimensional case, the quasiparticle collapsing in a form of spin-charge separation has been well established, but the problem remains elusive in dimensions higher than one. By using density matrix renormalization group (DMRG) algorithm, we show that in an anisotropic two-leg $t$-$J$ ladder, an injected single hole behaves like a well-defined quasiparticle in the strong rung limit, but undergoes a "phase transition" with the effective mass diverging at a quantum critical point (QCP) towards the isotropic limit. After the transition, the quasiparticle collapses into a composite object of a self-localized charge (holon) and a deconfined spin-1/2 (spinon), accompanied by a substantially enhanced binding energy between two holes. A phase diagram of multi-leg ladders is further obtained, which extrapolates the QCP towards the two-dimensional limit. The underlying novel mechanism generic for any dimensions is also discussed.

preprint2016arXiv

Strengths and weaknesses of weak-strong cluster problems: A detailed overview of state-of-the-art classical heuristics vs quantum approaches

To date, a conclusive detection of quantum speedup remains elusive. Recently, a team by Google Inc.~[V.~S.~Denchev {\em et al}., Phys.~Rev.~X {\bf 6}, 031015 (2016)] proposed a weak-strong cluster model tailored to have tall and narrow energy barriers separating local minima, with the aim to highlight the value of finite-range tunneling. More precisely, results from quantum Monte Carlo simulations, as well as the D-Wave 2X quantum annealer scale considerably better than state-of-the-art simulated annealing simulations. Moreover, the D-Wave 2X quantum annealer is $\sim 10^8$ times faster than simulated annealing on conventional computer hardware for problems with approximately $10^3$ variables. Here, an overview of different sequential, nontailored, as well as specialized tailored algorithms on the Google instances is given. We show that the quantum speedup is limited to sequential approaches and study the typical complexity of the benchmark problems using insights from the study of spin glasses.

preprint2015arXiv

Charge modulation as fingerprints of phase-string triggered interference

Charge order appears to be an ubiquitous phenomenon in doped Mott insulators, which is currently under intense experimental and theoretical investigations particularly in the high $T_c$ cuprates. This phenomenon is conventionally understood in terms of Hartree-Fock type mean field theory. Here we demonstrate a mechanism for charge modulation which is rooted in the many-particle quantum physics arising in the strong coupling limit. Specifically, we consider the problem of a single hole in a bipartite $t-J$ ladder. As a remnant of the fermion signs, the hopping hole picks up subtle phases pending the fluctuating spins, the so-called phase string effect. We demonstrate the presence of charge modulations in the density matrix renormalization group solutions which disappear when the phase strings are switched off. This form of charge modulation can be understood analytically in a path-integral language, showing that the phase strings give rise to constructive interferences leading to self-localization. When the latter occurs, left- and right-moving propagating modes emerge inside the localization volume and their interference is responsible for the real space charge modulation.

preprint2015arXiv

Efficient Cluster Algorithm for Spin Glasses in Any Space Dimension

Spin systems with frustration and disorder are notoriously difficult to study both analytically and numerically. While the simulation of ferromagnetic statistical mechanical models benefits greatly from cluster algorithms, these accelerated dynamics methods remain elusive for generic spin-glass-like systems. Here we present a cluster algorithm for Ising spin glasses that works in any space dimension and speeds up thermalization by at least one order of magnitude at temperatures where thermalization is typically difficult. Our isoenergetic cluster moves are based on the Houdayer cluster algorithm for two-dimensional spin glasses and lead to a speedup over conventional state-of-the-art methods that increases with the system size. We illustrate the benefits of the isoenergetic cluster moves in two and three space dimensions, as well as the nonplanar chimera topology found in the D-Wave Inc.~quantum annealing machine.

preprint2015arXiv

Exact sign structure of the $t$-$J$ chain and the single hole ground state

Injecting a single hole into a one-dimensional Heisenberg spin chain is probably the simplest case of doping a Mott insulator. The motion of such a single hole will generally induce a many-body phase shift, which can be identified by an exact sign structure of the model known as the phase string. We show that the sign structure is nontrivial even in this simplest problem, which is responsible for the essential properties of Mott physics. We find that the characteristic momentum structure, the Luttinger liquid behavior, and the quantum phase interference of the hole under a periodic boundary condition, can all be attributed to it. We use the density matrix renormalization group (DMRG) numerical simulation to make a comparative study of the $t$-$J$ chain and a model in which the sign structure is switched off. We further show that the key DMRG results can be reproduced by a variational wave function with incorporating the correct sign structure. Physical implications of the sign structure for doped Mott insulators in general are also discussed.

preprint2015arXiv

Seeking Quantum Speedup Through Spin Glasses: The Good, the Bad, and the Ugly

There has been considerable progress in the design and construction of quantum annealing devices. However, a conclusive detection of quantum speedup over traditional silicon-based machines remains elusive, despite multiple careful studies. In this work we outline strategies to design hard tunable benchmark instances based on insights from the study of spin glasses - the archetypal random benchmark problem for novel algorithms and optimization devices. We propose to complement head-to-head scaling studies that compare quantum annealing machines to state-of-the-art classical codes with an approach that compares the performance of different algorithms and/or computing architectures on different classes of computationally hard tunable spin-glass instances. The advantage of such an approach lies in having to only compare the performance hit felt by a given algorithm and/or architecture when the instance complexity is increased. Furthermore, we propose a methodology that might not directly translate into the detection of quantum speedup, but might elucidate whether quantum annealing has a "`quantum advantage" over corresponding classical algorithms like simulated annealing. Our results on a 496 qubit D-Wave Two quantum annealing device are compared to recently-used state-of-the-art thermal simulated annealing codes.

preprint2015arXiv

Spin and charge modulations in a single hole doped Hubbard ladder -- verification with optical lattice experiments

We show that pronounced modulations in spin and charge densities can be induced by the insertion of a single hole in an otherwise half-filled 2-leg Hubbard ladder. Accompanied with these modulations is a loosely bound structure of the doped charge with a spin-1/2, in contrast to the tightly bound case where such modulations are absent. These behaviors are caused by the interference of the Berry phases associated a string of flipped spins (or "phase strings") left behind as a hole travels through a spin bath with a short-range anti-ferromagnetic order. The key role of the phase strings is also reflected in how the system respond to increasing spin polarization, increasing the on-site repulsion, addition of a second hole, and increasing asymmetry between intra- and inter-chain hopping. Remarkably, all these properties persist down to ladders as short as $\sim 10$ sites. They can therefore be studied in cold atom experiments using the recently developed fermion microscope.

preprint2015arXiv

The length scale measurements of the Fractional quantum Hall state on cylinder

Once the fractional quantum Hall (FQH) state for a finite size system is put on the surface of a cylinder, the distance between the two ends with open boundary conditions can be tuned as varying the aspect ratio $γ$. It scales linearly as increasing the system size and therefore has a larger adjustable range than that on disk. The previous study of the quasi-hole tunneling amplitude on disk in Ref.~\cite{Zk2011} indicates that the tunneling amplitudes have a scaling behavior as a function of the tunneling distance and the scaling exponents are related to the scaling dimension and the charge of the transported quasiparticles. However, the scaling behaviors poorly due to the narrow range of the tunneling distance on disk. Here we systematically study the quasiparticle tunneling amplitudes of the Laughlin state in the cylinder geometry which shows a much better scaling behavior. Especially, there are some corssover behaviors at two length scales when the two open edges are close to each other. These lengths are also reflected in the bipartite entanglement and the electron Green's function as either a singularity or a crossover. These two critical length scales of the edge-edge distance, $L_x^{c_1}$ and $L_x^{c_2}$, are found to be related to the dimension reduction and back scattering point respectively.

preprint2015arXiv

Variational wave function for an anisotropic single-hole-doped $t$-$J$ ladder

Based on three general guiding principles, i.e., no double occupancy constraint, accurate description of antiferromagnetism at half-filling, and the precise sign structure of the $t$-$J$ model, a new ground state wave function has been constructed recently [Weng, New J. Phys. 13, 103039 (2011)]. In this paper, we specifically study such kind of variational ground state for the one-hole-doped anisotropic two-leg $t$-$J$ ladder using variational Monte Carlo (VMC) method. The results are then systematically compared with those recently obtained by density matrix renormalization group (DMRG) simulation. An excellent agreement is found between the VMC and DMRG results, including a "quantum critical point" at the anisotropy parameter $α=α_c\approx0.7$ (with the parameters $t/J=3$), and the emergence of charge modulation and momentum (Fermi point) reconstruction at $α>α_c$ due to the quantum interference of the sign structure. In particular, the wave function indicates that a Landau's quasiparticle description remains valid at $α<α_c$ but fails at $α>α_c$ due to the breakdown of the one-to-one correspondence of momentum and translational symmetry of the hole. The explicit form of the wave function provides a direct understanding on how the many-body strong correlation effect takes place non-perturbatively in a doped Mott insulator, which sheds interesting light on the two-dimensional case where the same type of wave function was proposed to describe the cuprate superconductor.

preprint2014arXiv

Boolean decision problems with competing interactions on scale-free networks: Equilibrium and nonequilibrium behavior in an external bias

We study the equilibrium and nonequilibrium properties of Boolean decision problems with competing interactions on scale-free networks in an external bias (magnetic field). Previous studies at zero field have shown a remarkable equilibrium stability of Boolean variables (Ising spins) with competing interactions (spin glasses) on scale-free networks. When the exponent that describes the power-law decay of the connectivity of the network is strictly larger than 3, the system undergoes a spin-glass transition. However, when the exponent is equal to or less than 3, the glass phase is stable for all temperatures. First, we perform finite-temperature Monte Carlo simulations in a field to test the robustness of the spin-glass phase and show that the system has a spin-glass phase in a field, i.e., exhibits a de Almeida-Thouless line. Furthermore, we study avalanche distributions when the system is driven by a field at zero temperature to test if the system displays self-organized criticality. Numerical results suggest that avalanches (damage) can spread across the whole system with nonzero probability when the decay exponent of the interaction degree is less than or equal to 2, i.e., that Boolean decision problems on scale-free networks with competing interactions can be fragile when not in thermal equilibrium.

preprint2014arXiv

Nature of strong hole pairing in doped Mott antiferromagnets

Cooper pairing instability in a Fermi liquid is well understood by the BCS theory, but pairing mechanism for doped Mott insulators still remains elusive. Previously it has been shown by density matrix renormalization group (DMRG) method that a single doped hole is always self-localized due to the quantum destructive interference of the phase string signs hidden in the t-J ladders. Here we report a DMRG investigation of hole binding in the same model, where a novel pairing-glue scheme beyond the BCS realm is discovered. Specifically, we show that, in addition to spin pairing due to superexchange interaction, the strong frustration of the phase string signs on the kinetic energy gets effectively removed by pairing the charges, which results in strong binding of two holes. By contrast, if the phase string signs are switched off artificially, the pairing strength diminishes significantly even if the superexchange coupling remains the same. In the latter, unpaired holes behave like coherent quasiparticles with pairing drastically weakened, whose sole origin may be attributed to the resonating-valence-bond (RVB) pairing of spins. Such non-BCS pairing mechanism is therefore beyond the RVB picture and may shed important light on the high-T_c cuprate superconductors.

preprint2013arXiv

Self-Organized Criticality in Glassy Spin Systems Requires a Diverging Number of Neighbors

We investigate the conditions required for general spin systems with frustration and disorder to display self-organized criticality, a property which so far has been established only for the fully-connected infinite-range Sherrington-Kirkpatrick Ising spin-glass model [Phys. Rev. Lett. 83, 1034 (1999)]. Here we study both avalanche and magnetization jump distributions triggered by an external magnetic field, as well as internal field distributions in the short-range Edwards-Anderson Ising spin glass for various space dimensions between 2 and 8, as well as the fixed-connectivity mean-field Viana-Bray model. Our numerical results, obtained on systems of unprecedented size, demonstrate that self-organized criticality is recovered only in the strict limit of a diverging number of neighbors, and is not a generic property of spin-glass models in finite space dimensions.

preprint2012arXiv

Self-localization of a single hole in Mott antiferromagnets

A long-standing issue in the physics of strongly correlated electronic systems is whether the motion of a single hole in quantum antiferromagnets can be understood in terms of the quasiparticle picture. Very recently, investigations of this issue have been within the experimental reach. Here we perform a large-scale density matrix renormalization group study, and provide the first unambiguous numerical evidence showing that in ladder systems, a single hole doped in the Mott antiferromagnet does not behave as a quasiparticle. Specifically, the injected hole is found to be always localized as long as the leg number is larger than one, with a vanishing quasiparticle weight and a localization length monotonically decreasing with the leg number. In addition, the single hole self-localization is insensitive to the parity (even-odd) of the leg number. Our findings may advance conceptual developments in different fields of condensed matter physics. First of all, the intriguing self-localization phenomenon is of pure strong correlation origin free of extrinsic disorders. Therefore, it is in sharp contrast to the well-known Anderson localization and recently found many-body localization, where extrinsic disordered potentials play crucial roles. Second, they confirm the analytical predictions of the so-called phase string theory, suggesting that the phase string effect lies in the core of the physics of doped Mott antiferromagnets.

preprint2012arXiv

Strong correlation induced charge localization in antiferromagnets

The fate of an injected hole in a Mott antiferromagnet is an outstanding issue of strongly correlated physics. It provides important insights into doped Mott insulators closely related to high-temperature superconductivity in cuprates. Here, we report a systematic numerical study based on the density matrix renormalization group (DMRG). It reveals a remarkable novelty and surprise for the single hole's motion in otherwise well-understood Mott insulators. Specifically, we find that the charge of the hole is self-localized by a novel quantum interference mechanism purely of strong correlation origin, in contrast to Anderson localization due to disorders. The common belief of quasiparticle picture is invalidated by the charge localization concomitant with spin-charge separation: the spin of the doped hole is found to remain a mobile object. Our findings unveil a new paradigm for doped Mott insulators that emerges already in the simplest single hole case.

Zheng Zhu

What is connected

Connect this record

See the researcher in context

Building this map preview

56 published item(s)

Coordinated Pandemic Control with Large Language Model Agents as Policymaking Assistants

RoboTransfer: Controllable Geometry-Consistent Video Diffusion for Manipulation Policy Transfer

Spatial Multi-Task Learning for Breast Cancer Molecular Subtype Prediction from Single-Phase DCE-MRI

TokenSeg: Efficient 3D Medical Image Segmentation via Hierarchical Visual Token Compression

Detachable Novel Views Synthesis of Dynamic Scenes Using Distribution-Driven Neural Radiance Fields

A Simple Baseline for Multi-Camera 3D Object Detection

An Efficient Training Approach for Very Large Scale Face Recognition

BEVDet: High-performance Multi-camera 3D Object Detection in Bird-Eye-View

BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving

CAFE: Learning to Condense Dataset by Aligning Features

Crafting Better Contrastive Views for Siamese Representation Learning

Crafting Monocular Cues and Velocity Guidance for Self-Supervised Multi-Frame Depth Learning

Decoupled Multi-task Learning with Cyclical Self-Regulation for Face Parsing

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

Divide to Adapt: Mitigating Confirmation Bias for Domain Adaptation of Black-Box Predictors

Doped Mott Insulators in the Triangular Lattice Hubbard Model

FaceMAE: Privacy-Preserving Face Recognition via Masked Autoencoders

HFT: Lifting Perspective Representations via Hybrid Feature Transformation

Modeling Ride-Sourcing Matching and Pickup Processes based on Additive Gaussian Process Models

MVSTER: Epipolar Transformer for Efficient Multi-View Stereo

Predict the Rover Mobility over Soft Terrain using Articulated Wheeled Bevameter

Proposal for asymmetric photoemission and tunneling spectroscopies in quantum simulators of the triangular-lattice Fermi-Hubbard model

Reliable Label Correction is a Good Booster When Learning with Extremely Noisy Labels

Shapley-NAS: Discovering Operation Contribution for Neural Architecture Search

Symmetric Mass Generation in the 1+1 Dimensional Chiral Fermion 3-4-5-0 Model

WebFace260M: A Benchmark for Million-Scale Deep Face Recognition

Zero volume boundary for extension domains from Sobolev to $BV$

WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition

${\bf 2k_F}$ Density Wave Instability of Composite Fermi Liquid

Complex Phase Diagram of Doped XXZ Ladder: Localization and Pairing

Joint predictions of multi-modal ride-hailing demands: a deep multi-task multigraph learning-based approach

Magnetic Field Induced Spin Liquids in S=1 Kitaev Honeycomb Model

Modeling indoor-level non-pharmaceutical interventions during the COVID-19 pandemic: a pedestrian dynamics-based microscopic simulation approach

The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation

Widely Tunable Quantum Phase Transition from Moore-Read to Composite Fermi Liquid in Bilayer Graphene

Deformations of Bi-conformal Energy and a new Characterization of Quasiconformality

Pointwise inequalities for Sobolev functions on outward cuspidal domains

Predicting origin-destination ride-sourcing demand with a spatio-temporal encoder-decoder residual multi-graph convolutional network

Tackling Challenges in Seebeck Coefficient Measurement of Ultra-High Resistance Samples with an AC Technique

Best-case performance of quantum annealers on native spin-glass benchmarks: How chaos can affect success probabilities

borealis - A generalized global update algorithm for Boolean optimization problems

Evolutionary Approaches to Optimization Problems in Chimera Topologies

Quasiparticle collapsing in an anisotropic $t$-$J$ ladder

Strengths and weaknesses of weak-strong cluster problems: A detailed overview of state-of-the-art classical heuristics vs quantum approaches

Charge modulation as fingerprints of phase-string triggered interference

Efficient Cluster Algorithm for Spin Glasses in Any Space Dimension

Exact sign structure of the $t$-$J$ chain and the single hole ground state

Seeking Quantum Speedup Through Spin Glasses: The Good, the Bad, and the Ugly

Spin and charge modulations in a single hole doped Hubbard ladder -- verification with optical lattice experiments

The length scale measurements of the Fractional quantum Hall state on cylinder

Variational wave function for an anisotropic single-hole-doped $t$-$J$ ladder

Boolean decision problems with competing interactions on scale-free networks: Equilibrium and nonequilibrium behavior in an external bias

Nature of strong hole pairing in doped Mott antiferromagnets

Self-Organized Criticality in Glassy Spin Systems Requires a Diverging Number of Neighbors

Self-localization of a single hole in Mott antiferromagnets

Strong correlation induced charge localization in antiferromagnets