Researcher profile

Hao Shi

Hao Shi contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
18works
0followers
16topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

18 published item(s)

preprint2026arXiv

EgoEV-HandPose: Egocentric 3D Hand Pose Estimation and Gesture Recognition with Stereo Event Cameras

Egocentric 3D hand pose estimation and gesture recognition are essential for immersive augmented/virtual reality, human-computer interaction, and robotics. However, conventional frame-based cameras suffer from motion blur and limited dynamic range, while existing event-based methods are hindered by ego-motion interference, monocular depth ambiguity, and the lack of large-scale real-world stereo datasets. To overcome these limitations, we propose EgoEV-HandPose, an end-to-end framework for joint 3D bimanual pose estimation and gesture recognition from stereo event streams. Central to our approach is KeypointBEV, a flexible stereo fusion module that lifts features into a canonical bird's-eye-view space and employs an iterative reprojection-guided refinement loop to progressively resolve depth uncertainty and enforce kinematic consistency. In addition, we introduce EgoEVHands, the first large-scale real-world stereo event-camera dataset for egocentric hand perception, containing 5,419 annotated sequences with dense 3D/2D keypoints across 38 gesture classes under varying illumination. Extensive experiments demonstrate that EgoEV-HandPose achieves state-of-the-art performance with an MPJPE of 30.54mm and 86.87% Top-1 gesture recognition accuracy, significantly outperforming RGB-based stereo and prior event-camera methods, particularly in low-light and bimanual occlusion scenarios, thereby setting a new benchmark for event-based egocentric perception. The established dataset and source code will be publicly released at https://github.com/ZJUWang01/EgoEV-HandPose.

preprint2022arXiv

An Improved Automatic Modulation Classification Scheme Based on Adaptive Fusion Network

Due to the over-fitting problem caused by imbalance samples, there is still room to improve the performance of data-driven automatic modulation classification (AMC) in noisy scenarios. By fully considering the signal characteristics, an AMC scheme based on adaptive fusion network (AFNet) is proposed in this work. The AFNet can extract and aggregate multi-scale spatial features of in-phase and quadrature (I/Q) signals intelligently, thus improving the feature representation capability. Moreover, a novel confidence weighted loss function is proposed to address the imbalance issue and it is implemented by a two-stage learning scheme.Through the two-stage learning, AFNet can focus on high-confidence samples with more valid information and extract effective representations, so as to improve the overall classification performance. In the simulations, the proposed scheme reaches an average accuracy of 62.66% on a wide range of SNRs, which outperforms other AMC models. The effects of the loss function on classification accuracy are further studied.

preprint2022arXiv

Annular Computational Imaging: Capture Clear Panoramic Images through Simple Lens

Panoramic Annular Lens (PAL) composed of few lenses has great potential in panoramic surrounding sensing tasks for mobile and wearable devices because of its tiny size and large Field of View (FoV). However, the image quality of tiny-volume PAL confines to optical limit due to the lack of lenses for aberration correction. In this paper, we propose an Annular Computational Imaging (ACI) framework to break the optical limit of light-weight PAL design. To facilitate learning-based image restoration, we introduce a wave-based simulation pipeline for panoramic imaging and tackle the synthetic-to-real gap through multiple data distributions. The proposed pipeline can be easily adapted to any PAL with design parameters and is suitable for loose-tolerance designs. Furthermore, we design the Physics Informed Image Restoration Network (PI2RNet) considering the physical priors of panoramic imaging and single-pass physics-informed engine. At the dataset level, we create the DIVPano dataset and the extensive experiments on it illustrate that our proposed network sets the new state of the art in the panoramic image restoration under spatially-variant degradation. In addition, the evaluation of the proposed ACI on a simple PAL with only 3 spherical lenses reveals the delicate balance between high-quality panoramic imaging and compact design. To the best of our knowledge, we are the first to explore Computational Imaging (CI) in PAL. Code and datasets are publicly available at https://github.com/zju-jiangqi/ACI-PI2RNet.

preprint2022arXiv

CSFlow: Learning Optical Flow via Cross Strip Correlation for Autonomous Driving

Optical flow estimation is an essential task in self-driving systems, which helps autonomous vehicles perceive temporal continuity information of surrounding scenes. The calculation of all-pair correlation plays an important role in many existing state-of-the-art optical flow estimation methods. However, the reliance on local knowledge often limits the model's accuracy under complex street scenes. In this paper, we propose a new deep network architecture for optical flow estimation in autonomous driving--CSFlow, which consists of two novel modules: Cross Strip Correlation module (CSC) and Correlation Regression Initialization module (CRI). CSC utilizes a striping operation across the target image and the attended image to encode global context into correlation volumes, while maintaining high efficiency. CRI is used to maximally exploit the global context for optical flow initialization. Our method has achieved state-of-the-art accuracy on the public autonomous driving dataset KITTI-2015. Code is publicly available at https://github.com/MasterHow/CSFlow.

preprint2022arXiv

DCAN: Diversified News Recommendation with Coverage-Attentive Networks

Self-attention based models are widely used in news recommendation tasks. However, previous Attention architecture does not constrain repeated information in the user's historical behavior, which limits the power of hidden representation and leads to some problems such as information redundancy and filter bubbles. To solve this problem, we propose a personalized news recommendation model called DCAN.It captures multi-grained user-news matching signals through news encoders and user encoders. We keep updating a coverage vector to track the history of news attention and augment the vector in 4 types of ways. Then we fed the augmented Coverage vector into the Multi-headed Self-attention model to help adjust the future attention and added the Coverage regulation to the loss function(CRL), which enabled the recommendation system to consider more about differentiated information. Extensive experiments on Microsoft News Recommendation Dataset (MIND) show that our model significantly improve the diversity of news recommendations with minimal sacrifice in accuracy.

preprint2022arXiv

Efficient Human Pose Estimation via 3D Event Point Cloud

Human Pose Estimation (HPE) based on RGB images has experienced a rapid development benefiting from deep learning. However, event-based HPE has not been fully studied, which remains great potential for applications in extreme scenes and efficiency-critical conditions. In this paper, we are the first to estimate 2D human pose directly from 3D event point cloud. We propose a novel representation of events, the rasterized event point cloud, aggregating events on the same position of a small time slice. It maintains the 3D features from multiple statistical cues and significantly reduces memory consumption and computation complexity, proved to be efficient in our work. We then leverage the rasterized event point cloud as input to three different backbones, PointNet, DGCNN, and Point Transformer, with two linear layer decoders to predict the location of human keypoints. We find that based on our method, PointNet achieves promising results with much faster speed, whereas Point Transfomer reaches much higher accuracy, even close to previous event-frame-based methods. A comprehensive set of results demonstrates that our proposed method is consistently effective for these 3D backbone models in event-driven human pose estimation. Our method based on PointNet with 2048 points input achieves 82.46mm in MPJPE3D on the DHP19 dataset, while only has a latency of 12.29ms on an NVIDIA Jetson Xavier NX edge computing platform, which is ideally suitable for real-time detection with event cameras. Code is available at https://github.com/MasterHow/EventPointPose.

preprint2022arXiv

Language-specific Characteristic Assistance for Code-switching Speech Recognition

Dual-encoder structure successfully utilizes two language-specific encoders (LSEs) for code-switching speech recognition. Because LSEs are initialized by two pre-trained language-specific models (LSMs), the dual-encoder structure can exploit sufficient monolingual data and capture the individual language attributes. However, most existing methods have no language constraints on LSEs and underutilize language-specific knowledge of LSMs. In this paper, we propose a language-specific characteristic assistance (LSCA) method to mitigate the above problems. Specifically, during training, we introduce two language-specific losses as language constraints and generate corresponding language-specific targets for them. During decoding, we take the decoding abilities of LSMs into account by combining the output probabilities of two LSMs and the mixture model to obtain the final predictions. Experiments show that either the training or decoding method of LSCA can improve the model's performance. Furthermore, the best result can obtain up to 15.4% relative error reduction on the code-switching test set by combining the training and decoding methods of LSCA. Moreover, the system can process code-switching speech recognition tasks well without extra shared parameters or even retraining based on two pre-trained LSMs by using our method.

preprint2022arXiv

LF-VIO: A Visual-Inertial-Odometry Framework for Large Field-of-View Cameras with Negative Plane

Visual-inertial-odometry has attracted extensive attention in the field of autonomous driving and robotics. The size of Field of View (FoV) plays an important role in Visual-Odometry (VO) and Visual-Inertial-Odometry (VIO), as a large FoV enables to perceive a wide range of surrounding scene elements and features. However, when the field of the camera reaches the negative half plane, one cannot simply use [u,v,1]^T to represent the image feature points anymore. To tackle this issue, we propose LF-VIO, a real-time VIO framework for cameras with extremely large FoV. We leverage a three-dimensional vector with unit length to represent feature points, and design a series of algorithms to overcome this challenge. To address the scarcity of panoramic visual odometry datasets with ground-truth location and pose, we present the PALVIO dataset, collected with a Panoramic Annular Lens (PAL) system with an entire FoV of 360°x(40°-120°) and an IMU sensor. With a comprehensive variety of experiments, the proposed LF-VIO is verified on both the established PALVIO benchmark and a public fisheye camera dataset with a FoV of 360°x(0°-93.5°). LF-VIO outperforms state-of-the-art visual-inertial-odometry methods. Our dataset and code are made publicly available at https://github.com/flysoaryun/LF-VIO

preprint2022arXiv

Precision Many-Body Study of the Berezinskii-Kosterlitz-Thouless Transition and Temperature-Dependent Properties in the Two-Dimensional Fermi Gas

We perform large-scale, numerically exact calculations on the two-dimensional interacting Fermi gas with a contact attraction. Reaching much larger lattice sizes and lower temperatures than previously possible, we determine systematically the finite-temperature phase diagram of the Berezinskii-Kosterlitz-Thouless (BKT) transitions for interaction strengths ranging from BCS to crossover to BEC regimes. The evolutions of the pairing wavefunctions and the fermion and Cooper pair momentum distributions with temperature are accurately characterized. In the crossover regime, we find that the contact has a non-monotonic temperature dependence, first increasing as temperature is lowered, and then showing a slight decline below the BKT transition temperature to approach the ground-state value from above.

preprint2022arXiv

Solving 2D and 3D lattice models of correlated fermions -- combining matrix product states with mean field theory

Correlated electron states are at the root of many important phenomena including unconventional superconductivity (USC), where electron-pairing arises from repulsive interactions. Computing the properties of correlated electrons, such as the critical temperature $T_c$ for the onset of USC, efficiently and reliably from the microscopic physics with quantitative methods remains a major challenge for almost all models and materials. In this theoretical work we combine matrix product states (MPS) with static mean field (MF) to provide a solution to this challenge for quasi-one-dimensional (Q1D) systems: Two- and three-dimensional (2D/3D) materials comprised of weakly coupled correlated 1D fermions. This MPS+MF framework for the ground state and thermal equilibrium properties of Q1D fermions is developed and validated for attractive Hubbard systems first, and further enhanced via analytical field theory. We then deploy it to compute $T_c$ for superconductivity in 3D arrays of weakly coupled, doped and repulsive Hubbard ladders. The MPS+MF framework thus enables the reliable, quantitative and unbiased study of USC and high-$T_c$ superconductivity - and potentially many more correlated phases - in fermionic Q1D systems from microscopic parameters, in ways inaccessible to previous methods. It opens the possibility of designing deliberately optimized Q1D superconductors, from experiments in ultracold gases to synthesizing new materials.

preprint2022arXiv

Stripes and spin-density waves in the doped two-dimensional Hubbard model: ground state phase diagram

We determine the spin and charge orders in the ground state of the doped two-dimensional (2D) Hubbard model in its simplest form, namely with only nearest-neighbor hopping and on-site repulsion. At half-filling, the ground state is known to be an anti-ferromagnetic Mott insulator. Doping Mott insulators is believed to be relevant to the superconductivity observed in cuprates. A variety of candidates have been proposed for the ground state of the doped 2D Hubbard model. A recent work employing a combination of several state-of-the-art numerical many-body methods, established the stripe order as the ground state near $1/8$ doping at strong interactions. In this work, we apply one of these methods, the cutting-edge constrained-path auxiliary field quantum Monte Carlo method with self-consistently optimized gauge constraints, to systematically study the model as a function of doping and interaction strength. With careful finite size scaling based on large-scale computations, we map out the ground state phase diagram in terms of its spin and charge order. We find that modulated antiferromagnetic order persists from near half-filling to about $1/5$ doping. At lower interaction strengths or larger doping, these ordered states are best described as spin-density waves, with essentially delocalized holes and modest oscillations in charge correlations. When the charge correlations are stronger (large interaction or small doping), they are best described as stripe states, with the holes more localized near the node in the antiferromagnetic spin order. In both cases, we find that the wavelength in the charge correlations is consistent with so-called filled stripes in the pure Hubbard model.

preprint2020arXiv

A Pseudo-BCS Wavefunction from Density Matrix Decomposition:Application in Auxiliary-Field Quantum Monte Carlo

We present a method to construct pseudo-BCS wave functions from the one-body density matrix. The resulting many-body wave function, which can be produced for any fermion systems, including those with purely repulsive interactions, has the form of a number-projected BCS form, or antisymmetrized germinal power (AGP). Such wave functions provide a better ansatz for correlated fermion systems than a single Slater determinant, and often better than a linear combination of Slater determinants (for example from a truncated active space calculation). We describe a procedure to build such a wave function conveniently from a given reduced density matrix of the system, rather than from a mean-field solution (which gives a Slater determinant for repulsive interactions). The pseudo-BCS wave function thus obtained reproduces the density matrix or minimizes the difference between the input and resulting density matrices. One application of the pseudo-BCS wave function is in auxiliary-field quantum Monte Carlo (AFQMC) calculations as the trial wave function to control the sign/phase problem. AFQMC is often among the most accurate general methods for correlated fermion systems. We show that the pseudo-BCS form further reduces the constraint bias and leads to improved accuracy compared to the usual Slater determinant trial wave functions, using the two-dimensional Hubbard model as an example. Furthermore, the pseudo-BCS trial wave function allows a new systematically improvable self-consistent approach, with pseudo-BCS trial wave function iteratively generated by AFQMC via the one-body density matrix.

preprint2020arXiv

Absence of superconductivity in the pure two-dimensional Hubbard model

We study the superconducting pairing correlations in the ground state of the doped Hubbard model -- in its original form without hopping beyond nearest neighbor or other perturbing parameters -- in two dimensions at intermediate to strong coupling and near optimal doping. The nature of such correlations has been a central question ever since the discovery of cuprate high-temperature superconductors. Despite unprecedented effort and tremendous progress in understanding the properties of this fundamental model, a definitive answer to whether the ground state is superconducting in the parameter regime most relevant to cuprates has proved exceedingly difficult to establish. In this work, we employ two complementary, state-of-the-art many-body computational methods, constrained path (CP) auxiliary-field quantum Monte Carlo (AFQMC) and density matrix renormalization group (DMRG) methods, deploying the most recent algorithmic advances in each. Systematic and detailed comparisons between the two methods are performed. The DMRG is extremely reliable on small width cylinders, where we use it to validate the AFQMC. The AFQMC is then used to study wide systems as well as fully periodic systems, to establish that we have reached the thermodynamic limit. The ground state is found to be non-superconducting in the moderate to strong coupling regime in the vicinity of optimal hole doping.

preprint2020arXiv

Ground-state properties of the hydrogen chain: insulator-to-metal transition, dimerization, and magnetic phases

Accurate and predictive computations of the quantum-mechanical behavior of many interacting electrons in realistic atomic environments are critical for the theoretical design of materials with desired properties, and require solving the grand-challenge problem of the many-electron Schrodinger equation. An infinite chain of equispaced hydrogen atoms is perhaps the simplest realistic model for a bulk material, embodying several central themes of modern condensed matter physics and chemistry, while retaining a connection to the paradigmatic Hubbard model. Here we report a combined application of cutting-edge computational methods to determine the properties of the hydrogen chain in its quantum-mechanical ground state. Varying the separation between the nuclei leads to a rich phase diagram, including a Mott phase with quasi long-range antiferromagnetic order, electron density dimerization with power-law correlations, an insulator-to-metal transition and an intricate set of intertwined magnetic orders.

preprint2020arXiv

Some Recent Developments in Auxiliary-Field Quantum Monte Carlo for Real Materials

The auxiliary-field quantum Monte Carlo (AFQMC) method is a general numerical method for correlated many-electron systems, which is being increasingly applied in lattice models, atoms, molecules, and solids. Here we introduce the theory and algorithm of the method specialized for real materials, and present several recent developments. We give a systematic exposition of the key steps of AFQMC, closely tracking the framework of a modern software library we are developing. The building of a Monte Carlo Hamiltonian, projecting to the ground state, sampling two-body operators, phaseless approximation, and measuring ground state properties are discussed in details. An advanced implementation for multi-determinant trial wave functions is described which dramatically speeds up the algorithm and reduces the memory cost. We propose a self-consistent constraint for real materials, and discuss two flavors for its realization, either by coupling the AFQMC calculation to an effective independent-electron calculation, or via the natural orbitals of the computed one-body density matrix.

preprint2020arXiv

Stretching the limits of dynamic and quasi-static flow testing on limestone powders

Powders are a special class of granular matter due to the important role of cohesive forces. The flow behavior of powders depends on the flow states and stress and is therefore difficult to measure/quantify with only one experiment. In this study, the most commonly used characterization tests that cover a wide range of states are compared: (static, free surface) angle of repose, the (quasi-static, confined) ring shear steady state angle of internal friction, and the (dynamic, free surface) rotating drum flow angle are considered for free flowing, moderately and strongly cohesive limestone powders. The free flowing powder gives good agreement among all different situations (devices), while the moderately and strongly cohesive powders behave more interestingly. Starting from the flow angle in the rotating drum and going slower, one can extrapolate to the limit of zero rotation rate, but then observes that the angle of repose measured from the heap is considerably larger, possibly due to its special history. When we stretch the ring shear test to its lowest confining stress limit, the steady state angle of internal friction of the cohesive powder coincides with the flow angle (at free surface) in the zero rotation rate limit.

preprint2019arXiv

Direct comparison of many-body methods for realistic electronic Hamiltonians

A large collaboration carefully benchmarks 20 first principles many-body electronic structure methods on a test set of 7 transition metal atoms, and their ions and monoxides. Good agreement is attained between the 3 systematically converged methods, resulting in experiment-free reference values. These reference values are used to assess the accuracy of modern emerging and scalable approaches to the many-electron problem. The most accurate methods obtain energies indistinguishable from experimental results, with the agreement mainly limited by the experimental uncertainties. Comparison between methods enables a unique perspective on calculations of many-body systems of electrons.

preprint2019arXiv

Metal-Insulator and Magnetic Phase Diagram of Ca$_2$RuO$_4$ from Auxiliary Field Quantum Monte Carlo and Dynamical Mean Field Theory

Layered perovskite ruthenium oxides exhibit a striking series of metal-insulator and magnetic-nonmagnetic phase transitions easily tuned by temperature, pressure, epitaxy, and nonlinear drive. In this work, we combine results from two complementary state of the art many-body methods, Auxiliary Field Quantum Monte Carlo and Dynamical Mean Field Theory, to determine the low-temperature phase diagram of Ca$_2$RuO$_4$. Both methods predict a low temperature, pressure-driven metal-insulator transition accompanied by a ferromagnetic-antiferromagnetic transition. The properties of the ferromagnetic state vary non-monotonically with pressure and are dominated by the ruthenium $d_{xy}$ orbital, while the properties of the antiferromagnetic state are dominated by the $d_{xz}$ and $d_{yz}$ orbitals. Differences of detail in the predictions of the two methods are analyzed. This work is theoretically important as it presents the first application of the Auxiliary Field Quantum Monte Carlo method to an orbitally-degenerate system with both Mott and Hunds physics, and provides an important comparison of the Dynamical Mean Field and Auxiliary Field Quantum Monte Carlo methods.