Researcher profile

Shen Zhao

Shen Zhao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2024arXiv

Explore Human Parsing Modality for Action Recognition

Multimodal-based action recognition methods have achieved high success using pose and RGB modality. However, skeletons sequences lack appearance depiction and RGB images suffer irrelevant noise due to modality limitations. To address this, we introduce human parsing feature map as a novel modality, since it can selectively retain effective semantic features of the body parts, while filtering out most irrelevant noise. We propose a new dual-branch framework called Ensemble Human Parsing and Pose Network (EPP-Net), which is the first to leverage both skeletons and human parsing modalities for action recognition. The first human pose branch feeds robust skeletons in graph convolutional network to model pose features, while the second human parsing branch also leverages depictive parsing feature maps to model parsing festures via convolutional backbones. The two high-level features will be effectively combined through a late fusion strategy for better action recognition. Extensive experiments on NTU RGB+D and NTU RGB+D 120 benchmarks consistently verify the effectiveness of our proposed EPP-Net, which outperforms the existing action recognition methods. Our code is available at: https://github.com/liujf69/EPP-Net-Action.

preprint2024arXiv

Precision-controlled ultrafast electron microscope platforms. A case study: Multiple-order coherent phonon dynamics in 1T-TaSe$_2$ probed at 50 femtosecond - 10 femtometer scales

We report on the first detailed beam test attesting the fundamental principle behind the development of high-current-efficiency ultrafast electron microscope systems where a radio-frequency cavity is incorporated as a condenser lens in the beam delivery system. To allow the experiment to be carried out with a sufficient resolution to probe the performance at the emittance floor, a new cascade loop RF controller system is developed to reduce the RF noise floor. Temporal resolution at 50 femtoseconds in full-width-at-half-maximum and detection sensitivity better than 1% are demonstrated on exfoliated 1T-TaSe$_2$ layers where the multi-order edge-mode coherent phonon excitation is employed as the standard candle to benchmark the performance. The high temporal resolution and the significant visibility to very low dynamical contrast in diffraction signals give strong support to the working principle of the high-brightness beam delivery via phase-space manipulation in the electron microscope system.

preprint2022arXiv

Continual Object Detection via Prototypical Task Correlation Guided Gating Mechanism

Continual learning is a challenging real-world problem for constructing a mature AI system when data are provided in a streaming fashion. Despite recent progress in continual classification, the researches of continual object detection are impeded by the diverse sizes and numbers of objects in each image. Different from previous works that tune the whole network for all tasks, in this work, we present a simple and flexible framework for continual object detection via pRotOtypical taSk corrElaTion guided gaTing mechAnism (ROSETTA). Concretely, a unified framework is shared by all tasks while task-aware gates are introduced to automatically select sub-models for specific tasks. In this way, various knowledge can be successively memorized by storing their corresponding sub-model weights in this system. To make ROSETTA automatically determine which experience is available and useful, a prototypical task correlation guided Gating Diversity Controller(GDC) is introduced to adaptively adjust the diversity of gates for the new task based on class-specific prototypes. GDC module computes class-to-class correlation matrix to depict the cross-task correlation, and hereby activates more exclusive gates for the new task if a significant domain gap is observed. Comprehensive experiments on COCO-VOC, KITTI-Kitchen, class-incremental detection on VOC and sequential learning of four tasks show that ROSETTA yields state-of-the-art performance on both task-based and class-based continual object detection.

preprint2022arXiv

Maximizing Unambiguous Velocity Range in Phase-contrast MRI with Multipoint Encoding

In phase-contrast magnetic resonance imaging (PC-MRI), the velocity of spins at a voxel is encoded in the image phase. The strength of the velocity encoding gradient offers a trade-off between the velocity-to-noise ratio (VNR) and the extent of phase aliasing. Phase differences provide invariance to an unknown background phase. Existing literature proposes processing a reduced set of phase difference equations, simplifying the phase unwrapping problem at the expense of VNR or unaliased range of velocities, or both. Here, we demonstrate that the fullest unambiguous range of velocities is a parallelepiped, which can be accessed by jointly processing all phase differences. The joint processing also minimizes the velocity-to-noise ratio. The simple understanding of the unambiguous parallelepiped provides the potential for analyzing new multi-point acquisitions for an enhanced range of unaliased velocities; two examples are given.

preprint2022arXiv

PCCT: Progressive Class-Center Triplet Loss for Imbalanced Medical Image Classification

Imbalanced training data is a significant challenge for medical image classification. In this study, we propose a novel Progressive Class-Center Triplet (PCCT) framework to alleviate the class imbalance issue particularly for diagnosis of rare diseases, mainly by carefully designing the triplet sampling strategy and the triplet loss formation. Specifically, the PCCT framework includes two successive stages. In the first stage, PCCT trains the diagnosis system via a class-balanced triplet loss to coarsely separate distributions of different classes. In the second stage, the PCCT framework further improves the diagnosis system via a class-center involved triplet loss to cause a more compact distribution for each class. For the class-balanced triplet loss, triplets are sampled equally for each class at each training iteration, thus alleviating the imbalanced data issue. For the class-center involved triplet loss, the positive and negative samples in each triplet are replaced by their corresponding class centers, which enforces data representations of the same class closer to the class center. Furthermore, the class-center involved triplet loss is extended to the pair-wise ranking loss and the quadruplet loss, which demonstrates the generalization of the proposed framework. Extensive experiments support that the PCCT framework works effectively for medical image classification with imbalanced training images. On two skin image datasets and one chest X-ray dataset, the proposed approach respectively obtains the mean F1 score 86.2, 65.2, and 90.66 over all classes and 81.4, 63.87, and 81.92 for rare classes, achieving state-of-the-art performance and outperforming the widely used methods for the class imbalance issue.

preprint2022arXiv

Venc Design and Velocity Estimation for Phase Contrast MRI

In phase-contrast magnetic resonance imaging (PC-MRI), spin velocity contributes to the phase measured at each voxel. Therefore, estimating velocity from potentially wrapped phase measurements is the task of solving a system of noisy congruence equations. We propose Phase Recovery from Multiple Wrapped Measurements (PRoM) as a fast, approximate maximum likelihood estimator of velocity from multi-coil data with possible amplitude attenuation due to dephasing. The estimator can recover the fullest possible extent of unambiguous velocities, which can greatly exceed twice the highest venc. The estimator uses all pairwise phase differences and the inherent correlations among them to minimize the estimation error. Correlations are directly estimated from multi-coil data without requiring knowledge of coil sensitivity maps, dephasing factors, or the actual per-voxel signal-to-noise ratio. Derivation of the estimator yields explicit probabilities of unwrapping errors and the probability distribution for the velocity estimate; this, in turn, allows for optimized design of the phase-encoded acquisition. These probabilities are also incorporated into spatial post-processing to further mitigate wrapping errors. Simulation, phantom, and in vivo results for three-point PC-MRI acquisitions validate the benefits of reduced estimation error, increased recovered velocity range, optimized acquisition, and fast computation. A phantom study at 1.5T demonstrates 48.5% decrease in root mean squared error using PRoM with post-processing versus a conventional "dual-venc" technique. Simulation and 3T in vivo results likewise demonstrate the proposed benefits.

preprint2020arXiv

Convolutional Framework for Accelerated Magnetic Resonance Imaging

Magnetic Resonance Imaging (MRI) is a noninvasive imaging technique that provides exquisite soft-tissue contrast without using ionizing radiation. The clinical application of MRI may be limited by long data acquisition times; therefore, MR image reconstruction from highly undersampled k-space data has been an active area of research. Many works exploit rank deficiency in a Hankel data matrix to recover unobserved k-space samples; the resulting problem is non-convex, so the choice of numerical algorithm can significantly affect performance, computation, and memory. We present a simple, scalable approach called Convolutional Framework (CF). We demonstrate the feasibility and versatility of CF using measured data from 2D, 3D, and dynamic applications.