Source author record

Zhicheng Zhang

Zhicheng Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning eess.IV eess.SP physics.med-ph quant-ph Human-Computer Interaction Information Theory math.IT Multiagent Systems Neurons and Cognition nlin.PS physics.optics

Catalog footprint

What is connected

14works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Optimal lower bound for quantum channel tomography in away-from-boundary regime

Consider quantum channels with input dimension $d_1$, output dimension $d_2$ and Kraus rank at most $r$. Any such channel must satisfy the constraint $rd_2\geq d_1$, and the parameter regime $rd_2=d_1$ is called the boundary regime. In this paper, we show an optimal query lower bound $Ω(rd_1d_2/\varepsilon^2)$ for quantum channel tomography to within diamond norm error $\varepsilon$ in the away-from-boundary regime $rd_2\geq 2d_1$, matching the existing upper bound $O(rd_1d_2/\varepsilon^2)$. In particular, this lower bound fully settles the query complexity for the commonly studied case of equal input and output dimensions $d_1=d_2=d$ with $r\geq 2$, in sharp contrast to the unitary case $r=1$ where Heisenberg scaling $Θ(d^2/\varepsilon)$ is achievable.

preprint2022arXiv

An end-to-end multi-scale network for action prediction in videos

In this paper, we develop an efficient multi-scale network to predict action classes in partial videos in an end-to-end manner. Unlike most existing methods with offline feature generation, our method directly takes frames as input and further models motion evolution on two different temporal scales.Therefore, we solve the complexity problems of the two stages of modeling and the problem of insufficient temporal and spatial information of a single scale. Our proposed End-to-End MultiScale Network (E2EMSNet) is composed of two scales which are named segment scale and observed global scale. The segment scale leverages temporal difference over consecutive frames for finer motion patterns by supplying 2D convolutions. For observed global scale, a Long Short-Term Memory (LSTM) is incorporated to capture motion features of observed frames. Our model provides a simple and efficient modeling framework with a small computational cost. Our E2EMSNet is evaluated on three challenging datasets: BIT, HMDB51, and UCF101. The extensive experiments demonstrate the effectiveness of our method for action prediction in videos.

preprint2022arXiv

MAVIPER: Learning Decision Tree Policies for Interpretable Multi-Agent Reinforcement Learning

Many recent breakthroughs in multi-agent reinforcement learning (MARL) require the use of deep neural networks, which are challenging for human experts to interpret and understand. On the other hand, existing work on interpretable reinforcement learning (RL) has shown promise in extracting more interpretable decision tree-based policies from neural networks, but only in the single-agent setting. To fill this gap, we propose the first set of algorithms that extract interpretable decision-tree policies from neural networks trained with MARL. The first algorithm, IVIPER, extends VIPER, a recent method for single-agent interpretable RL, to the multi-agent setting. We demonstrate that IVIPER learns high-quality decision-tree policies for each agent. To better capture coordination between agents, we propose a novel centralized decision-tree training algorithm, MAVIPER. MAVIPER jointly grows the trees of each agent by predicting the behavior of the other agents using their anticipated trees, and uses resampling to focus on states that are critical for its interactions with other agents. We show that both algorithms generally outperform the baselines and that MAVIPER-trained agents achieve better-coordinated performance than IVIPER-trained agents on three different multi-agent particle-world environments.

preprint2022arXiv

Noise-like Pulses from an All-Normal-Dispersion Fiber Laser with Weakened Spectrum Filtering

Noise-like pulses (NLP) are extremely sought after in many fields. Here, we experimentally and numerically investigated the generation of noise-like pulses in an all-normal-dispersion fiber laser with weak spectrum filtering. With the insertion of the grating as a tunable spectrum filter, the laser operates at a stable dissipative soliton state with a 3.84 ps duration. Replacing the grating with a mirror, NLPs with double-scale intensity autocorrelation trace is ultimately attained. Numerical simulations are performed in detail and demonstrated that with the absence of a spectrum filter, the stable state cannot be established but form the random pulse cluster. The random pulse cluster achieves dynamic stability with suitable feedback, and the NLP is ultimately generated. The NLP here is directly evolved by the initial noise, and no other states occur during its evolution. These explorations could deepen the understanding of NLP and enrich the complex dynamics of the ANDi ultrafast fiber laser.

preprint2022arXiv

Patient-specific mean teacher UNet for enhancing PET image and low-dose PET reconstruction on RefleXion X1 biology-guided radiotherapy system

The RefleXion X1 is the first biology-guided radiotherapy (BgRT) system. Its dual 90-degree PET detector collects fewer pair production events compared to a full-ring diagnostic PET system. In the proposed BgRT workflow, a short scan is acquired before treatment delivery to ensure image quality and consistency. The shorter scan time, a quarter of the simulation scan time, also leads to fewer coincidence events and hence reduced image quality. In this study, we proposed a patient-specific mean teacher UNet (MT-UNet) to enhance PET image quality and low-dose PET reconstruction on RefleXion X1. PET/CT scans of nine cancer patients were acquired using RefleXion X1. Every patient had one simulation scan. Five patients had additional scans acquired during the first and the final treatment fractions. Treatment scans were acquired using the same imaging protocol as the simulation scan. For each scan, we reconstructed a full-dose image and evenly split coincidence events into four sessions to reconstruct four quarter-dose PET images. For each patient, our proposed MT-UNet was trained using quarter-dose and full-dose images of the simulation scan. For the image quality enhancement task, we applied nine trained MT-UNets to full-dose simulation PET images of the nine patients to generate enhanced images, respectively. The enhanced images were compared with the original full-dose images using CNR and SNR. For the low-dose image reconstruction task, we applied five trained MT-UNets to ten quarter-dose treatment images of five patients to predict full-dose images, respectively. The predicted and ground truth full-dose images were compared using SSIM and PSNR. We also trained and evaluated patient-specific UNets for model comparison. Our proposed patient-specific MT-UNet achieved better performance in improving the quality of RefleXion low-dose and full-dose images compared to the patient-specific UNet.

preprint2022arXiv

ProCo: Prototype-aware Contrastive Learning for Long-tailed Medical Image Classification

Medical image classification has been widely adopted in medical image analysis. However, due to the difficulty of collecting and labeling data in the medical area, medical image datasets are usually highly-imbalanced. To address this problem, previous works utilized class samples as prior for re-weighting or re-sampling but the feature representation is usually still not discriminative enough. In this paper, we adopt the contrastive learning to tackle the long-tailed medical imbalance problem. Specifically, we first propose the category prototype and adversarial proto-instance to generate representative contrastive pairs. Then, the prototype recalibration strategy is proposed to address the highly imbalanced data distribution. Finally, a unified proto-loss is designed to train our framework. The overall framework, namely as Prototype-aware Contrastive learning (ProCo), is unified as a single-stage pipeline in an end-to-end manner to alleviate the imbalanced problem in medical image classification, which is also a distinct progress than existing works as they follow the traditional two-stage pipeline. Extensive experiments on two highly-imbalanced medical image classification datasets demonstrate that our method outperforms the existing state-of-the-art methods by a large margin.

preprint2022arXiv

Quantum Algorithm for Fidelity Estimation

For two unknown mixed quantum states $ρ$ and $σ$ in an $N$-dimensional Hilbert space, computing their fidelity $F(ρ,σ)$ is a basic problem with many important applications in quantum computing and quantum information, for example verification and characterization of the outputs of a quantum computer, and design and analysis of quantum algorithms. In this paper, we propose a quantum algorithm that solves this problem in $\operatorname{poly}(\log (N), r, 1/\varepsilon)$ time, where $r$ is the lower rank of $ρ$ and $σ$, and $\varepsilon$ is the desired precision, provided that the purifications of $ρ$ and $σ$ are prepared by quantum oracles. This algorithm exhibits an exponential speedup over the best known algorithm (based on quantum state tomography) which has time complexity polynomial in $N$.

preprint2022arXiv

Tell Me the Evidence? Dual Visual-Linguistic Interaction for Answer Grounding

Answer grounding aims to reveal the visual evidence for visual question answering (VQA), which entails highlighting relevant positions in the image when answering questions about images. Previous attempts typically tackle this problem using pretrained object detectors, but without the flexibility for objects not in the predefined vocabulary. However, these black-box methods solely concentrate on the linguistic generation, ignoring the visual interpretability. In this paper, we propose Dual Visual-Linguistic Interaction (DaVI), a novel unified end-to-end framework with the capability for both linguistic answering and visual grounding. DaVI innovatively introduces two visual-linguistic interaction mechanisms: 1) visual-based linguistic encoder that understands questions incorporated with visual features and produces linguistic-oriented evidence for further answer decoding, and 2) linguistic-based visual decoder that focuses visual features on the evidence-related regions for answer grounding. This way, our approach ranked the 1st place in the answer grounding track of 2022 VizWiz Grand Challenge.

preprint2022arXiv

Topological EEG Nonlinear Dynamics Analysis for Emotion Recognition

Emotional recognition through exploring the electroencephalography (EEG) characteristics has been widely performed in recent studies. Nonlinear analysis and feature extraction methods for understanding the complex dynamical phenomena are associated with the EEG patterns of different emotions. The phase space reconstruction is a typical nonlinear technique to reveal the dynamics of the brain neural system. Recently, the topological data analysis (TDA) scheme has been used to explore the properties of space, which provides a powerful tool to think over the phase space. In this work, we proposed a topological EEG nonlinear dynamics analysis approach using the phase space reconstruction (PSR) technique to convert EEG time series into phase space, and the persistent homology tool explores the topological properties of the phase space. We perform the topological analysis of EEG signals in different rhythm bands to build emotion feature vectors, which shows high distinguishing ability. We evaluate the approach with two well-known benchmark datasets, the DEAP and DREAMER datasets. The recognition results achieved accuracies of 99.37% and 99.35% in arousal and valence classification tasks with DEAP, and 99.96%, 99.93%, and 99.95% in arousal, valence, and dominance classifications tasks with DREAMER, respectively. The performances are supposed to be outperformed current state-of-art approaches in DREAMER (improved by 1% to 10% depends on temporal length), while comparable to other related works evaluated in DEAP. The proposed work is the first investigation in the emotion recognition oriented EEG topological feature analysis, which brought a novel insight into the brain neural system nonlinear dynamics analysis and feature extraction.

preprint2021arXiv

Adaptive Deconvolution-based stereo matching Net for Local Stereo Matching

In deep learning-based local stereo matching methods, larger image patches usually bring better stereo matching accuracy. However, it is unrealistic to increase the size of the image patch size without restriction. Arbitrarily extending the patch size will change the local stereo matching method into the global stereo matching method, and the matching accuracy will be saturated. We simplified the existing Siamese convolutional network by reducing the number of network parameters and propose an efficient CNN based structure, namely Adaptive Deconvolution-based disparity matching Net (ADSM net) by adding deconvolution layers to learn how to enlarge the size of input feature map for the following convolution layers. Experimental results on the KITTI 2012 and 2015 datasets demonstrate that the proposed method can achieve a good trade-off between accuracy and complexity.

preprint2020arXiv

A Matlab Toolbox for Feature Importance Ranking

More attention is being paid for feature importance ranking (FIR), in particular when thousands of features can be extracted for intelligent diagnosis and personalized medicine. A large number of FIR approaches have been proposed, while few are integrated for comparison and real-life applications. In this study, a matlab toolbox is presented and a total of 30 algorithms are collected. Moreover, the toolbox is evaluated on a database of 163 ultrasound images. To each breast mass lesion, 15 features are extracted. To figure out the optimal subset of features for classification, all combinations of features are tested and linear support vector machine is used for the malignancy prediction of lesions annotated in ultrasound images. At last, the effectiveness of FIR is analyzed according to performance comparison. The toolbox is online (https://github.com/NicoYuCN/matFIR). In our future work, more FIR methods, feature selection methods and machine learning classifiers will be integrated.

preprint2020arXiv

Deep Sinogram Completion with Image Prior for Metal Artifact Reduction in CT Images

Computed tomography (CT) has been widely used for medical diagnosis, assessment, and therapy planning and guidance. In reality, CT images may be affected adversely in the presence of metallic objects, which could lead to severe metal artifacts and influence clinical diagnosis or dose calculation in radiation therapy. In this paper, we propose a generalizable framework for metal artifact reduction (MAR) by simultaneously leveraging the advantages of image domain and sinogram domain-based MAR techniques. We formulate our framework as a sinogram completion problem and train a neural network (SinoNet) to restore the metal-affected projections. To improve the continuity of the completed projections at the boundary of metal trace and thus alleviate new artifacts in the reconstructed CT images, we train another neural network (PriorNet) to generate a good prior image to guide sinogram learning, and further design a novel residual sinogram learning strategy to effectively utilize the prior image information for better sinogram completion. The two networks are jointly trained in an end-to-end fashion with a differentiable forward projection (FP) operation so that the prior image generation and deep sinogram completion procedures can benefit from each other. Finally, the artifact-reduced CT images are reconstructed using the filtered backward projection (FBP) from the completed sinogram. Extensive experiments on simulated and real artifacts data demonstrate that our method produces superior artifact-reduced results while preserving the anatomical structures and outperforms other MAR methods.

preprint2020arXiv

Elastic Net based Feature Ranking and Selection

Feature selection is important in data representation and intelligent diagnosis. Elastic net is one of the most widely used feature selectors. However, the features selected are dependant on the training data, and their weights dedicated for regularized regression are irrelevant to their importance if used for feature ranking, that degrades the model interpretability and extension. In this study, an intuitive idea is put at the end of multiple times of data splitting and elastic net based feature selection. It concerns the frequency of selected features and uses the frequency as an indicator of feature importance. After features are sorted according to their frequency, linear support vector machine performs the classification in an incremental manner. At last, a compact subset of discriminative features is selected by comparing the prediction performance. Experimental results on breast cancer data sets (BCDR-F03, WDBC, GSE 10810, and GSE 15852) suggest that the proposed framework achieves competitive or superior performance to elastic net and with consistent selection of fewer features. How to further enhance its consistency on high-dimension small-sample-size data sets should be paid more attention in our future work. The proposed framework is accessible online (https://github.com/NicoYuCN/elasticnetFR).

preprint2020arXiv

Noise2Context: Context-assisted Learning 3D Thin-layer Low Dose CT Without Clean Data

Computed tomography (CT) has played a vital role in medical diagnosis, assessment, and therapy planning, etc. In clinical practice, concerns about the increase of X-ray radiation exposure attract more and more attention. To lower the X-ray radiation, low-dose CT is often used in certain scenarios, while it will induce the degradation of CT image quality. In this paper, we proposed a training method that trained denoising neural networks without any paired clean data. we trained the denoising neural network to map one noise LDCT image to its two adjacent LDCT images in a singe 3D thin-layer low-dose CT scanning, simultaneously In other words, with some latent assumptions, we proposed an unsupervised loss function with the integration of the similarity between adjacent CT slices in 3D thin-layer lowdose CT to train the denoising neural network in an unsupervised manner. For 3D thin-slice CT scanning, the proposed virtual supervised loss function was equivalent to a supervised loss function with paired noisy and clean samples when the noise in the different slices from a single scan was uncorrelated and zero-mean. Further experiments on Mayo LDCT dataset and a realistic pig head were carried out and demonstrated superior performance over existing unsupervised methods.

Zhicheng Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

Optimal lower bound for quantum channel tomography in away-from-boundary regime

An end-to-end multi-scale network for action prediction in videos

MAVIPER: Learning Decision Tree Policies for Interpretable Multi-Agent Reinforcement Learning

Noise-like Pulses from an All-Normal-Dispersion Fiber Laser with Weakened Spectrum Filtering

Patient-specific mean teacher UNet for enhancing PET image and low-dose PET reconstruction on RefleXion X1 biology-guided radiotherapy system

ProCo: Prototype-aware Contrastive Learning for Long-tailed Medical Image Classification

Quantum Algorithm for Fidelity Estimation

Tell Me the Evidence? Dual Visual-Linguistic Interaction for Answer Grounding

Topological EEG Nonlinear Dynamics Analysis for Emotion Recognition

Adaptive Deconvolution-based stereo matching Net for Local Stereo Matching

A Matlab Toolbox for Feature Importance Ranking

Deep Sinogram Completion with Image Prior for Metal Artifact Reduction in CT Images

Elastic Net based Feature Ranking and Selection

Noise2Context: Context-assisted Learning 3D Thin-layer Low Dose CT Without Clean Data