Source author record

Yaping Zhao

Yaping Zhao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision eess.IV cs.CY quant-ph Social and Information Networks

Catalog footprint

What is connected

8works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Cross-Camera Human Motion Transfer by Time Series Analysis

With advances in optical sensor technology, heterogeneous camera systems are increasingly used for high-resolution (HR) video acquisition and analysis. However, motion transfer across multiple cameras poses challenges. To address this, we propose an algorithm based on time series analysis that identifies motion seasonality and constructs an additive model to extract transferable patterns. Validated on real-world data, our algorithm demonstrates effectiveness and interpretability. Notably, it improves pose estimation in low-resolution videos by leveraging patterns derived from HR counterparts, enhancing practical utility. Code is available at: https://github.com/IndigoPurple/TSAMT

preprint2023arXiv

SASA: Saliency-Aware Self-Adaptive Snapshot Compressive Imaging

The ability of snapshot compressive imaging (SCI) systems to efficiently capture high-dimensional (HD) data depends on the advent of novel optical designs to sample the HD data as two-dimensional (2D) compressed measurements. Nonetheless, the traditional SCI scheme is fundamentally limited, due to the complete disregard for high-level information in the sampling process. To tackle this issue, in this paper, we pave the first mile toward the advanced design of adaptive coding masks for SCI. Specifically, we propose an efficient and effective algorithm to generate coding masks with the assistance of saliency detection, in a low-cost and low-power fashion. Experiments demonstrate the effectiveness and efficiency of our approach. Code is available at: https://github.com/IndigoPurple/SASA

preprint2022arXiv

Cross-Camera Deep Colorization

In this paper, we consider the color-plus-mono dual-camera system and propose an end-to-end convolutional neural network to align and fuse images from it in an efficient and cost-effective way. Our method takes cross-domain and cross-scale images as input, and consequently synthesizes HR colorization results to facilitate the trade-off between spatial-temporal resolution and color depth in the single-camera imaging system. In contrast to the previous colorization methods, ours can adapt to color and monochrome cameras with distinctive spatial-temporal resolutions, rendering the flexibility and robustness in practical applications. The key ingredient of our method is a cross-camera alignment module that generates multi-scale correspondences for cross-domain image alignment. Through extensive experiments on various datasets and multiple settings, we validate the flexibility and effectiveness of our approach. Remarkably, our method consistently achieves substantial improvements, i.e., around 10dB PSNR gain, upon the state-of-the-art methods. Code is at: https://github.com/IndigoPurple/CCDC

preprint2022arXiv

H4M: Heterogeneous, Multi-source, Multi-modal, Multi-view and Multi-distributional Dataset for Socioeconomic Analytics in the Case of Beijing

The study of socioeconomic status has been reformed by the availability of digital records containing data on real estate, points of interest, traffic and social media trends such as micro-blogging. In this paper, we describe a heterogeneous, multi-source, multi-modal, multi-view and multi-distributional dataset named "H4M". The mixed dataset contains data on real estate transactions, points of interest, traffic patterns and micro-blogging trends from Beijing, China. The unique composition of H4M makes it an ideal test bed for methodologies and approaches aimed at studying and solving problems related to real estate, traffic, urban mobility planning, social sentiment analysis etc. The dataset is available at: https://indigopurple.github.io/H4M/index.html

preprint2022arXiv

MANet: Improving Video Denoising with a Multi-Alignment Network

In video denoising, the adjacent frames often provide very useful information, but accurate alignment is needed before such information can be harnassed. In this work, we present a multi-alignment network, which generates multiple flow proposals followed by attention-based averaging. It serves to mimic the non-local mechanism, suppressing noise by averaging multiple observations. Our approach can be applied to various state-of-the-art models that are based on flow estimation. Experiments on a large-scale video dataset demonstrate that our method improves the denoising baseline model by 0.2dB, and further reduces the parameters by 47% with model distillation. Code is available at https://github.com/IndigoPurple/MANet.

preprint2022arXiv

Revisit Dictionary Learning for Video Compressive Sensing under the Plug-and-Play Framework

Aiming at high-dimensional (HD) data acquisition and analysis, snapshot compressive imaging (SCI) obtains the 2D compressed measurement of HD data with optical imaging systems and reconstructs HD data using compressive sensing algorithms. While the Plug-and-Play (PnP) framework offers an emerging solution to SCI reconstruction, its intrinsic denoising process is still a challenging problem. Unfortunately, existing denoisers in the PnP framework either suffer limited performance or require extensive training data. In this paper, we propose an efficient and effective shallow-learning-based algorithm for video SCI reconstruction. Revisiting dictionary learning methods, we empower the PnP framework with a new denoiser, the kernel singular value decomposition (KSVD). Benefited from the advent of KSVD, our algorithm retains a good trade-off among quality, speed, and training difficulty. On a variety of datasets, both quantitative and qualitative evaluations of our simulation results demonstrate the effectiveness of our proposed method. In comparison to a typical baseline using total variation, our method achieves around $2$ dB improvement in PSNR and 0.2 in SSIM. We expect that our proposed PnP-KSVD algorithm can serve as a new baseline for video SCI reconstruction.

preprint2020arXiv

Zoom in to the details of human-centric videos

Presenting high-resolution (HR) human appearance is always critical for the human-centric videos. However, current imagery equipment can hardly capture HR details all the time. Existing super-resolution algorithms barely mitigate the problem by only considering universal and low-level priors of im-age patches. In contrast, our algorithm is under bias towards the human body super-resolution by taking advantage of high-level prior defined by HR human appearance. Firstly, a motion analysis module extracts inherent motion pattern from the HR reference video to refine the pose estimation of the low-resolution (LR) sequence. Furthermore, a human body reconstruction module maps the HR texture in the reference frames onto a 3D mesh model. Consequently, the input LR videos get super-resolved HR human sequences are generated conditioned on the original LR videos as well as few HR reference frames. Experiments on an existing dataset and real-world data captured by hybrid cameras show that our approach generates superior visual quality of human body compared with the traditional method.

preprint2012arXiv

Experimental preparation of eight-partite linear and two-diamond shape cluster states for photonic qumodes

The preparation of multipartite entangled states is the prerequisite for exploring quantum information networks and quantum computation. In this letter, we present the first experimental demonstration of eight-partite spatially separated CV entangled states. The initial resource quantum states are eight squeezed states of light, through the linearly optical transformation of which two types of the eight-partite cluster entangled states are prepared, respectively. The generated eight entangled photonic qumodes are spatially separated, which provide valuable quantum resources to implement more complicated quantum information task.

Yaping Zhao

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Cross-Camera Human Motion Transfer by Time Series Analysis

SASA: Saliency-Aware Self-Adaptive Snapshot Compressive Imaging

Cross-Camera Deep Colorization

H4M: Heterogeneous, Multi-source, Multi-modal, Multi-view and Multi-distributional Dataset for Socioeconomic Analytics in the Case of Beijing

MANet: Improving Video Denoising with a Multi-Alignment Network

Revisit Dictionary Learning for Video Compressive Sensing under the Plug-and-Play Framework

Zoom in to the details of human-centric videos

Experimental preparation of eight-partite linear and two-diamond shape cluster states for photonic qumodes