Source author record

Yunhua Zhang

Yunhua Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision hep-ph hep-ex nucl-ex nucl-th physics.data-an physics.optics

Catalog footprint

What is connected

6works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Audio-Adaptive Activity Recognition Across Video Domains

This paper strives for activity recognition under domain shift, for example caused by change of scenery or camera viewpoint. The leading approaches reduce the shift in activity appearance by adversarial training and self-supervised learning. Different from these vision-focused works we leverage activity sounds for domain adaptation as they have less variance across domains and can reliably indicate which activities are not happening. We propose an audio-adaptive encoder and associated learning methods that discriminatively adjust the visual feature representation as well as addressing shifts in the semantic distribution. To further eliminate domain-specific features and include domain-invariant activity sounds for recognition, an audio-infused recognizer is proposed, which effectively models the cross-modal interaction across domains. We also introduce the new task of actor shift, with a corresponding audio-visual dataset, to challenge our method with situations where the activity appearance changes dramatically. Experiments on this dataset, EPIC-Kitchens and CharadesEgo show the effectiveness of our approach.

preprint2022arXiv

TANet: Transformer-based Asymmetric Network for RGB-D Salient Object Detection

Existing RGB-D SOD methods mainly rely on a symmetric two-stream CNN-based network to extract RGB and depth channel features separately. However, there are two problems with the symmetric conventional network structure: first, the ability of CNN in learning global contexts is limited; second, the symmetric two-stream structure ignores the inherent differences between modalities. In this paper, we propose a Transformer-based asymmetric network (TANet) to tackle the issues mentioned above. We employ the powerful feature extraction capability of Transformer (PVTv2) to extract global semantic information from RGB data and design a lightweight CNN backbone (LWDepthNet) to extract spatial structure information from depth data without pre-training. The asymmetric hybrid encoder (AHE) effectively reduces the number of parameters in the model while increasing speed without sacrificing performance. Then, we design a cross-modal feature fusion module (CMFFM), which enhances and fuses RGB and depth features with each other. Finally, we add edge prediction as an auxiliary task and propose an edge enhancement module (EEM) to generate sharper contours. Extensive experiments demonstrate that our method achieves superior performance over 14 state-of-the-art RGB-D methods on six public datasets. Our code will be released at https://github.com/lc012463/TANet.

preprint2020arXiv

High-Performance Long-Term Tracking with Meta-Updater

Long-term visual tracking has drawn increasing attention because it is much closer to practical applications than short-term tracking. Most top-ranked long-term trackers adopt the offline-trained Siamese architectures, thus, they cannot benefit from great progress of short-term trackers with online update. However, it is quite risky to straightforwardly introduce online-update-based trackers to solve the long-term problem, due to long-term uncertain and noisy observations. In this work, we propose a novel offline-trained Meta-Updater to address an important but unsolved problem: Is the tracker ready for updating in the current frame? The proposed meta-updater can effectively integrate geometric, discriminative, and appearance cues in a sequential manner, and then mine the sequential information with a designed cascaded LSTM module. Our meta-updater learns a binary output to guide the tracker's update and can be easily embedded into different trackers. This work also introduces a long-term tracking framework consisting of an online local tracker, an online verifier, a SiamRPN-based re-detector, and our meta-updater. Numerous experimental results on the VOT2018LT, VOT2019LT, OxUvALT, TLP, and LaSOT benchmarks show that our tracker performs remarkably better than other competing algorithms. Our project is available on the website: https://github.com/Daikenan/LTMU.

preprint2013arXiv

Electromagnetic Sub-Wavelength Imaging Using Signal Processing Techniques Combined With Phase Conjugation

In this paper, we show how we can combine Electromagnetics (EM) with signal processing algorithms to enhance the image resolution over that can be realized by using Electromagnetics techniques alone. We discuss several signal processing techniques, including the Correlation Method (CM) and the Minimum Residual Power Search Method (MRPSM), and apply them for sub-wavelength imaging in the microwave regime by combining them with the well-known Phase Conjugation (PC) algorithm, for instance, which has been extensively used in the electromagnetics area for imaging purposes. We show that by using this type of combination we can achieve sub-wavelength resolution on the order of λ0/10, even if the measurement plane is not located in the very near-field region of the source. We describe the proposed imaging algorithms in detail and study their abilities to resolve at sub-wavelength level. We also study their computational efficiencies in a comparative manner.

preprint2010arXiv

Parton distribution functions and nuclear EMC effect in a statistical model

A new and simple statistical approach is performed to calculate the parton distribution functions (PDFs) of the nucleon in terms of light-front kinematic variables. Analytic expressions of x-dependent PDFs are obtained in the whole x region. And thereafter, we treat the temperature T as a parameter of the atomic number A to explain the nuclear EMC effect in the region $x \in [0.2, 0.7]$. We give the predictions of PDF ratios, and they are very different from those by other models, thus experiments aiming at measuring PDF ratios are suggested to provide a discrimination of different models.

preprint2009arXiv

Nuclear EMC Effect in a Statistical Model

A simple statistical model in terms of light-front kinematic variables is used to explain the nuclear EMC effect in the range $x \in [0.2,~0.7]$, which was constructed by us previously to calculate the parton distribution functions (PDFs) of the nucleon. Here, we treat the temperature $T$ as a parameter of the atomic number $A$, and get reasonable results in agreement with the experimental data. Our results show that the larger $A$, the lower $T$ thus the bigger volume $V$, and these features are consistent with other models. Moreover, we give the predictions of the quark distribution ratios, \emph{i.e.}, $q^A(x) / q^D(x)$, $\bar{q}^A(x) / \bar{q}^D(x)$, and $s^A(x) / s^D(x)$, and also the gluon ratio $g^A(x) / g^D(x)$ for iron as an example. The predictions are different from those by other models, thus experiments aiming at measuring the parton ratios of antiquarks, strange quarks, and gluons can provide a discrimination of different models.