Researcher profile

Lin Zhu

Lin Zhu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2026arXiv

Dynamic Pondering Sparsity-aware Mixture-of-Experts Transformer for Event Stream based Visual Object Tracking

Despite significant progress, RGB-based trackers remain vulnerable to challenging imaging conditions, such as low illumination and fast motion. Event cameras offer a promising alternative by asynchronously capturing pixel-wise brightness changes, providing high dynamic range and high temporal resolution. However, existing event-based trackers often neglect the intrinsic spatial sparsity and temporal density of event data, while relying on a single fixed temporal-window sampling strategy that is suboptimal under varying motion dynamics. In this paper, we propose an event sparsity-aware tracking framework that explicitly models event-density variations across multiple temporal scales. Specifically, the proposed framework progressively injects sparse, medium-density, and dense event search regions into a three-stage Vision Transformer backbone, enabling hierarchical multi-density feature learning. Furthermore, we introduce a sparsity-aware Mixture-of-Experts module to encourage expert specialization under different sparsity patterns, and design a dynamic pondering strategy to adaptively adjust the inference depth according to tracking difficulty. Extensive experiments on FE240hz, COESOT, and EventVOT demonstrate that the proposed approach achieves a favorable trade-off between tracking accuracy and computational efficiency. The source code will be released on https://github.com/Event-AHU/OpenEvTracking.

preprint2024arXiv

CRSOT: Cross-Resolution Object Tracking using Unaligned Frame and Event Cameras

Existing datasets for RGB-DVS tracking are collected with DVS346 camera and their resolution ($346 \times 260$) is low for practical applications. Actually, only visible cameras are deployed in many practical systems, and the newly designed neuromorphic cameras may have different resolutions. The latest neuromorphic sensors can output high-definition event streams, but it is very difficult to achieve strict alignment between events and frames on both spatial and temporal views. Therefore, how to achieve accurate tracking with unaligned neuromorphic and visible sensors is a valuable but unresearched problem. In this work, we formally propose the task of object tracking using unaligned neuromorphic and visible cameras. We build the first unaligned frame-event dataset CRSOT collected with a specially built data acquisition system, which contains 1,030 high-definition RGB-Event video pairs, 304,974 video frames. In addition, we propose a novel unaligned object tracking framework that can realize robust tracking even using the loosely aligned RGB-Event data. Specifically, we extract the template and search regions of RGB and Event data and feed them into a unified ViT backbone for feature embedding. Then, we propose uncertainty perception modules to encode the RGB and Event features, respectively, then, we propose a modality uncertainty fusion module to aggregate the two modalities. These three branches are jointly optimized in the training phase. Extensive experiments demonstrate that our tracker can collaborate the dual modalities for high-performance tracking even without strictly temporal and spatial alignment. The source code, dataset, and pre-trained models will be released at https://github.com/Event-AHU/Cross_Resolution_SOT.

preprint2024arXiv

Revisiting Color-Event based Tracking: A Unified Network, Dataset, and Metric

Combining the Color and Event cameras (also called Dynamic Vision Sensors, DVS) for robust object tracking is a newly emerging research topic in recent years. Existing color-event tracking framework usually contains multiple scattered modules which may lead to low efficiency and high computational complexity, including feature extraction, fusion, matching, interactive learning, etc. In this paper, we propose a single-stage backbone network for Color-Event Unified Tracking (CEUTrack), which achieves the above functions simultaneously. Given the event points and RGB frames, we first transform the points into voxels and crop the template and search regions for both modalities, respectively. Then, these regions are projected into tokens and parallelly fed into the unified Transformer backbone network. The output features will be fed into a tracking head for target object localization. Our proposed CEUTrack is simple, effective, and efficient, which achieves over 75 FPS and new SOTA performance. To better validate the effectiveness of our model and address the data deficiency of this task, we also propose a generic and large-scale benchmark dataset for color-event tracking, termed COESOT, which contains 90 categories and 1354 video sequences. Additionally, a new evaluation metric named BOC is proposed in our evaluation toolkit to evaluate the prominence with respect to the baseline methods. We hope the newly proposed method, dataset, and evaluation metric provide a better platform for color-event-based tracking. The dataset, toolkit, and source code will be released on: \url{https://github.com/Event-AHU/COESOT}.

preprint2022arXiv

Data-Driven Fast Frequency Control using Inverter-Based Resources

We develop and test a data-driven and area-based fast frequency control scheme, which rapidly redispatches inverter-based resources to compensate for local power imbalances within the bulk power system. The approach requires no explicit system model information, relying only on historical measurement sequences for the computation of control actions. Our technical approach fuses developments in low-gain estimator design and data-driven control to provide a model-free and practical solution for fast frequency control. Theoretical results and extensive simulation scenarios on a three area system are provided to support the approach.

preprint2022arXiv

Event-based Video Reconstruction via Potential-assisted Spiking Neural Network

Neuromorphic vision sensor is a new bio-inspired imaging paradigm that reports asynchronous, continuously per-pixel brightness changes called `events' with high temporal resolution and high dynamic range. So far, the event-based image reconstruction methods are based on artificial neural networks (ANN) or hand-crafted spatiotemporal smoothing techniques. In this paper, we first implement the image reconstruction work via fully spiking neural network (SNN) architecture. As the bio-inspired neural networks, SNNs operating with asynchronous binary spikes distributed over time, can potentially lead to greater computational efficiency on event-driven hardware. We propose a novel Event-based Video reconstruction framework based on a fully Spiking Neural Network (EVSNN), which utilizes Leaky-Integrate-and-Fire (LIF) neuron and Membrane Potential (MP) neuron. We find that the spiking neurons have the potential to store useful temporal information (memory) to complete such time-dependent tasks. Furthermore, to better utilize the temporal information, we propose a hybrid potential-assisted framework (PA-EVSNN) using the membrane potential of spiking neuron. The proposed neuron is referred as Adaptive Membrane Potential (AMP) neuron, which adaptively updates the membrane potential according to the input spikes. The experimental results demonstrate that our models achieve comparable performance to ANN-based models on IJRR, MVSEC, and HQF datasets. The energy consumptions of EVSNN and PA-EVSNN are 19.36$\times$ and 7.75$\times$ more computationally efficient than their ANN architectures, respectively.

preprint2022arXiv

GenAD: General Representations of Multivariate Time Seriesfor Anomaly Detection

The reliability of wireless base stations in China Mobile is of vital importance, because the cell phone users are connected to the stations and the behaviors of the stations are directly related to user experience. Although the monitoring of the station behaviors can be realized by anomaly detection on multivariate time series, due to complex correlations and various temporal patterns of multivariate series in large-scale stations, building a general unsupervised anomaly detection model with a higher F1-score remains a challenging task. In this paper, we propose a General representation of multivariate time series for Anomaly Detection(GenAD). First, we pre-train a general model on large-scale wireless base stations with self-supervision, which can be easily transferred to a specific station anomaly detection with a small amount of training data. Second, we employ Multi-Correlation Attention and Time-Series Attention to represent the correlations and temporal patterns of the stations. With the above innovations, GenAD increases F1-score by total 9% on real-world datasets in China Mobile, while the performance does not significantly degrade on public datasets with only 10% of the training data.

preprint2022arXiv

Mirror Complementary Transformer Network for RGB-thermal Salient Object Detection

RGB-thermal salient object detection (RGB-T SOD) aims to locate the common prominent objects of an aligned visible and thermal infrared image pair and accurately segment all the pixels belonging to those objects. It is promising in challenging scenes such as nighttime and complex backgrounds due to the insensitivity to lighting conditions of thermal images. Thus, the key problem of RGB-T SOD is to make the features from the two modalities complement and adjust each other flexibly, since it is inevitable that any modalities of RGB-T image pairs failure due to challenging scenes such as extreme light conditions and thermal crossover. In this paper, we propose a novel mirror complementary Transformer network (MCNet) for RGB-T SOD. Specifically, we introduce a Transformer-based feature extraction module to effective extract hierarchical features of RGB and thermal images. Then, through the attention-based feature interaction and serial multiscale dilated convolution (SDC) based feature fusion modules, the proposed model achieves the complementary interaction of low-level features and the semantic fusion of deep features. Finally, based on the mirror complementary structure, the salient regions of the two modalities can be accurately extracted even one modality is invalid. To demonstrate the robustness of the proposed model under challenging scenes in real world, we build a novel RGB-T SOD dataset VT723 based on a large public semantic segmentation RGB-T dataset used in the autonomous driving domain. Expensive experiments on benchmark and VT723 datasets show that the proposed method outperforms state-of-the-art approaches, including CNN-based and Transformer-based methods. The code and dataset will be released later at https://github.com/jxr326/SwinMCNet.

preprint2022arXiv

Noise and Edge Based Dual Branch Image Manipulation Detection

Unlike ordinary computer vision tasks that focus more on the semantic content of images, the image manipulation detection task pays more attention to the subtle information of image manipulation. In this paper, the noise image extracted by the improved constrained convolution is used as the input of the model instead of the original image to obtain more subtle traces of manipulation. Meanwhile, the dual-branch network, consisting of a high-resolution branch and a context branch, is used to capture the traces of artifacts as much as possible. In general, most manipulation leaves manipulation artifacts on the manipulation edge. A specially designed manipulation edge detection module is constructed based on the dual-branch network to identify these artifacts better. The correlation between pixels in an image is closely related to their distance. The farther the two pixels are, the weaker the correlation. We add a distance factor to the self-attention module to better describe the correlation between pixels. Experimental results on four publicly available image manipulation datasets demonstrate the effectiveness of our model.

preprint2022arXiv

Proposal for the search for exotic spin-spin interactions at the micrometer scale using functionalized cantilever force sensors

Spin-dependent exotic interactions can be generated by exchanging hypothetical bosons, which were introduced to solve some puzzles in physics. Many precision experiments have been performed to search for such interactions, but no confirmed observation has been made. Here, we propose new experiments to search for the exotic spin-spin interactions that can be mediated by axions or Z$^\prime$ bosons. A sensitive functionalized cantilever is utilized as a force sensor to measure the interactions between the spin-polarized electrons in a periodic magnetic source structure and a closed-loop magnetic structure integrated on the cantilever. The source is set to oscillate during data acquisition to modulate the exotic force signal to high harmonics of the oscillating frequency. This helps to suppress the spurious signals at the signal frequency. Different magnetic source structures are designed for different interaction detections. A magnetic stripe structure is designed for Z$^\prime$-mediated interaction, which is insensitive to the detection of axion-mediated interaction. This allows us to measure the coupling constant of both if we assume both exist. With the force sensitivity achievable at low temperature, the proposed experiments are expected to search for the parameter spaces with much smaller coupling constant than the current stringent constraints from micrometer to millimeter range. Specifically, the lower bound of the parameter space will be seven orders of magnitude lower than the stringent constraints for Z$^\prime$-mediated interaction, and an order of magnitude lower for axion-mediated interaction, at the interaction range of $10\, μ$m.

preprint2022arXiv

Temporal Up-Sampling for Asynchronous Events

The event camera is a novel bio-inspired vision sensor. When the brightness change exceeds the preset threshold, the sensor generates events asynchronously. The number of valid events directly affects the performance of event-based tasks, such as reconstruction, detection, and recognition. However, when in low-brightness or slow-moving scenes, events are often sparse and accompanied by noise, which poses challenges for event-based tasks. To solve these challenges, we propose an event temporal up-sampling algorithm1 to generate more effective and reliable events. The main idea of our algorithm is to generate up-sampling events on the event motion trajectory. First, we estimate the event motion trajectory by contrast maximization algorithm and then up-sampling the events by temporal point processes. Experimental results show that up-sampling events can provide more effective information and improve the performance of downstream tasks, such as improving the quality of reconstructed images and increasing the accuracy of object detection.

preprint2020arXiv

Smart Prediction of the Complaint Hotspot Problem in Mobile Network

In mobile network, a complaint hotspot problem often affects even thousands of users' service and leads to significant economic losses and bulk complaints. In this paper, we propose an approach to predict a customer complaint based on real-time user signalling data. Through analyzing the network and user sevice procedure, 30 key data fields related to user experience have been extracted in XDR data collected from the S1 interface. Furthermore, we augment these basic features with derived features for user experience evaluation, such as one-hot features, statistical features and differential features. Considering the problems of unbalanced data, we use LightGBM as our prediction model. LightGBM has strong generalization ability and was designed to handle unbalanced data. Experiments we conducted prove the effectiveness and efficiency of this proposal. This approach has been deployed for daily routine to locate the hot complaint problem scope as well as to report affected users and area.