Source author record

Jinpeng Chen

Jinpeng Chen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Artificial Intelligence eess.IV Machine Learning Robotics

Catalog footprint

What is connected

3works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

LISA: Language-guided Interference-aware Spatial-Frequency Attention for Driver Gaze Estimation

Driver gaze estimation serves as a fundamental metric for evaluating driver attentiveness in modern monitoring systems. Beyond being vulnerable to sudden lighting changes and sensor noise, spatial-domain models struggle to disentangle authentic gaze cues from irrelevant visual attributes. In this paper, we propose LISA, a \textbf{L}anguage-guided \textbf{I}nterference-aware \textbf{S}patial-Frequency \textbf{A}ttention framework that combines frequency-domain priors with vision-language knowledge. Observing that the amplitude spectrum remains relatively stable even under spatial perturbations, we design a dual-domain fusion mechanism. It integrates stable low-frequency semantics into high-frequency details, employing spatial attention to precisely target ocular regions. To reduce semantic ambiguity, we also introduce a training-time disentanglement strategy. Using a frozen CLIP encoder and orthogonal regularization, we explicitly separate gaze features from appearance interference. Experiments on two benchmarks show that LISA achieves state-of-the-art performance, with significantly improved robustness against occlusions and lighting variations. The code repository is available at https://github.com/Mason-bupt/LISA.

preprint2026arXiv

See Silhouettes in Motion with Neuromorphic Vision

Quasi-bimodal objects, such as text, road signs, and barcodes, play a basic yet vital role in daily visual communication. By boiling these down to clear silhouettes, binarization uses a minimal language to convey essential vision cues for maximum downstream efficiency. The catch is that frame-based imaging often struggles on mobile platforms like drones, self-driving cars, and underwater vehicles. In these dynamic scenes, rapid motion and harsh lighting can make it blind, causing severe motion blur and erasing crucial details. To overcome the limits, neuromorphic vision via event cameras, featuring microsecond-level temporal resolution and high dynamic range, steps in as a natural solution. Building upon this event-driven sensing paradigm, we introduce a simple yet effective dual-modal approach that harnesses the synergy between frames and events to achieve real-time, high-frame-rate binarization on CPU-only devices. Extensive evaluations present that it earns competitive performance against leading techniques in reducing motion blur, while delivering impressive improvements under challenging illumination. Besides, our asynchronous workflow bypasses event scarcity that breaks traditional time-binning reconstruction, maintaining clear target shapes even at extreme kilohertz frame rates. Its binary results further serve as reliable representations that facilitate a range of downstream tasks. This work paves the way towards lightweight perception and interaction in embodied intelligence on resource-constrained edge platforms.

preprint2022arXiv

Adaptive Graph Diffusion Networks

Graph Neural Networks (GNNs) have received much attention in the graph deep learning domain. However, recent research empirically and theoretically shows that deep GNNs suffer from over-fitting and over-smoothing problems. The usual solutions either cannot solve extensive runtime of deep GNNs or restrict graph convolution in the same feature space. We propose the Adaptive Graph Diffusion Networks (AGDNs) which perform multi-layer generalized graph diffusion in different feature spaces with moderate complexity and runtime. Standard graph diffusion methods combine large and dense powers of the transition matrix with predefined weighting coefficients. Instead, AGDNs combine smaller multi-hop node representations with learnable and generalized weighting coefficients. We propose two scalable mechanisms of weighting coefficients to capture multi-hop information: Hop-wise Attention (HA) and Hop-wise Convolution (HC). We evaluate AGDNs on diverse, challenging Open Graph Benchmark (OGB) datasets with semi-supervised node classification and link prediction tasks. Until the date of submission (Aug 26, 2022), AGDNs achieve top-1 performance on the ogbn-arxiv, ogbn-proteins and ogbl-ddi datasets and top-3 performance on the ogbl-citation2 dataset. On the similar Tesla V100 GPU cards, AGDNs outperform Reversible GNNs (RevGNNs) with 13% complexity and 1% training runtime of RevGNNs on the ogbn-proteins dataset. AGDNs also achieve comparable performance to SEAL with 36% training and 0.2% inference runtime of SEAL on the ogbl-citation2 dataset.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint