Researcher profile

Yu Qin

Yu Qin contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2025arXiv

Learnable Query Aggregation with KV Routing for Cross-view Geo-localisation

Cross-view geo-localisation (CVGL) aims to estimate the geographic location of a query image by matching it with images from a large-scale database. However, the significant view-point discrepancies present considerable challenges for effective feature aggregation and alignment. To address these challenges, we propose a novel CVGL system that incorporates three key improvements. Firstly, we leverage the DINOv2 backbone with a convolution adapter fine-tuning to enhance model adaptability to cross-view variations. Secondly, we propose a multi-scale channel reallocation module to strengthen the diversity and stability of spatial representations. Finally, we propose an improved aggregation module that integrates a Mixture-of-Experts (MoE) routing into the feature aggregation process. Specifically, the module dynamically selects expert subspaces for the keys and values in a cross-attention framework, enabling adaptive processing of heterogeneous input domains. Extensive experiments on the University-1652 and SUES-200 datasets demonstrate that our method achieves competitive performance with fewer trained parameters.

preprint2022arXiv

Mesoscopic Collective Activity in Excitatory Neural Fields: Cross-frequency Coupling

In the brain, cross-frequency coupling has been hypothesized to result from the activity of specialized microcircuits. For example, theta-gamma coupling is assumed to be generated by specialized cell pairs (PING and ING mechanisms), or special cells (e.g., fast bursting neurons). However, this implies that the generating mechanisms is uniquely specific to the brain. In fact, cross-scale coupling is a phenomenon encountered in the physics of all large, multi-scale systems: phase and amplitude correlations between components of different scales arise as a result of nonlinear interaction. Because the brain is a multi-scale system too, a similar mechanism must be active in the brain. Here, we represent brain activity as a superposition of nonlinearly interacting patterns of spatio-temporal activity (collective activity), supported by populations of neurons. Cross-frequency coupling is a direct consequence of the nonlinear interactions, and does not require specialized cells or cell pairs. It is therefore universal, and must be active in neural fields of any composition. To emphasize this, we demonstrate the phenomenon in excitatory fields. While there is no doubt that specialized cells play a role in theta-gamma coupling, our results suggest that the coupling mechanism is at the same time simpler and richer: simpler because it involves the universal principle of nonlinearity; richer, because nonlinearity of collective activity is likely modulated by specialized-cell populations in ways to be yet understood.

preprint2022arXiv

Mesoscopic Collective Activity in Excitatory Neural Fields: Governing Equations

In this study we derive the governing equations for mesoscopic collective activity in the cortex, starting from the generic Hodgkin-Huxley equations for microscopic cell dynamics. For simplicity, and to maintain focus on the essential elements of the derivation, the discussion is confined to excitatory neural fields. The fundamental assumption of the procedure is that mesoscale processes are macroscopic with respect to cell-scale activity, and emerge as the average behavior of a large population of cells. Because of their duration, action-potential details are assumed not observable at mesoscale; the essential mesoscopic function of action potentials is to redistribute energy in the neural field. The Hodgkin-Huxley dynamical model is first reduced to a set of equations that describe subthreshold dynamics. An ensemble average over a cell population then produces a closed system of equations involving two mesoscopic state variables: the density of kinetic energy J, carried by sodium ionic currents, and the excitability H of the neural field, which could be described as the average state of gating variable h. The resulting model is represented as essentially a subthreshold process; and the dynamical role of the firing rate is naturally reassessed as describing energy transfers. The linear properties of the equations are consistent with expectations for the dynamics of excitatory neural fields: the system supports oscillations of progressive waves, with shorter waves typically having higher frequencies, propagating slower, and decaying faster. Extending the derivation to include more complex cell dynamics (e.g., including other ionic channels, e.g., calcium channels) and multiple-type, excitatory-inhibitory, neural fields is straightforward, and will be presented elsewhere.

preprint2022arXiv

PANet: Perspective-Aware Network with Dynamic Receptive Fields and Self-Distilling Supervision for Crowd Counting

Crowd counting aims to learn the crowd density distributions and estimate the number of objects (e.g. persons) in images. The perspective effect, which significantly influences the distribution of data points, plays an important role in crowd counting. In this paper, we propose a novel perspective-aware approach called PANet to address the perspective problem. Based on the observation that the size of the objects varies greatly in one image due to the perspective effect, we propose the dynamic receptive fields (DRF) framework. The framework is able to adjust the receptive field by the dilated convolution parameters according to the input image, which helps the model to extract more discriminative features for each local region. Different from most previous works which use Gaussian kernels to generate the density map as the supervised information, we propose the self-distilling supervision (SDS) training method. The ground-truth density maps are refined from the first training stage and the perspective information is distilled to the model in the second stage. The experimental results on ShanghaiTech Part_A and Part_B, UCF_QNRF, and UCF_CC_50 datasets demonstrate that our proposed PANet outperforms the state-of-the-art methods by a large margin.

preprint2021arXiv

Deconfounded Visual Grounding

We focus on the confounding bias between language and location in the visual grounding pipeline, where we find that the bias is the major visual reasoning bottleneck. For example, the grounding process is usually a trivial language-location association without visual reasoning, e.g., grounding any language query containing sheep to the nearly central regions, due to that most queries about sheep have ground-truth locations at the image center. First, we frame the visual grounding pipeline into a causal graph, which shows the causalities among image, query, target location and underlying confounder. Through the causal graph, we know how to break the grounding bottleneck: deconfounded visual grounding. Second, to tackle the challenge that the confounder is unobserved in general, we propose a confounder-agnostic approach called: Referring Expression Deconfounder (RED), to remove the confounding bias. Third, we implement RED as a simple language attention, which can be applied in any grounding method. On popular benchmarks, RED improves various state-of-the-art grounding methods by a significant margin. Code will soon be available at: https://github.com/JianqiangH/Deconfounded_VG.