Researcher profile

Tao Cheng

Tao Cheng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning Framework

Geolocation, the task of identifying an image's location, requires complex reasoning and is crucial for navigation, monitoring, and cultural preservation. However, current methods often produce coarse, imprecise, and non-interpretable localization. A major challenge lies in the quality and scale of existing geolocation datasets. These datasets are typically small-scale and automatically constructed, leading to noisy data and inconsistent task difficulty, with images that either reveal answers too easily or lack sufficient clues for reliable inference. To address these challenges, we introduce a comprehensive geolocation framework with three key components: GeoComp, a large-scale dataset; GeoCoT, a novel reasoning method; and GeoEval, an evaluation metric, collectively designed to address critical challenges and drive advancements in geolocation research. At the core of this framework is GeoComp (Geolocation Competition Dataset), a large-scale dataset collected from a geolocation game platform involving 740K users over two years. It comprises 25 million entries of metadata and 3 million geo-tagged locations spanning much of the globe, with each location annotated thousands to tens of thousands of times by human users. The dataset offers diverse difficulty levels for detailed analysis and highlights key gaps in current models. Building on this dataset, we propose Geographical Chain-of-Thought (GeoCoT), a novel multi-step reasoning framework designed to enhance the reasoning capabilities of Large Vision Models (LVMs) in geolocation tasks. GeoCoT improves performance by integrating contextual and spatial cues through a multi-step process that mimics human geolocation reasoning. Finally, using the GeoEval metric, we demonstrate that GeoCoT significantly boosts geolocation accuracy by up to 25% while enhancing interpretability.

preprint2026arXiv

Vision-Language Reasoning for Geolocalization: A Reinforcement Learning Approach

Recent advances in vision-language models have opened up new possibilities for reasoning-driven image geolocalization. However, existing approaches often rely on synthetic reasoning annotations or external image retrieval, which can limit interpretability and generalizability. In this paper, we present Geo-R, a retrieval-free framework that uncovers structured reasoning paths from existing ground-truth coordinates and optimizes geolocation accuracy via reinforcement learning. We propose the Chain of Region, a rule-based hierarchical reasoning paradigm that generates precise, interpretable supervision by mapping GPS coordinates to geographic entities (e.g., country, province, city) without relying on model-generated or synthetic labels. Building on this, we introduce a lightweight reinforcement learning strategy with coordinate-aligned rewards based on Haversine distance, enabling the model to refine predictions through spatially meaningful feedback. Our approach bridges structured geographic reasoning with direct spatial supervision, yielding improved localization accuracy, stronger generalization, and more transparent inference. Experimental results across multiple benchmarks confirm the effectiveness of Geo-R, establishing a new retrieval-free paradigm for scalable and interpretable image geolocalization. To facilitate further research and ensure reproducibility, both the model and code will be made publicly available.

preprint2021arXiv

CyclingNet: Detecting cycling near misses from video streams in complex urban scenes with deep learning

Cycling is a promising sustainable mode for commuting and leisure in cities, however, the fear of getting hit or fall reduces its wide expansion as a commuting mode. In this paper, we introduce a novel method called CyclingNet for detecting cycling near misses from video streams generated by a mounted frontal camera on a bike regardless of the camera position, the conditions of the built, the visual conditions and without any restrictions on the riding behaviour. CyclingNet is a deep computer vision model based on convolutional structure embedded with self-attention bidirectional long-short term memory (LSTM) blocks that aim to understand near misses from both sequential images of scenes and their optical flows. The model is trained on scenes of both safe rides and near misses. After 42 hours of training on a single GPU, the model shows high accuracy on the training, testing and validation sets. The model is intended to be used for generating information that can draw significant conclusions regarding cycling behaviour in cities and elsewhere, which could help planners and policy-makers to better understand the requirement of safety measures when designing infrastructure or drawing policies. As for future work, the model can be pipelined with other state-of-the-art classifiers and object detectors simultaneously to understand the causality of near misses based on factors related to interactions of road-users, the built and the natural environments.

preprint2021arXiv

Long-term LHC Discovery Reach for Compressed Higgsino-like Models using VBF Processes

The identity of Dark Matter (DM) is one of the most active topics in particle physics today. Supersymmetry (SUSY) is an extension of the standard model (SM) that could describe the particle nature of DM in the form of the lightest neutralino in R-parity conserving models. We focus on SUSY models that solve the hierarchy problem with small fine tuning, and where the lightest SUSY particles ($\tildeχ_{1}^{0}$, $\tildeχ_{1}^{\pm}$, $\tildeχ_{2}^{0}$) are a triplet of higgsino-like states, such that the mass difference $Δm(\tildeχ^{0}_{2},\tildeχ^{0}_{1})$ is 2-50 GeV. We perform a feasibility study to assess the long-term discovery potential for these compressed SUSY models with higgsino-like states, using vector boson fusion (VBF) processes in the context of proton-proton collisions at $\sqrt{s} = 13$ TeV, at the CERN Large Hadron Collider. Assuming an integrated luminosity of 3000 fb$^{-1}$, we find that stringent VBF requirements, combined with large missing momentum and one or two low-$p_{T}$ leptons, is effective at reducing the major SM backgrounds, leading to a 5$σ$ (3$σ$) discovery reach for $m(\tildeχ^{0}_{2}) < 180$ $(260)$ GeV, and a projected 95\% confidence level exclusion region that covers $m(\tildeχ^{0}_{2})$ up to 385 GeV, parameter space that is currently unconstrained by other experiments.