Source author record

Tao Cheng

Tao Cheng appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Artificial Intelligence Computation and Language hep-ph math.CV math.DS math.GT math.QA math.RA

Catalog footprint

What is connected

6works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning Framework

Geolocation, the task of identifying an image's location, requires complex reasoning and is crucial for navigation, monitoring, and cultural preservation. However, current methods often produce coarse, imprecise, and non-interpretable localization. A major challenge lies in the quality and scale of existing geolocation datasets. These datasets are typically small-scale and automatically constructed, leading to noisy data and inconsistent task difficulty, with images that either reveal answers too easily or lack sufficient clues for reliable inference. To address these challenges, we introduce a comprehensive geolocation framework with three key components: GeoComp, a large-scale dataset; GeoCoT, a novel reasoning method; and GeoEval, an evaluation metric, collectively designed to address critical challenges and drive advancements in geolocation research. At the core of this framework is GeoComp (Geolocation Competition Dataset), a large-scale dataset collected from a geolocation game platform involving 740K users over two years. It comprises 25 million entries of metadata and 3 million geo-tagged locations spanning much of the globe, with each location annotated thousands to tens of thousands of times by human users. The dataset offers diverse difficulty levels for detailed analysis and highlights key gaps in current models. Building on this dataset, we propose Geographical Chain-of-Thought (GeoCoT), a novel multi-step reasoning framework designed to enhance the reasoning capabilities of Large Vision Models (LVMs) in geolocation tasks. GeoCoT improves performance by integrating contextual and spatial cues through a multi-step process that mimics human geolocation reasoning. Finally, using the GeoEval metric, we demonstrate that GeoCoT significantly boosts geolocation accuracy by up to 25% while enhancing interpretability.

preprint2026arXiv

Vision-Language Reasoning for Geolocalization: A Reinforcement Learning Approach

Recent advances in vision-language models have opened up new possibilities for reasoning-driven image geolocalization. However, existing approaches often rely on synthetic reasoning annotations or external image retrieval, which can limit interpretability and generalizability. In this paper, we present Geo-R, a retrieval-free framework that uncovers structured reasoning paths from existing ground-truth coordinates and optimizes geolocation accuracy via reinforcement learning. We propose the Chain of Region, a rule-based hierarchical reasoning paradigm that generates precise, interpretable supervision by mapping GPS coordinates to geographic entities (e.g., country, province, city) without relying on model-generated or synthetic labels. Building on this, we introduce a lightweight reinforcement learning strategy with coordinate-aligned rewards based on Haversine distance, enabling the model to refine predictions through spatially meaningful feedback. Our approach bridges structured geographic reasoning with direct spatial supervision, yielding improved localization accuracy, stronger generalization, and more transparent inference. Experimental results across multiple benchmarks confirm the effectiveness of Geo-R, establishing a new retrieval-free paradigm for scalable and interpretable image geolocalization. To facilitate further research and ensure reproducibility, both the model and code will be made publicly available.

preprint2021arXiv

CyclingNet: Detecting cycling near misses from video streams in complex urban scenes with deep learning

Cycling is a promising sustainable mode for commuting and leisure in cities, however, the fear of getting hit or fall reduces its wide expansion as a commuting mode. In this paper, we introduce a novel method called CyclingNet for detecting cycling near misses from video streams generated by a mounted frontal camera on a bike regardless of the camera position, the conditions of the built, the visual conditions and without any restrictions on the riding behaviour. CyclingNet is a deep computer vision model based on convolutional structure embedded with self-attention bidirectional long-short term memory (LSTM) blocks that aim to understand near misses from both sequential images of scenes and their optical flows. The model is trained on scenes of both safe rides and near misses. After 42 hours of training on a single GPU, the model shows high accuracy on the training, testing and validation sets. The model is intended to be used for generating information that can draw significant conclusions regarding cycling behaviour in cities and elsewhere, which could help planners and policy-makers to better understand the requirement of safety measures when designing infrastructure or drawing policies. As for future work, the model can be pipelined with other state-of-the-art classifiers and object detectors simultaneously to understand the causality of near misses based on factors related to interactions of road-users, the built and the natural environments.

preprint2021arXiv

Long-term LHC Discovery Reach for Compressed Higgsino-like Models using VBF Processes

The identity of Dark Matter (DM) is one of the most active topics in particle physics today. Supersymmetry (SUSY) is an extension of the standard model (SM) that could describe the particle nature of DM in the form of the lightest neutralino in R-parity conserving models. We focus on SUSY models that solve the hierarchy problem with small fine tuning, and where the lightest SUSY particles ($\tildeχ_{1}^{0}$, $\tildeχ_{1}^{\pm}$, $\tildeχ_{2}^{0}$) are a triplet of higgsino-like states, such that the mass difference $Δm(\tildeχ^{0}_{2},\tildeχ^{0}_{1})$ is 2-50 GeV. We perform a feasibility study to assess the long-term discovery potential for these compressed SUSY models with higgsino-like states, using vector boson fusion (VBF) processes in the context of proton-proton collisions at $\sqrt{s} = 13$ TeV, at the CERN Large Hadron Collider. Assuming an integrated luminosity of 3000 fb$^{-1}$, we find that stringent VBF requirements, combined with large missing momentum and one or two low-$p_{T}$ leptons, is effective at reducing the major SM backgrounds, leading to a 5$σ$ (3$σ$) discovery reach for $m(\tildeχ^{0}_{2}) < 180$ $(260)$ GeV, and a projected 95\% confidence level exclusion region that covers $m(\tildeχ^{0}_{2})$ up to 385 GeV, parameter space that is currently unconstrained by other experiments.

preprint2016arXiv

Generalized Clifford Algebras as Algebras in Suitable Symmetric Linear Gr-Categories

By viewing Clifford algebras as algebras in some suitable symmetric Gr-categories, Albuquerque and Majid were able to give a new derivation of some well known results about Clifford algebras and to generalize them. Along the same line, Bulacu observed that Clifford algebras are weak Hopf algebras in the aforementioned categories and obtained other interesting properties. The aim of this paper is to study generalized Clifford algebras in a similar manner and extend the results of Albuquerque, Majid and Bulacu to the generalized setting. In particular, by taking full advantage of the gauge transformations in symmetric linear Gr-categories, we derive the decomposition theorem and provide categorical weak Hopf structures for generalized Clifford algebras in a conceptual and simpler manner.

preprint2012arXiv

Geometrization of sub-hyperbolic semi-rational branched coverings

Given a sub-hyperbolic semi-rational branched covering which is not CLH-equivalent a rational map, it must have the non-empty canonical Thurston obstruction. By using this canonical Thurston obstruction, we decompose this dynamical system in this paper into several sub-dynamical systems. Each of these sub-dynamical systems is either a post-critically finite type branched covering or a sub-hyperbolic semi-rational type branched covering. If a sub-dynamical system is a post-critically finite type branched covering with a hyperbolic orbifold, then it has no Thurston obstruction and is combinatorially equivalent to a unique post-critically finite rational map (up to conjugation by an automorphism of the Riemann sphere) and, more importantly, if a sub-dynamical system is a sub-hyperbolic semi-rational type branched covering with hyperbolic orbifold, we prove in this paper that it has no Thurston obstruction and is CLH-equivalent to a unique geometrically finite rational map (up to conjugation by an automorphism of the Riemann sphere).