Source author record

Yiming Hu

Yiming Hu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Artificial Intelligence Computation and Language Machine Learning Neural and Evolutionary Computing physics.gen-ph physics.ins-det

Catalog footprint

What is connected

5works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization

The image geolocalization task aims to predict the location where an image was taken anywhere on Earth using visual clues. Existing large vision-language model (LVLM) approaches leverage world knowledge, chain-of-thought reasoning, and agentic capabilities, but overlook a common strategy used by humans -- using maps. In this work, we first equip the model \textit{Thinking with Map} ability and formulate it as an agent-in-the-map loop. We develop a two-stage optimization scheme for it, including agentic reinforcement learning (RL) followed by parallel test-time scaling (TTS). The RL strengthens the agentic capability of model to improve sampling efficiency, and the parallel TTS enables the model to explore multiple candidate paths before making the final prediction, which is crucial for geolocalization. To evaluate our method on up-to-date and in-the-wild images, we further present MAPBench, a comprehensive geolocalization training and evaluation benchmark composed entirely of real-world images. Experimental results show that our method outperforms existing open- and closed-source models on most metrics, specifically improving Acc@500m from 8.0\% to 22.1\% compared to \textit{Gemini-3-Pro} with Google Search/Map grounded mode.

preprint2023arXiv

MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices

We present MobileVLM, a competent multimodal vision language model (MMVLM) targeted to run on mobile devices. It is an amalgamation of a myriad of architectural designs and techniques that are mobile-oriented, which comprises a set of language models at the scale of 1.4B and 2.7B parameters, trained from scratch, a multimodal vision model that is pre-trained in the CLIP fashion, cross-modality interaction via an efficient projector. We evaluate MobileVLM on several typical VLM benchmarks. Our models demonstrate on par performance compared with a few much larger models. More importantly, we measure the inference speed on both a Qualcomm Snapdragon 888 CPU and an NVIDIA Jeston Orin GPU, and we obtain state-of-the-art performance of 21.5 tokens and 65.3 tokens per second, respectively. Our code will be made available at: https://github.com/Meituan-AutoML/MobileVLM.

preprint2020arXiv

Angle-based Search Space Shrinking for Neural Architecture Search

In this work, we present a simple and general search space shrinking method, called Angle-Based search space Shrinking (ABS), for Neural Architecture Search (NAS). Our approach progressively simplifies the original search space by dropping unpromising candidates, thus can reduce difficulties for existing NAS methods to find superior architectures. In particular, we propose an angle-based metric to guide the shrinking process. We provide comprehensive evidences showing that, in weight-sharing supernet, the proposed metric is more stable and accurate than accuracy-based and magnitude-based metrics to predict the capability of child models. We also show that the angle-based metric can converge fast while training supernet, enabling us to get promising shrunk search spaces efficiently. ABS can easily apply to most of NAS approaches (e.g. SPOS, FairNAS, ProxylessNAS, DARTS and PDARTS). Comprehensive experiments show that ABS can dramatically enhance existing NAS approaches by providing a promising shrunk search space.

preprint2016arXiv

The calibration and electron energy reconstruction of the BGO ECAL of the DAMPE detector

The DArk Matter Particle Explorer (DAMPE) is a space experiment designed to search for dark matter indirectly by measuring the spectra of photons, electrons, and positrons up to 10 TeV. The BGO electromagnetic calorimeter (ECAL) is its main sub-detector for energy measurement. In this paper, the instrumentation and development of the BGO ECAL is briefly described. The calibration on the ground, including the pedestal, minimum ionizing particle (MIP) peak, dynode ratio, and attenuation length with the cosmic rays and beam particles is discussed in detail. Also, the energy reconstruction results of the electrons from the beam test are presented.

preprint2012arXiv

Error Analysis of Ia Supernova and Query on Cosmic Dark Energy

Some serious faults in error analysis of observations for SNIa have been found. Redoing the same error analysis of SNIa, by our idea, it is found that the average total observational error of SNIa is obviously greater than $0.55^m$, so we can't decide whether the universe is accelerating expansion or not.