Source author record

Guang Tan

Guang Tan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Artificial Intelligence Machine Learning Networking and Internet Architecture Other Computer Science

Catalog footprint

What is connected

4works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

COVR:Collaborative Optimization of VLMs and RL Agent for Visual-Based Control

Visual reinforcement learning (RL) suffers from poor sample efficiency due to high-dimensional observations in complex tasks. While existing works have shown that vision-language models (VLMs) can assist RL, they often focus on knowledge distillation from the VLM to RL, overlooking the potential of RL-generated interaction data to enhance the VLM. To address this, we propose COVR, a collaborative optimization framework that enables the mutual enhancement of the VLM and RL policies. Specifically, COVR fine-tunes the VLM with RL-generated data to enhance the semantic reasoning ability consistent with the target task, and uses the enhanced VLM to further guide policy learning via action priors. To improve fine-tuning efficiency, we introduce two key modules: (1) an Exploration-Driven Dynamic Filter module that preserves valuable exploration samples using adaptive thresholds based on the degree of exploration, and (2) a Return-Aware Adaptive Loss Weight module that improves the stability of training by quantifying the inconsistency of sampling actions via return signals of RL. We further design a progressive fine-tuning strategy to reduce resource consumption. Extensive experiments show that COVR achieves strong performance across various challenging visual control tasks.

preprint2026arXiv

Deflickering Vision-Based Occupancy Networks through Lightweight Spatio-Temporal Correlation

Vision-based occupancy networks (VONs) provide an end-to-end solution for reconstructing 3D environments in autonomous driving. However, existing methods often suffer from temporal inconsistencies, manifesting as flickering effects that degrade temporal coherence and adversely affect downstream decision-making. While recent approaches incorporate historical information to alleviate this issue, they often incur high computational costs and may introduce misaligned or redundant features that interfere with object detection. We propose OccLinker, a novel plugin framework that can be easily integrated into existing VONs to improve performance. Our method efficiently consolidates historical static and motion cues, learns sparse latent correlations with current features through a dual cross-attention mechanism, and generates correction occupancy components to refine the base network predictions. In addition, we introduce a new temporal consistency metric to quantitatively measure flickering effects. Extensive experiments on two benchmark datasets demonstrate that our method achieves superior performance with minimal computational overhead while effectively reducing flickering artifacts.

preprint2015arXiv

Modeling and Improving the Energy Performance of GPS Receivers for Mobile Applications

Integrated GPS receivers have become a basic module in today's mobile devices. While serving as the cornerstone for location based services, GPS modules have a serious battery drain problem due to high computation load. This paper aims to reveal the impact of key software parameters on hardware energy consumption, by establishing an energy model for a standard GPS receiver architecture as found in both academic and industrial designs. In particular, our measurements show that the receiver's energy consumption is in large part linear with the number of tracked satellites. This leads to a design of selective tracking algorithm that provides similar positioning accuracy (around 12m) with a subset of selected satellites, which translates to an energy saving of 20.9-23.1\% on the Namuru board.

preprint2014arXiv

LIPS: A Light Intensity Based Positioning System For Indoor Environments

This paper presents LIPS, a Light Intensity based Positioning System for indoor environments. The system uses off-the-shelf LED lamps as signal sources, and uses light sensors as signal receivers. The design is inspired by the observation that a light sensor has deterministic sensitivity to both distance and incident angle of light signal, an under-utilized feature of photodiodes now widely found on mobile devices. We develop a stable and accurate light intensity model to capture the phenomenon, based on which a new positioning principle, Multi-Face Light Positioning (MFLP), is established that uses three collocated sensors to uniquely determine the receiver's position, assuming merely a single source of light. We have implemented a prototype on both dedicated embedded systems and smartphones. Experimental results show average positioning accuracy within 0.4 meters across different environments, with high stability against interferences from obstacles, ambient lights, temperature variation, etc.