Researcher profile

Changhao Chen

Changhao Chen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
14works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

14 published item(s)

preprint2026arXiv

Efficient Feature-Free Initialization for Monocular Visual-Inertial Systems Using a Feed-Forward 3D Model

Fast and reliable initialization is critical for monocular visual-inertial navigation systems (VINS), as it establishes the starting conditions for subsequent state estimation. Despite steady progress, most existing methods heavily rely on visual feature correspondences and require 3-4 seconds of sensory data for successful initialization, which limits their applicability and efficiency. With the advent of feed-forward 3D models that can directly predict point clouds from images, we revisit the visual-inertial initialization problem from a concise perspective. In this work, we propose a feature-free initialization framework that leverages up-to-scale point clouds predicted by a feed-forward 3D model, thereby obviating the need for visual feature tracking and estimation. This design substantially reduces system complexity and improves the reliability of initialization. Experiments on public datasets demonstrate that the proposed feature-free initialization method achieves the highest success rate, exceeding 90%, and significantly reduces the data duration required for successful initialization, typically to under 1.2 s. We further validate our method on a self-collected dataset covering various indoor and outdoor scenarios, demonstrating robust performance, particularly in visually degraded environments where existing methods often fail. The code and dataset are available at https://github.com/Yuantai-Z/FF-VIO-Init.

preprint2026arXiv

FreeOcc: Training-Free Embodied Open-Vocabulary Occupancy Prediction

Existing learning-based occupancy prediction methods rely on large-scale 3D annotations and generalize poorly across environments. We present FreeOcc, a training-free framework for open-vocabulary occupancy prediction from monocular or RGB-D sequences. Unlike prior approaches that require voxel-level supervision and ground-truth camera poses, FreeOcc operates without 3D annotations, pose ground truth, or any learning stage. FreeOcc incrementally builds a globally consistent occupancy map via a four-layer pipeline: a SLAM backbone estimates poses and sparse geometry; a geometrically consistent Gaussian update constructs dense 3D Gaussian maps; open-vocabulary semantics from off-the-shelf vision-language models are associated with Gaussian primitives; and a probabilistic Gaussian-to-occupancy projection produces dense voxel occupancy. Despite being entirely training-free and pose-agnostic, FreeOcc achieves over $2\times$ improvements in IoU and mIoU on EmbodiedOcc-ScanNet compared to prior self-supervised methods. We further introduce ReplicaOcc, a benchmark for indoor open-vocabulary occupancy prediction, and show that FreeOcc transfers zero-shot to novel environments, substantially outperforming both supervised and self-supervised baselines. Project page: https://the-masses.github.io/freeocc-web/.

preprint2022arXiv

Learning Selective Sensor Fusion for States Estimation

Autonomous vehicles and mobile robotic systems are typically equipped with multiple sensors to provide redundancy. By integrating the observations from different sensors, these mobile agents are able to perceive the environment and estimate system states, e.g. locations and orientations. Although deep learning approaches for multimodal odometry estimation and localization have gained traction, they rarely focus on the issue of robust sensor fusion - a necessary consideration to deal with noisy or incomplete sensor observations in the real world. Moreover, current deep odometry models suffer from a lack of interpretability. To this extent, we propose SelectFusion, an end-to-end selective sensor fusion module which can be applied to useful pairs of sensor modalities such as monocular images and inertial measurements, depth images and LIDAR point clouds. Our model is a uniform framework that is not restricted to specific modality or task. During prediction, the network is able to assess the reliability of the latent features from different sensor modalities and estimate trajectory both at scale and global pose. In particular, we propose two fusion modules - a deterministic soft fusion and a stochastic hard fusion, and offer a comprehensive study of the new strategies compared to trivial direct fusion. We extensively evaluate all fusion strategies in both public datasets and on progressively degraded datasets that present synthetic occlusions, noisy and missing data and time misalignment between sensors, and we investigate the effectiveness of the different fusion strategies in attending the most reliable features, which in itself, provides insights into the operation of the various models.

preprint2020arXiv

A Survey on Deep Learning for Localization and Mapping: Towards the Age of Spatial Machine Intelligence

Deep learning based localization and mapping has recently attracted significant attention. Instead of creating hand-designed algorithms through exploitation of physical models or geometric theories, deep learning based solutions provide an alternative to solve the problem in a data-driven way. Benefiting from ever-increasing volumes of data and computational power, these methods are fast evolving into a new area that offers accurate and robust systems to track motion and estimate scenes and their structure for real-world applications. In this work, we provide a comprehensive survey, and propose a new taxonomy for localization and mapping using deep learning. We also discuss the limitations of current models, and indicate possible future directions. A wide range of topics are covered, from learning odometry estimation, mapping, to global localization and simultaneous localization and mapping (SLAM). We revisit the problem of perceiving self-motion and scene understanding with on-board sensors, and show how to solve it by integrating these modules into a prospective spatial machine intelligence system (SMIS). It is our hope that this work can connect emerging works from robotics, computer vision and machine learning communities, and serve as a guide for future researchers to apply deep learning to tackle localization and mapping problems.

preprint2020arXiv

Deep Learning based Pedestrian Inertial Navigation: Methods, Dataset and On-Device Inference

Modern inertial measurements units (IMUs) are small, cheap, energy efficient, and widely employed in smart devices and mobile robots. Exploiting inertial data for accurate and reliable pedestrian navigation supports is a key component for emerging Internet-of-Things applications and services. Recently, there has been a growing interest in applying deep neural networks (DNNs) to motion sensing and location estimation. However, the lack of sufficient labelled data for training and evaluating architecture benchmarks has limited the adoption of DNNs in IMU-based tasks. In this paper, we present and release the Oxford Inertial Odometry Dataset (OxIOD), a first-of-its-kind public dataset for deep learning based inertial navigation research, with fine-grained ground-truth on all sequences. Furthermore, to enable more efficient inference at the edge, we propose a novel lightweight framework to learn and reconstruct pedestrian trajectories from raw IMU data. Extensive experiments show the effectiveness of our dataset and methods in achieving accurate data-driven pedestrian inertial navigation on resource-constrained devices.

preprint2020arXiv

DeepPCO: End-to-End Point Cloud Odometry through Deep Parallel Neural Network

Odometry is of key importance for localization in the absence of a map. There is considerable work in the area of visual odometry (VO), and recent advances in deep learning have brought novel approaches to VO, which directly learn salient features from raw images. These learning-based approaches have led to more accurate and robust VO systems. However, they have not been well applied to point cloud data yet. In this work, we investigate how to exploit deep learning to estimate point cloud odometry (PCO), which may serve as a critical component in point cloud-based downstream tasks or learning-based systems. Specifically, we propose a novel end-to-end deep parallel neural network called DeepPCO, which can estimate the 6-DOF poses using consecutive point clouds. It consists of two parallel sub-networks to estimate 3-D translation and orientation respectively rather than a single neural network. We validate our approach on KITTI Visual Odometry/SLAM benchmark dataset with different baselines. Experiments demonstrate that the proposed approach achieves good performance in terms of pose accuracy.

preprint2020arXiv

DeepTIO: A Deep Thermal-Inertial Odometry with Visual Hallucination

Visual odometry shows excellent performance in a wide range of environments. However, in visually-denied scenarios (e.g. heavy smoke or darkness), pose estimates degrade or even fail. Thermal cameras are commonly used for perception and inspection when the environment has low visibility. However, their use in odometry estimation is hampered by the lack of robust visual features. In part, this is as a result of the sensor measuring the ambient temperature profile rather than scene appearance and geometry. To overcome this issue, we propose a Deep Neural Network model for thermal-inertial odometry (DeepTIO) by incorporating a visual hallucination network to provide the thermal network with complementary information. The hallucination network is taught to predict fake visual features from thermal images by using Huber loss. We also employ selective fusion to attentively fuse the features from three different modalities, i.e thermal, hallucination, and inertial features. Extensive experiments are performed in hand-held and mobile robot data in benign and smoke-filled environments, showing the efficacy of the proposed model.

preprint2020arXiv

Hybrid bounds on two-parametric family Weyl sums along smooth curves

We obtain a new bound on Weyl sums with degree $k\ge 2$ polynomials of the form $(τx+c) ω(n)+xn$, $n=1, 2, \ldots$, with fixed $ω(T) \in \mathbb{Z}[T]$ and $τ\in \mathbb{R}$, which holds for almost all $c\in [0,1)$ and all $x\in [0,1)$. We improve and generalise some recent results of M.~B.~Erdogan and G.~Shakan (2019), whose work also shows links between this question and some classical partial differential equations. We extend this to more general settings of families of polynomials $xn+y ω(n)$ for all $(x,y)\in [0,1)^2$ with $f(x,y)=z$ for a set of $z \in [0,1)$ of full Lebesgue measure, provided that $f$ is some Hölder function.

preprint2020arXiv

On Large Values of Weyl Sums

A special case of the Menshov--Rademacher theorem implies for almost all polynomials $x_1Z+\ldots +x_d Z^{d} \in {\mathbb R}[Z]$ of degree $d$ for the Weyl sums satisfy the upper bound $$ \left| \sum_{n=1}^{N}\exp\left(2πi \left(x_1 n+\ldots +x_d n^{d}\right)\right) \right| \leqslant N^{1/2+o(1)}, \qquad N\to \infty. $$ Here we investigate the exceptional sets of coefficients $(x_1, \ldots, x_d)$ with large values of Weyl sums for infinitely many $N$, and show that in terms of the Baire categories and Hausdorff dimension they are quite massive, in particular of positive Hausdorff dimension in any fixed cube inside of $[0,1]^d$. We also use a different technique to give similar results for sums with just one monomial $xn^d$. We apply these results to show that the set of poorly distributed modulo one polynomials is rather massive as well.

preprint2020arXiv

Restricted mean value theorems and metric theory of restricted Weyl sums

We study an apparently new question about the behaviour of Weyl sums on a subset $\mathcal{X}\subseteq [0,1)^d$ with a natural measure $μ$ on $\mathcal{X}$. For certain measure spaces $(\mathcal{X}, μ)$ we obtain non-trivial bounds for the mean values of the Weyl sums, and for $μ$-almost all points of $\mathcal{X}$ the Weyl sums satisfy the square root cancellation law. Moreover we characterise the size of the exceptional sets in terms of Hausdorff dimension. Finally, we derive variants of the Vinogradov mean value theorem averaging over measure spaces $(\mathcal{X}, μ)$. We obtain general results, which we refine for some special spaces $\mathcal{X}$ such as spheres, moment curves and line segments.

preprint2020arXiv

See Through Smoke: Robust Indoor Mapping with Low-cost mmWave Radar

This paper presents the design, implementation and evaluation of milliMap, a single-chip millimetre wave (mmWave) radar based indoor mapping system targetted towards low-visibility environments to assist in emergency response. A unique feature of milliMap is that it only leverages a low-cost, off-the-shelf mmWave radar, but can reconstruct a dense grid map with accuracy comparable to lidar, as well as providing semantic annotations of objects on the map. milliMap makes two key technical contributions. First, it autonomously overcomes the sparsity and multi-path noise of mmWave signals by combining cross-modal supervision from a co-located lidar during training and the strong geometric priors of indoor spaces. Second, it takes the spectral response of mmWave reflections as features to robustly identify different types of objects e.g. doors, walls etc. Extensive experiments in different indoor environments show that milliMap can achieve a map reconstruction error less than 0.2m and classify key semantics with an accuracy around 90%, whilst operating through dense smoke.

preprint2020arXiv

Self-similar sets with super-exponential close cylinders

S. Baker (2019), B. Bárány and A. Käenmäki (2019) independently showed that there exist iterated function systems without exact overlaps and there are super-exponentially close cylinders at all small levels. We adapt the method of S. Baker and obtain further examples of this type. We prove that for any algebraic number $β\ge 2$ there exist real numbers $s, t$ such that the iterated function system $$ \left \{\frac{x}β, \frac{x+1}β, \frac{x+s}β, \frac{x+t}β\right \} $$ satisfies the above property.

preprint2020arXiv

Threshold functions for substructures in random subsets of finite vector spaces

The study of substructures in random objects has a long history, beginning with Erdős and Rényi's work on subgraphs of random graphs. We study the existence of certain substructures in random subsets of vector spaces over finite fields. First we provide a general framework which can be applied to establish coarse threshold results and prove a limiting Poisson distribution at the threshold scale. To illustrate our framework we apply our results to $k$-term arithmetic progressions, sums, right triangles, parallelograms and affine planes. We also find coarse thresholds for the property that a random subset of a finite vector space is sum-free, or is a Sidon set.