Researcher profile

Jiaxin Li

Jiaxin Li contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2026arXiv

Floor Plan-Guided Visual Navigation Incorporating Depth and Directional Cues

Current visual navigation strategies mainly follow an exploration-first and then goal-directed navigation paradigm. This exploratory phase inevitably compromises the overall efficiency of navigation. Recent studies propose leveraging floor plans alongside RGB inputs to guide agents, aiming for rapid navigation without prior exploration or mapping. Key issues persist despite early successes. The modal gap and content misalignment between floor plans and RGB images necessitate an efficient approach to extract the most salient and complementary features from both for reliable navigation. Here, we propose GlocDiff, a novel framework that employs a diffusion-based policy to continuously predict future waypoints. This policy is conditioned on two complementary information streams: (1) local depth cues derived from the current RGB observation, and (2) global directional guidance extracted from the floor plan. The former handles immediate navigation safety by capturing surrounding geometry, while the latter ensures goal-directed efficiency by offering definitive directional cues. Extensive evaluations on the FloNa benchmark demonstrate that GlocDiff achieves superior efficiency and effectiveness. Furthermore, its successful deployment in real-world scenarios underscores its strong potential for broad practical application.

preprint2026arXiv

InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields

Existing depth estimation methods are fundamentally limited to predicting depth on discrete image grids. Such representations restrict their scalability to arbitrary output resolutions and hinder the geometric detail recovery. This paper introduces InfiniDepth, which represents depth as neural implicit fields. Through a simple yet effective local implicit decoder, we can query depth at continuous 2D coordinates, enabling arbitrary-resolution and fine-grained depth estimation. To better assess our method's capabilities, we curate a high-quality 4K synthetic benchmark from five different games, spanning diverse scenes with rich geometric and appearance details. Extensive experiments demonstrate that InfiniDepth achieves state-of-the-art performance on both synthetic and real-world benchmarks across relative and metric depth estimation tasks, particularly excelling in fine-detail regions. It also benefits the task of novel view synthesis under large viewpoint shifts, producing high-quality results with fewer holes and artifacts.

preprint2026arXiv

The Great March 100: 100 Detail-oriented Tasks for Evaluating Embodied AI Agents

Recently, with the rapid development of robot learning and imitation learning, numerous datasets and methods have emerged. However, these datasets and their task designs often lack systematic consideration and principles. This raises important questions: Do the current datasets and task designs truly advance the capabilities of robotic agents? Do evaluations on a few common tasks accurately reflect the differentiated performance of various methods proposed by different teams and evaluated on different tasks? To address these issues, we introduce the Great March 100 (\textbf{GM-100}) as the first step towards a robot learning Olympics. GM-100 consists of 100 carefully designed tasks that cover a wide range of interactions and long-tail behaviors, aiming to provide a diverse and challenging set of tasks to comprehensively evaluate the capabilities of robotic agents and promote diversity and complexity in robot dataset task designs. These tasks are developed through systematic analysis and expansion of existing task designs, combined with insights from human-object interaction primitives and object affordances. We collect a large amount of trajectory data on different robotic platforms and evaluate several baseline models. Experimental results demonstrate that the GM-100 tasks are 1) feasible to execute and 2) sufficiently challenging to effectively differentiate the performance of current VLA models. Our data and code are available at https://rhos.ai/research/gm-100.

preprint2022arXiv

Label-Only Membership Inference Attack against Node-Level Graph Neural Networks

Graph Neural Networks (GNNs), inspired by Convolutional Neural Networks (CNNs), aggregate the message of nodes' neighbors and structure information to acquire expressive representations of nodes for node classification, graph classification, and link prediction. Previous studies have indicated that GNNs are vulnerable to Membership Inference Attacks (MIAs), which infer whether a node is in the training data of GNNs and leak the node's private information, like the patient's disease history. The implementation of previous MIAs takes advantage of the models' probability output, which is infeasible if GNNs only provide the prediction label (label-only) for the input. In this paper, we propose a label-only MIA against GNNs for node classification with the help of GNNs' flexible prediction mechanism, e.g., obtaining the prediction label of one node even when neighbors' information is unavailable. Our attacking method achieves around 60\% accuracy, precision, and Area Under the Curve (AUC) for most datasets and GNN models, some of which are competitive or even better than state-of-the-art probability-based MIAs implemented under our environment and settings. Additionally, we analyze the influence of the sampling method, model selection approach, and overfitting level on the attack performance of our label-only MIA. Both of those factors have an impact on the attack performance. Then, we consider scenarios where assumptions about the adversary's additional dataset (shadow dataset) and extra information about the target model are relaxed. Even in those scenarios, our label-only MIA achieves a better attack performance in most cases. Finally, we explore the effectiveness of possible defenses, including Dropout, Regularization, Normalization, and Jumping knowledge. None of those four defenses prevent our attack completely.

preprint2022arXiv

LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection

In this paper, we propose the LiDAR Distillation to bridge the domain gap induced by different LiDAR beams for 3D object detection. In many real-world applications, the LiDAR points used by mass-produced robots and vehicles usually have fewer beams than that in large-scale public datasets. Moreover, as the LiDARs are upgraded to other product models with different beam amount, it becomes challenging to utilize the labeled data captured by previous versions' high-resolution sensors. Despite the recent progress on domain adaptive 3D detection, most methods struggle to eliminate the beam-induced domain gap. We find that it is essential to align the point cloud density of the source domain with that of the target domain during the training process. Inspired by this discovery, we propose a progressive framework to mitigate the beam-induced domain shift. In each iteration, we first generate low-beam pseudo LiDAR by downsampling the high-beam point clouds. Then the teacher-student framework is employed to distill rich information from the data with more beams. Extensive experiments on Waymo, nuScenes and KITTI datasets with three different LiDAR-based detectors demonstrate the effectiveness of our LiDAR Distillation. Notably, our approach does not increase any additional computation cost for inference.

preprint2022arXiv

SimpleTrack: Rethinking and Improving the JDE Approach for Multi-Object Tracking

Joint detection and embedding (JDE) based methods usually estimate bounding boxes and embedding features of objects with a single network in Multi-Object Tracking (MOT). In the tracking stage, JDE-based methods fuse the target motion information and appearance information by applying the same rule, which could fail when the target is briefly lost or blocked. To overcome this problem, we propose a new association matrix, the Embedding and Giou matrix, which combines embedding cosine distance and Giou distance of objects. To further improve the performance of data association, we develop a simple, effective tracker named SimpleTrack, which designs a bottom-up fusion method for Re-identity and proposes a new tracking strategy based on our EG matrix. The experimental results indicate that SimpleTrack has powerful data association capability, e.g., 61.6 HOTA and 76.3 IDF1 on MOT17. In addition, we apply the EG matrix to 5 different state-of-the-art JDE-based methods and achieve significant improvements in IDF1, HOTA and IDsw metrics, and increase the tracking speed of these methods by about 20%.

preprint2022arXiv

Tensor Decompositions for Hyperspectral Data Processing in Remote Sensing: A Comprehensive Review

Owing to the rapid development of sensor technology, hyperspectral (HS) remote sensing (RS) imaging has provided a significant amount of spatial and spectral information for the observation and analysis of the Earth's surface at a distance of data acquisition devices, such as aircraft, spacecraft, and satellite. The recent advancement and even revolution of the HS RS technique offer opportunities to realize the full potential of various applications, while confronting new challenges for efficiently processing and analyzing the enormous HS acquisition data. Due to the maintenance of the 3-D HS inherent structure, tensor decomposition has aroused widespread concern and research in HS data processing tasks over the past decades. In this article, we aim at presenting a comprehensive overview of tensor decomposition, specifically contextualizing the five broad topics in HS data processing, and they are HS restoration, compressed sensing, anomaly detection, super-resolution, and spectral unmixing. For each topic, we elaborate on the remarkable achievements of tensor decomposition models for HS RS with a pivotal description of the existing methodologies and a representative exhibition on the experimental results. As a result, the remaining challenges of the follow-up research directions are outlined and discussed from the perspective of the real HS RS practices and tensor decomposition merged with advanced priors and even with deep neural networks. This article summarizes different tensor decomposition-based HS data processing methods and categorizes them into different classes from simple adoptions to complex combinations with other priors for the algorithm beginners. We also expect this survey can provide new investigations and development trends for the experienced researchers who understand tensor decomposition and HS RS to some extent.

preprint2021arXiv

Reciprocity of thermal diffusion in time-modulated systems

The reciprocity principle governs the symmetry in transmission of electromagnetic and acoustic waves, as well as the diffusion of heat between two points in space, with important consequences for thermal management and energy harvesting. There has been significant recent interest in materials with time-modulated properties, which have been shown to efficiently break reciprocity for light, sound, and even charge diffusion. Quite surprisingly, here we show that, from a practical point of view, time modulation cannot generally be used to break reciprocity for thermal diffusion. We establish a theoretical framework to accurately describe the behavior of diffusive processes under time modulation, and prove that thermal reciprocity in dynamic materials is generally preserved by the continuity equation, unless some external bias or special material is considered. We then experimentally demonstrate reciprocal heat transfer in a time-modulated device. Our findings correct previous misconceptions regarding reciprocity breaking for thermal diffusion, revealing the generality of symmetry constraints in heat transfer, and clarifying its differences from other transport processes in what concerns the principles of reciprocity and microscopic reversibility.

preprint2020arXiv

Diffusive non-reciprocity and thermal diode

Wave propagation and diffusion in linear materials preserve local reciprocity in terms of a symmetric Green's function. For wave propagations, the relation between the fields entering and leaving a system is more relevant than the detailed information about the fields inside it. In such cases, the global reciprocity of the scattering off a system through several ports is more important, which is defined as the symmetric transmission between the scattering channels. When a two-port system supports non-reciprocal (electromagnetic, acoustic) wave propagation, it is a (optical, phonon) diode directly following the definition. However, to date no concrete definition or discussion has been made on the global reciprocity of diffusive processes through a multiple-port system. It thus remains unclear what are the differences and relations between the three concepts, namely local non-reciprocity, global non-reciprocity, and diode effect in diffusion. Here, we provide theoretical analysis on the frequency-domain Green's function and define the global reciprocity of heat diffusion through a two-port system, which has a different setup from that of a thermal diode. We further prove the equivalence between a heat transfer system with broken steady-state global reciprocity and a thermal diode, assuming no temperature-dependent heat generation. The validities of some typical mechanisms in breaking the diffusive reciprocity and making a thermal diode have been discussed. Our results set a general background for future studies on symmetric and asymmetric diffusive processes.