Source author record

Haotian Li

Haotian Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Human-Computer Interaction Artificial Intelligence Machine Learning Computation and Language Computer Vision cond-mat.mes-hall cond-mat.mtrl-sci cond-mat.str-el cs.CY eess.SP Robotics

Catalog footprint

What is connected

15works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Does Theory of Mind Improvement Really Benefit Human-AI Interactions? Empirical Findings from Interactive Evaluations

Improving the Theory of Mind (ToM) capability of Large Language Models (LLMs) is crucial for effective social interactions between these AI models and humans. However, the existing benchmarks often measure ToM capability improvement through story-reading, multiple-choice questions from a third-person perspective, while ignoring the first-person, dynamic, and open-ended nature of human-AI (HAI) interactions. To directly examine how ToM improvement techniques benefit HAI interactions, we first proposed the new paradigm of interactive ToM evaluation with both perspective and metric shifts. Next, following the paradigm, we conducted a systematic study of four representative ToM enhancement techniques using both four real-world datasets and a user study, covering both goal-oriented tasks (e.g., coding, math) and experience-oriented tasks (e.g., counseling). Our findings reveal that improvements on static benchmarks do not always translate to better performance in dynamic HAI interactions. This paper offers critical insights into ToM evaluation, showing the necessity of interaction-based assessments in developing next-generation, socially aware LLMs for HAI symbiosis.

preprint2026arXiv

Efficient Swept Volume-Based Trajectory Generation for Arbitrary-Shaped Ground Robot Navigation

Navigating an arbitrary-shaped ground robot safely in cluttered environments remains a challenging problem. The existing trajectory planners that account for the robot's physical geometry severely suffer from the intractable runtime. To achieve both computational efficiency and Continuous Collision Avoidance (CCA) of arbitrary-shaped ground robot planning, we proposed a novel coarse-to-fine navigation framework that significantly accelerates planning. In the first stage, a sampling-based method selectively generates distinct topological paths that guarantee a minimum inflated margin. In the second stage, a geometry-aware front-end strategy is designed to discretize these topologies into full-state robot motion sequences while concurrently partitioning the paths into SE(2) sub-problems and simpler R2 sub-problems for back-end optimization. In the final stage, an SVSDF-based optimizer generates trajectories tailored to these sub-problems and seamlessly splices them into a continuous final motion plan. Extensive benchmark comparisons show that the proposed method is one to several orders of magnitude faster than the cutting-edge methods in runtime while maintaining a high planning success rate and ensuring CCA.

preprint2026arXiv

Multi-agent AI systems outperform human teams in creativity

Although artificial intelligence (AI) now matches or exceeds human performance across numerous cognitive tasks, creativity remains a highly contested frontier. As AI systems based on large language models (LLMs) are increasingly adopted in research and innovation, it is essential to understand and augment their creativity. Here we demonstrate that multi-agent LLM teams not only surpass single agents, but also substantially outperform human teams in creativity (Cohen's d=1.50) across 4,541 multi-agent LLM ideas and 341 human-team ideas on six diverse problem-solving tasks. This advantage is driven by novelty while maintaining comparable usefulness. To investigate the generative processes in both groups, we represent conversations as paths through semantic space using neural language model representations. Both LLM and human teams produce more creative ideas when conversations range widely rather than staying centered on a single theme (low global coherence). However, the additional patterns that predict creativity differ: LLM teams benefit from efficient exploration (high semantic spread, shorter paths), while human teams benefit from maintaining smooth conversational flow (high local coherence, frequent pivots). Additionally, we identify model choice and discussion structure as orthogonal design levers that together explain 26.8% of variance in LLM conversational dynamics, paving the way for systematic approaches to developing multi-agent systems with augmented creative capabilities.

preprint2022arXiv

A microstructure estimation Transformer inspired by sparse representation for diffusion MRI

Diffusion magnetic resonance imaging (dMRI) is an important tool in characterizing tissue microstructure based on biophysical models, which are complex and highly non-linear. Resolving microstructures with optimization techniques is prone to estimation errors and requires dense sampling in the q-space. Deep learning based approaches have been proposed to overcome these limitations. Motivated by the superior performance of the Transformer, in this work, we present a learning-based framework based on Transformer, namely, a Microstructure Estimation Transformer with Sparse Coding (METSC) for dMRI-based microstructure estimation with downsampled q-space data. To take advantage of the Transformer while addressing its limitation in large training data requirements, we explicitly introduce an inductive bias - model bias into the Transformer using a sparse coding technique to facilitate the training process. Thus, the METSC is composed with three stages, an embedding stage, a sparse representation stage, and a mapping stage. The embedding stage is a Transformer-based structure that encodes the signal to ensure the voxel is represented effectively. In the sparse representation stage, a dictionary is constructed by solving a sparse reconstruction problem that unfolds the Iterative Hard Thresholding (IHT) process. The mapping stage is essentially a decoder that computes the microstructural parameters from the output of the second stage, based on the weighted sum of normalized dictionary coefficients where the weights are also learned. We tested our framework on two dMRI models with downsampled q-space data, including the intravoxel incoherent motion (IVIM) model and the neurite orientation dispersion and density imaging (NODDI) model. The proposed method achieved up to 11.25 folds of acceleration in scan time and outperformed the other state-of-the-art learning-based methods.

preprint2022arXiv

An original model for multi-target learning of logical rules for knowledge graph reasoning

Large-scale knowledge graphs provide structured representations of human knowledge. However, as it is impossible to collect all knowledge, knowledge graphs are usually incomplete. Reasoning based on existing facts paves a way to discover missing facts. In this paper, we study the problem of learning logical rules for reasoning on knowledge graphs for completing missing factual triplets. Learning logical rules equips a model with strong interpretability as well as the ability to generalize to similar tasks. We propose a model able to fully use training data which also considers multi-target scenarios. In addition, considering the deficiency in evaluating the performance of models and the quality of mined rules, we further propose two novel indicators to help with the problem. Experimental results empirically demonstrate that our model outperforms state-of-the-art methods on five benchmark datasets. The results also prove the effectiveness of the indicators.

preprint2022arXiv

ComputableViz: Mathematical Operators as a Formalism for Visualization Processing and Analysis

Data visualizations are created and shared on the web at an unprecedented speed, raising new needs and questions for processing and analyzing visualizations after they have been generated and digitized. However, existing formalisms focus on operating on a single visualization instead of multiple visualizations, making it challenging to perform analysis tasks such as sorting and clustering visualizations. Through a systematic analysis of previous work, we abstract visualization-related tasks into mathematical operators such as union and propose a design space of visualization operations. We realize the design by developing ComputableViz, a library that supports operations on multiple visualization specifications. To demonstrate its usefulness and extensibility, we present multiple usage scenarios concerning processing and analyzing visualization, such as generating visualization embeddings and automatically making visualizations accessible. We conclude by discussing research opportunities and challenges for managing and exploiting the massive visualizations on the web.

preprint2022arXiv

Structure-aware Visualization Retrieval

With the wide usage of data visualizations, a huge number of Scalable Vector Graphic (SVG)-based visualizations have been created and shared online. Accordingly, there has been an increasing interest in exploring how to retrieve perceptually similar visualizations from a large corpus, since it can benefit various downstream applications such as visualization recommendation. Existing methods mainly focus on the visual appearance of visualizations by regarding them as bitmap images. However, the structural information intrinsically existing in SVG-based visualizations is ignored. Such structural information can delineate the spatial and hierarchical relationship among visual elements, and characterize visualizations thoroughly from a new perspective. This paper presents a structure-aware method to advance the performance of visualization retrieval by collectively considering both the visual and structural information. We extensively evaluated our approach through quantitative comparisons, a user study and case studies. The results demonstrate the effectiveness of our approach and its advantages over existing methods.

preprint2022arXiv

Who Will Support My Project? Interactive Search of Potential Crowdfunding Investors Through InSearch

Crowdfunding provides project founders with a convenient way to reach online investors. However, it is challenging for founders to find the most potential investors and successfully raise money for their projects on crowdfunding platforms. A few machine learning based methods have been proposed to recommend investors' interest in a specific crowdfunding project, but they fail to provide project founders with explanations in detail for these recommendations, thereby leading to an erosion of trust in predicted investors. To help crowdfunding founders find truly interested investors, we conducted semi-structured interviews with four crowdfunding experts and presents inSearch, a visual analytic system. inSearch allows founders to search for investors interactively on crowdfunding platforms. It supports an effective overview of potential investors by leveraging a Graph Neural Network to model investor preferences. Besides, it enables interactive exploration and comparison of the temporal evolution of different investors' investment details.

preprint2021arXiv

A Visual Analytics Approach to Facilitate the Proctoring of Online Exams

Online exams have become widely used to evaluate students' performance in mastering knowledge in recent years, especially during the pandemic of COVID-19. However, it is challenging to conduct proctoring for online exams due to the lack of face-to-face interaction. Also, prior research has shown that online exams are more vulnerable to various cheating behaviors, which can damage their credibility. This paper presents a novel visual analytics approach to facilitate the proctoring of online exams by analyzing the exam video records and mouse movement data of each student. Specifically, we detect and visualize suspected head and mouse movements of students in three levels of detail, which provides course instructors and teachers with convenient, efficient and reliable proctoring for online exams. Our extensive evaluations, including usage scenarios, a carefully-designed user study and expert interviews, demonstrate the effectiveness and usability of our approach.

preprint2021arXiv

Deep Colormap Extraction from Visualizations

This work presents a new approach based on deep learning to automatically extract colormaps from visualizations. After summarizing colors in an input visualization image as a Lab color histogram, we pass the histogram to a pre-trained deep neural network, which learns to predict the colormap that produces the visualization. To train the network, we create a new dataset of 64K visualizations that cover a wide variety of data distributions, chart types, and colormaps. The network adopts an atrous spatial pyramid pooling module to capture color features at multiple scales in the input color histograms. We then classify the predicted colormap as discrete or continuous and refine the predicted colormap based on its color histogram. Quantitative comparisons to existing methods show the superior performance of our approach on both synthetic and real-world visualizations. We further demonstrate the utility of our method with two use cases,i.e., color transfer and color remapping.

preprint2021arXiv

Topology Density Map for Urban Data Visualization and Analysis

Density map is an effective visualization technique for depicting the scalar field distribution in 2D space. Conventional methods for constructing density maps are mainly based on Euclidean distance, limiting their applicability in urban analysis that shall consider road network and urban traffic. In this work, we propose a new method named Topology Density Map, targeting for accurate and intuitive density maps in the context of urban environment. Based on the various constraints of road connections and traffic conditions, the method first constructs a directed acyclic graph (DAG) that propagates nonlinear scalar fields along 1D road networks. Next, the method extends the scalar fields to a 2D space by identifying key intersecting points in the DAG, dividing the underlying territory into planar regions using a weighted Voronoi diagram, and calculating the scalar fields for every point. Two case studies demonstrate that the Topology Density Map supplies accurate information to users and provides an intuitive visualization for decision making. An interview with domain experts demonstrates the feasibility, usability, and effectiveness of our method.

preprint2020arXiv

Maximizing spin-orbit torque efficiency of Ta(O)/Py via modulating oxygen-induced interface orbital hybridization

Spin-orbit torques due to interfacial Rashba and spin Hall effects have been widely considered as a potentially more efficient approach than the conventional spin-transfer torque to control the magnetization of ferromagnets. We report a comprehensive study of spin-orbit torque efficiency in Ta(O)/Ni81Fe19 bilayers by tuning low-oxidation of \b{eta}-phase tantalum, and find that the spin Hall angle θDL increases from ~ -0.18 of the pure Ta/Py to the maximum value ~ -0.30 of Ta(O)/Py with 7.8% oxidation. Furthermore, we distinguish the efficiency of the spin-orbit torque generated by the bulk spin Hall effect and by interfacial Rashba effect, respectively, via a series of Py/Cu(0-2 nm)/Ta(O) control experiments. The latter has more than twofold enhancement, and even more significant than that of the former at the optimum oxidation level. Our results indicate that 65% enhancement of the efficiency should be related to the modulation of the interfacial Rashba-like spin-orbit torque due to oxygen-induced orbital hybridization cross the interface. Our results suggest that the modulation of interfacial coupling via oxygen-induced orbital hybridization can be an alternative method to boost the change-spin conversion rate.

preprint2020arXiv

Peer-inspired Student Performance Prediction in Interactive Online Question Pools with Graph Neural Network

Student performance prediction is critical to online education. It can benefit many downstream tasks on online learning platforms, such as estimating dropout rates, facilitating strategic intervention, and enabling adaptive online learning. Interactive online question pools provide students with interesting interactive questions to practice their knowledge in online education. However, little research has been done on student performance prediction in interactive online question pools. Existing work on student performance prediction targets at online learning platforms with predefined course curriculum and accurate knowledge labels like MOOC platforms, but they are not able to fully model knowledge evolution of students in interactive online question pools. In this paper, we propose a novel approach using Graph Neural Networks (GNNs) to achieve better student performance prediction in interactive online question pools. Specifically, we model the relationship between students and questions using student interactions to construct the student-interaction-question network and further present a new GNN model, called R^2GCN, which intrinsically works for the heterogeneous networks, to achieve generalizable student performance prediction in interactive online question pools. We evaluate the effectiveness of our approach on a real-world dataset consisting of 104,113 mouse trajectories generated in the problem-solving process of over 4000 students on 1631 questions. The experiment results show that our approach can achieve a much higher accuracy of student performance prediction than both traditional machine learning approaches and GNN models.

preprint2020arXiv

Predicting Student Performance in Interactive Online Question Pools Using Mouse Interaction Features

Modeling student learning and further predicting the performance is a well-established task in online learning and is crucial to personalized education by recommending different learning resources to different students based on their needs. Interactive online question pools (e.g., educational game platforms), an important component of online education, have become increasingly popular in recent years. However, most existing work on student performance prediction targets at online learning platforms with a well-structured curriculum, predefined question order and accurate knowledge tags provided by domain experts. It remains unclear how to conduct student performance prediction in interactive online question pools without such well-organized question orders or knowledge tags by experts. In this paper, we propose a novel approach to boost student performance prediction in interactive online question pools by further considering student interaction features and the similarity between questions. Specifically, we introduce new features (e.g., think time, first attempt, and first drag-and-drop) based on student mouse movement trajectories to delineate students' problem-solving details. In addition, heterogeneous information network is applied to integrating students' historical problem-solving information on similar questions, enhancing student performance predictions on a new question. We evaluate the proposed approach on the dataset from a real-world interactive question pool using four typical machine learning models.

preprint2020arXiv

TradAO: A Visual Analytics System for Trading Algorithm Optimization

With the wide applications of algorithmic trading, it has become critical for traders to build a winning trading algorithm to beat the market. However, due to the lack of efficient tools, traders mainly rely on their memory to manually compare the algorithm instances of a trading algorithm and further select the best trading algorithm instance for the real trading deployment. We work closely with industry practitioners to discover and consolidate user requirements and develop an interactive visual analytics system for trading algorithm optimization. Structured expert interviews are conducted to evaluateTradAOand a representative case study is documented for illustrating the system effectiveness. To the best of our knowledge, previous financial data visual analyses have mainly aimed to assist investment managers in investment portfolio analysis but have neglected the need of traders in developing trading algorithms for portfolio execution.TradAOis the first visual analytics system that assists users in comprehensively exploring the performances of a trading algorithm with different parameter settings.

Haotian Li

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

Does Theory of Mind Improvement Really Benefit Human-AI Interactions? Empirical Findings from Interactive Evaluations

Efficient Swept Volume-Based Trajectory Generation for Arbitrary-Shaped Ground Robot Navigation

Multi-agent AI systems outperform human teams in creativity

A microstructure estimation Transformer inspired by sparse representation for diffusion MRI

An original model for multi-target learning of logical rules for knowledge graph reasoning

ComputableViz: Mathematical Operators as a Formalism for Visualization Processing and Analysis

Structure-aware Visualization Retrieval

Who Will Support My Project? Interactive Search of Potential Crowdfunding Investors Through InSearch

A Visual Analytics Approach to Facilitate the Proctoring of Online Exams

Deep Colormap Extraction from Visualizations

Topology Density Map for Urban Data Visualization and Analysis

Maximizing spin-orbit torque efficiency of Ta(O)/Py via modulating oxygen-induced interface orbital hybridization

Peer-inspired Student Performance Prediction in Interactive Online Question Pools with Graph Neural Network

Predicting Student Performance in Interactive Online Question Pools Using Mouse Interaction Features

TradAO: A Visual Analytics System for Trading Algorithm Optimization