Source author record

Yi Han

Yi Han appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computer Vision Information Retrieval Machine Learning physics.med-ph Robotics Social and Information Networks Biological Physics Computation and Language Cryptography and Security Graphics math.PR

Catalog footprint

What is connected

13works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Adaptive Causal Coordination Detection for Social Media: A Memory-Guided Framework with Semi-Supervised Learning

Detecting coordinated inauthentic behavior on social media remains a critical and persistent challenge, as most existing approaches rely on superficial correlation analysis, employ static parameter settings, and demand extensive and labor-intensive manual annotation. To address these limitations systematically, we propose the Adaptive Causal Coordination Detection (ACCD) framework. ACCD adopts a three-stage, progressive architecture that leverages a memory-guided adaptive mechanism to dynamically learn and retain optimal detection configurations for diverse coordination scenarios. Specifically, in the first stage, ACCD introduces an adaptive Convergent Cross Mapping (CCM) technique to deeply identify genuine causal relationships between accounts. The second stage integrates active learning with uncertainty sampling within a semi-supervised classification scheme, significantly reducing the burden of manual labeling. The third stage deploys an automated validation module driven by historical detection experience, enabling self-verification and optimization of the detection outcomes. We conduct a comprehensive evaluation using real-world datasets, including the Twitter IRA dataset, Reddit coordination traces, and several widely-adopted bot detection benchmarks. Experimental results demonstrate that ACCD achieves an F1-score of 87.3\% in coordinated attack detection, representing a 15.2\% improvement over the strongest existing baseline. Furthermore, the system reduces manual annotation requirements by 68\% and achieves a 2.8x speedup in processing through hierarchical clustering optimization. In summary, ACCD provides a more accurate, efficient, and highly automated end-to-end solution for identifying coordinated behavior on social platforms, offering substantial practical value and promising potential for broad application.

preprint2026arXiv

CARLA-Round: A Multi-Factor Simulation Dataset for Roundabout Trajectory Prediction

Accurate trajectory prediction of vehicles at roundabouts is critical for reducing traffic accidents, yet it remains highly challenging due to their circular road geometry, continuous merging and yielding interactions, and absence of traffic signals. Developing accurate prediction algorithms relies on reliable, multimodal, and realistic datasets; however, such datasets for roundabout scenarios are scarce, as real-world data collection is often limited by incomplete observations and entangled factors that are difficult to isolate. We present CARLA-Round, a systematically designed simulation dataset for roundabout trajectory prediction. The dataset varies weather conditions (five types) and traffic density levels (spanning Level-of-Service A-E) in a structured manner, resulting in 25 controlled scenarios. Each scenario incorporates realistic mixtures of driving behaviors and provides explicit annotations that are largely absent from existing datasets. Unlike randomly sampled simulation data, this structured design enables precise analysis of how different conditions influence trajectory prediction performance. Validation experiments using standard baselines (LSTM, GCN, GRU+GCN) reveal traffic density dominates prediction difficulty with strong monotonic effects, while weather shows non-linear impacts. The best model achieves 0.312m ADE on real-world rounD dataset, demonstrating effective sim-to-real transfer. This systematic approach quantifies factor impacts impossible to isolate in confounded real-world datasets. Our CARLA-Round dataset is available at https://github.com/Rebecca689/CARLA-Round.

preprint2026arXiv

Herculean: An Agentic Benchmark for Financial Intelligence

As AI agents improve, the central question is no longer whether they can solve isolated well-defined financial tasks, but whether they can reliably carry out financial professional work. Existing financial benchmarks offer only a partial view of this ability, as they primarily evaluate static competencies such as question answering, retrieval, summarization, and classification. We introduce Herculean, the first skilled benchmark for agentic financial intelligence spanning four representative workflows, including Trading, Hedging, Market Insights, and Auditing. Each workflow is instantiated as a standardized MCP-based skill environment with its own tools, interaction dynamics, constraints, and success criteria, enabling consistent end-to-end assessment of heterogeneous agent systems. Across frontier agents, we find agents perform relatively well on Trading and Market Insights, but struggle substantially on Hedging and Auditing, where long-horizon coordination, state consistency, and structured verification are critical. Overall, our results point to a key gap in current agents in turning financial reasoning into dependable workflow execution in high-stakes financial workflows.

preprint2026arXiv

RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics

Spatial referring is a fundamental capability of embodied robots to interact with the 3D physical world. However, even with the powerful pretrained vision language models (VLMs), recent approaches are still not qualified to accurately understand the complex 3D scenes and dynamically reason about the instruction-indicated locations for interaction. To this end, we propose RoboRefer, a 3D-aware VLM that can first achieve precise spatial understanding by integrating a disentangled but dedicated depth encoder via supervised fine-tuning (SFT). Moreover, RoboRefer advances generalized multi-step spatial reasoning via reinforcement fine-tuning (RFT), with metric-sensitive process reward functions tailored for spatial referring tasks. To support SFT and RFT training, we introduce RefSpatial, a large-scale dataset of 20M QA pairs (2x prior), covering 31 spatial relations (vs. 15 prior) and supporting complex reasoning processes (up to 5 steps). In addition, we introduce RefSpatial-Bench, a challenging benchmark filling the gap in evaluating spatial referring with multi-step reasoning. Experiments show that SFT-trained RoboRefer achieves state-of-the-art spatial understanding, with an average success rate of 89.6%. RFT-trained RoboRefer further outperforms all other baselines by a large margin, even surpassing Gemini-2.5-Pro by 17.4% in average accuracy on RefSpatial-Bench. Notably, RoboRefer can be integrated with various control policies to execute long-horizon, dynamic tasks across diverse robots (e,g., UR5, G1 humanoid) in cluttered real-world scenes.

preprint2026arXiv

RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics

Spatial tracing, as a fundamental embodied interaction ability for robots, is inherently challenging as it requires multi-step metric-grounded reasoning compounded with complex spatial referring and real-world metric measurement. However, existing methods struggle with this compositional task. To this end, we propose RoboTracer, a 3D-aware VLM that first achieves both 3D spatial referring and measuring via a universal spatial encoder and a regression-supervised decoder to enhance scale awareness during supervised fine-tuning (SFT). Moreover, RoboTracer advances multi-step metric-grounded reasoning via reinforcement fine-tuning (RFT) with metric-sensitive process rewards, supervising key intermediate perceptual cues to accurately generate spatial traces. To support SFT and RFT training, we introduce TraceSpatial, a large-scale dataset of 30M QA pairs, spanning outdoor/indoor/tabletop scenes and supporting complex reasoning processes (up to 9 steps). We further present TraceSpatial-Bench, a challenging benchmark filling the gap to evaluate spatial tracing. Experimental results show that RoboTracer surpasses baselines in spatial understanding, measuring, and referring, with an average success rate of 79.1%, and also achieves SOTA performance on TraceSpatial-Bench by a large margin, exceeding Gemini-2.5-Pro by 36% accuracy. Notably, RoboTracer can be integrated with various control policies to execute long-horizon, dynamic tasks across diverse robots (UR5, G1 humanoid) in cluttered real-world scenes. See the project page at https://zhoues.github.io/RoboTracer.

preprint2022arXiv

Smoothness of the density for McKean-Vlasov SDEs with measurable kernel

Consider the McKean-Vlasov SDE $$ dX_t=\langle b(X_t-\cdot),μ_t\rangle dt+dW_t,\quad μ_t=\operatorname{Law}(X_t), $$ where $W$ is the $n$-dimensional Brownian motion and $b:\mathbb{R}^d\to\mathbb{R}^d$ is a measurable function. First assuming $b\in L^\infty$, we prove that the law $μ_t$ of $X_t$ has a density $p_t$ with respect to the Lebesgue measure, which is continuously differentiable with gradient being $γ$-Hölder continuous for each $γ\in(0,1)$. Assume further that $b\in \mathcal{C}_b^1$, we prove that the density $p_t$ is infinitely differentiable. In the regularization by noise perspective, this shows McKean-Vlasov SDEs tend to have a smoother density function than SDEs without density dependence, under the same regularity assumption of the coefficients. We observe similar phenomenon for singular interaction kernels satisfying Krylov's integrability condition, for distributional kernels $b\in B_{\infty,\infty}^α$, $α\in(-1,0)$, and for processes driven by an $α$-stable noise for $α\in(1,2)$.

preprint2021arXiv

Example-based Real-time Clothing Synthesis for Virtual Agents

We present a real-time cloth animation method for dressing virtual humans of various shapes and poses. Our approach formulates the clothing deformation as a high-dimensional function of body shape parameters and pose parameters. In order to accelerate the computation, our formulation factorizes the clothing deformation into two independent components: the deformation introduced by body pose variation (Clothing Pose Model) and the deformation from body shape variation (Clothing Shape Model). Furthermore, we sample and cluster the poses spanning the entire pose space and use those clusters to efficiently calculate the anchoring points. We also introduce a sensitivity-based distance measurement to both find nearby anchoring points and evaluate their contributions to the final animation. Given a query shape and pose of the virtual agent, we synthesize the resulting clothing deformation by blending the Taylor expansion results of nearby anchoring points. Compared to previous methods, our approach is general and able to add the shape dimension to any clothing pose model. %and therefore it is more general. Furthermore, we can animate clothing represented with tens of thousands of vertices at 50+ FPS on a CPU. Moreover, our example database is more representative and can be generated in parallel, and thereby saves the training time. We also conduct a user evaluation and show that our method can improve a user's perception of dressed virtual agents in an immersive virtual environment compared to a conventional linear blend skinning method.

preprint2020arXiv

Adversarial Reinforcement Learning under Partial Observability in Autonomous Computer Network Defence

Recent studies have demonstrated that reinforcement learning (RL) agents are susceptible to adversarial manipulation, similar to vulnerabilities previously demonstrated in the supervised learning setting. While most existing work studies the problem in the context of computer vision or console games, this paper focuses on reinforcement learning in autonomous cyber defence under partial observability. We demonstrate that under the black-box setting, where the attacker has no direct access to the target RL model, causative attacks---attacks that target the training process---can poison RL agents even if the attacker only has partial observability of the environment. In addition, we propose an inversion defence method that aims to apply the opposite perturbation to that which an attacker might use to generate their adversarial samples. Our experimental results illustrate that the countermeasure can effectively reduce the impact of the causative attack, while not significantly affecting the training process in non-attack scenarios.

preprint2020arXiv

E-commerce Recommendation with Weighted Expected Utility

Different from shopping at retail stores, consumers on e-commerce platforms usually cannot touch or try products before purchasing, which means that they have to make decisions when they are uncertain about the outcome (e.g., satisfaction level) of purchasing a product. To study people's preferences, economics researchers have proposed the hypothesis of Expected Utility (EU) that models the subject value associated with an individual's choice as the statistical expectations of that individual's valuations of the outcomes of this choice. Despite its success in studies of game theory and decision theory, the effectiveness of EU, however, is mostly unknown in e-commerce recommendation systems. Previous research on e-commerce recommendation interprets the utility of purchase decisions either as a function of the consumed quantity of the product or as the gain of sellers/buyers in the monetary sense. As most consumers just purchase one unit of a product at a time and most alternatives have similar prices, such modeling of purchase utility is likely to be inaccurate in practice. In this paper, we interpret purchase utility as the satisfaction level a consumer gets from a product and propose a recommendation framework using EU to model consumers' behavioral patterns. We assume that consumer estimates the expected utilities of all the alternatives and choose products with maximum expected utility for each purchase. To deal with the potential psychological biases of each consumer, we introduce the usage of Probability Weight Function (PWF) and design our algorithm based on Weighted Expected Utility (WEU). Empirical study on real-world e-commerce datasets shows that our proposed ranking-based recommendation framework achieves statistically significant improvement against both classical Collaborative Filtering/Latent Factor Models and state-of-the-art deep models in top-K recommendation.

preprint2020arXiv

Graph Neural Networks with Continual Learning for Fake News Detection from Social Media

Although significant effort has been applied to fact-checking, the prevalence of fake news over social media, which has profound impact on justice, public trust and our society, remains a serious problem. In this work, we focus on propagation-based fake news detection, as recent studies have demonstrated that fake news and real news spread differently online. Specifically, considering the capability of graph neural networks (GNNs) in dealing with non-Euclidean data, we use GNNs to differentiate between the propagation patterns of fake and real news on social media. In particular, we concentrate on two questions: (1) Without relying on any text information, e.g., tweet content, replies and user descriptions, how accurately can GNNs identify fake news? Machine learning models are known to be vulnerable to adversarial attacks, and avoiding the dependence on text-based features can make the model less susceptible to the manipulation of advanced fake news fabricators. (2) How to deal with new, unseen data? In other words, how does a GNN trained on a given dataset perform on a new and potentially vastly different dataset? If it achieves unsatisfactory performance, how do we solve the problem without re-training the model on the entire data from scratch? We study the above questions on two datasets with thousands of labelled news items, and our results show that: (1) GNNs can achieve comparable or superior performance without any text information to state-of-the-art methods. (2) GNNs trained on a given dataset may perform poorly on new, unseen data, and direct incremental training cannot solve the problem---this issue has not been addressed in the previous work that applies GNNs for fake news detection. In order to solve the problem, we propose a method that achieves balanced performance on both existing and new datasets, by using techniques from continual learning to train GNNs incrementally.

preprint2020arXiv

Image Analysis Enhanced Event Detection from Geo-tagged Tweet Streams

Events detected from social media streams often include early signs of accidents, crimes or disasters. Therefore, they can be used by related parties for timely and efficient response. Although significant progress has been made on event detection from tweet streams, most existing methods have not considered the posted images in tweets, which provide richer information than the text, and potentially can be a reliable indicator of whether an event occurs or not. In this paper, we design an event detection algorithm that combines textual, statistical and image information, following an unsupervised machine learning approach. Specifically, the algorithm starts with semantic and statistical analyses to obtain a list of tweet clusters, each of which corresponds to an event candidate, and then performs image analysis to separate events from non-events---a convolutional autoencoder is trained for each cluster as an anomaly detector, where a part of the images are used as the training data and the remaining images are used as the test instances. Our experiments on multiple datasets verify that when an event occurs, the mean reconstruction errors of the training and test images are much closer, compared with the case where the candidate is a non-event cluster. Based on this finding, the algorithm rejects a candidate if the difference is larger than a threshold. Experimental results over millions of tweets demonstrate that this image analysis enhanced approach can significantly increase the precision with minimum impact on the recall.

preprint2015arXiv

Signal Improvement and Contrast Enhancement in Magnetic Resonance Imaging

This thesis reports advances in magnetic resonance imaging (MRI), with the ultimate goal of improving signal and contrast in biomedical applications. More specifically, novel MRI pulse sequences have been designed to characterize microstructure, enhance signal and contrast in tissue, and image functional processes. In this thesis, rat brain and red bone marrow images are acquired using iMQCs (intermolecular multiple quantum coherences) between intermediate separated spins. As an important application, iMQCs images in different directions can be used for anisotropy mapping and tissue microstructure analysis. At the same time, the simulations prove that the dipolar field from the overall shape only has small contributions to the experimental iMQC signal. Besides magnitude of iMQCs, phase of iMQCs should be studied as well. The phase anisotropy maps built by our method can clearly show susceptibility information in rat brain. It may provide meaningful diagnostic information. To deeply study susceptibility, the modified-crazed sequence is developed. Combining phase data of modified-crazed images and phase data of iMQCs images is very promising to construct microstructure maps. Obviously, the phase image in all above techniques needs to be highly-contrasted and clear. To achieve the goal, algorithm tools from Susceptibility-Weighted Imaging (SWI) and Susceptibility Tensor Imaging (STI) stands out superb useful and creative in our system.

preprint2013arXiv

Anisotropy mapping in rat brains using Intermolecular Multiple Quantum Coherence Effects

This document reports an unconventional and rapidly developing approach to magnetic resonance imaging (MRI) using intermolecular multiple-quantum coherences (iMQCs). Rat brain images are acquired using iMQCs. We detect iMQCs between spins that are 10 μm to 500 μm apart. The interaction between spins is dependent on different directions. We can choose the directions on physical Z, Y and X axis by choosing correlation gradients along those directions. As an important application, iMQCs can be used for anisotropy mapping. In the rat brains, we investigate tissue microstructure. We simulated images expected from rat brains without microstructure. We compare those with experimental results to prove that the dipolar field from the overall shape only has small contributions to the experimental iMQC signal. Because of the underlying low signal to noise ratio (SNR) in iMQCs, this anisotropy mapping method still has comparatively large potentials to grow. The ultimate goal of my project is to develop creative and effective methods of tissue microstructure anisotropy mapping. Recently we found that combining phase data of iMQCs images with phase data of modified-crazed images is very promising to construct microstructure maps. Some information and initial results are shown in this document.

Yi Han

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

Adaptive Causal Coordination Detection for Social Media: A Memory-Guided Framework with Semi-Supervised Learning

CARLA-Round: A Multi-Factor Simulation Dataset for Roundabout Trajectory Prediction

Herculean: An Agentic Benchmark for Financial Intelligence

RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics

RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics

Smoothness of the density for McKean-Vlasov SDEs with measurable kernel

Example-based Real-time Clothing Synthesis for Virtual Agents

Adversarial Reinforcement Learning under Partial Observability in Autonomous Computer Network Defence

E-commerce Recommendation with Weighted Expected Utility

Graph Neural Networks with Continual Learning for Fake News Detection from Social Media

Image Analysis Enhanced Event Detection from Geo-tagged Tweet Streams

Signal Improvement and Contrast Enhancement in Magnetic Resonance Imaging

Anisotropy mapping in rat brains using Intermolecular Multiple Quantum Coherence Effects