Source author record

Xiaoyu Hu

Xiaoyu Hu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Computation and Language math-ph math.MP math.PR physics.med-ph

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Fin-Bias: Comprehensive Evaluation for LLM Decision-Making under human bias in Finance Domain

Large language models (LLMs) are increasingly deployed in financial contexts, raising critical concerns about reliability, alignment, and susceptibility to adversarial manipulation. While prior finance-related benchmarks assess LLMs' capabilities in stock trading, they are often restricted to small sample and fail to demonstrate LLM susceptibility to context with potential human bias. We introduce Fin-Bias (financial herding under long and uncertain financial context), a benchmark for evaluating LLM investment decision-making when faced with uncertainty and possible human-biased opinions. Fin-Bias includes 8868 long firm-specific analyst reports, including firm aspects summarized and analyzed by sophisticated analysts with investment ratings (Bullish/Neutral/Bearish) spanning from various industries. We present large language models with firm analyst reports with/without analyst investment ratings and even with 'fake' rating, to get investment ratings generated by LLMs. Our results reveal that LLMs tend to herd the explicit bias in context. We also develop a method to detect potential human opinions, which can encourage LLMs to think independently, some models even exceed human performance in predicting future stock return.

preprint2026arXiv

Human-like AI-based Auto-Field-in-Field Whole-Brain Radiotherapy Treatment Planning With Conversation Large Language Model Feedback

Whole-brain radiotherapy (WBRT) is a common treatment due to its simplicity and effectiveness. While automated Field-in-Field (Auto-FiF) functions assist WBRT planning in modern treatment planning systems, it still requires manual approaches for optimal plan generation including patient-specific hyperparameters definition and plan refinement based on quality feedback. This study introduces an automated WBRT planning pipeline that integrates a deep learning (DL) Hyperparameter Prediction model for patient-specific parameter generation and a large-language model (LLM)-based conversational interface for interactive plan refinement. The Hyperparameter Prediction module was trained on 55 WBRT cases using geometric features of clinical target volume (CTV) and organs at risk (OARs) to determine optimal Auto-FiF settings in RayStation treatment planning system. Plans were generated under predicted hyperparameters. For cases in which the generated plan was suboptimal, quality feedback via voice input was captured by a Conversation module, transcribed using Whisper, and interpreted by GPT-4o to adjust planning settings. Plan quality was evaluated in 15 independent cases using clinical metrics and expert review, and model explainability was supported through analysis of feature importance. Fourteen of 15 DL-generated plans were clinically acceptable. Normalized to identical CTV D95% as the clinical plans, the DL-generated and clinical plans showed no statistically significant differences in doses to the eyes, lenses, or CTV dose metrics D1% and D99%. The DL-based planning required under 1 minute of computation and achieved total workflow execution in approximately 7 minutes with a single mouse click, compared to 15 minutes for manual planning. In cases requiring adjustment, the Conversational module successfully improved dose conformity and hotspot reduction.

preprint2022arXiv

Dynamic Principal Component Analysis in High Dimensions

Principal component analysis is a versatile tool to reduce dimensionality which has wide applications in statistics and machine learning. It is particularly useful for modeling data in high-dimensional scenarios where the number of variables $p$ is comparable to, or much larger than the sample size $n$. Despite an extensive literature on this topic, researchers have focused on modeling static principal eigenvectors, which are not suitable for stochastic processes that are dynamic in nature. To characterize the change in the entire course of high-dimensional data collection, we propose a unified framework to directly estimate dynamic eigenvectors of covariance matrices. Specifically, we formulate an optimization problem by combining the local linear smoothing and regularization penalty together with the orthogonality constraint, which can be effectively solved by manifold optimization algorithms. We show that our method is suitable for high-dimensional data observed under both common and irregular designs, and theoretical properties of the estimators are investigated under $l_q (0 \leq q \leq 1)$ sparsity. Extensive experiments demonstrate the effectiveness of the proposed method in both simulated and real data examples.

preprint2021arXiv

Sparse Functional Principal Component Analysis in High Dimensions

Functional principal component analysis (FPCA) is a fundamental tool and has attracted increasing attention in recent decades, while existing methods are restricted to data with a single or finite number of random functions (much smaller than the sample size $n$). In this work, we focus on high-dimensional functional processes where the number of random functions $p$ is comparable to, or even much larger than $n$. Such data are ubiquitous in various fields such as neuroimaging analysis, and cannot be properly modeled by existing methods. We propose a new algorithm, called sparse FPCA, which is able to model principal eigenfunctions effectively under sensible sparsity regimes. While sparsity assumptions are standard in multivariate statistics, they have not been investigated in the complex context where not only is $p$ large, but also each variable itself is an intrinsically infinite-dimensional process. The sparsity structure motivates a thresholding rule that is easy to compute without nonparametric smoothing by exploiting the relationship between univariate orthonormal basis expansions and multivariate Kahunen-Loève (K-L) representations. We investigate the theoretical properties of the resulting estimators, and illustrate the performance with simulated and real data examples.

preprint2010arXiv

Thick points of the Gaussian free field

Let $U\subseteq\mathbf{C}$ be a bounded domain with smooth boundary and let $F$ be an instance of the continuum Gaussian free field on $U$ with respect to the Dirichlet inner product $\int_U\nabla f(x)\cdot \nabla g(x)\,dx$. The set $T(a;U)$ of $a$-thick points of $F$ consists of those $z\in U$ such that the average of $F$ on a disk of radius $r$ centered at $z$ has growth $\sqrt{a/π}\log\frac{1}{r}$ as $r\to 0$. We show that for each $0\leq a\leq2$ the Hausdorff dimension of $T(a;U)$ is almost surely $2-a$, that $ν_{2-a}(T(a;U))=\infty$ when $0<a\leq2$ and $ν_2(T(0;U))=ν_2(U)$ almost surely, where $ν_α$ is the Hausdorff-$α$ measure, and that $T(a;U)$ is almost surely empty when $a>2$. Furthermore, we prove that $T(a;U)$ is invariant under conformal transformations in an appropriate sense. The notion of a thick point is connected to the Liouville quantum gravity measure with parameter $γ$ given formally by $Γ(dz)=e^{\sqrt{2π}γF(z)}\,dz$ considered by Duplantier and Sheffield.