Source author record

Yuqiang Li

Yuqiang Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.PR Machine Learning math.ST Statistics Theory Artificial Intelligence astro-ph.GA astro-ph.IM astro-ph.SR gr-qc math.OC

Catalog footprint

What is connected

13works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Improved Model-based Reinforcement Learning with Smooth Kernels

For continuous state-action space scenarios, classical reinforcement learning (RL) theory predominantly focuses on low-rank Markov decision processes (MDPs), which provide sample-efficient guarantees at the expense of restrictive structural assumptions. Kernel smoothing model-based approaches offer a promising alternative paradigm that instead leverages the smoothness of the MDP and employs non-parametric kernel smoothing estimates of transition dynamics. This paper proposes a new kernel-smoothing model-based approach for online reinforcement learning in finite-horizon settings under Lipschitz continuity assumptions on the MDP. By incorporating a Bernstein-style exploration bonus into the kernel smoothing framework, our method achieves a regret bound which improves upon the state-of-the-art regret bound in its dependence on the horizon. The theoretical advancement relies on a delicate analysis of the synergy between Bernstein-style bonuses and kernel smoothing, where a new tight Bernstein-type concentration inequality for martingales may be of independent interest.

preprint2026arXiv

Pessimistic Risk-Aware Policy Learning in Contextual Bandits

We study risk-aware offline policy learning, aiming to learn a decision rule from logged data that is optimal under general risk criteria. This problem is crucial in high-stakes domains where online interaction is infeasible and adverse outcomes must be carefully controlled. However, existing literature on offline contextual bandits either centers on expected-reward criteria or restricts risk considerations to policy evaluation instead of optimization. In this work, we propose a unified distributional framework for optimizing Lipschitz-continuous risk functionals, a broad class of risk measures encompassing mean-variance, entropic risk, and conditional value-at-risk, among others. By developing novel empirical concentration inequalities for importance sampling-based distributional estimators, our analysis derives data-dependent suboptimality bounds with an $\tilde{\mathcal{O}}(1/\sqrt{n})$ rate, without relying on restrictive uniform overlap assumptions. This rate is minimax optimal and matches that of risk-neutral offline policy optimization, indicating that optimizing general Lipschitz risk criteria incurs no additional statistical cost relative to the expected-reward.

preprint2026arXiv

Unleashing LLMs in Bayesian Optimization: Preference-Guided Framework for Scientific Discovery

Scientific discovery is increasingly constrained by costly experiments and limited resources, underscoring the need for efficient optimization in AI for science. Bayesian Optimization (BO), though widely adopted for balancing exploration and exploitation, often exhibits slow cold-start performance and poor scalability in high-dimensional settings, limiting its applicability in real-world scientific problems. To overcome these challenges, we propose LLM-Guided Bayesian Optimization (LGBO), the first LLM preference-guided BO framework that continuously integrates the semantic reasoning of large language models (LLMs) into the optimization loop. Unlike prior works that use LLMs only for warm-start initialization or candidate generation, LGBO introduces a region-lifted preference mechanism that embeds LLM-driven preferences into every iteration, shifting the surrogate mean in a stable and controllable way. Theoretically, we prove that LGBO does not perform significantly worse than standard BO in the worst case, while achieving significantly faster convergence when preferences align with the objective. Empirically, LGBO consistently outperforms existing methods across diverse dry benchmarks in physics, chemistry, biology, and materials science. Most notably, in a new wet-lab optimization of Fe-Cr battery electrolytes, LGBO attains \textbf{90\% of the best observed value within 6 iterations}, whereas standard BO and existing LLM-augmented baselines require more than 10. Together, these results suggest that LGBO offers a promising direction for integrating LLMs into scientific optimization workflows.

preprint2023arXiv

Large and moderate deviations of weak record numbers in random walks

Record numbers are basic statistics in random walks, whose deviation principles are not very clear so far. In this paper, the asymptotic probabilities of large and moderate deviations for numbers of weak records in right continuous or left continuous random walks are proved.

preprint2022arXiv

Deuterated ammonia in Galactic massive star-forming regions

We present sensitive observations of NH2D at 110.153599 GHz toward 50 Galactic massive star-forming regions with IRAM 30-m telescope. The NH2D transition is detected toward 36 objects, yielding a detection rate of 72%. Column densities of NH2D, HC3N and C18O for each source are derived by assuming local thermal equilibrium conditions with a fixed excitation temperature. The deuterium ratio of NH$_3$, defined as the abundance ratio of NH2D to NH3, for 19 sources is also obtained with the information of NH3 from the literature. The range of deuterium fractionation bends to be large in the late-stage star-forming regions in this work, with the value from 0.043 to 0.0006. The highest deuterium ratio of NH3 is 0.043 in G081.75+00.78 (DR21). We also find that the deuterium ratio of NH3 increases with the Galactocentric distances and decreases with the line width.

preprint2020arXiv

The TianQin project: current progress on science and technology

TianQin is a planned space-based gravitational wave (GW) observatory consisting of three earth orbiting satellites with an orbital radius of about $10^5~{\rm km}$. The satellites will form a equilateral triangle constellation the plane of which is nearly perpendicular to the ecliptic plane. TianQin aims to detect GWs between $10^{-4}~{\rm Hz}$ and $1~{\rm Hz}$ that can be generated by a wide variety of important astrophysical and cosmological sources, including the inspiral of Galactic ultra-compact binaries, the inspiral of stellar-mass black hole binaries, extreme mass ratio inspirals, the merger of massive black hole binaries, and possibly the energetic processes in the very early universe or exotic sources such as cosmic strings. In order to start science operations around 2035, a roadmap called the 0123 plan is being used to bring the key technologies of TianQin to maturity, supported by the construction of a series of research facilities on the ground. Two major projects of the 0123 plan are being carried out. In this process, the team has created a new generation $17~{\rm cm}$ single-body hollow corner-cube retro-reflector which has been launched with the QueQiao satellite on 21 May 2018; a new laser ranging station equipped with a $1.2~{\rm m}$ telescope has been constructed and the station has successfully ranged to all the five retro-reflectors on the Moon; and the TianQin-1 experimental satellite has been launched on 20 December 2019 and the first round result shows that the satellite has exceeded all of its mission requirements.

preprint2015arXiv

Exact moduli of continuity for operator-scaling Gaussian random fields

Let $X=\{X(t),t\in\mathrm{R}^N\}$ be a centered real-valued operator-scaling Gaussian random field with stationary increments, introduced by Biermé, Meerschaert and Scheffler (Stochastic Process. Appl. 117 (2007) 312-332). We prove that $X$ satisfies a form of strong local nondeterminism and establish its exact uniform and local moduli of continuity. The main results are expressed in terms of the quasi-metric $τ_E$ associated with the scaling exponent of $X$. Examples are provided to illustrate the subtle changes of the regularity properties.

preprint2012arXiv

Approximations of fractional Brownian motion

Approximations of fractional Brownian motion using Poisson processes whose parameter sets have the same dimensions as the approximated processes have been studied in the literature. In this paper, a special approximation to the one-parameter fractional Brownian motion is constructed using a two-parameter Poisson process. The proof involves the tightness and identification of finite-dimensional distributions.

preprint2012arXiv

Fluctuation limits of strongly degenerate branching systems

Functional limit theorems for scaled fluctuations of occupation time processes of a sequence of critical branching particle systems in $\R^d$ with anisotropic space motions and strongly degenerated splitting abilities are proved in the cases of critical and intermediate dimensions. The results show that the limit processes are constant measure-valued Wienner processes with degenerated temporal and simple spatial structures.

preprint2012arXiv

Riemann-Liouville processes arising from Branching particle systems

It is proved in this paper that Riemann-Liouville processes can arise from the temporal structures of the scaled occupation time fluctuation limits of the site-dependent (d,α,σ(x))branching particle systems in the case of 1=d<α<2 and \int_{\R}σ(x)\d x<\infty.

preprint2011arXiv

Functional ergodic theorems of site-dependent branching Brownian motions in R

In this paper, we studied the functional ergodic limits of the site-dependent branching Brownian motions in R. The results show that the limiting processes are non-degenerate if and only if the variance functions of branching laws are integrable. When the functions are integrable, although the limiting processes will vary according to the integrals, they are always positive, infinitely divisible and self-similar, and their marginal distributions are determined by a kind of 1/2-fractional integral equations. As a byproduct, the unique non-negative solutions of the integral equations are explicitly presented by the Levy-measure of the corresponding limiting processes.

preprint2011arXiv

Multivariate Operator-Self-Similar Random Fields

Multivariate random fields whose distributions are invariant under operator-scalings in both time-domain and state space are studied. Such random fields are called operator-self-similar random fields and their scaling operators are characterized. Two classes of operator-self-similar stable random fields $X=\{X(t), t \in \R^d\}$ with values in $\R^m$ are constructed by utilizing homogeneous functions and stochastic integral representations.

preprint2011arXiv

Occupation Time Fluctuations of Weakly Degenerate Branching Systems

We establish limit theorems for re-scaled occupation time fluctuations of a sequence of branching particle systems in $\R^d$ with anisotropic space motion and weakly degenerate splitting ability. In the case of large dimensions, our limit processes lead to a new class of operator-scaling Gaussian random fields with non-stationary increments. In the intermediate and critical dimensions, the limit processes have spatial structures analogous to (but more complicated than) those arising from the critical branching particle system without degeneration considered by Bojdecki et al.{\it Stochastic Process. Appl. 2006}. Due to the weakly degenerate branching ability, temporal structures of the limit processes in all three cases are different from those obtained by Bojdecki et al. .{\it Stochastic Process. Appl. 2006}.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint