Source author record

Stephen S. -T. Yau

Stephen S. -T. Yau appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.AP hep-th math.NA math.OC Artificial Intelligence Computational Engineering, Finance, and Science Information Theory math.AG math.CO math.CV math.DG math.IT Quantitative Methods

Catalog footprint

What is connected

11works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

AEM: Adaptive Entropy Modulation for Multi-Turn Agentic Reinforcement Learning

Reinforcement learning (RL) has substantially improved the ability of large language model (LLM) agents to interact with environments and solve multi-turn tasks. However, effective agentic RL remains challenging: sparse outcome-only rewards provide limited guidance for assigning credit to individual steps within long interaction trajectories. Existing approaches often introduce dense intermediate supervision, such as process reward models or auxiliary self-supervised signals, which increases supervision and tuning complexity and may limit generalization across tasks and domains. We present AEM, a supervision-free credit assignment method that adaptively modulates entropy dynamics during RL training to improve the exploration-exploitation trade-off. Since in agentic RL the environment is typically affected by a complete response, rather than an individual token, our analysis lifts entropy dynamics from the token level to the response level, aligning uncertainty estimation with the effective action granularity of LLM agents and reducing sensitivity to token-level sampling noise. We further show that entropy drift under natural-gradient updates is governed by the interaction between the sampled-response advantage and its relative surprisal. Motivated by this result, AEM derives a practical response-level uncertainty proxy and uses it to rescale advantages, leveraging the evolving balance between positive and negative samples to naturally transition from exploration to exploitation. Extensive experiments on ALFWorld, WebShop, and SWE-bench-Verified with models ranging from 1.5B to 32B demonstrate that AEM consistently improves strong RL baselines, including a +1.4\% gain when integrated into a state-of-the-art software-engineering RL training framework.

preprint2016arXiv

4d N=2 SCFT and singularity theory Part II: Complete intersection

We classify three dimensional isolated weighted homogeneous rational complete intersection singularities, which define many new four dimensional N=2 superconformal field theories. We also determine the mini-versal deformation of these singularities, and therefore solve the Coulomb branch spectrum and Seiberg-Witten solution.

preprint2016arXiv

4d N=2 SCFT from Complete Intersection Singularity

Detailed studies of four dimensional N=2 superconformal field theories (SCFT) defined by isolated complete intersection singularities are performed: we compute the Coulomb branch spectrum, Seiberg-Witten solutions and central charges. Most of our theories have exactly marginal deformations and we identify the weakly coupled gauge theory descriptions for many of them, which involve (affine) D and E shaped quiver gauge theories and theories formed from Argyres-Douglas matters. These investigations provide strong evidence for the singularity approach in classifying 4d N=2 SCFTs.

preprint2014arXiv

Complete Real Time Solution of the General Nonlinear Filtering Problem without Memory

It is well known that the nonlinear filtering problem has important applications in both military and civil industries. The central problem of nonlinear filtering is to solve the Duncan-Mortensen-Zakai (DMZ) equation in real time and in a memoryless manner. In this paper, we shall extend the algorithm developed previously by S.-T. Yau and the second author to the most general setting of nonlinear filterings, where the explicit time-dependence is in the drift term, observation term, and the variance of the noises could be a matrix of functions of both time and the states. To preserve the off-line virture of the algorithm, necessary modifications are illustrated clearly. Moreover, it is shown rigorously that the approximated solution obtained by the algorithm converges to the real solution in the $L^1$ sense. And the precise error has been estimated. Finally, the numerical simulation support the feasibility and efficiency of our algorithm.

preprint2014arXiv

Hermite spectral method to 1D forward Kolmogorov equation and its application to nonlinear filtering problems

In this paper, we investigate the Hermite spectral method (HSM) to numerically solve the forward Kolmogorov equation (FKE). A useful guideline of choosing the scaling factor of the generalized Hermite functions is given in this paper. It greatly improves the resolution of HSM. The convergence rate of HSM to FKE is analyzed in the suitable function space and has been verified by the numerical simulation. As an important application and our primary motivation to study the HSM to FKE, we work on the implementation of the nonlinear filtering (NLF) problem with a real-time algorithm developed in [17]. The HSM to FKE is served as the off-line computation in this algorithm. The translating factor of the generalized Hermite functions and the moving-window technique are introduced to deal with the drifting of the posterior conditional density function of the states in the on-line experiments. Two numerical experiments of NLF problems are carried out to illustrate the feasibility of our algorithm. Moreover, our algorithm surpasses the particle filter as a real-time solver to NLF.

preprint2014arXiv

Hermite spectral method with hyperbolic cross approximations to high-dimensional parabolic PDEs

It is well-known that sparse grid algorithm has been widely accepted as an efficient tool to overcome the "curse of dimensionality" in some degree. In this note, we first give the error estimate of hyperbolic cross (HC) approximations with generalized Hermite functions. The exponential convergence in both regular and optimized hyperbolic cross approximations has been shown. Moreover, the error estimate of Hermite spectral method to high-dimensional linear parabolic PDEs with HC approximations has been investigated in the properly weighted Korobov spaces. The numerical result verifies the exponential convergence of this approach.

preprint2014arXiv

On Classification of Toric Surface Codes of Low Dimension

This work is a natural continuation of our previous work \cite{yz}. In this paper, we give a complete classification of toric surface codes of dimension less than or equal to 6, except a special pair, $C_{P_6^{(4)}}$ and $C_{P_6^{(5)}}$ over $\mathbb{F}_8$. Also, we give an example, $C_{P_6^{(5)}}$ and $C_{P_6^{(6)}}$ over $\mathbb{F}_7$, to illustrate that two monomially equivalent toric codes can be constructed from two lattice non-equivalent polygons.

preprint2014arXiv

On the quenching behavior of the MEMS with fringing field

The singular parabolic problem $u_t-\triangle u=λ{\frac{1+δ|\nabla u|^2}{(1-u)^2}}$ on a bounded domain $Ω$ of $\mathbb{R}^n$ with Dirichlet boundary condition, models the Microelectromechanical systems (MEMS) device with fringing field. In this paper, we focus on the quenching behavior of the solution to this equation. We first show that there exists a critical value $λ_δ^*>0$ such that if $0<λ<λ_δ^*$, all solutions exist globally; while for $λ>λ_δ^*$, all the solution will quench in finite time. The estimate of the quenching time in terms of large voltage $λ$ is investigated. Furthermore, the quenching set is a compact subset of $Ω$, provided $Ω$ is a convex bounded domain in $\mathbb{R}^n$. In particular, if the domain $Ω$ is radially symmetric, then the origin is the only quenching point. We not only derive the one-side estimate of the quenching rate, but also further study the refined asymptotic behavior of the finite quenching solution.

preprint2014arXiv

Time-dependent Hermite-Galerkin spectral method and its applications

A time-dependent Hermite-Galerkin spectral method (THGSM) is investigated in this paper for the nonlinear convection-diffusion equations in the unbounded domains. The time-dependent scaling factor and translating factor are introduced in the definition of the generalized Hermite functions (GHF). As a consequence, the THGSM based on these GHF has many advantages, not only in theorethical proofs, but also in numerical implementations. The stability and spectral convergence of our proposed method have been established in this paper. The Korteweg-de Vries-Burgers (KdVB) equation and its special cases, including the heat equation and the Burgers' equation, as the examples, have been numerically solved by our method. The numerical results are presented, and it surpasses the existing methods in accuracy. Our theoretical proof of the spectral convergence has been supported by the numerical results.

preprint2013arXiv

Denoising the 3-Base Periodicity Walks of DNA Sequences in Gene Finding

A nonlinear Tracking-Differentiator is one-input-two-output system that can generate smooth approximation of measured signals and get the derivatives of the signals. The nonlinear tracking-Differentiator is explored to denoise and generate the derivatives of the walks of the 3-periodicity of DNA sequences. An improved algorithm for gene finding is presented using the nonlinear Tracking-Differentiator. The gene finding algorithm employs the 3-base periodicity of coding region. The 3-base periodicity DNA walks are denoised and tracked using the nonlinear Tracking-Differentiator. Case studies demonstrate that the nonlinear Tracking-Differentiator is an effective method to improve the accuracy of the gene finding algorithm.

preprint1998arXiv

Counterexample to boundary regularity of a strongly pseudoconvex CR submanifold: An addendum to the paper of Harvey-Lawson

The purpose of this paper is to give a counterexample of Theorem 10.4 in [Ann. of Math. 102 (1975), 223-290]. In the Harvey-Lawson paper, a global result is claimed, but only a local result is proven. This theorem has had a big impact on CR geometry for almost a quarter of a century because one can use the theory of isolated singularities to study the theory of CR manifolds and vice versa.

Stephen S. -T. Yau

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

AEM: Adaptive Entropy Modulation for Multi-Turn Agentic Reinforcement Learning

4d N=2 SCFT and singularity theory Part II: Complete intersection

4d N=2 SCFT from Complete Intersection Singularity

Complete Real Time Solution of the General Nonlinear Filtering Problem without Memory

Hermite spectral method to 1D forward Kolmogorov equation and its application to nonlinear filtering problems

Hermite spectral method with hyperbolic cross approximations to high-dimensional parabolic PDEs

On Classification of Toric Surface Codes of Low Dimension

On the quenching behavior of the MEMS with fringing field

Time-dependent Hermite-Galerkin spectral method and its applications

Denoising the 3-Base Periodicity Walks of DNA Sequences in Gene Finding

Counterexample to boundary regularity of a strongly pseudoconvex CR submanifold: An addendum to the paper of Harvey-Lawson