Researcher profile

SiYuan Ma

SiYuan Ma contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2026arXiv

Large Language Models as Amortized Pareto-Front Generators for Constrained Bi-Objective Convex Optimization

Generating feasible Pareto fronts for constrained bi-objective continuous optimization is central to multi-criteria decision-making. Existing methods usually rely on iterative scalarization, evolutionary search, or problem-specific solvers, requiring repeated optimization for each instance. We introduce DIPS, an end-to-end framework that fine-tunes large language models as amortized Pareto-front generators for constrained bi-objective convex optimization. Given a textual problem description, DIPS directly outputs an ordered set of feasible continuous decision vectors approximating the Pareto front. To make continuous optimization compatible with autoregressive language modeling, DIPS combines a compact discretization scheme, Numerically Grounded Token Initialization for new numerical tokens, and Three-Phase Curriculum Optimization, which progressively aligns structural validity, feasibility, and Pareto-front quality. Across five families of constrained bi-objective convex problems, a fine-tuned 7B-parameter model achieves normalized hypervolume ratios of 95.29% to 98.18% relative to reference fronts. With vLLM-accelerated inference, DIPS solves one instance in as little as 0.16 seconds and outperforms general-purpose and reasoning LLM baselines under the evaluated setting. These results suggest that LLMs can serve as effective amortized generators for continuous Pareto-front approximation.

preprint2022arXiv

Nonlinear radiation gauge for near Kerr spacetimes

In this paper, we introduce and explore the properties of a new gauge choice for the vacuum Einstein equation inspired by the ingoing and outgoing radiation gauges (IRG, ORG) for the linearized vacuum Einstein equation introduced by Chrzanowski in his work on metric reconstruction [DOI:10.1103/PhysRevD.11.2042] on the Kerr background. It has been shown by Price, Shankar and Whiting [arXiv:gr-qc/0611070] that the IRG/ORG are consistent gauges for the linearized vacuum Einstein equation on Petrov type II backgrounds. In [arXiv:1903.03859], the ORG was used in proving linearized stability for the Kerr spacetime, and the new non-linear radiation gauge introduced here is a direct generalization of that gauge condition, and is intended to be used to study the stability of Kerr black holes under the evolution generated by the vacuum Einstein equation.

preprint2022arXiv

Sharp decay estimates for massless Dirac fields on a Schwarzschild background

We consider the explicit asymptotic profile of massless Dirac fields on a Schwarzschild background. First, we prove for the spin $s=\pm \frac{1}{2}$ components of the Dirac field a uniform bound of a positive definite energy and an integrated local energy decay estimate from a symmetric hyperbolic wave system. Based on these estimates, we further show that these components have globally pointwise decay $fv^{-3/2-s}τ^{-5/2+s}$ as both an upper and a lower bound outside the black hole, with function $f$ finite and explicitly expressed in terms of the initial data and the coordinates. This establishes the validity of the conjectured Price's law for massless Dirac fields outside a Schwarzschild black hole.

preprint2020arXiv

Almost Price's law in Schwarzschild and decay estimates in Kerr for Maxwell field

We consider in this work the asymptotics of a Maxwell field in Schwarzschild and Kerr spacetimes. In any subextremal Kerr spacetime, we show energy and pointwise decay estimates for all components under an assumption of a basic energy and Morawetz estimate for spin $\pm 1$ components. If restricted to slowly rotating Kerr, we utilize the basic energy and Morawetz estimates proven in an earlier work to further improve these decay estimates such that the total power of decay for all components of Maxwell field is $-7/2$. In the end, depending on if the Newman--Penrose constant vanishes or not, we prove almost sharp Price's law decay $τ^{-5+}$ (or $τ^{-4+}$) for Maxwell field and $τ^{-\ell -4+}$ (or $τ^{-\ell -3+}$) for any $\ell$ mode of the field towards a static solution on a Schwarzschild background. All estimates are uniform in the exterior of the black hole.

preprint2020arXiv

Uniform energy bound and Morawetz estimate for extreme components of spin fields in the exterior of a slowly rotating Kerr black hole I: Maxwell field

This first part of the series treats the Maxwell equations in the exterior of a slowly rotating Kerr black hole. By performing a first-order differential operator on each extreme Newman-Penrose (N-P) scalar in a Kinnersley tetrad, the resulting equation and the Teukolsky master equation for the extreme N-P component are both in the form of an inhomogeneous \textquotedblleft{spin-weighted Fackerell-Ipser equation\textquotedblright} (SWFIE) and constitute a weakly coupled system. We first prove energy estimate and integrated local energy decay (Morawetz) estimate for this type of inhomogeneous SWFIE following the method in (Dafermos and Rodnianski in Decay for solutions of the wave equation on Kerr exterior spacetimes I-II: the cases $|a|\ll M$ or axisymmetry, 2010, arXiv:1010.5132), and then utilize these estimates to achieve both a uniform bound of a positive definite energy and a Morawetz estimate for the coupled system of each extreme N-P component. The same type of estimates for the regular extreme N-P components defined in the regular Hawking-Hartle tetrad is also proved. The hierarchy here is generalized in our second part (Ma in Uniform energy bound and Morawetz estimates for extreme components of spin fields in the exterior of a slowly rotating Kerr black hole II: linearized gravity, 2017, arXiv:1708.07385) of this series to treat the extreme components of linearized gravity.

preprint2020arXiv

Uniform energy bound and Morawetz estimate for extreme components of spin fields in the exterior of a slowly rotating Kerr black hole II: linearized gravity

This second part of the series treats spin $\pm2$ components (or extreme components), that satisfy the Teukolsky master equation, of the linearized gravity in the exterior of a slowly rotating Kerr black hole. For each of these two components, after performing a first-order differential operator once and twice, the resulting equations together with the Teukolsky master equation itself constitute a linear spin-weighted wave system. An energy and Morawetz estimate for spin $\pm 2$ components is proved by treating this system. This is a first step in a joint work (Andersson et al. in Stability for linearized gravity on the Kerr spacetime, arXiv:1903.03859, 2019) in addressing the linear stability of slowly rotating Kerr metrics.

preprint2019arXiv

Reconciling modern machine learning practice and the bias-variance trade-off

Breakthroughs in machine learning are rapidly changing science and society, yet our fundamental understanding of this technology has lagged far behind. Indeed, one of the central tenets of the field, the bias-variance trade-off, appears to be at odds with the observed behavior of methods used in the modern machine learning practice. The bias-variance trade-off implies that a model should balance under-fitting and over-fitting: rich enough to express underlying structure in data, simple enough to avoid fitting spurious patterns. However, in the modern practice, very rich models such as neural networks are trained to exactly fit (i.e., interpolate) the data. Classically, such models would be considered over-fit, and yet they often obtain high accuracy on test data. This apparent contradiction has raised questions about the mathematical foundations of machine learning and their relevance to practitioners. In this paper, we reconcile the classical understanding and the modern practice within a unified performance curve. This "double descent" curve subsumes the textbook U-shaped bias-variance trade-off curve by showing how increasing model capacity beyond the point of interpolation results in improved performance. We provide evidence for the existence and ubiquity of double descent for a wide spectrum of models and datasets, and we posit a mechanism for its emergence. This connection between the performance and the structure of machine learning models delineates the limits of classical analyses, and has implications for both the theory and practice of machine learning.