Source author record

Molei Tao

Molei Tao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.NA math.DS physics.comp-ph Machine Learning Numerical Analysis math.OC astro-ph.EP physics.class-ph Artificial Intelligence Biological Physics Biomolecules Computation Computational Engineering, Finance, and Science Computer Science and Game Theory Computer Vision math.AP math.CA physics.plasm-ph

Catalog footprint

What is connected

22works

18topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

NoiseRater: Meta-Learned Noise Valuation for Diffusion Model Training

Diffusion models have achieved remarkable success across a wide range of generative tasks, yet their training paradigm largely treats injected noise as uniformly informative. In this work, we challenge this assumption and introduce NoiseRater, a meta-learning framework for instance-level noise valuation in diffusion model training. We propose a parametric noise rater that assigns importance scores to individual noise realizations conditioned on data and timestep, enabling adaptive reweighting of the training objective. The rater is trained via bilevel optimization to improve downstream validation performance after inner-loop diffusion updates. To enable efficient deployment, we further design a decoupled two-stage pipeline that transitions from soft weighting during meta-training to hard noise selection during standard training. Extensive experiments on FFHQ and ImageNet demonstrate that not all noise samples contribute equally, and that prioritizing informative noise improves both training efficiency and generation quality. Our results establish noise valuation as a complementary and previously underexplored axis for improving diffusion model training. Our code is available at: https://anonymous.4open.science/r/NoiseRater-DEB116.

preprint2024arXiv

Automated construction of effective potential via algorithmic implicit bias

We introduce a novel approach for decomposing and learning every scale of a given multiscale objective function in $\mathbb{R}^d$, where $d\ge 1$. This approach leverages a recently demonstrated implicit bias of the optimization method of gradient descent by Kong and Tao, which enables the automatic generation of data that nearly follow Gibbs distribution with an effective potential at any desired scale. One application of this automated effective potential modeling is to construct reduced-order models. For instance, a deterministic surrogate Hamiltonian model can be developed to substantially soften the stiffness that bottlenecks the simulation, while maintaining the accuracy of phase portraits at the scale of interest. Similarly, a stochastic surrogate model can be constructed at a desired scale, such that both its equilibrium and out-of-equilibrium behaviors (characterized by auto-correlation function and mean path) align with those of a damped mechanical system with the original multiscale function being its potential. The robustness and efficiency of our proposed approach in multi-dimensional scenarios have been demonstrated through a series of numerical experiments. A by-product of our development is a method for anisotropic noise estimation and calibration. More precisely, Langevin model of stochastic mechanical systems may not have isotropic noise in practice, and we provide a systematic algorithm to quantify its covariance matrix without directly measuring the noise. In this case, the system may not admit closed form expression of its invariant distribution either, but with this tool, we can design friction matrix appropriately to calibrate the system so that its invariant distribution has a closed form expression of Gibbs.

preprint2022arXiv

Alternating Mirror Descent for Constrained Min-Max Games

In this paper we study two-player bilinear zero-sum games with constrained strategy spaces. An instance of natural occurrences of such constraints is when mixed strategies are used, which correspond to a probability simplex constraint. We propose and analyze the alternating mirror descent algorithm, in which each player takes turns to take action following the mirror descent algorithm for constrained optimization. We interpret alternating mirror descent as an alternating discretization of a skew-gradient flow in the dual space, and use tools from convex optimization and modified energy function to establish an $O(K^{-2/3})$ bound on its average regret after $K$ iterations. This quantitatively verifies the algorithm's better behavior than the simultaneous version of mirror descent algorithm, which is known to diverge and yields an $O(K^{-1/2})$ average regret bound. In the special case of an unconstrained setting, our results recover the behavior of alternating gradient descent algorithm for zero-sum games which was studied in (Bailey et al., COLT 2020).

preprint2022arXiv

Hessian-Free High-Resolution Nesterov Acceleration for Sampling

Nesterov's Accelerated Gradient (NAG) for optimization has better performance than its continuous time limit (noiseless kinetic Langevin) when a finite step-size is employed \citep{shi2021understanding}. This work explores the sampling counterpart of this phenonemon and proposes a diffusion process, whose discretizations can yield accelerated gradient-based MCMC methods. More precisely, we reformulate the optimizer of NAG for strongly convex functions (NAG-SC) as a Hessian-Free High-Resolution ODE, change its high-resolution coefficient to a hyperparameter, inject appropriate noise, and discretize the resulting diffusion process. The acceleration effect of the new hyperparameter is quantified and it is not an artificial one created by time-rescaling. Instead, acceleration beyond underdamped Langevin in $W_2$ distance is quantitatively established for log-strongly-concave-and-smooth targets, at both the continuous dynamics level and the discrete algorithm level. Empirical experiments in both log-strongly-concave and multi-modal cases also numerically demonstrate this acceleration.

preprint2022arXiv

Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect

Recent empirical advances show that training deep models with large learning rate often improves generalization performance. However, theoretical justifications on the benefits of large learning rate are highly limited, due to challenges in analysis. In this paper, we consider using Gradient Descent (GD) with a large learning rate on a homogeneous matrix factorization problem, i.e., $\min_{X, Y} \|A - XY^\top\|_{\sf F}^2$. We prove a convergence theory for constant large learning rates well beyond $2/L$, where $L$ is the largest eigenvalue of Hessian at the initialization. Moreover, we rigorously establish an implicit bias of GD induced by such a large learning rate, termed 'balancing', meaning that magnitudes of $X$ and $Y$ at the limit of GD iterations will be close even if their initialization is significantly unbalanced. Numerical experiments are provided to support our theory.

preprint2022arXiv

Low Spin-Axis Variations of Circumbinary Planets

Having a massive moon has been considered as a primary mechanism for stabilized planetary obliquity, an example of which being our Earth. This is, however, not always consistent with the exoplanetary cases. This article details the discovery of an alternative mechanism, namely that planets orbiting around binary stars tend to have low spin-axis variations. This is because the large quadrupole potential of the stellar binary could speed up the planetary orbital precession, and detune the system out of secular spin-orbit resonances. Consequently, habitable zone planets around the stellar binaries in low inclination orbits hold higher potential for regular seasonal changes comparing to their single star analogues.

preprint2021arXiv

Accurate and Efficient Simulations of Hamiltonian Mechanical Systems with Discontinuous Potentials

This article considers Hamiltonian mechanical systems with potential functions admitting jump discontinuities. The focus is on accurate and efficient numerical approximations of their solutions, which will be defined via the laws of reflection and refraction. Despite of the success of symplectic integrators for smooth mechanical systems, their construction for the discontinuous ones is nontrivial, and numerical convergence order can be impaired too. Several rather-usable numerical methods are proposed, including: a first-order symplectic integrator for general problems, a third-order symplectic integrator for problems with only one linear interface, arbitrarily high-order reversible integrators for general problems (no longer symplectic), and an adaptive time-stepping version of the previous high-order method. Interestingly, whether symplecticity leads to favorable long time performance is no longer clear due to discontinuity, as traditional Hamiltonian backward error analysis does not apply any more. Therefore, at this stage, our recommended default method is the last one. Various numerical evidence, on the order of convergence, long time performance, momentum map conservation, and consistency with the computationally-expensive penalty method, are supplied. A complex problem, namely the Sauteed Mushroom, is also proposed and numerically investigated, for which multiple bifurcations between trapped and ergodic dynamics are observed.

preprint2020arXiv

Variational Optimization on Lie Groups, with Examples of Leading (Generalized) Eigenvalue Problems

The article considers smooth optimization of functions on Lie groups. By generalizing NAG variational principle in vector space (Wibisono et al., 2016) to Lie groups, continuous Lie-NAG dynamics which are guaranteed to converge to local optimum are obtained. They correspond to momentum versions of gradient flow on Lie groups. A particular case of $\mathsf{SO}(n)$ is then studied in details, with objective functions corresponding to leading Generalized EigenValue problems: the Lie-NAG dynamics are first made explicit in coordinates, and then discretized in structure preserving fashions, resulting in optimization algorithms with faithful energy behavior (due to conformal symplecticity) and exactly remaining on the Lie group. Stochastic gradient versions are also investigated. Numerical experiments on both synthetic data and practical problem (LDA for MNIST) demonstrate the effectiveness of the proposed methods as optimization algorithms ($not$ as a classification method).

preprint2019arXiv

Space-Time Phononic Crystals with Anomalous Topological Edge States

It is well known that an interface created by two topologically distinct structures could host nontrivial edge states that are immune to defects. In this letter, we introduce a one-dimensional space-time phononic crystal and study the associated anomalous topological edge states. A space-decoupled time modulation is assumed. While preserving the key topological feature of the system, such a modulation also duplicates the edge state mode across the spectrum, both inside and outside the band gap. It is shown that, in contrast to conventional topological edge states which are excited by frequencies in the Bragg regime, the time-modulation-induced frequency conversion can be leveraged to access topological edge states at a deep subwavelength scale where the entire phononic crystal size is merely 1/5.1 of the wavelength. This remarkable feature could open a new route for designing miniature devices that are based on topological physics.

preprint2016arXiv

Explicit high-order symplectic integrators for charged particles in general electromagnetic fields

This article considers non-relativistic charged particle dynamics in both static and non-static electromagnetic fields, which are governed by nonseparable, possibly time-dependent Hamiltonians. For the first time, explicit symplectic integrators of arbitrary high-orders are constructed for accurate and efficient simulations of such mechanical systems. Performances superior to the standard non-symplectic method of Runge-Kutta are demonstrated on two examples: the first is on the confined motion of a particle in a static toroidal magnetic field used in tokamak; the second is on how time-periodic perturbations to a magnetic field inject energy into a particle via parametric resonance at a specific frequency.

preprint2016arXiv

Explicit symplectic approximation of nonseparable Hamiltonians: algorithm and long time performance

Explicit symplectic integrators have been important tools for accurate and efficient approximations of mechanical systems with separable Hamiltonians. For the first time, the article proposes for arbitrary Hamiltonians similar integrators, which are explicit, of any even order, symplectic in an extended phase space, and with pleasant long time properties. They are based on a mechanical restraint that binds two copies of phase space together. Using backward error analysis, KAM theory, and additional multiscale analysis, an error bound of $\mathcal{O}(Tδ^l ω)$ is established for integrable systems, where $T$, $δ$, $l$ and $ω$ are respectively the (long) simulation time, step size, integrator order, and some binding constant. For non-integrable systems with positive Lyapunov exponents, such an error bound is generally impossible, but satisfactory statistical behaviors were observed in a numerical experiment with a nonlinear Schrödinger equation.

preprint2016arXiv

Uncovering Circumbinary Planetary Architectural Properties from Selection Biases

The new discoveries of circumbinary planetary systems shed light on the understanding of planetary system formation. Learning the architectural properties of these systems is essential for constraining the different formation mechanisms. We first revisit the stability limit of circumbinary planets. Next, we focus on eclipsing stellar binaries and obtain an analytical expression for the transit probability in a realistic setting, where finite observation period and planetary orbital precession are included. Then, understanding of the architectural properties of the currently observed transiting systems is refined, based on Bayesian analysis and a series of hypothesis tests. We find 1) it is not a selection bias that the innermost planets reside near the stability limit for eight of the nine observed systems, and this is consistent with a log uniform distribution of the planetary semi-major axis; 2) it is not a selection bias that the planetary and stellar orbits are nearly coplanar ($\lesssim 3^\circ$), and this together with previous studies may imply an occurrence rate of circumbinary planets similar to that of single star systems; 3) the dominance of observed circumbinary systems with only one transiting planet may be caused by selection effects; 4) formation mechanisms involving Lidov-Kozai oscillations, which may produce misalignment and large separation between planet and stellar binaries, are consistent with the lack of transiting circumbinary planets around short-period stellar binaries, in agreement with previous studies. As a consequence of 4), eclipse timing variations may better suit the detection of planets in such configurations.

preprint2015arXiv

Convex Optimal Uncertainty Quantification

Optimal uncertainty quantification (OUQ) is a framework for numerical extreme-case analysis of stochastic systems with imperfect knowledge of the underlying probability distribution. This paper presents sufficient conditions under which an OUQ problem can be reformulated as a finite-dimensional convex optimization problem, for which efficient numerical solutions can be obtained. The sufficient conditions include that the objective function is piecewise concave and the constraints are piecewise convex. In particular, we show that piecewise concave objective functions may appear in applications where the objective is defined by the optimal value of a parameterized linear program.

preprint2015arXiv

Temporal homogenization of linear ODEs, with applications to parametric super-resonance and energy harvest

We consider the temporal homogenization of linear ODEs of the form $\dot{x}=Ax+εP(t)x+f(t)$, where $P(t)$ is periodic and $ε$ is small. Using a 2-scale expansion approach, we obtain the long-time approximation $x(t)\approx \exp(At) \left( Ω(t)+\int_0^t \exp(-A τ) f(τ) \, dτ\right)$, where $Ω$ solves the cell problem $\dotΩ=εB Ω+ εF(t)$ with an effective matrix $B$ and an explicitly-known $F(t)$. We provide necessary and sufficient condition for the accuracy of the approximation (over a $\mathcal{O}(ε^{-1})$ time-scale), and show how $B$ can be computed (at a cost independent of $ε$). As a direct application, we investigate the possibility of using RLC circuits to harvest the energy contained in small scale oscillations of ambient electromagnetic fields (such as Schumann resonances). Although a RLC circuit parametrically coupled to the field may achieve such energy extraction via parametric resonance, its resistance $R$ needs to be smaller than a threshold $κ$ proportional to the fluctuations of the field, thereby limiting practical applications. We show that if $n$ RLC circuits are appropriately coupled via mutual capacitances or inductances, then energy extraction can be achieved when the resistance of each circuit is smaller than $nκ$. Hence, if the resistance of each circuit has a non-zero fixed value, energy extraction can be made possible through the coupling of a sufficiently large number $n$ of circuits ($n\approx 1000$ for the first mode of Schumann resonances and contemporary values of capacitances, inductances and resistances). The theory is also applied to the control of the oscillation amplitude of a (damped) oscillator.

preprint2014arXiv

Variational and linearly-implicit integrators, with applications

We show that symplectic and linearly-implicit integrators proposed by [Zhang and Skeel, 1997] are variational linearizations of Newmark methods. When used in conjunction with penalty methods (i.e., methods that replace constraints by stiff potentials), these integrators permit coarse time-stepping of holonomically constrained mechanical systems and bypass the resolution of nonlinear systems. Although penalty methods are widely employed, an explicit link to Lagrange multiplier approaches appears to be lacking; such a link is now provided (in the context of two-scale flow convergence [Tao, Owhadi and Marsden, 2010]). The variational formulation also allows efficient simulations of mechanical systems on Lie groups.

preprint2012arXiv

Control of a Model of DNA Division via Parametric Resonance

We study the internal resonance, energy transfer, activation mechanism, and control of a model of DNA division via parametric resonance. While the system is robust to noise, this study shows that it is sensitive to specific fine scale modes and frequencies that could be targeted by low intensity electro-magnetic fields for triggering and controlling the division. The DNA model is a chain of pendula in a Morse potential. While the (possibly parametrically excited) system has a large number of degrees of freedom and a large number of intrinsic time scales, global and slow variables can be identified by (i) first reducing its dynamic to two modes exchanging energy between each other and (ii) averaging the dynamic of the reduced system with respect to the phase of the fastest mode. Surprisingly the global and slow dynamic of the system remains Hamiltonian (despite the parametric excitation) and the study of its associated effective potential shows how parametric excitation can turn the unstable open state into a stable one. Numerical experiments support the accuracy of the time-averaged reduced Hamiltonian in capturing the global and slow dynamic of the full system.

preprint2011arXiv

From efficient symplectic exponentiation of matrices to symplectic integration of high-dimensional Hamiltonian systems with slowly varying quadratic stiff potentials

We present a multiscale integrator for Hamiltonian systems with slowly varying quadratic stiff potentials that uses coarse timesteps (analogous to what the impulse method uses for constant quadratic stiff potentials). This method is based on the highly-non-trivial introduction of two efficient symplectic schemes for exponentiations of matrices that only require O(n) matrix multiplications operations at each coarse time step for a preset small number n. The proposed integrator is shown to be (i) uniformly convergent on positions; (ii) symplectic in both slow and fast variables; (iii) well adapted to high dimensional systems. Our framework also provides a general method for iteratively exponentiating a slowly varying sequence of (possibly high dimensional) matrices in an efficient way.

preprint2011arXiv

Space-time FLAVORS: finite difference, multisymlectic, and pseudospectral integrators for multiscale PDEs

We present a new class of integrators for stiff PDEs. These integrators are generalizations of FLow AVeraging integratORS (FLAVORS) for stiff ODEs and SDEs introduced in [Tao, Owhadi and Marsden 2010] with the following properties: (i) Multiscale: they are based on flow averaging and have a computational cost determined by mesoscopic steps in space and time instead of microscopic steps in space and time; (ii) Versatile: the method is based on averaging the flows of the given PDEs (which may have hidden slow and fast processes). This bypasses the need for identifying explicitly (or numerically) the slow variables or reduced effective PDEs; (iii) Nonintrusive: A pre-existing numerical scheme resolving the microscopic time scale can be used as a black box and easily turned into one of the integrators in this paper by turning the large coefficients on over a microscopic timescale and off during a mesoscopic timescale; (iv) Convergent over two scales: strongly over slow processes and in the sense of measures over fast ones; (v) Structure-preserving: for stiff Hamiltonian PDEs (possibly on manifolds), they can be made to be multi-symplectic, symmetry-preserving (symmetries are group actions that leave the system invariant) in all variables and variational.

preprint2011arXiv

Variational integrators for electric circuits

In this contribution, we develop a variational integrator for the simulation of (stochastic and multiscale) electric circuits. When considering the dynamics of an electrical circuit, one is faced with three special situations: 1. The system involves external (control) forcing through external (controlled) voltage sources and resistors. 2. The system is constrained via the Kirchhoff current (KCL) and voltage laws (KVL). 3. The Lagrangian is degenerate. Based on a geometric setting, an appropriate variational formulation is presented to model the circuit from which the equations of motion are derived. A time-discrete variational formulation provides an iteration scheme for the simulation of the electric circuit. Dependent on the discretization, the intrinsic degeneracy of the system can be canceled for the discrete variational scheme. In this way, a variational integrator is constructed that gains several advantages compared to standard integration tools for circuits; in particular, a comparison to BDF methods (which are usually the method of choice for the simulation of electric circuits) shows that even for simple LCR circuits, a better energy behavior and frequency spectrum preservation can be observed using the developed variational integrator.

preprint2010arXiv

Non-intrusive and structure preserving multiscale integration of stiff ODEs, SDEs and Hamiltonian systems with hidden slow dynamics via flow averaging

We introduce a new class of integrators for stiff ODEs as well as SDEs. These integrators are (i) {\it Multiscale}: they are based on flow averaging and so do not fully resolve the fast variables and have a computational cost determined by slow variables (ii) {\it Versatile}: the method is based on averaging the flows of the given dynamical system (which may have hidden slow and fast processes) instead of averaging the instantaneous drift of assumed separated slow and fast processes. This bypasses the need for identifying explicitly (or numerically) the slow or fast variables (iii) {\it Nonintrusive}: A pre-existing numerical scheme resolving the microscopic time scale can be used as a black box and easily turned into one of the integrators in this paper by turning the large coefficients on over a microscopic timescale and off during a mesoscopic timescale (iv) {\it Convergent over two scales}: strongly over slow processes and in the sense of measures over fast ones. We introduce the related notion of two-scale flow convergence and analyze the convergence of these integrators under the induced topology (v) {\it Structure preserving}: for stiff Hamiltonian systems (possibly on manifolds), they can be made to be symplectic, time-reversible, and symmetry preserving (symmetries are group actions that leave the system invariant) in all variables. They are explicit and applicable to arbitrary stiff potentials (that need not be quadratic). Their application to the Fermi-Pasta-Ulam problems shows accuracy and stability over four orders of magnitude of time scales. For stiff Langevin equations, they are symmetry preserving, time-reversible and Boltzmann-Gibbs reversible, quasi-symplectic on all variables and conformally symplectic with isotropic friction.

preprint2010arXiv

Structure preserving Stochastic Impulse Methods for stiff Langevin systems with a uniform global error of order 1 or 1/2 on position

Impulse methods are generalized to a family of integrators for Langevin systems with quadratic stiff potentials and arbitrary soft potentials. Uniform error bounds (independent from stiff parameters) are obtained on integrated positions allowing for coarse integration steps. The resulting integrators are explicit and structure preserving (quasi-symplectic for Langevin systems).

preprint2010arXiv

Temperature and Friction Accelerated Sampling of Boltzmann-Gibbs Distribution

This paper is concerned with tuning friction and temperature in Langevin dynamics for fast sampling from the canonical ensemble. We show that near-optimal acceleration is achieved by choosing friction so that the local quadratic approximation of the Hamiltonian is a critical damped oscillator. The system is also over-heated and cooled down to its final temperature. The performances of different cooling schedules are analyzed as functions of total simulation time.

Molei Tao

What is connected

Connect this record

See the researcher in context

Building this map preview

22 published item(s)

NoiseRater: Meta-Learned Noise Valuation for Diffusion Model Training

Automated construction of effective potential via algorithmic implicit bias

Alternating Mirror Descent for Constrained Min-Max Games

Hessian-Free High-Resolution Nesterov Acceleration for Sampling

Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect

Low Spin-Axis Variations of Circumbinary Planets

Accurate and Efficient Simulations of Hamiltonian Mechanical Systems with Discontinuous Potentials

Variational Optimization on Lie Groups, with Examples of Leading (Generalized) Eigenvalue Problems

Space-Time Phononic Crystals with Anomalous Topological Edge States

Explicit high-order symplectic integrators for charged particles in general electromagnetic fields

Explicit symplectic approximation of nonseparable Hamiltonians: algorithm and long time performance

Uncovering Circumbinary Planetary Architectural Properties from Selection Biases

Convex Optimal Uncertainty Quantification

Temporal homogenization of linear ODEs, with applications to parametric super-resonance and energy harvest

Variational and linearly-implicit integrators, with applications

Control of a Model of DNA Division via Parametric Resonance

From efficient symplectic exponentiation of matrices to symplectic integration of high-dimensional Hamiltonian systems with slowly varying quadratic stiff potentials

Space-time FLAVORS: finite difference, multisymlectic, and pseudospectral integrators for multiscale PDEs

Variational integrators for electric circuits

Non-intrusive and structure preserving multiscale integration of stiff ODEs, SDEs and Hamiltonian systems with hidden slow dynamics via flow averaging

Structure preserving Stochastic Impulse Methods for stiff Langevin systems with a uniform global error of order 1 or 1/2 on position

Temperature and Friction Accelerated Sampling of Boltzmann-Gibbs Distribution