Researcher profile

Min Dai

Min Dai contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2026arXiv

Walk the PLANC: Physics-Guided RL for Agile Humanoid Locomotion on Constrained Footholds

Bipedal humanoid robots must precisely coordinate balance, timing, and contact decisions when locomoting on constrained footholds such as stepping stones, beams, and planks -- even minor errors can lead to catastrophic failure. Classical optimization and control pipelines handle these constraints well but depend on highly accurate mathematical representations of terrain geometry, making them prone to error when perception is noisy or incomplete. Meanwhile, reinforcement learning has shown strong resilience to disturbances and modeling errors, yet end-to-end policies rarely discover the precise foothold placement and step sequencing required for discontinuous terrain. These contrasting limitations motivate approaches that guide learning with physics-based structure rather than relying purely on reward shaping. In this work, we introduce a locomotion framework in which a reduced-order stepping planner supplies dynamically consistent motion targets that steer the RL training process via Control Lyapunov Function (CLF) rewards. This combination of structured footstep planning and data-driven adaptation produces accurate, agile, and hardware-validated stepping-stone locomotion on a humanoid robot, substantially improving reliability compared to conventional model-free reinforcement-learning baselines.

preprint2023arXiv

Learning effective dynamics from data-driven stochastic systems

Multiscale stochastic dynamical systems have been widely adopted to a variety of scientific and engineering problems due to their capability of depicting complex phenomena in many real world applications. This work is devoted to investigating the effective dynamics for slow-fast stochastic dynamical systems. Given observation data on a short-term period satisfying some unknown slow-fast stochastic systems, we propose a novel algorithm including a neural network called Auto-SDE to learn invariant slow manifold. Our approach captures the evolutionary nature of a series of time-dependent autoencoder neural networks with the loss constructed from a discretized stochastic differential equation. Our algorithm is also validated to be accurate, stable and effective through numerical experiments under various evaluation metrics.

preprint2022arXiv

Learning Controller Gains on Bipedal Walking Robots via User Preferences

Experimental demonstration of complex robotic behaviors relies heavily on finding the correct controller gains. This painstaking process is often completed by a domain expert, requiring deep knowledge of the relationship between parameter values and the resulting behavior of the system. Even when such knowledge is possessed, it can take significant effort to navigate the nonintuitive landscape of possible parameter combinations. In this work, we explore the extent to which preference-based learning can be used to optimize controller gains online by repeatedly querying the user for their preferences. This general methodology is applied to two variants of control Lyapunov function based nonlinear controllers framed as quadratic programs, which provide theoretical guarantees but are challenging to realize in practice. These controllers are successfully demonstrated both on the planar underactuated biped, AMBER, and on the 3D underactuated biped, Cassie. We experimentally evaluate the performance of the learned controllers and show that the proposed method is repeatably able to learn gains that yield stable and robust locomotion.

preprint2022arXiv

Physical Properties of 29 sdB+dM Eclipsing Binaries in Zwicky Transient Facility

The development of large-scale time-domain surveys provides an opportunity to study the physical properties as well as the evolutionary scenario of B-type subdwarfs (sdB) and M-type dwarfs (dM). Here, we obtained 33 sdB+dM eclipsing binaries based on the Zwicky Transient Facility (ZTF) light curves and $Gaia$ early data release 3 (EDR3) parallaxes. By using the PHOEBE code for light curve analysis, we obtain probability distributions for parameters of 29 sdB+dM. $R_1$, $R_2$, and $i$ are well determined, and the average uncertainty of mass ratio $q$ is 0.08. Our parameters are in good agreement with previous works if a typical mass of sdB is assumed. Based on parameters of 29 sdB+dM, we find that both the mass ratio $q$ and the companion's radius $R_2$ decrease with the shortening of the orbital period. For the three sdB+dMs with orbital periods less than 0.075 days, their companions are all brown dwarfs. The masses and radii of the companions satisfy the mass--radius relation for low-mass stars and brown dwarfs. Companions with radii between $0.12R_\odot$ and $0.15R_\odot$ seem to be missing in the observations. As more short-period sdB+dM eclipsing binaries are discovered and classified in the future with ZTF and $Gaia$, we will have more information to constrain the evolutionary ending of sdB+dM.

preprint2020arXiv

Maximum Likelihood Estimation of Stochastic Differential Equations with Random Effects Driven by Fractional Brownian Motion

Stochastic differential equations and stochastic dynamics are good models to describe stochastic phenomena in real world. In this paper, we study N independent stochastic processes Xi(t) with real entries and the processes are determined by the stochastic differential equations with drift term relying on some random effects. We obtain the Girsanov-type formula of the stochastic differential equation driven by Fractional Brownian Motion through kernel transformation. Under some assumptions of the random effect, we estimate the parameter estimators by the maximum likelihood estimation and give some numerical simulations for the discrete observations. Results show that for the different H, the parameter estimator is closer to the true value as the amount of data increases.

preprint2020arXiv

Memory-Gated Recurrent Networks

The essence of multivariate sequential learning is all about how to extract dependencies in data. These data sets, such as hourly medical records in intensive care units and multi-frequency phonetic time series, often time exhibit not only strong serial dependencies in the individual components (the "marginal" memory) but also non-negligible memories in the cross-sectional dependencies (the "joint" memory). Because of the multivariate complexity in the evolution of the joint distribution that underlies the data generating process, we take a data-driven approach and construct a novel recurrent network architecture, termed Memory-Gated Recurrent Networks (mGRN), with gates explicitly regulating two distinct types of memories: the marginal memory and the joint memory. Through a combination of comprehensive simulation studies and empirical experiments on a range of public datasets, we show that our proposed mGRN architecture consistently outperforms state-of-the-art architectures targeting multivariate time series.

preprint2020arXiv

Real-Time Fault Detection and Process Control Based on Multi-channel Sensor Data Fusion

Sensor signals acquired in the industrial process contain rich information which can be analyzed to facilitate effective monitoring of the process, early detection of system anomalies, quick diagnosis of fault root causes, and intelligent system design and control. In many mechatronic systems, multiple signals are acquired by different sensor channels (i.e. multi-channel data) which can be represented by high-order arrays (tensorial data). The multi-channel data has a high-dimensional and complex cross-correlation structure. It is crucial to develop a method that considers the interrelationships between different sensor channels. This paper proposes a new process monitoring approach based on uncorrelated multilinear discriminant analysis that can effectively model the multi-channel data to achieve a superior monitoring and fault diagnosis performance compared to other competing methods. The proposed method is applied directly to the high-dimensional tensorial data. Features are extracted and combined with multivariate control charts to monitor multi-channel data. The effectiveness of the proposed method in quick detection of process changes is demonstrated with both the simulation and a real-world case study.

preprint2009arXiv

Continuous-Time Markowitz's Model with Transaction Costs

A continuous-time Markowitz's mean-variance portfolio selection problem is studied in a market with one stock, one bond, and proportional transaction costs. This is a singular stochastic control problem,inherently in a finite time horizon. With a series of transformations, the problem is turned into a so-called double obstacle problem, a well studied problem in physics and partial differential equation literature, featuring two time-varying free boundaries. The two boundaries, which define the buy, sell, and no-trade regions, are proved to be smooth in time. This in turn characterizes the optimal strategy, via a Skorokhod problem, as one that tries to keep a certain adjusted bond-stock position within the no-trade region. Several features of the optimal strategy are revealed that are remarkably different from its no-transaction-cost counterpart. It is shown that there exists a critical length in time, which is dependent on the stock excess return as well as the transaction fees but independent of the investment target and the stock volatility, so that an expected terminal return may not be achievable if the planning horizon is shorter than that critical length (while in the absence of transaction costs any expected return can be reached in an arbitrary period of time). It is further demonstrated that anyone following the optimal strategy should not buy the stock beyond the point when the time to maturity is shorter than the aforementioned critical length. Moreover, the investor would be less likely to buy the stock and more likely to sell the stock when the maturity date is getting closer. These features, while consistent with the widely accepted investment wisdom, suggest that the planning horizon is an integral part of the investment opportunities.

preprint2009arXiv

Optimal Redeeming Strategy of Stock Loans

A stock loan is a loan, secured by a stock, which gives the borrower the right to redeem the stock at any time before or on the loan maturity. The way of dividends distribution has a significant effect on the pricing of the stock loan and the optimal redeeming strategy adopted by the borrower. We present the pricing models sub ject to various ways of dividend distribution. Since closed-form price formulas are generally not available, we provide a thorough analysis to examine the optimal redeeming strategy. Numerical results are presented as well.