Researcher profile

Michael Y. Hu

Michael Y. Hu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

Always Learning, Always Mixing: Efficient and Simple Data Mixing All The Time

Data mixing decides how to combine different sources or types of data and is a consequential problem throughout language model training. In pretraining, data composition is a key determinant of model quality; in continual learning and adaptation, it governs what is retained and acquired. Yet existing data mixing methods address only one phase of this lifecycle at a time: some require smaller proxy models tied to a single training phase, others assume a fixed domain set, and continual learning lacks principled guidance altogether. We argue that data mixing is fundamentally an online decision making problem -- one that recurs throughout training and demands a single, unified solution. We introduce OP-Mix (On-Policy Mix), a data mixing algorithm that operates across the entire language model training lifecycle. Our main insight is that candidate data mixtures can be cheaply simulated by interpolating between low-rank adapters trained directly on the current model, eliminating separate proxy models and ensuring the search is always grounded in the model's actual learning dynamics. Across pretraining, continual midtraining, and continual instruction tuning, OP-Mix consistently finds near-optimal mixtures while using a fraction of the compute of the baselines. In pretraining, OP-Mix improves upon training without mixing by 6.3% in average perplexity. For continual learning, OP-Mix matches the performance of both retraining and on-policy distillation while using 66% and 95% less overall compute, respectively. OP-Mix suggests a different view of language model training: not a sequence of distinct phases, but a single continuous process of learning from data.

preprint2026arXiv

Sample-Efficient Online Learning in LM Agents via Hindsight Trajectory Rewriting

Language model (LM) agents deployed in novel environments often exhibit poor sample efficiency when learning from sequential interactions. This significantly hinders the usefulness of such agents in environments where interaction is costly (for example, when they interact with humans or reset physical systems). While a number of existing LM agent architectures incorporate various mechanisms for experience storage and reflection, they make limited use of LMs' abilities to directly generate or reason about full counterfactual trajectories. We introduce ECHO (Experience Consolidation via Hindsight Optimization), a prompting framework that adapts hindsight experience replay from reinforcement learning for language model agents. ECHO generates optimized trajectories for alternative goals that could have been achieved during failed attempts, effectively creating synthetic positive examples from unsuccessful interactions. Our approach consists of two components: a hindsight rule that uses the language model itself to identify relevant subgoals and generate optimized trajectories, and an update rule that maintains compressed trajectory representations in memory. We evaluate ECHO on stateful versions of XMiniGrid, a text-based navigation and planning benchmark, and PeopleJoinQA, a collaborative information-gathering enterprise simulation. Across both domains, ECHO outperforms vanilla language agent baselines by up to 80%; in XMiniGrid, it also outperforms a number of sophisticated agent architectures including Reflexion and AWM, demonstrating faster adaptation to novel environments through more effective utilization of past experiences.

preprint2020arXiv

Interface-related magnetic and vibrational properties in Fe/MgO heterostructures from nuclear resonant spectroscopy and first-principles calculations

We combine $^{57}$Fe Mössbauer spectroscopy and $^{57}$Fe nuclear resonant inelastic x-ray scattering (NRIXS) in nanoscale polycrystalline [bcc-$^{57}$Fe/MgO] multilayers with various Fe layer thicknesses and layer-resolved density-functional-theory (DFT) based first-principles calculations of a (001)-oriented [Fe(8 ML)/MgO(8 ML)](001) heterostructure to unravel the interface-related atomic vibrational properties of a multilayer system. In theory and experiment, we observe consistently enhanced hyperfine magnetic fields compared to bulk which are associated with the Fe/MgO interface layers. NRIXS and DFT both reveal a strong reduction of the longitudinal acoustic (LA) phonon peak in combination with an enhancement of the low-energy vibrational density of states (VDOS) suggesting that the presence of interfaces and the associated increase in the layer-resolved magnetic moments results in drastic changes in the Fe-partial VDOS. From the experimental and calculated VDOS, vibrational thermodynamic properties have been determined as a function of Fe thickness and are found to be in excellent agreement.

preprint2019arXiv

Influence of hydrogenation on the vibrational density of states of magnetocaloric $\mathrm{LaFe}_\mathrm{11.4}\mathrm{Si}_\mathrm{1.6}\mathrm{H}_{1.6}$

We report on the impact of magnetoelastic coupling on the magnetocaloric properties of LaFe$_{11.4}$Si$_{1.6}$H$_{1.6}$ in terms of the vibrational density of states, which we determined with $^{57}$Fe nuclear resonant inelastic X-ray scattering measurements and with density-functional-theory based first-principles calculations in the ferromagnetic low-temperature and paramagnetic high-temperature phase. In experiments and calculations, we observe pronounced differences in the shape of the Fe-partial VDOS between non-hydrogenated and hydrogenated samples. This shows that hydrogen does not only shift the temperature of the first-order phase transition, but also affects the elastic response of the Fe-subsystem significantly. In turn, the anomalous redshift of the Fe VDOS, observed by going to the low-volume PM phase, survives hydrogenation. As a consequence, the change in the Fe specific vibrational entropy $ΔS_\mathrm{lat}$ across the phase transition has the same sign as the magnetic and electronic contribution. DFT calculations show that the same mechanism, which is a consequence of the itinerant electron metamagnetism associated with the Fe subsystem, is effective in both the hydrogenated and he hydrogen-free compounds. Although reduced by 50 % as compared to the hydrogen-free system, the measured change $ΔS_\mathrm{lat}$ of 3.2\pm1.9 J/kgK across the FM to PM transition contributes with 35 % significantly and cooperatively to the total isothermal entropy change $ΔS_\mathrm{iso}$. Hydrogenation is observed to induce an overall blueshift of the Fe-VDOS with respect to the H-free compound; this effect, together with the enhanced Debye temperature observed, is a fingerprint of the hardening of the Fe sublattice by hydrogen incorporation. In addition, the mean Debye velocity of sound of LaFe$_{11.4}$Si$_{1.6}$H$_{1.6}$ was determined from the NRIXS and the DFT data.