Researcher profile

Minsu Park

Minsu Park contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2026arXiv

When Correct Isn't Usable: Improving Structured Output Reliability in Small Language Models

Deployed language models must produce outputs that are both correct and format-compliant. We study this structured-output reliability gap using two mathematical benchmarks -- GSM8K and MATH -- as a controlled testbed: ground truth is unambiguous and the output contract is strict (JSON with required fields). We evaluate three 7-9B models under five prompting strategies and report output accuracy -- the joint event of mathematical correctness and valid JSON structure -- as the primary metric. A systematic format failure emerges: NAIVE prompting (no system prompt) achieves up to 85% task accuracy on GSM8K but 0% output accuracy across all models and datasets. REFERENCE prompting (a minimal hand-written JSON format prompt) fares little better, yielding 0% output accuracy for two of four models tested. Constrained decoding enforces syntactic validity but incurs 3.6x-8.2x latency overhead and in several settings degrades task performance substantially. To overcome this limitation, we developed AloLab, an iterative system-prompt optimizer (meta-agent: Claude Sonnet 4.5) requiring only black-box API access to the target model; it reaches 84-87% output accuracy on GSM8K and 34-40% on MATH across five independent runs per model, with 29/30 paired McNemar comparisons against the best static prompt significant at p < 0.05, at near-NAIVE inference latency and without model fine-tuning. The same format failure extends to GPT-4o (OpenAI, 2024), a proprietary closed-source model: REFERENCE achieves 0% output accuracy due to systematic markdown-fence wrapping, while AloLab reaches 95.2% [94.8, 95.6]. An ablation replacing the Sonnet 4.5 meta-agent with Claude 3 Haiku reduces mean output accuracy to 61.0% and increases run-to-run standard deviation from <1 pp to 21.8 pp, confirming that meta-agent capability is a primary driver of optimization quality.

preprint2022arXiv

The Atacama Cosmology Telescope: The Persistence of Neutrino Self-Interaction in Cosmological Measurements

We use data from the Atacama Cosmology Telescope (ACT) DR4 to search for the presence of neutrino self-interaction in the cosmic microwave background. Consistent with prior works, the posterior distributions we find are bimodal, with one mode consistent with $Λ$CDM and one where neutrinos strongly self-interact. By combining ACT data with large-scale information from WMAP, we find that a delayed onset of neutrino free streaming caused by significantly strong neutrino self-interaction is compatible with these data at the $2-3σ$ level. As seen in the past, the preference shifts to $Λ$CDM with the inclusion of Planck data. We determine that the preference for strong neutrino self-interaction is largely driven by angular scales corresponding to $700 \lesssim \ell \lesssim 1000$ in the ACT E-mode polarization data. This region is expected to be key to discriminate between neutrino self-interacting modes and will soon be probed with more sensitive data.

preprint2021arXiv

What does a cosmological experiment really measure? Covariant posterior decomposition with normalizing flows

We present methods to rigorously extract parameter combinations that are constrained by data from posterior distributions. The standard approach uses linear methods that apply to Gaussian distributions. We show the limitations of the linear methods for current surveys, and develop non-linear methods that can be used with non-Gaussian distributions, and are independent of the parameter basis. These are made possible by the use of machine-learning models, normalizing flows, to learn posterior distributions from their samples. These models allow us to obtain the local covariance of the posterior at all positions in parameter space and use its inverse, the Fisher matrix, as a local metric over parameter space. The posterior distribution can then be non-linearly decomposed into the leading constrained parameter combinations via parallel transport in the metric space. We test our methods on two non-Gaussian, benchmark examples, and then apply them to the parameter posteriors of the Dark Energy Survey and Planck CMB lensing. We illustrate how our method automatically learns the survey-specific, best constrained effective amplitude parameter $S_8$ for cosmic shear alone, cosmic shear and galaxy clustering, and CMB lensing. We also identify constrained parameter combinations in the full parameter space, and as an application we estimate the Hubble constant, $H_0$, from large-structure data alone.