Source author record

Minsu Park

Minsu Park appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

astro-ph.CO Artificial Intelligence astro-ph.IM Computation and Language hep-ph Machine Learning

Catalog footprint

What is connected

3works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

When Correct Isn't Usable: Improving Structured Output Reliability in Small Language Models

Deployed language models must produce outputs that are both correct and format-compliant. We study this structured-output reliability gap using two mathematical benchmarks -- GSM8K and MATH -- as a controlled testbed: ground truth is unambiguous and the output contract is strict (JSON with required fields). We evaluate three 7-9B models under five prompting strategies and report output accuracy -- the joint event of mathematical correctness and valid JSON structure -- as the primary metric. A systematic format failure emerges: NAIVE prompting (no system prompt) achieves up to 85% task accuracy on GSM8K but 0% output accuracy across all models and datasets. REFERENCE prompting (a minimal hand-written JSON format prompt) fares little better, yielding 0% output accuracy for two of four models tested. Constrained decoding enforces syntactic validity but incurs 3.6x-8.2x latency overhead and in several settings degrades task performance substantially. To overcome this limitation, we developed AloLab, an iterative system-prompt optimizer (meta-agent: Claude Sonnet 4.5) requiring only black-box API access to the target model; it reaches 84-87% output accuracy on GSM8K and 34-40% on MATH across five independent runs per model, with 29/30 paired McNemar comparisons against the best static prompt significant at p < 0.05, at near-NAIVE inference latency and without model fine-tuning. The same format failure extends to GPT-4o (OpenAI, 2024), a proprietary closed-source model: REFERENCE achieves 0% output accuracy due to systematic markdown-fence wrapping, while AloLab reaches 95.2% [94.8, 95.6]. An ablation replacing the Sonnet 4.5 meta-agent with Claude 3 Haiku reduces mean output accuracy to 61.0% and increases run-to-run standard deviation from <1 pp to 21.8 pp, confirming that meta-agent capability is a primary driver of optimization quality.

preprint2022arXiv

The Atacama Cosmology Telescope: The Persistence of Neutrino Self-Interaction in Cosmological Measurements

We use data from the Atacama Cosmology Telescope (ACT) DR4 to search for the presence of neutrino self-interaction in the cosmic microwave background. Consistent with prior works, the posterior distributions we find are bimodal, with one mode consistent with $Λ$CDM and one where neutrinos strongly self-interact. By combining ACT data with large-scale information from WMAP, we find that a delayed onset of neutrino free streaming caused by significantly strong neutrino self-interaction is compatible with these data at the $2-3σ$ level. As seen in the past, the preference shifts to $Λ$CDM with the inclusion of Planck data. We determine that the preference for strong neutrino self-interaction is largely driven by angular scales corresponding to $700 \lesssim \ell \lesssim 1000$ in the ACT E-mode polarization data. This region is expected to be key to discriminate between neutrino self-interacting modes and will soon be probed with more sensitive data.

preprint2021arXiv

What does a cosmological experiment really measure? Covariant posterior decomposition with normalizing flows

We present methods to rigorously extract parameter combinations that are constrained by data from posterior distributions. The standard approach uses linear methods that apply to Gaussian distributions. We show the limitations of the linear methods for current surveys, and develop non-linear methods that can be used with non-Gaussian distributions, and are independent of the parameter basis. These are made possible by the use of machine-learning models, normalizing flows, to learn posterior distributions from their samples. These models allow us to obtain the local covariance of the posterior at all positions in parameter space and use its inverse, the Fisher matrix, as a local metric over parameter space. The posterior distribution can then be non-linearly decomposed into the leading constrained parameter combinations via parallel transport in the metric space. We test our methods on two non-Gaussian, benchmark examples, and then apply them to the parameter posteriors of the Dark Energy Survey and Planck CMB lensing. We illustrate how our method automatically learns the survey-specific, best constrained effective amplitude parameter $S_8$ for cosmic shear alone, cosmic shear and galaxy clustering, and CMB lensing. We also identify constrained parameter combinations in the full parameter space, and as an application we estimate the Hubble constant, $H_0$, from large-structure data alone.