Source author record

Sungwoo Kang

Sungwoo Kang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cond-mat.mtrl-sci physics.comp-ph Artificial Intelligence Computer Vision cond-mat.dis-nn cs.CY eess.IV q-fin.CP Quantitative Methods

Catalog footprint

What is connected

5works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Diagnosing Korean-Language LLM Political Bias via Census-Grounded Agent Simulation

Large language models (LLMs) exhibit systematic political biases in voter simulations, but their underlying mechanisms and cross-lingual generalizations remain poorly understood. We introduce Dynamo-K, a census-grounded simulation framework evaluating Korean-language LLM political behavior across four models on six Korean elections (2017-2025). Using this framework, we identify three systematic failure modes: (1) progressive bias in moderate agents, where explicit mitigation reduces Mean Absolute Error (MAE) by 5.2 times; (2) model-dependent third-party salience collapse, distinguishing between salience failure and decision bias; and (3) regional polarization collapse, where models bidirectionally under-predict historical party strongholds. To address these failures, we demonstrate that scenario reframing recovers 62% of 2017 MAE by restoring third-party visibility. Furthermore, we introduce a learned reweighting adapter that successfully calibrates opposing-valence models without relying on candidate names at train or test time. Validating our diagnostic framework, Dynamo-K accurately predicts 3/3 presidential winners - including a 2.1%p MAE on the highly contested 0.73%p-margin 2022 race - and correctly identifies the dominant party in a held-out local election. The pipeline is open-source and provides a scalable, cost-effective method for diagnosing LLM political behavior.

preprint2026arXiv

The Color-Clinical Decoupling: Why Perceptual Calibration Fails Clinical Biomarkers in Smartphone Dermatology

Smartphone-based tele-dermatology assumes that colorimetric calibration ensures clinical reliability, yet this remains untested for underrepresented skin phototypes. We investigated whether standard calibration translates to reliable clinical biomarkers using 43,425 images from 965 Korean subjects (Fitzpatrick III-IV) across DSLR, tablet, and smartphone devices. While Linear Color Correction Matrix (CCM) normalization reduced color error by 67-77% -- achieving near-clinical accuracy (Delta E < 2.3) -- this success did not translate to biomarker reliability. We identify a phenomenon termed "color-clinical decoupling": despite perceptual accuracy, the Individual Typology Angle (ITA) showed poor inter-device agreement (ICC = 0.40), while the Melanin Index achieved good agreement (ICC = 0.77). This decoupling is driven by the ITA formula's sensitivity to b* channel noise and is further compounded by anatomical variance. Facial region accounts for 25.2% of color variance -- 3.6x greater than device effects (7.0%) -- challenging the efficacy of single-patch calibration. Our results demonstrate that current colorimetric standards are insufficient for clinical-grade biomarker extraction, necessitating region-aware protocols for mobile dermatology.

preprint2026arXiv

The Limits of Complexity: Why Feature Engineering Beats Deep Learning in Investor Flow Prediction

The application of machine learning to financial prediction has accelerated dramatically, yet the conditions under which complex models outperform simple alternatives remain poorly understood. This paper investigates whether advanced signal processing and deep learning techniques can extract predictive value from investor order flows beyond what simple feature engineering achieves. Using a comprehensive dataset of 2.79 million observations spanning 2,439 Korean equities from 2020--2024, we apply three methodologies: \textit{Independent Component Analysis} (ICA) to recover latent market drivers, \textit{Wavelet Coherence} analysis to characterize multi-scale correlation structure, and \textit{Long Short-Term Memory} (LSTM) networks with attention mechanisms for non-linear prediction. Our results reveal a striking finding: a parsimonious linear model using market capitalization-normalized flows (``Matched Filter'' preprocessing) achieves a Sharpe ratio of 1.30 and cumulative return of 272.6\%, while the full ICA-Wavelet-LSTM pipeline generates a Sharpe ratio of only 0.07 with a cumulative return of $-5.1\%$. The raw LSTM model collapsed to predicting the unconditional mean, achieving a hit rate of 47.5\% -- worse than random. We conclude that in low signal-to-noise financial environments, domain-specific feature engineering yields substantially higher marginal returns than algorithmic complexity. These findings establish important boundary conditions for the application of deep learning to financial prediction.

preprint2021arXiv

Accelerated identification of equilibrium structures of multicomponent inorganic crystals using machine learning potentials

The discovery of new multicomponent inorganic compounds can provide direct solutions to many scientific and engineering challenges, yet the vast size of the uncharted material space dwarfs current synthesis throughput. While the computational crystal structure prediction is expected to mitigate this frustration, the NP-hardness and steep costs of density functional theory (DFT) calculations prohibit material exploration at scale. Herein, we introduce SPINNER, a highly efficient and reliable structure-prediction framework based on exhaustive random searches and evolutionary algorithms, which is completely free from empiricism. Empowered by accurate neural network potentials, the program can navigate the configuration space faster than DFT by more than 10$^{2}$-fold. In blind tests on 60 ternary compositions diversely selected from the experimental database, SPINNER successfully identifies experimental (or theoretically more stable) phases for ~80% of materials within 5000 generations, entailing up to half a million structure evaluations for each composition. When benchmarked against previous data mining or DFT-based evolutionary predictions, SPINNER identifies more stable phases in the majority of cases. By developing a reliable and fast structure-prediction framework, this work opens the door to large-scale, unbounded computational exploration of undiscovered inorganic crystals.

preprint2020arXiv

Training machine-learning potentials for crystal structure prediction using disordered structures

Prediction of the stable crystal structure for multinary (ternary or higher) compounds with unexplored compositions demands fast and accurate evaluation of free energies in exploring the vast configurational space. The machine-learning potential such as the neural network potential (NNP) is poised to meet this requirement but a dearth of information on the crystal structure poses a challenge in choosing training sets. Herein we propose constructing the training set from densityfunctional-theory (DFT) based dynamical trajectories of liquid and quenched amorphous phases, which does not require any preceding information on material structures except for the chemical composition. To demonstrate suitability of the trained NNP in the crystal structure prediction, we compare NNP and DFT energies for Ba2AgSi3, Mg2SiO4, LiAlCl4, and InTe2O5F over experimental phases as well as low-energy crystal structures that are generated theoretically. For every material, we find strong correlations between DFT and NNP energies, ensuring that the NNPs can properly rank energies among low-energy crystalline structures. We also find that the evolutionary search using the NNPs can identify low-energy metastable phases more efficiently than the DFTbased approach. By proposing a way to developing reliable machine-learning potentials for the crystal structure prediction, this work will pave the way to identifying unexplored multinary phases efficiently.

Sungwoo Kang

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Diagnosing Korean-Language LLM Political Bias via Census-Grounded Agent Simulation

The Color-Clinical Decoupling: Why Perceptual Calibration Fails Clinical Biomarkers in Smartphone Dermatology

The Limits of Complexity: Why Feature Engineering Beats Deep Learning in Investor Flow Prediction

Accelerated identification of equilibrium structures of multicomponent inorganic crystals using machine learning potentials

Training machine-learning potentials for crystal structure prediction using disordered structures