Researcher profile

Sungwon Kim

Sungwon Kim contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2023arXiv

Ultrafast X-ray imaging of the light-induced phase transition in VO2

Using light to control transient phases in quantum materials is an emerging route to engineer new properties and functionality, with both thermal and non-thermal phases observed out of equilibrium. Transient phases are expected to be heterogeneous, either through photo-generated domain growth or by generating topological defects, and this impacts the dynamics of the system. However, this nanoscale heterogeneity has not been directly observed. Here we use time- and spectrally resolved coherent X-ray imaging to track the prototypical light induced insulator-to-metal phase transition in vanadium dioxide on the nanoscale with femtosecond time resolution. We show that the early-time dynamics are independent of the initial spatial heterogeneity and observe a 200 fs switch to the metallic phase. A heterogeneous response emerges only after hundreds of picoseconds. Through spectroscopic imaging, we reveal that the transient metallic phase is a highly orthorhombically strained rutile metallic phase, an interpretation that is in contrast to those based on spatially averaged probes. Our results demonstrate the critical importance of spatially and spectrally resolved measurements for understanding and interpreting the transient phases of quantum materials.

preprint2022arXiv

Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data

We propose Guided-TTS 2, a diffusion-based generative model for high-quality adaptive TTS using untranscribed data. Guided-TTS 2 combines a speaker-conditional diffusion model with a speaker-dependent phoneme classifier for adaptive text-to-speech. We train the speaker-conditional diffusion model on large-scale untranscribed datasets for a classifier-free guidance method and further fine-tune the diffusion model on the reference speech of the target speaker for adaptation, which only takes 40 seconds. We demonstrate that Guided-TTS 2 shows comparable performance to high-quality single-speaker TTS baselines in terms of speech quality and speaker similarity with only a ten-second untranscribed data. We further show that Guided-TTS 2 outperforms adaptive TTS baselines on multi-speaker datasets even with a zero-shot adaptation setting. Guided-TTS 2 can adapt to a wide range of voices only using untranscribed speech, which enables adaptive TTS with the voice of non-human characters such as Gollum in \textit{"The Lord of the Rings"}.

preprint2022arXiv

Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance

We propose Guided-TTS, a high-quality text-to-speech (TTS) model that does not require any transcript of target speaker using classifier guidance. Guided-TTS combines an unconditional diffusion probabilistic model with a separately trained phoneme classifier for classifier guidance. Our unconditional diffusion model learns to generate speech without any context from untranscribed speech data. For TTS synthesis, we guide the generative process of the diffusion model with a phoneme classifier trained on a large-scale speech recognition dataset. We present a norm-based scaling method that reduces the pronunciation errors of classifier guidance in Guided-TTS. We show that Guided-TTS achieves a performance comparable to that of the state-of-the-art TTS model, Grad-TTS, without any transcript for LJSpeech. We further demonstrate that Guided-TTS performs well on diverse datasets including a long-form untranscribed dataset.

preprint2022arXiv

Perception Prioritized Training of Diffusion Models

Diffusion models learn to restore noisy data, which is corrupted with different levels of noise, by optimizing the weighted sum of the corresponding loss terms, i.e., denoising score matching loss. In this paper, we show that restoring data corrupted with certain noise levels offers a proper pretext task for the model to learn rich visual concepts. We propose to prioritize such noise levels over other levels during training, by redesigning the weighting scheme of the objective function. We show that our simple redesign of the weighting scheme significantly improves the performance of diffusion models regardless of the datasets, architectures, and sampling strategies.

preprint2020arXiv

Generative Adversarial Networks for Crystal Structure Prediction

The constant demand for new functional materials calls for efficient strategies to accelerate the materials design and discovery. In addressing this challenge, machine learning generative models can offer promising opportunities since they allow for the continuous navigation of chemical space via low dimensional latent spaces. In this work, we employ a crystal representation that is inversion-free with a low memory requirement based on unit cell information and fractional atomic coordinates, and build the generative adversarial network (GAN) for crystal structures. The proposed model is then applied to the Mg-Mn-O ternary inorganic materials system to generate novel structures with application as potential water-splitting photoanodes, and combined with the evaluation of their photoanode properties for high-throughput virtual screening (HTVS). The generative-HTVS system that we built predicts 23 new crystal structures with a reasonable predicted stability and bandgap. These findings suggest that the proposed generative model can be an effective way to explore hidden portions of the chemical space, an area that is usually unreachable when conventional substitution-based discovery is employed.