Source author record

Sungwon Kim

Sungwon Kim appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence cond-mat.mtrl-sci eess.AS Sound Computer Vision cond-mat.mes-hall cond-mat.str-el Machine Learning physics.comp-ph

Catalog footprint

What is connected

5works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Ultrafast X-ray imaging of the light-induced phase transition in VO2

Using light to control transient phases in quantum materials is an emerging route to engineer new properties and functionality, with both thermal and non-thermal phases observed out of equilibrium. Transient phases are expected to be heterogeneous, either through photo-generated domain growth or by generating topological defects, and this impacts the dynamics of the system. However, this nanoscale heterogeneity has not been directly observed. Here we use time- and spectrally resolved coherent X-ray imaging to track the prototypical light induced insulator-to-metal phase transition in vanadium dioxide on the nanoscale with femtosecond time resolution. We show that the early-time dynamics are independent of the initial spatial heterogeneity and observe a 200 fs switch to the metallic phase. A heterogeneous response emerges only after hundreds of picoseconds. Through spectroscopic imaging, we reveal that the transient metallic phase is a highly orthorhombically strained rutile metallic phase, an interpretation that is in contrast to those based on spatially averaged probes. Our results demonstrate the critical importance of spatially and spectrally resolved measurements for understanding and interpreting the transient phases of quantum materials.

preprint2022arXiv

Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data

We propose Guided-TTS 2, a diffusion-based generative model for high-quality adaptive TTS using untranscribed data. Guided-TTS 2 combines a speaker-conditional diffusion model with a speaker-dependent phoneme classifier for adaptive text-to-speech. We train the speaker-conditional diffusion model on large-scale untranscribed datasets for a classifier-free guidance method and further fine-tune the diffusion model on the reference speech of the target speaker for adaptation, which only takes 40 seconds. We demonstrate that Guided-TTS 2 shows comparable performance to high-quality single-speaker TTS baselines in terms of speech quality and speaker similarity with only a ten-second untranscribed data. We further show that Guided-TTS 2 outperforms adaptive TTS baselines on multi-speaker datasets even with a zero-shot adaptation setting. Guided-TTS 2 can adapt to a wide range of voices only using untranscribed speech, which enables adaptive TTS with the voice of non-human characters such as Gollum in \textit{"The Lord of the Rings"}.

preprint2022arXiv

Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance

We propose Guided-TTS, a high-quality text-to-speech (TTS) model that does not require any transcript of target speaker using classifier guidance. Guided-TTS combines an unconditional diffusion probabilistic model with a separately trained phoneme classifier for classifier guidance. Our unconditional diffusion model learns to generate speech without any context from untranscribed speech data. For TTS synthesis, we guide the generative process of the diffusion model with a phoneme classifier trained on a large-scale speech recognition dataset. We present a norm-based scaling method that reduces the pronunciation errors of classifier guidance in Guided-TTS. We show that Guided-TTS achieves a performance comparable to that of the state-of-the-art TTS model, Grad-TTS, without any transcript for LJSpeech. We further demonstrate that Guided-TTS performs well on diverse datasets including a long-form untranscribed dataset.

preprint2022arXiv

Perception Prioritized Training of Diffusion Models

Diffusion models learn to restore noisy data, which is corrupted with different levels of noise, by optimizing the weighted sum of the corresponding loss terms, i.e., denoising score matching loss. In this paper, we show that restoring data corrupted with certain noise levels offers a proper pretext task for the model to learn rich visual concepts. We propose to prioritize such noise levels over other levels during training, by redesigning the weighting scheme of the objective function. We show that our simple redesign of the weighting scheme significantly improves the performance of diffusion models regardless of the datasets, architectures, and sampling strategies.

preprint2020arXiv

Generative Adversarial Networks for Crystal Structure Prediction

The constant demand for new functional materials calls for efficient strategies to accelerate the materials design and discovery. In addressing this challenge, machine learning generative models can offer promising opportunities since they allow for the continuous navigation of chemical space via low dimensional latent spaces. In this work, we employ a crystal representation that is inversion-free with a low memory requirement based on unit cell information and fractional atomic coordinates, and build the generative adversarial network (GAN) for crystal structures. The proposed model is then applied to the Mg-Mn-O ternary inorganic materials system to generate novel structures with application as potential water-splitting photoanodes, and combined with the evaluation of their photoanode properties for high-throughput virtual screening (HTVS). The generative-HTVS system that we built predicts 23 new crystal structures with a reasonable predicted stability and bandgap. These findings suggest that the proposed generative model can be an effective way to explore hidden portions of the chemical space, an area that is usually unreachable when conventional substitution-based discovery is employed.

Sungwon Kim

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Ultrafast X-ray imaging of the light-induced phase transition in VO2

Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data

Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance

Perception Prioritized Training of Diffusion Models

Generative Adversarial Networks for Crystal Structure Prediction