Source author record

Ji-Hoon Kim

Ji-Hoon Kim appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

astro-ph.GA astro-ph.CO Artificial Intelligence Machine Learning astro-ph.IM eess.AS astro-ph.HE Computation and Language Distributed, Parallel, and Cluster Computing Hardware Architecture Sound

Catalog footprint

What is connected

12works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Probing Cross-modal Information Hubs in Audio-Visual LLMs

Audio-visual large language models (AVLLMs) have recently emerged as a powerful architecture capable of jointly reasoning over audio, visual, and textual modalities. In AVLLMs, the bidirectional interaction between audio and video modalities introduces intricate processing dynamics, necessitating a deeper understanding of their internal mechanisms. However, unlike extensively studied text-only or large vision language models, the internal workings of AVLLMs remain largely unexplored. In this paper, we focus on cross-modal information flow between audio and visual modalities in AVLLMs, investigating where information derived from one modality is encoded within the token representations of the other modality. Through an analysis of multiple recent AVLLMs, we uncover two common findings. First, AVLLMs primarily encode integrated audio-visual information in sink tokens. Second, sink tokens do not uniformly hold cross-modal information. Instead, a distinct subset of sink tokens, which we term cross-modal sink tokens, specializes in storing such information. Based on these findings, we further propose a simple training-free hallucination mitigation method by encouraging reliance on integrated cross-modal information within cross-modal sink tokens. Our code is available at https://github.com/kaistmm/crossmodal-hub.

preprint2024arXiv

Let There Be Sound: Reconstructing High Quality Speech from Silent Videos

The goal of this work is to reconstruct high quality speech from lip motions alone, a task also known as lip-to-speech. A key challenge of lip-to-speech systems is the one-to-many mapping caused by (1) the existence of homophenes and (2) multiple speech variations, resulting in a mispronounced and over-smoothed speech. In this paper, we propose a novel lip-to-speech system that significantly improves the generation quality by alleviating the one-to-many mapping problem from multiple perspectives. Specifically, we incorporate (1) self-supervised speech representations to disambiguate homophenes, and (2) acoustic variance information to model diverse speech styles. Additionally, to better solve the aforementioned problem, we employ a flow based post-net which captures and refines the details of the generated speech. We perform extensive experiments on two datasets, and demonstrate that our method achieves the generation quality close to that of real human utterance, outperforming existing methods in terms of speech naturalness and intelligibility by a large margin. Synthesised samples are available at our demo page: https://mm.kaist.ac.kr/projects/LTBS.

preprint2022arXiv

Accelerating Large-Scale Graph-based Nearest Neighbor Search on a Computational Storage Platform

K-nearest neighbor search is one of the fundamental tasks in various applications and the hierarchical navigable small world (HNSW) has recently drawn attention in large-scale cloud services, as it easily scales up the database while offering fast search. On the other hand, a computational storage device (CSD) that combines programmable logic and storage modules on a single board becomes popular to address the data bandwidth bottleneck of modern computing systems. In this paper, we propose a computational storage platform that can accelerate a large-scale graph-based nearest neighbor search algorithm based on SmartSSD CSD. To this end, we modify the algorithm more amenable on the hardware and implement two types of accelerators using HLS- and RTL-based methodology with various optimization methods. In addition, we scale up the proposed platform to have 4 SmartSSDs and apply graph parallelism to boost the system performance further. As a result, the proposed computational storage platform achieves 75.59 query per second throughput for the SIFT1B dataset at 258.66W power dissipation, which is 12.83x and 17.91x faster and 10.43x and 24.33x more energy efficient than the conventional CPU-based and GPU-based server platform, respectively. With multi-terabyte storage and custom acceleration capability, we believe that the proposed computational storage platform is a promising solution for cost-sensitive cloud datacenters.

preprint2022arXiv

Demystifying the Neural Tangent Kernel from a Practical Perspective: Can it be trusted for Neural Architecture Search without training?

In Neural Architecture Search (NAS), reducing the cost of architecture evaluation remains one of the most crucial challenges. Among a plethora of efforts to bypass training of each candidate architecture to convergence for evaluation, the Neural Tangent Kernel (NTK) is emerging as a promising theoretical framework that can be utilized to estimate the performance of a neural architecture at initialization. In this work, we revisit several at-initialization metrics that can be derived from the NTK and reveal their key shortcomings. Then, through the empirical analysis of the time evolution of NTK, we deduce that modern neural architectures exhibit highly non-linear characteristics, making the NTK-based metrics incapable of reliably estimating the performance of an architecture without some amount of training. To take such non-linear characteristics into account, we introduce Label-Gradient Alignment (LGA), a novel NTK-based metric whose inherent formulation allows it to capture the large amount of non-linear advantage present in modern neural architectures. With minimal amount of training, LGA obtains a meaningful level of rank correlation with the post-training test accuracy of an architecture. Lastly, we demonstrate that LGA, complemented with few epochs of training, successfully guides existing search algorithms to achieve competitive search performances with significantly less search cost. The code is available at: https://github.com/nutellamok/DemystifyingNTK.

preprint2022arXiv

Two-Step Question Retrieval for Open-Domain QA

The retriever-reader pipeline has shown promising performance in open-domain QA but suffers from a very slow inference speed. Recently proposed question retrieval models tackle this problem by indexing question-answer pairs and searching for similar questions. These models have shown a significant increase in inference speed, but at the cost of lower QA performance compared to the retriever-reader models. This paper proposes a two-step question retrieval model, SQuID (Sequential Question-Indexed Dense retrieval) and distant supervision for training. SQuID uses two bi-encoders for question retrieval. The first-step retriever selects top-k similar questions, and the second-step retriever finds the most similar question from the top-k questions. We evaluate the performance and the computational efficiency of SQuID. The results show that SQuID significantly increases the performance of existing question retrieval models with a negligible loss on inference speed.

preprint2020arXiv

Dark Matter Deficient Galaxies Produced Via High-velocity Galaxy Collisions In High-resolution Numerical Simulations

The recent discovery of diffuse dwarf galaxies that are deficient in dark matter appears to challenge the current paradigm of structure formation in our Universe. We describe the numerical experiments to determine if the so-called dark matter deficient galaxies (DMDGs) could be produced when two gas-rich, dwarf-sized galaxies collide with a high relative velocity of $\sim 300\,{\rm kms^{-1}}$. Using idealized high-resolution simulations with both mesh-based and particle-based gravito-hydrodynamics codes, we find that DMDGs can form as high-velocity galaxy collisions separate dark matter from the warm disk gas which subsequently is compressed by shock and tidal interaction to form stars. Then using a large simulated universe IllustrisTNG, we discover a number of high-velocity galaxy collision events in which DMDGs are expected to form. However, we did not find evidence that these types of collisions actually produced DMDGs in the TNG100-1 run. We argue that the resolution of the numerical experiment is critical to realize the "collision-induced" DMDG formation scenario. Our results demonstrate one of many routes in which galaxies could form with unconventional dark matter fractions.

preprint2020arXiv

Self-consistent proto-globular cluster formation in cosmological simulations of high-redshift galaxies

We report the formation of bound star clusters in a sample of high-resolution cosmological zoom-in simulations of z>5 galaxies from the FIRE project. We find that bound clusters preferentially form in high-pressure clouds with gas surface densities over 10^4 Msun pc^-2, where the cloud-scale star formation efficiency is near unity and young stars born in these regions are gravitationally bound at birth. These high-pressure clouds are compressed by feedback-driven winds and/or collisions of smaller clouds/gas streams in highly gas-rich, turbulent environments. The newly formed clusters follow a power-law mass function of dN/dM~M^-2. The cluster formation efficiency is similar across galaxies with stellar masses of ~10^7-10^10 Msun at z>5. The age spread of cluster stars is typically a few Myrs and increases with cluster mass. The metallicity dispersion of cluster members is ~0.08 dex in [Z/H] and does not depend on cluster mass significantly. Our findings support the scenario that present-day old globular clusters (GCs) were formed during relatively normal star formation in high-redshift galaxies. Simulations with a stricter/looser star formation model form a factor of a few more/fewer bound clusters per stellar mass formed, while the shape of the mass function is unchanged. Simulations with a lower local star formation efficiency form more stars in bound clusters. The simulated clusters are larger than observed GCs due to finite resolution. Our simulations are among the first cosmological simulations that form bound clusters self-consistently in a wide range of high-redshift galaxies.

preprint2020arXiv

The AGORA high-resolution galaxy simulations comparison project: Public data release

As part of the AGORA High-resolution Galaxy Simulations Comparison Project (Kim et al. 2014, 2016) we have generated a suite of isolated Milky Way-mass galaxy simulations using 9 state-of-the-art gravito-hydrodynamics codes widely used in the numerical galaxy formation community. In these simulations we adopted identical galactic disk initial conditions, and common physics models (e.g., radiative cooling and ultraviolet background by a standardized package). Subgrid physics models such as Jeans pressure floor, star formation, supernova feedback energy, and metal production were carefully constrained. Here we release the simulation data to be freely used by the community. In this release we include the disk snapshots at 0 and 500Myr of evolution per each code as used in Kim et al. (2016), from simulations with and without star formation and feedback. We encourage any member of the numerical galaxy formation community to make use of these resources for their research - for example, compare their own simulations with the AGORA galaxies, with the common analysis yt scripts used to obtain the plots shown in our papers, also available in this release.

preprint2019arXiv

High-redshift Galaxy Formation with Self-consistently Modeled Stars and Massive Black Holes: Stellar Feedback and Quasar Growth

As computational resolution of modern cosmological simulations reach ever so close to resolving individual star-forming clumps in a galaxy, a need for "resolution-appropriate" physics for a galaxy-scale simulation has never been greater. To this end, we introduce a self-consistent numerical framework that includes explicit treatments of feedback from star-forming molecular clouds (SFMCs) and massive black holes (MBHs). In addition to the thermal supernovae feedback from SFMC particles, photoionizing radiation from both SFMCs and MBHs is tracked through full 3-dimensional ray tracing. A mechanical feedback channel from MBHs is also considered. Using our framework, we perform a state-of-the-art cosmological simulation of a quasar-host galaxy at z~7.5 for ~25 Myrs with all relevant galactic components such as dark matter, gas, SFMCs, and an embedded MBH seed of ~> 1e6 Ms. We find that feedback from SFMCs and an accreting MBH suppresses runaway star formation locally in the galactic core region. Newly included radiation feedback from SFMCs, combined with feedback from the MBH, helps the MBH grow faster by retaining gas that eventually accretes on to the MBH. Our experiment demonstrates that previously undiscussed types of interplay between gas, SFMCs, and a MBH may hold important clues about the growth and feedback of quasars and their host galaxies in the high-redshift Universe.

preprint2016arXiv

Reconciling dwarf galaxies with LCDM cosmology: Simulating a realistic population of satellites around a Milky Way-mass galaxy

Low-mass "dwarf" galaxies represent the most significant challenges to the cold dark matter (CDM) model of cosmological structure formation. Because these faint galaxies are (best) observed within the Local Group (LG) of the Milky Way (MW) and Andromeda (M31), understanding their formation in such an environment is critical. We present first results from the Latte Project: the Milky Way on FIRE (Feedback in Realistic Environments). This simulation models the formation of a MW-mass galaxy to z = 0 within LCDM cosmology, including dark matter, gas, and stars at unprecedented resolution: baryon particle mass of 7070 Msun with gas kernel/softening that adapts down to 1 pc (with a median of 25 - 60 pc at z = 0). Latte was simulated using the GIZMO code with a mesh-free method for accurate hydrodynamics and the FIRE-2 model for star formation and explicit feedback within a multi-phase interstellar medium. For the first time, Latte self-consistently resolves the spatial scales corresponding to half-light radii of dwarf galaxies that form around a MW-mass host down to Mstar > 10^5 Msun. Latte's population of dwarf galaxies agrees with the LG across a broad range of properties: (1) distributions of stellar masses and stellar velocity dispersions (dynamical masses), including their joint relation; (2) the mass-metallicity relation; and (3) a diverse range of star-formation histories, including their mass dependence. Thus, Latte produces a realistic population of dwarf galaxies at Mstar > 10^5 Msun that does not suffer from the "missing satellites" or "too big to fail" problems of small-scale structure formation. We conclude that baryonic physics can reconcile observed dwarf galaxies with standard LCDM cosmology.

preprint2013arXiv

Enzo: An Adaptive Mesh Refinement Code for Astrophysics

This paper describes the open-source code Enzo, which uses block-structured adaptive mesh refinement to provide high spatial and temporal resolution for modeling astrophysical fluid flows. The code is Cartesian, can be run in 1, 2, and 3 dimensions, and supports a wide variety of physics including hydrodynamics, ideal and non-ideal magnetohydrodynamics, N-body dynamics (and, more broadly, self-gravity of fluids and particles), primordial gas chemistry, optically-thin radiative cooling of primordial and metal-enriched plasmas (as well as some optically-thick cooling models), radiation transport, cosmological expansion, and models for star formation and feedback in a cosmological context. In addition to explaining the algorithms implemented, we present solutions for a wide range of test problems, demonstrate the code's parallel performance, and discuss the Enzo collaboration's code development methodology.

preprint2013arXiv

How Does the Surface Density and Size of Disk Galaxies Measured in Hydrodynamic Simulations Correlate with the Halo Spin Parameter?

Late-type low surface brightness galaxies (LSBs) are faint disk galaxies with central maximum stellar surface densities below 100 Msun/pc^2. The currently favored scenario for their origin is that LSBs have formed in fast-rotating halos with large angular momenta. We present the first numerical evidence for this scenario using a suite of self-consistent hydrodynamic simulations of a 2.3e11 Msun galactic halo, in which we investigate the correlations between the disk stellar/gas surface densities and the spin parameter of its host halo. A clear anti-correlation between the surface densities and the halo spin parameter, lambda, is found. That is, as the halo spin parameter increases, the disk cutoff radius at which the stellar surface density drops below 0.1 Msun/pc^2 monotonically increases, while the average stellar surface density of the disk within that radius decreases. The ratio of the average stellar surface density for the case of lambda=0.03 to that for the case of lambda=0.14 reaches more than 15. We demonstrate that the result is robust against variations in the baryon fraction, confirming that the angular momentum of the host halo is an important driver for the formation of LSBs.

Ji-Hoon Kim

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

Probing Cross-modal Information Hubs in Audio-Visual LLMs

Let There Be Sound: Reconstructing High Quality Speech from Silent Videos

Accelerating Large-Scale Graph-based Nearest Neighbor Search on a Computational Storage Platform

Demystifying the Neural Tangent Kernel from a Practical Perspective: Can it be trusted for Neural Architecture Search without training?

Two-Step Question Retrieval for Open-Domain QA

Dark Matter Deficient Galaxies Produced Via High-velocity Galaxy Collisions In High-resolution Numerical Simulations

Self-consistent proto-globular cluster formation in cosmological simulations of high-redshift galaxies

The AGORA high-resolution galaxy simulations comparison project: Public data release

High-redshift Galaxy Formation with Self-consistently Modeled Stars and Massive Black Holes: Stellar Feedback and Quasar Growth

Reconciling dwarf galaxies with LCDM cosmology: Simulating a realistic population of satellites around a Milky Way-mass galaxy

Enzo: An Adaptive Mesh Refinement Code for Astrophysics

How Does the Surface Density and Size of Disk Galaxies Measured in Hydrodynamic Simulations Correlate with the Halo Spin Parameter?