Source author record

Taehyeong Kim

Taehyeong Kim appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computer Vision Machine Learning math.DS math.NT Computation and Language physics.plasm-ph

Catalog footprint

What is connected

6works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Mi:dm 2.0 Korea-centric Bilingual Language Models

We introduce Mi:dm 2.0, a bilingual large language model (LLM) specifically engineered to advance Korea-centric AI. This model goes beyond Korean text processing by integrating the values, reasoning patterns, and commonsense knowledge inherent to Korean society, enabling nuanced understanding of cultural contexts, emotional subtleties, and real-world scenarios to generate reliable and culturally appropriate responses. To address limitations of existing LLMs, often caused by insufficient or low-quality Korean data and lack of cultural alignment, Mi:dm 2.0 emphasizes robust data quality through a comprehensive pipeline that includes proprietary data cleansing, high-quality synthetic data generation, strategic data mixing with curriculum learning, and a custom Korean-optimized tokenizer to improve efficiency and coverage. To realize this vision, we offer two complementary configurations: Mi:dm 2.0 Base (11.5B parameters), built with a depth-up scaling strategy for general-purpose use, and Mi:dm 2.0 Mini (2.3B parameters), optimized for resource-constrained environments and specialized tasks. Mi:dm 2.0 achieves state-of-the-art performance on Korean-specific benchmarks, with top-tier zero-shot results on KMMLU and strong internal evaluation results across language, humanities, and social science tasks. The Mi:dm 2.0 lineup is released under the MIT license to support extensive research and commercial use. By offering accessible and high-performance Korea-centric LLMs, KT aims to accelerate AI adoption across Korean industries, public services, and education, strengthen the Korean AI developer community, and lay the groundwork for the broader vision of K-intelligence. Our models are available at https://huggingface.co/K-intelligence. For technical inquiries, please contact midm-llm@kt.com.

preprint2022arXiv

Cross-Modal Alignment Learning of Vision-Language Conceptual Systems

Human infants learn the names of objects and develop their own conceptual systems without explicit supervision. In this study, we propose methods for learning aligned vision-language conceptual systems inspired by infants' word learning mechanisms. The proposed model learns the associations of visual objects and words online and gradually constructs cross-modal relational graph networks. Additionally, we also propose an aligned cross-modal representation learning method that learns semantic representations of visual objects and words in a self-supervised manner based on the cross-modal relational graph networks. It allows entities of different modalities with conceptually the same meaning to have similar semantic representation vectors. We quantitatively and qualitatively evaluate our method, including object-to-word mapping and zero-shot learning tasks, showing that the proposed model significantly outperforms the baselines and that each conceptual system is topologically aligned.

preprint2022arXiv

Dimension estimates for badly approximable affine forms

For given $ε>0$ and $b\in\mathbb{R}^m$, we say that a real $m\times n$ matrix $A$ is $ε$-badly approximable for the target $b$ if $$\liminf_{q\in\mathbb{Z}^n, \|q\|\to\infty} \|q\|^n \langle Aq-b \rangle^m \geq ε,$$ where $\langle \cdot \rangle$ denotes the distance from the nearest integral point. In this article, we obtain upper bounds for the Hausdorff dimensions of the set of $ε$-badly approximable matrices for fixed target $b$ and the set of $ε$-badly approximable targets for fixed matrix $A$. Moreover, we give an equivalent Diophantine condition of $A$ for which the set of $ε$-badly approximable targets for fixed $A$ has full Hausdorff dimension for some $ε>0$. The upper bounds are established by effectivizing entropy rigidity in homogeneous dynamics, which is of independent interest. For the $A$-fixed case, our method also works for the weighted setting where the supremum norms are replaced by certain weighted quasinorms.

preprint2022arXiv

Hausdorff measure of sets of Dirichlet non-improvable affine forms

For a decreasing real valued function $ψ$, a pair $(A,\mathbf{b})$ of a real $m\times n$ matrix $A$ and $\mathbf{b}\in\mathbb{R}^m$ is said to be $ψ$-Dirichlet improvable if the system $$\|A\mathbf{q}+\mathbf{b}-\mathbf{p}\|^m < ψ(T)\quad\text{and}\quad\|\mathbf{q}\|^n < T$$ has a solution $\mathbf{p}\in\mathbb{Z}^m$, $\mathbf{q}\in\mathbb{Z}^n$ for all sufficiently large $T$, where $\|\cdot\|$ denotes the supremum norm. Kleinbock and Wadleigh (2019) established an integrability criterion for the Lebesgue measure of the $ψ$-Dirichlet non-improvable set. In this paper, we prove a similar criterion for the Hausdorff measure of the $ψ$-Dirichlet non-improvable set. Also, we extend this result to the singly metric case that $\mathbf{b}$ is fixed. As an application, we compute the Hausdorff dimension of the set of pairs $(A,\mathbf{b})$ with uniform Diophantine exponents $\widehat{w}(A,\mathbf{b})\leq w$.

preprint2021arXiv

Catalytic effect of plasma in lowering the reduction temperature of $Fe_{2}O_{3}$

Atmospheric pressure plasma (APP) generates highly reactive species that are useful for surface activations. We demonstrate a fast regeneration of iron oxides, that are popular catalysts in various industrial processes, using microwave-driven argon APP under ambient condition. The surface treatment of hematite powder by the APP with a small portion of hydrogen (0.5%) lowers the oxide's reduction temperature. A near-infrared laser is used for localized heating to control the surface temperature. Controlled experiments without plasma confirm the catalytic effect of the plasma. Raman, XRD, SEM, and XPS analyses show that the plasma treatment changed the chemical state of the hematite to that of magnetite without sintering.

preprint2020arXiv

Label Propagation Adaptive Resonance Theory for Semi-supervised Continuous Learning

Semi-supervised learning and continuous learning are fundamental paradigms for human-level intelligence. To deal with real-world problems where labels are rarely given and the opportunity to access the same data is limited, it is necessary to apply these two paradigms in a joined fashion. In this paper, we propose Label Propagation Adaptive Resonance Theory (LPART) for semi-supervised continuous learning. LPART uses an online label propagation mechanism to perform classification and gradually improves its accuracy as the observed data accumulates. We evaluated the proposed model on visual (MNIST, SVHN, CIFAR-10) and audio (NSynth) datasets by adjusting the ratio of the labeled and unlabeled data. The accuracies are much higher when both labeled and unlabeled data are used, demonstrating the significant advantage of LPART in environments where the data labels are scarce.

Taehyeong Kim

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

Mi:dm 2.0 Korea-centric Bilingual Language Models

Cross-Modal Alignment Learning of Vision-Language Conceptual Systems

Dimension estimates for badly approximable affine forms

Hausdorff measure of sets of Dirichlet non-improvable affine forms

Catalytic effect of plasma in lowering the reduction temperature of $Fe_{2}O_{3}$

Label Propagation Adaptive Resonance Theory for Semi-supervised Continuous Learning