Researcher profile

Youngjae Kim

Youngjae Kim contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2026arXiv

DUAL-BLADE: Dual-Path NVMe-Direct KV-Cache Offloading for Edge LLM Inference

The increasing deployment of Large Language Model (LLM) inference on edge AI systems demands efficient execution under tight memory budgets. A key challenge arises from Key-Value (KV) caches, which often exceed available device memory. Although NVMe-based offloading offers scalable capacity, existing file-based designs rely heavily on the kernel page cache, leading to cache thrashing, unpredictable latency, and high software overhead under memory pressure. We present DUAL-BLADE, a dual-path KV residency framework that dynamically assigns KV tensors to either a page-cache path or an NVMe-direct path based on runtime memory availability. The NVMe-direct path bypasses the filesystem by mapping KV tensors to contiguous logical block address (LBA) regions, enabling low-overhead direct storage access. DUAL-BLADE further incorporates adaptive pipeline parallelism to overlap storage I/O with GPU DMA, improving inference throughput. Our evaluation shows that DUAL-BLADE substantially mitigates I/O bottlenecks, reducing prefill and decode latency by up to 33.1% and 42.4%, respectively, while improving SSD utilization by 2.2x across diverse memory budgets.

preprint2020arXiv

Optimizing Placement of Heap Memory Objects in Energy-Constrained Hybrid Memory Systems

Main memory (DRAM) significantly impacts the power and energy utilization of the overall server system. Non-Volatile Memory (NVM) devices, such as Phase Change Memory and Spin-Transfer Torque RAM, are suitable candidates for main memory to reduce energy consumption. But unlike DRAM, NVMs access latencies are higher than DRAM and NVM writes are more energy sensitive than DRAM write operations. Thus, Hybrid Main Memory Systems (HMMS) employing DRAM and NVM have been proposed to reduce the overall energy depletion of main memory while optimizing the performance of NVM. This paper proposes eMap, an optimal heap memory object placement planner in HMMS. eMap considers the object-level access patterns and energy consumption at the application level and provides an ideal placement strategy for each object to augment performance and energy utilization. eMap is equipped with two modules, eMPlan and eMDyn. Specifically, eMPlan is a static placement planner which provides one time placement policies for memory object to meet the energy budget while eMDyn is a runtime placement planner to consider the change in energy limiting constraint during the runtime and shuffles the memory objects by taking into account the access patterns as well as the migration cost in terms of energy and performance. The evaluation shows that our proposed solution satisfies both the energy limiting constraint and the performance. We compare our methodology with the state-of-the-art memory object classification and allocation (MOCA) framework. Our extensive evaluation shows that our proposed solution, eMPlan meets the energy constraint with 4.17 times less costly and reducing the energy consumption up to 14% with the same performance. eMDyn also satisfies the performance and energy requirement while considering the migration cost in terms of time and energy.

preprint2020arXiv

SGX-SSD: A Policy-based Versioning SSD with Intel SGX

This paper demonstrates that SSDs, which perform device-level versioning, can be exposed to data tampering attacks when the retention time of data is less than the malware's dwell time. To deal with that threat, we propose SGX-SSD, a SGX-based versioning SSD which selectively preserves file history based on the given policy. The proposed system adopts Intel SGX to implement the version policy management system that is safe from high-privileged malware. Based on the policy, only the necessary data is selectively preserved in SSD that prevents files with less priority from wasting space and also ensures the integrity of important files.