Source author record

Kangwook Lee

Kangwook Lee appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Information Theory math.IT Computer Vision Cryptography and Security Distributed, Parallel, and Cluster Computing Networking and Internet Architecture Applications cond-mat.mtrl-sci cs.CY math.OC Performance Robotics

Catalog footprint

What is connected

13works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Fine-tuning a vision-language model for fracture-surface morphology recognition

Vision-language models (VLMs) have shown strong potential for scientific image understanding, but general-purpose models often lack the domain-specific visual knowledge required for reliable materials characterization. In this work, we fine-tuned an open-source VLM (Qwen3-VL-32B-Instruct) for fracture-surface image analysis using a curated dataset of 13,168 open-source, literature-mined fracture-surface images. Morphology annotations were generated by GPT-5.2-Reasoning (high) from both the images and relevant excerpts of their source papers, and the dataset was further enriched with targeted manual collection and rotation-based augmentation. The resulting specialist model outperforms flagship proprietary multimodal models on a benchmark of 100 manually annotated images. It achieves a precision of 0.92, compared to 0.35 for the base Qwen3-VL-32B-Instruct, 0.58 for GPT-5.5-Reasoning (high), and 0.78 for Gemini 3.1 Pro-Reasoning (high). Dataset ablations show that manual collection of rare-feature images and augmentation via image rotation are both beneficial to improve recognition of less common fracture morphology features. We further discuss integrated use of the fine-tuned model with proprietary models to combine fracture-specific visual accuracy with broader multimodal reasoning for autonomous fractography. Although focused on fracture-surface images, this work demonstrates how VLMs can be adapted through targeted collection and fine-tuning on novel feature images to recognize those features and support downstream decision-making in autonomous microscopy workflows.

preprint2026arXiv

RLDX-1 Technical Report

While Vision-Language-Action models (VLAs) have shown remarkable progress toward human-like generalist robotic policies through the versatile intelligence (i.e. broad scene understanding and language-conditioned generalization) inherited from pre-trained Vision-Language Models, they still struggle with complex real-world tasks requiring broader functional capabilities (e.g. motion awareness, long-term memory, and physical sensing). To address this, we introduce RLDX-1, a general-purpose robotic policy for dexterous manipulation built on the Multi-Stream Action Transformer (MSAT), an architecture that unifies these capabilities by integrating heterogeneous modalities through modality-specific streams with cross-modal joint self-attention. RLDX-1 further combines this architecture with system-level design choices, including data synthesis for rare manipulation scenarios, learning procedures specialized for human-like manipulation, and inference optimizations for real-time deployment. Through empirical evaluation, we show that RLDX-1 consistently outperforms recent frontier VLAs (e.g. $π_{0.5}$ and GR00T N1.6) across both simulation benchmarks and real-world tasks that require broad functional capabilities beyond general versatility. In particular, RLDX-1 shows superiority in ALLEX humanoid tasks by achieving success rates of 86.8% while $π_{0.5}$ and GR00T N1.6 achieve around 40%, highlighting the ability of RLDX-1 to control a high-DoF humanoid robot under diverse functional demands. Together, these results position RLDX-1 as a promising step toward reliable VLAs for complex, contact-rich, and dynamic real-world dexterous manipulation.

preprint2022arXiv

Breaking Fair Binary Classification with Optimal Flipping Attacks

Minimizing risk with fairness constraints is one of the popular approaches to learning a fair classifier. Recent works showed that this approach yields an unfair classifier if the training set is corrupted. In this work, we study the minimum amount of data corruption required for a successful flipping attack. First, we find lower/upper bounds on this quantity and show that these bounds are tight when the target model is the unique unconstrained risk minimizer. Second, we propose a computationally efficient data poisoning attack algorithm that can compromise the performance of fair learning algorithms.

preprint2022arXiv

GenLabel: Mixup Relabeling using Generative Models

Mixup is a data augmentation method that generates new data points by mixing a pair of input data. While mixup generally improves the prediction performance, it sometimes degrades the performance. In this paper, we first identify the main causes of this phenomenon by theoretically and empirically analyzing the mixup algorithm. To resolve this, we propose GenLabel, a simple yet effective relabeling algorithm designed for mixup. In particular, GenLabel helps the mixup algorithm correctly label mixup samples by learning the class-conditional data distribution using generative models. Via extensive theoretical and empirical analysis, we show that mixup, when used together with GenLabel, can effectively resolve the aforementioned phenomenon, improving the generalization performance and the adversarial robustness.

preprint2022arXiv

Improved Input Reprogramming for GAN Conditioning

We study the GAN conditioning problem, whose goal is to convert a pretrained unconditional GAN into a conditional GAN using labeled data. We first identify and analyze three approaches to this problem -- conditional GAN training from scratch, fine-tuning, and input reprogramming. Our analysis reveals that when the amount of labeled data is small, input reprogramming performs the best. Motivated by real-world scenarios with scarce labeled data, we focus on the input reprogramming approach and carefully analyze the existing algorithm. After identifying a few critical issues of the previous input reprogramming approach, we propose a new algorithm called InRep+. Our algorithm InRep+ addresses the existing issues with the novel uses of invertible neural networks and Positive-Unlabeled (PU) learning. Via extensive experiments, we show that InRep+ outperforms all existing methods, particularly when label information is scarce, noisy, and/or imbalanced. For instance, for the task of conditioning a CIFAR10 GAN with 1% labeled data, InRep+ achieves an average Intra-FID of 76.24, whereas the second-best method achieves 114.51.

preprint2022arXiv

Rare Gems: Finding Lottery Tickets at Initialization

Large neural networks can be pruned to a small fraction of their original size, with little loss in accuracy, by following a time-consuming "train, prune, re-train" approach. Frankle & Carbin conjecture that we can avoid this by training "lottery tickets", i.e., special sparse subnetworks found at initialization, that can be trained to high accuracy. However, a subsequent line of work by Frankle et al. and Su et al. presents concrete evidence that current algorithms for finding trainable networks at initialization, fail simple baseline comparisons, e.g., against training random sparse subnetworks. Finding lottery tickets that train to better accuracy compared to simple baselines remains an open problem. In this work, we resolve this open problem by proposing Gem-Miner which finds lottery tickets at initialization that beat current baselines. Gem-Miner finds lottery tickets trainable to accuracy competitive or better than Iterative Magnitude Pruning (IMP), and does so up to $19\times$ faster.

preprint2020arXiv

Attack of the Tails: Yes, You Really Can Backdoor Federated Learning

Due to its decentralized nature, Federated Learning (FL) lends itself to adversarial attacks in the form of backdoors during training. The goal of a backdoor is to corrupt the performance of the trained model on specific sub-tasks (e.g., by classifying green cars as frogs). A range of FL backdoor attacks have been introduced in the literature, but also methods to defend against them, and it is currently an open question whether FL systems can be tailored to be robust against backdoors. In this work, we provide evidence to the contrary. We first establish that, in the general case, robustness to backdoors implies model robustness to adversarial examples, a major open problem in itself. Furthermore, detecting the presence of a backdoor in a FL model is unlikely assuming first order oracles or polynomial time. We couple our theoretical results with a new family of backdoor attacks, which we refer to as edge-case backdoors. An edge-case backdoor forces a model to misclassify on seemingly easy inputs that are however unlikely to be part of the training, or test data, i.e., they live on the tail of the input distribution. We explain how these edge-case backdoors can lead to unsavory failures and may have serious repercussions on fairness, and exhibit that with careful tuning at the side of the adversary, one can insert them across a range of machine learning tasks (e.g., image classification, OCR, text prediction, sentiment analysis).

preprint2020arXiv

FR-Train: A Mutual Information-Based Approach to Fair and Robust Training

Trustworthy AI is a critical issue in machine learning where, in addition to training a model that is accurate, one must consider both fair and robust training in the presence of data bias and poisoning. However, the existing model fairness techniques mistakenly view poisoned data as an additional bias to be fixed, resulting in severe performance degradation. To address this problem, we propose FR-Train, which holistically performs fair and robust model training. We provide a mutual information-based interpretation of an existing adversarial training-based fairness-only method, and apply this idea to architect an additional discriminator that can identify poisoned data using a clean validation set and reduce its influence. In our experiments, FR-Train shows almost no decrease in fairness and accuracy in the presence of data poisoning by both mitigating the bias and defending against poisoning. We also demonstrate how to construct clean validation sets using crowdsourcing, and release new benchmark datasets.

preprint2016arXiv

Fast and Robust Compressive Phase Retrieval with Sparse-Graph Codes

In this paper, we tackle the compressive phase retrieval problem in the presence of noise. The noisy compressive phase retrieval problem is to recover a $K$-sparse complex signal $s \in \mathbb{C}^n$, from a set of $m$ noisy quadratic measurements: $ y_i=| a_i^H s |^2+w_i$, where $a_i^H\in\mathbb{C}^n$ is the $i$th row of the measurement matrix $A\in\mathbb{C}^{m\times n}$, and $w_i$ is the additive noise to the $i$th measurement. We consider the regime where $K=βn^δ$, with constants $β>0$ and $δ\in(0,1)$. We use the architecture of PhaseCode algorithm, and robustify it using two schemes: the almost-linear scheme and the sublinear scheme. We prove that with high probability, the almost-linear scheme recovers $s$ with sample complexity $Θ(K \log(n))$ and computational complexity $Θ(n \log(n))$, and the sublinear scheme recovers $s$ with sample complexity $Θ(K\log^3(n))$ and computational complexity $Θ(K\log^3(n))$. To the best of our knowledge, this is the first scheme that achieves sublinear computational complexity for compressive phase retrieval problem. Finally, we provide simulation results that support our theoretical contributions.

preprint2016arXiv

Predicting Long-term Outcomes of Educational Interventions Using the Evolutionary Causal Matrices and Markov Chain Based on Educational Neuroscience

We developed a prediction model based on the evolutionary causal matrices (ECM) and the Markov Chain to predict long-term influences of educational interventions on adolescents development. Particularly, we created a computational model predicting longitudinal influences of different types of stories of moral exemplars on adolescents voluntary service participation. We tested whether the developed prediction model can properly predict a long-term longitudinal trend of change in voluntary service participation rate by comparing prediction results and surveyed data. Furthermore, we examined which type of intervention would most effectively promote service engagement and what is the minimum required frequency of intervention to produce a large effect. We discussed the implications of the developed prediction model in educational interventions based on educational neuroscience.

preprint2015arXiv

SAFFRON: A Fast, Efficient, and Robust Framework for Group Testing based on Sparse-Graph Codes

Group testing tackles the problem of identifying a population of $K$ defective items from a set of $n$ items by pooling groups of items efficiently in order to cut down the number of tests needed. The result of a test for a group of items is positive if any of the items in the group is defective and negative otherwise. The goal is to judiciously group subsets of items such that defective items can be reliably recovered using the minimum number of tests, while also having a low-complexity decoding procedure. We describe SAFFRON (Sparse-grAph codes Framework For gROup testiNg), a non-adaptive group testing paradigm that recovers at least a $(1-ε)$-fraction (for any arbitrarily small $ε> 0$) of $K$ defective items with high probability with $m=6C(ε)K\log_2{n}$ tests, where $C(ε)$ is a precisely characterized constant that depends only on $ε$. For instance, it can provably recover at least $(1-10^{-6})K$ defective items with $m \simeq 68 K \log_2{n}$ tests. The computational complexity of the decoding algorithm of SAFFRON is $\mathcal{O}(K\log n)$, which is order-optimal. Further, we robustify SAFFRON such that it can reliably recover the set of $K$ defective items even in the presence of erroneous or noisy test results. We also propose Singleton-Only-SAFFRON, a variant of SAFFRON, that recovers all the $K$ defective items with $m=2e(1+α)K\log K \log_2 n$ tests with probability $1-\mathcal{O}{\left(\frac{1}{K^α}\right)}$, where $α>0$ is a constant. By leveraging powerful design and analysis tools from modern sparse-graph coding theory, SAFFRON is the first approach to reliable, large-scale probabilistic group testing that offers both precisely characterizable number of tests needed (down to the constants) together with order-optimal decoding complexity.

preprint2013arXiv

The MDS Queue: Analysing the Latency Performance of Erasure Codes

In order to scale economically, data centers are increasingly evolving their data storage methods from the use of simple data replication to the use of more powerful erasure codes, which provide the same level of reliability as replication but at a significantly lower storage cost. In particular, it is well known that Maximum-Distance-Separable (MDS) codes, such as Reed-Solomon codes, provide the maximum storage efficiency. While the use of codes for providing improved reliability in archival storage systems, where the data is less frequently accessed (or so-called "cold data"), is well understood, the role of codes in the storage of more frequently accessed and active "hot data", where latency is the key metric, is less clear. In this paper, we study data storage systems based on MDS codes through the lens of queueing theory, and term this the "MDS queue." We analytically characterize the (average) latency performance of MDS queues, for which we present insightful scheduling policies that form upper and lower bounds to performance, and are observed to be quite tight. Extensive simulations are also provided and used to validate our theoretical analysis. We also employ the framework of the MDS queue to analyse different methods of performing so-called degraded reads (reading of partial data) in distributed data storage.

preprint2013arXiv

When Do Redundant Requests Reduce Latency ?

Several systems possess the flexibility to serve requests in more than one way. For instance, a distributed storage system storing multiple replicas of the data can serve a request from any of the multiple servers that store the requested data, or a computational task may be performed in a compute-cluster by any one of multiple processors. In such systems, the latency of serving the requests may potentially be reduced by sending "redundant requests": a request may be sent to more servers than needed, and it is deemed served when the requisite number of servers complete service. Such a mechanism trades off the possibility of faster execution of at least one copy of the request with the increase in the delay due to an increased load on the system. Due to this tradeoff, it is unclear when redundant requests may actually help. Several recent works empirically evaluate the latency performance of redundant requests in diverse settings. This work aims at an analytical study of the latency performance of redundant requests, with the primary goals of characterizing under what scenarios sending redundant requests will help (and under what scenarios they will not help), as well as designing optimal redundant-requesting policies. We first present a model that captures the key features of such systems. We show that when service times are i.i.d. memoryless or "heavier", and when the additional copies of already-completed jobs can be removed instantly, redundant requests reduce the average latency. On the other hand, when service times are "lighter" or when service times are memoryless and removal of jobs is not instantaneous, then not having any redundancy in the requests is optimal under high loads. Our results hold for arbitrary arrival processes.

Kangwook Lee

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

Fine-tuning a vision-language model for fracture-surface morphology recognition

RLDX-1 Technical Report

Breaking Fair Binary Classification with Optimal Flipping Attacks

GenLabel: Mixup Relabeling using Generative Models

Improved Input Reprogramming for GAN Conditioning

Rare Gems: Finding Lottery Tickets at Initialization

Attack of the Tails: Yes, You Really Can Backdoor Federated Learning

FR-Train: A Mutual Information-Based Approach to Fair and Robust Training

Fast and Robust Compressive Phase Retrieval with Sparse-Graph Codes

Predicting Long-term Outcomes of Educational Interventions Using the Evolutionary Causal Matrices and Markov Chain Based on Educational Neuroscience

SAFFRON: A Fast, Efficient, and Robust Framework for Group Testing based on Sparse-Graph Codes

The MDS Queue: Analysing the Latency Performance of Erasure Codes

When Do Redundant Requests Reduce Latency ?