Source author record

Jungwoo Lee

Jungwoo Lee appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Machine Learning Artificial Intelligence Computer Vision

Catalog footprint

What is connected

9works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Lyapunov-Guided Self-Alignment: Test-Time Adaptation for Offline Safe Reinforcement Learning

Offline reinforcement learning (RL) agents often fail when deployed, as the gap between training datasets and real environments leads to unsafe behavior. To address this, we present SAS (Self-Alignment for Safety), a transformer-based framework that enables test-time adaptation in offline safe RL without retraining. In SAS, the main mechanism is self-alignment: at test time, the pretrained agent generates several imagined trajectories and selects those satisfying the Lyapunov condition. These feasible segments are then recycled as in-context prompts, allowing the agent to realign its behavior toward safety while avoiding parameter updates. In effect, SAS turns Lyapunov-guided imagination into control-invariant prompts, and its transformer architecture admits a hierarchical RL interpretation where prompting functions as Bayesian inference over latent skills. Across Safety Gymnasium and MuJoCo benchmarks, SAS consistently reduces cost and failure while maintaining or improving return.

preprint2024arXiv

SPQR: Controlling Q-ensemble Independence with Spiked Random Model for Reinforcement Learning

Alleviating overestimation bias is a critical challenge for deep reinforcement learning to achieve successful performance on more complex tasks or offline datasets containing out-of-distribution data. In order to overcome overestimation bias, ensemble methods for Q-learning have been investigated to exploit the diversity of multiple Q-functions. Since network initialization has been the predominant approach to promote diversity in Q-functions, heuristically designed diversity injection methods have been studied in the literature. However, previous studies have not attempted to approach guaranteed independence over an ensemble from a theoretical perspective. By introducing a novel regularization loss for Q-ensemble independence based on random matrix theory, we propose spiked Wishart Q-ensemble independence regularization (SPQR) for reinforcement learning. Specifically, we modify the intractable hypothesis testing criterion for the Q-ensemble independence into a tractable KL divergence between the spectral distribution of the Q-ensemble and the target Wigner's semicircle distribution. We implement SPQR in several online and offline ensemble Q-learning algorithms. In the experiments, SPQR outperforms the baseline algorithms in both online and offline RL benchmarks.

preprint2022arXiv

Large Loss Matters in Weakly Supervised Multi-Label Classification

Weakly supervised multi-label classification (WSML) task, which is to learn a multi-label classification using partially observed labels per image, is becoming increasingly important due to its huge annotation cost. In this work, we first regard unobserved labels as negative labels, casting the WSML task into noisy multi-label classification. From this point of view, we empirically observe that memorization effect, which was first discovered in a noisy multi-class setting, also occurs in a multi-label setting. That is, the model first learns the representation of clean labels, and then starts memorizing noisy labels. Based on this finding, we propose novel methods for WSML which reject or correct the large loss samples to prevent model from memorizing the noisy label. Without heavy and complex components, our proposed methods outperform previous state-of-the-art WSML methods on several partial label settings including Pascal VOC 2012, MS COCO, NUSWIDE, CUB, and OpenImages V3 datasets. Various analysis also show that our methodology actually works well, validating that treating large loss properly matters in a weakly supervised multi-label classification. Our code is available at https://github.com/snucml/LargeLossMatters.

preprint2020arXiv

Information-Theoretic Privacy in Federated Submodel learning

We consider information-theoretic privacy in federated submodel learning, where a global server has multiple submodels. Compared to the privacy considered in the conventional federated submodel learning where secure aggregation is adopted for ensuring privacy, information-theoretic privacy provides the stronger protection on submodel selection by the local machine. We propose an achievable scheme that partially adopts the conventional private information retrieval (PIR) scheme that achieves the minimum amount of download. With respect to computation and communication overhead, we compare the achievable scheme with a naive approach for federated submodel learning with information-theoretic privacy.

preprint2020arXiv

REST: Performance Improvement of a Black Box Model via RL-based Spatial Transformation

In recent years, deep neural networks (DNN) have become a highly active area of research, and shown remarkable achievements on a variety of computer vision tasks. DNNs, however, are known to often make overconfident yet incorrect predictions on out-of-distribution samples, which can be a major obstacle to real-world deployments because the training dataset is always limited compared to diverse real-world samples. Thus, it is fundamental to provide guarantees of robustness to the distribution shift between training and test time when we construct DNN models in practice. Moreover, in many cases, the deep learning models are deployed as black boxes and the performance has been already optimized for a training dataset, thus changing the black box itself can lead to performance degradation. We here study the robustness to the geometric transformations in a specific condition where the black-box image classifier is given. We propose an additional learner, \emph{REinforcement Spatial Transform learner (REST)}, that transforms the warped input data into samples regarded as in-distribution by the black-box models. Our work aims to improve the robustness by adding a REST module in front of any black boxes and training only the REST module without retraining the original black box model in an end-to-end manner, i.e. we try to convert the real-world data into training distribution which the performance of the black-box model is best suited for. We use a confidence score that is obtained from the black-box model to determine whether the transformed input is drawn from in-distribution. We empirically show that our method has an advantage in generalization to geometric transformations and sample efficiency.

preprint2016arXiv

Linear Degrees of Freedom for $K $-user MISO Interference Channels with Blind Interference Alignment

In this paper, we characterize the degrees of freedom (DoF) for $K $-user $M \times 1 $ multiple-input single-output interference channels with reconfigurable antennas which have multiple preset modes at the receivers, assuming linear coding strategies in the absence of channel state information at the transmitters, i.e., blind interference alignment. Our linear DoF converse builds on the lemma that if a set of transmit symbols is aligned at their common unintended receivers, those symbols must have independent signal subspace at their corresponding receivers. This lemma arises from the inherent feature that channel state's changing patterns of the links towards the same receiver are always identical, assuming that the coherence time of the channel is long enough. We derive an upper bound for the linear sum DoF, and propose an achievable scheme that exactly achieves the linear sum DoF upper-bound when both of the $\frac{n^{*}}{M}=R_{1} $ and $\frac{MK}{n^{*}}=R_{2} $ are integers. For the other cases, where either $R_1 $ or $R_2 $ is not an integer, we only give some guidelines how the interfering signals are aligned at the receivers to achieve the upper-bound. As an extension, we also show the linear sum DoF upper-bound for downlink/uplink cellular networks.

preprint2016arXiv

Topological Interference Management with Reconfigurable Antennas

We study the symmetric degrees-of-freedom (DoF) of partially connected interference networks under linear coding strategies at transmitters without channel state information beyond topology. We assume that the receivers are equipped with reconfigurable antennas that can switch among their preset modes. In such a network setting, we characterize the class of network topologies in which half linear symmetric DoF is achievable. Moreover, we derive a general upper bound on the linear symmetric DoF for arbitrary network topologies. We also show that this upper bound is tight if the transmitters have at most two co-interferers.

preprint2015arXiv

Grouping Based Blind Interference Alignment for $K$-user MISO Interference Channels

We propose a blind interference alignment (BIA) through staggered antenna switching scheme with no ideal channel assumption. Contrary to the ideal assumption that channels remain constant during BIA symbol extension period, when the coherence time of the channel is relatively short, channel coefficients may change during a given symbol extension period. To perform BIA perfectly with realistic channel assumption, we propose a grouping based supersymbol structure for $K$-user interference channels which can adjust a supersymbol length to given coherence time. It is proved that the supersymbol length could be reduced significantly by an appropriate grouping. Furthermore, it is also shown that the grouping based supersymbol achieves higher degrees of freedom than the conventional method with given coherence time.

preprint2015arXiv

Retrospective Interference Alignment for Two-Cell Uplink MIMO Cellular Networks with Delayed CSIT

In this paper, we propose a new retrospective interference alignment for two-cell multiple-input multiple-output (MIMO) interfering multiple access channels (IMAC) with the delayed channel state information at the transmitters (CSIT). It is shown that having delayed CSIT can strictly increase the sum-DoF compared to the case of no CSIT. The key idea is to align multiple interfering signals from adjacent cells onto a small dimensional subspace over time by fully exploiting the previously received signals as side information with outdated CSIT in a distributed manner. Remarkably, we show that the retrospective interference alignment can achieve the optimal sum-DoF in the context of two-cell two-user scenario by providing a new outer bound.

Jungwoo Lee

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Lyapunov-Guided Self-Alignment: Test-Time Adaptation for Offline Safe Reinforcement Learning

SPQR: Controlling Q-ensemble Independence with Spiked Random Model for Reinforcement Learning

Large Loss Matters in Weakly Supervised Multi-Label Classification

Information-Theoretic Privacy in Federated Submodel learning

REST: Performance Improvement of a Black Box Model via RL-based Spatial Transformation

Linear Degrees of Freedom for $K $-user MISO Interference Channels with Blind Interference Alignment

Topological Interference Management with Reconfigurable Antennas

Grouping Based Blind Interference Alignment for $K$-user MISO Interference Channels

Retrospective Interference Alignment for Two-Cell Uplink MIMO Cellular Networks with Delayed CSIT