Source author record

Sangwoo Park

Sangwoo Park appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Machine Learning Artificial Intelligence eess.SP Robotics Computation and Language Neurons and Cognition

Catalog footprint

What is connected

15works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Adaptive Selection of LoRA Components in Privacy-Preserving Federated Learning

Differentially private federated fine-tuning of large models with LoRA suffers from aggregation error caused by LoRA's multiplicative structure, which is further amplified by DP noise and degrades both stability and accuracy. Existing remedies apply a single update mode uniformly across all layers and all communication rounds (or alternate them on a fixed schedule), ignoring both the structural asymmetry between the two LoRA factors and the round-wise dynamics of training. We propose AS-LoRA, an adaptive framework defined by three axes (i) layer-wise freedom, in which each layer independently selects its active component, (ii) round-wise adaptivity, in which the selection updates over communication rounds, and (iii) a curvature-aware score derived from a second-order approximation of the loss. Theoretically, AS-LoRA eliminates the reconstruction-error floor of layer-tied schedules, accelerates convergence, implicitly biases solutions toward flatter minima, and incurs no additional privacy cost. Across GLUE, SQuAD, CIFAR-100, and Tiny-ImageNet under strict DP budgets and non-IID partitions, AS-LoRA improves over the federated LoRA baselines by up to $+7.5$ pp on GLUE and $+12.5$ pp on MNLI-mm for example, while matching or exceeding SVD-based aggregation methods at $33\text{--}180 \times$ lower aggregation cost and with negligible communication overhead. Code for the proposed method is available at https://anonymous.4open.science/r/as_lora-F75F/.

preprint2026arXiv

Nudging Beyond the Comfort Zone: Efficient Strategy-Guided Exploration for RLVR

Reinforcement learning with verifiable rewards (RLVR) has emerged as a scalable paradigm for improving the reasoning capabilities of large language models. However, its effectiveness is fundamentally limited by exploration: the policy can only improve on trajectories it has already sampled. While increasing the number of rollouts alleviates this issue, such brute-force scaling is computationally expensive, and existing approaches that modify the optimization objective provide limited control over what is explored. In this work, we propose NudgeRL, a framework for structured and diversity-driven exploration in RLVR. Our approach introduces Strategy Nudging, which conditions each rollout on lightweight, strategy-level contexts to induce diverse reasoning trajectories without relying on expensive oracle supervision. To effectively learn from such structured exploration, we further propose a unified objective, which decomposes the reward signal into inter- and intra-context components and incorporates a distillation objective to transfer discovered behaviors back to the base policy. Empirically, NudgeRL outperforms standard GRPO with up to 8 times larger rollout budgets, while outperforming oracle-guided RL baseline on average across five challenging math benchmarks. These results demonstrate that structured, context-driven exploration can serve as an efficient and scalable alternative to both brute-force rollout scaling and feasibility-oriented methods based on privileged information. Our code is available at https://github.com/tally0818/NudgeRL.

preprint2026arXiv

PREPING: Building Agent Memory without Tasks

Agent memory is typically constructed either offline from curated demonstrations or online from post-deployment interactions. However, regardless of how it is built, an agent faces a cold-start gap when first introduced to a new environment without any task-specific experience available. In this paper, we study pre-task memory construction: whether an agent can build procedural memory before observing any target-environment tasks, using only self-generated synthetic practice. Yet, synthetic interaction alone is insufficient, as without controlling what to practice and what to store, synthetic tasks become redundant, infeasible, and ultimately uninformative, and memory further degrades quickly due to unfiltered trajectories. To overcome this, we present Preping, a proposer-guided memory construction framework. At its core is proposer memory, a structured control state that shapes future practice. A Proposer generates synthetic tasks conditioned on this state, a Solver executes them, and a Validator determines which trajectories are eligible for memory insertion while also providing feedback to guide future proposals. Experiments on AppWorld, BFCL v3, and MCP-Universe show that Preping substantially improves over a no-memory baseline and achieves performance competitive with strong playbook-based methods built from offline or online experience, with deployment cost $2.99\times$ lower on AppWorld and $2.23\times$ lower on BFCL v3 than online memory construction. Further analyses reveal that the main benefit does not come from synthetic volume alone, but from proposer-side control over feasibility, redundancy, and coverage, combined with selective memory updates.

preprint2022arXiv

Adaptive Semi-Supervised Intent Inferral to Control a Powered Hand Orthosis for Stroke

In order to provide therapy in a functional context, controls for wearable robotic orthoses need to be robust and intuitive. We have previously introduced an intuitive, user-driven, EMG-based method to operate a robotic hand orthosis, but the process of training a control that is robust to concept drift (changes in the input signal) places a substantial burden on the user. In this paper, we explore semi-supervised learning as a paradigm for controlling a powered hand orthosis for stroke subjects. To the best of our knowledge, this is the first use of semi-supervised learning for an orthotic application. Specifically, we propose a disagreement-based semi-supervision algorithm for handling intrasession concept drift based on multimodal ipsilateral sensing. We evaluate the performance of our algorithm on data collected from five stroke subjects. Our results show that the proposed algorithm helps the device adapt to intrasession drift using unlabeled data and reduces the training burden placed on the user. We also validate the feasibility of our proposed algorithm with a functional task; in these experiments, two subjects successfully completed multiple instances of a pick-and-handover task.

preprint2022arXiv

Bayesian Active Meta-Learning for Few Pilot Demodulation and Equalization

Two of the main principles underlying the life cycle of an artificial intelligence (AI) module in communication networks are adaptation and monitoring. Adaptation refers to the need to adjust the operation of an AI module depending on the current conditions; while monitoring requires measures of the reliability of an AI module's decisions. Classical frequentist learning methods for the design of AI modules fall short on both counts of adaptation and monitoring, catering to one-off training and providing overconfident decisions. This paper proposes a solution to address both challenges by integrating meta-learning with Bayesian learning. As a specific use case, the problems of demodulation and equalization over a fading channel based on the availability of few pilots are studied. Meta-learning processes pilot information from multiple frames in order to extract useful shared properties of effective demodulators across frames. The resulting trained demodulators are demonstrated, via experiments, to offer better calibrated soft decisions, at the computational cost of running an ensemble of networks at run time. The capacity to quantify uncertainty in the model parameter space is further leveraged by extending Bayesian meta-learning to an active setting. In it, the designer can select in a sequential fashion channel conditions under which to generate data for meta-learning from a channel simulator. Bayesian active meta-learning is seen in experiments to significantly reduce the number of frames required to obtain efficient adaptation procedure for new frames.

preprint2022arXiv

Design of Spiral-Cable Forearm Exoskeleton to Assist Supination for Hemiparetic Stroke Subjects

We present the development of a cable-based passive forearm exoskeleton that is designed to assist supination for hemiparetic stroke survivors. Our device uniquely provides torque sufficient for counteracting spasticity within a below-elbow apparatus. The mechanism consists of a spiral single-tendon routing embedded in a rigid forearm brace and terminated at the hand and upper-forearm. A spool with an internal releasable-ratchet mechanism allows the user to manually retract the tendon and rotate the hand to counteract involuntary pronation synergies due to stroke. We characterize the mechanism with benchtop testing and five healthy subjects, and perform a preliminary assessment of the exoskeleton with a single chronic stroke subject having minimal supination ability. The mechanism can be integrated into an existing active hand-opening orthosis to enable supination support during grasping tasks, and also allows for a future actuated supination strategy.

preprint2022arXiv

Predicting Flat-Fading Channels via Meta-Learned Closed-Form Linear Filters and Equilibrium Propagation

Predicting fading channels is a classical problem with a vast array of applications, including as an enabler of artificial intelligence (AI)-based proactive resource allocation for cellular networks. Under the assumption that the fading channel follows a stationary complex Gaussian process, as for Rayleigh and Rician fading models, the optimal predictor is linear, and it can be directly computed from the Doppler spectrum via standard linear minimum mean squared error (LMMSE) estimation. However, in practice, the Doppler spectrum is unknown, and the predictor has only access to a limited time series of estimated channels. This paper proposes to leverage meta-learning in order to mitigate the requirements in terms of training data for channel fading prediction. Specifically, it first develops an offline low-complexity solution based on linear filtering via a meta-trained quadratic regularization. Then, an online method is proposed based on gradient descent and equilibrium propagation (EP). Numerical results demonstrate the advantages of the proposed approach, showing its capacity to approach the genie-aided LMMSE solution with a small number of training data points.

preprint2022arXiv

Predicting Multi-Antenna Frequency-Selective Channels via Meta-Learned Linear Filters based on Long-Short Term Channel Decomposition

An efficient data-driven prediction strategy for multi-antenna frequency-selective channels must operate based on a small number of pilot symbols. This paper proposes novel channel prediction algorithms that address this goal by integrating transfer and meta-learning with a reduced-rank parametrization of the channel. The proposed methods optimize linear predictors by utilizing data from previous frames, which are generally characterized by distinct propagation characteristics, in order to enable fast training on the time slots of the current frame. The proposed predictors rely on a novel long-short-term decomposition (LSTD) of the linear prediction model that leverages the disaggregation of the channel into long-term space-time signatures and fading amplitudes. We first develop predictors for single-antenna frequency-flat channels based on transfer/meta-learned quadratic regularization. Then, we introduce transfer and meta-learning algorithms for LSTD-based prediction models that build on equilibrium propagation (EP) and alternating least squares (ALS). Numerical results under the 3GPP 5G standard channel model demonstrate the impact of transfer and meta-learning on reducing the number of pilots for channel prediction, as well as the merits of the proposed LSTD parametrization.

preprint2022arXiv

Robust Bayesian Learning for Reliable Wireless AI: Framework and Applications

This work takes a critical look at the application of conventional machine learning methods to wireless communication problems through the lens of reliability and robustness. Deep learning techniques adopt a frequentist framework, and are known to provide poorly calibrated decisions that do not reproduce the true uncertainty caused by limitations in the size of the training data. Bayesian learning, while in principle capable of addressing this shortcoming, is in practice impaired by model misspecification and by the presence of outliers. Both problems are pervasive in wireless communication settings, in which the capacity of machine learning models is subject to resource constraints and training data is affected by noise and interference. In this context, we explore the application of the framework of robust Bayesian learning. After a tutorial-style introduction to robust Bayesian learning, we showcase the merits of robust Bayesian learning on several important wireless communication problems in terms of accuracy, calibration, and robustness to outliers and misspecification.

preprint2022arXiv

Thumb Stabilization and Assistance in a Robotic Hand Orthosis for Post-Stroke Hemiparesis

We propose a dual-cable method of stabilizing the thumb in the context of a hand orthosis designed for individuals with upper extremity hemiparesis after stroke. This cable network adds opposition/reposition capabilities to the thumb, and increases the likelihood of forming a hand pose that can successfully manipulate objects. In addition to a passive-thumb version (where both cables are of fixed length), our approach also allows for a single-actuator active-thumb version (where the extension cable is actuated while the abductor remains passive), which allows a range of motion intended to facilitate creating and maintaining grasps. We performed experiments with five chronic stroke survivors consisting of unimanual resistive-pull tasks and bimanual twisting tasks with simulated real-world objects; these explored the effects of thumb assistance on grasp stability and functional range of motion. Our results show that both active- and passive-thumb versions achieved similar performance in terms of improving grasp force generation over a no-device baseline, but active thumb stabilization enabled users to maintain grasps for longer durations.

preprint2020arXiv

User-Driven Functional Movement Training with a Wearable Hand Robot after Stroke

We studied the performance of a robotic orthosis designed to assist the paretic hand after stroke. It is wearable and fully user-controlled, serving two possible roles: as a therapeutic tool that facilitates device mediated hand exercises to recover neuromuscular function or as an assistive device for use in everyday activities to aid functional use of the hand. We present the clinical outcomes of a pilot study designed as a feasibility test for these hypotheses. 11 chronic stroke (> 2 years) patients with moderate muscle tone (Modified Ashworth Scale less than or equal to 2 in upper extremity) engaged in a month-long training protocol using the orthosis. Individuals were evaluated using standardized outcome measures, both with and without orthosis assistance. Fugl-Meyer post intervention scores without robotic assistance showed improvement focused specifically at the distal joints of the upper limb, suggesting the use of the orthosis as a rehabilitative device for the hand. Action Research Arm Test scores post intervention with robotic assistance showed that the device may serve an assistive role in grasping tasks. These results highlight the potential for wearable and user-driven robotic hand orthoses to extend the use and training of the affected upper limb after stroke.

preprint2016arXiv

A Unifying Variational Perspective on Some Fundamental Information Theoretic Inequalities

This paper proposes a unifying variational approach for proving and extending some fundamental information theoretic inequalities. Fundamental information theory results such as maximization of differential entropy, minimization of Fisher information (Cramér-Rao inequality), worst additive noise lemma, entropy power inequality (EPI), and extremal entropy inequality (EEI) are interpreted as functional problems and proved within the framework of calculus of variations. Several applications and possible extensions of the proposed results are briefly mentioned.

preprint2012arXiv

An Alternative Proof of an Extremal Entropy Inequality

This paper first focuses on deriving an alternative approach for proving an extremal entropy inequality (EEI), originally presented in [11]. The proposed approach does not rely on the channel enhancement technique, and has the advantage that it yields an explicit description of the optimal solution as opposed to the implicit approach of [11]. Compared with the proofs in [11], the proposed alternative proof is also simpler, more direct, more information-theoretic, and has the additional advantage that it offers a new perspective for establishing novel as well as known challenging results such the capacity of the vector Gaussian broadcast channel, the lower bound of the achievable rate for distributed source coding with a single quadratic distortion constraint, and the secrecy capacity of the Gaussian wire-tap channel. The second part of this paper is devoted to some novel applications of the proposed mathematical results. The proposed mathematical techniques are further exploited to obtain a more simplified proof of the EEI without using the entropy power inequality (EPI), to build the optimal solution for a special class of broadcasting channels with private messages and to obtain a mutual information-based performance bound for the mean square-error of a linear Bayesian estimator of a Gaussian source embedded in an additive noise channel.

preprint2012arXiv

Gaussian Assumption: the Least Favorable but the Most Useful

This paper focuses on three contributions. First, a connection between the result, proposed by Stoica and Babu, and the recent information theoretic results, the worst additive noise lemma and the isoperimetric inequality for entropies, is illustrated. Second, information theoretic and estimation theoretic justifications for the fact that the Gaussian assumption leads to the largest Cramér-Rao lower bound (CRLB) is presented. Third, a slight extension of this result to the more general framework of correlated observations is shown.

preprint2012arXiv

On the equivalence between Stein and de Bruijn identities

This paper focuses on proving the equivalence between Stein's identity and de Bruijn's identity. Given some conditions, we prove that Stein's identity is equivalent to de Bruijn's identity. In addition, some extensions of de Bruijn's identity are presented. For arbitrary but fixed input and noise distributions, there exist relations between the first derivative of the differential entropy and the posterior mean. Moreover, the second derivative of the differential entropy is related to the Fisher information for arbitrary input and noise distributions. Several applications are presented to support the usefulness of the developed results in this paper.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint