Researcher profile

Kejun Li

Kejun Li contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2025arXiv

CLF-RL: Control Lyapunov Function Guided Reinforcement Learning

Reinforcement learning (RL) has shown promise in generating robust locomotion policies for bipedal robots, but often suffers from tedious reward design and sensitivity to poorly shaped objectives. In this work, we propose a structured reward shaping framework that leverages model-based trajectory generation and control Lyapunov functions (CLFs) to guide policy learning. We explore two model-based planners for generating reference trajectories: a reduced-order linear inverted pendulum (LIP) model for velocity-conditioned motion planning, and a precomputed gait library based on hybrid zero dynamics (HZD) using full-order dynamics. These planners define desired end-effector and joint trajectories, which are used to construct CLF-based rewards that penalize tracking error and encourage rapid convergence. This formulation provides meaningful intermediate rewards, and is straightforward to implement once a reference is available. Both the reference trajectories and CLF shaping are used only during training, resulting in a lightweight policy at deployment. We validate our method both in simulation and through extensive real-world experiments on a Unitree G1 robot. CLF-RL demonstrates significantly improved robustness relative to the baseline RL policy and better performance than a classic tracking reward RL formulation.

preprint2022arXiv

Natural Multicontact Walking for Robotic Assistive Devices via Musculoskeletal Models and Hybrid Zero Dynamics

Generating stable walking gaits that yield natural locomotion when executed on robotic-assistive devices is a challenging task that often requires hand-tuning by domain experts. This paper presents an alternative methodology, where we propose the addition of musculoskeletal models directly into the gait generation process to intuitively shape the resulting behavior. In particular, we construct a multi-domain hybrid system model that combines the system dynamics with muscle models to represent natural multicontact walking. Provably stable walking gaits can then be generated for this model via the hybrid zero dynamics (HZD) method. We experimentally apply our integrated framework towards achieving multicontact locomotion on a dual-actuated transfemoral prosthesis, AMPRO3, for two subjects. The results demonstrate that enforcing muscle model constraints produces gaits that yield natural locomotion (as analyzed via comparison to motion capture data and electromyography). Moreover, gaits generated with our framework were strongly preferred by the non-disabled prosthetic users as compared to gaits generated with the nominal HZD method, even with the use of systematic tuning methods. We conclude that the novel approach of combining robotic walking methods (specifically HZD) with muscle models successfully generates anthropomorphic robotic-assisted locomotion.

preprint2022arXiv

Nuclear spin polarization and control in a van der Waals material

Van der Waals layered materials are a focus of materials research as they support strong quantum effects and can easily form heterostructures. Electron spins in van der Waals materials played crucial roles in many recent breakthroughs, including topological insulators, two-dimensional (2D) magnets, and spin liquids. However, nuclear spins in van der Waals materials remain an unexplored quantum resource. Here we report the first demonstration of optical polarization and coherent control of nuclear spins in a van der Waals material at room temperature. We use negatively-charged boron vacancy ($V_B^-$) spin defects in hexagonal boron nitride to polarize nearby nitrogen nuclear spins. Remarkably, we observe the Rabi frequency of nuclear spins at the excited-state level anti-crossing of $V_B^-$ defects to be 350 times larger than that of an isolated nucleus, and demonstrate fast coherent control of nuclear spins. We also detect strong electron-mediated nuclear-nuclear spin coupling that is 5 orders of magnitude larger than the direct nuclear spin dipolar coupling, enabling multi-qubit operations. Nitrogen nuclear spins in a triangle lattice will be suitable for large-scale quantum simulation. Our work opens a new frontier with nuclear spins in van der Waals materials for quantum information science and technology.

preprint2022arXiv

POLAR: Preference Optimization and Learning Algorithms for Robotics

Parameter tuning for robotic systems is a time-consuming and challenging task that often relies on domain expertise of the human operator. Moreover, existing learning methods are not well suited for parameter tuning for many reasons including: the absence of a clear numerical metric for `good robotic behavior'; limited data due to the reliance on real-world experimental data; and the large search space of parameter combinations. In this work, we present an open-source MATLAB Preference Optimization and Learning Algorithms for Robotics toolbox (POLAR) for systematically exploring high-dimensional parameter spaces using human-in-the-loop preference-based learning. This aim of this toolbox is to systematically and efficiently accomplish one of two objectives: 1) to optimize robotic behaviors for human operator preference; 2) to learn the operator's underlying preference landscape to better understand the relationship between adjustable parameters and operator preference. The POLAR toolbox achieves these objectives using only subjective feedback mechanisms (pairwise preferences, coactive feedback, and ordinal labels) to infer a Bayesian posterior over the underlying reward function dictating the user's preferences. We demonstrate the performance of the toolbox in simulation and present various applications of human-in-the-loop preference-based learning.

preprint2022arXiv

Safety-Aware Preference-Based Learning for Safety-Critical Control

Bringing dynamic robots into the wild requires a tenuous balance between performance and safety. Yet controllers designed to provide robust safety guarantees often result in conservative behavior, and tuning these controllers to find the ideal trade-off between performance and safety typically requires domain expertise or a carefully constructed reward function. This work presents a design paradigm for systematically achieving behaviors that balance performance and robust safety by integrating safety-aware Preference-Based Learning (PBL) with Control Barrier Functions (CBFs). Fusing these concepts -- safety-aware learning and safety-critical control -- gives a robust means to achieve safe behaviors on complex robotic systems in practice. We demonstrate the capability of this design paradigm to achieve safe and performant perception-based autonomous operation of a quadrupedal robot both in simulation and experimentally on hardware.

preprint2021arXiv

ROIAL: Region of Interest Active Learning for Characterizing Exoskeleton Gait Preference Landscapes

Characterizing what types of exoskeleton gaits are comfortable for users, and understanding the science of walking more generally, require recovering a user's utility landscape. Learning these landscapes is challenging, as walking trajectories are defined by numerous gait parameters, data collection from human trials is expensive, and user safety and comfort must be ensured. This work proposes the Region of Interest Active Learning (ROIAL) framework, which actively learns each user's underlying utility function over a region of interest that ensures safety and comfort. ROIAL learns from ordinal and preference feedback, which are more reliable feedback mechanisms than absolute numerical scores. The algorithm's performance is evaluated both in simulation and experimentally for three non-disabled subjects walking inside of a lower-body exoskeleton. ROIAL learns Bayesian posteriors that predict each exoskeleton user's utility landscape across four exoskeleton gait parameters. The algorithm discovers both commonalities and discrepancies across users' gait preferences and identifies the gait parameters that most influenced user feedback. These results demonstrate the feasibility of recovering gait utility landscapes from limited human trials.