Source author record

Fangzhou Yu

Fangzhou Yu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Robotics eess.SP Information Theory math.IT

Catalog footprint

What is connected

3works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Dynamic Bipedal Maneuvers through Sim-to-Real Reinforcement Learning

For legged robots to match the athletic capabilities of humans and animals, they must not only produce robust periodic walking and running, but also seamlessly switch between nominal locomotion gaits and more specialized transient maneuvers. Despite recent advancements in controls of bipedal robots, there has been little focus on producing highly dynamic behaviors. Recent work utilizing reinforcement learning to produce policies for control of legged robots have demonstrated success in producing robust walking behaviors. However, these learned policies have difficulty expressing a multitude of different behaviors on a single network. Inspired by conventional optimization-based control techniques for legged robots, this work applies a recurrent policy to execute four-step, 90 degree turns trained using reference data generated from optimized single rigid body model trajectories. We present a novel training framework using epilogue terminal rewards for learning specific behaviors from pre-computed trajectory data and demonstrate a successful transfer to hardware on the bipedal robot Cassie.

preprint2022arXiv

Optimizing Bipedal Maneuvers of Single Rigid-Body Models for Reinforcement Learning

In this work, we propose a method to generate reduced-order model reference trajectories for general classes of highly dynamic maneuvers for bipedal robots for use in sim-to-real reinforcement learning. Our approach is to utilize a single rigid-body model (SRBM) to optimize libraries of trajectories offline to be used as expert references in the reward function of a learned policy. This method translates the model's dynamically rich rotational and translational behaviour to a full-order robot model and successfully transfers to real hardware. The SRBM's simplicity allows for fast iteration and refinement of behaviors, while the robustness of learning-based controllers allows for highly dynamic motions to be transferred to hardware. % Within this work we introduce a set of transferability constraints that amend the SRBM dynamics to actual bipedal robot hardware, our framework for creating optimal trajectories for dynamic stepping, turning maneuvers and jumps as well as our approach to integrating reference trajectories to a reinforcement learning policy. Within this work we introduce a set of transferability constraints that amend the SRBM dynamics to actual bipedal robot hardware, our framework for creating optimal trajectories for a variety of highly dynamic maneuvers as well as our approach to integrating reference trajectories for a high-speed running reinforcement learning policy. We validate our methods on the bipedal robot Cassie on which we were successfully able to demonstrate highly dynamic grounded running gaits up to 3.0 m/s.

preprint2021arXiv

Heterogeneous Millimeter Wave Wireless Power Transfer With Poisson Cluster Processes

In this paper, we analyze the energy coverage performance of heterogeneous millimeter wave (mmWave) wireless power transfer (WPT) networks, where macro base stations (MBSs) are distributed according to a Poisson point process (PPP), the location of power beacons (PBs) is modeled as a $k$-tier Poisson cluster process (PCP), and energy users (EUs) are clustered around the centers of PB clusters. Moreover, the cosine antenna gain model is adopted instead of the prevalent flap-top gain model, which is simpler in derivation but less accurate. Based on the generalized exponential distribution approximation, we propose a new technique of deriving the energy coverage probability of randomly deployed mmWave WPT networks. Specifically, taking the Thomas cluster process (TCP) for instance, we derive the energy coverage probabilities with two PB association strategies, i.e., the random PB association and the nearest PB association. Through Monte-Carlo simulations, our theoretical results are verified and the impact of system parameters, such as the array antenna size, energy threshold or average number of PBs in a cluster, are also investigated.