Researcher profile

Yong Cao

Yong Cao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

FrankenMotion: Part-level Human Motion Generation and Composition

Human motion generation from text prompts has made remarkable progress in recent years. However, existing methods primarily rely on either sequence-level or action-level descriptions due to the absence of fine-grained, part-level motion annotations. This limits their controllability over individual body parts. In this work, we construct a high-quality motion dataset with atomic, temporally-aware part-level text annotations, leveraging the reasoning capabilities of large language models (LLMs). Unlike prior datasets that either provide synchronized part captions with fixed time segments or rely solely on global sequence labels, our dataset captures asynchronous and semantically distinct part movements at fine temporal resolution. Based on this dataset, we introduce a diffusion-based part-aware motion generation framework, namely FrankenMotion, where each body part is guided by its own temporally-structured textual prompt. This is, to our knowledge, the first work to provide atomic, temporally-aware part-level motion annotations and have a model that allows motion generation with both spatial (body part) and temporal (atomic action) control. Experiments demonstrate that FrankenMotion outperforms all previous baseline models adapted and retrained for our setting, and our model can compose motions unseen during training. Our code and dataset will be publicly available upon publication.

preprint2024arXiv

MLPs Compass: What is learned when MLPs are combined with PLMs?

While Transformer-based pre-trained language models and their variants exhibit strong semantic representation capabilities, the question of comprehending the information gain derived from the additional components of PLMs remains an open question in this field. Motivated by recent efforts that prove Multilayer-Perceptrons (MLPs) modules achieving robust structural capture capabilities, even outperforming Graph Neural Networks (GNNs), this paper aims to quantify whether simple MLPs can further enhance the already potent ability of PLMs to capture linguistic information. Specifically, we design a simple yet effective probing framework containing MLPs components based on BERT structure and conduct extensive experiments encompassing 10 probing tasks spanning three distinct linguistic levels. The experimental results demonstrate that MLPs can indeed enhance the comprehension of linguistic structure by PLMs. Our research provides interpretable and valuable insights into crafting variations of PLMs utilizing MLPs for tasks that emphasize diverse linguistic structures.

preprint2022arXiv

Reynolds number effects on the bistable flows over a wavy circular cylinder

The wake of wavy cylinder has been shown to exhibit bistability. Depending on the initial condition, the final state of the wake can either develop into a steady flow (state I), or periodic shedding (state II). In this paper, we perform direct numerical simulations to reveal the Reynolds number effects on these two wake states. With increasing Reynolds number, the steady vortical structures in state I wake sways back and forth in the spanwise direction, resulting in low-frequency fluctuations in drag forces, but not in lift. For state II, the increase in Reynolds number is associated with the emergence of another spectral peak in the lift coefficient. The secondary frequency is associated with highly three-dimensional vortical structures in the wake. For both states, the wakes transition to turblent flows at higher Reynolds numbers, with the development of small-scale vortices. We further study the streamwise gust flows over the wavy cylinder. The time-varying inflow velocity results in a wide range of instantaneous Reynolds number spanning from the absolutely unstable flow regime to the bistable regime. Depending on the period of the inflow velocity variation, the wake perturbations grown at the absolutely unstable flow regime can be damped out in state I wake, or grow large enough to trigger the transition state II, resulting in loss of flow control efficacy. The above analyses reveal novel flow physics of the bistable states at unexplored Reynolds numbers, and showcase the complex transition behavior between the two states in unsteady flows. The insights gained from this study improve the understanding of the wake dynamics of the wavy cylinder.

preprint2020arXiv

Deep Active Learning for Remote Sensing Object Detection

Recently, CNN object detectors have achieved high accuracy on remote sensing images but require huge labor and time costs on annotation. In this paper, we propose a new uncertainty-based active learning which can select images with more information for annotation and detector can still reach high performance with a fraction of the training images. Our method not only analyzes objects' classification uncertainty to find least confident objects but also considers their regression uncertainty to declare outliers. Besides, we bring out two extra weights to overcome two difficulties in remote sensing datasets, class-imbalance and difference in images' objects amount. We experiment our active learning algorithm on DOTA dataset with CenterNet as object detector. We achieve same-level performance as full supervision with only half images. We even override full supervision with 55% images and augmented weights on least confident images.