Source author record

Yong Cao

Yong Cao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Computation and Language Graphics Machine Learning physics.flu-dyn

Catalog footprint

What is connected

5works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

FrankenMotion: Part-level Human Motion Generation and Composition

Human motion generation from text prompts has made remarkable progress in recent years. However, existing methods primarily rely on either sequence-level or action-level descriptions due to the absence of fine-grained, part-level motion annotations. This limits their controllability over individual body parts. In this work, we construct a high-quality motion dataset with atomic, temporally-aware part-level text annotations, leveraging the reasoning capabilities of large language models (LLMs). Unlike prior datasets that either provide synchronized part captions with fixed time segments or rely solely on global sequence labels, our dataset captures asynchronous and semantically distinct part movements at fine temporal resolution. Based on this dataset, we introduce a diffusion-based part-aware motion generation framework, namely FrankenMotion, where each body part is guided by its own temporally-structured textual prompt. This is, to our knowledge, the first work to provide atomic, temporally-aware part-level motion annotations and have a model that allows motion generation with both spatial (body part) and temporal (atomic action) control. Experiments demonstrate that FrankenMotion outperforms all previous baseline models adapted and retrained for our setting, and our model can compose motions unseen during training. Our code and dataset will be publicly available upon publication.

preprint2024arXiv

MLPs Compass: What is learned when MLPs are combined with PLMs?

While Transformer-based pre-trained language models and their variants exhibit strong semantic representation capabilities, the question of comprehending the information gain derived from the additional components of PLMs remains an open question in this field. Motivated by recent efforts that prove Multilayer-Perceptrons (MLPs) modules achieving robust structural capture capabilities, even outperforming Graph Neural Networks (GNNs), this paper aims to quantify whether simple MLPs can further enhance the already potent ability of PLMs to capture linguistic information. Specifically, we design a simple yet effective probing framework containing MLPs components based on BERT structure and conduct extensive experiments encompassing 10 probing tasks spanning three distinct linguistic levels. The experimental results demonstrate that MLPs can indeed enhance the comprehension of linguistic structure by PLMs. Our research provides interpretable and valuable insights into crafting variations of PLMs utilizing MLPs for tasks that emphasize diverse linguistic structures.

preprint2022arXiv

Reynolds number effects on the bistable flows over a wavy circular cylinder

The wake of wavy cylinder has been shown to exhibit bistability. Depending on the initial condition, the final state of the wake can either develop into a steady flow (state I), or periodic shedding (state II). In this paper, we perform direct numerical simulations to reveal the Reynolds number effects on these two wake states. With increasing Reynolds number, the steady vortical structures in state I wake sways back and forth in the spanwise direction, resulting in low-frequency fluctuations in drag forces, but not in lift. For state II, the increase in Reynolds number is associated with the emergence of another spectral peak in the lift coefficient. The secondary frequency is associated with highly three-dimensional vortical structures in the wake. For both states, the wakes transition to turblent flows at higher Reynolds numbers, with the development of small-scale vortices. We further study the streamwise gust flows over the wavy cylinder. The time-varying inflow velocity results in a wide range of instantaneous Reynolds number spanning from the absolutely unstable flow regime to the bistable regime. Depending on the period of the inflow velocity variation, the wake perturbations grown at the absolutely unstable flow regime can be damped out in state I wake, or grow large enough to trigger the transition state II, resulting in loss of flow control efficacy. The above analyses reveal novel flow physics of the bistable states at unexplored Reynolds numbers, and showcase the complex transition behavior between the two states in unsteady flows. The insights gained from this study improve the understanding of the wake dynamics of the wavy cylinder.

preprint2020arXiv

Deep Active Learning for Remote Sensing Object Detection

Recently, CNN object detectors have achieved high accuracy on remote sensing images but require huge labor and time costs on annotation. In this paper, we propose a new uncertainty-based active learning which can select images with more information for annotation and detector can still reach high performance with a fraction of the training images. Our method not only analyzes objects' classification uncertainty to find least confident objects but also considers their regression uncertainty to declare outliers. Besides, we bring out two extra weights to overcome two difficulties in remote sensing datasets, class-imbalance and difference in images' objects amount. We experiment our active learning algorithm on DOTA dataset with CenterNet as object detector. We achieve same-level performance as full supervision with only half images. We even override full supervision with 55% images and augmented weights on least confident images.

preprint2012arXiv

Efficient and Effective Volume Visualization with Enhanced Isosurface Rendering

Compared with full volume rendering, isosurface rendering has several well recognized advantages in efficiency and accuracy. However, standard isosurface rendering has some limitations in effectiveness. First, it uses a monotone colored approach and can only visualize the geometry features of an isosurface. The lack of the capability to illustrate the material property and the internal structures behind an isosurface has been a big limitation of this method in applications. Another limitation of isosurface rendering is the difficulty to reveal physically meaningful structures, which are hidden in one or multiple isosurfaces. As such, the application requirements of extract and recombine structures of interest can not be implemented effectively with isosurface rendering. In this work, we develop an enhanced isosurface rendering technique to improve the effectiveness while maintaining the performance efficiency of the standard isosurface rendering. First, an isosurface color enhancement method is proposed to illustrate the neighborhood density and to reveal some of the internal structures. Second, we extend the structure extraction capability of isosurface rendering by enabling explicit scene exploration within a 3D-view, using surface peeling, voxel-selecting, isosurface segmentation, and multi-surface-structure visualization. Our experiments show that the color enhancement not only improves the visual fidelity of the rendering, but also reveals the internal structures without significant increase of the computational cost. Explicit scene exploration is also demonstrated as a powerful tool in some application scenarios, such as displaying multiple abdominal organs.