Source author record

Yu Shang

Yu Shang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

gr-qc Computer Vision Information Retrieval Machine Learning Neural and Evolutionary Computing Robotics

Catalog footprint

What is connected

6works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

WorldVLN: Autoregressive World Action Model for Aerial Vision-Language Navigation

Aerial vision-language navigation (VLN) requires agents to follow natural-language instructions through closed-loop perception and action in 3D environments. We argue that aerial VLN can be formulated as a prediction-driven world-action problem: the agent should anticipate latent world evolution and act according to the predicted consequences. To this end, we propose WorldVLN, the first autoregressive world action model for aerial VLN. Unlike full-sequence video-generation world models that generate an entire visual clip, WorldVLN adapts a latent autoregressive video backbone to predict short-horizon world-state transitions and directly decodes them into executable waypoint actions. After each action segment is executed, newly received observations are encoded back into the autoregressive context, enabling closed-loop world-action prediction. We further introduce a two-stage training framework that first grounds the video prior in instruction-conditioned navigation dynamics and then develops Action-aware GRPO, the first reinforcement learning method tailored to autoregressive WAMs, to optimize waypoint decisions through their downstream rollout consequences. On public outdoor and indoor benchmarks, WorldVLN consistently outperforms existing Vision-Language-Action baselines with 12\%+ success-rate gains and larger advantages on challenging cases. It further transfers zero-shot to real drone deployment, suggesting that the proposed WorldVLN offers a promising route for spatial action tasks. Demos and code are available at https://embodiedcity.github.io/WorldVLN/.

preprint2021arXiv

Genetic Meta-Structure Search for Recommendation on Heterogeneous Information Network

In the past decade, the heterogeneous information network (HIN) has become an important methodology for modern recommender systems. To fully leverage its power, manually designed network templates, i.e., meta-structures, are introduced to filter out semantic-aware information. The hand-crafted meta-structure rely on intense expert knowledge, which is both laborious and data-dependent. On the other hand, the number of meta-structures grows exponentially with its size and the number of node types, which prohibits brute-force search. To address these challenges, we propose Genetic Meta-Structure Search (GEMS) to automatically optimize meta-structure designs for recommendation on HINs. Specifically, GEMS adopts a parallel genetic algorithm to search meaningful meta-structures for recommendation, and designs dedicated rules and a meta-structure predictor to efficiently explore the search space. Finally, we propose an attention based multi-view graph convolutional network module to dynamically fuse information from different meta-structures. Extensive experiments on three real-world datasets suggest the effectiveness of GEMS, which consistently outperforms all baseline methods in HIN recommendation. Compared with simplified GEMS which utilizes hand-crafted meta-paths, GEMS achieves over $6\%$ performance gain on most evaluation metrics. More importantly, we conduct an in-depth analysis on the identified meta-structures, which sheds light on the HIN based recommender system design.

preprint2012arXiv

EMRI data analysis with a phenomenological waveform

Extreme mass ratio inspirals (EMRIs) (capture and inspiral of a compact stellar mass object into a Massive Black Hole (MBH)) are among the most interesting objects for the gravitational wave astronomy. It is a very challenging task to detect those sources with the accurate estimation parameters of binaries primarily due to a large number of the secondary maxima on the likelihood surface. Search algorithms based on the matched filtering require computation of the gravitational waveform hundreds of thousands of times, which is currently not feasible with the most accurate (faithful) models of EMRIs. Here we propose to use a phenomenological template family which covers a large range of EMRIs parameter space. We use these phenomenological templates to detect the signal in the simulated data and then, assuming a particular EMRI model, estimate the physical parameters of the binary. We have separated the detection problem, which is done in a model-independent way, from the parameter estimation. For the latter one, we need to adopt the model for inspiral in order to map phenomenological parameters onto the physical parameter characterizing EMRIs.

preprint2010arXiv

The Mock LISA Data Challenges: from Challenge 3 to Challenge 4

The Mock LISA Data Challenges are a program to demonstrate LISA data-analysis capabilities and to encourage their development. Each round of challenges consists of one or more datasets containing simulated instrument noise and gravitational waves from sources of undisclosed parameters. Participants analyze the datasets and report best-fit solutions for the source parameters. Here we present the results of the third challenge, issued in Apr 2008, which demonstrated the positive recovery of signals from chirping Galactic binaries, from spinning supermassive--black-hole binaries (with optimal SNRs between ~ 10 and 2000), from simultaneous extreme-mass-ratio inspirals (SNRs of 10-50), from cosmic-string-cusp bursts (SNRs of 10-100), and from a relatively loud isotropic background with Omega_gw(f) ~ 10^-11, slightly below the LISA instrument noise.

preprint2010arXiv

The search for spinning black hole binaries in mock LISA data using a genetic algorithm

Coalescing massive Black Hole binaries are the strongest and probably the most important gravitational wave sources in the LISA band. The spin and orbital precessions bring complexity in the waveform and make the likelihood surface richer in structure as compared to the non-spinning case. We introduce an extended multimodal genetic algorithm which utilizes the properties of the signal and the detector response function to analyze the data from the third round of mock LISA data challenge (MLDC 3.2). The performance of this method is comparable, if not better, to already existing algorithms. We have found all five sources present in MLDC 3.2 and recovered the coalescence time, chirp mass, mass ratio and sky location with reasonable accuracy. As for the orbital angular momentum and two spins of the Black Holes, we have found a large number of widely separated modes in the parameter space with similar maximum likelihood values.

preprint2009arXiv

The search for black hole binaries using a genetic algorithm

In this work we use genetic algorithm to search for the gravitational wave signal from the inspiralling massive Black Hole binaries in the simulated LISA data. We consider a single signal in the Gaussian instrumental noise. This is a first step in preparation for analysis of the third round of the mock LISA data challenge. We have extended a genetic algorithm utilizing the properties of the signal and the detector response function. The performance of this method is comparable, if not better, to already existing algorithms.