Source author record

Yingjie Zhu

Yingjie Zhu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence astro-ph.SR Computation and Language Distributed, Parallel, and Cluster Computing Machine Learning

Catalog footprint

What is connected

3works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

ASRU: Activation Steering Meets Reinforcement Unlearning for Multimodal Large Language Models

Multimodal large language models (MLLMs) may memorize sensitive cross-modal information during pretraining, making machine unlearning (MU) crucial. Existing methods typically evaluate unlearning effectiveness based on output deviations, while overlooking the generation quality after unlearning. This can easily lead to hallucinated or rigid responses, thereby affecting the usability and safety of the unlearned model. To address this issue, we propose ASRU, a controllable multimodal unlearning framework that incorporates generation quality as a core evaluation objective. ASRU first induces initial refusal behavior through activation redirection, and then optimizes fine-grained refusal boundaries using a customized reward function, thereby achieving a better trade-off between target knowledge unlearning and model utility. Experiments on Qwen3-VL show that ASRU significantly improves unlearning effectiveness (+24.6%) on average and generation quality (5.8x) on average while effectively preserving model utility, using only a small amount of retained supervision data.

preprint2026arXiv

RelayGR: Scaling Long-Sequence Generative Recommendation via Cross-Stage Relay-Race Inference

Real-time recommender systems execute multi-stage cascades (retrieval, pre-processing, fine-grained ranking) under strict tail-latency SLOs, leaving only tens of milliseconds for ranking. Generative recommendation (GR) models can improve quality by consuming long user-behavior sequences, but in production their online sequence length is tightly capped by the ranking-stage P99 budget. We observe that the majority of GR tokens encode user behaviors that are independent of the item candidates, suggesting an opportunity to pre-infer a user-behavior prefix once and reuse it during ranking rather than recomputing it on the critical path. Realizing this idea at industrial scale is non-trivial: the prefix cache must survive across multiple pipeline stages before the final ranking instance is determined, the user population implies cache footprints far beyond a single device, and indiscriminate pre-inference would overload shared resources under high QPS. We present RelayGR, a production system that enables in-HBM relay-race inference for GR. RelayGR selectively pre-infers long-term user prefixes, keeps their KV caches resident in HBM over the request lifecycle, and ensures the subsequent ranking can consume them without remote fetches. RelayGR combines three techniques: 1) a sequence-aware trigger that admits only at-risk requests under a bounded cache footprint and pre-inference load, 2) an affinity-aware router that co-locates cache production and consumption by routing both the auxiliary pre-infer signal and the ranking request to the same instance, and 3) a memory-aware expander that uses server-local DRAM to capture short-term cross-request reuse while avoiding redundant reloads. We implement RelayGR on Huawei Ascend NPUs and evaluate it with real queries. Under a fixed P99 SLO, RelayGR supports up to 1.5$\times$ longer sequences and improves SLO-compliant throughput by up to 3.6$\times$.

preprint2022arXiv

Can we detect coronal mass ejections through asymmetries of Sun-as-a-star extreme-ultraviolet spectral line profiles?

Coronal mass ejections (CMEs) are the largest-scale eruptive phenomena in the solar system. Associated with enormous plasma ejections and energy release, CMEs have an important impact on the solar-terrestrial environment. Accurate predictions of the arrival times of CMEs at the Earth depend on the precise measurements on their three-dimensional velocities, which can be achieved using simultaneous line-of-sight (LOS) and plane-of-sky (POS) observations. Besides the POS information from routine coronagraph and extreme ultraviolet (EUV) imaging observations, spectroscopic observations could unveil the physical properties of CMEs including their LOS velocities. We propose that spectral line asymmetries measured by Sun-as-a-star spectrographs can be used for routine detections of CMEs and estimations of their LOS velocities during their early propagation phases. Such observations can also provide important clues for the detection of CMEs on other solar-like stars. However, few studies have concentrated on whether we can detect CME signals and accurately diagnose CME properties through Sun-as-a-star spectral observations. In this work, we constructed a geometric CME model and derived the analytical expressions for full-disk integrated EUV line profiles during CMEs. For different CME properties and instrumental configurations, full disk-integrated line profiles were synthesized. We further evaluated the detectability and diagnostic potential of CMEs from the synthetic line profiles. Our investigations provide important constraints on the future design of Sun-as-a-star spectrographs for CME detections through EUV line asymmetries.