Researcher profile

Yue Yu

Yue Yu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

AdaptEval: A Benchmark for Evaluating Large Language Models on Code Snippet Adaptation

Recent advancements in large language models (LLMs) have automated various software engineering tasks, with benchmarks emerging to evaluate their capabilities. However, for adaptation, a critical activity during code reuse, there is no benchmark to assess LLMs' performance, leaving their practical utility in this area unclear. To fill this gap, we propose AdaptEval, a benchmark designed to evaluate LLMs on code snippet adaptation. Unlike existing benchmarks, AdaptEval incorporates the following three distinctive features: First, Practical Context. Tasks in AdaptEval are derived from developers' practices, preserving rich contextual information from Stack Overflow and GitHub communities. Second, Multi-granularity Annotation. Each task is annotated with requirements at both task and adaptation levels, supporting the evaluation of LLMs across diverse adaptation scenarios. Third, Fine-grained Evaluation. AdaptEval includes a two-tier testing framework combining adaptation-level and function-level tests, which enables evaluating LLMs' performance across various individual adaptations. Based on AdaptEval, we conduct the first empirical study to evaluate six instruction-tuned LLMs and especially three reasoning LLMs on code snippet adaptation. Experimental results demonstrate that AdaptEval enables the assessment of LLMs' adaptation capabilities from various perspectives. It also provides critical insights into their current limitations, particularly their struggle to follow explicit instructions. We hope AdaptEval can facilitate further investigation and enhancement of LLMs' capabilities in code snippet adaptation, supporting their real-world applications.

preprint2026arXiv

Coding in a Bubble? Evaluating LLMs in Resolving Context Adaptation Bugs During Code Adaptation

Code adaptation is a fundamental but challenging task in software development, requiring developers to modify existing code for new contexts. A key challenge is to resolve Context Adaptation Bugs (CtxBugs), which occurs when code correct in its original context violates constraints in the target environment. Unlike isolated bugs, CtxBugs cannot be resolved through local fixes and require cross-context reasoning to identify semantic mismatches. Overlooking them may lead to critical failures in adaptation. Although Large Language Models (LLMs) show great potential in automating code-related tasks, their ability to resolve CtxBugs remains a significant and unexplored obstacle to their practical use in code adaptation. To bridge this gap, we propose CtxBugGen, a novel framework for generating CtxBugs to evaluate LLMs. Its core idea is to leverage LLMs' tendency to generate plausible but context-free code when contextual constraints are absent. The framework generates CtxBugs through a four-step process to ensure their relevance and validity: (1) Adaptation Task Selection, (2) Task-specific Perturbation,(3) LLM-based Variant Generation and (4) CtxBugs Identification. Based on the benchmark constructed by CtxBugGen, we conduct an empirical study with four state-of-the-art LLMs. Our results reveal their unsatisfactory performance in CtxBug resolution. The best performing LLM, Kimi-K2, achieves 55.93% on Pass@1 and resolves just 52.47% of CtxBugs. The presence of CtxBugs degrades LLMs' adaptation performance by up to 30%. Failure analysis indicates that LLMs often overlook CtxBugs and replicate them in their outputs. Our study highlights a critical weakness in LLMs' cross-context reasoning and emphasize the need for new methods to enhance their context awareness for reliable code adaptation.

preprint2026arXiv

From Synthetic to Real: Toward Identity-Consistent Makeup Transfer with Synthetic and Real Data

Makeup transfer aims to apply the makeup style of a reference portrait to a source portrait while preserving identity and background. Early methods formulate this task as unsupervised image-to-image translation, relying on surrogate objectives and often yielding limited performance. Recent diffusion- and flow-based approaches instead exploit synthetic data for supervised training, leading to significant improvements. However, these methods still face two critical challenges: synthetic supervision frequently fails to faithfully preserve identity, and the domain gap between synthetic and real data limits generalization, resulting in degraded performance in complex real-world scenarios. To address these issues, this paper first proposes ConsistentBeauty, a novel data curation pipeline that ensures makeup fidelity and strict identity consistency within the synthesized data. Second, we propose RealBeauty, a synthetic-to-real post-training framework. Beyond supervised learning on curated synthetic data, we further adapt the model to real-world scenarios through reinforcement learning and design novel verifiable rewards tailored to the makeup transfer task. It allows the model to further benefit from real makeup patterns beyond synthetic supervision. In addition, we establish a new diverse benchmark for makeup transfer, covering a wide range of skin tones, ages, genders, poses, and makeup styles, thereby enabling a more comprehensive evaluation of model performance under diverse real-world conditions. Extensive experiments show that our method achieves state-of-the-art performance on multiple benchmarks and demonstrates clear advantages in identity preservation and performance on complex real-world cases.

preprint2026arXiv

MiMo-V2-Flash Technical Report

We present MiMo-V2-Flash, a Mixture-of-Experts (MoE) model with 309B total parameters and 15B active parameters, designed for fast, strong reasoning and agentic capabilities. MiMo-V2-Flash adopts a hybrid attention architecture that interleaves Sliding Window Attention (SWA) with global attention, with a 128-token sliding window under a 5:1 hybrid ratio. The model is pre-trained on 27 trillion tokens with Multi-Token Prediction (MTP), employing a native 32k context length and subsequently extended to 256k. To efficiently scale post-training compute, MiMo-V2-Flash introduces a novel Multi-Teacher On-Policy Distillation (MOPD) paradigm. In this framework, domain-specialized teachers (e.g., trained via large-scale reinforcement learning) provide dense and token-level reward, enabling the student model to perfectly master teacher expertise. MiMo-V2-Flash rivals top-tier open-weight models such as DeepSeek-V3.2 and Kimi-K2, despite using only 1/2 and 1/3 of their total parameters, respectively. During inference, by repurposing MTP as a draft model for speculative decoding, MiMo-V2-Flash achieves up to 3.6 acceptance length and 2.6x decoding speedup with three MTP layers. We open-source both the model weights and the three-layer MTP weights to foster open research and community collaboration.

preprint2026arXiv

Model-Agnostic and Uncertainty-Aware Dimensionality Reduction in Supervised Learning

Dimension reduction is a fundamental tool for analyzing high-dimensional data in supervised learning. Traditional methods for estimating intrinsic order often prioritize model-specific structural assumptions over predictive utility. This paper introduces predictive order determination (POD), a model-agnostic framework that determines the minimal predictively sufficient dimension by directly evaluating out-of-sample predictiveness. POD quantifies uncertainty via error bounds for over- and underestimation and achieves consistency under mild conditions. By unifying dimension reduction with predictive performance, POD applies flexibly across diverse reduction tasks and supervised learners. Simulations and real-data analyses show that POD delivers accurate, uncertainty-aware order estimates, making it a versatile component for prediction-centric pipelines.

preprint2026arXiv

Self-Creative Text-to-Object Generation using Semantic-Aware Spatial Weighting

Instilling creativity in text-to-image (T2I) generation presents a significant challenge, as it requires synthesized images to exhibit not only visual novelty and surprise, but also artistic value. Current T2I models, however, are largely optimized for literal text-image alignment with their data distribution, and their noise prediction networks constrain the generation to high-probability regions, consequently generating outputs that lack authentic creativity. To address this, we propose a Self-Creative Diffusion (SCDiff) model for meaningful T2I generations featuring two core modules: a learnable spatial weighting (LSW) module and a visual-semantic mixing loss (VSML). The LSW module designs a parametric Kaiser-Bessel window to reinforce central image features, fostering novel and surprising generation. The VSML module introduces a dual loss function: a similarity loss constrains that the new images align with its textual description, while a diversity loss maximizes its distinction from the original image, enhancing both semantic value and visual novelty. Extensive experiments demonstrate that our model substantially improves creativity, semantic alignment, and visual coherence, offering a simple yet powerful framework for generating creative objects.

preprint2026arXiv

Uncooled low-noise thin-film optomechanical resonator for thermal sensing on lithium niobate

Optomechanical transduction harnesses the interaction between optical fields and mechanical motion to achieve sensitive measurement of weak mechanical quantities with inherently low noise. Lithium niobate combines low optical loss, strong piezoelectricity, high intrinsic fQ_m factor, and low thermal conductivity, making it promising for exploring optomechanical platforms targeting thermal sensing applications. Here, we developed an integrated optomechanical platform on thin-film lithium niobate with precisely engineered optical, mechanical, and thermal fields within a compact 40 μm by 40 μm footprint. The platform integrates suspended microring resonators with ultrathin central membranes, reducing mechanical stiffness and effective mass while maintaining a high optical factor Q_o of 1e6 and mechanical quality factor Q_m of 1117, which increases to 5.1e4 after oscillation. The design suppresses thermal dissipation into the silicon substrate and enhances thermal sensitivity, achieving a temperature coefficient of frequency of -124 ppm/K and a noise-equivalent power of 6.2 nW/sqrt(Hz) at 10 kHz at room temperature. This compact and scalable platform opens up new opportunities for high-sensitivity thermal sensing, supports heterogeneous integration with infrared absorbers for uncooled infrared detection, and enables fully integrated, all-optical on-chip readout, paving the way toward large-format, low-noise infrared sensing arrays.

preprint2024arXiv

Order by projection in single-band Hubbard model: a DMRG study

In a Fermi system near or at half-filling, a specific superconducting pairing channel, if not explicitly included in the Hamiltonian, can be boosted by suppressing a competing pairing channel; this is exemplified by the enhancement of extended $s$-wave correlations upon suppressing $s$-wave Cooper pairing. This phenomenon, originally found by the use of generalized uncertainty relations is referred to as \emph{order by projection}. The case of zero on-site Coulomb interaction in the thermodynamic limit, confirms this mechanism through the analytical solution. In this study, we go further and systematically investigate this mechanism for a strongly correlated fermionic Hubbard model, now with finite on-site interaction, on a square lattice with an extended set of hopping parameters. We explore the behaviors of different pairing channels when one of them is suppressed, utilizing density matrix renormalization group calculations. Our findings provide numerical evidence supporting the existence of \emph{order by projection} in the strongly correlated system we studied. We also investigate the effect of the strength of Hubbard $U$, next-nearest neighbor $t'$, hole-doping, as well as finite-size scaling approaching the thermodynamic limit.