Source author record

Shuai Lu

Shuai Lu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.AP math.ST Software Engineering Statistics Theory Computation and Language Computer Vision math.NA Numerical Analysis Artificial Intelligence Computer Science and Game Theory cond-mat.mtrl-sci cond-mat.supr-con math.OC Programming Languages

Catalog footprint

What is connected

14works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A Systematic Post-Train Framework for Video Generation

While large-scale video diffusion models have demonstrated impressive capabilities in generating high-resolution and semantically rich content, a significant gap remains between their pretraining performance and real-world deployment requirements due to critical issues such as prompt sensitivity, temporal inconsistency, and prohibitive inference costs. To bridge this gap, we propose a comprehensive post-training framework that systematically aligns pretrained models with user intentions through four synergistic stages: we first employ Supervised Fine-Tuning (SFT) to transform the base model into a stable instruction-following policy, followed by a Reinforcement Learning from Human Feedback (RLHF) stage that utilizes a novel Group Relative Policy Optimization (GRPO) method tailored for video diffusion to enhance perceptual quality and temporal coherence; subsequently, we integrate Prompt Enhancement via a specialized language model to refine user inputs, and finally address system efficiency through Inference Optimization. Together, these components provide a systematic approach to improving visual quality, temporal coherence, and instruction following, while preserving the controllability learned during pretraining. The result is a practical blueprint for building scalable post-training pipelines that are stable, adaptable, and effective in real-world deployment. Extensive experiments demonstrate that this unified pipeline effectively mitigates common artifacts and significantly improves controllability and visual aesthetics while adhering to strict sampling cost constraints.

preprint2026arXiv

ExtraVAR: Stage-Aware RoPE Remapping for Resolution Extrapolation in Visual Autoregressive Models

Visual Autoregressive (VAR) models have emerged as a strong alternative to diffusion for image synthesis, yet their fixed training resolution prevents direct generation at higher resolutions. Naively transferring training-free extrapolation methods from LLMs or diffusion models to VAR yields three characteristic failure modes: global repetition, local repetition, and detail degradation. We trace them to a unified band-stage mismatch: VAR generates images in a coarse-to-fine, scale-wise process where each stage is driven by a distinct dominant RoPE frequency band, and each failure mode emerges when the dominant band of a particular stage is disrupted. Building on this insight, we propose Stage-Aware RoPE Remapping, a training-free strategy that assigns each frequency band a stage-specific remapping rule, jointly suppressing all three failure modes. We further observe that attention becomes systematically dispersed as the image resolution increases. Existing methods typically depend on predefined attention scaling factors, which are neither adaptive to the target resolution nor capable of faithfully capturing the actual extent of attention dispersion. We therefore propose Entropy-Driven Adaptive Attention Calibration, which quantifies dispersion via a resolution-invariant normalized entropy and yields a closed-form per-head scaling factor that realigns the extrapolated-resolution attention entropy with its training-resolution counterpart. Extensive experiments show that our method consistently outperforms prior resolution-extrapolation methods in both structural coherence and fine-detail fidelity. Our code is available at https://github.com/feihongyan1/ExtraVAR.

preprint2023arXiv

Increasing stability of a linearized inverse boundary value problem for a nonlinear Schrödinger equation on transversally anisotropic manifolds

We consider the problem of recovering a nonlinear potential function in a nonlinear Schrödinger equation on transversally anisotropic manifolds from the linearized Dirichlet-to-Neumann map at a large wavenumber. By calibrating the complex geometric optics (CGO) solutions according to the wavenumber, we prove the increasing stability of recovering the coefficient of a cubic term as the wavenumber becomes large.

preprint2022arXiv

Discrimination-Based Double Auction for Maximizing Social Welfare in the Electricity and Heating Market Considering Privacy Preservation

This paper proposes a doubled-sided auction mechanism with price discrimination for social welfare (SW) maximization in the electricity and heating market. In this mechanism, energy service providers (ESPs) submit offers and load aggregators (LAs) submit bids to an energy trading center (ETC) to maximize their utility; in turn, the selfless ETC as an auctioneer leverages dis-criminatory price weights to regulate the behaviors of ESPs and LAs, which combines the individual benefits of each stakeholder with the overall social welfare to achieve the global optimum. Nash games are employed to describe the interactions between players with the same market role. Theoretically, we first prove the existence and uniqueness of the Nash equilibrium; then, considering the requirement of game players to preserve privacy, a distributed algorithm based on the alternating direction method of multipliers is developed to implement distributed bidding and analytical target cascading algorithm is applied to reach the balance of demand and supply. We validated the proposed mechanism using case studies on a city-level distribution system. The results indicated that the achieved SW improved by 4%-15% compared with other mechanisms, and also verified the effectiveness of the distributed algorithm.

preprint2022arXiv

Increasing stability in the linearized inverse Schrödinger potential problem with power type nonlinearities

We consider increasing stability in the inverse Schrödinger potential problem with power type nonlinearities at a large wavenumber. Two linearization approaches, with respect to small boundary data and small potential function, are proposed and their performance on the inverse Schrödinger potential problem is investigated. It can be observed that higher order linearization for small boundary data can provide an increasing stability for an arbitrary power type nonlinearity term if the wavenumber is chosen large. Meanwhile, linearization with respect to the potential function leads to increasing stability for a quadratic nonlinearity term, which highlights the advantage of nonlinearity in solving the inverse Schrödinger potential problem. Noticing that both linearization approaches can be numerically approximated, we provide several reconstruction algorithms for the quadratic and general power type nonlinearity terms, where one of these algorithms is designed based on boundary measurements of multiple wavenumbers. Several numerical examples shed light on the efficiency of our proposed algorithms.

preprint2022arXiv

Learning to Recommend Method Names with Global Context

In programming, the names for the program entities, especially for the methods, are the intuitive characteristic for understanding the functionality of the code. To ensure the readability and maintainability of the programs, method names should be named properly. Specifically, the names should be meaningful and consistent with other names used in related contexts in their codebase. In recent years, many automated approaches are proposed to suggest consistent names for methods, among which neural machine translation (NMT) based models are widely used and have achieved state-of-the-art results. However, these NMT-based models mainly focus on extracting the code-specific features from the method body or the surrounding methods, the project-specific context and documentation of the target method are ignored. We conduct a statistical analysis to explore the relationship between the method names and their contexts. Based on the statistical results, we propose GTNM, a Global Transformer-based Neural Model for method name suggestion, which considers the local context, the project-specific context, and the documentation of the method simultaneously. Experimental results on java methods show that our model can outperform the state-of-the-art results by a large margin on method name suggestion, demonstrating the effectiveness of our proposed model.

preprint2022arXiv

On a Dynamic Variant of the Iteratively Regularized Gauss-Newton Method with Sequential Data

For numerous parameter and state estimation problems, assimilating new data as they become available can help produce accurate and fast inference of unknown quantities. While most existing algorithms for solving those kind of ill-posed inverse problems can only be used with a single instance of the observed data, in this work we propose a new framework that enables existing algorithms to invert multiple instances of data in a sequential fashion. Specifically we will work with the well-known iteratively regularized Gauss-Newton method (IRGNM), a variational methodology for solving nonlinear inverse problems. We develop a theory of convergence analysis for a proposed dynamic IRGNM algorithm in the presence of Gaussian white noise. We combine this algorithm with the classical IRGNM to deliver a practical (hybrid) algorithm that can invert data sequentially while producing fast estimates. Our work includes the proof of well-definedness of the proposed iterative scheme, as well as various error bounds that rely on standard assumptions for nonlinear inverse problems. We use several numerical experiments to verify our theoretical findings, and to highlight the benefits of incorporating sequential data. The context of the numerical experiments comprises various parameter identification problems including a Darcy flow elliptic PDE example, and that of electrical impedance tomography.

preprint2022arXiv

ReACC: A Retrieval-Augmented Code Completion Framework

Code completion, which aims to predict the following code token(s) according to the code context, can improve the productivity of software development. Recent work has proved that statistical language modeling with transformers can greatly improve the performance in the code completion task via learning from large-scale source code datasets. However, current approaches focus only on code context within the file or project, i.e. internal context. Our distinction is utilizing "external" context, inspired by human behaviors of copying from the related code snippets when writing code. Specifically, we propose a retrieval-augmented code completion framework, leveraging both lexical copying and referring to code with similar semantics by retrieval. We adopt a stage-wise training approach that combines a source code retriever and an auto-regressive language model for programming language. We evaluate our approach in the code completion task in Python and Java programming languages, achieving a state-of-the-art performance on CodeXGLUE benchmark.

preprint2022arXiv

UniXcoder: Unified Cross-Modal Pre-training for Code Representation

Pre-trained models for programming languages have recently demonstrated great success on code intelligence. To support both code-related understanding and generation tasks, recent works attempt to pre-train unified encoder-decoder models. However, such encoder-decoder framework is sub-optimal for auto-regressive tasks, especially code completion that requires a decoder-only manner for efficient inference. In this paper, we present UniXcoder, a unified cross-modal pre-trained model for programming language. The model utilizes mask attention matrices with prefix adapters to control the behavior of the model and leverages cross-modal contents like AST and code comment to enhance code representation. To encode AST that is represented as a tree in parallel, we propose a one-to-one mapping method to transform AST in a sequence structure that retains all structural information from the tree. Furthermore, we propose to utilize multi-modal contents to learn representation of code fragment with contrastive learning, and then align representations among programming languages using a cross-modal generation task. We evaluate UniXcoder on five code-related tasks over nine datasets. To further evaluate the performance of code fragment representation, we also construct a dataset for a new task, called zero-shot code-to-code search. Results show that our model achieves state-of-the-art performance on most tasks and analysis reveals that comment and AST can both enhance UniXcoder.

preprint2020arXiv

A linearised inverse conductivity problem for the Maxwell system at a high frequency

We consider a linearised inverse conductivity problem for electromagnetic waves in a three dimensional bounded domain at a high time-harmonic frequency. Increasing stability bounds for the conductivity coefficient in the full Maxwell system and in a simplified transverse electric mode are derived. These bounds contain a Lipschitz term with a factor growing polynomially in terms of the frequency, a Holder term, and a logarithmic term which decays with respect to the frequency as a power. To validate this increasing stability numerically, we propose a reconstruction algorithm aiming at the recovery of sufficiently many Fourier modes of the conductivity. A numerical evidence sheds light on the influence of the growing frequency and confirms the improved resolution at higher frequencies.

preprint2020arXiv

Direct Observation of One-Dimensional Peierls-Type Charge Density Wave in Twin Boundaries of Monolayer MoTe$_2$

One-dimensional (1D) metallic mirror-twin boundaries (MTBs) in monolayer transition metal dichalcogenides (TMDCs) exhibit a periodic charge modulation and provide an ideal platform for exploring collective electron behavior in the confined system. The underlying mechanism of the charge modulation and how the electrons travel in 1D structures remain controversial. Here, for the first time, we observed atomic-scale structures of the charge distribution within one period in MTB of monolayer MoTe2 by using scanning tunneling microscopy/spectroscopy (STM/STS). The coexisting apparent periodic lattice distortions and U-shaped energy gap clearly demonstrate a Peierls-type charge density wave (CDW). Equidistant quantized energy levels with varied periodicity are further discovered outside the CDW gap along the metallic MTB. Density functional theory (DFT) calculations are in good agreement with the gapped electronic structures and reveal they originate mainly from Mo 4d orbital. Our work presents hallmark evidence of the 1D Peierls-type CDW on the metallic MTBs and offers opportunities to study the underlying physics of 1D charge modulation.

preprint2020arXiv

On the asymptotical regularization for linear inverse problems in presence of white noise

We interpret steady linear statistical inverse problems as artificial dynamic systems with white noise and introduce a stochastic differential equation (SDE) system where the inverse of the ending time $T$ naturally plays the role of the squared noise level. The time-continuous framework then allows us to apply classical methods from data assimilation, namely the Kalman-Bucy filter and 3DVAR, and to analyze their behaviour as a regularization method for the original problem. Such treatment offers some connections to the famous asymptotical regularization method, which has not yet been analyzed in the context of random noise. We derive error bounds for both methods in terms of the mean-squared error under standard assumptions and discuss commonalities and differences between both approaches. If an additional tuning parameter $α$ for the initial covariance is chosen appropriately in terms of the ending time $T$, one of the proposed methods gains order optimality. Our results extend theoretical findings in the discrete setting given in the recent paper Iglesias et al. (2017). Numerical examples confirm our theoretical results.

preprint2018arXiv

Linearized inverse Schrödinger potential problem at a large wavenumber

We investigate recovery of the (Schrödinger) potential function from many boundary measurements at a large wavenumber. By considering such a linearized form, we obtain a Hölder type stability which is a big improvement over a logarithmic stability in low wavenumbers. Furthermore we extend the discussion to the linearized inverse Schrödinger potential problem with attenuation, where an exponential dependence of the attenuation constant is traced in the stability estimate. Based on the linearized problem, a reconstruction algorithm is proposed aiming at the recovery of the Fourier modes of the potential function. By choosing the large wavenumber appropriately, we verify the efficiency of the proposed algorithm by several numerical examples.

preprint2015arXiv

Filter Based Methods For Statistical Linear Inverse Problems

Ill-posed inverse problems are ubiquitous in applications. Under- standing of algorithms for their solution has been greatly enhanced by a deep understanding of the linear inverse problem. In the applied communities ensemble-based filtering methods have recently been used to solve inverse problems by introducing an artificial dynamical sys- tem. This opens up the possibility of using a range of other filtering methods, such as 3DVAR and Kalman based methods, to solve inverse problems, again by introducing an artificial dynamical system. The aim of this paper is to analyze such methods in the context of the ill-posed linear inverse problem. Statistical linear inverse problems are studied in the sense that the observational noise is assumed to be derived via realization of a Gaussian random variable. We investigate the asymptotic behavior of filter based methods for these inverse problems. Rigorous convergence rates are established for 3DVAR and for the Kalman filters, including minimax rates in some instances. Blowup of 3DVAR and a variant of its basic form is also presented, and optimality of the Kalman filter is discussed. These analyses reveal a close connection between (iterative) regularization schemes in deterministic inverse problems and filter based methods in data assimilation. Numerical experiments are presented to illustrate the theory.

Shuai Lu

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

A Systematic Post-Train Framework for Video Generation

ExtraVAR: Stage-Aware RoPE Remapping for Resolution Extrapolation in Visual Autoregressive Models

Increasing stability of a linearized inverse boundary value problem for a nonlinear Schrödinger equation on transversally anisotropic manifolds

Discrimination-Based Double Auction for Maximizing Social Welfare in the Electricity and Heating Market Considering Privacy Preservation

Increasing stability in the linearized inverse Schrödinger potential problem with power type nonlinearities

Learning to Recommend Method Names with Global Context

On a Dynamic Variant of the Iteratively Regularized Gauss-Newton Method with Sequential Data

ReACC: A Retrieval-Augmented Code Completion Framework

UniXcoder: Unified Cross-Modal Pre-training for Code Representation

A linearised inverse conductivity problem for the Maxwell system at a high frequency

Direct Observation of One-Dimensional Peierls-Type Charge Density Wave in Twin Boundaries of Monolayer MoTe$_2$

On the asymptotical regularization for linear inverse problems in presence of white noise

Linearized inverse Schrödinger potential problem at a large wavenumber

Filter Based Methods For Statistical Linear Inverse Problems