Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
24works
0followers
17topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

24 published item(s)

preprint2026arXiv

Controllable Flow Matching for Online Reinforcement Learning

Model-based reinforcement learning (MBRL) typically relies on modeling environment dynamics for data efficiency. However, due to the accumulation of model errors over long-horizon rollouts, such methods often face challenges in maintaining modeling stability. To address this, we propose CtrlFlow, a trajectory-level synthetic method using conditional flow matching (CFM), which directly modeling the distribution of trajectories from initial states to high-return terminal states without explicitly modeling the environment transition function. Our method ensures optimal trajectory sampling by minimizing the control energy governed by the non-linear Controllability Gramian Matrix, while the generated diverse trajectory data significantly enhances the robustness and cross-task generalization of policy learning. In online settings, CtrlFlow demonstrates the better performance on common MuJoCo benchmark tasks than dynamics models and achieves superior sample efficiency compared to standard MBRL methods.

preprint2026arXiv

Dual-Latent Collaborative Decoding for Fidelity-Perception Balanced Image Compression

Learned image compression (LIC) increasingly requires reconstructions that balance distortion fidelity and perceptual realism across a wide range of bitrates. However, most existing methods still rely on a single compressed latent representation to simultaneously carry structural details, semantic cues, and perceptual priors, requiring the same latent representation to serve multiple, potentially conflicting roles. This tension becomes evident across different latent paradigms: scalar-quantized (SQ) continuous latents provide rate-scalable fidelity but tend to lose perceptual details at low rates, while vector-quantized (VQ) discrete tokens preserve compact semantic cues but suffer from limited structural fidelity and bitrate scalability. To address this issue, we propose Mixture of Decoder Experts (MoDE), a dual-latent collaborative decoding framework that decomposes reconstruction responsibilities across complementary latent paradigms. Specifically, MoDE treats the SQ branch as a fidelity-oriented expert and the VQ branch as a perception-oriented expert, and coordinates them through two decoder-side modules: Expert-Specific Enhancement (ESE), which preserves branch-specific expert references, and Cross-Expert Modulation (CEM), which enables selective complementary transfer during reconstruction. The resulting framework supports selective cross-latent collaboration under a shared dual-stream bitstream and enables both fidelity-anchored and perception-anchored decoding. Extensive experiments demonstrate that MoDE achieves a more favorable fidelity-perception balance than representative distortion-oriented, perception-oriented, generative, and dual-latent baselines across a wide bitrate range, highlighting decoder-side expert collaboration as an effective design for wide-range fidelity-perception balanced LIC.

preprint2026arXiv

Honesty-Aware Multi-Agent Framework for High-Fidelity Synthetic Data Generation in Digital Psychiatric Intake Doctor-Patient Interactions

Data scarcity and unreliable self-reporting -- such as concealment or exaggeration -- pose fundamental challenges to psychiatric intake and assessment. We propose a multi-agent synthesis framework that explicitly models patient deception to generate high-fidelity, publicly releasable synthetic psychiatric intake records. Starting from DAIC-WOZ interviews, we construct enriched patient profiles and simulate a four-role workflow: a \emph{Patient} completes self-rated scales and participates in a semi-structured interview under a topic-dependent honesty state; an \emph{Assessor} selects instruments based on demographics and chief complaints; an \emph{Evaluator} conducts the interview grounded in rater-administered scales, tracks suspicion, and completes ratings; and a \emph{Diagnostician} integrates all evidence into a diagnostic summary. Each case links the patient profile, self-rated and rater-administered responses, interview transcript, diagnostic summary, and honesty state. We validate the framework through four complementary evaluations: diagnostic consistency and severity grading, chain-of-thought ablations, human evaluation of clinical realism and dishonesty modeling, and LLM-based comparative evaluation. The resulting corpus spans multiple disorders and severity levels, enabling controlled study of dishonesty-aware psychiatric assessment and the training and evaluation of adaptive dialogue agents.

preprint2026arXiv

Learning to Align Generative Appearance Priors for Fine-grained Image Retrieval

Fine-grained image retrieval (FGIR) typically relies on supervision from seen categories to learn discriminative embeddings for retrieving unseen categories. However, such supervision often biases retrieval models toward the semantics of seen categories rather than the underlying appearance characteristics that generalize across categories, thereby limiting retrieval performance on unseen categories. To tackle this, we propose GAPan, a Generative Appearance Prior alignment network that reformulates the learning objective from category prediction toward appearance modeling. Technically, GAPan treats retrieval features with an invertible density model based on normalizing flows. In the forward direction, the flow maps all instance features into a latent density space, where each seen category is modeled by a class-conditional Gaussian prior and optimized via exact likelihood estimation. This formulation preserves richer appearance details by leveraging the invertible property of the flows. In the reverse direction, samples from the high-density regions of these learned priors are mapped back to the feature space to produce appearance-aware anchors that reflect intra-category variation. These anchors supervise a prior-driven alignment objective that aligns retrieval embeddings with category-specific appearance distributions, thereby improving generalization to unseen categories. Evaluations demonstrate that our GAPan achieves state-of-the-art performance on both widely-used fine- and coarse-grained benchmarks.

preprint2026arXiv

ThinkDrive: Chain-of-Thought Guided Progressive Reinforcement Learning Fine-Tuning for Autonomous Driving

With the rapid advancement of large language models (LLMs) technologies, their application in the domain of autonomous driving has become increasingly widespread. However, existing methods suffer from unstructured reasoning, poor generalization, and misalignment with human driving intent. While Chain-of-Thought (CoT) reasoning enhances decision transparency, conventional supervised fine-tuning (SFT) fails to fully exploit its potential, and reinforcement learning (RL) approaches face instability and suboptimal reasoning depth. We propose ThinkDrive, a CoT guided progressive RL fine-tuning framework for autonomous driving that synergizes explicit reasoning with difficulty-aware adaptive policy optimization. Our method employs a two-stage training strategy. First, we perform SFT using CoT explanations. Then, we apply progressive RL with a difficulty-aware adaptive policy optimizer that dynamically adjusts learning intensity based on sample complexity. We evaluate our approach on a public dataset. The results show that ThinkDrive outperforms strong RL baselines by 1.45%, 1.95%, and 1.01% on exam, easy-exam, and accuracy, respectively. Moreover, a 2B-parameter model trained with our method surpasses the much larger GPT-4o by 3.28% on the exam metric.

preprint2026arXiv

Towards Universal Physical Adversarial Attacks via a Joint Multi-Objective and Multi-Model Optimization Framework

Physical adversarial attacks often overfit single surrogate models and optimization objectives. While ensemble attacks can mitigate this, existing methods struggle with severe gradient conflicts within restricted physical texture spaces, significantly degrading cross-model transferability. To bridge this gap, this paper proposes a Joint Multi-Objective and Multi-Model Optimization Framework (JMOF) that leverages quantitative similarity analysis to select the optimal surrogate model ensemble. Within JMOF, a dual-level mechanism jointly suppresses prediction outputs and flattens intermediate feature distributions, balancing attack efficiency with deep generalization. Additionally, an Orthogonal Gradient Alignment (OGA) strategy resolves cross-model gradient conflicts, transforming mutually repulsive gradients into synergistic optimization directions. Extensive simulated and real-world experiments demonstrate that JMOF outperforms state-of-the-art baselines against diverse black-box detectors. Crucially, JMOF exhibits substantial cross-vision-task generalization, generating attacks capable of simultaneously deceiving object detection and semantic segmentation or monocular depth estimation models. This research advances the generalization limits of physical adversarial attacks, providing a robust framework for evaluating visual AI vulnerabilities in real-world deployments.

preprint2025arXiv

The Role of Population III Star Tidal Disruption Events in Black Hole Growth at the Cosmic Dawn

The discovery of supermassive black holes (SMBHs) at high redshifts has intensified efforts to understand their early formation and rapid growth during the cosmic dawn. Using a semi-analytical cosmological framework, we investigate the role of tidal disruption events (TDEs) involving Population III (Pop-III) stars in driving the growth of heavy seed black holes (10^4-10^6 solar mass). Our results indicate that Pop-III TDEs significantly accelerate the growth of relatively lighter massive black holes (~ 10^4-10^5 solar mass), allowing them to increase their mass by roughly an order of magnitude within the first 10 Myr. Cosmological evolution modeling further supports that such Pop-III TDE-driven growth scenarios are consistent with the formation pathways of observed luminous high-redshift quasars originating from seed black holes at 10<z<15. We also discuss the future observational probes of these early-stage growth processes that future facilities, including space-based gravitational wave observatories and infrared telescopes like JWST, could potentially detect. These findings provide a clear observational framework to test the critical role of Pop-III star interactions in the rapid buildup of SMBHs during the earliest epochs.

preprint2022arXiv

Debiasing Neural Retrieval via In-batch Balancing Regularization

People frequently interact with information retrieval (IR) systems, however, IR models exhibit biases and discrimination towards various demographics. The in-processing fair ranking methods provide a trade-offs between accuracy and fairness through adding a fairness-related regularization term in the loss function. However, there haven&#39;t been intuitive objective functions that depend on the click probability and user engagement to directly optimize towards this. In this work, we propose the In-Batch Balancing Regularization (IBBR) to mitigate the ranking disparity among subgroups. In particular, we develop a differentiable \textit{normed Pairwise Ranking Fairness} (nPRF) and leverage the T-statistics on top of nPRF over subgroups as a regularization to improve fairness. Empirical results with the BERT-based neural rankers on the MS MARCO Passage Retrieval dataset with the human-annotated non-gendered queries benchmark \citep{rekabsaz2020neural} show that our IBBR method with nPRF achieves significantly less bias with minimal degradation in ranking performance compared with the baseline.

preprint2022arXiv

DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization

Large-scale pre-trained sequence-to-sequence models like BART and T5 achieve state-of-the-art performance on many generative NLP tasks. However, such models pose a great challenge in resource-constrained scenarios owing to their large memory requirements and high latency. To alleviate this issue, we propose to jointly distill and quantize the model, where knowledge is transferred from the full-precision teacher model to the quantized and distilled low-precision student model. Empirical analyses show that, despite the challenging nature of generative tasks, we were able to achieve a 16.5x model footprint compression ratio with little performance drop relative to the full-precision counterparts on multiple summarization and QA datasets. We further pushed the limit of compression ratio to 27.7x and presented the performance-efficiency trade-off for generative tasks using pre-trained models. To the best of our knowledge, this is the first work aiming to effectively distill and quantize sequence-to-sequence pre-trained models for language generation tasks.

preprint2022arXiv

Knowing Where and What: Unified Word Block Pretraining for Document Understanding

Due to the complex layouts of documents, it is challenging to extract information for documents. Most previous studies develop multimodal pre-trained models in a self-supervised way. In this paper, we focus on the embedding learning of word blocks containing text and layout information, and propose UTel, a language model with Unified TExt and Layout pre-training. Specifically, we propose two pre-training tasks: Surrounding Word Prediction (SWP) for the layout learning, and Contrastive learning of Word Embeddings (CWE) for identifying different word blocks. Moreover, we replace the commonly used 1D position embedding with a 1D clipped relative position embedding. In this way, the joint training of Masked Layout-Language Modeling (MLLM) and two newly proposed tasks enables the interaction between semantic and spatial features in a unified way. Additionally, the proposed UTel can process arbitrary-length sequences by removing the 1D position embedding, while maintaining competitive performance. Extensive experimental results show UTel learns better joint representations and achieves superior performance than previous methods on various downstream tasks, though requiring no image modality. Code is available at \url{https://github.com/taosong2019/UTel}.

preprint2022arXiv

ShowFace: Coordinated Face Inpainting with Memory-Disentangled Refinement Networks

Face inpainting aims to complete the corrupted regions of the face images, which requires coordination between the completed areas and the non-corrupted areas. Recently, memory-oriented methods illustrate great prospects in the generation related tasks by introducing an external memory module to improve image coordination. However, such methods still have limitations in restoring the consistency and continuity for specificfacial semantic parts. In this paper, we propose the coarse-to-fine Memory-Disentangled Refinement Networks (MDRNets) for coordinated face inpainting, in which two collaborative modules are integrated, Disentangled Memory Module (DMM) and Mask-Region Enhanced Module (MREM). Specifically, the DMM establishes a group of disentangled memory blocks to store the semantic-decoupled face representations, which could provide the most relevant information to refine the semantic-level coordination. The MREM involves a masked correlation mining mechanism to enhance the feature relationships into the corrupted regions, which could also make up for the correlation loss caused by memory disentanglement. Furthermore, to better improve the inter-coordination between the corrupted and non-corrupted regions and enhance the intra-coordination in corrupted regions, we design InCo2 Loss, a pair of similarity based losses to constrain the feature consistency. Eventually, extensive experiments conducted on CelebA-HQ and FFHQ datasets demonstrate the superiority of our MDRNets compared with previous State-Of-The-Art methods.

preprint2022arXiv

Towards Differential Relational Privacy and its use in Question Answering

Memorization of the relation between entities in a dataset can lead to privacy issues when using a trained model for question answering. We introduce Relational Memorization (RM) to understand, quantify and control this phenomenon. While bounding general memorization can have detrimental effects on the performance of a trained model, bounding RM does not prevent effective learning. The difference is most pronounced when the data distribution is long-tailed, with many queries having only few training examples: Impeding general memorization prevents effective learning, while impeding only relational memorization still allows learning general properties of the underlying concepts. We formalize the notion of Relational Privacy (RP) and, inspired by Differential Privacy (DP), we provide a possible definition of Differential Relational Privacy (DrP). These notions can be used to describe and compute bounds on the amount of RM in a trained model. We illustrate Relational Privacy concepts in experiments with large-scale models for Question Answering.

preprint2021arXiv

Back to Prior Knowledge: Joint Event Causality Extraction via Convolutional Semantic Infusion

Joint event and causality extraction is a challenging yet essential task in information retrieval and data mining. Recently, pre-trained language models (e.g., BERT) yield state-of-the-art results and dominate in a variety of NLP tasks. However, these models are incapable of imposing external knowledge in domain-specific extraction. Considering the prior knowledge of frequent n-grams that represent cause/effect events may benefit both event and causality extraction, in this paper, we propose convolutional knowledge infusion for frequent n-grams with different windows of length within a joint extraction framework. Knowledge infusion during convolutional filter initialization not only helps the model capture both intra-event (i.e., features in an event cluster) and inter-event (i.e., associations across event clusters) features but also boosts training convergence. Experimental results on the benchmark datasets show that our model significantly outperforms the strong BERT+CSNN baseline.

preprint2021arXiv

Fast Outage Analysis of Large-scale Production Clouds with Service Correlation Mining

Cloud-based services are surging into popularity in recent years. However, outages, i.e., severe incidents that always impact multiple services, can dramatically affect user experience and incur severe economic losses. Locating the root-cause service, i.e., the service that contains the root cause of the outage, is a crucial step to mitigate the impact of the outage. In current industrial practice, this is generally performed in a bootstrap manner and largely depends on human efforts: the service that directly causes the outage is identified first, and the suspected root cause is traced back manually from service to service during diagnosis until the actual root cause is found. Unfortunately, production cloud systems typically contain a large number of interdependent services. Such a manual root cause analysis is often time-consuming and labor-intensive. In this work, we propose COT, the first outage triage approach that considers the global view of service correlations. COT mines the correlations among services from outage diagnosis data. After learning from historical outages, COT can infer the root cause of emerging ones accurately. We implement COT and evaluate it on a real-world dataset containing one year of data collected from Microsoft Azure, one of the representative cloud computing platforms in the world. Our experimental results show that COT can reach a triage accuracy of 82.1%~83.5%, which outperforms the state-of-the-art triage approach by 28.0%~29.7%.

preprint2020arXiv

Adversarial Bipartite Graph Learning for Video Domain Adaptation

Domain adaptation techniques, which focus on adapting models between distributionally different domains, are rarely explored in the video recognition area due to the significant spatial and temporal shifts across the source (i.e. training) and target (i.e. test) domains. As such, recent works on visual domain adaptation which leverage adversarial learning to unify the source and target video representations and strengthen the feature transferability are not highly effective on the videos. To overcome this limitation, in this paper, we learn a domain-agnostic video classifier instead of learning domain-invariant representations, and propose an Adversarial Bipartite Graph (ABG) learning framework which directly models the source-target interactions with a network topology of the bipartite graph. Specifically, the source and target frames are sampled as heterogeneous vertexes while the edges connecting two types of nodes measure the affinity among them. Through message-passing, each vertex aggregates the features from its heterogeneous neighbors, forcing the features coming from the same class to be mixed evenly. Explicitly exposing the video classifier to such cross-domain representations at the training and test stages makes our model less biased to the labeled source data, which in-turn results in achieving a better generalization on the target domain. To further enhance the model capacity and testify the robustness of the proposed architecture on difficult transfer tasks, we extend our model to work in a semi-supervised setting using an additional video-level bipartite graph. Extensive experiments conducted on four benchmarks evidence the effectiveness of the proposed approach over the SOTA methods on the task of video recognition.

preprint2020arXiv

Bayesian Update with Importance Sampling: Required Sample Size

Importance sampling is used to approximate Bayes&#39; rule in many computational approaches to Bayesian inverse problems, data assimilation and machine learning. This paper reviews and further investigates the required sample size for importance sampling in terms of the $χ^2$-divergence between target and proposal. We develop general abstract theory and illustrate through numerous examples the roles that dimension, noise-level and other model parameters play in approximating the Bayesian update with importance sampling. Our examples also facilitate a new direct comparison of standard and optimal proposals for particle filtering.

preprint2020arXiv

Capacity Performance of Relay Beamformings for MIMO Multi-Relay Networks with Imperfect R-D CSI at Relays

In this paper, we consider a dual-hop Multiple Input Multiple Output (MIMO) wireless relay network in the presence of imperfect channel state information (CSI), in which a source-destination pair both equipped with multiple antennas communicates through a large number of half-duplex amplify-and-forward (AF) relay terminals. We investigate the performance of three linear beamforming schemes when the CSI of relay-to-destination (R-D) link is not perfect at the relay nodes. The three efficient linear beamforming schemes are based on the matched-filter (MF), zero-forcing (ZF) precoding and regularized zero-forcing (RZF) precoding techniques, which utilize the CSI of both S-D channel and R-D channel at the relay nodes. By modeling the R-D CSI error at the relay nodes as independent complex Gaussian random variables, we derive the ergodic capacities of the three beamformers in terms of instantaneous SNR. Using Law of Large Number, we obtain the asymptotic capacities, upon which the optimized MF-RZF is derived. Simulation results show that the asymptotic capacities match with the respective ergodic capacities very well. Analysis and simulation results demonstrate that the optimized MF-RZF outperforms MF and MF-ZF for any power of R-D CSI error.

preprint2020arXiv

Clustering and Power Optimization for NOMA Multi-Objective Problems

This paper considers uplink multiple access (MA) transmissions, where the MA technique is adaptively selected between Non Orthogonal Multiple Access (NOMA) and Orthogonal Multiple Access (OMA). Two types of users, namely Internet of Things (IoT) and enhanced mobile broadband (eMBB) coexist with different metrics to be optimized, energy efficiency (EE) for IoT and spectral efficiency (SE) for eMBB. The corresponding multi-objective power allocation problems aiming at maximizing a weighted sum of EE and SE are solved for both NOMA and OMA. Based on the identification of the best MA strategy, a clustering algorithm is then proposed to maximize the multi-objective metric per cluster as well as NOMA use. The proposed clustering, power allocation and MA selection algorithm is shown to outperform other clustering solutions and non-adaptive MA techniques.

preprint2020arXiv

Distributed Motion Control for Multiple Connected Surface Vessels

We propose a scalable cooperative control approach which coordinates a group of rigidly connected autonomous surface vessels to track desired trajectories in a planar water environment as a single floating modular structure. Our approach leverages the implicit information of the structure&#39;s motion for force and torque allocation without explicit communication among the robots. In our system, a leader robot steers the entire group by adjusting its force and torque according to the structure&#39;s deviation from the desired trajectory, while follower robots run distributed consensus-based controllers to match their inputs to amplify the leader&#39;s intent using only onboard sensors as feedback. To cope with the complex and highly coupled system dynamics in the water, the leader robot employs a nonlinear model predictive controller (NMPC), where we experimentally estimated the dynamics model of the floating modular structure in order to achieve superior performance for leader-following control. Our method has a wide range of potential applications in transporting humans and goods in many of today&#39;s existing waterways. We conducted trajectory and orientation tracking experiments in hardware with three custom-built autonomous modular robotic boats, called Roboat, which are capable of holonomic motions and onboard state estimation. Simulation results with up to 65 robots also prove the scalability of our proposed approach.

preprint2020arXiv

Efficient Beamforming for MIMO Relaying Broadcast Channel with Imperfect Channel Estimation

We consider a multiple-input multiple-output (MIMO) relaying boardcast channel in downlink cellular networks, where the base station and the relay stations are both equipped with multiple antennas, and each user terminal has only a single antenna. In practical scenarios, channel estimation is imperfect at the receivers. Aiming at maximizing the SINR at each user, we develop two robust linear beamforming schemes respectively for the single relay case and the multi-relay case. The two proposed schemes are based on sigular value decomposition (SVD), minimum mean square error (MMSE) and regularized zero-forcing (RZF). Simulation results show that the proposed scheme outperforms the conventional schemes with imperfect channel estimation.

preprint2020arXiv

Pattern formation and exotic order in driven-dissipative Bose-Hubbard systems

Modern experimental platforms such as supercoducting-circuit arrays call for the exploration of bosonic tight-binding models in unconventional situations with no counterpart in real materials. Here we investigate one of such situations, in which excitations are driven and damped by pairs, leading to pattern formation and exotic bosonic states emerged from a non-equilibrium quantum many-body system. Focusing on a two-dimensional driven-dissipative Bose-Hubbard model, we find that its steady states are characterized by the condensation of bosons around momenta lying on a &#34;Bose surface&#34;, a bosonic analogue of the Fermi surface in solid-state systems. The interplay between instabilities generated by the driving, the nonlinear dissipative mode-coupling, and the underlaying lattice effect, allows the system to equilibrate into an exotic superfluid state of bosons condensed on a closed ring in momentum space instead of discrete points. Such an unconventional state with a spatially uniform density distribution goes beyond the traditional scope of pattern formation, and thus has no counterpart in the classical literature. In addition, it is a state connected to several open problems in modern condensed-matter physics, and here we provide the means to stabilize it, opening the way to its experimental study. Moreover, we also provide a concrete experimental implementation of our model in currently-available superconducting-circuit arrays. We also investigate the relaxation spectrum around the condensate, which shows a characteristic purely diffusive behavior.

preprint2020arXiv

Progressive Graph Learning for Open-Set Domain Adaptation

Domain shift is a fundamental problem in visual recognition which typically arises when the source and target data follow different distributions. The existing domain adaptation approaches which tackle this problem work in the closed-set setting with the assumption that the source and the target data share exactly the same classes of objects. In this paper, we tackle a more realistic problem of open-set domain shift where the target data contains additional classes that are not present in the source data. More specifically, we introduce an end-to-end Progressive Graph Learning (PGL) framework where a graph neural network with episodic training is integrated to suppress underlying conditional shift and adversarial learning is adopted to close the gap between the source and target distributions. Compared to the existing open-set adaptation approaches, our approach guarantees to achieve a tighter upper bound of the target error. Extensive experiments on three standard open-set benchmarks evidence that our approach significantly outperforms the state-of-the-arts in open-set domain adaptation.

preprint2020arXiv

Regularized Zero-Forcing for Multiantenna Broadcast Channels with User Selection

A multiantenna multiuser broadcast channel with transmitter beamforming and user selection is considered. Different from the conventional works, we consider imperfect channel state information (CSI) which is a practical scenario for multiuser broadcast channels. We propose a robust regularized zero-forcing (RRZF) beamforming at the base station. Then we show that the RRZF outperforms zero-forcing (ZF) and regularized ZF (RZF) beamforming even as the number of users grows to infinity. Simulation results validate the advantage of the proposed robust RZF beamforming.

preprint2020arXiv

Relay Beamforming Design with SIC Detection for MIMO Multi-Relay Networks with Imperfect CSI

In this paper, we consider a dual-hop Multiple Input Multiple Output (MIMO) wireless multi-relay network, in which a source-destination pair both equipped with multiple antennas communicates through multiple half-duplex amplify-and-forward (AF) relay terminals which are also with multiple antennas. Since perfect channel state information (CSI) is difficult to obtain in practical multi-relay network, we consider imperfect CSI for all channels. We focus on maximizing the signal-to-interference-plus-noise ratio (SINR) at the destination. We propose a novel robust linear beamforming at the relays, based on the QR decomposition filter at the destination node which performs successive interference cancellation (SIC). Using Law of Large Number, we obtain the asymptotic rate in the presence of imperfect CSI, upon which, the proposed relay beamforming is optimized. Simulation results show that the asymptotic rate matches with the ergodic rate well. Analysis and simulation results demonstrate that the proposed beamforming outperforms the conventional beamforming schemes for any power of CSI errors and SNR regions.