Researcher profile

Shun Zhang

Shun Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
20works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

20 published item(s)

preprint2026arXiv

NoRIN: Backbone-Adaptive Reversible Normalization for Time-Series Forecasting

Reversible instance normalization (RevIN) and its successors (Dish-TS, SAN, FAN) have become the de facto plug-in for time-series forecasting, yet the map they apply to each data point is strictly affine, $x \mapsto ax+b$, so they cannot reshape the underlying distribution -- heavy tails remain heavy and skewness remains uncorrected. We propose NoRIN, a non-linear reversible normalization based on the arcsinh-form Johnson $S_U$ transform with two shape parameters $(δ,\varepsilon)$ that control tailedness and skewness; the linear $Z$-score used by RevIN is recovered only in the limit $δ\to \infty$. Training $(δ,\varepsilon)$ jointly with the backbone via gradient descent reliably pushes them toward this linear limit within a few epochs -- a phenomenon we name the degeneration problem: the forecasting loss is locally indifferent to shape, and the high-capacity backbone compensates for any monotone reparameterization of its input. NoRIN escapes the degeneration by decoupling shape selection from gradient training: $(δ,\varepsilon)$ are initialized by a closed-form Slifker-Shapiro quantile fit and refined by Bayesian optimization on the validation objective, while the inner training loop is identical to standard RevIN-style training. Across six representative backbones x five real-world datasets x three prediction horizons (90 configurations), decoupled shape optimization recovers $(δ^\star,\varepsilon^\star)$ that sit systematically far from the linear limit, with values that vary in a backbone-dependent way. This empirically supports the central thesis: different backbones genuinely require different normalization parameters to reach their best performance.

preprint2026arXiv

Toward Stable Semi-Supervised Remote Sensing Segmentation via Co-Guidance and Co-Fusion

Semi-supervised remote sensing (RS) image semantic segmentation offers a promising solution to alleviate the burden of exhaustive annotation, yet it fundamentally struggles with pseudo-label drift, a phenomenon where confirmation bias leads to the accumulation of errors during training. In this work, we propose Co2S, a stable semi-supervised RS segmentation framework that synergistically fuses priors from vision-language models and self-supervised models. Specifically, we construct a heterogeneous dual-student architecture comprising two distinct ViT-based vision foundation models initialized with pretrained CLIP and DINOv3 to mitigate error accumulation and pseudo-label drift. To effectively incorporate these distinct priors, an explicit-implicit semantic co-guidance mechanism is introduced that utilizes text embeddings and learnable queries to provide explicit and implicit class-level guidance, respectively, thereby jointly enhancing semantic consistency. Furthermore, a global-local feature collaborative fusion strategy is developed to effectively fuse the global contextual information captured by CLIP with the local details produced by DINOv3, enabling the model to generate highly precise segmentation results. Extensive experiments on six popular datasets demonstrate the superiority of the proposed method, which consistently achieves leading performance across various partition protocols and diverse scenarios. Project page is available at https://xavierjiezou.github.io/Co2S/.

preprint2022arXiv

Battling Gibbs Phenomenon: On Finite Element Approximations of Discontinuous Solutions of PDEs

In this paper, we want to clarify the Gibbs phenomenon when continuous and discontinuous finite elements are used to approximate discontinuous or nearly discontinuous PDE solutions from the approximation point of view. For a simple step function, we explicitly compute its continuous and discontinuous piecewise constant or linear projections on discontinuity matched or non-matched meshes. For the simple discontinuity-aligned mesh case, piecewise discontinuous approximations are always good. For the general non-matched case, we explain that the piecewise discontinuous constant approximation combined with adaptive mesh refinements is a good choice to achieve accuracy without overshoots. For discontinuous piecewise linear approximations, non-trivial overshoots will be observed unless the mesh is matched with discontinuity. For continuous piecewise linear approximations, the computation is based on a "far-away assumption", and non-trivial overshoots will always be observed under regular meshes. We calculate the explicit overshoot values for several typical cases. Numerical tests are conducted for a singularly-perturbed reaction-diffusion equation and linear hyperbolic equations to verify our findings in the paper. Also, we discuss the $L^1$-minimization-based methods and do not recommend such methods due to their similar behavior to $L^2$-based methods and more complicated implementations.

preprint2022arXiv

Computer Vision-Aided Reconfigurable Intelligent Surface-Based Beam Tracking: Prototyping and Experimental Results

In this paper, we propose a novel computer vision-based approach to aid Reconfigurable Intelligent Surface (RIS) for dynamic beam tracking and then implement the corresponding prototype verification system. A camera is attached at the RIS to obtain the visual information about the surrounding environment, with which RIS identifies the desired reflected beam direction and then adjusts the reflection coefficients according to the pre-designed codebook. Compared to the conventional approaches that utilize channel estimation or beam sweeping to obtain the reflection coefficients, the proposed one not only saves beam training overhead but also eliminates the requirement for extra feedback links. We build a 20-by-20 RIS running at 5.4 GHz and develop a high-speed control board to ensure the real-time refresh of the reflection coefficients. Meanwhile we implement an independent peer-to-peer communication system to simulate the communication between the base station and the user equipment. The vision-aided RIS prototype system is tested in two mobile scenarios: RIS works in near-field conditions as a passive array antenna of the base station; RIS works in far-field conditions to assist the communication between the base station and the user equipment. The experimental results show that RIS can quickly adjust the reflection coefficients for dynamic beam tracking with the help of visual information.

preprint2022arXiv

Efficient reversible data hiding via two layers of double-peak embedding

Reversible data hiding continues to attract significant attention in recent years. In particular, an increasing number of authors focus on the higher significant bit (HSB) plane of an image which can yield more redundant space. On the other hand, the lower significant bit planes are often ignored for embedding in existing schemes due to their harm to the embedding rate. This paper proposes an efficient reversible data hiding scheme via a double-peak two-layer embedding (DTLE) strategy with prediction error expansion. The higher six-bit planes of the image are assigned as the HSB plane, and double prediction error peaks are applied in either embedding layer. This makes fuller use of the redundancy space of images compared with the one error peak strategy. Moreover, we carry out the median-edge detector pre-processing for complex images to reduce the size of the auxiliary information. A series of experimental results show that our DTLE approach achieves up to 83% higher embedding rate on real-world datasets while guaranteeing better image quality.

preprint2022arXiv

Joint Channel Estimation and Data Detection for Hybrid RIS aided Millimeter Wave OTFS Systems

For high mobility communication scenario, the recently emerged orthogonal time frequency space (OTFS) modulation introduces a new delay-Doppler domain signal space, and can provide better communication performance than traditional orthogonal frequency division multiplexing system. This article focuses on the joint channel estimation and data detection (JCEDD) for hybrid reconfigurable intelligent surface (HRIS) aided millimeter wave (mmWave) OTFS systems. Firstly, a new transmission structure is designed. Within the pilot durations of the designed structure, partial HRIS elements are alternatively activated. The time domain channel model is then exhibited. Secondly, the received signal model for both the HRIS over time domain and the base station over delay-Doppler domain are studied. Thirdly, by utilizing channel parameters acquired at the HRIS, an HRIS beamforming design strategy is proposed. For the OTFS transmission, we propose a JCEDD scheme over delay-Doppler domain. In this scheme, message passing (MP) algorithm is designed to simultaneously obtain the equivalent channel gain and the data symbols. On the other hand, the channel parameters, i.e., the Doppler shift, the channel sparsity, and the channel variance, are updated through expectation-maximization (EM) algorithm. By iteratively executing the MP and EM algorithm, both the channel and the unknown data symbols can be accurately acquired. Finally, simulation results are provided to validate the effectiveness of our proposed JCEDD scheme.

preprint2022arXiv

Least-Squares Methods with Nonconforming Finite Elements for General Second-Order Elliptic Equations

In this paper, we study least-squares finite element methods (LSFEM) for general second-order elliptic equations with nonconforming finite element approximations. The equation may be indefinite. For the two-field potential-flux div LSFEM with Crouzeix-Raviart (CR) element approximation, we present three proofs of the discrete solvability under the condition that mesh size is small enough. One of the proof is based on the coerciveness of the original bilinear form. The other two are based on the minimal assumption of the uniqueness of the solution of the second-order elliptic equation. A counterexample shows that div least-squares functional does not have norm equivalence in the sum space of $H^1$ and CR finite element spaces. Thus it cannot be used as an a posteriori error estimator. Several versions of reliable and efficient error estimators are proposed for the method. We also propose a three-filed potential-flux-intensity div-curl least-squares method with general nonconforming finite element approximations. The norm equivalence in the abstract nonconforming piecewise $H^1$-space is established for the three-filed formulation on the minimal assumption of the uniqueness of the solution of the second-order elliptic equation. The three-filed div-curl nonconforming formulation thus has no restriction on the mesh size, and the least-squares functional can be used as the built-in a posteriori error estimator. Under some restrictive conditions, we also discuss a potential-flux div-curl least-squares method.

preprint2022arXiv

Prompting Decision Transformer for Few-Shot Policy Generalization

Humans can leverage prior experience and learn novel tasks from a handful of demonstrations. In contrast to offline meta-reinforcement learning, which aims to achieve quick adaptation through better algorithm design, we investigate the effect of architecture inductive bias on the few-shot learning capability. We propose a Prompt-based Decision Transformer (Prompt-DT), which leverages the sequential modeling ability of the Transformer architecture and the prompt framework to achieve few-shot adaptation in offline RL. We design the trajectory prompt, which contains segments of the few-shot demonstrations, and encodes task-specific information to guide policy generation. Our experiments in five MuJoCo control benchmarks show that Prompt-DT is a strong few-shot learner without any extra finetuning on unseen target tasks. Prompt-DT outperforms its variants and strong meta offline RL baselines by a large margin with a trajectory prompt containing only a few timesteps. Prompt-DT is also robust to prompt length changes and can generalize to out-of-distribution (OOD) environments.

preprint2022arXiv

Several Proofs of Coerciveness of First-Order System Least-Squares Methods for General Second-Order Elliptic PDEs

In this paper, we present proofs of the coerciveness of first-order system least-squares methods for general (possibly indefinite) second-order linear elliptic PDEs under a minimal uniqueness assumption. For general linear second-order elliptic PDEs, the uniqueness, existence, and well-posedness are equivalent due to the compactness of the operator and Fredholm alternative. Thus only a minimal uniqueness assumption is assumed: the homogeneous equation has a unique zero solution. The coerciveness of the standard variational problem is not required. The paper's main contribution is our first proof, which is a straightforward and short proof using the inf-sup stability of the standard variational formulation. The proof can potentially be applied to other equations or settings once having the standard formulation's stability. We also present two other proofs for the least-squares methods of general second-order linear elliptic PDEs. The second proof is based on a lemma introduced in the discontinuous Petrov-Galerkin method, and the third proof is based on various stability analyses of the decomposed problems. As an application, we also discuss least-squares finite element methods for problems with a nonsingular $H^{-1}$ right-hand side.

preprint2021arXiv

Deep Learning based Antenna Selection and CSI Extrapolation in Massive MIMO Systems

A critical bottleneck of massive multiple-input multiple-output (MIMO) system is the huge training overhead caused by downlink transmission, like channel estimation, downlink beamforming and covariance observation. In this paper, we propose to use the channel state information (CSI) of a small number of antennas to extrapolate the CSI of the other antennas and reduce the training overhead. Specifically, we design a deep neural network that we call an antenna domain extrapolation network (ADEN) that can exploit the correlation function among antennas. We then propose a deep learning (DL) based antenna selection network (ASN) that can select a limited antennas for optimizing the extrapolation, which is conventionally a type of combinatorial optimization and is difficult to solve. We trickly designed a constrained degradation algorithm to generate a differentiable approximation of the discrete antenna selection vector such that the back-propagation of the neural network can be guaranteed. Numerical results show that the proposed ADEN outperforms the traditional fully connected one, and the antenna selection scheme learned by ASN is much better than the trivially used uniform selection.

preprint2021arXiv

Deep Learning based Channel Extrapolation for Large-Scale Antenna Systems: Opportunities, Challenges and Solutions

With the depletion of spectrum, wireless communication systems turn to exploit large antenna arrays to achieve the degree of freedom in space domain, such as millimeter wave massive multi-input multioutput (MIMO), reconfigurable intelligent surface assisted communications and cell-free massive MIMO. In these systems, how to acquire accurate channel state information (CSI) is difficult and becomes a bottleneck of the communication links. In this article, we introduce the concept of channel extrapolation that relies on a small portion of channel parameters to infer the remaining channel parameters. Since the substance of channel extrapolation is a mapping from one parameter subspace to another, we can resort to deep learning (DL), a powerful learning architecture, to approximate such mapping function. Specifically, we first analyze the requirements, conditions and challenges for channel extrapolation. Then, we present three typical extrapolations over the antenna dimension, the frequency dimension, and the physical terminal, respectively. We also illustrate their respective principles, design challenges and DL strategies. It will be seen that channel extrapolation could greatly reduce the transmission overhead and subsequently enhance the performance gains compared with the traditional strategies. In the end, we provide several potential research directions on channel extrapolation for future intelligent communications systems.

preprint2020arXiv

A New Path Division Multiple Access for the Massive MIMO-OTFS Networks

This paper focuses on a new path division multiple access (PDMA) for both uplink (UL) and downlink (DL) massive multiple-input multiple-output network over a high mobility scenario, where the orthogonal time frequency space (OTFS) is adopted. First, the 3D UL channel model and the received signal model in the angle-delay-Doppler domain are studied. Secondly, the 3D-Newtonized orthogonal matching pursuit algorithm is utilized for the extraction of the UL channel parameters, including channel gains, directions of arrival, delays, and Doppler frequencies, over the antenna-time-frequency domain. Thirdly, we carefully analyze energy dispersion and power leakage of the 3D angle-delay-Doppler channels. Then, along UL, we design a path scheduling algorithm to properly assign angle-domain resources at user sides and to assure that the observation regions for different users do not overlap over the 3D cubic area, i.e., angle-delay-Doppler domain. After scheduling, different users can map their respective data to the scheduled delay-Doppler domain grids, and simultaneously send the data to base station (BS) without inter-user interference in the same OTFS block. Correspondingly, the signals at desired grids within the 3D resource space of BS are separately collected to implement the 3D channel estimation and maximal ratio combining-based data detection over the angle-delay-Doppler domain. Then, we construct a low complexity beamforming scheme over the angle-delay-Domain domain to achieve inter-user interference free DL communication. Simulation results are provided to demonstrate the validity of our proposed unified UL/DL PDMA scheme.

preprint2020arXiv

Deep Learning Based Antenna Selection for Channel Extrapolation in FDD Massive MIMO

In massive multiple-input multiple-output (MIMO) systems, the large number of antennas would bring a great challenge for the acquisition of the accurate channel state information, especially in the frequency division duplex mode. To overcome the bottleneck of the limited number of radio links in hybrid beamforming, we utilize the neural networks (NNs) to capture the inherent connection between the uplink and downlink channel data sets and extrapolate the downlink channels from a subset of the uplink channel state information. We study the antenna subset selection problem in order to achieve the best channel extrapolation and decrease the data size of NNs. The probabilistic sampling theory is utilized to approximate the discrete antenna selection as a continuous and differentiable function, which makes the back propagation of the deep learning feasible. Then, we design the proper off-line training strategy to optimize both the antenna selection pattern and the extrapolation NNs. Finally, numerical results are presented to verify the effectiveness of our proposed massive MIMO channel extrapolation algorithm.

preprint2020arXiv

Deep Learning Optimized Sparse Antenna Activation for Reconfigurable Intelligent Surface Assisted Communication

To capture the communications gain of the massive radiating elements with low power cost, the conventional reconfigurable intelligent surface (RIS) usually works in passive mode. However, due to the cascaded channel structure and the lack of signal processing ability, it is difficult for RIS to obtain the individual channel state information and optimize the beamforming vector. In this paper, we add signal processing units for a few antennas at RIS to partially acquire the channels. To solve the crucial active antenna selection problem, we construct an active antenna selection network that utilizes the probabilistic sampling theory to select the optimal locations of these active antennas. With this active antenna selection network, we further design two deep learning (DL) based schemes, i.e., the channel extrapolation scheme and the beam searching scheme, to enable the RIS communication system. The former utilizes the selection network and a convolutional neural network to extrapolate the full channels from the partial channels received by the active RIS antennas, while the latter adopts a fully-connected neural network to achieve the direct mapping between the partial channels and the optimal beamforming vector with maximal transmission rate. Simulation results are provided to demonstrate the effectiveness of the designed DL-based schemes.

preprint2020arXiv

Differentially Private Combinatorial Cloud Auction

Cloud service providers typically provide different types of virtual machines (VMs) to cloud users with various requirements. Thanks to its effectiveness and fairness, auction has been widely applied in this heterogeneous resource allocation. Recently, several strategy-proof combinatorial cloud auction mechanisms have been proposed. However, they fail to protect the bid privacy of users from being inferred from the auction results. In this paper, we design a differentially private combinatorial cloud auction mechanism (DPCA) to address this privacy issue. Technically, we employ the exponential mechanism to compute a clearing unit price vector with a probability proportional to the corresponding revenue. We further improve the mechanism to reduce the running time while maintaining high revenues, by computing a single clearing unit price, or a subgroup of clearing unit prices at a time, resulting in the improved mechanisms DPCA-S and its generalized version DPCA-M, respectively. We theoretically prove that our mechanisms can guarantee differential privacy, approximate truthfulness and high revenue. Extensive experimental results demonstrate that DPCA can generate near-optimal revenues at the price of relatively high time complexity, while the improved mechanisms achieve a tunable trade-off between auction revenue and running time.

preprint2020arXiv

Generalized Prager-Synge Inequality and Equilibrated Error Estimators for Discontinuous Elements

The well-known Prager-Synge identity is valid in $H^1(Ω)$ and serves as a foundation for developing equilibrated a posteriori error estimators for continuous elements. In this paper, we introduce a new inequality, that may be regarded as a generalization of the Prager-Synge identity, to be valid for piecewise $H^1(Ω)$ functions for diffusion problems. The inequality is proved to be identity in two dimensions. For nonconforming finite element approximation of arbitrary odd order, we propose a fully explicit approach that recovers an equilibrated flux in $H(div; Ω)$ through a local element-wise scheme and that recovers a gradient in $H(curl;Ω)$ through a simple averaging technique over edges. The resulting error estimator is then proved to be globally reliable and locally efficient. Moreover, the reliability and efficiency constants are independent of the jump of the diffusion coefficient regardless of its distribution.

preprint2020arXiv

Graph Neural Network based Channel Tracking for Massive MIMO Networks

In this paper, we resort to the graph neural network (GNN) and propose the new channel tracking method for the massive multiple-input multiple-output networks under the high mobility scenario. We first utilize a small number of pilots to achieve the initial channel estimation. Then, we represent the obtained channel data in the form of graphs and describe the channel spatial correlation by the weights along the edges of the graph. Furthermore, we introduce the computation steps of the main unit for the GNN and design a GNN-based channel tracking framework, which includes an encoder, a core network and a decoder. Simulation results corroborate that our proposed GNN-based scheme can achieve better performance than the works with feedforward neural network.

preprint2020arXiv

Primal-Dual Reduced Basis Methods for Convex Minimization Variational Problems: Robust True Solution A Posteriori Error Certification and Adaptive Greedy Algorithms

In this paper, with the parametric symmetric coercive elliptic boundary value problem as an example of the primal-dual variational problems satisfying the strong duality, we develop primal-dual reduced basis methods (PD-RBM) with robust true error certifications and discuss three versions of greedy algorithms to balance the finite element error, the exact reduced basis error, and the adaptive mesh refinements. For a class of convex minimization variational problems which has corresponding dual problems satisfying the strong duality, the primal-dual gap between the primal and dual functionals can be used as a posteriori error estimator. This primal-dual gap error estimator is robust with respect to the parameters of the problem, and it can be used for both mesh refinements of finite element methods and the true RB error certification. With the help of integrations by parts formula, the primal-dual variational theory is developed for the symmetric coercive elliptic boundary value problems with non-homogeneous boundary conditions by both the conjugate function and Lagrangian theories. A generalized Prager-Synge identity, which is the primal-dual gap error representation for this specific problem, is developed. RBMs for both the primal and dual problems with robust error estimates are developed. The dual variational problem often can be viewed as a constraint optimization problem. In the paper, different from the standard saddle-point finite element approximation, the dual RBM is treated as a Galerkin projection by constructing RB spaces satisfying the homogeneous constraint. Inspired by the greedy algorithm with spatio-parameter adaptivity of \cite{Yano:18}, adaptive balanced greedy algorithms with primal-dual finite element and reduced basis error estimators are discussed. Numerical tests are presented to test the PD-RBM with adaptive balanced greedy algorithms.

preprint2020arXiv

Uplink-aided High Mobility Downlink Channel Estimation over Massive MIMO-OTFS System

Although it is often used in the orthogonal frequency division multiplexing (OFDM) systems, application of massive multiple-input multiple-output (MIMO) over the orthogonal time frequency space (OTFS) modulation could suffer from enormous training overhead in high mobility scenarios. In this paper, we propose one uplink-aided high mobility downlink channel estimation scheme for the massive MIMO-OTFS networks. Specifically, we firstly formulate the time domain massive MIMO-OTFS signal model along the uplink and adopt the expectation maximization based variational Bayesian (EM-VB) framework to recover the uplink channel parameters including the angle, the delay, the Doppler frequency, and the channel gain for each physical scattering path. Correspondingly, with the help of the fast Bayesian inference, one low complex approach is constructed to overcome the bottleneck of the EM-VB. Then, we fully exploit the angle, delay and Doppler reciprocity between the uplink and the downlink and reconstruct the angles, the delays, and the Doppler frequencies for the downlink massive channels at the base station. Furthermore, we examine the downlink massive MIMO channel estimation over the delay-Doppler-angle domain. The channel dispersion of the OTFS over the delay-Doppler domain is carefully analyzed. Various numerical examples are presented to confirm the validity and robustness of the proposed scheme.