Source author record

Sheng Chen

Sheng Chen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

40works

26topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Mosaic: Unlocking Long-Context Inference for Diffusion LLMs via Global Memory Planning and Dynamic Peak Taming

Diffusion-based large language models (dLLMs) have emerged as a promising paradigm, utilizing simultaneous denoising to enable global planning and iterative refinement. While these capabilities are particularly advantageous for long-context generation, deploying such models faces a prohibitive memory capacity barrier stemming from severe system inefficiencies. We identify that existing inference systems are ill-suited for this paradigm: unlike autoregressive models constrained by the cumulative KV-cache, dLLMs are bottlenecked by transient activations recomputed at every step. Furthermore, general-purpose memory reuse mechanisms lack the global visibility to adapt to dLLMs' dynamic memory peaks, which toggle between logits and FFNs. To address these mismatches, we propose Mosaic, a memory-efficient inference system that shifts from local, static management to a global, dynamic paradigm. Mosaic integrates a mask-only logits kernel to eliminate redundancy, a lazy chunking optimizer driven by an online heuristic search to adaptively mitigate dynamic peaks, and a global memory manager to resolve fragmentation via virtual addressing. Extensive evaluations demonstrate that Mosaic achieves an average 2.71$\times$ reduction in the memory peak-to-average ratio and increases the maximum inference sequence length supportable on identical hardware by 15.89-32.98$\times$. This scalability is achieved without compromising accuracy and speed, and in fact reducing latency by 4.12%-23.26%.

preprint2026arXiv

VLingNav: Embodied Navigation with Adaptive Reasoning and Visual-Assisted Linguistic Memory

VLA models have shown promising potential in embodied navigation by unifying perception and planning while inheriting the strong generalization abilities of large VLMs. However, most existing VLA models rely on reactive mappings directly from observations to actions, lacking the explicit reasoning capabilities and persistent memory required for complex, long-horizon navigation tasks. To address these challenges, we propose VLingNav, a VLA model for embodied navigation grounded in linguistic-driven cognition. First, inspired by the dual-process theory of human cognition, we introduce an adaptive chain-of-thought mechanism, which dynamically triggers explicit reasoning only when necessary, enabling the agent to fluidly switch between fast, intuitive execution and slow, deliberate planning. Second, to handle long-horizon spatial dependencies, we develop a visual-assisted linguistic memory module that constructs a persistent, cross-modal semantic memory, enabling the agent to recall past observations to prevent repetitive exploration and infer movement trends for dynamic environments. For the training recipe, we construct Nav-AdaCoT-2.9M, the largest embodied navigation dataset with reasoning annotations to date, enriched with adaptive CoT annotations that induce a reasoning paradigm capable of adjusting both when to think and what to think about. Moreover, we incorporate an online expert-guided reinforcement learning stage, enabling the model to surpass pure imitation learning and to acquire more robust, self-explored navigation behaviors. Extensive experiments demonstrate that VLingNav achieves state-of-the-art performance across a wide range of embodied navigation benchmarks. Notably, VLingNav transfers to real-world robotic platforms in a zero-shot manner, executing various navigation tasks and demonstrating strong cross-domain and cross-task generalization.

preprint2023arXiv

Joint Beamforming Design for Dual-Functional MIMO Radar and Communication Systems Guaranteeing Physical Layer Security

The dual-functional radar and communication (DFRC) technique constitutes a promising next-generation wireless solution, due to its benefits in terms of power consumption, physical hardware, and spectrum exploitation. In this paper, we propose sophisticated beamforming designs for multi-user DFRC systems by additionally taking the physical layer security (PLS) into account. We show that appropriately designed radar waveforms can also act as the traditional artificial noise conceived for drowning out the eavesdropping channel and for attaining increased design degrees of freedom (DoF). The joint beamforming design is formulated as a non-convex optimization problem for striking a compelling trade-off amongst the conflicting design objectives of radar transmit beampattern, communication quality of service (QoS), and the PLS level. Then, we propose a semidefinite relaxation (SDR)-based algorithm and a reduced-complexity version to tackle the non-convexity, where the globally optimal solutions are found. Moreover, a robust beamforming method is also developed for considering realistic imperfect channel state information (CSI) knowledge. Finally, simulation results are provided for corroborating our theoretical results and show the proposed methods' superiority.

preprint2022arXiv

A Novel DeBERTa-based Model for Financial Question Answering Task

As a rising star in the field of natural language processing, question answering systems (Q&A Systems) are widely used in all walks of life. Compared with other scenarios, the applicationin financial scenario has strong requirements in the traceability and interpretability of the Q&A systems. In addition, since the demand for artificial intelligence technology has gradually shifted from the initial computational intelligence to cognitive intelligence, this research mainly focuses on the financial numerical reasoning dataset - FinQA. In the shared task, the objective is to generate the reasoning program and the final answer according to the given financial report containing text and tables. We use the method based on DeBERTa pre-trained language model, with additional optimization methods including multi-model fusion, training set combination on this basis. We finally obtain an execution accuracy of 68.99 and a program accuracy of 64.53, ranking No. 4 in the 2022 FinQA Challenge.

preprint2022arXiv

Arithmetic purity of strong approximation for complete toric varieties

In this article, we establish the arithmetic purity of strong approximation for smooth loci of weighted projective spaces. By using this result and the descent method, we also prove that the arithmetic purity of strong approximation with Brauer-Manin obstruction holds for any smooth and complete toric variety.

preprint2022arXiv

AutoFAS: Automatic Feature and Architecture Selection for Pre-Ranking System

Industrial search and recommendation systems mostly follow the classic multi-stage information retrieval paradigm: matching, pre-ranking, ranking, and re-ranking stages. To account for system efficiency, simple vector-product based models are commonly deployed in the pre-ranking stage. Recent works consider distilling the high knowledge of large ranking models to small pre-ranking models for better effectiveness. However, two major challenges in pre-ranking system still exist: (i) without explicitly modeling the performance gain versus computation cost, the predefined latency constraint in the pre-ranking stage inevitably leads to suboptimal solutions; (ii) transferring the ranking teacher's knowledge to a pre-ranking student with a predetermined handcrafted architecture still suffers from the loss of model performance. In this work, a novel framework AutoFAS is proposed which jointly optimizes the efficiency and effectiveness of the pre-ranking model: (i) AutoFAS for the first time simultaneously selects the most valuable features and network architectures using Neural Architecture Search (NAS) technique; (ii) equipped with ranking model guided reward during NAS procedure, AutoFAS can select the best pre-ranking architecture for a given ranking teacher without any computation overhead. Experimental results in our real world search system show AutoFAS consistently outperforms the previous state-of-the-art (SOTA) approaches at a lower computing cost. Notably, our model has been adopted in the pre-ranking module in the search system of Meituan, bringing significant improvements.

preprint2022arXiv

CausalMTA: Eliminating the User Confounding Bias for Causal Multi-touch Attribution

Multi-touch attribution (MTA), aiming to estimate the contribution of each advertisement touchpoint in conversion journeys, is essential for budget allocation and automatically advertising. Existing methods first train a model to predict the conversion probability of the advertisement journeys with historical data and calculate the attribution of each touchpoint using counterfactual predictions. An assumption of these works is the conversion prediction model is unbiased, i.e., it can give accurate predictions on any randomly assigned journey, including both the factual and counterfactual ones. Nevertheless, this assumption does not always hold as the exposed advertisements are recommended according to user preferences. This confounding bias of users would lead to an out-of-distribution (OOD) problem in the counterfactual prediction and cause concept drift in attribution. In this paper, we define the causal MTA task and propose CausalMTA to eliminate the influence of user preferences. It systemically eliminates the confounding bias from both static and dynamic preferences to learn the conversion prediction model using historical data. We also provide a theoretical analysis to prove CausalMTA can learn an unbiased prediction model with sufficient data. Extensive experiments on both public datasets and the impression data in an e-commerce company show that CausalMTA not only achieves better prediction performance than the state-of-the-art method but also generates meaningful attribution credits across different advertising channels.

preprint2022arXiv

CLIP2TV: Align, Match and Distill for Video-Text Retrieval

Modern video-text retrieval frameworks basically consist of three parts: video encoder, text encoder and the similarity head. With the success on both visual and textual representation learning, transformer based encoders and fusion methods have also been adopted in the field of video-text retrieval. In this report, we present CLIP2TV, aiming at exploring where the critical elements lie in transformer based methods. To achieve this, We first revisit some recent works on multi-modal learning, then introduce some techniques into video-text retrieval, finally evaluate them through extensive experiments in different configurations. Notably, CLIP2TV achieves 52.9@R1 on MSR-VTT dataset, outperforming the previous SOTA result by 4.1%.

preprint2022arXiv

Contrastive Information Transfer for Pre-Ranking Systems

Real-word search and recommender systems usually adopt a multi-stage ranking architecture, including matching, pre-ranking, ranking, and re-ranking. Previous works mainly focus on the ranking stage while very few focus on the pre-ranking stage. In this paper, we focus on the information transfer from ranking to pre-ranking stage. We propose a new Contrastive Information Transfer (CIT) framework to transfer useful information from ranking model to pre-ranking model. We train the pre-ranking model to distinguish the positive pair of representation from a set of positive and negative pairs with a contrastive objective. As a consequence, the pre-ranking model can make full use of rich information in ranking model's representations. The CIT framework also has the advantage of alleviating selection bias and improving the performance of recall metrics, which is crucial for pre-ranking models. We conduct extensive experiments including offline datasets and online A/B testing. Experimental results show that CIT achieves superior results than competitive models. In addition, a strict online A/B testing at one of the world's largest E-commercial platforms shows that the proposed model achieves 0.63\% improvements on CTR and 1.64\% improvements on VBR. The proposed model now has been deployed online and serves the main traffic of this system, contributing a remarkable business growth.

preprint2022arXiv

Integrated Sensing and Communication with mmWave Massive MIMO: A Compressed Sampling Perspective

Integrated sensing and communication (ISAC) has opened up numerous game-changing opportunities for realizing future wireless systems. In this paper, we propose an ISAC processing framework relying on millimeter-wave (mmWave) massive multiple-input multiple-output (MIMO) systems. Specifically, we provide a compressed sampling (CS) perspective to facilitate ISAC processing, which can not only recover the high-dimensional channel state information or/and radar imaging information, but also significantly reduce pilot overhead. First, an energy-efficient widely spaced array (WSA) architecture is tailored for the radar receiver, which enhances the angular resolution of radar sensing at the cost of angular ambiguity. Then, we propose an ISAC frame structure for time-varying ISAC systems considering different timescales. The pilot waveforms are judiciously designed by taking into account both CS theories and hardware constraints induced by hybrid beamforming (HBF) architecture. Next, we design the dedicated dictionary for WSA that serves as a building block for formulating the ISAC processing as sparse signal recovery problems. The orthogonal matching pursuit with support refinement (OMP-SR) algorithm is proposed to effectively solve the problems in the existence of the angular ambiguity. We also provide a framework for estimating the Doppler frequencies during payload data transmission to guarantee communication performances. Simulation results demonstrate the good performances of both communications and radar sensing under the proposed ISAC framework.

preprint2022arXiv

KSSOLV 2.0: An efficient MATLAB toolbox for solving the Kohn-Sham equations with plane-wave basis set

KSSOLV (Kohn-Sham Solver) is a MATLAB toolbox for performing Kohn-Sham density functional theory (DFT) calculations with a plane-wave basis set. KSSOLV 2.0 preserves the design features of the original KSSOLV software to allow users and developers to easily set up a problem and perform ground-state calculations as well as to prototype and test new algorithms. Furthermore, it includes new functionalities such as new iterative diagonalization algorithms, k-point sampling for electron band structures, geometry optimization and advanced algorithms for performing DFT calculations with local, semi-local, and hybrid exchange-correlation functionals. It can be used to study the electronic structures of both molecules and solids. We describe these new capabilities in this work through a few use cases. We also demonstrate the numerical accuracy and computational efficiency of KSSOLV on a variety of examples.

preprint2022arXiv

Mobility Support for Millimeter Wave Communications: Opportunities and Challenges

Millimeter-wave (mmWave) communication technology offers a potential and promising solution to support 5G and B5G wireless networks in dynamic scenarios and applications. However, mobility introduces many challenges as well as opportunities to mmWave applications. To address these problems, we conduct a survey of the opportunities and technologies to support mmWave communications in mobile scenarios. Firstly, we summarize the mobile scenarios where mmWave communications are exploited, including indoor wireless local area network (WLAN) or wireless personal area network (WPAN), cellular access, vehicle-to-everything (V2X), high speed train (HST), unmanned aerial vehicle (UAV), and the new space-air-ground-sea communication scenarios. Then, to address users' mobility impact on the system performance in different application scenarios, we introduce several representative mobility models in mmWave systems, including human mobility, vehicular mobility, high speed train mobility and ship mobility. Next we survey the key challenges and existing solutions to mmWave applications, such as channel modeling, channel estimation, anti-blockage, and capacity improvement. Lastly, we discuss the open issues concerning mobility-aware mmWave communications that deserve further investigation. In particular, we highlight future heterogeneous mobile networks, dynamic resource management, artificial intelligence (AI) for mobility and integration of geographical information, deployment of large intelligent surface and reconfigurable antenna technology, and finally, the evolution to Terahertz (THz) communications.

preprint2022arXiv

Multiple-Objective Packet Routing Optimization for Aeronautical ad-hoc Networks

Providing Internet service above the clouds is of ever-increasing interest and in this context aeronautical {\it{ad-hoc}} networking (AANET) constitutes a promising solution. However, the optimization of packet routing in large ad hoc networks is quite challenging. In this paper, we develop a discrete $ε$ multi-objective genetic algorithm ($ε$-DMOGA) for jointly optimizing the end-to-end latency, the end-to-end spectral efficiency (SE), and the path expiration time (PET) that specifies how long the routing path can be relied on without re-optimizing the path. More specifically, a distance-based adaptive coding and modulation (ACM) scheme specifically designed for aeronautical communications is exploited for quantifying each link's achievable SE. Furthermore, the queueing delay at each node is also incorporated into the multiple-objective optimization metric. Our $ε$-DMOGA assisted multiple-objective routing optimization is validated by real historical flight data collected over the Australian airspace on two selected representative dates.

preprint2022arXiv

Sampling Is All You Need on Modeling Long-Term User Behaviors for CTR Prediction

Rich user behavior data has been proven to be of great value for Click-Through Rate (CTR) prediction applications, especially in industrial recommender, search, or advertising systems. However, it's non-trivial for real-world systems to make full use of long-term user behaviors due to the strict requirements of online serving time. Most previous works adopt the retrieval-based strategy, where a small number of user behaviors are retrieved first for subsequent attention. However, the retrieval-based methods are sub-optimal and would cause more or less information losses, and it's difficult to balance the effectiveness and efficiency of the retrieval algorithm. In this paper, we propose SDIM (Sampling-based Deep Interest Modeling), a simple yet effective sampling-based end-to-end approach for modeling long-term user behaviors. We sample from multiple hash functions to generate hash signatures of the candidate item and each item in the user behavior sequence, and obtain the user interest by directly gathering behavior items associated with the candidate item with the same hash signature. We show theoretically and experimentally that the proposed method performs on par with standard attention-based models on modeling long-term user behaviors, while being sizable times faster. We also introduce the deployment of SDIM in our system. Specifically, we decouple the behavior sequence hashing, which is the most time-consuming part, from the CTR model by designing a separate module named BSE (behavior Sequence Encoding). BSE is latency-free for the CTR server, enabling us to model extremely long user behaviors. Both offline and online experiments are conducted to demonstrate the effectiveness of SDIM. SDIM now has been deployed online in the search system of Meituan APP.

preprint2021arXiv

Structure Parameter Optimized Kernel Based Online Prediction with a Generalized Optimization Strategy for Nonstationary Time Series

In this paper, sparsification techniques aided online prediction algorithms in a reproducing kernel Hilbert space are studied for nonstationary time series. The online prediction algorithms as usual consist of the selection of kernel structure parameters and the kernel weight vector updating. For structure parameters, the kernel dictionary is selected by some sparsification techniques with online selective modeling criteria, and moreover the kernel covariance matrix is intermittently optimized in the light of the covariance matrix adaptation evolution strategy (CMA-ES). Optimizing the real symmetric covariance matrix can not only improve the kernel structure's flexibility by the cross relatedness of the input variables, but also partly alleviate the prediction uncertainty caused by the kernel dictionary selection for nonstationary time series. In order to sufficiently capture the underlying dynamic characteristics in prediction-error time series, a generalized optimization strategy is designed to construct the kernel dictionary sequentially in multiple kernel connection modes. The generalized optimization strategy provides a more self-contained way to construct the entire kernel connections, which enhances the ability to adaptively track the changing dynamic characteristics. Numerical simulations have demonstrated that the proposed approach has superior prediction performance for nonstationary time series.

preprint2020arXiv

Compressive Sensing Based Massive Access for IoT Relying on Media Modulation Aided Machine Type Communications

A fundamental challenge of the large-scale Internet-of-Things lies in how to support massive machine-type communications (mMTC). This letter proposes a media modulation based mMTC solution for increasing the throughput, where a massive multi-input multi-output based base station (BS) is used for enhancing the detection performance. For such a mMTC scenario, the reliable active device detection and data decoding pose a serious challenge. By leveraging the sparsity of the uplink access signals of mMTC received at the BS, a compressive sensing based massive access solution is proposed for tackling this challenge. Specifically, we propose a block sparsity adaptive matching pursuit algorithm for detecting the active devices, whereby the block-sparsity of the uplink access signals exhibited across the successive time slots and the structured sparsity of media modulated symbols are exploited for enhancing the detection performance. Moreover, a successive interference cancellation based structured subspace pursuit algorithm is conceived for data demodulation of the active devices, whereby the structured sparsity of media modulation based symbols found in each time slot is exploited for improving the detection performance. Finally, our simulation results verify the superiority of the proposed scheme over state-of-the-art solutions.

preprint2020arXiv

Hybrid Transceiver Optimization for Multi-Hop Communications

Multi-hop communication with the aid of large-scale antenna arrays will play a vital role in future emergence communication systems. In this paper, we investigate amplify-and-forward based and multiple-input multiple-output assisted multi-hop communication, in which all nodes employ hybrid transceivers. Moreover, channel errors are taken into account in our hybrid transceiver design. Based on the matrix-monotonic optimization framework, the optimal structures of the robust hybrid transceivers are derived. By utilizing these optimal structures, the optimizations of analog transceivers and digital transceivers can be separated without loss of optimality. This fact greatly simplifies the joint optimization of analog and digital transceivers. Since the optimization of analog transceivers under unit-modulus constraints is non-convex, a projection type algorithm is proposed for analog transceiver optimization to overcome this difficulty. Based on the derived analog transceivers, the optimal digital transceivers can then be derived using matrix-monotonic optimization. Numeral results obtained demonstrate the performance advantages of the proposed hybrid transceiver designs over other existing solutions.

preprint2020arXiv

Log orthogonal functions: approximation properties and applications

We present two new classes of orthogonal functions, log orthogonal functions (LOFs) and generalized log orthogonal functions (GLOFs), which are constructed by applying a $\log$ mapping to Laguerre polynomials. We develop basic approximation theory for these new orthogonal functions and apply them to solve several typical fractional differential equations whose solutions exhibit weak singularities. Our error analysis and numerical results show that our methods based on the new orthogonal functions are particularly suitable for functions which have weak singularities at one endpoint, and can lead to exponential convergence rate, as opposed to low algebraic rates if usual orthogonal polynomials are used.

preprint2020arXiv

Matrix-Monotonic Optimization Part II: Multi-Variable Optimization

In contrast to Part I of this treatise [1] that focuses on the optimization problems associated with single matrix variables, in this paper, we investigate the application of the matrix-monotonic optimization framework in the optimization problems associated with multiple matrix variables. It is revealed that matrix-monotonic optimization still works even for multiple matrix-variate based optimization problems, provided that certain conditions are satisfied. Using this framework, the optimal structures of the matrix variables can be derived and the associated multiple matrix-variate optimization problems can be substantially simplified. In this paper, several specific examples are given, which are essentially open problems. Firstly, we investigate multi-user multiple-input multiple-output (MU- MIMO) uplink communications under various power constraints. Using the proposed framework, the optimal structures of the precoding matrices at each user under various power constraints can be derived. Secondly, we considered the optimization of the signal compression matrices at each sensor under various power constraints in distributed sensor networks. Finally, we investigate the transceiver optimization for multi-hop amplify-and-forward (AF) MIMO relaying networks with imperfect channel state information (CSI) under various power constraints. At the end of this paper, several simulation results are given to demonstrate the accuracy of the proposed theoretical results.

preprint2020arXiv

Quantum Criticism: A Tagged News Corpus Analysed for Sentiment and Named Entities

In this research, we continuously collect data from the RSS feeds of traditional news sources. We apply several pre-trained implementations of named entity recognition (NER) tools, quantifying the success of each implementation. We also perform sentiment analysis of each news article at the document, paragraph and sentence level, with the goal of creating a corpus of tagged news articles that is made available to the public through a web interface. Finally, we show how the data in this corpus could be used to identify bias in news reporting.

preprint2020arXiv

Research on a New Convolutional Neural Network Model Combined with Random Edges Adding

It is always a hot and difficult point to improve the accuracy of convolutional neural network model and speed up its convergence. Based on the idea of small world network, a random edge adding algorithm is proposed to improve the performance of convolutional neural network model. This algorithm takes the convolutional neural network model as a benchmark, and randomizes backwards and cross-layer connections with probability p to form a new convolutional neural network model. The proposed idea can optimize the cross layer connectivity by changing the topological structure of convolutional neural network, and provide a new idea for the improvement of the model. The simulation results based on Fashion-MINST and cifar10 data set show that the model recognition accuracy and training convergence speed are greatly improved by random edge adding reconstructed models with aprobability p = 0.1.

preprint2020arXiv

Towards Playing Full MOBA Games with Deep Reinforcement Learning

MOBA games, e.g., Honor of Kings, League of Legends, and Dota 2, pose grand challenges to AI systems such as multi-agent, enormous state-action space, complex action control, etc. Developing AI for playing MOBA games has raised much attention accordingly. However, existing work falls short in handling the raw game complexity caused by the explosion of agent combinations, i.e., lineups, when expanding the hero pool in case that OpenAI's Dota AI limits the play to a pool of only 17 heroes. As a result, full MOBA games without restrictions are far from being mastered by any existing AI system. In this paper, we propose a MOBA AI learning paradigm that methodologically enables playing full MOBA games with deep reinforcement learning. Specifically, we develop a combination of novel and existing learning techniques, including curriculum self-play learning, policy distillation, off-policy adaption, multi-head value estimation, and Monte-Carlo tree-search, in training and playing a large pool of heroes, meanwhile addressing the scalability issue skillfully. Tested on Honor of Kings, a popular MOBA game, we show how to build superhuman AI agents that can defeat top esports players. The superiority of our AI is demonstrated by the first large-scale performance test of MOBA AI agent in the literature.

preprint2018arXiv

On non-feasible edge sets in matching-covered graphs

Let $G=(V,E)$ be a matching-covered graph and $X$ be an edge set of $G$. $X$ is said to be feasible if there exist two perfect matchings $M_1$ and $M_2$ in $G$ such that $|M_1\cap X|\not \equiv|M_2\cap X|\ (\mbox{mod } 2)$. For any $V_0\subseteq V$, $X$ is said to be switching-equivalent to $X\oplus \nabla_G(V_0)$, where $\nabla_G(V_0)$ is the set of edges in $G$ each of which has exactly one end in $V_0$ and $A \oplus B$ is the symmetric difference of two sets $A$ and $B$. Lukot'ka and Rollová showed that when $G$ is regular and bipartite, $X$ is non-feasible if and only if $X$ is switching-equivalent to $\emptyset$. This article extends Lukot'ka and Rollová's result by showing that this conclusion holds as long as $G$ is matching-covered and bipartite. This article also studies matching-covered graphs $G$ whose non-feasible edge sets are switching-equivalent to $\emptyset$ or $E$ and partially characterizes these matching-covered graphs in terms of their ear decompositions. Another aim of this article is to construct infinite many $r$-connected and $r$-regular graphs of class 1 containing non-feasible edge sets not switching-equivalent to either $\emptyset$ or $E$ for an arbitrary integer $r$ with $r\ge 3$, which provides negative answers to problems asked by Lukot'ka and Rollová and He, et al respectively.

preprint2016arXiv

Alternating Estimation for Structured High-Dimensional Multi-Response Models

We consider learning high-dimensional multi-response linear models with structured parameters. By exploiting the noise correlations among responses, we propose an alternating estimation (AltEst) procedure to estimate the model parameters based on the generalized Dantzig selector. Under suitable sample size and resampling assumptions, we show that the error of the estimates generated by AltEst, with high probability, converges linearly to certain minimum achievable level, which can be tersely expressed by a few geometric measures, such as Gaussian width of sets related to the parameter structure. To the best of our knowledge, this is the first non-asymptotic statistical guarantee for such AltEst-type algorithm applied to estimation problem with general structures.

preprint2016arXiv

Compressive Sensing Based Multi-User Detector for the Large-Scale SM-MIMO Uplink

Conventional spatial modulation (SM) is typically considered for transmission in the downlink of small-scale MIMO systems, where a single one of a set of antenna elements (AEs) is activated for implicitly conveying extra bits. By contrast, inspired by the compelling benefits of large-scale MIMO (LS- MIMO) systems, here we propose a LS-SM-MIMO scheme for the uplink (UL), where each user having multiple AEs but only a single radio frequency (RF) chain invokes SM for increasing the UL-throughput. At the same time, by relying on hundreds of AEs but a small number of RF chains, the base station (BS) can simultaneously serve multiple users whilst reducing the power consumption. Due to the large number of AEs of the UL-users and the comparably small number of RF chains at the BS, the UL multi-user signal detection becomes a challenging large-scale under-determined problem. To solve this problem, we propose a joint SM transmission scheme and a carefully designed structured compressive sensing (SCS)-based multi-user detector (MUD) to be used at the users and BS, respectively. Additionally, the cyclic- prefix single-carrier (CPSC) is used to combat the multipath channels, and a simple receive AE selection is used for the improved performance over correlated Rayleigh-fading MIMO channels. We demonstrate that the aggregate SM signal consisting of SM signals of multiple UL-users in one CPSC block appears the distributed sparsity. Moreover, due to the joint SM transmission scheme, aggregate SM signals in the same transmission group exhibit the group sparsity. By exploiting these intrinsically sparse features, the proposed SCS-based MUD can reliably detect the resultant SM signals with low complexity. Simulation results demonstrate that the proposed SCS-based MUD achieves a better signal detection performance than its counterparts even with higher UL-throughtput.

preprint2016arXiv

Non-equilibrium relaxation in a stochastic lattice Lotka-Volterra model

We employ Monte Carlo simulations to study a stochastic Lotka-Volterra model on a two-dimensional square lattice with periodic boundary conditions. If the (local) prey carrying capacity is finite, there exists an extinction threshold for the predator population that separates a stable active two-species coexistence phase from an inactive state wherein only prey survive. Holding all other rates fixed, we investigate the non-equilibrium relaxation of the predator density in the vicinity of the critical predation rate. As expected, we observe critical slowing-down, i.e., a power law dependence of the relaxation time on the predation rate, and algebraic decay of the predator density at the extinction critical point. The numerically determined critical exponents are in accord with the established values of the directed percolation universality class. Following a sudden predation rate change to its critical value, one finds critical aging for the predator density autocorrelation function that is also governed by universal scaling exponents. This aging scaling signature of the active-to-absorbing state phase transition emerges at significantly earlier times than the stationary critical power laws, and could thus serve as an advanced indicator of the (predator) population's proximity to its extinction threshold.

preprint2016arXiv

Structured Matrix Recovery via the Generalized Dantzig Selector

In recent years, structured matrix recovery problems have gained considerable attention for its real world applications, such as recommender systems and computer vision. Much of the existing work has focused on matrices with low-rank structure, and limited progress has been made matrices with other types of structure. In this paper we present non-asymptotic analysis for estimation of generally structured matrices via the generalized Dantzig selector under generic sub-Gaussian measurements. We show that the estimation error can always be succinctly expressed in terms of a few geometric measures of suitable sets which only depend on the structure of the underlying true matrix. In addition, we derive the general bounds on these geometric measures for structures characterized by unitarily invariant norms, which is a large family covering most matrix norms of practical interest. Examples are provided to illustrate the utility of our theoretical development.

preprint2015arXiv

A Survey of directed graphs invariants

In this paper, various kinds of invariants of directed graphs are summarized. In the first topic, the invariant w(G) for a directed graph G is introduced, which is primarily defined by S. Chen and X.M. Chen to solve a problem of weak connectedness of tensor product of two directed graphs. Further, we present our recent studies on the invariant w(G) in categorical view. In the second topic, Homology theory on directed graph is introduced, and we also cast on categorical view of the definition. The third topic mainly focuses on Laplacians on graphs, including traditional work and latest result of 1-laplacian by K.C.Chang. Finally, Zeta functions and Graded graphs are introduced, inclduing Bratteli-Vershik diagram, dual graded graphs and differential posets, with some applications in dynamic system.

preprint2015arXiv

Computer simulation of random loose packings of micro-particles in presence of adhesion and friction

With a novel 3D discrete-element method specially developed with adhesive contact mechanics, random loose packings of uniform spherical micron-sized particles are fully investigated. The results show that large velocity, large size or weak adhesion can produce a relatively dense packing when other parameters are fixed, and these combined effects can be characterized by a dimensionless adhesion parameter ( $Ad=ω/2ρ_pU^2_0R$). Four regimes are identified based on the value of $Ad$: RCP regime with $Ad<\sim 0.01$; RLP regime with $\sim 0.01<Ad<1$; adhesion regime with $1<Ad<20$ and an asymptotic regime with $Ad>20$. Force distribution of these adhesive loose packings follows $P(f)\sim f^θ$ for small forces and $P(f)\sim \exp^{-βf}$ for big forces, respectively, which shares a similar form with that in packings without adhesion but results in distinct exponents of $θ=0.879$, $β=0.839$. A local mechanical equilibrium analysis shows that adhesion enhances both sliding and rolling resistance so that fewer neighbours are needed to satisfy the force and torque balance.

preprint2015arXiv

Effect of long-range repulsive Coulomb interactions on packing structure of adhesive particles

The packing of charged micron-sized particles was investigated using discrete element simulations based on adhesive contact dynamic model. The formation process and the final obtained structures of ballistic packings are studied to show the effect of interparticle Coulomb force. It was found that increasing the charge on particles causes a remarkable decrease of the packing volume fraction ϕand the average coordination number Z, indicating a looser and chainlike structure. Force-scaling analysis shows that the long-range Coulomb interaction changes packing structures through its influence on particle inertia before they are bonded into the force networks. Once contact networks are formed, the expansion effect caused by repulsive Coulomb forces are dominated by short-range adhesion. Based on abundant results from simulations, a dimensionless adhesion parameter Ad* , which combines the effects of the particle inertia, the short-range adhesion and the long-range Coulomb interaction, is proposed and successfully scales the packing results for micron-sized particles within the latestly derived adhesive loose packing (ALP) regime. The structural properties of our packings follow well the recent theoretical prediction which is described by an ensemble approach based on a coarse-grained volume function, indicating some kind of universality in the low packing density regime of the phase diagram regardless of adhesion or particle charge. Based on the comprehensive consideration of the complicated inter-particle interactions, our findings provide insight into the roles of short-range adhesion and repulsive Coulomb force during packing formation and should be useful for further design of packings.

preprint2015arXiv

Estimation with Norm Regularization

Analysis of non-asymptotic estimation error and structured statistical recovery based on norm regularized regression, such as Lasso, needs to consider four aspects: the norm, the loss function, the design matrix, and the noise model. This paper presents generalizations of such estimation error analysis on all four aspects compared to the existing literature. We characterize the restricted error set where the estimation error vector lies, establish relations between error sets for the constrained and regularized problems, and present an estimation error bound applicable to any norm. Precise characterizations of the bound is presented for isotropic as well as anisotropic subGaussian design matrices, subGaussian noise models, and convex loss functions, including least squares and generalized linear models. Generic chaining and associated results play an important role in the analysis. A key result from the analysis is that the sample complexity of all such estimators depends on the Gaussian width of a spherical cap corresponding to the restricted error set. Further, once the number of samples $n$ crosses the required sample complexity, the estimation error decreases as $\frac{c}{\sqrt{n}}$, where $c$ depends on the Gaussian width of the unit norm ball.

preprint2015arXiv

Generalized Dantzig Selector: Application to the k-support norm

We propose a Generalized Dantzig Selector (GDS) for linear models, in which any norm encoding the parameter structure can be leveraged for estimation. We investigate both computational and statistical aspects of the GDS. Based on conjugate proximal operator, a flexible inexact ADMM framework is designed for solving GDS, and non-asymptotic high-probability bounds are established on the estimation error, which rely on Gaussian width of unit norm ball and suitable set encompassing estimation error. Further, we consider a non-trivial example of the GDS using $k$-support norm. We derive an efficient method to compute the proximal operator for $k$-support norm since existing methods are inapplicable in this setting. For statistical analysis, we provide upper bounds for the Gaussian widths needed in the GDS analysis, yielding the first statistical recovery guarantee for estimation with the $k$-support norm. The experimental results confirm our theoretical analysis.

preprint2015arXiv

l1-norm Penalized Orthogonal Forward Regression

A l1-norm penalized orthogonal forward regression (l1-POFR) algorithm is proposed based on the concept of leaveone- out mean square error (LOOMSE). Firstly, a new l1-norm penalized cost function is defined in the constructed orthogonal space, and each orthogonal basis is associated with an individually tunable regularization parameter. Secondly, due to orthogonal computation, the LOOMSE can be analytically computed without actually splitting the data set, and moreover a closed form of the optimal regularization parameter in terms of minimal LOOMSE is derived. Thirdly, a lower bound for regularization parameters is proposed, which can be used for robust LOOMSE estimation by adaptively detecting and removing regressors to an inactive set so that the computational cost of the algorithm is significantly reduced. Illustrative examples are included to demonstrate the effectiveness of this new l1-POFR approach.

preprint2015arXiv

Priori-Information Aided Iterative Hard Threshold: A Low-Complexity High-Accuracy Compressive Sensing Based Channel Estimation for TDS-OFDM

This paper develops a low-complexity channel estimation (CE) scheme based on compressive sensing (CS) for time-domain synchronous (TDS) orthogonal frequency-division multiplexing (OFDM) to overcome the performance loss under doubly selective fading channels. Specifically, an overlap-add method of the time-domain training sequence is first proposed to obtain the coarse estimates of the channel length, path delays and path gains of the wireless channel, by exploiting the channel's temporal correlation to improve the robustness of the coarse CE under the severe fading channel with long delay spread. We then propose the priori-information aided (PA) iterative hard threshold (IHT) algorithm, which utilizes the priori information of the acquired coarse estimate for the wireless channel and therefore is capable of obtaining an accurate channel estimate of the doubly selective fading channel. Compared with the classical IHT algorithm whose convergence requires the $l_2$ norm of the measurement matrix being less than 1, the proposed PA-IHT algorithm exploits the priori information acquired to remove such a limitation as well as to reduce the number of required iterations. Compared with the existing CS based CE method for TDS-OFDM, the proposed PA-IHT algorithm significantly reduces the computational complexity of CE as well as enhances the CE accuracy. Simulation results demonstrate that, without sacrificing spectral efficiency and changing the current TDS-OFDM signal structure, the proposed scheme performs better than the existing CE schemes for TDS-OFDM in various scenarios, especially under severely doubly selective fading channels.

preprint2015arXiv

Soft Pilot Reuse and Multi-Cell Block Diagonalization Precoding for Massive MIMO Systems

The users at cell edge of a massive multiple-input multiple-output (MIMO) system suffer from severe pilot contamination, which leads to poor quality of service (QoS). In order to enhance the QoS for these edge users, soft pilot reuse (SPR) combined with multi-cell block diagonalization (MBD) precoding are proposed. Specifically, the users are divided into two groups according to their large-scale fading coefficients, referred to as the center users, who only suffer from modest pilot contamination and the edge users, who suffer from severe pilot contamination. Based on this distinction, the SPR scheme is proposed for improving the QoS for the edge users, whereby a cell-center pilot group is reused for all cell-center users in all cells, while a cell-edge pilot group is applied for the edge users in the adjacent cells. By extending the classical block diagonalization precoding to a multi-cell scenario, the MBD precoding scheme projects the downlink transmit signal onto the null space of the subspace spanned by the inter-cell channels of the edge users in adjacent cells. Thus, the inter-cell interference contaminating the edge users' signals in the adjacent cells can be efficiently mitigated and hence the QoS of these edge users can be further enhanced. Our theoretical analysis and simulation results demonstrate that both the uplink and downlink rates of the edge users are significantly improved, albeit at the cost of the slightly decreased rate of center users.

preprint2015arXiv

Super Hom-Gel'fand-Dorfman bialgebras and Hom-Lie conformal superalgebras

The purpose of this paper is to introduce and study super Hom-Gel'fand-Dorfman bialgebras and Hom-Lie conformal superalgebras. In this paper, we provide different ways for constructing super Hom-Gel'fand-Dorfman bialgebras and obtain some infinite-dimensional Hom-Lie superalgebras from affinization of super Hom-Gel'fand-Dorfman bialgebras. Also, we give a general construction of Hom-Lie conformal superalgebras from Hom-Lie superalgebras and establish equivalence of quadratic Hom-Lie conformal superalgebras and super Hom-Gel'fand-Dorfman bialgebras. Finally, we characterize one-dimensional central extensions of quadratic Hom-Lie conformal superalgebras by using certain bilinear forms of super Hom-Gel'fand-Dorfman bialgebras.

preprint2014arXiv

Cross-Layer Software-Defined 5G Network

In the past few decades, the world has witnessed a rapid growth in mobile communication and reaped great benefits from it. Even though the fourth generation (4G) mobile communication system is just being deployed worldwide, proliferating mobile demands call for newer wireless communication technologies with even better performance. Consequently, the fifth generation (5G) system is already emerging in the research field. However, simply evolving the current mobile networks can hardly meet such great expectations, because over the years the infrastructures have generally become ossified, closed, and vertically constructed. Aiming to establish a new paradigm for 5G mobile networks, in this article, we propose a cross-layer software-defined 5G network architecture. By jointly considering both the network layer and the physical layer together, we establish the two software-defined programmable components, the control plane and the cloud computing pool, which enable an effective control of the mobile network from the global perspective and benefit technological innovations. Specifically, by the cross-layer design for software-defining, the logically centralized and programmable control plane abstracts the control functions from the network layer down to the physical layer, through which we achieve the fine-grained controlling of mobile network, while the cloud computing pool provides powerful computing capability to implement the baseband data processing of multiple heterogeneous networks. We discuss the main challenges of our architecture, including the fine-grained control strategies, network virtualization, and programmability. The architecture significantly benefits the convergence towards heterogeneous networks and it enables much more controllable, programmable and evolvable mobile networks.

preprint2014arXiv

Generalized Jacobi Functions and Their Applications to Fractional Differential Equations

In this paper, we consider spectral approximation of fractional differential equations (FDEs). A main ingredient of our approach is to define a new class of generalized Jacobi functions (GJFs), which is intrinsically related to fractional calculus, and can serve as natural basis functions for properly designed spectral methods for FDEs. We establish spectral approximation results for these GJFs in weighted Sobolev spaces involving fractional derivatives. We construct efficient GJF-Petrov-Galerkin methods for a class of prototypical fractional initial value problems (FIVPs) and fractional boundary value problems (FBVPs) of general order, and show that with an appropriate choice of the parameters in GJFs, the resulted linear systems can be sparse and well-conditioned. Moreover, we derive error estimates with convergence rate only depending on the smoothness of data, so truly spectral accuracy can be attained if the data are smooth enough. The idea and results presented in this paper will be useful to deal with more general FDEs associated with Riemann-Liouville or Caputo fractional derivatives.

preprint2010arXiv

Generalized Ehrhart polynomials

Let $P$ be a polytope with rational vertices. A classical theorem of Ehrhart states that the number of lattice points in the dilations $P(n) = nP$ is a quasi-polynomial in $n$. We generalize this theorem by allowing the vertices of P(n) to be arbitrary rational functions in $n$. In this case we prove that the number of lattice points in P(n) is a quasi-polynomial for $n$ sufficiently large. Our work was motivated by a conjecture of Ehrhart on the number of solutions to parametrized linear Diophantine equations whose coefficients are polynomials in $n$, and we explain how these two problems are related.

preprint2010arXiv

Radiation-driven Implosion in the Cepheus B Molecular Cloud

We analyze large scale mapping observations of the molecular lines in the 12CO (J=2-1), 12CO (J=3-2), 13CO (J=2-1), and 13CO (J=3-2) transition emissions toward the Cepheus B molecular cloud with the KOSMA 3m-telescope. The integrated intensity map of the 12CO (J=2-1) transition has shown a structure with a compact core and a compact ridge extended in the north-west of the core. The cloud is surrounded by an optically bright rim, where the radiation-driven implosion (RDI) may greatly change the gas properties. The intensities of the CO (J=3-2) transition are higher than those of the CO (J=2-1) transition along the rim area.We find characteristic RDI structure in positionvelocity diagrams. Non-LTE Large velocity gradient (LVG) model analysis shows that the density and temperature at the edge are higher than that in the center. Our results provide evidences that RDI is taking place in Cepheus B molecular cloud.

Sheng Chen

What is connected

Connect this record

See the researcher in context

Building this map preview

40 published item(s)

Mosaic: Unlocking Long-Context Inference for Diffusion LLMs via Global Memory Planning and Dynamic Peak Taming

VLingNav: Embodied Navigation with Adaptive Reasoning and Visual-Assisted Linguistic Memory

Joint Beamforming Design for Dual-Functional MIMO Radar and Communication Systems Guaranteeing Physical Layer Security

A Novel DeBERTa-based Model for Financial Question Answering Task

Arithmetic purity of strong approximation for complete toric varieties

AutoFAS: Automatic Feature and Architecture Selection for Pre-Ranking System

CausalMTA: Eliminating the User Confounding Bias for Causal Multi-touch Attribution

CLIP2TV: Align, Match and Distill for Video-Text Retrieval

Contrastive Information Transfer for Pre-Ranking Systems

Integrated Sensing and Communication with mmWave Massive MIMO: A Compressed Sampling Perspective

KSSOLV 2.0: An efficient MATLAB toolbox for solving the Kohn-Sham equations with plane-wave basis set

Mobility Support for Millimeter Wave Communications: Opportunities and Challenges

Multiple-Objective Packet Routing Optimization for Aeronautical ad-hoc Networks

Sampling Is All You Need on Modeling Long-Term User Behaviors for CTR Prediction

Structure Parameter Optimized Kernel Based Online Prediction with a Generalized Optimization Strategy for Nonstationary Time Series

Compressive Sensing Based Massive Access for IoT Relying on Media Modulation Aided Machine Type Communications

Hybrid Transceiver Optimization for Multi-Hop Communications

Log orthogonal functions: approximation properties and applications

Matrix-Monotonic Optimization Part II: Multi-Variable Optimization

Quantum Criticism: A Tagged News Corpus Analysed for Sentiment and Named Entities

Research on a New Convolutional Neural Network Model Combined with Random Edges Adding

Towards Playing Full MOBA Games with Deep Reinforcement Learning

On non-feasible edge sets in matching-covered graphs

Alternating Estimation for Structured High-Dimensional Multi-Response Models

Compressive Sensing Based Multi-User Detector for the Large-Scale SM-MIMO Uplink

Non-equilibrium relaxation in a stochastic lattice Lotka-Volterra model

Structured Matrix Recovery via the Generalized Dantzig Selector

A Survey of directed graphs invariants

Computer simulation of random loose packings of micro-particles in presence of adhesion and friction

Effect of long-range repulsive Coulomb interactions on packing structure of adhesive particles

Estimation with Norm Regularization

Generalized Dantzig Selector: Application to the k-support norm

l1-norm Penalized Orthogonal Forward Regression

Priori-Information Aided Iterative Hard Threshold: A Low-Complexity High-Accuracy Compressive Sensing Based Channel Estimation for TDS-OFDM

Soft Pilot Reuse and Multi-Cell Block Diagonalization Precoding for Massive MIMO Systems

Super Hom-Gel'fand-Dorfman bialgebras and Hom-Lie conformal superalgebras

Cross-Layer Software-Defined 5G Network

Generalized Jacobi Functions and Their Applications to Fractional Differential Equations

Generalized Ehrhart polynomials

Radiation-driven Implosion in the Cepheus B Molecular Cloud