Source author record

Lu Wei

Lu Wei appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT math-ph math.MP Networking and Internet Architecture quant-ph Machine Learning hep-th math.ST Statistics Theory Computation and Language cond-mat.stat-mech cond-mat.str-el eess.SP Performance physics.comp-ph

Catalog footprint

What is connected

21works

16topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

How well can off-the-shelf LLMs elucidate molecular structures from mass spectra using chain-of-thought reasoning?

Mass spectrometry (MS) is a powerful analytical technique for identifying small molecules, yet determining complete molecular structures directly from tandem mass spectra (MS/MS) remains a long-standing challenge due to complex fragmentation patterns and the vast diversity of chemical space. Recent progress in large language models (LLMs) has shown promise for reasoning-intensive scientific tasks, but their capability for chemical interpretation is still unclear. In this work, we introduce a Chain-of-Thought (CoT) prompting framework and benchmark that evaluate how LLMs reason about mass spectral data to predict molecular structures. We formalize expert chemists' reasoning steps-such as double bond equivalent (DBE) analysis, neutral loss identification, and fragment assembly-into structured prompts and assess multiple state-of-the-art LLMs (Claude-3.5-Sonnet, GPT-4o-mini, and Llama-3 series) in a zero-shot setting using the MassSpecGym dataset. Our evaluation across metrics of SMILES validity, formula consistency, and structural similarity reveals that while LLMs can produce syntactically valid and partially plausible structures, they fail to achieve chemical accuracy or link reasoning to correct molecular predictions. These findings highlight both the interpretive potential and the current limitations of LLM-based reasoning for molecular elucidation, providing a foundation for future work that combines domain knowledge and reinforcement learning to achieve chemically grounded AI reasoning.

preprint2022arXiv

Modeling and Analysis of Intermittent Federated Learning Over Cellular-Connected UAV Networks

Federated learning (FL) is a promising distributed learning technique particularly suitable for wireless learning scenarios since it can accomplish a learning task without raw data transportation so as to preserve data privacy and lower network resource consumption. However, current works on FL over wireless networks do not profoundly study the fundamental performance of FL over wireless networks that suffers from communication outage due to channel impairment and network interference. To accurately exploit the performance of FL over wireless networks, this paper proposes a novel intermittent FL model over a cellular-connected unmanned aerial vehicle (UAV) network, which characterizes communication outage from UAV (clients) to their server and data heterogeneity among the datasets at UAVs. We propose an analytically tractable framework to derive the uplink outage probability and use it to devise a simulation-based approach so as to evaluate the performance of the proposed intermittent FL model. Our findings reveal how the intermittent FL model is impacted by uplink communication outage and UAV deployment. Extensive numerical simulations are provided to show the consistency between the simulated and analytical performances of the proposed intermittent FL model.

preprint2022arXiv

Spatio-Temporal Federated Learning for Massive Wireless Edge Networks

This paper presents a novel approach to conduct highly efficient federated learning (FL) over a massive wireless edge network, where an edge server and numerous mobile devices (clients) jointly learn a global model without transporting the huge amount of data collected by the mobile devices to the edge server. The proposed FL approach is referred to as spatio-temporal FL (STFL), which jointly exploits the spatial and temporal correlations between the learning updates from different mobile devices scheduled to join STFL in various training epochs. The STFL model not only represents the realistic intermittent learning behavior from the edge server to the mobile devices due to data delivery outage, but also features a mechanism of compensating loss learning updates in order to mitigate the impacts of intermittent learning. An analytical framework of STFL is proposed and employed to study the learning capability of STFL via its convergence performance. In particular, we have assessed the impact of data delivery outage, intermittent learning mitigation, and statistical heterogeneity of datasets on the convergence performance of STFL. The results provide crucial insights into the design and analysis of STFL-based wireless networks.

preprint2022arXiv

Toward Ubiquitous and Flexible Coverage of UAV-IRS-Assisted NOMA Networks

This paper studies how to achieve a high and flexible coverage performance of a large-scale cellular network that enables unmanned aerial vehicles (UAVs) for non-orthogonal multiple access (NOMA) transmission to simultaneously serve multiple users. The considered cellular network consists of a tier of base stations and a tier of UAVs. Each UAV is mounted with an intelligent reflecting surface (IRS) in order to serve as an aerial IRS reflecting signals between a base station and a user in the network. All the UAVs in the network are deployed based on a newly proposed three-dimensional (3D) point process that leads to a tractable and accurate analysis of the association statistics, which is traditionally difficult to analyze due to the mobility of UAVs. In light of this, we are able to analyze the downlink coverage of UAV-IRS-assisted NOMA transmission for two users and derive the corresponding coverage probabilities. Our coverage analyses shed light on the optimal allocations of transmit power between NOMA users and UAVs to accomplish the goal of ubiquitous and flexible NOMA transmission. We also conduct numerical simulations to validate our coverage analytical results while demonstrating the improved coverage performance achieved by aerial IRSs.

preprint2021arXiv

Second-order statistics of fermionic Gaussian states

We study the statistical behavior of entanglement in quantum bipartite systems over fermionic Gaussian states as measured by von Neumann entropy and entanglement capacity. The focus is on the variance of von Neumann entropy and the mean entanglement capacity that belong to the so-defined second-order statistics. The main results are the exact yet explicit formulas of the two considered second-order statistics for fixed subsystem dimension differences. We also conjecture the exact variance of von Neumann entropy valid for arbitrary subsystem dimensions. Based on the obtained results, we analytically study the numerically observed phenomena of Gaussianity of von Neumann entropy and linear growth of average capacity.

preprint2020arXiv

Entanglement Area Law for Shallow and Deep Quantum Neural Network States

A study of the artificial neural network representation of quantum many-body states is presented. The locality and entanglement properties of states for shallow and deep quantum neural networks are investigated in detail. By introducing the notion of local quasi-product states, for which the locally connected shallow feed-forward neural network states and restricted Boltzmann machine states are special cases, we show that Rényi entanglement entropies of all these states obey the entanglement area law. Besides, we also investigate the entanglement features of deep Boltzmann machine states and show that locality constraints imposed on the neural networks make the states obey the entanglement area law. Finally, as an application, we apply the notion of Rényi entanglement entropy to understanding the power of neural networks and show that image classification problems which can be efficiently solved must obey the area law.

preprint2020arXiv

Exact variance of von Neumann entanglement entropy over the Bures-Hall measure

The Bures-Hall distance metric between quantum states is a unique measure that satisfies various useful properties for quantum information processing. In this work, we study the statistical behavior of quantum entanglement over the Bures-Hall ensemble as measured by von Neumann entropy. The average von Neumann entropy over such an ensemble has been recently obtained, whereas the main result of this work is an explicit expression of the corresponding variance that specifies the fluctuation around its average. The starting point of the calculations is the connection between correlation functions of the Bures-Hall ensemble and these of the Cauchy-Laguerre ensemble. The derived variance formula, together with the known mean formula, leads to a simple but accurate Gaussian approximation to the distribution of von Neumann entropy of finite-size systems. This Gaussian approximation is also conjectured to be the limiting distribution for large dimensional systems.

preprint2020arXiv

Proof of Sarkar-Kumar's Conjectures on Average Entanglement Entropies over the Bures-Hall Ensemble

Sarkar and Kumar recently conjectured [J. Phys. A: Math. Theor. $\textbf{52}$, 295203 (2019)] that for a bipartite system of Hilbert dimension $mn$, the mean values of quantum purity and von Neumann entropy of a subsystem of dimension $m\leq n$ over the Bures-Hall measure are given by \begin{equation*} \frac{2n(2n+m)-m^{2}+1}{2n(2mn-m^2+2)} \end{equation*} and \begin{equation*} ψ_{0}\left(mn-\frac{m^2}{2}+1\right)-ψ_{0}\left(n+\frac{1}{2}\right), \end{equation*} respectively, where $ψ_{0}(\cdot)$ is the digamma function. We prove the above conjectured formulas in this work. A key ingredient of the proofs is Forrester and Kieburg's discovery on the connection between the Bures-Hall ensemble and the Cauchy-Laguerre biorthogonal ensemble studied by Bertola, Gekhtman, and Szmigielski.

preprint2019arXiv

Skewness of von Neumann entanglement entropy

We study quantum bipartite systems in a random pure state, where von Neumann entropy is considered as a measure of the entanglement. Expressions of the first and second exact cumulants of von Neumann entropy, relevant respectively to the average and fluctuation behavior, are known in the literature. The focus of this paper is on its skewness that specifies the degree of asymmetry of the distribution. Computing the skewness requires additionally the third cumulant, an exact formula of which is the main result of this work. In proving the main result, we obtain as a byproduct various summation identities involving polygamma and related functions. The derived third cumulant also leads to an improved approximation to the distribution of von Neumann entropy.

preprint2018arXiv

On the exact variance of Tsallis entropy in a random pure state

Tsallis entropy is a useful one-parameter generalization of the standard von Neumann entropy in information theory. We study the variance of Tsallis entropy of bipartite quantum systems in a random pure state. The main result is an exact variance formula of Tsallis entropy that involves finite sums of some terminating hypergeometric functions. In the special cases of quadratic entropy and small subsystem dimensions, the main result is further simplified to explicit variance expressions. As a byproduct, we find an independent proof of the recently proved variance formula of von Neumann entropy based on the derived moment relation to the Tsallis entropy.

preprint2016arXiv

Asymptotic Matrix Variate von-Mises Fisher and Bingham Distributions with Applications

Probability distributions in Stiefel manifold such as the von-Mises Fisher and Bingham distributions find diverse applications in signal processing and other applied sciences. Use of these statistical models in practice is complicated by the difficulties in numerical evaluation of their normalization constants. In this letter, we derive asymptotical approximations to the normalization constants via recent results in random matrix theory. The derived approximations take simple forms and are reasonably accurate in regimes of practical interest. As an application, we show that the proposed analytical results lead to a remarkably reduction of the sampling complexity compared to existing simulation based approaches.

preprint2015arXiv

From Random Matrix Theory to Coding Theory: Volume of a Metric Ball in Unitary Group

Volume estimates of metric balls in manifolds find diverse applications in information and coding theory. In this paper, some new results for the volume of a metric ball in unitary group are derived via various tools from random matrix theory. The first result is an integral representation of the exact volume, which involves a Toeplitz determinant of Bessel functions. The connection to matrix-variate hypergeometric functions and Szegő's strong limit theorem lead independently from the finite size formula to an asymptotic one. The convergence of the limiting formula is exceptionally fast due to an underlying mock-Gaussian behavior. The proposed volume estimate enables simple but accurate analytical evaluation of coding-theoretic bounds of unitary codes. In particular, the Gilbert-Varshamov lower bound and the Hamming upper bound on cardinality as well as the resulting bounds on code rate and minimum distance are derived. Moreover, bounds on the scaling law of code rate are found. Lastly, a closed-form bound on diversity sum relevant to unitary space-time codes is obtained, which was only computed numerically in literature.

preprint2015arXiv

Volume of Metric Balls in High-Dimensional Complex Grassmann Manifolds

Volume of metric balls relates to rate-distortion theory and packing bounds on codes. In this paper, the volume of balls in complex Grassmann manifolds is evaluated for an arbitrary radius. The ball is defined as a set of hyperplanes of a fixed dimension with reference to a center of possibly different dimension, and a generalized chordal distance for unequal dimensional subspaces is used. First, the volume is reduced to one-dimensional integral representation. The overall problem boils down to evaluating a determinant of a matrix of the same size as the subspace dimensionality. Interpreting this determinant as a characteristic function of the Jacobi ensemble, an asymptotic analysis is carried out. The obtained asymptotic volume is moreover refined using moment-matching techniques to provide a tighter approximation in finite-size regimes. Lastly, the pertinence of the derived results is shown by rate-distortion analysis of source coding on Grassmann manifolds.

preprint2014arXiv

On the Outage Capacity of Orthogonal Space-time Block Codes Over Multi-cluster Scattering MIMO Channels

Multiple cluster scattering MIMO channel is a useful model for pico-cellular MIMO networks. In this paper, orthogonal space-time block coded transmission over such a channel is considered, where the effective channel equals the product of n complex Gaussian matrices. A simple and accurate closed-form approximation to the channel outage capacity has been derived in this setting. The result is valid for an arbitrary number of clusters n-1 of scatterers and an arbitrary antenna configuration. Numerical results are provided to study the relative outage performance between the multi-cluster and the Rayleigh-fading MIMO channels for which n=1.

preprint2013arXiv

Network Coding for Energy-Efficient Distributed Storage System in Wireless Sensor Networks

A network coding-based scheme is proposed to improve the energy efficiency of distributed storage systems in WSNs (wireless sensor networks), which mainly focuses on two problems: firstly, consideration is given to effective distributed storage technology in WSNs; secondly, we address how to repair the data in failed storage nodes with less resource. For the first problem, we propose a method to obtain a sparse generator matrix to construct network codes, and this sparse generator matrix is proven to be the sparsest. Benefiting from the sparse generator matrix, the energy consumption required to implement distributed storage is reduced. For the second problem, we designed a network coding-based iterative repair method, which adequately utilizes the idea of re-encoding at intermediate nodes from network coding theory. Benefiting from the re-encoding, the energy consumption required by data repair is significantly reduced. Moreover, we provide an explicit lower bound of field size required by this scheme, which implies that this scheme can work over a very small field and the required computation overhead of coding is very low. The simulation result verifies that by using our scheme, the total energy consumption required to implement distributed storage system in WSNs can be reduced on the one hand, and on the other hand, this method can also balance energy consumption of the networks.

preprint2013arXiv

Singular value correlation functions for products of Wishart random matrices

Consider the product of $M$ quadratic random matrices with complex elements and no further symmetry, where all matrix elements of each factor have a Gaussian distribution. This generalises the classical Wishart-Laguerre Gaussian Unitary Ensemble with M=1. In this paper we first compute the joint probability distribution for the singular values of the product matrix when the matrix size $N$ and the number $M$ are fixed but arbitrary. This leads to a determinantal point process which can be realised in two different ways. First, it can be written as a one-matrix singular value model with a non-standard Jacobian, or second, for $M\geq2$, as a two-matrix singular value model with a set of auxiliary singular values and a weight proportional to the Meijer $G$-function. For both formulations we determine all singular value correlation functions in terms of the kernels of biorthogonal polynomials which we explicitly construct. They are given in terms of hypergeometric and Meijer $G$-functions, generalising the Laguerre polynomials. Our investigation was motivated from applications in telecommunication of multi-layered scattering MIMO channels. We present the ergodic mutual information for finite-$N$ for such a channel model with $M-1$ layers of scatterers as an example.

preprint2012arXiv

A Blind Time-Reversal Detector in the Presence of Channel Correlation

A blind target detector using the time reversal transmission is proposed in the presence of channel correlation. We calculate the exact moments of the test statistics involved. The derived moments are used to construct an accurate approximative Likelihood Ratio Test (LRT) based on multivariate Edgeworth expansion. Performance gain over an existing detector is observed in scenarios with channel correlation and relatively strong target signal.

preprint2012arXiv

Approximation to Distribution of Product of Random Variables Using Orthogonal Polynomials for Lognormal Density

We derive a closed-form expression for the orthogonal polynomials associated with the general lognormal density. The result can be utilized to construct easily computable approximations for probability density function of a product of random variables, when the considered variates are either independent or correlated. As an example, we have calculated the approximative distribution for the product of Nakagami-m variables. Simulations indicate that accuracy of the proposed approximation is good with small cross-correlations under light fading condition.

preprint2012arXiv

Locally Best Invariant Test for Multiple Primary User Spectrum Sensing

We consider multi-antenna cooperative spectrum sensing in cognitive radio networks, when there may be multiple primary users. A noise-uncertainty-free detector that is optimal in the low signal to noise ratio regime is analyzed in such a scenario. Specifically, we derive the exact moments of the test statistics involved, which lead to simple and accurate analytical formulae for the false alarm probability and the decision threshold. Simulations are provided to examine the accuracy of the derived results, and to compare with other detectors in realistic sensing scenarios.

preprint2012arXiv

On the Exact Distribution of the Scaled Largest Eigenvalue

In this paper we study the distribution of the scaled largest eigenvalue of complexWishart matrices, which has diverse applications both in statistics and wireless communications. Exact expressions, valid for any matrix dimensions, have been derived for the probability density function and the cumulative distribution function. The derived results involve only finite sums of polynomials. These results are obtained by taking advantage of properties of the Mellin transform for products of independent random variables.

preprint2012arXiv

Spectrum Sensing in the Presence of Multiple Primary Users

We consider multi-antenna cooperative spectrum sensing in cognitive radio networks, when there may be multiple primary users. A detector based on the spherical test is analyzed in such a scenario. Based on the moments of the distributions involved, simple and accurate analytical formulae for the key performance metrics of the detector are derived. The false alarm and the detection probabilities, as well as the detection threshold and Receiver Operation Characteristics are available in closed form. Simulations are provided to verify the accuracy of the derived results, and to compare with other detectors in realistic sensing scenarios.

Lu Wei

What is connected

Connect this record

See the researcher in context

Building this map preview

21 published item(s)

How well can off-the-shelf LLMs elucidate molecular structures from mass spectra using chain-of-thought reasoning?

Modeling and Analysis of Intermittent Federated Learning Over Cellular-Connected UAV Networks

Spatio-Temporal Federated Learning for Massive Wireless Edge Networks

Toward Ubiquitous and Flexible Coverage of UAV-IRS-Assisted NOMA Networks

Second-order statistics of fermionic Gaussian states

Entanglement Area Law for Shallow and Deep Quantum Neural Network States

Exact variance of von Neumann entanglement entropy over the Bures-Hall measure

Proof of Sarkar-Kumar's Conjectures on Average Entanglement Entropies over the Bures-Hall Ensemble

Skewness of von Neumann entanglement entropy

On the exact variance of Tsallis entropy in a random pure state

Asymptotic Matrix Variate von-Mises Fisher and Bingham Distributions with Applications

From Random Matrix Theory to Coding Theory: Volume of a Metric Ball in Unitary Group

Volume of Metric Balls in High-Dimensional Complex Grassmann Manifolds

On the Outage Capacity of Orthogonal Space-time Block Codes Over Multi-cluster Scattering MIMO Channels

Network Coding for Energy-Efficient Distributed Storage System in Wireless Sensor Networks

Singular value correlation functions for products of Wishart random matrices

A Blind Time-Reversal Detector in the Presence of Channel Correlation

Approximation to Distribution of Product of Random Variables Using Orthogonal Polynomials for Lognormal Density

Locally Best Invariant Test for Multiple Primary User Spectrum Sensing

On the Exact Distribution of the Scaled Largest Eigenvalue

Spectrum Sensing in the Presence of Multiple Primary Users