Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
29works
0followers
16topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

29 published item(s)

preprint2026arXiv

Efficient Preference Poisoning Attack on Offline RLHF

Offline Reinforcement Learning from Human Feedback (RLHF) pipelines such as Direct Preference Optimization (DPO) train on a pre-collected preference dataset, which makes them vulnerable to preference poisoning attack. We study label flip attacks against log-linear DPO. We first illustrate that flipping one preference label induces a parameter-independent shift in the DPO gradient. Using this key property, we can then convert the targeted poisoning problem into a structured binary sparse approximation problem. To solve this problem, we develop two attack methods: Binary-Aware Lattice Attack (BAL-A) and Binary Matching Pursuit Attack (BMP-A). BAL-A embeds the binary flip selection problem into a binary-aware lattice and applies Lenstra-Lenstra-Lovász reduction and Babai's nearest plane algorithm; we provide sufficient conditions that enforce binary coefficients and recover the minimum-flip objective. BMP-A adapts binary matching pursuit to our non-normalized gradient dictionary and yields coherence-based recovery guarantees and robustness (impossibility) certificates for $K$-flip budgets. Experiments on synthetic dictionaries and the Stanford Human Preferences dataset validate the theory and highlight how dictionary geometry governs attack success.

preprint2021arXiv

Distributed Dual Coordinate Ascent in General Tree Networks and Communication Network Effect on Synchronous Machine Learning

Due to the big size of data and limited data storage volume of a single computer or a single server, data are often stored in a distributed manner. Thus, performing large-scale machine learning operations with the distributed datasets through communication networks is often required. In this paper, we study the convergence rate of the distributed dual coordinate ascent for distributed machine learning problems in a general tree-structured network. Since a tree network model can be understood as the generalization of a star network model, our algorithm can be thought of as the generalization of the distributed dual coordinate ascent in a star network model. We provide the convergence rate of the distributed dual coordinate ascent over a general tree network in a recursive manner and analyze the network effect on the convergence rate. Secondly, by considering network communication delays, we optimize the distributed dual coordinate ascent algorithm to maximize its convergence speed. From our analytical result, we can choose the optimal number of local iterations depending on the communication delay severity to achieve the fastest convergence speed. In numerical experiments, we consider machine learning scenarios over communication networks, where local workers cannot directly reach to a central node due to constraints in communication, and demonstrate that the usability of our distributed dual coordinate ascent algorithm in tree networks. Additionally, we show that adapting number of local and global iterations to network communication delays in the distributed dual coordinated ascent algorithm can improve its convergence speed.

preprint2020arXiv

Derivation of Information-Theoretically Optimal Adversarial Attacks with Applications to Robust Machine Learning

We consider the theoretical problem of designing an optimal adversarial attack on a decision system that maximally degrades the achievable performance of the system as measured by the mutual information between the degraded signal and the label of interest. This problem is motivated by the existence of adversarial examples for machine learning classifiers. By adopting an information theoretic perspective, we seek to identify conditions under which adversarial vulnerability is unavoidable i.e. even optimally designed classifiers will be vulnerable to small adversarial perturbations. We present derivations of the optimal adversarial attacks for discrete and continuous signals of interest, i.e., finding the optimal perturbation distributions to minimize the mutual information between the degraded signal and a signal following a continuous or discrete distribution. In addition, we show that it is much harder to achieve adversarial attacks for minimizing mutual information when multiple redundant copies of the input signal are available. This provides additional support to the recently proposed ``feature compression" hypothesis as an explanation for the adversarial vulnerability of deep learning classifiers. We also report on results from computational experiments to illustrate our theoretical results.

preprint2020arXiv

Do Deep Minds Think Alike? Selective Adversarial Attacks for Fine-Grained Manipulation of Multiple Deep Neural Networks

Recent works have demonstrated the existence of {\it adversarial examples} targeting a single machine learning system. In this paper we ask a simple but fundamental question of "selective fooling": given {\it multiple} machine learning systems assigned to solve the same classification problem and taking the same input signal, is it possible to construct a perturbation to the input signal that manipulates the outputs of these {\it multiple} machine learning systems {\it simultaneously} in arbitrary pre-defined ways? For example, is it possible to selectively fool a set of "enemy" machine learning systems but does not fool the other "friend" machine learning systems? The answer to this question depends on the extent to which these different machine learning systems "think alike". We formulate the problem of "selective fooling" as a novel optimization problem, and report on a series of experiments on the MNIST dataset. Our preliminary findings from these experiments show that it is in fact very easy to selectively manipulate multiple MNIST classifiers simultaneously, even when the classifiers are identical in their architectures, training algorithms and training datasets except for random initialization during training. This suggests that two nominally equivalent machine learning systems do not in fact "think alike" at all, and opens the possibility for many novel applications and deeper understandings of the working principles of deep neural networks.

preprint2020arXiv

Error Correction Codes for COVID-19 Virus and Antibody Testing: Using Pooled Testing to Increase Test Reliability

We consider a novel method to increase the reliability of COVID-19 virus or antibody tests by using specially designed pooled testings. Instead of testing nasal swab or blood samples from individual persons, we propose to test mixtures of samples from many individuals. The pooled sample testing method proposed in this paper also serves a different purpose: for increasing test reliability and providing accurate diagnoses even if the tests themselves are not very accurate. Our method uses ideas from compressed sensing and error-correction coding to correct for a certain number of errors in the test results. The intuition is that when each individual's sample is part of many pooled sample mixtures, the test results from all of the sample mixtures contain redundant information about each individual's diagnosis, which can be exploited to automatically correct for wrong test results in exactly the same way that error correction codes correct errors introduced in noisy communication channels. While such redundancy can also be achieved by simply testing each individual's sample multiple times, we present simulations and theoretical arguments that show that our method is significantly more efficient in increasing diagnostic accuracy. In contrast to group testing and compressed sensing which aim to reduce the number of required tests, this proposed error correction code idea purposefully uses pooled testing to increase test accuracy, and works not only in the "undersampling" regime, but also in the "oversampling" regime, where the number of tests is bigger than the number of subjects. The results in this paper run against traditional beliefs that, "even though pooled testing increased test capacity, pooled testings were less reliable than testing individuals separately."

preprint2020arXiv

Low-Cost and High-Throughput Testing of COVID-19 Viruses and Antibodies via Compressed Sensing: System Concepts and Computational Experiments

Coronavirus disease 2019 (COVID-19) is an ongoing pandemic infectious disease outbreak that has significantly harmed and threatened the health and lives of millions or even billions of people. COVID-19 has also negatively impacted the social and economic activities of many countries significantly. With no approved vaccine available at this moment, extensive testing of COVID-19 viruses in people are essential for disease diagnosis, virus spread confinement, contact tracing, and determining right conditions for people to return to normal economic activities. Identifying people who have antibodies for COVID-19 can also help select persons who are suitable for undertaking certain essential activities or returning to workforce. However, the throughputs of current testing technologies for COVID-19 viruses and antibodies are often quite limited, which are not sufficient for dealing with COVID-19 viruses' anticipated fast oscillating waves of spread affecting a significant portion of the earth's population. In this paper, we propose to use compressed sensing (group testing can be seen as a special case of compressed sensing when it is applied to COVID-19 detection) to achieve high-throughput rapid testing of COVID-19 viruses and antibodies, which can potentially provide tens or even more folds of speedup compared with current testing technologies. The proposed compressed sensing system for high-throughput testing can utilize expander graph based compressed sensing matrices developed by us \cite{Weiyuexpander2007}.

preprint2020arXiv

Optimal Pooling Matrix Design for Group Testing with Dilution (Row Degree) Constraints

In this paper, we consider the problem of designing optimal pooling matrix for group testing (for example, for COVID-19 virus testing) with the constraint that no more than $r>0$ samples can be pooled together, which we call "dilution constraint". This problem translates to designing a matrix with elements being either 0 or 1 that has no more than $r$ '1's in each row and has a certain performance guarantee of identifying anomalous elements. We explicitly give pooling matrix designs that satisfy the dilution constraint and have performance guarantees of identifying anomalous elements, and prove their optimality in saving the largest number of tests, namely showing that the designed matrices have the largest width-to-height ratio among all constraint-satisfying 0-1 matrices.

preprint2013arXiv

Compressed Hypothesis Testing: to Mix or Not to Mix?

In this paper, we study the hypothesis testing problem of, among $n$ random variables, determining $k$ random variables which have different probability distributions from the rest $(n-k)$ random variables. Instead of using separate measurements of each individual random variable, we propose to use mixed measurements which are functions of multiple random variables. It is demonstrated that $O({\displaystyle \frac{k \log(n)}{\min_{P_i, P_j} C(P_i, P_j)}})$ observations are sufficient for correctly identifying the $k$ anomalous random variables with high probability, where $C(P_i, P_j)$ is the Chernoff information between two possible distributions $P_i$ and $P_j$ for the proposed mixed observations. We characterized the Chernoff information respectively under fixed time-invariant mixed observations, random time-varying mixed observations, and deterministic time-varying mixed observations; in our derivations, we introduced the \emph{inner and outer conditional Chernoff information} for time-varying measurements. It is demonstrated that mixed observations can strictly improve the error exponent of hypothesis testing, over separate observations of individual random variables. We also characterized the optimal mixed observations maximizing the error exponent, and derived an explicit construction of the optimal mixed observations for the case of Gaussian random variables. These results imply that mixed observations of random variables can reduce the number of required samples in hypothesis testing applications. Compared with compressed sensing problems, this paper considers random variables which are allowed to dramatically change values in different measurements.

preprint2013arXiv

Guarantees of Total Variation Minimization for Signal Recovery

In this paper, we consider using total variation minimization to recover signals whose gradients have a sparse support, from a small number of measurements. We establish the proof for the performance guarantee of total variation (TV) minimization in recovering \emph{one-dimensional} signal with sparse gradient support. This partially answers the open problem of proving the fidelity of total variation minimization in such a setting \cite{TVMulti}. In particular, we have shown that the recoverable gradient sparsity can grow linearly with the signal dimension when TV minimization is used. Recoverable sparsity thresholds of TV minimization are explicitly computed for 1-dimensional signal by using the Grassmann angle framework. We also extend our results to TV minimization for multidimensional signals. Stability of recovering signal itself using 1-D TV minimization has also been established through a property called "almost Euclidean property for 1-dimensional TV norm". We further give a lower bound on the number of random Gaussian measurements for recovering 1-dimensional signal vectors with $N$ elements and $K$-sparse gradients. Interestingly, the number of needed measurements is lower bounded by $Ω((NK)^{\frac{1}{2}})$, rather than the $O(K\log(N/K))$ bound frequently appearing in recovering $K$-sparse signal vectors.

preprint2013arXiv

Off-The-Grid Spectral Compressed Sensing With Prior Information

Recent research in off-the-grid compressed sensing (CS) has demonstrated that, under certain conditions, one can successfully recover a spectrally sparse signal from a few time-domain samples even though the dictionary is continuous. In this paper, we extend off-the-grid CS to applications where some prior information about spectrally sparse signal is known. We specifically consider cases where a few contributing frequencies or poles, but not their amplitudes or phases, are known a priori. Our results show that equipping off-the-grid CS with the known-poles algorithm can increase the probability of recovering all the frequency components.

preprint2013arXiv

Outliers and Random Noises in System Identification: a Compressed Sensing Approach

In this paper, we consider robust system identification under sparse outliers and random noises. In this problem, system parameters are observed through a Toeplitz matrix. All observations are subject to random noises and a few are corrupted with outliers. We reduce this problem of system identification to a sparse error correcting problem using a Toeplitz structured real-numbered coding matrix. We prove the performance guarantee of Toeplitz structured matrix in sparse error correction. Thresholds on the percentage of correctable errors for Toeplitz structured matrices are established. When both outliers and observation noise are present, we have shown that the estimation error goes to 0 asymptotically as long as the probability density function for observation noise is not "vanishing" around 0. No probabilistic assumptions are imposed on the outliers.

preprint2013arXiv

Precise Semidefinite Programming Formulation of Atomic Norm Minimization for Recovering d-Dimensional ($d\geq 2$) Off-the-Grid Frequencies

Recent research in off-the-grid compressed sensing (CS) has demonstrated that, under certain conditions, one can successfully recover a spectrally sparse signal from a few time-domain samples even though the dictionary is continuous. In particular, atomic norm minimization was proposed in \cite{tang2012csotg} to recover $1$-dimensional spectrally sparse signal. However, in spite of existing research efforts \cite{chi2013compressive}, it was still an open problem how to formulate an equivalent positive semidefinite program for atomic norm minimization in recovering signals with $d$-dimensional ($d\geq 2$) off-the-grid frequencies. In this paper, we settle this problem by proposing equivalent semidefinite programming formulations of atomic norm minimization to recover signals with $d$-dimensional ($d\geq 2$) off-the-grid frequencies.

preprint2013arXiv

Precisely Verifying the Null Space Conditions in Compressed Sensing: A Sandwiching Algorithm

In this paper, we propose new efficient algorithms to verify the null space condition in compressed sensing (CS). Given an $(n-m) \times n$ ($m>0$) CS matrix $A$ and a positive $k$, we are interested in computing $\displaystyle α_k = \max_{\{z: Az=0,z\neq 0\}}\max_{\{K: |K|\leq k\}}$ ${\|z_K \|_{1}}{\|z\|_{1}}$, where $K$ represents subsets of $\{1,2,...,n\}$, and $|K|$ is the cardinality of $K$. In particular, we are interested in finding the maximum $k$ such that $α_k < {1}{2}$. However, computing $α_k$ is known to be extremely challenging. In this paper, we first propose a series of new polynomial-time algorithms to compute upper bounds on $α_k$. Based on these new polynomial-time algorithms, we further design a new sandwiching algorithm, to compute the \emph{exact} $α_k$ with greatly reduced complexity. When needed, this new sandwiching algorithm also achieves a smooth tradeoff between computational complexity and result accuracy. Empirical results show the performance improvements of our algorithm over existing known methods; and our algorithm outputs precise values of $α_k$, with much lower complexity than exhaustive search.

preprint2013arXiv

Quickest Search Over Multiple Sequences with Mixed Observations

The problem of sequentially finding an independent and identically distributed (i.i.d.) sequence that is drawn from a probability distribution $F_1$ by searching over multiple sequences, some of which are drawn from $F_1$ and the others of which are drawn from a different distribution $F_0$, is considered. The sensor is allowed to take one observation at a time. It has been shown in a recent work that if each observation comes from one sequence, Cumulative Sum (CUSUM) test is optimal. In this paper, we propose a new approach in which each observation can be a linear combination of samples from multiple sequences. The test has two stages. In the first stage, namely scanning stage, one takes a linear combination of a pair of sequences with the hope of scanning through sequences that are unlikely to be generated from $F_1$ and quickly identifying a pair of sequences such that at least one of them is highly likely to be generated by $F_1$. In the second stage, namely refinement stage, one examines the pair identified from the first stage more closely and picks one sequence to be the final sequence. The problem under this setup belongs to a class of multiple stopping time problems. In particular, it is an ordered two concatenated Markov stopping time problem. We obtain the optimal solution using the tools from the multiple stopping time theory. Numerical simulation results show that this search strategy can significantly reduce the searching time, especially when $F_{1}$ is rare.

preprint2013arXiv

Sparse Recovery from Nonlinear Measurements with Applications in Bad Data Detection for Power Networks

In this paper, we consider the problem of sparse recovery from nonlinear measurements, which has applications in state estimation and bad data detection for power networks. An iterative mixed $\ell_1$ and $\ell_2$ convex program is used to estimate the true state by locally linearizing the nonlinear measurements. When the measurements are linear, through using the almost Euclidean property for a linear subspace, we derive a new performance bound for the state estimation error under sparse bad data and additive observation noise. As a byproduct, in this paper we provide sharp bounds on the almost Euclidean property of a linear subspace, using the &#34;escape-through-the-mesh&#34; theorem from geometric functional analysis. When the measurements are nonlinear, we give conditions under which the solution of the iterative algorithm converges to the true state even though the locally linearized measurements may not be the actual nonlinear measurements. We numerically evaluate our iterative convex programming approach to perform bad data detections in nonlinear electrical power networks problems. We are able to use semidefinite programming to verify the conditions for convergence of the proposed iterative sparse recovery algorithms from nonlinear measurements.

preprint2013arXiv

Universally Elevating the Phase Transition Performance of Compressed Sensing: Non-Isometric Matrices are Not Necessarily Bad Matrices

In compressed sensing problems, $\ell_1$ minimization or Basis Pursuit was known to have the best provable phase transition performance of recoverable sparsity among polynomial-time algorithms. It is of great theoretical and practical interest to find alternative polynomial-time algorithms which perform better than $\ell_1$ minimization. \cite{Icassp reweighted l_1}, \cite{Isit reweighted l_1}, \cite{XuScaingLaw} and \cite{iterativereweightedjournal} have shown that a two-stage re-weighted $\ell_1$ minimization algorithm can boost the phase transition performance for signals whose nonzero elements follow an amplitude probability density function (pdf) $f(\cdot)$ whose $t$-th derivative $f^{t}(0) \neq 0$ for some integer $t \geq 0$. However, for signals whose nonzero elements are strictly suspended from zero in distribution (for example, constant-modulus, only taking values `$+d$&#39; or `$-d$&#39; for some nonzero real number $d$), no polynomial-time signal recovery algorithms were known to provide better phase transition performance than plain $\ell_1$ minimization, especially for dense sensing matrices. In this paper, we show that a polynomial-time algorithm can universally elevate the phase-transition performance of compressed sensing, compared with $\ell_1$ minimization, even for signals with constant-modulus nonzero elements. Contrary to conventional wisdoms that compressed sensing matrices are desired to be isometric, we show that non-isometric matrices are not necessarily bad sensing matrices. In this paper, we also provide a framework for recovering sparse signals when sensing matrices are not isometric.

preprint2012arXiv

Matrix Design for Optimal Sensing

We design optimal $2 \times N$ ($2 <N$) matrices, with unit columns, so that the maximum condition number of all the submatrices comprising 3 columns is minimized. The problem has two applications. When estimating a 2-dimensional signal by using only three of $N$ observations at a given time, this minimizes the worst-case achievable estimation error. It also captures the problem of optimum sensor placement for monitoring a source located in a plane, when only a minimum number of required sensors are active at any given time. For arbitrary $N\geq3$, we derive the optimal matrices which minimize the maximum condition number of all the submatrices of three columns. Surprisingly, a uniform distribution of the columns is \emph{not} the optimal design for odd $N\geq 7$.

preprint2012arXiv

On the Mixing Time of Markov Chain Monte Carlo for Integer Least-Square Problems

In this paper, we study the mixing time of Markov Chain Monte Carlo (MCMC) for integer least-square (LS) optimization problems. It is found that the mixing time of MCMC for integer LS problems depends on the structure of the underlying lattice. More specifically, the mixing time of MCMC is closely related to whether there is a local minimum in the lattice structure. For some lattices, the mixing time of the Markov chain is independent of the signal-to-noise ($SNR$) ratio and grows polynomially in the problem dimension; while for some lattices, the mixing time grows unboundedly as $SNR$ grows. Both theoretical and empirical results suggest that to ensure fast mixing, the temperature for MCMC should often grow positively as the $SNR$ increases. We also derive the probability that there exist local minima in an integer least-square problem, which can be as high as $1/3-\frac{1}{\sqrt{5}}+\frac{2\arctan(\sqrt{5/3})}{\sqrt{5}π}$.

preprint2012arXiv

Sensing with Optimal Matrices

We consider the problem of designing optimal $M \times N$ ($M \leq N$) sensing matrices which minimize the maximum condition number of all the submatrices of $K$ columns. Such matrices minimize the worst-case estimation errors when only $K$ sensors out of $N$ sensors are available for sensing at a given time. For M=2 and matrices with unit-normed columns, this problem is equivalent to the problem of maximizing the minimum singular value among all the submatrices of $K$ columns. For M=2, we are able to give a closed form formula for the condition number of the submatrices. When M=2 and K=3, for an arbitrary $N\geq3$, we derive the optimal matrices which minimize the maximum condition number of all the submatrices of $K$ columns. Surprisingly, a uniformly distributed design is often \emph{not} the optimal design minimizing the maximum condition number.

preprint2012arXiv

Toeplitz Matrix Based Sparse Error Correction in System Identification: Outliers and Random Noises

In this paper, we consider robust system identification under sparse outliers and random noises. In our problem, system parameters are observed through a Toeplitz matrix. All observations are subject to random noises and a few are corrupted with outliers. We reduce this problem of system identification to a sparse error correcting problem using a Toeplitz structured real-numbered coding matrix. We prove the performance guarantee of Toeplitz structured matrix in sparse error correction. Thresholds on the percentage of correctable errors for Toeplitz structured matrices are also established. When both outliers and observation noise are present, we have shown that the estimation error goes to 0 asymptotically as long as the probability density function for observation noise is not &#34;vanishing&#34; around 0.

preprint2011arXiv

Improving the Thresholds of Sparse Recovery: An Analysis of a Two-Step Reweighted Basis Pursuit Algorithm

It is well known that $\ell_1$ minimization can be used to recover sufficiently sparse unknown signals from compressed linear measurements. In fact, exact thresholds on the sparsity, as a function of the ratio between the system dimensions, so that with high probability almost all sparse signals can be recovered from i.i.d. Gaussian measurements, have been computed and are referred to as &#34;weak thresholds&#34; \cite{D}. In this paper, we introduce a reweighted $\ell_1$ recovery algorithm composed of two steps: a standard $\ell_1$ minimization step to identify a set of entries where the signal is likely to reside, and a weighted $\ell_1$ minimization step where entries outside this set are penalized. For signals where the non-sparse component entries are independent and identically drawn from certain classes of distributions, (including most well known continuous distributions), we prove a \emph{strict} improvement in the weak recovery threshold. Our analysis suggests that the level of improvement in the weak threshold depends on the behavior of the distribution at the origin. Numerical simulations verify the distribution dependence of the threshold improvement very well, and suggest that in the case of i.i.d. Gaussian nonzero entries, the improvement can be quite impressive---over 20% in the example we consider.

preprint2011arXiv

On State Estimation with Bad Data Detection

In this paper, we consider the problem of state estimation through observations possibly corrupted with both bad data and additive observation noises. A mixed $\ell_1$ and $\ell_2$ convex programming is used to separate both sparse bad data and additive noises from the observations. Through using the almost Euclidean property for a linear subspace, we derive a new performance bound for the state estimation error under sparse bad data and additive observation noises. Our main contribution is to provide sharp bounds on the almost Euclidean property of a linear subspace, using the &#34;escape-through-a-mesh&#34; theorem from geometric functional analysis. We also propose and numerically evaluate an iterative convex programming approach to performing bad data detections in nonlinear electrical power networks problems.

preprint2011arXiv

On the Scaling Law for Compressive Sensing and its Applications

$\ell_1$ minimization can be used to recover sufficiently sparse unknown signals from compressed linear measurements. In fact, exact thresholds on the sparsity (the size of the support set), under which with high probability a sparse signal can be recovered from i.i.d. Gaussian measurements, have been computed and are referred to as &#34;weak thresholds&#34; \cite{D}. It was also known that there is a tradeoff between the sparsity and the $\ell_1$ minimization recovery stability. In this paper, we give a \emph{closed-form} characterization for this tradeoff which we call the scaling law for compressive sensing recovery stability. In a nutshell, we are able to show that as the sparsity backs off $\varpi$ ($0<\varpi<1$) from the weak threshold of $\ell_1$ recovery, the parameter for the recovery stability will scale as $\frac{1}{\sqrt{1-\varpi}}$. Our result is based on a careful analysis through the Grassmann angle framework for the Gaussian measurement matrix. We will further discuss how this scaling law helps in analyzing the iterative reweighted $\ell_1$ minimization algorithms. If the nonzero elements over the signal support follow an amplitude probability density function (pdf) $f(\cdot)$ whose $t$-th derivative $f^{t}(0) \neq 0$ for some integer $t \geq 0$, then a certain iterative reweighted $\ell_1$ minimization algorithm can be analytically shown to lift the phase transition thresholds (weak thresholds) of the plain $\ell_1$ minimization algorithm.

preprint2011arXiv

Sparse Recovery with Graph Constraints: Fundamental Limits and Measurement Construction

This paper addresses the problem of sparse recovery with graph constraints in the sense that we can take additive measurements over nodes only if they induce a connected subgraph. We provide explicit measurement constructions for several special graphs. A general measurement construction algorithm is also proposed and evaluated. For any given graph $G$ with $n$ nodes, we derive order optimal upper bounds of the minimum number of measurements needed to recover any $k$-sparse vector over $G$ ($M^G_{k,n}$). Our study suggests that $M^G_{k,n}$ may serve as a graph connectivity metric.

preprint2010arXiv

Analyzing Weighted $\ell_1$ Minimization for Sparse Recovery with Nonuniform Sparse Models\footnote{The results of this paper were presented in part at the International Symposium on Information Theory, ISIT 2009}

In this paper we introduce a nonuniform sparsity model and analyze the performance of an optimized weighted $\ell_1$ minimization over that sparsity model. In particular, we focus on a model where the entries of the unknown vector fall into two sets, with entries of each set having a specific probability of being nonzero. We propose a weighted $\ell_1$ minimization recovery algorithm and analyze its performance using a Grassmann angle approach. We compute explicitly the relationship between the system parameters-the weights, the number of measurements, the size of the two sets, the probabilities of being nonzero- so that when i.i.d. random Gaussian measurement matrices are used, the weighted $\ell_1$ minimization recovers a randomly selected signal drawn from the considered sparsity model with overwhelming probability as the problem dimension increases. This allows us to compute the optimal weights. We demonstrate through rigorous analysis and simulations that for the case when the support of the signal can be divided into two different subclasses with unequal sparsity fractions, the optimal weighted $\ell_1$ minimization outperforms the regular $\ell_1$ minimization substantially. We also generalize the results to an arbitrary number of classes.

preprint2010arXiv

Compressive Sensing over Graphs

In this paper, motivated by network inference and tomography applications, we study the problem of compressive sensing for sparse signal vectors over graphs. In particular, we are interested in recovering sparse vectors representing the properties of the edges from a graph. Unlike existing compressive sensing results, the collective additive measurements we are allowed to take must follow connected paths over the underlying graph. For a sufficiently connected graph with $n$ nodes, it is shown that, using $O(k \log(n))$ path measurements, we are able to recover any $k$-sparse link vector (with no more than $k$ nonzero elements), even though the measurements have to follow the graph path constraints. We further show that the computationally efficient $\ell_1$ minimization can provide theoretical guarantees for inferring such $k$-sparse vectors with $O(k \log(n))$ path measurements from the graph.

preprint2010arXiv

Compressive Sensing over the Grassmann Manifold: a Unified Geometric Framework

$\ell_1$ minimization is often used for finding the sparse solutions of an under-determined linear system. In this paper we focus on finding sharp performance bounds on recovering approximately sparse signals using $\ell_1$ minimization, possibly under noisy measurements. While the restricted isometry property is powerful for the analysis of recovering approximately sparse signals with noisy measurements, the known bounds on the achievable sparsity (The &#34;sparsity&#34; in this paper means the size of the set of nonzero or significant elements in a signal vector.) level can be quite loose. The neighborly polytope analysis which yields sharp bounds for ideally sparse signals cannot be readily generalized to approximately sparse signals. Starting from a necessary and sufficient condition, the &#34;balancedness&#34; property of linear subspaces, for achieving a certain signal recovery accuracy, we give a unified \emph{null space Grassmann angle}-based geometric framework for analyzing the performance of $\ell_1$ minimization. By investigating the &#34;balancedness&#34; property, this unified framework characterizes sharp quantitative tradeoffs between the considered sparsity and the recovery accuracy of the $\ell_{1}$ optimization. As a consequence, this generalizes the neighborly polytope result for ideally sparse signals. Besides the robustness in the &#34;strong&#34; sense for \emph{all} sparse signals, we also discuss the notions of &#34;weak&#34; and &#34;sectional&#34; robustness. Our results concern fundamental properties of linear subspaces and so may be of independent mathematical interest.

preprint2010arXiv

Improved Sparse Recovery Thresholds with Two-Step Reweighted $\ell_1$ Minimization

It is well known that $\ell_1$ minimization can be used to recover sufficiently sparse unknown signals from compressed linear measurements. In fact, exact thresholds on the sparsity, as a function of the ratio between the system dimensions, so that with high probability almost all sparse signals can be recovered from iid Gaussian measurements, have been computed and are referred to as &#34;weak thresholds&#34; \cite{D}. In this paper, we introduce a reweighted $\ell_1$ recovery algorithm composed of two steps: a standard $\ell_1$ minimization step to identify a set of entries where the signal is likely to reside, and a weighted $\ell_1$ minimization step where entries outside this set are penalized. For signals where the non-sparse component has iid Gaussian entries, we prove a &#34;strict&#34; improvement in the weak recovery threshold. Simulations suggest that the improvement can be quite impressive-over 20% in the example we consider.

preprint2010arXiv

On the Performance of Sparse Recovery via L_p-minimization (0<=p <=1)

It is known that a high-dimensional sparse vector x* in R^n can be recovered from low-dimensional measurements y= A^{m*n} x* (m<n) . In this paper, we investigate the recovering ability of l_p-minimization (0<=p<=1) as p varies, where l_p-minimization returns a vector with the least l_p ``norm&#39;&#39; among all the vectors x satisfying Ax=y. Besides analyzing the performance of strong recovery where l_p-minimization needs to recover all the sparse vectors up to certain sparsity, we also for the first time analyze the performance of ``weak&#39;&#39; recovery of l_p-minimization (0<=p<1) where the aim is to recover all the sparse vectors on one support with fixed sign pattern. When m/n goes to 1, we provide sharp thresholds of the sparsity ratio that differentiates the success and failure via l_p-minimization. For strong recovery, the threshold strictly decreases from 0.5 to 0.239 as p increases from 0 to 1. Surprisingly, for weak recovery, the threshold is 2/3 for all p in [0,1), while the threshold is 1 for l_1-minimization. We also explicitly demonstrate that l_p-minimization (p<1) can return a denser solution than l_1-minimization. For any m/n<1, we provide bounds of sparsity ratio for strong recovery and weak recovery respectively below which l_p-minimization succeeds with overwhelming probability. Our bound of strong recovery improves on the existing bounds when m/n is large. Regarding the recovery threshold, l_p-minimization has a higher threshold with smaller p for strong recovery; the threshold is the same for all p for sectional recovery; and l_1-minimization can outperform l_p-minimization for weak recovery. These are in contrast to traditional wisdom that l_p-minimization has better sparse recovery ability than l_1-minimization since it is closer to l_0-minimization. We provide an intuitive explanation to our findings and use numerical examples to illustrate the theoretical predictions.