Source author record

Shuchin Aeron

Shuchin Aeron appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

33works

20topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Optimal Representations for Generalized Contrastive Learning with Imbalanced Datasets

In this paper, we provide a computable characterization of the geometry of optimal representations in Contrastive Learning (CL) when the classes are imbalanced. When classes are balanced and the representation dimension is greater than the number of classes, it is well-known that the optimal representations exhibit Neural Collapse (NC), i.e., representations from the same class collapse to their class means and the class means form an Equiangular Tight Frame (ETF). For imbalanced classes and a large, generalized family of CL losses, we prove that the optimal representations of all samples from the same class collapse to their class means and their geometry exhibits an angular symmetry structure that is determined by the relative class proportions. In general, we show that the geometry can be determined by solving a convex optimization problem. Exploiting this symmetry structure, we analytically investigate a special case where class imbalance is extreme and prove that CL exhibits a phenomenon called Minority Collapse (MC) where all samples from the minority classes (classes with small probabilities) collapse into a single vector, whenever the class imbalance exceeds a threshold, which in turn depends on the regularity properties of the CL loss used and on the number of negative samples. Numerical results are provided to illustrate these phenomena and corroborate the theoretical results. We conclude by identifying a number of open problems.

preprint2022arXiv

Conditional entropy minimization principle for learning domain invariant representation features

Invariance-principle-based methods such as Invariant Risk Minimization (IRM), have recently emerged as promising approaches for Domain Generalization (DG). Despite promising theory, such approaches fail in common classification tasks due to the mixing of true invariant features and spurious invariant features. To address this, we propose a framework based on the conditional entropy minimization (CEM) principle to filter-out the spurious invariant features leading to a new algorithm with a better generalization capability. We show that our proposed approach is closely related to the well-known Information Bottleneck (IB) framework and prove that under certain assumptions, entropy minimization can exactly recover the true invariant features. Our approach provides competitive classification accuracy compared to recent theoretically-principled state-of-the-art alternatives across several DG datasets.

preprint2022arXiv

Easy Variational Inference for Categorical Models via an Independent Binary Approximation

We pursue tractable Bayesian analysis of generalized linear models (GLMs) for categorical data. Thus far, GLMs are difficult to scale to more than a few dozen categories due to non-conjugacy or strong posterior dependencies when using conjugate auxiliary variable methods. We define a new class of GLMs for categorical data called categorical-from-binary (CB) models. Each CB model has a likelihood that is bounded by the product of binary likelihoods, suggesting a natural posterior approximation. This approximation makes inference straightforward and fast; using well-known auxiliary variables for probit or logistic regression, the product of binary models admits conjugate closed-form variational inference that is embarrassingly parallel across categories and invariant to category ordering. Moreover, an independent binary model simultaneously approximates multiple CB models. Bayesian model averaging over these can improve the quality of the approximation for any given dataset. We show that our approach scales to thousands of categories, outperforming posterior estimation competitors like Automatic Differentiation Variational Inference (ADVI) and No U-Turn Sampling (NUTS) in the time required to achieve fixed prediction quality.

preprint2022arXiv

Joint covariate-alignment and concept-alignment: a framework for domain generalization

In this paper, we propose a novel domain generalization (DG) framework based on a new upper bound to the risk on the unseen domain. Particularly, our framework proposes to jointly minimize both the covariate-shift as well as the concept-shift between the seen domains for a better performance on the unseen domain. While the proposed approach can be implemented via an arbitrary combination of covariate-alignment and concept-alignment modules, in this work we use well-established approaches for distributional alignment namely, Maximum Mean Discrepancy (MMD) and covariance Alignment (CORAL), and use an Invariant Risk Minimization (IRM)-based approach for concept alignment. Our numerical results show that the proposed methods perform as well as or better than the state-of-the-art for domain generalization on several data sets.

preprint2022arXiv

Measure Estimation in the Barycentric Coding Model

This paper considers the problem of measure estimation under the barycentric coding model (BCM), in which an unknown measure is assumed to belong to the set of Wasserstein-2 barycenters of a finite set of known measures. Estimating a measure under this model is equivalent to estimating the unknown barycentric coordinates. We provide novel geometrical, statistical, and computational insights for measure estimation under the BCM, consisting of three main results. Our first main result leverages the Riemannian geometry of Wasserstein-2 space to provide a procedure for recovering the barycentric coordinates as the solution to a quadratic optimization problem assuming access to the true reference measures. The essential geometric insight is that the parameters of this quadratic problem are determined by inner products between the optimal displacement maps from the given measure to the reference measures defining the BCM. Our second main result then establishes an algorithm for solving for the coordinates in the BCM when all the measures are observed empirically via i.i.d. samples. We prove precise rates of convergence for this algorithm -- determined by the smoothness of the underlying measures and their dimensionality -- thereby guaranteeing its statistical consistency. Finally, we demonstrate the utility of the BCM and associated estimation procedures in three application areas: (i) covariance estimation for Gaussian measures; (ii) image processing; and (iii) natural language processing.

preprint2022arXiv

r-local sensing: Improved algorithm and applications

The unlabeled sensing problem is to solve a noisy linear system of equations under unknown permutation of the measurements. We study a particular case of the problem where the permutations are restricted to be r-local, i.e. the permutation matrix is block diagonal with r x r blocks. Assuming a Gaussian measurement matrix, we argue that the r-local permutation model is more challenging compared to a recent sparse permutation model. We propose a proximal alternating minimization algorithm for the general unlabeled sensing problem that provably converges to a first order stationary point. Applied to the r-local model, we show that the resulting algorithm is efficient. We validate the algorithm on synthetic and real datasets. We also formulate the 1-d unassigned distance geometry problem as an unlabeled sensing problem with a structured measurement matrix.

preprint2022arXiv

R-local unlabeled sensing: A novel graph matching approach for multiview unlabeled sensing under local permutations

Unlabeled sensing is a linear inverse problem where the measurements are scrambled under an unknown permutation leading to loss of correspondence between the measurements and the rows of the sensing matrix. Motivated by practical tasks such as mobile sensor networks, target tracking and the pose and correspondence estimation between point clouds, we study a special case of this problem restricting the class of permutations to be local and allowing for multiple views. In this setting, namely unlabeled multi-view sensing with local permutation, previous results and algorithms are not directly applicable. In this paper, we propose a computationally efficient algorithm that creatively exploits the machinery of graph alignment and Gromov-Wasserstein alignment and leverages the multiple views to estimate the local permutations. Simulation results on synthetic data sets indicate that the proposed algorithm is scalable and applicable to the challenging regimes of low to moderate SNR.

preprint2022arXiv

Towards Designing and Exploiting Generative Networks for Neutrino Physics Experiments using Liquid Argon Time Projection Chambers

In this paper, we show that a hybrid approach to generative modeling via combining the decoder from an autoencoder together with an explicit generative model for the latent space is a promising method for producing images of particle trajectories in a liquid argon time projection chamber (LArTPC). LArTPCs are a type of particle physics detector used by several current and future experiments focused on studies of the neutrino. We implement a Vector-Quantized Variational Autoencoder (VQ-VAE) and PixelCNN which produces images with LArTPC-like features and introduce a method to evaluate the quality of the images using a semantic segmentation that identifies important physics-based features.

preprint2020arXiv

Optimal Transport Based Change Point Detection and Time Series Segment Clustering

Two common problems in time series analysis are the decomposition of the data stream into disjoint segments that are each in some sense "homogeneous" - a problem known as Change Point Detection (CPD) - and the grouping of similar nonadjacent segments, a problem that we call Time Series Segment Clustering (TSSC). Building upon recent theoretical advances characterizing the limiting distribution-free behavior of the Wasserstein two-sample test (Ramdas et al. 2015), we propose a novel algorithm for unsupervised, distribution-free CPD which is amenable to both offline and online settings. We also introduce a method to mitigate false positives in CPD and address TSSC by using the Wasserstein distance between the detected segments to build an affinity matrix to which we apply spectral clustering. Results on both synthetic and real data sets show the benefits of the approach.

preprint2020arXiv

Optimization-based incentivization and control scheme for autonomous traffic

We consider the problem of incentivization and optimal control of autonomous vehicles for improving traffic congestion. In our scenario, autonomous vehicles must be incentivized in order to participate in traffic improvement. Using the theory and methods of optimal transport, we propose a constrained optimization framework over dynamics governed by partial differential equations, so that we can optimally select a portion of vehicles to be incentivized and controlled. The goal of the optimization is to obtain a uniform distribution of vehicles over the spatial domain. To achieve this, we consider two types of penalties on vehicle density, one is the $L^2$ cost and the other is a multiscale-norm cost, commonly used in fluid-mixing problems. To solve this non-convex optimization problem, we introduce a novel algorithm, which iterates between solving a convex optimization problem and propagating the flow of uncontrolled vehicles according to the Lighthill-Whitham-Richards model. We perform numerical simulations, which suggest that the optimization of the $L^2$ cost is ineffective while optimization of the multiscale norm is effective. The results also suggest the use of a dedicated lane for this type of control in practice.

preprint2020arXiv

Representation Learning via Adversarially-Contrastive Optimal Transport

In this paper, we study the problem of learning compact (low-dimensional) representations for sequential data that captures its implicit spatio-temporal cues. To maximize extraction of such informative cues from the data, we set the problem within the context of contrastive representation learning and to that end propose a novel objective via optimal transport. Specifically, our formulation seeks a low-dimensional subspace representation of the data that jointly (i) maximizes the distance of the data (embedded in this subspace) from an adversarial data distribution under the optimal transport, a.k.a. the Wasserstein distance, (ii) captures the temporal order, and (iii) minimizes the data distortion. To generate the adversarial distribution, we propose a novel framework connecting Wasserstein GANs with a classifier, allowing a principled mechanism for producing good negative distributions for contrastive learning, which is currently a challenging problem. Our full objective is cast as a subspace learning problem on the Grassmann manifold and solved via Riemannian optimization. To empirically study our formulation, we provide experiments on the task of human action recognition in video sequences. Our results demonstrate competitive performance against challenging baselines.

preprint2016arXiv

A Randomized Tensor Singular Value Decomposition based on the t-product

The tensor Singular Value Decomposition (t-SVD) for third order tensors that was proposed by Kilmer and Martin~\cite{2011kilmer} has been applied successfully in many fields, such as computed tomography, facial recognition, and video completion. In this paper, we propose a method that extends a well-known randomized matrix method to the t-SVD. This method can produce a factorization with similar properties to the t-SVD, but is more computationally efficient on very large datasets. We present details of the algorithm, theoretical results, and provide numerical results that show the promise of our approach for compressing and analyzing datasets. We also present an improved analysis of the randomized subspace iteration for matrices, which may be of independent interest to the scientific community.

preprint2016arXiv

Algorithms for item categorization based on ordinal ranking data

We present a new method for identifying the latent categorization of items based on their rankings. Complimenting a recent work that uses a Dirichlet prior on preference vectors and variational inference, we show that this problem can be effectively dealt with using existing community detection algorithms, with the communities corresponding to item categories. In particular we convert the bipartite ranking data to a unipartite graph of item affinities, and apply community detection algorithms. In this context we modify an existing algorithm - namely the label propagation algorithm to a variant that uses the distance between the nodes for weighting the label propagation - to identify the categories. We propose and analyze a synthetic ordinal ranking model and show its relation to the recently much studied stochastic block model. We test our algorithms on synthetic data and compare performance with several popular community detection algorithms. We also test the method on real data sets of movie categorization from the Movie Lens database. In all of the cases our algorithm is able to identify the categories for a suitable choice of tuning parameter.

preprint2016arXiv

Low-tubal-rank Tensor Completion using Alternating Minimization

The low-tubal-rank tensor model has been recently proposed for real-world multidimensional data. In this paper, we study the low-tubal-rank tensor completion problem, i.e., to recover a third-order tensor by observing a subset of its elements selected uniformly at random. We propose a fast iterative algorithm, called {\em Tubal-Alt-Min}, that is inspired by a similar approach for low-rank matrix completion. The unknown low-tubal-rank tensor is represented as the product of two much smaller tensors with the low-tubal-rank property being automatically incorporated, and Tubal-Alt-Min alternates between estimating those two tensors using tensor least squares minimization. First, we note that tensor least squares minimization is different from its matrix counterpart and nontrivial as the circular convolution operator of the low-tubal-rank tensor model is intertwined with the sub-sampling operator. Second, the theoretical performance guarantee is challenging since Tubal-Alt-Min is iterative and nonconvex in nature. We prove that 1) Tubal-Alt-Min guarantees exponential convergence to the global optima, and 2) for an $n \times n \times k$ tensor with tubal-rank $r \ll n$, the required sampling complexity is $O(nr^2k \log^3 n)$ and the computational complexity is $O(n^2rk^2 \log^2 n)$. Third, on both synthetic data and real-world video data, evaluation results show that compared with tensor-nuclear norm minimization (TNN-ADMM), Tubal-Alt-Min improves the recovery error dramatically (by orders of magnitude). It is estimated that Tubal-Alt-Min converges at an exponential rate $10^{-0.4423 \text{Iter}}$ where $\text{Iter}$ denotes the number of iterations, which is much faster than TNN-ADMM's $10^{-0.0332 \text{Iter}}$, and the running time can be accelerated by more than $5$ times for a $200 \times 200 \times 20$ tensor.

preprint2016arXiv

On Deterministic Conditions for Subspace Clustering under Missing Data

In this paper we present deterministic conditions for success of sparse subspace clustering (SSC) under missing data, when data is assumed to come from a Union of Subspaces (UoS) model. We consider two algorithms, which are variants of SSC with entry-wise zero-filling that differ in terms of the optimization problems used to find affinity matrix for spectral clustering. For both the algorithms, we provide deterministic conditions for any pattern of missing data such that perfect clustering can be achieved. We provide extensive sets of simulation results for clustering as well as completion of data at missing entries, under the UoS model. Our experimental results indicate that in contrast to the full data case, accurate clustering does not imply accurate subspace identification and completion, indicating the natural order of relative hardness of these problems.

preprint2016arXiv

Tensor Completion by Alternating Minimization under the Tensor Train (TT) Model

Using the matrix product state (MPS) representation of tensor train decompositions, in this paper we propose a tensor completion algorithm which alternates over the matrices (tensors) in the MPS representation. This development is motivated in part by the success of matrix completion algorithms which alternate over the (low-rank) factors. We comment on the computational complexity of the proposed algorithm and numerically compare it with existing methods employing low rank tensor train approximation for data completion as well as several other recently proposed methods. We show that our method is superior to existing ones for a variety of real settings.

preprint2015arXiv

An algorithm for online tensor prediction

We present a new method for online prediction and learning of tensors ($N$-way arrays, $N >2$) from sequential measurements. We focus on the specific case of 3-D tensors and exploit a recently developed framework of structured tensor decompositions proposed in [1]. In this framework it is possible to treat 3-D tensors as linear operators and appropriately generalize notions of rank and positive definiteness to tensors in a natural way. Using these notions we propose a generalization of the matrix exponentiated gradient descent algorithm [2] to a tensor exponentiated gradient descent algorithm using an extension of the notion of von-Neumann divergence to tensors. Then following a similar construction as in [3], we exploit this algorithm to propose an online algorithm for learning and prediction of tensors with provable regret guarantees. Simulations results are presented on semi-synthetic data sets of ratings evolving in time under local influence over a social network. The result indicate superior performance compared to other (online) convex tensor completion methods.

preprint2015arXiv

Clustering multi-way data: a novel algebraic approach

In this paper, we develop a method for unsupervised clustering of two-way (matrix) data by combining two recent innovations from different fields: the Sparse Subspace Clustering (SSC) algorithm [10], which groups points coming from a union of subspaces into their respective subspaces, and the t-product [18], which was introduced to provide a matrix-like multiplication for third order tensors. Our algorithm is analogous to SSC in that an "affinity" between different data points is built using a sparse self-representation of the data. Unlike SSC, we employ the t-product in the self-representation. This allows us more flexibility in modeling; infact, SSC is a special case of our method. When using the t-product, three-way arrays are treated as matrices whose elements (scalars) are n-tuples or tubes. Convolutions take the place of scalar multiplication. This framework allows us to embed the 2-D data into a vector-space-like structure called a free module over a commutative ring. These free modules retain many properties of complex inner-product spaces, and we leverage that to provide theoretical guarantees on our algorithm. We show that compared to vector-space counterparts, SSmC achieves higher accuracy and better able to cluster data with less preprocessing in some image clustering problems. In particular we show the performance of the proposed method on Weizmann face database, the Extended Yale B Face database and the MNIST handwritten digits database.

preprint2015arXiv

Denoising and Completion of 3D Data via Multidimensional Dictionary Learning

In this paper a new dictionary learning algorithm for multidimensional data is proposed. Unlike most conventional dictionary learning methods which are derived for dealing with vectors or matrices, our algorithm, named KTSVD, learns a multidimensional dictionary directly via a novel algebraic approach for tensor factorization as proposed in [3, 12, 13]. Using this approach one can define a tensor-SVD and we propose to extend K-SVD algorithm used for 1-D data to a K-TSVD algorithm for handling 2-D and 3-D data. Our algorithm, based on the idea of sparse coding (using group-sparsity over multidimensional coefficient vectors), alternates between estimating a compact representation and dictionary learning. We analyze our KTSVD algorithm and demonstrate its result on video completion and multispectral image denoising.

preprint2015arXiv

Exact tensor completion using t-SVD

In this paper we focus on the problem of completion of multidimensional arrays (also referred to as tensors) from limited sampling. Our approach is based on a recently proposed tensor-Singular Value Decomposition (t-SVD) [1]. Using this factorization one can derive notion of tensor rank, referred to as the tensor tubal rank, which has optimality properties similar to that of matrix rank derived from SVD. As shown in [2] some multidimensional data, such as panning video sequences exhibit low tensor tubal rank and we look at the problem of completing such data under random sampling of the data cube. We show that by solving a convex optimization problem, which minimizes the tensor nuclear norm obtained as the convex relaxation of tensor tubal rank, one can guarantee recovery with overwhelming probability as long as samples in proportion to the degrees of freedom in t-SVD are observed. In this sense our results are order-wise optimal. The conditions under which this result holds are very similar to the incoherency conditions for the matrix completion, albeit we define incoherency under the algebraic set-up of t-SVD. We show the performance of the algorithm on some real data sets and compare it with other existing approaches based on tensor flattening and Tucker decomposition.

preprint2015arXiv

Group-Invariant Subspace Clustering

In this paper we consider the problem of group invariant subspace clustering where the data is assumed to come from a union of group-invariant subspaces of a vector space, i.e. subspaces which are invariant with respect to action of a given group. Algebraically, such group-invariant subspaces are also referred to as submodules. Similar to the well known Sparse Subspace Clustering approach where the data is assumed to come from a union of subspaces, we analyze an algorithm which, following a recent work [1], we refer to as Sparse Sub-module Clustering (SSmC). The method is based on finding group-sparse self-representation of data points. In this paper we primarily derive general conditions under which such a group-invariant subspace identification is possible. In particular we extend the geometric analysis in [2] and in the process we identify a related problem in geometric functional analysis.

preprint2015arXiv

Multilinear Subspace Clustering

In this paper we present a new model and an algorithm for unsupervised clustering of 2-D data such as images. We assume that the data comes from a union of multilinear subspaces (UOMS) model, which is a specific structured case of the much studied union of subspaces (UOS) model. For segmentation under this model, we develop Multilinear Subspace Clustering (MSC) algorithm and evaluate its performance on the YaleB and Olivietti image data sets. We show that MSC is highly competitive with existing algorithms employing the UOS model in terms of clustering performance while enjoying improvement in computational complexity.

preprint2014arXiv

First Order Methods for Robust Non-negative Matrix Factorization for Large Scale Noisy Data

Nonnegative matrix factorization (NMF) has been shown to be identifiable under the separability assumption, under which all the columns(or rows) of the input data matrix belong to the convex cone generated by only a few of these columns(or rows) [1]. In real applications, however, such separability assumption is hard to satisfy. Following [4] and [5], in this paper, we look at the Linear Programming (LP) based reformulation to locate the extreme rays of the convex cone but in a noisy setting. Furthermore, in order to deal with the large scale data, we employ First-Order Methods (FOM) to mitigate the computational complexity of LP, which primarily results from a large number of constraints. We show the performance of the algorithm on real and synthetic data sets.

preprint2014arXiv

Novel methods for multilinear data completion and de-noising based on tensor-SVD

In this paper we propose novel methods for completion (from limited samples) and de-noising of multilinear (tensor) data and as an application consider 3-D and 4- D (color) video data completion and de-noising. We exploit the recently proposed tensor-Singular Value Decomposition (t-SVD)[11]. Based on t-SVD, the notion of multilinear rank and a related tensor nuclear norm was proposed in [11] to characterize informational and structural complexity of multilinear data. We first show that videos with linear camera motion can be represented more efficiently using t-SVD compared to the approaches based on vectorizing or flattening of the tensors. Since efficiency in representation implies efficiency in recovery, we outline a tensor nuclear norm penalized algorithm for video completion from missing entries. Application of the proposed algorithm for video recovery from missing entries is shown to yield a superior performance over existing methods. We also consider the problem of tensor robust Principal Component Analysis (PCA) for de-noising 3-D video data from sparse random corruptions. We show superior performance of our method compared to the matrix robust PCA adapted to this setting as proposed in [4].

preprint2014arXiv

Robust Large Scale Non-negative Matrix Factorization using Proximal Point Algorithm

A robust algorithm for non-negative matrix factorization (NMF) is presented in this paper with the purpose of dealing with large-scale data, where the separability assumption is satisfied. In particular, we modify the Linear Programming (LP) algorithm of [9] by introducing a reduced set of constraints for exact NMF. In contrast to the previous approaches, the proposed algorithm does not require the knowledge of factorization rank (extreme rays [3] or topics [7]). Furthermore, motivated by a similar problem arising in the context of metabolic network analysis [13], we consider an entirely different regime where the number of extreme rays or topics can be much larger than the dimension of the data vectors. The performance of the algorithm for different synthetic data sets are provided.

preprint2013arXiv

Complexity penalized hydraulic fracture localization and moment tensor estimation under limited model information

In this paper we present a novel technique for micro-seismic localization using a group sparse penalization that is robust to the focal mechanism of the source and requires only a velocity model of the stratigraphy rather than a full Green's function model of the earth's response. In this technique we construct a set of perfect delta detector responses, one for each detector in the array, to a seismic event at a given location and impose a group sparsity across the array. This scheme is independent of the moment tensor and exploits the time compactness of the incident seismic signal. Furthermore we present a method for improving the inversion of the moment tensor and Green's function when the geometry of seismic array is limited. In particular we demonstrate that both Tikhonov regularization and truncated SVD can improve the recovery of the moment tensor and be robust to noise. We evaluate our algorithm on synthetic data and present error bounds for both estimation of the moment tensor as well as localization. Furthermore we discuss the estimated moment tensor accuracy as a function of both array geometry and fault orientation.

preprint2013arXiv

Consensus in the presence of interference

This paper studies distributed strategies for average-consensus of arbitrary vectors in the presence of network interference. We assume that the underlying communication on any \emph{link} suffers from \emph{additive interference} caused due to the communication by other agents following their own consensus protocol. Additionally, no agent knows how many or which agents are interfering with its communication. Clearly, the standard consensus protocol does not remain applicable in such scenarios. In this paper, we cast an algebraic structure over the interference and show that the standard protocol can be modified such that the average is reachable in a subspace whose dimension is complimentary to the maximal dimension of the interference subspaces (over all of the communication links). To develop the results, we use \emph{information alignment} to align the intended transmission (over each link) to the null-space of the interference (on that link). We show that this alignment is indeed invertible, i.e. the intended transmission can be recovered over which, subsequently, consensus protocol is implemented. That \emph{local} protocols exist even when the collection of the interference subspaces span the entire vector space is somewhat surprising.

preprint2013arXiv

Exploiting Structural Complexity for Robust and Rapid Hyperspectral Imaging

This paper presents several strategies for spectral de-noising of hyperspectral images and hypercube reconstruction from a limited number of tomographic measurements. In particular we show that the non-noisy spectral data, when stacked across the spectral dimension, exhibits low-rank. On the other hand, under the same representation, the spectral noise exhibits a banded structure. Motivated by this we show that the de-noised spectral data and the unknown spectral noise and the respective bands can be simultaneously estimated through the use of a low-rank and simultaneous sparse minimization operation without prior knowledge of the noisy bands. This result is novel for for hyperspectral imaging applications. In addition, we show that imaging for the Computed Tomography Imaging Systems (CTIS) can be improved under limited angle tomography by using low-rank penalization. For both of these cases we exploit the recent results in the theory of low-rank matrix completion using nuclear norm minimization.

preprint2013arXiv

Joint multi-mode dispersion extraction in Fourier and space time domains

In this paper we present a novel broadband approach for the extraction of dispersion curves of multiple time frequency overlapped dispersive modes such as in borehole acoustic data. The new approach works jointly in the Fourier and space time domains and, in contrast to existing space time approaches that mainly work for time frequency separated signals, efficiently handles multiple signals with significant time frequency overlap. The proposed method begins by exploiting the slowness (phase and group) and time location estimates based on frequency-wavenumber (f-k) domain sparsity penalized broadband dispersion extraction method as presented in \cite{AeronTSP2011}. In this context we first present a Cramer Rao Bound (CRB) analysis for slowness estimation in the (f-k) domain and show that for the f-k domain broadband processing, group slowness estimates have more variance than the phase slowness estimates and time location estimates. In order to improve the group slowness estimates we exploit the time compactness property of the modes to effectively represent the data as a linear superposition of time compact space time propagators parameterized by the phase and group slowness. A linear least squares estimation algorithm in the space time domain is then used to obtain improved group slowness estimates. The performance of the method is demonstrated on real borehole acoustic data sets.

preprint2013arXiv

Methods for Large Scale Hydraulic Fracture Monitoring

In this paper we propose computationally efficient and robust methods for estimating the moment tensor and location of micro-seismic event(s) for large search volumes. Our contribution is two-fold. First, we propose a novel joint-complexity measure, namely the sum of nuclear norms which while imposing sparsity on the number of fractures (locations) over a large spatial volume, also captures the rank-1 nature of the induced wavefield pattern. This wavefield pattern is modeled as the outer-product of the source signature with the amplitude pattern across the receivers from a seismic source. A rank-1 factorization of the estimated wavefield pattern at each location can therefore be used to estimate the seismic moment tensor using the knowledge of the array geometry. In contrast to existing work this approach allows us to drop any other assumption on the source signature. Second, we exploit the recently proposed first-order incremental projection algorithms for a fast and efficient implementation of the resulting optimization problem and develop a hybrid stochastic & deterministic algorithm which results in significant computational savings.

preprint2013arXiv

Novel Factorization Strategies for Higher Order Tensors: Implications for Compression and Recovery of Multi-linear Data

In this paper we propose novel methods for compression and recovery of multilinear data under limited sampling. We exploit the recently proposed tensor- Singular Value Decomposition (t-SVD)[1], which is a group theoretic framework for tensor decomposition. In contrast to popular existing tensor decomposition techniques such as higher-order SVD (HOSVD), t-SVD has optimality properties similar to the truncated SVD for matrices. Based on t-SVD, we first construct novel tensor-rank like measures to characterize informational and structural complexity of multilinear data. Following that we outline a complexity penalized algorithm for tensor completion from missing entries. As an application, 3-D and 4-D (color) video data compression and recovery are considered. We show that videos with linear camera motion can be represented more efficiently using t-SVD compared to traditional approaches based on vectorizing or flattening of the tensors. Application of the proposed tensor completion algorithm for video recovery from missing entries is shown to yield a superior performance over existing methods. In conclusion we point out several research directions and implications to online prediction of multilinear data.

preprint2013arXiv

Robust Hydraulic Fracture Monitoring (HFM) of Multiple Time Overlapping Events Using a Generalized Discrete Radon Transform

In this work we propose a novel algorithm for multiple-event localization for Hydraulic Fracture Monitoring (HFM) through the exploitation of the sparsity of the observed seismic signal when represented in a basis consisting of space time propagators. We provide explicit construction of these propagators using a forward model for wave propagation which depends non-linearly on the problem parameters - the unknown source location and mechanism of fracture, time and extent of event, and the locations of the receivers. Under fairly general assumptions and an appropriate discretization of these parameters we first build an over-complete dictionary of generalized Radon propagators and assume that the data is well represented as a linear superposition of these propagators. Exploiting this structure we propose sparsity penalized algorithms and workflow for super-resolution extraction of time overlapping multiple seismic events from single well data.

preprint2010arXiv

Information theoretic bounds for Compressed Sensing

In this paper we derive information theoretic performance bounds to sensing and reconstruction of sparse phenomena from noisy projections. We consider two settings: output noise models where the noise enters after the projection and input noise models where the noise enters before the projection. We consider two types of distortion for reconstruction: support errors and mean-squared errors. Our goal is to relate the number of measurements, $m$, and $\snr$, to signal sparsity, $k$, distortion level, $d$, and signal dimension, $n$. We consider support errors in a worst-case setting. We employ different variations of Fano's inequality to derive necessary conditions on the number of measurements and $\snr$ required for exact reconstruction. To derive sufficient conditions we develop new insights on max-likelihood analysis based on a novel superposition property. In particular this property implies that small support errors are the dominant error events. Consequently, our ML analysis does not suffer the conservatism of the union bound and leads to a tighter analysis of max-likelihood. These results provide order-wise tight bounds. For output noise models we show that asymptotically an $\snr$ of $Θ(\log(n))$ together with $Θ(k \log(n/k))$ measurements is necessary and sufficient for exact support recovery. Furthermore, if a small fraction of support errors can be tolerated, a constant $\snr$ turns out to be sufficient in the linear sparsity regime. In contrast for input noise models we show that support recovery fails if the number of measurements scales as $o(n\log(n)/SNR)$ implying poor compression performance for such cases. We also consider Bayesian set-up and characterize tradeoffs between mean-squared distortion and the number of measurements using rate-distortion theory.

Shuchin Aeron

What is connected

Connect this record

See the researcher in context

Building this map preview

33 published item(s)

Optimal Representations for Generalized Contrastive Learning with Imbalanced Datasets

Conditional entropy minimization principle for learning domain invariant representation features

Easy Variational Inference for Categorical Models via an Independent Binary Approximation

Joint covariate-alignment and concept-alignment: a framework for domain generalization

Measure Estimation in the Barycentric Coding Model

r-local sensing: Improved algorithm and applications

R-local unlabeled sensing: A novel graph matching approach for multiview unlabeled sensing under local permutations

Towards Designing and Exploiting Generative Networks for Neutrino Physics Experiments using Liquid Argon Time Projection Chambers

Optimal Transport Based Change Point Detection and Time Series Segment Clustering

Optimization-based incentivization and control scheme for autonomous traffic

Representation Learning via Adversarially-Contrastive Optimal Transport

A Randomized Tensor Singular Value Decomposition based on the t-product

Algorithms for item categorization based on ordinal ranking data

Low-tubal-rank Tensor Completion using Alternating Minimization

On Deterministic Conditions for Subspace Clustering under Missing Data

Tensor Completion by Alternating Minimization under the Tensor Train (TT) Model

An algorithm for online tensor prediction

Clustering multi-way data: a novel algebraic approach

Denoising and Completion of 3D Data via Multidimensional Dictionary Learning

Exact tensor completion using t-SVD

Group-Invariant Subspace Clustering

Multilinear Subspace Clustering

First Order Methods for Robust Non-negative Matrix Factorization for Large Scale Noisy Data

Novel methods for multilinear data completion and de-noising based on tensor-SVD

Robust Large Scale Non-negative Matrix Factorization using Proximal Point Algorithm

Complexity penalized hydraulic fracture localization and moment tensor estimation under limited model information

Consensus in the presence of interference

Exploiting Structural Complexity for Robust and Rapid Hyperspectral Imaging

Joint multi-mode dispersion extraction in Fourier and space time domains

Methods for Large Scale Hydraulic Fracture Monitoring

Novel Factorization Strategies for Higher Order Tensors: Implications for Compression and Recovery of Multi-linear Data

Robust Hydraulic Fracture Monitoring (HFM) of Multiple Time Overlapping Events Using a Generalized Discrete Radon Transform

Information theoretic bounds for Compressed Sensing