Source author record

Nicholas D. Sidiropoulos

Nicholas D. Sidiropoulos appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

25works

20topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

GAGE: Geometry Preserving Attributed Graph Embeddings

Node embedding is the task of extracting concise and informative representations of certain entities that are connected in a network. Various real-world networks include information about both node connectivity and certain node attributes, in the form of features or time-series data. Modern representation learning techniques employ both the connectivity and attribute information of the nodes to produce embeddings in an unsupervised manner. In this context, deriving embeddings that preserve the geometry of the network and the attribute vectors would be highly desirable, as they would reflect both the topological neighborhood structure and proximity in feature space. While this is fairly straightforward to maintain when only observing the connectivity or attribute information of the network, preserving the geometry of both types of information is challenging. A novel tensor factorization approach for node embedding in attributed networks is proposed in this paper, that preserves the distances of both the connections and the attributes. Furthermore, an effective and lightweight algorithm is developed to tackle the learning task and judicious experiments with multiple state-of-the-art baselines suggest that the proposed algorithm offers significant performance improvements in downstream tasks.

preprint2022arXiv

Probabilistic Simplex Component Analysis

This study presents PRISM, a probabilistic simplex component analysis approach to identifying the vertices of a data-circumscribing simplex from data. The problem has a rich variety of applications, the most notable being hyperspectral unmixing in remote sensing and non-negative matrix factorization in machine learning. PRISM uses a simple probabilistic model, namely, uniform simplex data distribution and additive Gaussian noise, and it carries out inference by maximum likelihood. The inference model is sound in the sense that the vertices are provably identifiable under some assumptions, and it suggests that PRISM can be effective in combating noise when the number of data points is large. PRISM has strong, but hidden, relationships with simplex volume minimization, a powerful geometric approach for the same problem. We study these fundamental aspects, and we also consider algorithmic schemes based on importance sampling and variational inference. In particular, the variational inference scheme is shown to resemble a matrix factorization problem with a special regularizer, which draws an interesting connection to the matrix factorization approach. Numerical results are provided to demonstrate the potential of PRISM.

preprint2021arXiv

Exploring the Subgraph Density-Size Trade-off via the Lovász Extension

Given an undirected graph, the Densest-k-Subgraph problem (DkS) seeks to find a subset of k vertices such that the sum of the edge weights in the corresponding subgraph is maximized. The problem is known to be NP-hard, and is also very difficult to approximate, in the worst-case. In this paper, we present a new convex relaxation for the problem. Our key idea is to reformulate DkS as minimizing a submodular function subject to a cardinality constraint. Exploiting the fact that submodular functions possess a convex, continuous extension (known as the Lovász extension), we propose to minimize the Lovász extension over the convex hull of the cardinality constraints. Although the Lovász extension of a submodular function does not admit an analytical form in general, for DkS we show that it does. We leverage this result to develop a highly scalable algorithm based on the Alternating Direction Method of Multipliers (ADMM) for solving the relaxed problem. Coupled with a pair of fortuitously simple rounding schemes, we demonstrate that our approach outperforms existing baselines on real-world graphs and can yield high quality sub-optimal solutions which typically are a posteriori no worse than 65-80\% of the optimal density.

preprint2021arXiv

Low-rank Characteristic Tensor Density Estimation Part II: Compression and Latent Density Estimation

Learning generative probabilistic models is a core problem in machine learning, which presents significant challenges due to the curse of dimensionality. This paper proposes a joint dimensionality reduction and non-parametric density estimation framework, using a novel estimator that can explicitly capture the underlying distribution of appropriate reduced-dimension representations of the input data. The idea is to jointly design a nonlinear dimensionality reducing auto-encoder to model the training data in terms of a parsimonious set of latent random variables, and learn a canonical low-rank tensor model of the joint distribution of the latent variables in the Fourier domain. The proposed latent density model is non-parametric and universal, as opposed to the predefined prior that is assumed in variational auto-encoders. Joint optimization of the auto-encoder and the latent density estimator is pursued via a formulation which learns both by minimizing a combination of the negative log-likelihood in the latent domain and the auto-encoder reconstruction loss. We demonstrate that the proposed model achieves very promising results on toy, tabular, and image datasets on regression tasks, sampling, and anomaly detection.

preprint2020arXiv

Cell-Edge Detection via Selective Cooperation and Generalized Canonical Correlation

Improving the uplink quality of service for users located around the boundaries between cells is a key challenge in LTE systems. Relying on power control, existing approaches throttle the rates of cell-center users, while multi-user detection requires accurate channel estimates for the cell-edge users, which is another challenge due to their low received signal-to-noise ratio (SNR). Utilizing the fact that cell-edge user signals are weak but common (received at roughly equal power) at different base stations (BSs), this paper establishes a connection between cell-edge user detection and generalized canonical correlation analysis (GCCA). It puts forth a GCCA-based method that leverages selective BS cooperation to recover the cell-edge user signal subspace even at low SNR. The cell-edge user signals can then be extracted from the resulting mixture via algebraic signal processing techniques. The paper includes theoretical analysis showing why GCCA recovers the correct subspace containing the cell-edge user signals under mild conditions. The proposed method can also identify the number of cell-edge users in the system, i.e., the common subspace dimension. Simulations reveal significant performance improvement relative to various multiuser detection techniques. Cell-edge detection performance is further studied as a function of how many / which BSs are selected, and it is shown that using the closest three BS is always the best choice.

preprint2020arXiv

Exactness of OPF Relaxation on Three-phase Radial Networks with Delta Connections

Simulations have shown that while semi-definite relaxations of AC optimal power flow (AC-OPF) on three-phase radial networks with only wye connections tend to be exact, the presence of delta connections seem to render them inexact. This paper shows that such inexactness originates from the non-uniqueness of relaxation solutions and numerical errors amplified by the non-uniqueness. This finding motivates two algorithms to recover the exact solution of AC-OPF in the presence of delta connections. In simulations using IEEE 13, 37 and 123-bus systems, the proposed algorithms provide exact optimal solutions up to numerical precision.

preprint2020arXiv

Fast Algorithms for Joint Multicast Beamforming and Antenna Selection in Massive MIMO

Massive MIMO is currently a leading physical layer technology candidate that can dramatically enhance throughput in 5G systems, for both unicast and multicast transmission modalities. As antenna elements are becoming smaller and cheaper in the mmW range compared to radio frequency (RF) chains, it is crucial to perform antenna selection at the transmitter, such that the available RF chains are switched to an appropriate subset of antennas. This paper considers the joint problem of multicast beamforming and antenna selection for a single multicast group in massive MIMO systems. The prior state-of-art for this problem relies on semi-definite relaxation (SDR), which cannot scale up to the massive MIMO regime. A successive convex approximation (SCA) based approach is proposed to tackle max-min fair joint multicast beamforming and antenna selection. The key idea of SCA is to successively approximate the non-convex problem by a class of non-smooth, convex optimization problems. Two fast and memory efficient first-order methods are proposed to solve each SCA subproblem. Simulations demonstrate that the proposed algorithms outperform the existing state-of-art approach in terms of solution quality and run time, in both traditional and especially in massive MIMO settings.

preprint2020arXiv

GRATE: Granular Recovery of Aggregated Tensor Data by Example

In this paper, we address the challenge of recovering an accurate breakdown of aggregated tensor data using disaggregation examples. This problem is motivated by several applications. For example, given the breakdown of energy consumption at some homes, how can we disaggregate the total energy consumed during the same period at other homes? In order to address this challenge, we propose GRATE, a principled method that turns the ill-posed task at hand into a constrained tensor factorization problem. Then, this optimization problem is tackled using an alternating least-squares algorithm. GRATE has the ability to handle exact aggregated data as well as inexact aggregation where some unobserved quantities contribute to the aggregated data. Special emphasis is given to the energy disaggregation problem where the goal is to provide energy breakdown for consumers from their monthly aggregated consumption. Experiments on two real datasets show the efficacy of GRATE in recovering more accurate disaggregation than state-of-the-art energy disaggregation methods.

preprint2020arXiv

Mining Large Quasi-cliques with Quality Guarantees from Vertex Neighborhoods

Mining dense subgraphs is an important primitive across a spectrum of graph-mining tasks. In this work, we formally establish that two recurring characteristics of real-world graphs, namely heavy-tailed degree distributions and large clustering coefficients, imply the existence of substantially large vertex neighborhoods with high edge-density. This observation suggests a very simple approach for extracting large quasi-cliques: simply scan the vertex neighborhoods, compute the clustering coefficient of each vertex, and output the best such subgraph. The implementation of such a method requires counting the triangles in a graph, which is a well-studied problem in graph mining. When empirically tested across a number of real-world graphs, this approach reveals a surprise: vertex neighborhoods include maximal cliques of non-trivial sizes, and the density of the best neighborhood often compares favorably to subgraphs produced by dedicated algorithms for maximizing subgraph density. For graphs with small clustering coefficients, we demonstrate that small vertex neighborhoods can be refined using a local-search method to ``grow'' larger cliques and near-cliques. Our results indicate that contrary to worst-case theoretical results, mining cliques and quasi-cliques of non-trivial sizes from real-world graphs is often not a difficult problem, and provides motivation for further work geared towards a better explanation of these empirical successes.

preprint2020arXiv

PREMA: Principled Tensor Data Recovery from Multiple Aggregated Views

Multidimensional data have become ubiquitous and are frequently encountered in situations where the information is aggregated over multiple data atoms. The aggregation can be over time or other features, such as geographical location. We often have access to multiple aggregated views of the same data, each aggregated in one or more dimensions, especially when data are collected or measured by different agencies. For instance, item sales can be aggregated temporally, and over groups of stores based on their location or affiliation. However, data mining and machine learning models benefit from detailed data for personalized analysis and prediction. Thus, data disaggregation algorithms are becoming increasingly important in various domains. The goal of this paper is to reconstruct finer-scale data from multiple coarse views, aggregated over different (subsets of) dimensions. The proposed method, called PREMA, leverages low-rank tensor factorization tools to fuse the multiple views and provide recovery guarantees under certain conditions. PREMA can tackle challenging scenarios, such as missing or partially observed data, double aggregation, and even blind disaggregation (without knowledge of the aggregation patterns) using a variant of PREMA called B-PREMA. To showcase the effectiveness of PREMA, the paper includes extensive experiments using real data from different domains: retail sales, crime counts, and weather observations.

preprint2020arXiv

Reliable Detection of Unknown Cell-Edge Users Via Canonical Correlation Analysis

Providing reliable service to users close to the edge between cells remains a challenge in cellular systems, even as 5G deployment is around the corner. These users are subject to significant signal attenuation, which also degrades their uplink channel estimates. Even joint detection using base station (BS) cooperation often fails to reliably detect such users, due to near-far power imbalance, and channel estimation errors. Is it possible to bypass the channel estimation stage and design a detector that can reliably detect cell-edge user signals under significant near-far imbalance? This paper shows, perhaps surprisingly, that the answer is affirmative -- albeit not via traditional multiuser detection. Exploiting that cell-edge user signals are weak but {\em common} to different base stations, while cell-center users are unique to their serving BS, this paper establishes an elegant connection between cell-edge user detection and canonical correlation analysis (CCA) of the associated space-time baseband-equivalent matrices. It proves that CCA identifies the common subspace of these matrices, even under significant intra- and inter-cell interference. The resulting mixture of cell-edge user signals can subsequently be unraveled using a well-known algebraic signal processing technique. Interestingly, the proposed approach does not even require that the signals from the different base stations are synchronized -- the right synchronization can be automatically determined as well. Experimental results demonstrate that the proposed approach achieves order of magnitude BER improvements compared to `oracle' multiuser detection that assumes perfect knowledge of the cell-center user channels.

preprint2019arXiv

Amplitude Retrieval for Channel Estimation of MIMO Systems with One-Bit ADCs

This letter revisits the channel estimation problem for MIMO systems with one-bit analog-to-digital converters (ADCs) through a novel algorithm--Amplitude Retrieval (AR). Unlike the state-of-the-art methods such as those based on one-bit compressive sensing, AR takes a different approach. It accounts for the lost amplitudes of the one-bit quantized measurements, and performs channel estimation and amplitude completion jointly. This way, the direction information of the propagation paths can be estimated via accurate direction finding algorithms in array processing, e.g., maximum likelihood. The upsot is that AR is able to handle off-grid angles and provide more accurate channel estimates. Simulation results are included to showcase the advantages of AR.

preprint2019arXiv

Learning Nonlinear Mixtures: Identifiability and Algorithm

Linear mixture models have proven very useful in a plethora of applications, e.g., topic modeling, clustering, and source separation. As a critical aspect of the linear mixture models, identifiability of the model parameters is well-studied, under frameworks such as independent component analysis and constrained matrix factorization. Nevertheless, when the linear mixtures are distorted by an unknown nonlinear functions -- which is well-motivated and more realistic in many cases -- the identifiability issues are much less studied. This work proposes an identification criterion for a nonlinear mixture model that is well grounded in many real-world applications, and offers identifiability guarantees. A practical implementation based on a judiciously designed neural network is proposed to realize the criterion, and an effective learning algorithm is proposed. Numerical results on synthetic and real-data corroborate effectiveness of the proposed method.

preprint2019arXiv

Tensor Completion from Regular Sub-Nyquist Samples

Signal sampling and reconstruction is a fundamental engineering task at the heart of signal processing. The celebrated Shannon-Nyquist theorem guarantees perfect signal reconstruction from uniform samples, obtained at a rate twice the maximum frequency present in the signal. Unfortunately a large number of signals of interest are far from being band-limited. This motivated research on reconstruction from sub-Nyquist samples, which mainly hinges on the use of random / incoherent sampling procedures. However, uniform or regular sampling is more appealing in practice and from the system design point of view, as it is far simpler to implement, and often necessary due to system constraints. In this work, we study regular sampling and reconstruction of three- or higher-dimensional signals (tensors). We show that reconstructing a tensor signal from regular samples is feasible. Under the proposed framework, the sample complexity is determined by the tensor rank---rather than the signal bandwidth. This result offers new perspectives for designing practical regular sampling patterns and systems for signals that are naturally tensors, e.g., images and video. For a concrete application, we show that functional magnetic resonance imaging (fMRI) acceleration is a tensor sampling problem, and design practical sampling schemes and an algorithmic framework to handle it. Numerical results show that our tensor sampling strategy accelerates the fMRI sampling process significantly without sacrificing reconstruction accuracy.

preprint2016arXiv

Anchor-Free Correlated Topic Modeling: Identifiability and Algorithm

In topic modeling, many algorithms that guarantee identifiability of the topics have been developed under the premise that there exist anchor words -- i.e., words that only appear (with positive probability) in one topic. Follow-up work has resorted to three or higher-order statistics of the data corpus to relax the anchor word assumption. Reliable estimates of higher-order statistics are hard to obtain, however, and the identification of topics under those models hinges on uncorrelatedness of the topics, which can be unrealistic. This paper revisits topic modeling based on second-order moments, and proposes an anchor-free topic mining framework. The proposed approach guarantees the identification of the topics under a much milder condition compared to the anchor-word assumption, thereby exhibiting much better robustness in practice. The associated algorithm only involves one eigen-decomposition and a few small linear programs. This makes it easy to implement and scale up to very large problem instances. Experiments using the TDT2 and Reuters-21578 corpus demonstrate that the proposed anchor-free approach exhibits very favorable performance (measured using coherence, similarity count, and clustering accuracy metrics) compared to the prior art.

preprint2016arXiv

Beyond Relaxation and Newton-Raphson: Solving AC OPF for Multi-phase Systems with Renewables

This paper focuses on the AC Optimal Power Flow (OPF) problem for multi-phase systems. Particular emphasis is given to systems with high integration of renewables, where adjustments of the real and reactive output powers from renewable sources of energy are necessary in order to enforce voltage regulation. The AC OPF problem is known to be nonconvex (and, in fact, NP-hard). Convex relaxation techniques have been recently explored to solve the OPF task with reduced computational burden; however, sufficient conditions for tightness of these relaxations are only available for restricted classes of system topologies and problem setups. Identifying feasible power-flow solutions remains hard in more general problem formulations, especially in unbalanced multi-phase systems with renewables. To identify feasible and optimal AC OPF solutions in challenging scenarios where existing methods may fail, this paper leverages the Feasible Point Pursuit - Successive Convex Approximation algorithm -- a powerful approach for general nonconvex quadratically constrained quadratic programs. The merits of the approach are illustrated using single- and multi-phase distribution networks with renewables, as well as several transmission systems.

preprint2016arXiv

Consensus-ADMM for General Quadratically Constrained Quadratic Programming

Non-convex quadratically constrained quadratic programming (QCQP) problems have numerous applications in signal processing, machine learning, and wireless communications, albeit the general QCQP is NP-hard, and several interesting special cases are NP-hard as well. This paper proposes a new algorithm for general QCQP. The problem is first reformulated in consensus optimization form, to which the alternating direction method of multipliers (ADMM) can be applied. The reformulation is done in such a way that each of the sub-problems is a QCQP with only one constraint (QCQP-1), which is efficiently solvable irrespective of (non-)convexity. The core components are carefully designed to make the overall algorithm more scalable, including efficient methods for solving QCQP-1, memory efficient implementation, parallel/distributed implementation, and smart initialization. The proposed algorithm is then tested in two applications: multicast beamforming and phase retrieval. The results indicate superior performance over prior state-of-the-art methods.

preprint2016arXiv

Learning From Hidden Traits: Joint Factor Analysis and Latent Clustering

Dimensionality reduction techniques play an essential role in data analytics, signal processing and machine learning. Dimensionality reduction is usually performed in a preprocessing stage that is separate from subsequent data analysis, such as clustering or classification. Finding reduced-dimension representations that are well-suited for the intended task is more appealing. This paper proposes a joint factor analysis and latent clustering framework, which aims at learning cluster-aware low-dimensional representations of matrix and tensor data. The proposed approach leverages matrix and tensor factorization models that produce essentially unique latent representations of the data to unravel latent cluster structure -- which is otherwise obscured because of the freedom to apply an oblique transformation in latent space. At the same time, latent cluster structure is used as prior information to enhance the performance of factorization. Specific contributions include several custom-built problem formulations, corresponding algorithms, and discussion of associated convergence properties. Besides extensive simulations, real-world datasets such as Reuters document data and MNIST image data are also employed to showcase the effectiveness of the proposed approaches.

preprint2016arXiv

Phase Retrieval from 1D Fourier Measurements: Convexity, Uniqueness, and Algorithms

This paper considers phase retrieval from the magnitude of 1D over-sampled Fourier measurements, a classical problem that has challenged researchers in various fields of science and engineering. We show that an optimal vector in a least-squares sense can be found by solving a convex problem, thus establishing a hidden convexity in Fourier phase retrieval. We also show that the standard semidefinite relaxation approach yields the optimal cost function value (albeit not necessarily an optimal solution) in this case. A method is then derived to retrieve an optimal minimum phase solution in polynomial time. Using these results, a new measuring technique is proposed which guarantees uniqueness of the solution, along with an efficient algorithm that can solve large-scale Fourier phase retrieval problems with uniqueness and optimality guarantees.

preprint2016arXiv

Phase Retrieval Using Feasible Point Pursuit: Algorithms and Cramér-Rao Bound

Reconstructing a signal from squared linear (rank-one quadratic) measurements is a challenging problem with important applications in optics and imaging, where it is known as phase retrieval. This paper proposes two new phase retrieval algorithms based on non-convex quadratically constrained quadratic programming (QCQP) formulations, and a recently proposed approximation technique dubbed feasible point pursuit (FPP). The first is designed for uniformly distributed bounded measurement errors, such as those arising from high-rate quantization (B-FPP). The second is designed for Gaussian measurement errors, using a least squares criterion (LS-FPP). Their performance is measured against state-of-the-art algorithms and the Cramér-Rao bound (CRB), which is also derived here. Simulations show that LS-FPP outperforms the state-of-art and operates close to the CRB. Compact CRB expressions, properties, and insights are obtained by explicitly computing the CRB in various special cases -- including when the signal of interest admits a sparse parametrization, using harmonic retrieval as an example.

preprint2016arXiv

Robust Volume Minimization-Based Matrix Factorization for Remote Sensing and Document Clustering

This paper considers \emph{volume minimization} (VolMin)-based structured matrix factorization (SMF). VolMin is a factorization criterion that decomposes a given data matrix into a basis matrix times a structured coefficient matrix via finding the minimum-volume simplex that encloses all the columns of the data matrix. Recent work showed that VolMin guarantees the identifiability of the factor matrices under mild conditions that are realistic in a wide variety of applications. This paper focuses on both theoretical and practical aspects of VolMin. On the theory side, exact equivalence of two independently developed sufficient conditions for VolMin identifiability is proven here, thereby providing a more comprehensive understanding of this aspect of VolMin. On the algorithm side, computational complexity and sensitivity to outliers are two key challenges associated with real-world applications of VolMin. These are addressed here via a new VolMin algorithm that handles volume regularization in a computationally simple way, and automatically detects and {iteratively downweights} outliers, simultaneously. Simulations and real-data experiments using a remotely sensed hyperspectral image and the Reuters document corpus are employed to showcase the effectiveness of the proposed algorithm.

preprint2015arXiv

A Flexible and Efficient Algorithmic Framework for Constrained Matrix and Tensor Factorization

We propose a general algorithmic framework for constrained matrix and tensor factorization, which is widely used in signal processing and machine learning. The new framework is a hybrid between alternating optimization (AO) and the alternating direction method of multipliers (ADMM): each matrix factor is updated in turn, using ADMM, hence the name AO-ADMM. This combination can naturally accommodate a great variety of constraints on the factor matrices, and almost all possible loss measures for the fitting. Computation caching and warm start strategies are used to ensure that each update is evaluated efficiently, while the outer AO framework exploits recent developments in block coordinate descent (BCD)-type methods which help ensure that every limit point is a stationary point, as well as faster and more robust convergence in practice. Three special cases are studied in detail: non-negative matrix/tensor factorization, constrained matrix/tensor completion, and dictionary learning. Extensive simulations and experiments with real data are used to showcase the effectiveness and broad applicability of the proposed framework.

preprint2015arXiv

Parallel Algorithms for Constrained Tensor Factorization via the Alternating Direction Method of Multipliers

Tensor factorization has proven useful in a wide range of applications, from sensor array processing to communications, speech and audio signal processing, and machine learning. With few recent exceptions, all tensor factorization algorithms were originally developed for centralized, in-memory computation on a single machine; and the few that break away from this mold do not easily incorporate practically important constraints, such as nonnegativity. A new constrained tensor factorization framework is proposed in this paper, building upon the Alternating Direction method of Multipliers (ADMoM). It is shown that this simplifies computations, bypassing the need to solve constrained optimization problems in each iteration; and it naturally leads to distributed algorithms suitable for parallel implementation on regular high-performance computing (e.g., mesh) architectures. This opens the door for many emerging big data-enabled applications. The methodology is exemplified using nonnegativity as a baseline constraint, but the proposed framework can more-or-less readily incorporate many other types of constraints. Numerical experiments are very encouraging, indicating that the ADMoM-based nonnegative tensor factorization (NTF) has high potential as an alternative to state-of-the-art approaches.

preprint2014arXiv

Feasible Point Pursuit and Successive Approximation of Non-convex QCQPs

Quadratically constrained quadratic programs (QCQPs) have a wide range of applications in signal processing and wireless communications. Non-convex QCQPs are NP-hard in general. Existing approaches relax the non-convexity using semi-definite relaxation (SDR) or linearize the non-convex part and solve the resulting convex problem. However, these techniques are seldom successful in even obtaining a feasible solution when the QCQP matrices are indefinite. In this paper, a new feasible point pursuit successive convex approximation (FPP-SCA) algorithm is proposed for non-convex QCQPs. FPP-SCA linearizes the non-convex parts of the problem as conventional SCA does, but adds slack variables to sustain feasibility, and a penalty to ensure slacks are sparingly used. When FPP-SCA is successful in identifying a feasible point of the non-convex QCQP, convergence to a Karush-Kuhn-Tucker (KKT) point is thereafter ensured. Simulations show the effectiveness of our proposed algorithm in obtaining feasible and near-optimal solutions, significantly outperforming existing approaches.

preprint2013arXiv

Scoup-SMT: Scalable Coupled Sparse Matrix-Tensor Factorization

How can we correlate neural activity in the human brain as it responds to words, with behavioral data expressed as answers to questions about these same words? In short, we want to find latent variables, that explain both the brain activity, as well as the behavioral responses. We show that this is an instance of the Coupled Matrix-Tensor Factorization (CMTF) problem. We propose Scoup-SMT, a novel, fast, and parallel algorithm that solves the CMTF problem and produces a sparse latent low-rank subspace of the data. In our experiments, we find that Scoup-SMT is 50-100 times faster than a state-of-the-art algorithm for CMTF, along with a 5 fold increase in sparsity. Moreover, we extend Scoup-SMT to handle missing data without degradation of performance. We apply Scoup-SMT to BrainQ, a dataset consisting of a (nouns, brain voxels, human subjects) tensor and a (nouns, properties) matrix, with coupling along the nouns dimension. Scoup-SMT is able to find meaningful latent variables, as well as to predict brain activity with competitive accuracy. Finally, we demonstrate the generality of Scoup-SMT, by applying it on a Facebook dataset (users, friends, wall-postings); there, Scoup-SMT spots spammer-like anomalies.

Nicholas D. Sidiropoulos

What is connected

Connect this record

See the researcher in context

Building this map preview

25 published item(s)

GAGE: Geometry Preserving Attributed Graph Embeddings

Probabilistic Simplex Component Analysis

Exploring the Subgraph Density-Size Trade-off via the Lovász Extension

Low-rank Characteristic Tensor Density Estimation Part II: Compression and Latent Density Estimation

Cell-Edge Detection via Selective Cooperation and Generalized Canonical Correlation

Exactness of OPF Relaxation on Three-phase Radial Networks with Delta Connections

Fast Algorithms for Joint Multicast Beamforming and Antenna Selection in Massive MIMO

GRATE: Granular Recovery of Aggregated Tensor Data by Example

Mining Large Quasi-cliques with Quality Guarantees from Vertex Neighborhoods

PREMA: Principled Tensor Data Recovery from Multiple Aggregated Views

Reliable Detection of Unknown Cell-Edge Users Via Canonical Correlation Analysis

Amplitude Retrieval for Channel Estimation of MIMO Systems with One-Bit ADCs

Learning Nonlinear Mixtures: Identifiability and Algorithm

Tensor Completion from Regular Sub-Nyquist Samples

Anchor-Free Correlated Topic Modeling: Identifiability and Algorithm

Beyond Relaxation and Newton-Raphson: Solving AC OPF for Multi-phase Systems with Renewables

Consensus-ADMM for General Quadratically Constrained Quadratic Programming

Learning From Hidden Traits: Joint Factor Analysis and Latent Clustering

Phase Retrieval from 1D Fourier Measurements: Convexity, Uniqueness, and Algorithms

Phase Retrieval Using Feasible Point Pursuit: Algorithms and Cramér-Rao Bound

Robust Volume Minimization-Based Matrix Factorization for Remote Sensing and Document Clustering

A Flexible and Efficient Algorithmic Framework for Constrained Matrix and Tensor Factorization

Parallel Algorithms for Constrained Tensor Factorization via the Alternating Direction Method of Multipliers

Feasible Point Pursuit and Successive Approximation of Non-convex QCQPs

Scoup-SMT: Scalable Coupled Sparse Matrix-Tensor Factorization