Source author record

Sergey M. Plis

Sergey M. Plis appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Neural and Evolutionary Computing Computer Vision Distributed, Parallel, and Cluster Computing eess.IV eess.SP Neurons and Cognition Numerical Analysis

Catalog footprint

What is connected

9works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

On self-supervised multi-modal representation learning: An application to Alzheimer's disease

Introspection of deep supervised predictive models trained on functional and structural brain imaging may uncover novel markers of Alzheimer's disease (AD). However, supervised training is prone to learning from spurious features (shortcut learning) impairing its value in the discovery process. Deep unsupervised and, recently, contrastive self-supervised approaches, not biased to classification, are better candidates for the task. Their multimodal options specifically offer additional regularization via modality interactions. In this paper, we introduce a way to exhaustively consider multimodal architectures for contrastive self-supervised fusion of fMRI and MRI of AD patients and controls. We show that this multimodal fusion results in representations that improve the results of the downstream classification for both modalities. We investigate the fused self-supervised features projected into the brain space and introduce a numerically stable way to do so.

preprint2022arXiv

Peering Beyond the Gradient Veil with Distributed Auto Differentiation

Although distributed machine learning has opened up many new and exciting research frontiers, fragmentation of models and data across different machines, nodes, and sites still results in considerable communication overhead, impeding reliable training in real-world contexts. The focus on gradients as the primary shared statistic during training has spawned a number of intuitive algorithms for distributed deep learning; however, gradient-centric training of large deep neural networks (DNNs) tends to be communication-heavy, often requiring additional adaptations such as sparsity constraints, compression, quantization, and more, to curtail bandwidth. We introduce an innovative, communication-friendly approach for training distributed DNNs, which capitalizes on the outer-product structure of the gradient as revealed by the mechanics of auto-differentiation. The exposed structure of the gradient evokes a new class of distributed learning algorithm, which is naturally more communication-efficient than full gradient sharing. Our approach, called distributed auto-differentiation (dAD), builds off a marriage of rank-based compression and the innate structure of the gradient as an outer-product. We demonstrate that dAD trains more efficiently than other state of the art distributed methods on modern architectures, such as transformers, when applied to large-scale text and imaging datasets. The future of distributed learning, we determine, need not be dominated by gradient-centric algorithms.

preprint2022arXiv

Self-supervised multimodal neuroimaging yields predictive representations for a spectrum of Alzheimer's phenotypes

Recent neuroimaging studies that focus on predicting brain disorders via modern machine learning approaches commonly include a single modality and rely on supervised over-parameterized models.However, a single modality provides only a limited view of the highly complex brain. Critically, supervised models in clinical settings lack accurate diagnostic labels for training. Coarse labels do not capture the long-tailed spectrum of brain disorder phenotypes, which leads to a loss of generalizability of the model that makes them less useful in diagnostic settings. This work presents a novel multi-scale coordinated framework for learning multiple representations from multimodal neuroimaging data. We propose a general taxonomy of informative inductive biases to capture unique and joint information in multimodal self-supervised fusion. The taxonomy forms a family of decoder-free models with reduced computational complexity and a propensity to capture multi-scale relationships between local and global representations of the multimodal inputs. We conduct a comprehensive evaluation of the taxonomy using functional and structural magnetic resonance imaging (MRI) data across a spectrum of Alzheimer's disease phenotypes and show that self-supervised models reveal disorder-relevant brain regions and multimodal links without access to the labels during pre-training. The proposed multimodal self-supervised learning yields representations with improved classification performance for both modalities. The concomitant rich and flexible unsupervised deep learning framework captures complex multimodal relationships and provides predictive performance that meets or exceeds that of a more narrow supervised classification analysis. We present elaborate quantitative evidence of how this framework can significantly advance our search for missing links in complex brain disorders.

preprint2022arXiv

Tasting the cake: evaluating self-supervised generalization on out-of-distribution multimodal MRI data

Self-supervised learning has enabled significant improvements on natural image benchmarks. However, there is less work in the medical imaging domain in this area. The optimal models have not yet been determined among the various options. Moreover, little work has evaluated the current applicability limits of novel self-supervised methods. In this paper, we evaluate a range of current contrastive self-supervised methods on out-of-distribution generalization in order to evaluate their applicability to medical imaging. We show that self-supervised models are not as robust as expected based on their results in natural imaging benchmarks and can be outperformed by supervised learning with dropout. We also show that this behavior can be countered with extensive augmentation. Our results highlight the need for out-of-distribution generalization standards and benchmarks to adopt the self-supervised methods in the medical imaging community.

preprint2021arXiv

Improved Differentially Private Decentralized Source Separation for fMRI Data

Blind source separation algorithms such as independent component analysis (ICA) are widely used in the analysis of neuroimaging data. In order to leverage larger sample sizes, different data holders/sites may wish to collaboratively learn feature representations. However, such datasets are often privacy-sensitive, precluding centralized analyses that pool the data at a single site. In this work, we propose a differentially private algorithm for performing ICA in a decentralized data setting. Conventional approaches to decentralized differentially private algorithms may introduce too much noise due to the typically small sample sizes at each site. We propose a novel protocol that uses correlated noise to remedy this problem. We show that our algorithm outperforms existing approaches on synthetic and real neuroimaging datasets and demonstrate that it can sometimes reach the same level of utility as the corresponding non-private algorithm. This indicates that it is possible to have meaningful utility while preserving privacy.

preprint2016arXiv

Variational Autoencoders for Feature Detection of Magnetic Resonance Imaging Data

Independent component analysis (ICA), as an approach to the blind source-separation (BSS) problem, has become the de-facto standard in many medical imaging settings. Despite successes and a large ongoing research effort, the limitation of ICA to square linear transformations have not been overcome, so that general INFOMAX is still far from being realized. As an alternative, we present feature analysis in medical imaging as a problem solved by Helmholtz machines, which include dimensionality reduction and reconstruction of the raw data under the same objective, and which recently have overcome major difficulties in inference and learning with deep and nonlinear configurations. We demonstrate one approach to training Helmholtz machines, variational auto-encoders (VAE), as a viable approach toward feature extraction with magnetic resonance imaging (MRI) data.

preprint2014arXiv

Deep learning for neuroimaging: a validation study

Deep learning methods have recently made notable advances in the tasks of classification and representation learning. These tasks are important for brain imaging and neuroscience discovery, making the methods attractive for porting to a neuroimager's toolbox. Success of these methods is, in part, explained by the flexibility of deep learning models. However, this flexibility makes the process of porting to new areas a difficult parameter optimization problem. In this work we demonstrate our results (and feasible parameter ranges) in application of deep learning methods to structural and functional brain imaging data. We also describe a novel constraint-based approach to visualizing high dimensional data. We use it to analyze the effect of parameter choices on data transformations. Our results show that deep learning methods are able to learn physiologically important representations and detect latent relations in neuroimaging data.

preprint2013arXiv

Block Coordinate Descent for Sparse NMF

Nonnegative matrix factorization (NMF) has become a ubiquitous tool for data analysis. An important variant is the sparse NMF problem which arises when we explicitly require the learnt features to be sparse. A natural measure of sparsity is the L$_0$ norm, however its optimization is NP-hard. Mixed norms, such as L$_1$/L$_2$ measure, have been shown to model sparsity robustly, based on intuitive attributes that such measures need to satisfy. This is in contrast to computationally cheaper alternatives such as the plain L$_1$ norm. However, present algorithms designed for optimizing the mixed norm L$_1$/L$_2$ are slow and other formulations for sparse NMF have been proposed such as those based on L$_1$ and L$_0$ norms. Our proposed algorithm allows us to solve the mixed norm sparsity constraints while not sacrificing computation time. We present experimental evidence on real-world datasets that shows our new algorithm performs an order of magnitude faster compared to the current state-of-the-art solvers optimizing the mixed norm and is suitable for large-scale datasets.

preprint2010arXiv

Directional Statistics on Permutations

Distributions over permutations arise in applications ranging from multi-object tracking to ranking of instances. The difficulty of dealing with these distributions is caused by the size of their domain, which is factorial in the number of considered entities ($n!$). It makes the direct definition of a multinomial distribution over permutation space impractical for all but a very small $n$. In this work we propose an embedding of all $n!$ permutations for a given $n$ in a surface of a hypersphere defined in $\mathbbm{R}^{(n-1)^2}$. As a result of the embedding, we acquire ability to define continuous distributions over a hypersphere with all the benefits of directional statistics. We provide polynomial time projections between the continuous hypersphere representation and the $n!$-element permutation space. The framework provides a way to use continuous directional probability densities and the methods developed thereof for establishing densities over permutations. As a demonstration of the benefits of the framework we derive an inference procedure for a state-space model over permutations. We demonstrate the approach with applications.

Sergey M. Plis

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

On self-supervised multi-modal representation learning: An application to Alzheimer's disease

Peering Beyond the Gradient Veil with Distributed Auto Differentiation

Self-supervised multimodal neuroimaging yields predictive representations for a spectrum of Alzheimer's phenotypes

Tasting the cake: evaluating self-supervised generalization on out-of-distribution multimodal MRI data

Improved Differentially Private Decentralized Source Separation for fMRI Data

Variational Autoencoders for Feature Detection of Magnetic Resonance Imaging Data

Deep learning for neuroimaging: a validation study

Block Coordinate Descent for Sparse NMF

Directional Statistics on Permutations