Source author record

Dimitris Stripelis

Dimitris Stripelis appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Distributed, Parallel, and Cluster Computing Computation and Language Cryptography and Security eess.IV Quantitative Methods

Catalog footprint

What is connected

5works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Federated Named Entity Recognition

We present an analysis of the performance of Federated Learning in a paradigmatic natural-language processing task: Named-Entity Recognition (NER). For our evaluation, we use the language-independent CoNLL-2003 dataset as our benchmark dataset and a Bi-LSTM-CRF model as our benchmark NER model. We show that federated training reaches almost the same performance as the centralized model, though with some performance degradation as the learning environments become more heterogeneous. We also show the convergence rate of federated models for NER. Finally, we discuss existing challenges of Federated Learning for NLP applications that can foster future research directions.

preprint2022arXiv

Semi-Synchronous Federated Learning for Energy-Efficient Training and Accelerated Convergence in Cross-Silo Settings

There are situations where data relevant to machine learning problems are distributed across multiple locations that cannot share the data due to regulatory, competitiveness, or privacy reasons. Machine learning approaches that require data to be copied to a single location are hampered by the challenges of data sharing. Federated Learning (FL) is a promising approach to learn a joint model over all the available data across silos. In many cases, the sites participating in a federation have different data distributions and computational capabilities. In these heterogeneous environments, existing approaches exhibit poor performance: synchronous FL protocols are communication efficient, but have slow learning convergence and high energy cost; conversely, asynchronous FL protocols have faster convergence with lower energy cost, but higher communication. In this work, we introduce a novel energy-efficient Semi-Synchronous Federated Learning protocol that mixes local models periodically with minimal idle time and fast convergence. We show through extensive experiments over established benchmark datasets in the computer-vision domain as well as in real-world biomedical settings that our approach significantly outperforms previous work in data and computationally heterogeneous environments.

preprint2022arXiv

Towards Sparsified Federated Neuroimaging Models via Weight Pruning

Federated training of large deep neural networks can often be restrictive due to the increasing costs of communicating the updates with increasing model sizes. Various model pruning techniques have been designed in centralized settings to reduce inference times. Combining centralized pruning techniques with federated training seems intuitive for reducing communication costs -- by pruning the model parameters right before the communication step. Moreover, such a progressive model pruning approach during training can also reduce training times/costs. To this end, we propose FedSparsify, which performs model pruning during federated training. In our experiments in centralized and federated settings on the brain age prediction task (estimating a person's age from their brain MRI), we demonstrate that models can be pruned up to 95% sparsity without affecting performance even in challenging federated learning environments with highly heterogeneous data distributions. One surprising benefit of model pruning is improved model privacy. We demonstrate that models with high sparsity are less susceptible to membership inference attacks, a type of privacy attack.

preprint2021arXiv

Scaling Neuroscience Research using Federated Learning

The amount of biomedical data continues to grow rapidly. However, the ability to analyze these data is limited due to privacy and regulatory concerns. Machine learning approaches that require data to be copied to a single location are hampered by the challenges of data sharing. Federated Learning is a promising approach to learn a joint model over data silos. This architecture does not share any subject data across sites, only aggregated parameters, often in encrypted environments, thus satisfying privacy and regulatory requirements. Here, we describe our Federated Learning architecture and training policies. We demonstrate our approach on a brain age prediction model on structural MRI scans distributed across multiple sites with diverse amounts of data and subject (age) distributions. In these heterogeneous environments, our Semi-Synchronous protocol provides faster convergence.

preprint2020arXiv

Accelerating Federated Learning in Heterogeneous Data and Computational Environments

There are situations where data relevant to a machine learning problem are distributed among multiple locations that cannot share the data due to regulatory, competitiveness, or privacy reasons. For example, data present in users' cellphones, manufacturing data of companies in a given industrial sector, or medical records located at different hospitals. Moreover, participating sites often have different data distributions and computational capabilities. Federated Learning provides an approach to learn a joint model over all the available data in these environments. In this paper, we introduce a novel distributed validation weighting scheme (DVW), which evaluates the performance of a learner in the federation against a distributed validation set. Each learner reserves a small portion (e.g., 5%) of its local training examples as a validation dataset and allows other learners models to be evaluated against it. We empirically show that DVW results in better performance compared to established methods, such as FedAvg, both under synchronous and asynchronous communication protocols in data and computationally heterogeneous environments.

Dimitris Stripelis

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Federated Named Entity Recognition

Semi-Synchronous Federated Learning for Energy-Efficient Training and Accelerated Convergence in Cross-Silo Settings

Towards Sparsified Federated Neuroimaging Models via Weight Pruning

Scaling Neuroscience Research using Federated Learning

Accelerating Federated Learning in Heterogeneous Data and Computational Environments