Source author record

Dmitry Kamzolov

Dmitry Kamzolov appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Machine Learning Cryptography and Security

Catalog footprint

What is connected

8works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

SANIA: Polyak-type Optimization Framework Leads to Scale Invariant Stochastic Algorithms

Adaptive optimization methods are widely recognized as among the most popular approaches for training Deep Neural Networks (DNNs). Techniques such as Adam, AdaGrad, and AdaHessian utilize a preconditioner that modifies the search direction by incorporating information about the curvature of the objective function. However, despite their adaptive characteristics, these methods still require manual fine-tuning of the step-size. This, in turn, impacts the time required to solve a particular problem. This paper presents an optimization framework named SANIA to tackle these challenges. Beyond eliminating the need for manual step-size hyperparameter settings, SANIA incorporates techniques to address poorly scaled or ill-conditioned problems. We also explore several preconditioning methods, including Hutchinson's method, which approximates the Hessian diagonal of the loss function. We conclude with an extensive empirical examination of the proposed techniques across classification tasks, covering both convex and non-convex contexts.

preprint2022arXiv

FLECS: A Federated Learning Second-Order Framework via Compression and Sketching

Inspired by the recent work FedNL (Safaryan et al, FedNL: Making Newton-Type Methods Applicable to Federated Learning), we propose a new communication efficient second-order framework for Federated learning, namely FLECS. The proposed method reduces the high-memory requirements of FedNL by the usage of an L-SR1 type update for the Hessian approximation which is stored on the central server. A low dimensional `sketch' of the Hessian is all that is needed by each device to generate an update, so that memory costs as well as number of Hessian-vector products for the agent are low. Biased and unbiased compressions are utilized to make communication costs also low. Convergence guarantees for FLECS are provided in both the strongly convex, and nonconvex cases, and local linear convergence is also established under strong convexity. Numerical experiments confirm the practical benefits of this new FLECS algorithm.

preprint2022arXiv

Suppressing Poisoning Attacks on Federated Learning for Medical Imaging

Collaboration among multiple data-owning entities (e.g., hospitals) can accelerate the training process and yield better machine learning models due to the availability and diversity of data. However, privacy concerns make it challenging to exchange data while preserving confidentiality. Federated Learning (FL) is a promising solution that enables collaborative training through exchange of model parameters instead of raw data. However, most existing FL solutions work under the assumption that participating clients are \emph{honest} and thus can fail against poisoning attacks from malicious parties, whose goal is to deteriorate the global model performance. In this work, we propose a robust aggregation rule called Distance-based Outlier Suppression (DOS) that is resilient to byzantine failures. The proposed method computes the distance between local parameter updates of different clients and obtains an outlier score for each client using Copula-based Outlier Detection (COPOD). The resulting outlier scores are converted into normalized weights using a softmax function, and a weighted average of the local parameters is used for updating the global model. DOS aggregation can effectively suppress parameter updates from malicious clients without the need for any hyperparameter selection, even when the data distributions are heterogeneous. Evaluation on two medical imaging datasets (CheXpert and HAM10000) demonstrates the higher robustness of DOS method against a variety of poisoning attacks in comparison to other state-of-the-art methods. The code can be found here https://github.com/Naiftt/SPAFD.

preprint2020arXiv

Near-Optimal Hyperfast Second-Order Method for convex optimization and its Sliding

In this paper, we present a new Hyperfast Second-Order Method with convergence rate $O(N^{-5})$ up to a logarithmic factor for the convex function with Lipshitz the third derivative. This method based on two ideas. The first comes from the superfast second-order scheme of Yu. Nesterov (CORE Discussion Paper 2020/07, 2020). It allows implementing the third-order scheme by solving subproblem using only the second-order oracle. This method converges with rate $O(N^{-4})$. The second idea comes from the work of Kamzolov et al. (arXiv:2002.01004). It is the inexact near-optimal third-order method. In this work, we improve its convergence and merge it with the scheme of solving subproblem using only the second-order oracle. As a result, we get convergence rate $O(N^{-5})$ up to a logarithmic factor. This convergence rate is near-optimal and the best known up to this moment. Further, we investigate the situation when there is a sum of two functions and improve the sliding framework from Kamzolov et al. (arXiv:2002.01004) for the second-order methods.

preprint2020arXiv

On the Optimal Combination of Tensor Optimization Methods

We consider the minimization problem of a sum of a number of functions having Lipshitz $p$-th order derivatives with different Lipschitz constants. In this case, to accelerate optimization, we propose a general framework allowing to obtain near-optimal oracle complexity for each function in the sum separately, meaning, in particular, that the oracle for a function with lower Lipschitz constant is called a smaller number of times. As a building block, we extend the current theory of tensor methods and show how to generalize near-optimal tensor methods to work with inexact tensor step. Further, we investigate the situation when the functions in the sum have Lipschitz derivatives of a different order. For this situation, we propose a generic way to separate the oracle complexity between the parts of the sum. Our method is not optimal, which leads to an open problem of the optimal combination of oracles of a different order.

preprint2016arXiv

Gradient and gradient-free methods for stochastic convex optimization with inexact oracle

In the paper we generalize universal gradient method (Yu. Nesterov) to strongly convex case and to Intermediate gradient method (Devolder-Glineur-Nesterov). We also consider possible generalizations to stochastic and online context. We show how these results can be generalized to gradient-free method and method of random direction search. But the main ingridient of this paper is assumption about the oracle. We considered the oracle to be inexact.

preprint2016arXiv

Universal composite prox-method for strictly convex optimization problems

We propose a simple way to explain Univerasal method of Yu. Nesterov. Based on this method and using the restart technique we propose Universal method for strictly convex optimization problems. We consider general proximal set up (not necessarily euclidian one).

preprint2016arXiv

Universal method with inexact oracle and its applications for searching equillibriums in multistage transport problems

In this paper we propose a new efficient approach for numerical calculation of equillibriums in multistage transport problems. In the very core of our approach lies the proper combination of Universal Gradient Method proposed by Yu. Nesterov (2013) and conception of inexact oracle (Devolder--Glineur--Nesterov, 2011). In particular our technique allows us to calculate Wasserstein's Barycenter in a fast manner (this results generalized M. Cuturi et al. (2014)).

Dmitry Kamzolov

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

SANIA: Polyak-type Optimization Framework Leads to Scale Invariant Stochastic Algorithms

FLECS: A Federated Learning Second-Order Framework via Compression and Sketching

Suppressing Poisoning Attacks on Federated Learning for Medical Imaging

Near-Optimal Hyperfast Second-Order Method for convex optimization and its Sliding

On the Optimal Combination of Tensor Optimization Methods

Gradient and gradient-free methods for stochastic convex optimization with inexact oracle

Universal composite prox-method for strictly convex optimization problems

Universal method with inexact oracle and its applications for searching equillibriums in multistage transport problems