Source author record

Franck Iutzeler

Franck Iutzeler appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Distributed, Parallel, and Cluster Computing Machine Learning Numerical Analysis Computer Science and Game Theory Information Retrieval Information Theory math.IT math.NA Mathematical Software Multiagent Systems Systems and Control

Catalog footprint

What is connected

13works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

$\texttt{skwdro}$: a library for Wasserstein distributionally robust machine learning

We present skwdro, a Python library for training robust machine learning models. The library is based on distributionally robust optimization using Wasserstein distances, popular in optimal transport and machine learnings. The goal of the library is to make the training of robust models easier for a wide audience by proposing a wrapper for PyTorch modules, enabling model loss' robustification with minimal code changes. It comes along with scikit-learn compatible estimators for some popular objectives. The core of the implementation relies on an entropic smoothing of the original robust objective, in order to ensure maximal model flexibility. The library is available at https://github.com/iutzeler/skwdro and the documentation at https://skwdro.readthedocs.io.

preprint2023arXiv

Entropy-regularized Wasserstein distributionally robust shape and topology optimization

This brief note aims to introduce the recent paradigm of distributional robustness in the field of shape and topology optimization. Acknowledging that the probability law of uncertain physical data is rarely known beyond a rough approximation constructed from observed samples, we optimize the worst-case value of the expected cost of a design when the probability law of the uncertainty is ``close'' to the estimated one up to a prescribed threshold. The ``proximity'' between probability laws is quantified by the Wasserstein distance, a notion pertaining to optimal transport theory. The combination of the classical entropic regularization technique in this field with recent results from convex duality theory allows to reformulate the distributionally robust optimization problem in a way which is tractable for computations. Two numerical examples are presented, in the different settings of density-based topology optimization and geometric shape optimization. They exemplify the relevance and applicability of the proposed formulation regardless of the selected optimal design framework.

preprint2022arXiv

Learning over No-Preferred and Preferred Sequence of Items for Robust Recommendation (Extended Abstract)

This paper is an extended version of [Burashnikova et al., 2021, arXiv: 2012.06910], where we proposed a theoretically supported sequential strategy for training a large-scale Recommender System (RS) over implicit feedback, mainly in the form of clicks. The proposed approach consists in minimizing pairwise ranking loss over blocks of consecutive items constituted by a sequence of non-clicked items followed by a clicked one for each user. We present two variants of this strategy where model parameters are updated using either the momentum method or a gradient-based approach. To prevent updating the parameters for an abnormally high number of clicks over some targeted items (mainly due to bots), we introduce an upper and a lower threshold on the number of updates for each user. These thresholds are estimated over the distribution of the number of blocks in the training set. They affect the decision of RS by shifting the distribution of items that are shown to the users. Furthermore, we provide a convergence analysis of both algorithms and demonstrate their practical efficiency over six large-scale collections with respect to various ranking measures.

preprint2022arXiv

Multi-Agent Online Optimization with Delays: Asynchronicity, Adaptivity, and Optimism

In this paper, we provide a general framework for studying multi-agent online learning problems in the presence of delays and asynchronicities. Specifically, we propose and analyze a class of adaptive dual averaging schemes in which agents only need to accumulate gradient feedback received from the whole system, without requiring any between-agent coordination. In the single-agent case, the adaptivity of the proposed method allows us to extend a range of existing results to problems with potentially unbounded delays between playing an action and receiving the corresponding feedback. In the multi-agent case, the situation is significantly more complicated because agents may not have access to a global clock to use as a reference point; to overcome this, we focus on the information that is available for producing each prediction rather than the actual delay associated with each feedback. This allows us to derive adaptive learning strategies with optimal regret bounds, even in a fully decentralized, asynchronous environment. Finally, we also analyze an "optimistic" variant of the proposed algorithm which is capable of exploiting the predictability of problems with a slower variation and leads to improved regret bounds.

preprint2020arXiv

Distributed Learning with Sparse Communications by Identification

In distributed optimization for large-scale learning, a major performance limitation comes from the communications between the different entities. When computations are performed by workers on local data while a coordinator machine coordinates their updates to minimize a global loss, we present an asynchronous optimization algorithm that efficiently reduces the communications between the coordinator and workers. This reduction comes from a random sparsification of the local updates. We show that this algorithm converges linearly in the strongly convex case and also identifies optimal strongly sparse solutions. We further exploit this identification to propose an automatic dimension reduction, aptly sparsifying all exchanges between coordinator and workers.

preprint2020arXiv

On the convergence of single-call stochastic extra-gradient methods

Variational inequalities have recently attracted considerable interest in machine learning as a flexible paradigm for models that go beyond ordinary loss function minimization (such as generative adversarial networks and related deep learning systems). In this setting, the optimal $\mathcal{O}(1/t)$ convergence rate for solving smooth monotone variational inequalities is achieved by the Extra-Gradient (EG) algorithm and its variants. Aiming to alleviate the cost of an extra gradient step per iteration (which can become quite substantial in deep learning applications), several algorithms have been proposed as surrogates to Extra-Gradient with a \emph{single} oracle call per iteration. In this paper, we develop a synthetic view of such algorithms, and we complement the existing literature by showing that they retain a $\mathcal{O}(1/t)$ ergodic convergence rate in smooth, deterministic problems. Subsequently, beyond the monotone deterministic case, we also show that the last iterate of single-call, \emph{stochastic} extra-gradient methods still enjoys a $\mathcal{O}(1/t)$ local convergence rate to solutions of \emph{non-monotone} variational inequalities that satisfy a second-order sufficient condition.

preprint2020arXiv

On the Interplay between Acceleration and Identification for the Proximal Gradient algorithm

In this paper, we study the interplay between acceleration and structure identification for the proximal gradient algorithm. We report and analyze several cases where this interplay has negative effects on the algorithm behavior (iterates oscillation, loss of structure, etc.). We present a generic method that tames acceleration when structure identification may be at stake; it benefits from a convergence rate that matches the one of the accelerated proximal gradient under some qualifying condition. We show empirically that the proposed method is much more stable in terms of subspace identification compared to the accelerated proximal gradient method while keeping a similar functional decrease.

preprint2020arXiv

Proximal Gradient methods with Adaptive Subspace Sampling

Many applications in machine learning or signal processing involve nonsmooth optimization problems. This nonsmoothness brings a low-dimensional structure to the optimal solutions. In this paper, we propose a randomized proximal gradient method harnessing this underlying structure. We introduce two key components: i) a random subspace proximal gradient algorithm; ii) an identification-based sampling of the subspaces. Their interplay brings a significant performance improvement on typical learning problems in terms of dimensions explored.

preprint2020arXiv

Rank-one partitioning: formalization, illustrative examples, and a new cluster enhancing strategy

In this paper, we introduce and formalize a rank-one partitioning learning paradigm that unifies partitioning methods that proceed by summarizing a data set using a single vector that is further used to derive the final clustering partition. Using this unification as a starting point, we propose a novel algorithmic solution for the partitioning problem based on rank-one matrix factorization and denoising of piecewise constant signals. Finally, we propose an empirical demonstration of our findings and demonstrate the robustness of the proposed denoising step. We believe that our work provides a new point of view for several unsupervised learning techniques that helps to gain a deeper understanding about the general mechanisms of data partitioning.

preprint2015arXiv

A Coordinate Descent Primal-Dual Algorithm and Application to Distributed Asynchronous Optimization

Based on the idea of randomized coordinate descent of $α$-averaged operators, a randomized primal-dual optimization algorithm is introduced, where a random subset of coordinates is updated at each iteration. The algorithm builds upon a variant of a recent (deterministic) algorithm proposed by Vũ and Condat that includes the well known ADMM as a particular case. The obtained algorithm is used to solve asynchronously a distributed optimization problem. A network of agents, each having a separate cost function containing a differentiable term, seek to find a consensus on the minimum of the aggregate objective. The method yields an algorithm where at each iteration, a random subset of agents wake up, update their local estimates, exchange some data with their neighbors, and go idle. Numerical results demonstrate the attractive performance of the method. The general approach can be naturally adapted to other situations where coordinate descent convex optimization algorithms are used with a random choice of the coordinates.

preprint2014arXiv

Explicit Convergence Rate of a Distributed Alternating Direction Method of Multipliers

Consider a set of N agents seeking to solve distributively the minimization problem $\inf_{x} \sum_{n = 1}^N f_n(x)$ where the convex functions $f_n$ are local to the agents. The popular Alternating Direction Method of Multipliers has the potential to handle distributed optimization problems of this kind. We provide a general reformulation of the problem and obtain a class of distributed algorithms which encompass various network architectures. The rate of convergence of our method is considered. It is assumed that the infimum of the problem is reached at a point $x_\star$, the functions $f_n$ are twice differentiable at this point and $\sum \nabla^2 f_n(x_\star) > 0$ in the positive definite ordering of symmetric matrices. With these assumptions, it is shown that the convergence to the consensus $x_\star$ is linear and the exact rate is provided. Application examples where this rate can be optimized with respect to the ADMM free parameter $ρ$ are also given.

preprint2013arXiv

Asynchronous Distributed Optimization using a Randomized Alternating Direction Method of Multipliers

Consider a set of networked agents endowed with private cost functions and seeking to find a consensus on the minimizer of the aggregate cost. A new class of random asynchronous distributed optimization methods is introduced. The methods generalize the standard Alternating Direction Method of Multipliers (ADMM) to an asynchronous setting where isolated components of the network are activated in an uncoordinated fashion. The algorithms rely on the introduction of randomized Gauss-Seidel iterations of a Douglas-Rachford operator for finding zeros of a sum of two monotone operators. Convergence to the sought minimizers is provided under mild connectivity conditions. Numerical results sustain our claims.

preprint2012arXiv

Analysis of Sum-Weight-like algorithms for averaging in Wireless Sensor Networks

Distributed estimation of the average value over a Wireless Sensor Network has recently received a lot of attention. Most papers consider single variable sensors and communications with feedback (e.g. peer-to-peer communications). However, in order to use efficiently the broadcast nature of the wireless channel, communications without feedback are advocated. To ensure the convergence in this feedback-free case, the recently-introduced Sum-Weight-like algorithms which rely on two variables at each sensor are a promising solution. In this paper, the convergence towards the consensus over the average of the initial values is analyzed in depth. Furthermore, it is shown that the squared error decreases exponentially with the time. In addition, a powerful algorithm relying on the Sum-Weight structure and taking into account the broadcast nature of the channel is proposed.

Franck Iutzeler

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

$\texttt{skwdro}$: a library for Wasserstein distributionally robust machine learning

Entropy-regularized Wasserstein distributionally robust shape and topology optimization

Learning over No-Preferred and Preferred Sequence of Items for Robust Recommendation (Extended Abstract)

Multi-Agent Online Optimization with Delays: Asynchronicity, Adaptivity, and Optimism

Distributed Learning with Sparse Communications by Identification

On the convergence of single-call stochastic extra-gradient methods

On the Interplay between Acceleration and Identification for the Proximal Gradient algorithm

Proximal Gradient methods with Adaptive Subspace Sampling

Rank-one partitioning: formalization, illustrative examples, and a new cluster enhancing strategy

A Coordinate Descent Primal-Dual Algorithm and Application to Distributed Asynchronous Optimization

Explicit Convergence Rate of a Distributed Alternating Direction Method of Multipliers

Asynchronous Distributed Optimization using a Randomized Alternating Direction Method of Multipliers

Analysis of Sum-Weight-like algorithms for averaging in Wireless Sensor Networks