Researcher profile

Anton Dereventsov

Anton Dereventsov contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2022arXiv

An adaptive stochastic gradient-free approach for high-dimensional blackbox optimization

In this work, we propose a novel adaptive stochastic gradient-free (ASGF) approach for solving high-dimensional nonconvex optimization problems based on function evaluations. We employ a directional Gaussian smoothing of the target function that generates a surrogate of the gradient and assists in avoiding bad local optima by utilizing nonlocal information of the loss landscape. Applying a deterministic quadrature scheme results in a massively scalable technique that is sample-efficient and achieves spectral accuracy. At each step we randomly generate the search directions while primarily following the surrogate of the smoothed gradient. This enables exploitation of the gradient direction while maintaining sufficient space exploration, and accelerates convergence towards the global extrema. In addition, we make use of a local approximation of the Lipschitz constant in order to adaptively adjust the values of all hyperparameters, thus removing the careful fine-tuning of current algorithms that is often necessary to be successful when applied to a large class of learning tasks. As such, the ASGF strategy offers significant improvements when solving high-dimensional nonconvex optimization problems when compared to other gradient-free methods (including the so called "evolutionary strategies") as well as iterative approaches that rely on the gradient information of the objective function. We illustrate the improved performance of this method by providing several comparative numerical studies on benchmark global optimization problems and reinforcement learning tasks.

preprint2022arXiv

Biorthogonal Greedy Algorithms in Convex Optimization

The study of greedy approximation in the context of convex optimization is becoming a promising research direction as greedy algorithms are actively being employed to construct sparse minimizers for convex functions with respect to given sets of elements. In this paper we propose a unified way of analyzing a certain kind of greedy-type algorithms for the minimization of convex functions on Banach spaces. Specifically, we define the class of Weak Biorthogonal Greedy Algorithms for convex optimization that contains a wide range of greedy algorithms. We analyze the introduced class of algorithms and establish the properties of convergence, rate of convergence, and numerical stability, which is understood in the sense that the steps of the algorithm are allowed to be performed not precisely but with controlled computational inaccuracies. We show that the following well-known algorithms for convex optimization -- the Weak Chebyshev Greedy Algorithm (co) and the Weak Greedy Algorithm with Free Relaxation (co) -- belong to this class, and introduce a new algorithm -- the Rescaled Weak Relaxed Greedy Algorithm (co). Presented numerical experiments demonstrate the practical performance of the aforementioned greedy algorithms in the setting of convex minimization as compared to optimization with regularization, which is the conventional approach of constructing sparse minimizers.

preprint2020arXiv

Neural network integral representations with the ReLU activation function

In this effort, we derive a formula for the integral representation of a shallow neural network with the ReLU activation function. We assume that the outer weighs admit a finite $L_1$-norm with respect to Lebesgue measure on the sphere. For univariate target functions we further provide a closed-form formula for all possible representations. Additionally, in this case our formula allows one to explicitly solve the least $L_1$-norm neural network representation for a given function.