Researcher profile

Yurii Nesterov

Yurii Nesterov contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2022arXiv

Adaptive Third-Order Methods for Composite Convex Optimization

In this paper we propose third-order methods for composite convex optimization problems in which the smooth part is a three-times continuously differentiable function with Lipschitz continuous third-order derivatives. The methods are adaptive in the sense that they do not require the knowledge of the Lipschitz constant. Trial points are computed by the inexact minimization of models that consist in the nonsmooth part of the objective plus a quartic regularization of third-order Taylor polynomial of the smooth part. Specifically, approximate solutions of the auxiliary problems are obtained by using a Bregman gradient method as inner solver. Different from existing adaptive approaches for high-order methods, in our new schemes the regularization parameters are adjusted taking into account the progress of the inner solver. With this technique, we show that the basic method finds an $ε$-approximate minimizer of the objective function performing at most $\mathcal{O}\left(|\log(ε)|ε^{-\frac{1}{3}}\right)$ iterations of the inner solver. An accelerated adaptive third-order method is also presented with total inner iteration complexity of $\mathcal{O}\left(|\log(ε)|ε^{-\frac{1}{4}}\right)$.

preprint2022arXiv

Quartic Regularity

In this paper, we propose new linearly convergent second-order methods for minimizing convex quartic polynomials. This framework is applied for designing optimization schemes, which can solve general convex problems satisfying a new condition of quartic regularity. It assumes positive definiteness and boundedness of the fourth derivative of the objective function. For such problems, an appropriate quartic regularization of Damped Newton Method has global linear rate of convergence. We discuss several important consequences of this result. In particular, it can be used for constructing new second-order methods in the framework of high-order proximal-point schemes. These methods have convergence rate $\tilde O(k^{-p})$, where $k$ is the iteration counter, $p$ is equal to 3, 4, or 5, and tilde indicates the presence of logarithmic factors in the complexity bounds for the auxiliary problems, which are solved at each iteration of the schemes.

preprint2022arXiv

Super-Universal Regularized Newton Method

We analyze the performance of a variant of Newton method with quadratic regularization for solving composite convex minimization problems. At each step of our method, we choose regularization parameter proportional to a certain power of the gradient norm at the current point. We introduce a family of problem classes characterized by Hölder continuity of either the second or third derivative. Then we present the method with a simple adaptive search procedure allowing an automatic adjustment to the problem class with the best global complexity bounds, without knowing specific parameters of the problem. In particular, for the class of functions with Lipschitz continuous third derivative, we get the global $O(1/k^3)$ rate, which was previously attributed to third-order tensor methods. When the objective function is uniformly convex, we justify an automatic acceleration of our scheme, resulting in a faster global rate and local superlinear convergence. The switching between the different rates (sublinear, linear, and superlinear) is automatic. Again, for that, no a priori knowledge of parameters is needed.

preprint2021arXiv

Dynamic pricing under nested logit demand

Recently, there is growing interest and need for dynamic pricing algorithms, especially, in the field of online marketplaces by offering smart pricing options for big online stores. We present an approach to adjust prices based on the observed online market data. The key idea is to characterize optimal prices as minimizers of a total expected revenue function, which turns out to be convex. We assume that consumers face information processing costs, hence, follow a discrete choice demand model, and suppliers are equipped with quantity adjustment costs. We prove the strong smoothness of the total expected revenue function by deriving the strong convexity modulus of its dual. Our gradient-based pricing schemes outbalance supply and demand at the convergence rates of $\mathcal{O}(\frac{1}{t})$ and $\mathcal{O}(\frac{1}{t^2})$, respectively. This suggests that the imperfect behavior of consumers and suppliers helps to stabilize the market.

preprint2020arXiv

Affine-invariant contracting-point methods for Convex Optimization

In this paper, we develop new affine-invariant algorithms for solving composite convex minimization problems with bounded domain. We present a general framework of Contracting-Point methods, which solve at each iteration an auxiliary subproblem restricting the smooth part of the objective function onto contraction of the initial domain. This framework provides us with a systematic way for developing optimization methods of different order, endowed with the global complexity bounds. We show that using an appropriate affine-invariant smoothness condition, it is possible to implement one iteration of the Contracting-Point method by one step of the pure tensor method of degree $p \geq 1$. The resulting global rate of convergence in functional residual is then ${\cal O}(1 / k^p)$, where $k$ is the iteration counter. It is important that all constants in our bounds are affine-invariant. For $p = 1$, our scheme recovers well-known Frank-Wolfe algorithm, providing it with a new interpretation by a general perspective of tensor methods. Finally, within our framework, we present efficient implementation and total complexity analysis of the inexact second-order scheme $(p = 2)$, called Contracting Newton method. It can be seen as a proper implementation of the trust-region idea. Preliminary numerical results confirm its good practical performance both in the number of iterations, and in computational time.

preprint2020arXiv

On the Quality of First-Order Approximation of Functions with Hölder Continuous Gradient

We show that Hölder continuity of the gradient is not only a sufficient condition, but also a necessary condition for the existence of a global upper bound on the error of the first-order Taylor approximation. We also relate this global upper bound to the Hölder constant of the gradient. This relation is expressed as an interval, depending on the Hölder constant, in which the error of the first-order Taylor approximation is guaranteed to be. We show that, for the Lipschitz continuous case, the interval cannot be reduced. An application to the norms of quadratic forms is proposed, which allows us to derive a novel characterization of Euclidean norms.

preprint2020arXiv

Online analysis of epidemics with variable infection rate

In this paper, we continue development of the new epidemiological model HIT, which is suitable for analyzing and predicting the propagation of COVID-19 epidemics. This is a discrete-time model allowing a reconstruction of the dynamics of asymptomatic virus holders using the available daily statistics on the number of new cases. We suggest to use a new indicator, the total infection rate, to distinguish the propagation and recession modes of the epidemic. We check our indicator on the available data for eleven different countries and for the whole world. Our reconstructions are very precise. In several cases, we are able to detect the exact dates of the disastrous political decisions, ensuring the second wave of the epidemics. It appears that for all our examples the decisions made on the basis of the current number of new cases are wrong. In this paper, we suggest a reasonable alternative. Our analysis shows that all tested countries are in a dangerous zone except Sweden.

preprint2020arXiv

Stochastic Subspace Cubic Newton Method

In this paper, we propose a new randomized second-order optimization algorithm---Stochastic Subspace Cubic Newton (SSCN)---for minimizing a high dimensional convex function $f$. Our method can be seen both as a {\em stochastic} extension of the cubically-regularized Newton method of Nesterov and Polyak (2006), and a {\em second-order} enhancement of stochastic subspace descent of Kozak et al. (2019). We prove that as we vary the minibatch size, the global convergence rate of SSCN interpolates between the rate of stochastic coordinate descent (CD) and the rate of cubic regularized Newton, thus giving new insights into the connection between first and second-order methods. Remarkably, the local convergence rate of SSCN matches the rate of stochastic subspace descent applied to the problem of minimizing the quadratic function $\frac12 (x-x^*)^\top \nabla^2f(x^*)(x-x^*)$, where $x^*$ is the minimizer of $f$, and hence depends on the properties of $f$ at the optimum only. Our numerical experiments show that SSCN outperforms non-accelerated first-order CD algorithms while being competitive to their accelerated variants.