Source author record

Naoki Marumo

Naoki Marumo appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Machine Learning math.CA math.ST Statistics Theory

Catalog footprint

What is connected

5works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2025arXiv

Complexity and convergence analysis of a single-loop SDCAM for Lipschitz composite optimization and beyond

We develop and analyze a single-loop algorithm for minimizing the sum of a Lipschitz differentiable function $f$, a prox-friendly proper closed function $g$ (with a closed domain on which $g$ is continuous) and the composition of another prox-friendly proper closed function $h$ (whose domain is closed on which $h$ is continuous) with a continuously differentiable mapping $c$ (that is Lipschitz continuous and Lipschitz differentiable on the convex closure of the domain of $g$). Such models arise naturally in many contemporary applications, where $f$ is the loss function for data misfit, and $g$ and $h$ are nonsmooth functions for inducing desirable structures in $x$ and $c(x)$. Existing single-loop algorithms mainly focus either on the case where $h$ is Lipschitz continuous or the case where $h$ is an indicator function of a closed convex set. In this paper, we develop a single-loop algorithm for more general possibly non-Lipschitz $h$. Our algorithm is a single-loop variant of the successive difference-of-convex approximation method (SDCAM) proposed in [22]. We show that when $h$ is Lipschitz, our algorithm exhibits an iteration complexity that matches the best known complexity result for obtaining an $(ε_1,ε_2,0)$-stationary point. Moreover, we show that, by assuming additionally that dom $g$ is compact, our algorithm exhibits an iteration complexity of $\tilde{O}(ε^{-4})$ for obtaining an $(ε,ε,ε)$-stationary point when $h$ is merely continuous and real-valued. Furthermore, we consider a scenario where $h$ does not have full domain and establish vanishing bounds on successive changes of iterates. Finally, in all three cases mentioned above, we show that one can construct a subsequence such that any accumulation point $x^*$ satisfies $c(x^*)\in$ dom $h$, and if a standard constraint qualification holds at $x^*$, then $x^*$ is a stationary point.

preprint2024arXiv

Accelerated-gradient-based generalized Levenberg--Marquardt method with oracle complexity bound and local quadratic convergence

Minimizing the sum of a convex function and a composite function appears in various fields. The generalized Levenberg--Marquardt (LM) method, also known as the prox-linear method, has been developed for such optimization problems. The method iteratively solves strongly convex subproblems with a damping term. This study proposes a new generalized LM method for solving the problem with a smooth composite function. The method enjoys three theoretical guarantees: iteration complexity bound, oracle complexity bound, and local convergence under a Hölderian growth condition. The local convergence results include local quadratic convergence under the quadratic growth condition; this is the first to extend the classical result for least-squares problems to a general smooth composite function. In addition, this is the first LM method with both an oracle complexity bound and local quadratic convergence under standard assumptions. These results are achieved by carefully controlling the damping parameter and solving the subproblems by the accelerated proximal gradient method equipped with a particular termination condition. Experimental results show that the proposed method performs well in practice for several instances, including classification with a neural network and nonnegative matrix factorization.

preprint2024arXiv

Universal heavy-ball method for nonconvex optimization under Hölder continuous Hessians

We propose a new first-order method for minimizing nonconvex functions with Lipschitz continuous gradients and Hölder continuous Hessians. The proposed algorithm is a heavy-ball method equipped with two particular restart mechanisms. It finds a solution where the gradient norm is less than $ε$ in $O(H_ν^{\frac{1}{2 + 2 ν}} ε^{- \frac{4 + 3 ν}{2 + 2 ν}})$ function and gradient evaluations, where $ν\in [0, 1]$ and $H_ν$ are the Hölder exponent and constant, respectively. Our algorithm is $ν$-independent and thus universal; it automatically achieves the above complexity bound with the optimal $ν\in [0, 1]$ without knowledge of $H_ν$. In addition, the algorithm does not require other problem-dependent parameters as input, including the gradient's Lipschitz constant or the target accuracy $ε$. Numerical results illustrate that the proposed method is promising.

preprint2021arXiv

Non-approximate Inference for Collective Graphical Models on Path Graphs via Discrete Difference of Convex Algorithm

The importance of aggregated count data, which is calculated from the data of multiple individuals, continues to increase. Collective Graphical Model (CGM) is a probabilistic approach to the analysis of aggregated data. One of the most important operations in CGM is maximum a posteriori (MAP) inference of unobserved variables under given observations. Because the MAP inference problem for general CGMs has been shown to be NP-hard, an approach that solves an approximate problem has been proposed. However, this approach has two major drawbacks. First, the quality of the solution deteriorates when the values in the count tables are small, because the approximation becomes inaccurate. Second, since continuous relaxation is applied, the integrality constraints of the output are violated. To resolve these problems, this paper proposes a new method for MAP inference for CGMs on path graphs. First we show that the MAP inference problem can be formulated as a (non-linear) minimum cost flow problem. Then, we apply Difference of Convex Algorithm (DCA), which is a general methodology to minimize a function represented as the sum of a convex function and a concave function. In our algorithm, important subroutines in DCA can be efficiently calculated by minimum convex cost flow algorithms. Experiments show that the proposed method outputs higher quality solutions than the conventional approach.

preprint2014arXiv

Properties of powers of functions satisfying second-order linear differential equations with applications to statistics

We derive properties of powers of a function satisfying a second-order linear differential equation. In particular we prove that the n-th power of the function satisfies an (n+1)-th order differential equation and give a simple method for obtaining the differential equation. Also we determine the exponents of the differential equation and derive a bound for the degree of the polynomials, which are coefficients in the differential equation. The bound corresponds to the order of differential equation satisfied by the n-fold convolution of the Fourier transform of the function. These results are applied to some probability density functions used in statistics.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint