Source author record

Rujun Jiang

Rujun Jiang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC eess.SY Systems and Control

Catalog footprint

What is connected

5works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2025arXiv

Adaptive Algorithms for Nonconvex Bilevel Optimization under PŁ Conditions

Existing methods for nonconvex bilevel optimization (NBO) require prior knowledge of first- and second-order problem-specific parameters (e.g., Lipschitz constants and the Polyak-Łojasiewicz (PŁ) parameters) to set step sizes, a requirement that poses practical limitations when such parameters are unknown or computationally expensive. We introduce the Adaptive Fully First-order Bilevel Approximation (AF${}^2$BA) algorithm and its accelerated variant, A${}^2$F${}^2$BA, for solving NBO problems under the PŁ conditions. To our knowledge, these are the first methods to employ fully adaptive step size strategies, eliminating the need for any problem-specific parameters in NBO. We prove that both algorithms achieve $\mathcal{O}(1/ε^2)$ iteration complexity for finding an $ε$-stationary point, matching the iteration complexity of existing well-tuned methods. Furthermore, we show that A${}^2$F${}^2$BA enjoys a near-optimal first-order oracle complexity of $\tilde{\mathcal{O}}(1/ε^2)$, matching the oracle complexity of existing well-tuned methods, and aligning with the complexity of gradient descent for smooth nonconvex single-level optimization when ignoring the logarithmic factors.

preprint2024arXiv

A Unified Framework for Rank-based Loss Minimization

The empirical loss, commonly referred to as the average loss, is extensively utilized for training machine learning models. However, in order to address the diverse performance requirements of machine learning models, the use of the rank-based loss is prevalent, replacing the empirical loss in many cases. The rank-based loss comprises a weighted sum of sorted individual losses, encompassing both convex losses like the spectral risk, which includes the empirical risk and conditional value-at-risk, and nonconvex losses such as the human-aligned risk and the sum of the ranked range loss. In this paper, we introduce a unified framework for the optimization of the rank-based loss through the utilization of a proximal alternating direction method of multipliers. We demonstrate the convergence and convergence rate of the proposed algorithm under mild conditions. Experiments conducted on synthetic and real datasets illustrate the effectiveness and efficiency of the proposed algorithm.

preprint2022arXiv

Solving Stackelberg Prediction Game with Least Squares Loss via Spherically Constrained Least Squares Reformulation

The Stackelberg prediction game (SPG) is popular in characterizing strategic interactions between a learner and an attacker. As an important special case, the SPG with least squares loss (SPG-LS) has recently received much research attention. Although initially formulated as a difficult bi-level optimization problem, SPG-LS admits tractable reformulations which can be polynomially globally solved by semidefinite programming or second order cone programming. However, all the available approaches are not well-suited for handling large-scale datasets, especially those with huge numbers of features. In this paper, we explore an alternative reformulation of the SPG-LS. By a novel nonlinear change of variables, we rewrite the SPG-LS as a spherically constrained least squares (SCLS) problem. Theoretically, we show that an $ε$ optimal solution to the SCLS (and the SPG-LS) can be achieved in $\tilde{O}(N/\sqrtε)$ floating-point operations, where $N$ is the number of nonzero entries in the data matrix. Practically, we apply two well-known methods for solving this new reformulation, i.e., the Krylov subspace method and the Riemannian trust region method. Both algorithms are factorization free so that they are suitable for solving large scale problems. Numerical results on both synthetic and real-world datasets indicate that the SPG-LS, equipped with the SCLS reformulation, can be solved orders of magnitude faster than the state of the art.

preprint2022arXiv

Tackling A Class of Hard Subset-Sum Problems: Integration of Lattice Attacks with Disaggregation Techniques

Subset-sum problems belong to the NP class and play an important role in both complexity theory and knapsack-based cryptosystems, which have been proved in the literature to become hardest when the so-called density approaches one. Lattice attacks, which are acknowledged in the literature as the most effective methods, fail occasionally even when the number of unknown variables is of medium size. In this paper we propose a modular disaggregation technique and a simplified lattice formulation based on which two lattice attack algorithms are further designed. We introduce the new concept "jump points" in our disaggregation technique, and derive inequality conditions to identify superior jump points which can more easily cut-off non-desirable short integer solutions. Empirical tests have been conducted to show that integrating the disaggregation technique with lattice attacks can effectively raise success ratios to 100% for randomly generated problems with density one and of dimensions up to 100. Finally, statistical regressions are conducted to test significant features, thus revealing reasonable factors behind the empirical success of our algorithms and techniques proposed in this paper.

preprint2020arXiv

Hölderian error bounds and Kurdyka-Łojasiewicz inequality for the trust region subproblem

In this paper, we study the local variational geometry of the optimal solution set of the trust region subproblem (TRS), which minimizes a general, possibly nonconvex, quadratic function over the unit ball. Specifically, we demonstrate that a Hölderian error bound holds globally for the TRS with modulus 1/4 and the Kurdyka-Łojasiewicz (KL) inequality holds locally for the TRS with a KL exponent 3/4 at any optimal solution. We further prove that unless in a special case, the Hölderian error bound modulus, as well as the KL exponent, is 1/2. Finally, based on the obtained KL property, we further show that the projected gradient methods studied in [A. Beck and Y. Vaisbourd, SIAM J. Optim., 28 (2018), pp. 1951--1967] for solving the TRS achieve a sublinear or even linear rate of convergence.