Source author record

Haoyang Liu

Haoyang Liu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Machine Learning math.NA math.PR math.ST Multiagent Systems Numerical Analysis Statistics Theory

Catalog footprint

What is connected

6works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A reconstructed discontinuous approximation for distributed elliptic control problems

In this paper, we present and analyze an interior penalty discontinuous Galerkin method for the distributed elliptic optimal control problems. It is based on a reconstructed discontinuous approximation which admits arbitrarily high-order approximation space with only one unknown per element. Applying this method, we develop a proper discretization scheme that approximates the state and adjoint variables in the approximation space. Our main contributions are twofold: (1) the derivation of both a priori and a posteriori error estimates of the $L^2$-norm and the energy norms, and (2) the implementation of an efficiently solvable discrete system, which is solved via a linearly convergent projected gradient descent method. Numerical experiments are provided to verify the convergence order in a priori error estimate and the efficiency of a posteriori error estimate.

preprint2026arXiv

Bi-Mem: Bidirectional Construction of Hierarchical Memory for Personalized LLMs via Inductive-Reflective Agents

Constructing memory from users' long-term conversations overcomes LLMs' contextual limitations and enables personalized interactions. Recent studies focus on hierarchical memory to model users' multi-granular behavioral patterns via clustering and aggregating historical conversations. However, conversational noise and memory hallucinations can be amplified during clustering, causing locally aggregated memories to misalign with the user's global persona. To mitigate this issue, we propose Bi-Mem, an agentic framework ensuring hierarchical memory fidelity through bidirectional construction. Specifically, we deploy an inductive agent to form the hierarchical memory: it extracts factual information from raw conversations to form fact-level memory, aggregates them into thematic scenes (i.e., local scene-level memory) using graph clustering, and infers users' profiles as global persona-level memory. Simultaneously, a reflective agent is designed to calibrate local scene-level memories using global constraints derived from the persona-level memory, thereby enforcing global-local alignment. For coherent memory recall, we propose an associative retrieval mechanism: beyond initial hierarchical search, a spreading activation process allows facts to evoke contextual scenes, while scene-level matches retrieve salient supporting factual information. Empirical evaluations demonstrate that Bi-Mem achieves significant improvements in question answering performance on long-term personalized conversational tasks.

preprint2022arXiv

Augmented Lagrangian Methods for Time-varying Constrained Online Convex Optimization

In this paper, we consider online convex optimization (OCO) with time-varying loss and constraint functions. Specifically, the decision maker chooses sequential decisions based only on past information, meantime the loss and constraint functions are revealed over time. We first develop a class of model-based augmented Lagrangian methods (MALM) for time-varying functional constrained OCO (without feedback delay). Under standard assumptions, we establish sublinear regret and sublinear constraint violation of MALM. Furthermore, we extend MALM to deal with time-varying functional constrained OCO with delayed feedback, in which the feedback information of loss and constraint functions is revealed to decision maker with delays. Without additional assumptions, we also establish sublinear regret and sublinear constraint violation for the delayed version of MALM. Finally, numerical results for several examples of constrained OCO including online network resource allocation, online logistic regression and online quadratically constrained quadratical program are presented to demonstrate the efficiency of the proposed algorithms.

preprint2022arXiv

Regrets of Proximal Method of Multipliers for Online Non-convex Optimization with Long Term Constraints

The online optimization problem with non-convex loss functions over a closed convex set, coupled with a set of inequality (possibly non-convex) constraints is a challenging online learning problem. A proximal method of multipliers with quadratic approximations (named as OPMM) is presented to solve this online non-convex optimization with long term constraints. Regrets of the violation of Karush-Kuhn-Tucker conditions of OPMM for solving online non-convex optimization problems are analyzed. Under mild conditions, it is shown that this algorithm exhibits ${\cO}(T^{-1/8})$ Lagrangian gradient violation regret, ${\cO}(T^{-1/8})$ constraint violation regret and ${\cO}(T^{-1/4})$ complementarity residual regret if parameters in the algorithm are properly chosen, where $T$ denotes the number of time periods. For the case that the objective is a convex quadratic function, we demonstrate that the regret of the objective reduction can be established even the feasible set is non-convex. For the case when the constraint functions are convex, if the solution of the subproblem in OPMM is obtained by solving its dual, OPMM is proved to be an implementable projection method for solving the online non-convex optimization problem.

preprint2015arXiv

On the Marčenko-Pastur law for linear time series

This paper is concerned with extensions of the classical Marčenko-Pastur law to time series. Specifically, $p$-dimensional linear processes are considered which are built from innovation vectors with independent, identically distributed (real- or complex-valued) entries possessing zero mean, unit variance and finite fourth moments. The coefficient matrices of the linear process are assumed to be simultaneously diagonalizable. In this setting, the limiting behavior of the empirical spectral distribution of both sample covariance and symmetrized sample autocovariance matrices is determined in the high-dimensional setting $p/n\to c\in (0,\infty)$ for which dimension $p$ and sample size $n$ diverge to infinity at the same rate. The results extend existing contributions available in the literature for the covariance case and are one of the first of their kind for the autocovariance case.

preprint2011arXiv

Learning in A Changing World: Restless Multi-Armed Bandit with Unknown Dynamics

We consider the restless multi-armed bandit (RMAB) problem with unknown dynamics in which a player chooses M out of N arms to play at each time. The reward state of each arm transits according to an unknown Markovian rule when it is played and evolves according to an arbitrary unknown random process when it is passive. The performance of an arm selection policy is measured by regret, defined as the reward loss with respect to the case where the player knows which M arms are the most rewarding and always plays the M best arms. We construct a policy with an interleaving exploration and exploitation epoch structure that achieves a regret with logarithmic order when arbitrary (but nontrivial) bounds on certain system parameters are known. When no knowledge about the system is available, we show that the proposed policy achieves a regret arbitrarily close to the logarithmic order. We further extend the problem to a decentralized setting where multiple distributed players share the arms without information exchange. Under both an exogenous restless model and an endogenous restless model, we show that a decentralized extension of the proposed policy preserves the logarithmic regret order as in the centralized setting. The results apply to adaptive learning in various dynamic systems and communication networks, as well as financial investment.