Researcher profile

Atilla Eryilmaz

Atilla Eryilmaz contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2026arXiv

PMF-CL: Pareto-Minimal-Forgetting Continual Learner for Conflicting Tasks

In the literature, many continual learning (CL) algorithms have been proposed to address the issue of catastrophic forgetting in ML models (i.e., learning new tasks leads to the loss of performance on previously learned tasks). Although all CL approaches use some form of memory to retain information about past tasks, a grounded understanding of what information needs to be stored to minimize catastrophic forgetting remains elusive. Recently, it has been recognized that under the strong assumption of the existence of a common global minimizer over all tasks, catastrophic forgetting can be completely avoided. However, in practice, tasks rarely have a common global minimizer, and a certain amount of forgetting is inevitable. In this paper, we propose a foundational framework for principled and systematic CL of conflicting tasks using a multi-task learning (MTL) perspective. The approach is based on finding Pareto-optimal solutions, i.e., the solutions which, by definition, minimally forget the previous tasks in the Pareto sense. We derive Pareto-minimal-forgetting CL algorithms for linear and basis-function regression, and general loss functions which have a quadratic upper bound, e.g., logistic regression. For quadratic problems, PMF-CL uses memory-efficient iterative updates with a static memory footage of $\mathcal{O}(d^2)$ for models with $d$ parameters.

preprint2024arXiv

Optimal Push and Pull-Based Edge Caching For Dynamic Content

We introduce a framework and optimal `fresh' caching for a content distribution network (CDN) comprising a front-end local cache and a back-end database. The data content is dynamically updated at a back-end database and end-users are interested in the most-recent version of that content. We formulate the average cost minimization problem that captures the system's cost due to the service of aging content as well as the regular cache update cost. We consider the cost minimization problem from two individual perspectives based on the available information to either side of the CDN: the back-end database perspective and the front-end local cache perspective. For the back-end database, the instantaneous version of content is observable but the exact demand is not. Caching decisions made by the back-end database are termed `push-based caching'. For the front-end local cache, the age of content version in the cache is not observable, yet the instantaneous demand is. Caching decisions made by the front-end local cache are termed `pull-based caching'. Our investigations reveal which type of information, updates, or demand dynamic, is of higher value towards achieving the minimum cost based on other network parameters including content popularity, update rate, and demand intensity.

preprint2023arXiv

Age-Optimal Multi-Channel-Scheduling under Energy and Tolerance Constraints

We study the optimal scheduling problem where n source nodes attempt to transmit updates over L shared wireless on/off fading channels to optimize their age performance under energy and age-violation tolerance constraints. Specifically, we provide a generic formulation of age-optimization in the form of a constrained Markov Decision Processes (CMDP), and obtain the optimal scheduler as the solution of an associated Linear Programming problem. We investigate the characteristics of the optimal single-user multi-channel scheduler for the important special cases of average-age and violation-rate minimization. This leads to several key insights on the nature of the optimal allocation of the limited energy, where a usual threshold-based policy does not apply and will be useful in guiding scheduler designers. We then investigate the stability region of the optimal scheduler for the multi-user case. We also develop an online scheduler using Lyapunov-drift-minimization methods that do not require the knowledge of channel statistics. Our numerical studies compare the stability region of our online scheduler to the optimal scheduler to reveal that it performs closely with unknown channel statistics.

preprint2022arXiv

A Lyapunov-Based Methodology for Constrained Optimization with Bandit Feedback

In a wide variety of applications including online advertising, contractual hiring, and wireless scheduling, the controller is constrained by a stringent budget constraint on the available resources, which are consumed in a random amount by each action, and a stochastic feasibility constraint that may impose important operational limitations on decision-making. In this work, we consider a general model to address such problems, where each action returns a random reward, cost, and penalty from an unknown joint distribution, and the decision-maker aims to maximize the total reward under a budget constraint $B$ on the total cost and a stochastic constraint on the time-average penalty. We propose a novel low-complexity algorithm based on Lyapunov optimization methodology, named ${\tt LyOn}$, and prove that for $K$ arms it achieves $O(\sqrt{K B\log B})$ regret and zero constraint-violation when $B$ is sufficiently large. The low computational cost and sharp performance bounds of ${\tt LyOn}$ suggest that Lyapunov-based algorithm design methodology can be effective in solving constrained bandit optimization problems.

preprint2020arXiv

Budget-Constrained Bandits over General Cost and Reward Distributions

We consider a budget-constrained bandit problem where each arm pull incurs a random cost, and yields a random reward in return. The objective is to maximize the total expected reward under a budget constraint on the total cost. The model is general in the sense that it allows correlated and potentially heavy-tailed cost-reward pairs that can take on negative values as required by many applications. We show that if moments of order $(2+γ)$ for some $γ> 0$ exist for all cost-reward pairs, $O(\log B)$ regret is achievable for a budget $B>0$. In order to achieve tight regret bounds, we propose algorithms that exploit the correlation between the cost and reward of each arm by extracting the common information via linear minimum mean-square error estimation. We prove a regret lower bound for this problem, and show that the proposed algorithms achieve tight problem-dependent regret bounds, which are optimal up to a universal constant factor in the case of jointly Gaussian cost and reward pairs.

preprint2020arXiv

Continuous-Time Multi-Armed Bandits with Controlled Restarts

Time-constrained decision processes have been ubiquitous in many fundamental applications in physics, biology and computer science. Recently, restart strategies have gained significant attention for boosting the efficiency of time-constrained processes by expediting the completion times. In this work, we investigate the bandit problem with controlled restarts for time-constrained decision processes, and develop provably good learning algorithms. In particular, we consider a bandit setting where each decision takes a random completion time, and yields a random and correlated reward at the end, with unknown values at the time of decision. The goal of the decision-maker is to maximize the expected total reward subject to a time constraint $τ$. As an additional control, we allow the decision-maker to interrupt an ongoing task and forgo its reward for a potentially more rewarding alternative. For this problem, we develop efficient online learning algorithms with $O(\log(τ))$ and $O(\sqrt{τ\log(τ)})$ regret in a finite and continuous action space of restart strategies, respectively. We demonstrate an applicability of our algorithm by using it to boost the performance of SAT solvers.

preprint2020arXiv

Group-Fair Online Allocation in Continuous Time

The theory of discrete-time online learning has been successfully applied in many problems that involve sequential decision-making under uncertainty. However, in many applications including contractual hiring in online freelancing platforms and server allocation in cloud computing systems, the outcome of each action is observed only after a random and action-dependent time. Furthermore, as a consequence of certain ethical and economic concerns, the controller may impose deadlines on the completion of each task, and require fairness across different groups in the allocation of total time budget $B$. In order to address these applications, we consider continuous-time online learning problem with fairness considerations, and present a novel framework based on continuous-time utility maximization. We show that this formulation recovers reward-maximizing, max-min fair and proportionally fair allocation rules across different groups as special cases. We characterize the optimal offline policy, which allocates the total time between different actions in an optimally fair way (as defined by the utility function), and impose deadlines to maximize time-efficiency. In the absence of any statistical knowledge, we propose a novel online learning algorithm based on dual ascent optimization for time averages, and prove that it achieves $\tilde{O}(B^{-1/2})$ regret bound.

preprint2019arXiv

Predictive Scheduling for Virtual Reality

A significant challenge for future virtual reality (VR) applications is to deliver high quality-of-experience, both in terms of video quality and responsiveness, over wireless networks with limited bandwidth. This paper proposes to address this challenge by leveraging the predictability of user movements in the virtual world. We consider a wireless system where an access point (AP) serves multiple VR users. We show that the VR application process consists of two distinctive phases, whereby during the first (proactive scheduling) phase the controller has uncertain predictions of the demand that will arrive at the second (deadline scheduling) phase. We then develop a predictive scheduling policy for the AP that jointly optimizes the scheduling decisions in both phases. In addition to our theoretical study, we demonstrate the usefulness of our policy by building a prototype system. We show that our policy can be implemented under Furion, a Unity-based VR gaming software, with minor modifications. Experimental results clearly show visible difference between our policy and the default one. We also conduct extensive simulation studies, which show that our policy not only outperforms others, but also maintains excellent performance even when the prediction of future user movements is not accurate.

preprint2010arXiv

Scheduling with Rate Adaptation under Incomplete Knowledge of Channel/Estimator Statistics

In time-varying wireless networks, the states of the communication channels are subject to random variations, and hence need to be estimated for efficient rate adaptation and scheduling. The estimation mechanism possesses inaccuracies that need to be tackled in a probabilistic framework. In this work, we study scheduling with rate adaptation in single-hop queueing networks under two levels of channel uncertainty: when the channel estimates are inaccurate but complete knowledge of the channel/estimator joint statistics is available at the scheduler; and when the knowledge of the joint statistics is incomplete. In the former case, we characterize the network stability region and show that a maximum-weight type scheduling policy is throughput-optimal. In the latter case, we propose a joint channel statistics learning - scheduling policy. With an associated trade-off in average packet delay and convergence time, the proposed policy has a stability region arbitrarily close to the stability region of the network under full knowledge of channel/estimator joint statistics.