Source author record

Ilan Doron-Arad

Ilan Doron-Arad appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Machine Learning

Catalog footprint

What is connected

2works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A Theory of Online Learning with Autoregressive Chain-of-Thought Reasoning

Autoregressive generation lies at the heart of the mechanism of large language models. It can be viewed as the repeated application of a next-token generator: starting from an input string (prompt), the generator is applied for $M$ steps, and the last generated token is taken as the final output. [Joshi et al., 2025] proposed a PAC model for studying the learnability of the input-output maps arising from this process. We develop an online analogue of this framework, focusing on the mistake bound of learning the final output induced by an unknown next-token generator. We distinguish between two forms of feedback. In the End-to-End model, after each round the learner observes only the final token produced after $M$ autoregressive steps. In the Chain-of-Thought model, the learner is additionally shown the entire $M$-step trajectory. Our goal is to understand how the optimal mistake bound depends on the generation horizon $M$, and to what extent observing intermediate tokens can reduce this dependence. Our main results show that the online theory of autoregressive learning exhibits a qualitative picture analogous to the statistical one found by [Hanneke et al., 2026], but with a different scale of dependence on the generation horizon. In the End-to-End model, we prove a taxonomy of possible mistake-bound growth rates in the generation horizon $M$: essentially any rate between constant and logarithmic can arise. We further show that this logarithmic ceiling is unavoidable. In the Chain-of-Thought model, we show that access to the full generated trajectory eliminates the dependence on $M$ altogether. We also analyze autoregressive linear threshold classes, and prove optimal mistake bounds, as well as a new lower bound for the statistical setting. Along the way, our results resolve several questions left open by [Joshi et al., 2025].

preprint2022arXiv

An EPTAS for Budgeted Matroid Independent Set

We consider the budgeted matroid independent set problem. The input is a ground set, where each element has a cost and a non-negative profit, along with a matroid over the elements and a budget. The goal is to select a subset of elements which maximizes the total profit subject to the matroid and budget constraints. Several well known special cases, where we have, e.g., a uniform matroid and a budget, or no matroid constraint (i.e., the classic knapsack problem), admit a fully polynomial-time approximation scheme (FPTAS). In contrast, already a slight generalization to the multi-budgeted matroid independent set problem has a PTAS but does not admit an efficient polynomial-time approximation scheme (EPTAS). This implies a PTAS for our problem, which is the best known result prior to this work. Our main contribution is an EPTAS for the budgeted matroid independent set problem. A key idea of the scheme is to find a representative set for the instance, whose cardinality depends solely on $1/\varepsilon$, where $\varepsilon > 0$ is the accuracy parameter of the scheme. The representative set is identified via matroid basis minimization, which can be solved by a simple greedy algorithm. Our scheme enumerates over subsets of the representative set and extends each subset using a linear program. The notion of representative sets may be useful in solving other variants of the budgeted matroid independent set problem.