Source author record

Morteza Ibrahimi

Morteza Ibrahimi appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.OC Information Theory math.IT math.ST Statistics Theory Computational Engineering, Finance, and Science math.PR cond-mat.dis-nn cond-mat.stat-mech Discrete Mathematics q-fin.ST

Catalog footprint

What is connected

9works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

From Predictions to Decisions: The Importance of Joint Predictive Distributions

A fundamental challenge for any intelligent system is prediction: given some inputs, can you predict corresponding outcomes? Most work on supervised learning has focused on producing accurate marginal predictions for each input. However, we show that for a broad class of decision problems, accurate joint predictions are required to deliver good performance. In particular, we establish several results pertaining to combinatorial decision problems, sequential predictions, and multi-armed bandits to elucidate the essential role of joint predictive distributions. Our treatment of multi-armed bandits introduces an approximate Thompson sampling algorithm and analytic techniques that lead to a new kind of regret bound.

preprint2020arXiv

Hypermodels for Exploration

We study the use of hypermodels to represent epistemic uncertainty and guide exploration. This generalizes and extends the use of ensembles to approximate Thompson sampling. The computational cost of training an ensemble grows with its size, and as such, prior work has typically been limited to ensembles with tens of elements. We show that alternative hypermodels can enjoy dramatic efficiency gains, enabling behavior that would otherwise require hundreds or thousands of elements, and even succeed in situations where ensemble methods fail to learn regardless of size. This allows more accurate approximation of Thompson sampling as well as use of more sophisticated exploration schemes. In particular, we consider an approximate form of information-directed sampling and demonstrate performance gains relative to Thompson sampling. As alternatives to ensembles, we consider linear and neural network hypermodels, also known as hypernetworks. We prove that, with neural network base models, a linear hypermodel can represent essentially any distribution over functions, and as such, hypernetworks are no more expressive.

preprint2015arXiv

The set of solutions of random XORSAT formulae

The XOR-satisfiability (XORSAT) problem requires finding an assignment of $n$ Boolean variables that satisfy $m$ exclusive OR (XOR) clauses, whereby each clause constrains a subset of the variables. We consider random XORSAT instances, drawn uniformly at random from the ensemble of formulae containing $n$ variables and $m$ clauses of size $k$. This model presents several structural similarities to other ensembles of constraint satisfaction problems, such as $k$-satisfiability ($k$-SAT), hypergraph bicoloring and graph coloring. For many of these ensembles, as the number of constraints per variable grows, the set of solutions shatters into an exponential number of well-separated components. This phenomenon appears to be related to the difficulty of solving random instances of such problems. We prove a complete characterization of this clustering phase transition for random $k$-XORSAT. In particular, we prove that the clustering threshold is sharp and determine its exact location. We prove that the set of solutions has large conductance below this threshold and that each of the clusters has large conductance above the same threshold. Our proof constructs a very sparse basis for the set of solutions (or the subset within a cluster). This construction is intimately tied to the construction of specific subgraphs of the hypergraph associated with an instance of $k$-XORSAT. In order to study such subgraphs, we establish novel local weak convergence results for them.

preprint2013arXiv

Accelerated Time-of-Flight Mass Spectrometry

We study a simple modification to the conventional time of flight mass spectrometry (TOFMS) where a \emph{variable} and (pseudo)-\emph{random} pulsing rate is used which allows for traces from different pulses to overlap. This modification requires little alteration to the currently employed hardware. However, it requires a reconstruction method to recover the spectrum from highly aliased traces. We propose and demonstrate an efficient algorithm that can process massive TOFMS data using computational resources that can be considered modest with today's standards. This approach can be used to improve duty cycle, speed, and mass resolving power of TOFMS at the same time. We expect this to extend the applicability of TOFMS to new domains.

preprint2013arXiv

Efficient Reinforcement Learning for High Dimensional Linear Quadratic Systems

We study the problem of adaptive control of a high dimensional linear quadratic (LQ) system. Previous work established the asymptotic convergence to an optimal controller for various adaptive control schemes. More recently, for the average cost LQ problem, a regret bound of ${O}(\sqrt{T})$ was shown, apart form logarithmic factors. However, this bound scales exponentially with $p$, the dimension of the state space. In this work we consider the case where the matrices describing the dynamic of the LQ system are sparse and their dimensions are large. We present an adaptive control scheme that achieves a regret bound of ${O}(p \sqrt{T})$, apart from logarithmic factors. In particular, our algorithm has an average cost of $(1+\eps)$ times the optimum cost after $T = \polylog(p) O(1/\eps^2)$. This is in comparison to previous work on the dense dynamics where the algorithm requires time that scales exponentially with dimension in order to achieve regret of $\eps$ times the optimal cost. We believe that our result has prominent applications in the emerging area of computational advertising, in particular targeted online advertising and advertising in social networks.

preprint2013arXiv

Support Recovery for the Drift Coefficient of High-Dimensional Diffusions

Consider the problem of learning the drift coefficient of a $p$-dimensional stochastic differential equation from a sample path of length $T$. We assume that the drift is parametrized by a high-dimensional vector, and study the support recovery problem when both $p$ and $T$ can tend to infinity. In particular, we prove a general lower bound on the sample-complexity $T$ by using a characterization of mutual information as a time integral of conditional variance, due to Kadota, Zakai, and Ziv. For linear stochastic differential equations, the drift coefficient is parametrized by a $p\times p$ matrix which describes which degrees of freedom interact under the dynamics. In this case, we analyze a $\ell_1$-regularized least squares estimator and prove an upper bound on $T$ that nearly matches the lower bound on specific classes of sparse matrices.

preprint2011arXiv

Information Theoretic Limits on Learning Stochastic Differential Equations

Consider the problem of learning the drift coefficient of a stochastic differential equation from a sample path. In this paper, we assume that the drift is parametrized by a high dimensional vector. We address the question of how long the system needs to be observed in order to learn this vector of parameters. We prove a general lower bound on this time complexity by using a characterization of mutual information as time integral of conditional variance, due to Kadota, Zakai, and Ziv. This general lower bound is applied to specific classes of linear and non-linear stochastic differential equations. In the linear case, the problem under consideration is the one of learning a matrix of interaction coefficients. We evaluate our lower bound for ensembles of sparse and dense random matrices. The resulting estimates match the qualitative behavior of upper bounds achieved by computationally efficient procedures.

preprint2011arXiv

Robust Max-Product Belief Propagation

We study the problem of optimizing a graph-structured objective function under \emph{adversarial} uncertainty. This problem can be modeled as a two-persons zero-sum game between an Engineer and Nature. The Engineer controls a subset of the variables (nodes in the graph), and tries to assign their values to maximize an objective function. Nature controls the complementary subset of variables and tries to minimize the same objective. This setting encompasses estimation and optimization problems under model uncertainty, and strategic problems with a graph structure. Von Neumann's minimax theorem guarantees the existence of a (minimax) pair of randomized strategies that provide optimal robustness for each player against its adversary. We prove several structural properties of this strategy pair in the case of graph-structured payoff function. In particular, the randomized minimax strategies (distributions over variable assignments) can be chosen in such a way to satisfy the Markov property with respect to the graph. This significantly reduces the problem dimensionality. Finally we introduce a message passing algorithm to solve this minimax problem. The algorithm generalizes max-product belief propagation to this new domain.

preprint2010arXiv

Learning Networks of Stochastic Differential Equations

We consider linear models for stochastic dynamics. To any such model can be associated a network (namely a directed graph) describing which degrees of freedom interact under the dynamics. We tackle the problem of learning such a network from observation of the system trajectory over a time interval $T$. We analyze the $\ell_1$-regularized least squares algorithm and, in the setting in which the underlying network is sparse, we prove performance guarantees that are \emph{uniform in the sampling rate} as long as this is sufficiently high. This result substantiates the notion of a well defined `time complexity' for the network inference problem.

Morteza Ibrahimi

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

From Predictions to Decisions: The Importance of Joint Predictive Distributions

Hypermodels for Exploration

The set of solutions of random XORSAT formulae

Accelerated Time-of-Flight Mass Spectrometry

Efficient Reinforcement Learning for High Dimensional Linear Quadratic Systems

Support Recovery for the Drift Coefficient of High-Dimensional Diffusions

Information Theoretic Limits on Learning Stochastic Differential Equations

Robust Max-Product Belief Propagation

Learning Networks of Stochastic Differential Equations