Source author record

Ilnura Usmanova

Ilnura Usmanova appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Artificial Intelligence Machine Learning math.NA Numerical Analysis Robotics

Catalog footprint

What is connected

3works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Constrained Policy Optimization via Bayesian World Models

Improving sample-efficiency and safety are crucial challenges when deploying reinforcement learning in high-stakes real world applications. We propose LAMBDA, a novel model-based approach for policy optimization in safety critical tasks modeled via constrained Markov decision processes. Our approach utilizes Bayesian world models, and harnesses the resulting uncertainty to maximize optimistic upper bounds on the task objective, as well as pessimistic upper bounds on the safety constraints. We demonstrate LAMBDA's state of the art performance on the Safety-Gym benchmark suite in terms of sample efficiency and constraint violation.

preprint2021arXiv

Safe non-smooth black-box optimization with application to policy search

For safety-critical black-box optimization tasks, observations of the constraints and the objective are often noisy and available only for the feasible points. We propose an approach based on log barriers to find a local solution of a non-convex non-smooth black-box optimization problem $\min f^0(x)$ subject to $f^i(x)\leq 0,~ i = 1,\ldots, m$, at the same time, guaranteeing constraint satisfaction while learning an optimal solution with high probability. Our proposed algorithm exploits noisy observations to iteratively improve on an initial safe point until convergence. We derive the convergence rate and prove safety of our algorithm. We demonstrate its performance in an application to an iterative control design problem.

preprint2016arXiv

Gradient-free prox-methods with inexact oracle for stochastic convex optimization problems on a simplex

In the paper we show that euclidian randomization in some situations (i.e. for gradient-free method on a simplex) can be as good as the randomization on the unit sphere in 1-norm. That is on the simplex example we show that for gradient-free methods the choise of the prox-structure and the choise of a way of randomization have to be connected to each other. We demonstrate how it can be done in an optimal way. It is important that we consider inexact oracle.