Source author record

Alexander Grishin

Alexander Grishin appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning cond-mat.mtrl-sci physics.app-ph Robotics

Catalog footprint

What is connected

3works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Automating Control of Overestimation Bias for Reinforcement Learning

Overestimation bias control techniques are used by the majority of high-performing off-policy reinforcement learning algorithms. However, most of these techniques rely on pre-defined bias correction policies that are either not flexible enough or require environment-specific tuning of hyperparameters. In this work, we present a general data-driven approach for the automatic selection of bias control hyperparameters. We demonstrate its effectiveness on three algorithms: Truncated Quantile Critics, Weighted Delayed DDPG, and Maxmin Q-learning. The proposed technique eliminates the need for an extensive hyperparameter search. We show that it leads to a significant reduction of the actual number of interactions while preserving the performance.

preprint2020arXiv

Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics

The overestimation bias is one of the major impediments to accurate off-policy learning. This paper investigates a novel way to alleviate the overestimation bias in a continuous control setting. Our method---Truncated Quantile Critics, TQC,---blends three ideas: distributional representation of a critic, truncation of critics prediction, and ensembling of multiple critics. Distributional representation and truncation allow for arbitrary granular overestimation control, while ensembling provides additional score improvements. TQC outperforms the current state of the art on all environments from the continuous control benchmark suite, demonstrating 25% improvement on the most challenging Humanoid environment.

preprint2020arXiv

UV-laser modification and selective ion-beam etching of amorphous vanadium pentoxide thin films

We present the results on excimer laser modification and patterning of amorphous vanadium pentoxide films. Wet positive resist-type and Ar ion-beam negative resist-type etching techniques were employed to develop UV-modified films. V2O5 films were found to possess sufficient resistivity compared to standard electronic materials thus to be promising masks for sub-micron lithog-raphy