Source author record

Alexander Novikov

Alexander Novikov appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Human-Computer Interaction math.PR math.ST Mathematical Software Neural and Evolutionary Computing Numerical Analysis q-fin.MF q-fin.PR Robotics Statistics Theory

Catalog footprint

What is connected

11works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Intentmaking and Sensemaking: Human Interaction with AI-Guided Mathematical Discovery

Artificial intelligence offers powerful new tools for scientific discovery, but the interaction paradigms required to effectively harness these systems remain underexplored. In this paper, we present findings from a formative user study with 11 expert mathematicians who used AlphaEvolve, an evolutionary coding agent, to tackle advanced problems in their fields of expertise. We identify and characterize a distinct workflow we term intentmaking, the iterative process of discovering, defining, and refining one's experimental goals through active system interaction. We frame this as a natural extension to sensemaking, the cognitive process of building an understanding of complex or novel data. We suggest that users enter a cycle of intentmaking (defining and updating their experiment) and sensemaking (interpreting the results) which repeats many times during the course of an investigation. Our documentation of these themes suggests an approach to designing AI tools for scientific discovery that goes beyond the existing question/answer model of many current systems, treating them as collaborative instruments rather than opaque black-box assistants.

preprint2021arXiv

RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning

Offline methods for reinforcement learning have a potential to help bridge the gap between reinforcement learning research and real-world applications. They make it possible to learn policies from offline datasets, thus overcoming concerns associated with online data collection in the real-world, including cost, safety, or ethical concerns. In this paper, we propose a benchmark called RL Unplugged to evaluate and compare offline RL methods. RL Unplugged includes data from a diverse range of domains including games (e.g., Atari benchmark) and simulated motor control problems (e.g., DM Control Suite). The datasets include domains that are partially or fully observable, use continuous or discrete actions, and have stochastic vs. deterministic dynamics. We propose detailed evaluation protocols for each domain in RL Unplugged and provide an extensive analysis of supervised learning and offline RL methods using these protocols. We will release data for all our tasks and open-source all algorithms presented in this paper. We hope that our suite of benchmarks will increase the reproducibility of experiments and make it possible to study challenging tasks with a limited computational budget, thus making RL research both more systematic and more accessible across the community. Moving forward, we view RL Unplugged as a living benchmark suite that will evolve and grow with datasets contributed by the research community and ourselves. Our project page is available on https://git.io/JJUhd.

preprint2020arXiv

Hyperparameter Selection for Offline Reinforcement Learning

Offline reinforcement learning (RL purely from logged data) is an important avenue for deploying RL techniques in real-world scenarios. However, existing hyperparameter selection methods for offline RL break the offline assumption by evaluating policies corresponding to each hyperparameter setting in the environment. This online execution is often infeasible and hence undermines the main aim of offline RL. Therefore, in this work, we focus on \textit{offline hyperparameter selection}, i.e. methods for choosing the best policy from a set of many policies trained using different hyperparameters, given only logged data. Through large-scale empirical evaluation we show that: 1) offline RL algorithms are not robust to hyperparameter choices, 2) factors such as the offline RL algorithm and method for estimating Q values can have a big impact on hyperparameter selection, and 3) when we control those factors carefully, we can reliably rank policies across hyperparameter choices, and therefore choose policies which are close to the best policy in the set. Overall, our results present an optimistic view that offline hyperparameter selection is within reach, even in challenging tasks with pixel observations, high dimensional action spaces, and long horizon.

preprint2020arXiv

Scaling data-driven robotics with reward sketching and batch reinforcement learning

We present a framework for data-driven robotics that makes use of a large dataset of recorded robot experience and scales to several tasks using learned reward functions. We show how to apply this framework to accomplish three different object manipulation tasks on a real robot platform. Given demonstrations of a task together with task-agnostic recorded experience, we use a special form of human annotation as supervision to learn a reward function, which enables us to deal with real-world tasks where the reward signal cannot be acquired directly. Learned rewards are used in combination with a large dataset of experience from different tasks to learn a robot policy offline using batch RL. We show that using our approach it is possible to train agents to perform a variety of challenging manipulation tasks including stacking rigid objects and handling cloth.

preprint2020arXiv

Tensor Train decomposition on TensorFlow (T3F)

Tensor Train decomposition is used across many branches of machine learning. We present T3F -- a library for Tensor Train decomposition based on TensorFlow. T3F supports GPU execution, batch processing, automatic differentiation, and versatile functionality for the Riemannian optimization framework, which takes into account the underlying manifold structure to construct efficient optimization methods. The library makes it easier to implement machine learning papers that rely on the Tensor Train decomposition. T3F includes documentation, examples and 94% test coverage.

preprint2016arXiv

Pricing of Asian-type and Basket Options via Upper and Lower Bounds

This paper sets out to provide a general framework for the pricing of average-type options via lower and upper bounds. This class of options includes Asian, basket and options on the volume-weighted average price. We demonstrate that in cases under discussion lower bounds allow for the dimensionality of the problem to be reduced and that these methods provide reasonable approximations to the price of the option. Keywords: Asian options, Basket options, Lower and Upper bounds, Volume-weighted average prices (VWAP), Levy processes.

preprint2016arXiv

Ultimate tensorization: compressing convolutional and FC layers alike

Convolutional neural networks excel in image recognition tasks, but this comes at the cost of high computational and memory complexity. To tackle this problem, [1] developed a tensor factorization framework to compress fully-connected layers. In this paper, we focus on compressing convolutional layers. We show that while the direct application of the tensor framework [1] to the 4-dimensional kernel of convolution does compress the layer, we can do better. We reshape the convolutional kernel into a tensor of higher order and factorize it. We combine the proposed approach with the previous work to compress both convolutional and fully-connected layers of a network and achieve 80x network compression rate with 1.1% accuracy drop on the CIFAR-10 dataset.

preprint2015arXiv

Bounds for expected maxima of Gaussian processes and their discrete approximations

The paper deals with the expected maxima of continuous Gaussian processes $X = (X_t)_{t\ge 0}$ that are Hölder continuous in $L_2$-norm and/or satisfy the opposite inequality for the $L_2$-norms of their increments. Examples of such processes include the fractional Brownian motion and some of its "relatives" (of which several examples are given in the paper). We establish upper and lower bounds for $E \max_{0\le t\le 1}X_t$ and investigate the rate of convergence to that quantity of its discrete approximation $E \max_{0\le i\le n}X_{i/n}$. Some further properties of these two maxima are established in the special case of the fractional Brownian motion.

preprint2015arXiv

Tensorizing Neural Networks

Deep neural networks currently demonstrate state-of-the-art performance in several domains. At the same time, models of this class are very demanding in terms of computational resources. In particular, a large amount of memory is required by commonly used fully-connected layers, making it hard to use the models on low-end devices and stopping the further increase of the model size. In this paper we convert the dense weight matrices of the fully-connected layers to the Tensor Train format such that the number of parameters is reduced by a huge factor and at the same time the expressive power of the layer is preserved. In particular, for the Very Deep VGG networks we report the compression factor of the dense weight matrix of a fully-connected layer up to 200000 times leading to the compression factor of the whole network up to 7 times.

preprint2014arXiv

On moments of Pitman estimators: the case of fractional Brownian Motion

In some non-regular statistical estimation problems, the limiting likelihood processes are functionals of fractional Brownian motion (fBm) with Hurst's parameter H; 0 < H <=? 1. In this paper we present several analytical and numerical results on the moments of Pitman estimators represented in the form of integral functionals of fBm. We also provide Monte Carlo simulation results for variances of Pitman and asymptotic maximum likelihood estimators.

preprint2013arXiv

On lower and upper bounds for Asian-type options: a unified approach

In the context of dealing with financial risk management problems it is desirable to have accurate bounds for option prices in situations when pricing formulae do not exist in the closed form. A unified approach for obtaining upper and lower bounds for Asian-type options, including options on VWAP, is proposed in this paper. The bounds obtained are applicable to the continuous and discrete-time frameworks for the case of time-dependent interest rates. Numerical examples are provided to illustrate the accuracy of the bounds.

Alexander Novikov

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Intentmaking and Sensemaking: Human Interaction with AI-Guided Mathematical Discovery

RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning

Hyperparameter Selection for Offline Reinforcement Learning

Scaling data-driven robotics with reward sketching and batch reinforcement learning

Tensor Train decomposition on TensorFlow (T3F)

Pricing of Asian-type and Basket Options via Upper and Lower Bounds

Ultimate tensorization: compressing convolutional and FC layers alike

Bounds for expected maxima of Gaussian processes and their discrete approximations

Tensorizing Neural Networks

On moments of Pitman estimators: the case of fractional Brownian Motion

On lower and upper bounds for Asian-type options: a unified approach