Researcher profile

Dmitry Kalashnikov

Dmitry Kalashnikov contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2022arXiv

Do As I Can, Not As I Say: Grounding Language in Robotic Affordances

Large language models can encode a wealth of semantic knowledge about the world. Such knowledge could be extremely useful to robots aiming to act upon high-level, temporally extended instructions expressed in natural language. However, a significant weakness of language models is that they lack real-world experience, which makes it difficult to leverage them for decision making within a given embodiment. For example, asking a language model to describe how to clean a spill might result in a reasonable narrative, but it may not be applicable to a particular agent, such as a robot, that needs to perform this task in a particular environment. We propose to provide real-world grounding by means of pretrained skills, which are used to constrain the model to propose natural language actions that are both feasible and contextually appropriate. The robot can act as the language model's "hands and eyes," while the language model supplies high-level semantic knowledge about the task. We show how low-level skills can be combined with large language models so that the language model provides high-level knowledge about the procedures for performing complex and temporally-extended instructions, while value functions associated with these skills provide the grounding necessary to connect this knowledge to a particular physical environment. We evaluate our method on a number of real-world robotic tasks, where we show the need for real-world grounding and that this approach is capable of completing long-horizon, abstract, natural language instructions on a mobile manipulator. The project's website and the video can be found at https://say-can.github.io/.

preprint2022arXiv

Hybrid Random Features

We propose a new class of random feature methods for linearizing softmax and Gaussian kernels called hybrid random features (HRFs) that automatically adapt the quality of kernel estimation to provide most accurate approximation in the defined regions of interest. Special instantiations of HRFs lead to well-known methods such as trigonometric (Rahimi and Recht, 2007) or (recently introduced in the context of linear-attention Transformers) positive random features (Choromanski et al., 2021). By generalizing Bochner's Theorem for softmax/Gaussian kernels and leveraging random features for compositional kernels, the HRF-mechanism provides strong theoretical guarantees - unbiased approximation and strictly smaller worst-case relative errors than its counterparts. We conduct exhaustive empirical evaluation of HRF ranging from pointwise kernel estimation experiments, through tests on data admitting clustering structure to benchmarking implicit-attention Transformers (also for downstream Robotics applications), demonstrating its quality in a wide spectrum of machine learning problems.

preprint2022arXiv

Sgoldstino signal at FASER: prospects in searches for supersymmetry

We investigate FASER@LHC perspectives in searches for light ($0.1-5$ GeV) sgoldstinos in models with low energy ($10-10^4$ TeV) supersymmetry breaking. We consider flavor conserving and flavor violating couplings of sgoldstinos to Standard Model fermions and find the both options to be testable at FASER. Even the first FASER run allows one to probe interesting patches in the model parameter space, while the second run, FASER-II, with significantly larger detector fiducial volume, gives a possibility to thoroughly explore a wide class of supersymmetric extensions of particle physics complementary to those probed at LHC with ATLAS and CMS detectors.

preprint2021arXiv

Continuous wave second harmonic generation enabled by quasi-bound-states in the continuum on gallium phosphide metasurfaces

Resonant metasurfaces are an attractive platform for enhancing the non-linear optical processes, such as second harmonic generation (SHG), since they can generate very large local electromagnetic fields while relaxing the phase-matching requirements. Here, we take this platform a step closer to the practical applications by demonstrating visible range, continuous wave (CW) SHG. We do so by combining the attractive material properties of gallium phosphide with engineered, high quality-factor photonic modes enabled by bound states in the continuum. For the optimum case, we obtain efficiencies around 5e-5 % W$^{-1}$ when the system is pumped at 1200 nm wavelength with CW intensities of 1 kW/cm$^2$. Moreover, we measure external efficiencies as high as 0.1 % W$^{-1}$ with pump intensities of only 10 MW/cm$^2$ for pulsed irradiation. This efficiency is higher than the values previously reported for dielectric metasurfaces, but achieved here with pump intensities that are two orders of magnitude lower.

preprint2020arXiv

Disentangled Planning and Control in Vision Based Robotics via Reward Machines

In this work we augment a Deep Q-Learning agent with a Reward Machine (DQRM) to increase speed of learning vision-based policies for robot tasks, and overcome some of the limitations of DQN that prevent it from converging to good-quality policies. A reward machine (RM) is a finite state machine that decomposes a task into a discrete planning graph and equips the agent with a reward function to guide it toward task completion. The reward machine can be used for both reward shaping, and informing the policy what abstract state it is currently at. An abstract state is a high level simplification of the current state, defined in terms of task relevant features. These two supervisory signals of reward shaping and knowledge of current abstract state coming from the reward machine complement each other and can both be used to improve policy performance as demonstrated on several vision based robotic pick and place tasks. Particularly for vision based robotics applications, it is often easier to build a reward machine than to try and get a policy to learn the task without this structure.

preprint2020arXiv

Integrated Single Photon Emitters

The realization of scalable systems for quantum information processing and networking is of utmost importance to the quantum information community. However, building such systems is difficult because of challenges in achieving all the necessary functionalities on a unified platform while maintaining stringent performance requirements of the individual elements. A promising approach which addresses this challenge is based on the consolidation of experimental and theoretical capabilities in quantum physics and integrated photonics. Integrated quantum photonics devices allow efficient control and read-out of quantum information while being scalable and cost effective. Here we review recent developments in solid-state single photon emitters coupled with various integrated photonic structures, which form a critical component of future scalable quantum devices. Our work contributes to the further development and realization of quantum networking protocols and quantum logic on a scalable and fabrication-friendly platform.

preprint2020arXiv

Thinking While Moving: Deep Reinforcement Learning with Concurrent Control

We study reinforcement learning in settings where sampling an action from the policy must be done concurrently with the time evolution of the controlled system, such as when a robot must decide on the next action while still performing the previous action. Much like a person or an animal, the robot must think and move at the same time, deciding on its next action before the previous one has completed. In order to develop an algorithmic framework for such concurrent control problems, we start with a continuous-time formulation of the Bellman equations, and then discretize them in a way that is aware of system delays. We instantiate this new class of approximate dynamic programming methods via a simple architectural extension to existing value-based deep reinforcement learning algorithms. We evaluate our methods on simulated benchmark tasks and a large-scale robotic grasping task where the robot must "think while moving".