Researcher profile

Konstantin Yakovlev

Konstantin Yakovlev contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2026arXiv

Learning to Communicate Locally for Large-Scale Multi-Agent Pathfinding

Multi-agent pathfinding (MAPF) is a widely used abstraction for multi-robot trajectory planning problems, where multiple homogeneous agents move simultaneously within a shared environment. Although solving MAPF optimally is NP-hard, scalable and efficient solvers are critical for real-world applications such as logistics and search-and-rescue. To this end, the research community has proposed various decentralized suboptimal MAPF solvers that leverage machine learning. Such methods frame MAPF (from a single agent perspective) as a Dec-POMDP where at each time step an agent has to decide an action based on the local observation and typically solve the problem via reinforcement learning or imitation learning. We follow the same approach but additionally introduce a learnable communication module tailored to enhance cooperation between agents via efficient feature sharing. We present the Local Communication for Multi-agent Pathfinding (LC-MAPF), a generalizable pre-trained model that applies multi-round communication between neighboring agents to exchange information and improve their coordination. Our experiments show that the introduced method outperforms the existing learning-based MAPF solvers, including IL and RL-based approaches, across diverse metrics in a diverse range of (unseen) test scenarios. Remarkably, the introduced communication mechanism does not compromise LC-MAPF's scalability, a common bottleneck for communication-based MAPF solvers.

preprint2025arXiv

Generalization error bound for denoising score matching under relaxed manifold assumption

We examine theoretical properties of the denoising score matching estimate. We model the density of observations with a nonparametric Gaussian mixture. We significantly relax the standard manifold assumption allowing the samples step away from the manifold. At the same time, we are still able to leverage a nice distribution structure. We derive non-asymptotic bounds on the approximation and generalization errors of the denoising score matching estimate. The rates of convergence are determined by the intrinsic dimension. Furthermore, our bounds remain valid even if we allow the ambient dimension grow polynomially with the sample size.

preprint2025arXiv

Implicit score matching meets denoising score matching: improved rates of convergence and log-density Hessian estimation

We study the problem of estimating the score function using both implicit score matching and denoising score matching. Assuming that the data distribution exhibiting a low-dimensional structure, we prove that implicit score matching is able not only to adapt to the intrinsic dimension, but also to achieve the same rates of convergence as denoising score matching in terms of the sample size. Furthermore, we demonstrate that both methods allow us to estimate log-density Hessians without the curse of dimensionality by simple differentiation. This justifies convergence of ODE-based samplers for generative diffusion models. Our approach is based on Gagliardo-Nirenberg-type inequalities relating weighted $L^2$-norms of smooth functions and their derivatives.

preprint2022arXiv

POGEMA: Partially Observable Grid Environment for Multiple Agents

We introduce POGEMA (https://github.com/AIRI-Institute/pogema) a sandbox for challenging partially observable multi-agent pathfinding (PO-MAPF) problems . This is a grid-based environment that was specifically designed to be a flexible, tunable and scalable benchmark. It can be tailored to a variety of PO-MAPF, which can serve as an excellent testing ground for planning and learning methods, and their combination, which will allow us to move towards filling the gap between AI planning and learning.

preprint2022arXiv

Policy Optimization to Learn Adaptive Motion Primitives in Path Planning with Dynamic Obstacles

This paper addresses the kinodynamic motion planning for non-holonomic robots in dynamic environments with both static and dynamic obstacles -- a challenging problem that lacks a universal solution yet. One of the promising approaches to solve it is decomposing the problem into the smaller sub problems and combining the local solutions into the global one. The crux of any planning method for non-holonomic robots is the generation of motion primitives that generates solutions to local planning sub-problems. In this work we introduce a novel learnable steering function (policy), which takes into account kinodynamic constraints of the robot and both static and dynamic obstacles. This policy is efficiently trained via the policy optimization. Empirically, we show that our steering function generalizes well to unseen problems. We then plug in the trained policy into the sampling-based and lattice-based planners, and evaluate the resultant POLAMP algorithm (Policy Optimization that Learns Adaptive Motion Primitives) in a range of challenging setups that involve a car-like robot operating in the obstacle-rich parking-lot environments. We show that POLAMP is able to plan collision-free kinodynamic trajectories with success rates higher than 92%, when 50 simultaneously moving obstacles populate the environment showing better performance than the state-of-the-art competitors.

preprint2022arXiv

Towards A Complete Multi-Agent Pathfinding Algorithm For Large Agents

Multi-agent pathfinding (MAPF) is a challenging problem which is hard to solve optimally even when simplifying assumptions are adopted, e.g. planar graphs (typically -- grids), discretized time, uniform duration of move and wait actions etc. On the other hand, MAPF under such restrictive assumptions (also known as the Classical MAPF) is equivalent to the so-called pebble motion problem for which non-optimal polynomial time algorithms do exist. Recently, a body of works emerged that investigated MAPF beyond the basic setting and, in particular, considered agents of arbitrary size and shape. Still, to the best of our knowledge no complete algorithms for such MAPF variant exists. In this work we attempt to narrow this gap by considering MAPF for large agents and suggesting how this problem can be reduced to pebble motion on (general) graphs. The crux of this reduction is the procedure that moves away the agents away from the edge which is needed to perform a move action of the current agent. We consider different variants of how this procedure can be implemented and present a variant of the pebble motion algorithm which incorporates this procedure. Unfortunately, the algorithm is still incomplete, but empirically we show that it is able to solve much more MAPF instances (under the strict time limit) with large agents on arbitrary non-planar graphs (roadmaps) compared to the state-of-the-art MAPF solver -- Continous Conflict-Based Search (CCBS).

preprint2021arXiv

Improving Continuous-time Conflict Based Search

Conflict-Based Search (CBS) is a powerful algorithmic framework for optimally solving classical multi-agent path finding (MAPF) problems, where time is discretized into the time steps. Continuous-time CBS (CCBS) is a recently proposed version of CBS that guarantees optimal solutions without the need to discretize time. However, the scalability of CCBS is limited because it does not include any known improvements of CBS. In this paper, we begin to close this gap and explore how to adapt successful CBS improvements, namely, prioritizing conflicts (PC), disjoint splitting (DS), and high-level heuristics, to the continuous time setting of CCBS. These adaptions are not trivial, and require careful handling of different types of constraints, applying a generalized version of the Safe interval path planning (SIPP) algorithm, and extending the notion of cardinal conflicts. We evaluate the effect of the suggested enhancements by running experiments both on general graphs and $2^k$-neighborhood grids. CCBS with these improvements significantly outperforms vanilla CCBS, solving problems with almost twice as many agents in some cases and pushing the limits of multiagent path finding in continuous-time domains.

preprint2020arXiv

A Combination of Theta*, ORCA and Push and Rotate for Multi-agent Navigation

We study the problem of multi-agent navigation in static environments when no centralized controller is present. Each agent is controlled individually and relies on three algorithmic components to achieve its goal while avoiding collisions with the other agents and the obstacles: i) individual path planning which is done by Theta* algorithm; ii) collision avoidance while path following which is performed by ORCA* algorithm; iii) locally-confined multi-agent path planning done by Push and Rotate algorithm. The latter component is crucial to avoid deadlocks in confined areas, such as narrow passages or doors. We describe how the suggested components interact and form a coherent navigation pipeline. We carry out an extensive empirical evaluation of this pipeline in simulation. The obtained results clearly demonstrate that the number of occurring deadlocks significantly decreases enabling more agents to reach their goals compared to techniques that rely on collision-avoidance only and do not include multi-agent path planning component

preprint2020arXiv

Map-merging Algorithms for Visual SLAM: Feasibility Study and Empirical Evaluation

Simultaneous localization and mapping, especially the one relying solely on video data (vSLAM), is a challenging problem that has been extensively studied in robotics and computer vision. State-of-the-art vSLAM algorithms are capable of constructing accurate-enough maps that enable a mobile robot to autonomously navigate an unknown environment. In this work, we are interested in an important problem related to vSLAM, i.e. map merging, that might appear in various practically important scenarios, e.g. in a multi-robot coverage scenario. This problem asks whether different vSLAM maps can be merged into a consistent single representation. We examine the existing 2D and 3D map-merging algorithms and conduct an extensive empirical evaluation in realistic simulated environment (Habitat). Both qualitative and quantitative comparison is carried out and the obtained results are reported and analyzed.

preprint2020arXiv

Planning to Score a Goal in Robotic Football with Heuristic Search

This paper considers a problem of planning an attack in robotic football (RoboCup). The problem is reduced to finding a trajectory of the ball from its current position to the opponents goals. Heuristic search algorithm, i.e. A*, is used to find such a trajectory. For this algorithm to be applicable we introduce a discretized model of the environment, i.e. a graph, as well as the core search components: cost function and heuristic function. Both are designed to take into account all the available information of the game state. We extensively evaluate the suggested approach in simulation comparing it to a range of baselines. The result of the conducted evaluation clearly shows the benefit of utilizing heuristic search within the RoboCup context.

preprint2020arXiv

Revisiting Bounded-Suboptimal Safe Interval Path Planning

Safe-interval path planning (SIPP) is a powerful algorithm for finding a path in the presence of dynamic obstacles. SIPP returns provably optimal solutions. However, in many practical applications of SIPP such as path planning for robots, one would like to trade-off optimality for shorter planning time. In this paper we explore different ways to build a bounded-suboptimal SIPP and discuss their pros and cons. We compare the different bounded-suboptimal versions of SIPP experimentally. While there is no universal winner, the results provide insights into when each method should be used.

preprint2019arXiv

Prioritized Multi-agent Path Finding for Differential Drive Robots

Methods for centralized planning of the collision-free trajectories for a fleet of mobile robots typically solve the discretized version of the problem and rely on numerous simplifying assumptions, e.g. moves of uniform duration, cardinal only translations, equal speed and size of the robots etc., thus the resultant plans can not always be directly executed by the real robotic systems. To mitigate this issue we suggest a set of modifications to the prominent prioritized planner -- AA-SIPP(m) -- aimed at lifting the most restrictive assumptions (syncronized translation only moves, equal size and speed of the robots) and at providing robustness to the solutions. We evaluate the suggested algorithm in simulation and on differential drive robots in typical lab environment (indoor polygon with external video-based navigation system). The results of the evaluation provide a clear evidence that the algorithm scales well to large number of robots (up to hundreds in simulation) and is able to produce solutions that are safely executed by the robots prone to imperfect trajectory following. The video of the experiments can be found at https://youtu.be/Fer_irn4BG0.