Researcher profile

Dmitry Yudin

Dmitry Yudin contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2026arXiv

SceneGraphVLM: Dynamic Scene Graph Generation from Video with Vision-Language Models

Scene graph generation provides a compact structured representation for visual perception, but accurate and fast graph prediction from images and videos remains challenging. Recent VLM-based methods can generate scene graphs end-to-end as structured text, yet often produce long outputs with irrelevant objects and relations. We present SceneGraphVLM, a compact method for image and video scene graph generation with small visual language models. SceneGraphVLM serializes graphs in a token-efficient TOON format and trains the model in two stages: supervised fine-tuning followed by reinforcement learning with hallucination-aware rewards that balance relation coverage and precision while penalizing unsupported objects and relations. For videos, the model can optionally condition each frame on the previously generated graph, providing lightweight short-term context without tracking or post-processing. We evaluate SceneGraphVLM on PSG, PVSG, and Action Genome. With compact VLMs and vLLM-accelerated decoding, SceneGraphVLM achieves a strong quality-speed trade-off, improves precision-oriented SGG metrics while preserving reasonable recall, and generates complete scene graphs with approximately one-second latency. Code and implementation details are available at: https://github.com/markus0440/SceneGraphVLM.git.

preprint2023arXiv

Rethinking Voxelization and Classification for 3D Object Detection

The main challenge in 3D object detection from LiDAR point clouds is achieving real-time performance without affecting the reliability of the network. In other words, the detecting network must be confident enough about its predictions. In this paper, we present a solution to improve network inference speed and precision at the same time by implementing a fast dynamic voxelizer that works on fast pillar-based models in the same way a voxelizer works on slow voxel-based models. In addition, we propose a lightweight detection sub-head model for classifying predicted objects and filter out false detected objects that significantly improves model precision in a negligible time and computing cost. The developed code is publicly available at: https://github.com/YoushaaMurhij/RVCDet.

preprint2022arXiv

HPointLoc: Point-based Indoor Place Recognition using Synthetic RGB-D Images

We present a novel dataset named as HPointLoc, specially designed for exploring capabilities of visual place recognition in indoor environment and loop detection in simultaneous localization and mapping. The loop detection sub-task is especially relevant when a robot with an on-board RGB-D camera can drive past the same place (``Point") at different angles. The dataset is based on the popular Habitat simulator, in which it is possible to generate photorealistic indoor scenes using both own sensor data and open datasets, such as Matterport3D. To study the main stages of solving the place recognition problem on the HPointLoc dataset, we proposed a new modular approach named as PNTR. It first performs an image retrieval with the Patch-NetVLAD method, then extracts keypoints and matches them using R2D2, LoFTR or SuperPoint with SuperGlue, and finally performs a camera pose optimization step with TEASER++. Such a solution to the place recognition problem has not been previously studied in existing publications. The PNTR approach has shown the best quality metrics on the HPointLoc dataset and has a high potential for real use in localization systems for unmanned vehicles. The proposed dataset and framework are publicly available: https://github.com/metra4ok/HPointLoc.

preprint2022arXiv

Quantum-machine-learning channel discrimination

In the problem of quantum channel discrimination, one distinguishes between a given number of quantum channels, which is done by sending an input state through a channel and measuring the output state. This work studies applications of variational quantum circuits and machine learning techniques for discriminating such channels. In particular, we explore (i) the practical implementation of embedding this task into the framework of variational quantum computing, (ii) training a quantum classifier based on variational quantum circuits, and (iii) applying the quantum kernel estimation technique. For testing these three channel discrimination approaches, we considered a pair of entanglement-breaking channels and the depolarizing channel with two different depolarization factors. For the approach (i), we address solving the quantum channel discrimination problem using widely discussed parallel and sequential strategies. We show the advantage of the latter in terms of better convergence with less quantum resources. Quantum channel discrimination with a variational quantum classifier (ii) allows one to operate even with random and mixed input states and simple variational circuits. The kernel-based classification approach (iii) is also found effective as it allows one to discriminate depolarizing channels associated not with just fixed values of the depolarization factor, but with ranges of it. Additionally, we discovered that a simple modification of one of the commonly used kernels significantly increases the efficiency of this approach. Finally, our numerical findings reveal that the performance of variational methods of channel discrimination depends on the trace of the product of the output states. These findings demonstrate that quantum machine learning can be used to discriminate channels, such as those representing physical noise processes.

preprint2021arXiv

On the origin of electron accumulation layer at clean InAs(111) surfaces

In this paper, we provide a comprehensive theoretical analysis of the electronic structure of InAs(111) surfaces with a special attention paid to the energy region close to the fundamental bandgap. Starting from the bulk electronic structure of InAs as calculated using PBE functional with included Hubbard correction and spin-orbit coupling, we deliver proper values for the bandgap, split-off energy, as well as effective electron, light- and heavy-hole masses in full consistency with available experimental results. On the basis of optimized atomic surfaces we recover scanning tunneling microscopy images, which being supplied with accessible experimental data make it possible to speculate on the formation of electron accumulation layer for both As- and In-terminated InAs(111) surfaces. Moreover, these results are accompanied by band structure simulations of conduction band states.

preprint2020arXiv

An ab initio perspective on scanning tunneling microscopy measurements of the tunable Kondo resonance of the TbPc$_2$ molecule on a gold substrate

With recent advances in the areas of nanostructure fabrication and molecular spintronics the idea of using single molecule magnets as building blocks for the next generation electronic devices becomes viable. A particular example represents a metal-organic complex in which organic ligands surround a rare-earth element or transition metal. Recently, it was explicitly shown that the relative position of the ligands with respect to each other can be reversibly changed by the external voltage without any need of the chemical modification of the sample. This opens a way of the electrical tuning of the Kondo effect in such metal-organic complexes. In this work, we present a detailed and systematic analysis of this effect in TbPc$_2$ from an ab initio perspective and compare the obtained results with the existing experimental data.

preprint2020arXiv

Localized surface electromagnetic waves in CrI$_3$-based magnetophotonic structures

Resulting from strong magnetic anisotropy two-dimensional ferromagnetism was recently shown to be stabilized in chromium triiodide, CrI$_3$, in the monolayer limit. While its properties remain largely unexplored, it provides a unique material-specific platform to unveil its electromagnetic properties associated with coupling of modes. Indeed, trigonal symmetry in the presence of out-of-plane magnetization results in a non-trivial structure of the conductivity tensor, including the off-diagonal terms. In this paper, we study the surface electromagnetic waves localized in a CrI$_3$-based structure using the results of {\it ab initio} calculations for the CrI$_3$ conductivity tensor. In particular, we provide an estimate for the critical angle corresponding to the surface plasmon polariton generation in the Kretschmann-Raether configuration by a detailed investigation of reflectance spectrum as well as the magnetic field distribution for different CrI$_3$ layer thicknesses. We also study the bilayer structure formed by two CrI$_3$ layers separated by a SiO$_2$ spacer and show that the surface plasmon resonance can be achieved at the interface between CrI$_3$ and air depending on the spacer thickness.

preprint2020arXiv

Oxygen Vacancy in ZnO-$w$ Phase: Pseudohybrid Hubbard Density Functional Study

The study of zinc oxide, within the homogeneous electron gas approximation, results in overhybridization of zinc $3d$ shell with oxygen $2p$ shell, a problem shown for most transition metal chalcogenides. This problem can be partially overcome by using LDA+$U$ (or, GGA+$U$) methodology. However, in contrast to the zinc $3d$ orbital, Hubbard type correction is typically excluded for the oxygen $2p$ orbital. In this work, we provide results of electronic structure calculations of an oxygen vacancy in ZnO supercell from ab initio perspective, with two Hubbard type corrections, $U_{\mathrm{Zn}-3d}$ and $U_{\mathrm{O}-2p}$. The results of our numerical simulations clearly reveal that the account of $U_{\mathrm{O}-2p}$ has a significant impact on the properties of bulk ZnO, in particular the relaxed lattice constants, effective mass of charge carriers as well as the bandgap. For a set of validated values of $U_{\mathrm{Zn}-3d}$ and $U_{\mathrm{O}-2p}$ we demonstrate the appearance of a localized state associated with the oxygen vacancy positioned in the bandgap of the ZnO supercell. Our numerical findings suggest that the defect state is characterized by the highest overlap with the conduction band states as obtained in the calculations with no Hubbard-type correction included. We argue that the electronic density of the defect state is primarily determined by Zn atoms closest to the vacancy.

preprint2020arXiv

Variational Quantum Eigensolver for Frustrated Quantum Systems

Hybrid quantum-classical algorithms have been proposed as a potentially viable application of quantum computers. A particular example - the variational quantum eigensolver, or VQE - is designed to determine a global minimum in an energy landscape specified by a quantum Hamiltonian, which makes it appealing for the needs of quantum chemistry. Experimental realizations have been reported in recent years and theoretical estimates of its efficiency are a subject of intense effort. Here we consider the performance of the VQE technique for a Hubbard-like model describing a one-dimensional chain of fermions with competing nearest- and next-nearest-neighbor interactions. We find that recovering the VQE solution allows one to obtain the correlation function of the ground state consistent with the exact result. We also study the barren plateau phenomenon for the Hamiltonian in question and find that the severity of this effect depends on the encoding of fermions to qubits. Our results are consistent with the current knowledge about the barren plateaus in quantum optimization.

preprint2019arXiv

Experimental neural network enhanced quantum tomography

Quantum tomography is currently ubiquitous for testing any implementation of a quantum information processing device. Various sophisticated procedures for state and process reconstruction from measured data are well developed and benefit from precise knowledge of the model describing state preparation and the measurement apparatus. However, physical models suffer from intrinsic limitations as actual measurement operators and trial states cannot be known precisely. This scenario inevitably leads to state-preparation-and-measurement (SPAM) errors degrading reconstruction performance. Here we develop and experimentally implement a machine learning based protocol reducing SPAM errors. We trained a supervised neural network to filter the experimental data and hence uncovered salient patterns that characterize the measurement probabilities for the original state and the ideal experimental apparatus free from SPAM errors. We compared the neural network state reconstruction protocol with a protocol treating SPAM errors by process tomography, as well as to a SPAM-agnostic protocol with idealized measurements. The average reconstruction fidelity is shown to be enhanced by 10\% and 27\%, respectively. The presented methods apply to the vast range of quantum experiments which rely on tomography.