Researcher profile

Maria Gorlatova

Maria Gorlatova contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2026arXiv

Can a Unimodal Language Agent Provide Preferences to Tune a Multimodal Vision-Language Model?

To explore a more scalable path for adding multimodal capabilities to existing LLMs, this paper addresses a fundamental question: Can a unimodal LLM, relying solely on text, reason about its own informational needs and provide effective feedback to optimize a multimodal model? To answer this, we propose a method that enables a language agent to give feedback to a vision-language model (VLM) to adapt text generation to the agent's preferences. Our results from different experiments affirm this hypothesis, showing that LLM preference feedback significantly enhances VLM descriptions. Using our proposed method, we find that the VLM can generate multimodal scene descriptions to help the LLM better understand multimodal context, leading to improvements of maximum 13% in absolute accuracy compared to the baseline multimodal approach. Furthermore, a human study validated our AI-driven feedback, showing a 64.6% preference alignment rate between the LLM's choices and human judgments. Extensive experiments provide insights on how and why the method works and its limitations.

preprint2023arXiv

AdaptSLAM: Edge-Assisted Adaptive SLAM with Resource Constraints via Uncertainty Minimization

Edge computing is increasingly proposed as a solution for reducing resource consumption of mobile devices running simultaneous localization and mapping (SLAM) algorithms, with most edge-assisted SLAM systems assuming the communication resources between the mobile device and the edge server to be unlimited, or relying on heuristics to choose the information to be transmitted to the edge. This paper presents AdaptSLAM, an edge-assisted visual (V) and visual-inertial (VI) SLAM system that adapts to the available communication and computation resources, based on a theoretically grounded method we developed to select the subset of keyframes (the representative frames) for constructing the best local and global maps in the mobile device and the edge server under resource constraints. We implemented AdaptSLAM to work with the state-of-the-art open-source V- and VI-SLAM ORB-SLAM3 framework, and demonstrated that, under constrained network bandwidth, AdaptSLAM reduces the tracking error by 62% compared to the best baseline method.

preprint2022arXiv

VR Viewport Pose Model for Quantifying and Exploiting Frame Correlations

The importance of the dynamics of the viewport pose, i.e., the location and the orientation of users' points of view, for virtual reality (VR) experiences calls for the development of VR viewport pose models. In this paper, informed by our experimental measurements of viewport trajectories across 3 different types of VR interfaces, we first develop a statistical model of viewport poses in VR environments. Based on the developed model, we examine the correlations between pixels in VR frames that correspond to different viewport poses, and obtain an analytical expression for the visibility similarity (ViS) of the pixels across different VR frames. We then propose a lightweight ViS-based ALG-ViS algorithm that adaptively splits VR frames into the background and the foreground, reusing the background across different frames. Our implementation of ALG-ViS in two Oculus Quest 2 rendering systems demonstrates ALG-ViS running in real time, supporting the full VR frame rate, and outperforming baselines on measures of frame quality and bandwidth consumption.