Researcher profile

Yanyan Li

Yanyan Li contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
14works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

14 published item(s)

preprint2025arXiv

Large Language Model-Driven Closed-Loop UAV Operation with Semantic Observations

Recent advances in large Language Models (LLMs) have revolutionized mobile robots, including unmanned aerial vehicles (UAVs), enabling their intelligent operation within Internet of Things (IoT) ecosystems. However, LLMs still face challenges from logical reasoning and complex decision-making, leading to concerns about the reliability of LLM-driven UAV operations in IoT applications. In this paper, we propose a closed-loop LLM-driven UAV operation code generation framework that enables reliable UAV operations powered by effective feedback and refinement using two LLM modules, i.e., a Code Generator and an Evaluator. Our framework transforms numerical state observations from UAV operations into semantic trajectory descriptions to enhance the evaluator LLM's understanding of UAV dynamics for precise feedback generation. Our framework also enables a simulation-based refinement process, and hence eliminates the risks to physical UAVs caused by incorrect code execution during the refinement. Extensive experiments on UAV control tasks with different complexities are conducted. The experimental results show that our framework can achieve reliable UAV operations using LLMs, which significantly outperforms baseline methods in terms of success rate and completeness with the increase of task complexity.

preprint2023arXiv

Towards Table-to-Text Generation with Pretrained Language Model: A Table Structure Understanding and Text Deliberating Approach

Although remarkable progress on the neural table-to-text methods has been made, the generalization issues hinder the applicability of these models due to the limited source tables. Large-scale pretrained language models sound like a promising solution to tackle such issues. However, how to effectively bridge the gap between the structured table and the text input by fully leveraging table information to fuel the pretrained model is still not well explored. Besides, another challenge of integrating the deliberation mechanism into the text-to-text pretrained model for solving the table-to-text task remains seldom studied. In this paper, to implement the table-to-text generation with pretrained language model, we propose a table structure understanding and text deliberating approach, namely TASD. Specifically, we devise a three-layered multi-head attention network to realize the table-structure-aware text generation model with the help of the pretrained language model. Furthermore, a multi-pass decoder framework is adopted to enhance the capability of polishing generated text for table descriptions. The empirical studies, as well as human evaluation, on two public datasets, validate that our approach can generate faithful and fluent descriptive texts for different types of tables.

preprint2022arXiv

E-Graph: Minimal Solution for Rigid Rotation with Extensibility Graphs

Minimal solutions for relative rotation and translation estimation tasks have been explored in different scenarios, typically relying on the so-called co-visibility graph. However, how to build direct rotation relationships between two frames without overlap is still an open topic, which, if solved, could greatly improve the accuracy of visual odometry. In this paper, a new minimal solution is proposed to solve relative rotation estimation between two images without overlapping areas by exploiting a new graph structure, which we call Extensibility Graph (E-Graph). Differently from a co-visibility graph, high-level landmarks, including vanishing directions and plane normals, are stored in our E-Graph, which are geometrically extensible. Based on E-Graph, the rotation estimation problem becomes simpler and more elegant, as it can deal with pure rotational motion and requires fewer assumptions, e.g. Manhattan/Atlanta World, planar/vertical motion. Finally, we embed our rotation estimation strategy into a complete camera tracking and mapping system which obtains 6-DoF camera poses and a dense 3D mesh model. Extensive experiments on public benchmarks demonstrate that the proposed method achieves state-of-the-art tracking performance.

preprint2022arXiv

Gradient estimates for the insulated conductivity problem: the non-umbilical case

We study the insulated conductivity problem with inclusions embedded in a bounded domain in $\mathbb R^n$, for $n \ge 3$. The gradient of solutions may blow up as $\varepsilon$, the distance between inclusions, approaches to $0$. We established in a recent paper optimal gradient estimates for a class of inclusions including balls. In this paper, we prove such gradient estimates for general strictly convex inclusions. Unlike the perfect conductivity problem, the estimates depend on the principal curvatures of the inclusions, and we show that these estimates are characterized by the first non-zero eigenvalue of a divergence form elliptic operator on $\mathbb S^{n-2}$.

preprint2022arXiv

Optimal gradient estimates of solutions to the insulated conductivity problem in dimension greater than two

We study the insulated conductivity problem with inclusions embedded in a bounded domain in $\mathbb{R}^n$. The gradient of solutions may blow up as $\varepsilon$, the distance between inclusions, approaches to $0$. It was known that the optimal blow up rate in dimension $n = 2$ is of order $\varepsilon^{-1/2}$. It has recently been proved that in dimensions $n \ge 3$, an upper bound of the gradient is of order $\varepsilon^{-1/2 + β}$ for some $β> 0$. On the other hand, optimal values of $β$ have not been identified. In this paper, we prove that when the inclusions are balls, the optimal value of $β$ is $[-(n-1)+\sqrt{(n-1)^2+4(n-2)}~]/4 \in (0,1/2)$ in dimensions $n \ge 3$.

preprint2021arXiv

A Survey on Amazon Alexa Attack Surfaces

Since being launched in 2014, Alexa, Amazon's versatile cloud-based voice service, is now active in over 100 million households worldwide. Alexa's user-friendly, personalized vocal experience offers customers a more natural way of interacting with cutting-edge technology by allowing the ability to directly dictate commands to the assistant. Now in the present year, the Alexa service is more accessible than ever, available on hundreds of millions of devices from not only Amazon but third-party device manufacturers. Unfortunately, that success has also been the source of concern and controversy. The success of Alexa is based on its effortless usability, but in turn, that has led to a lack of sufficient security. This paper surveys various attacks against Amazon Alexa ecosystem including attacks against the frontend voice capturing and the cloud backend voice command recognition and processing. Overall, we have identified six attack surfaces covering the lifecycle of Alexa voice interaction that spans several stages including voice data collection, transmission, processing and storage. We also discuss the potential mitigation solutions for each attack surface to better improve Alexa or other voice assistants in terms of security and privacy.

preprint2021arXiv

Diffeomorphic Image Registration with An Optimal Control Relaxation and Its Implementation

Image registration has played an important role in image processing problems, especially in medical imaging applications. It is well known that when the deformation is large, many variational models cannot ensure diffeomorphism. In this paper, we propose a new registration model based on an optimal control relaxation constraint for large deformation images, which can theoretically guarantee that the registration mapping is diffeomorphic. We present an analysis of optimal control relaxation for indirectly seeking the diffeomorphic transformation of Jacobian determinant equation and its registration applications, including the construction of diffeomorphic transformation as a special space. We also provide an existence result for the control increment optimization problem in the proposed diffeomorphic image registration model with an optimal control relaxation. Furthermore, a fast iterative scheme based on the augmented Lagrangian multipliers method (ALMM) is analyzed to solve the control increment optimization problem, and a convergence analysis is followed. Finally, a grid unfolding indicator is given, and a robust solving algorithm for using the deformation correction and backtrack strategy is proposed to guarantee that the solution is diffeomorphic. Numerical experiments show that the registration model we proposed can not only get a diffeomorphic mapping when the deformation is large, but also achieves the state-of-the-art performance in quantitative evaluations in comparing with other classical models.

preprint2020arXiv

Solutions to the $σ_k$-Loewner-Nirenberg problem on annuli are locally Lipschitz and not differentiable

We show for $k \geq 2$ that the locally Lipschitz viscosity solution to the $σ_k$-Loewner-Nirenberg problem on a given annulus $\{a < |x| < b\}$ is $C^{1,\frac{1}{k}}_{\rm loc}$ in each of $\{a < |x| \leq \sqrt{ab}\}$ and $\{\sqrt{ab} \leq |x| < b\}$ and has a jump in radial derivative across $|x| = \sqrt{ab}$. Furthermore, the solution is not $C^{1,γ}_{\rm loc}$ for any $γ> \frac{1}{k}$. Optimal regularity for solutions to the $σ_k$-Yamabe problem on annuli with finite constant boundary values is also established.

preprint2020arXiv

Structure-SLAM: Low-Drift Monocular SLAM in Indoor Environments

In this paper a low-drift monocular SLAM method is proposed targeting indoor scenarios, where monocular SLAM often fails due to the lack of textured surfaces. Our approach decouples rotation and translation estimation of the tracking process to reduce the long-term drift in indoor environments. In order to take full advantage of the available geometric information in the scene, surface normals are predicted by a convolutional neural network from each input RGB image in real-time. First, a drift-free rotation is estimated based on lines and surface normals using spherical mean-shift clustering, leveraging the weak Manhattan World assumption. Then translation is computed from point and line features. Finally, the estimated poses are refined with a map-to-frame optimization strategy. The proposed method outperforms the state of the art on common SLAM benchmarks such as ICL-NUIM and TUM RGB-D.