Source author record

Yiran Xu

Yiran Xu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision math.AP Artificial Intelligence Computation and Language Multimedia

Catalog footprint

What is connected

5works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

SIN-Bench: Tracing Native Evidence Chains in Long-Context Multimodal Scientific Interleaved Literature

Evaluating whether multimodal large language models truly understand long-form scientific papers remains challenging: answer-only metrics and synthetic "Needle-In-A-Haystack" tests often reward answer matching without requiring a causal, evidence-linked reasoning trace in the document. We propose the "Fish-in-the-Ocean" (FITO) paradigm, which requires models to construct explicit cross-modal evidence chains within native scientific documents. To operationalize FITO, we build SIN-Data, a scientific interleaved corpus that preserves the native interleaving of text and figures. On top of it, we construct SIN-Bench with four progressive tasks covering evidence discovery (SIN-Find), hypothesis verification (SIN-Verify), grounded QA (SIN-QA), and evidence-anchored synthesis (SIN-Summary). We further introduce "No Evidence, No Score", scoring predictions when grounded to verifiable anchors and diagnosing evidence quality via matching, relevance, and logic. Experiments on eight MLLMs show that grounding is the primary bottleneck: Gemini-3-pro achieves the best average overall score (0.573), while GPT-5 attains the highest SIN-QA answer accuracy (0.767) but underperforms on evidence-aligned overall scores, exposing a gap between correctness and traceable support.

preprint2022arXiv

Nonlinear Landau damping for the 2d Vlasov-Poisson system with massless electrons around Penrose-stable equilibria

In this paper, we prove the nonlinear asymptotic stability of the Penrose-stable equilibria among solutions of the $2d$ Vlasov-Poisson system with massless electrons.

preprint2022arXiv

Sharp estimates for screened Vlasov-Poisson system around Penrose-stable equilibria in $\mathbb{R}^d $, $ d\geq3$

In this paper, we study the asymptotic stability of Penrose-stable equilibria among solutions of the screened Vlasov-Poisson system in $\mathbb{R}^d$ with $d\geq 3$ that was first established by Bedrossian, Masmoudi, and Mouhot in \cite{JBedrossian2018} with smooth initial data. More precisely, we prove the sharp decay estimates for the density of the perturbed system, exactly like the free transport with only Hölder (i.e., $C^{a}$ for $0<a<1$) perturbed initial data. This improves the recent works in \cite{HanKwanD2021} by Han-Kwan, Nguyen, and Rousset for lower derivatives of the density and in \cite{NguyenTT2020} by T. Nguyen for higher derivatives with a logarithmic correction in time. Furthermore, we establish new estimates and cancellations of the kernel to the linearized problem to obtain this result. Moreover, we also prove this result for the Vlasov-Poisson system in which the electric field obeys a general nonlinear Poisson equation containing massless electrons/ions case.

preprint2022arXiv

Temporally Consistent Semantic Video Editing

Generative adversarial networks (GANs) have demonstrated impressive image generation quality and semantic editing capability of real images, e.g., changing object classes, modifying attributes, or transferring styles. However, applying these GAN-based editing to a video independently for each frame inevitably results in temporal flickering artifacts. We present a simple yet effective method to facilitate temporally coherent video editing. Our core idea is to minimize the temporal photometric inconsistency by optimizing both the latent code and the pre-trained generator. We evaluate the quality of our editing on different domains and GAN inversion techniques and show favorable results against the baselines.

preprint2020arXiv

Explainable Object-induced Action Decision for Autonomous Vehicles

A new paradigm is proposed for autonomous driving. The new paradigm lies between the end-to-end and pipelined approaches, and is inspired by how humans solve the problem. While it relies on scene understanding, the latter only considers objects that could originate hazard. These are denoted as action-inducing, since changes in their state should trigger vehicle actions. They also define a set of explanations for these actions, which should be produced jointly with the latter. An extension of the BDD100K dataset, annotated for a set of 4 actions and 21 explanations, is proposed. A new multi-task formulation of the problem, which optimizes the accuracy of both action commands and explanations, is then introduced. A CNN architecture is finally proposed to solve this problem, by combining reasoning about action inducing objects and global scene context. Experimental results show that the requirement of explanations improves the recognition of action-inducing objects, which in turn leads to better action predictions.

Yiran Xu

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

SIN-Bench: Tracing Native Evidence Chains in Long-Context Multimodal Scientific Interleaved Literature

Nonlinear Landau damping for the 2d Vlasov-Poisson system with massless electrons around Penrose-stable equilibria

Sharp estimates for screened Vlasov-Poisson system around Penrose-stable equilibria in $\mathbb{R}^d $, $ d\geq3$

Temporally Consistent Semantic Video Editing

Explainable Object-induced Action Decision for Autonomous Vehicles