Source author record

Yi-Fan Zhang

Yi-Fan Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Artificial Intelligence astro-ph.CO Computation and Language hep-ph physics.plasm-ph

Catalog footprint

What is connected

6works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Artifact-Bench: Evaluating MLLMs on Detecting and Assessing the Artifacts of AI-Generated Videos

Recent video generative models have greatly improved the realism of AI-generated videos, yet their outputs still exhibit artifacts such as temporal inconsistencies, structural distortions, and semantic incoherence. While Multimodal Large Language Models (MLLMs) show strong visual understanding capabilities, their ability to perceive and reason about such artifacts remains unclear. Existing benchmarks often lack systematic evaluation of artifact-aware perception and fine-grained diagnostic reasoning, especially across diverse AI-generated video domains beyond photorealistic content. To address this gap, we introduce Artifact-Bench, a comprehensive benchmark for evaluating MLLMs on AI-generated video artifact detection and analysis. We first establish a three-level hierarchical taxonomy of realism artifacts, covering photorealistic, animated, and CG-style videos. Based on this taxonomy, Artifact-Bench defines three complementary tasks: real vs. AI-generated video classification, pairwise realism comparison, and fine-grained artifact identification. Experiments on 19 leading MLLMs reveal substantial limitations in artifact perception and reasoning, with many models approaching random or even below-random performance in challenging settings. We further observe significant misalignment between MLLM judgments and human perceptual preferences, highlighting their limited reliability as general evaluators for AI-generated video realism.

preprint2026arXiv

Edit-Compass & EditReward-Compass: A Unified Benchmark for Image Editing and Reward Modeling

Recent image editing models have achieved remarkable progress in instruction following, multimodal understanding, and complex visual editing. However, existing benchmarks often fail to faithfully reflect human judgment, especially for strong frontier models, due to limited task difficulty and coarse-grained evaluation protocols. In parallel, reward models have become increasingly important for RL-based image editing optimization, yet existing reward model benchmarks still rely on unrealistic evaluation settings that deviate from practical RL scenarios. These limitations hinder reliable assessment of both image editing models and reward models. To address these challenges, we introduce Edit-Compass and EditReward-Compass, a unified evaluation suite for image editing and reward modeling. Edit-Compass contains 2,388 carefully annotated instances spanning six progressively challenging task categories, covering capabilities such as world knowledge reasoning, visual reasoning, and multi-image editing. Beyond broad task coverage, Edit-Compass adopts a fine-grained multidimensional evaluation framework based on structured reasoning and carefully designed scoring rubrics. In parallel, EditReward-Compass contains 2,251 preference pairs that simulate realistic reward modeling scenarios during RL optimization.

preprint2026arXiv

The effect of surface quenching coefficients of $O_2(a^{1}Δg)$ and $O_2(b^{1}Σg^{+})$ on capacitively coupled $Ar$/$O_2$ discharge: A global/equivalent circuit model study

Capacitively coupled discharges operated in mixtures of $Ar$ and $O_2$ are extensively utilized in plasma etching and deposition processes due to the oxidative properties and precursor functionality of the reactive species produced in the discharge. In $Ar$/$O_2$ discharges, the surface quenching coefficient of $O_2(a^{1}Δg)$ is known to affect this metastable density, which, in turn, affects the electronegativity and other important plasma characteristics. In this work, in addition to $O_2(a^{1}Δg)$, $O_2(b^{1}Σg^{+})$ and its associated reactions are incorporated into a global/equivalent circuit model of an $Ar$/$O_2$ discharge. By independently adjusting the quenching coefficients of both metastable species, changes of these surface coefficients are found to significantly affect the discharge characteristics, indicating that the role of $O_2(b^{1}Σg^{+})$ cannot be neglected. The effects of their respective surface quenching coefficients of these metastables based on various wall materials on the discharge are revealed including their effects on different particle species densities, plasma impedance, voltage drops across the sheaths, as well as plasma power absorption.

preprint2022arXiv

Focal and Efficient IOU Loss for Accurate Bounding Box Regression

In object detection, bounding box regression (BBR) is a crucial step that determines the object localization performance. However, we find that most previous loss functions for BBR have two main drawbacks: (i) Both $\ell_n$-norm and IOU-based loss functions are inefficient to depict the objective of BBR, which leads to slow convergence and inaccurate regression results. (ii) Most of the loss functions ignore the imbalance problem in BBR that the large number of anchor boxes which have small overlaps with the target boxes contribute most to the optimization of BBR. To mitigate the adverse effects caused thereby, we perform thorough studies to exploit the potential of BBR losses in this paper. Firstly, an Efficient Intersection over Union (EIOU) loss is proposed, which explicitly measures the discrepancies of three geometric factors in BBR, i.e., the overlap area, the central point and the side length. After that, we state the Effective Example Mining (EEM) problem and propose a regression version of focal loss to make the regression process focus on high-quality anchor boxes. Finally, the above two parts are combined to obtain a new loss function, namely Focal-EIOU loss. Extensive experiments on both synthetic and real datasets are performed. Notable superiorities on both the convergence speed and the localization accuracy can be achieved over other BBR losses.

preprint2022arXiv

Vision-Language Intelligence: Tasks, Representation Learning, and Large Models

This paper presents a comprehensive survey of vision-language (VL) intelligence from the perspective of time. This survey is inspired by the remarkable progress in both computer vision and natural language processing, and recent trends shifting from single modality processing to multiple modality comprehension. We summarize the development in this field into three time periods, namely task-specific methods, vision-language pre-training (VLP) methods, and larger models empowered by large-scale weakly-labeled data. We first take some common VL tasks as examples to introduce the development of task-specific methods. Then we focus on VLP methods and comprehensively review key components of the model structures and training methods. After that, we show how recent work utilizes large-scale raw image-text data to learn language-aligned visual representations that generalize better on zero or few shot learning tasks. Finally, we discuss some potential future trends towards modality cooperation, unified representation, and knowledge incorporation. We believe that this review will be of help for researchers and practitioners of AI and ML, especially those interested in computer vision and natural language processing.

preprint2015arXiv

Leptogenesis scenarios for natural SUSY with mixed axion-higgsino dark matter

Supersymmetric models with radiatively-driven electroweak naturalness require light higgsinos of mass ~ 100-300 GeV. Naturalness in the QCD sector is invoked via the Peccei-Quinn (PQ) axion leading to mixed axion-higgsino dark matter. The SUSY DFSZ axion model provides a solution to the SUSY mu problem and the Little Hierarchy μ<< m_{3/2} may emerge as a consequence of a mismatch between PQ and hidden sector mass scales. The traditional gravitino problem is now augmented by the axino and saxion problems, since these latter particles can also contribute to overproduction of WIMPs or dark radiation, or violation of BBN constraints. We compute regions of the T_R vs. m_{3/2} plane allowed by BBN, dark matter and dark radiation constraints for various PQ scale choices f_a. These regions are compared to the values needed for thermal leptogenesis, non-thermal leptogenesis, oscillating sneutrino leptogenesis and Affleck-Dine leptogenesis. The latter three are allowed in wide regions of parameter space for PQ scale f_a~ 10^{10}-10^{12} GeV which is also favored by naturalness: f_a~ \sqrt{μM_P/λ_μ}\sim 10^{10}-10^{12} GeV. These f_a values correspond to axion masses somewhat above the projected ADMX search regions.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint