Source author record

Jiaxin Ai

Jiaxin Ai appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computer Vision Software Engineering

Catalog footprint

What is connected

2works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2025arXiv

ProSoftArena: Benchmarking Hierarchical Capabilities of Multimodal Agents in Professional Software Environments

Multimodal agents are making rapid progress on general computer-use tasks, yet existing benchmarks remain largely confined to browsers and basic desktop applications, falling short in professional software workflows that dominate real-world scientific and industrial practice. To close this gap, we introduce ProSoftArena, a benchmark and platform specifically for evaluating multimodal agents in professional software environments. We establish the first capability hierarchy tailored to agent use of professional software and construct a benchmark of 436 realistic work and research tasks spanning 6 disciplines and 13 core professional applications. To ensure reliable and reproducible assessment, we build an executable real-computer environment with an execution-based evaluation framework and uniquely incorporate a human-in-the-loop evaluation paradigm. Extensive experiments show that even the best-performing agent attains only a 24.4\% success rate on L2 tasks and completely fails on L3 multi-software workflow. In-depth analysis further provides valuable insights for addressing current agent limitations and more effective design principles, paving the way to build more capable agents in professional software settings. This project is available at: https://prosoftarena.github.io.

preprint2022arXiv

Deepfake Face Traceability with Disentangling Reversing Network

Deepfake face not only violates the privacy of personal identity, but also confuses the public and causes huge social harm. The current deepfake detection only stays at the level of distinguishing true and false, and cannot trace the original genuine face corresponding to the fake face, that is, it does not have the ability to trace the source of evidence. The deepfake countermeasure technology for judicial forensics urgently calls for deepfake traceability. This paper pioneers an interesting question about face deepfake, active forensics that "know it and how it happened". Given that deepfake faces do not completely discard the features of original faces, especially facial expressions and poses, we argue that original faces can be approximately speculated from their deepfake counterparts. Correspondingly, we design a disentangling reversing network that decouples latent space features of deepfake faces under the supervision of fake-original face pair samples to infer original faces in reverse.