Researcher profile

Jing Yu

Jing Yu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
16works
0followers
16topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

16 published item(s)

preprint2026arXiv

APEX: Academic Poster Editing Agentic Expert

Designing academic posters is a labor-intensive process requiring the precise balance of high-density content and sophisticated layout. While existing paper-to-poster generation methods automate initial drafting, they are typically single-pass and non-interactive, often fail to align with complex, subjective user intent. To bridge this gap, we propose APEX (Academic Poster Editing agentic eXpert), the first agentic framework for interactive academic poster editing, supporting fine-grained control with robust multi-level API-based editing and a review-and-adjustment Mechanism. In addition, we introduce APEX-Bench, the first systematic benchmark comprising 514 academic poster editing instructions, categorized by a multi-dimensional taxonomy including operation type, difficulty, and abstraction level, constructed via reference-guided and reference-free strategies to ensure realism and diversity. We further establish a multi-dimensional VLM-as-a-judge evaluation protocol to assess instruction fulfillment, modification scope, and visual consistency & harmony. Experimental results demonstrate that APEX significantly outperforms baseline methods. Our implementation is available at https://github.com/Breesiu/APEX.

preprint2026arXiv

CrossCult-KIBench: A Benchmark for Cross-Cultural Knowledge Insertion in MLLMs

Multimodal Large Language Models (MLLMs), trained primarily on English-centric data, frequently generate culturally inappropriate or misaligned responses in cross-cultural settings. To mitigate this, we introduce the task of cross-cultural knowledge insertion, which focuses on adapting models to specific cultural contexts while preserving their original behavior in other cultures. To facilitate research in this area, we introduce CrossCult-KIBench, a comprehensive evaluation benchmark for assessing both the effectiveness of knowledge insertion and its unintended side effects on non-target cultures. The benchmark includes 9,800 image-grounded cases covering 49 culturally relevant visual scenarios across English, Chinese, and Arabic language-culture groups. It supports evaluation in both single-insert and sequential-insert settings. We also propose Memory-Conditioned Knowledge Insertion (MCKI) as a baseline method. MCKI retrieves relevant cultural knowledge from an external memory using frozen MLLM representations, prepending matched entries as conditional prompts when applicable. Extensive experiments on CrossCult-KIBench reveal that current approaches struggle to balance effective cultural adaptation with behavioral preservation, highlighting a key challenge in developing culturally-aware MLLMs. Our work thus underscores an important research direction for developing more culturally adaptive and responsible MLLMs.

preprint2026arXiv

On derived categories of module categories over multiring categories

Let $\mathcal{A}$ and $\mathcal{B}$ be subcategories of tensor categories $\mathcal{C}$ and $\mathcal{D}$, respectively, both of which are abelian categories with finitely many isomorphism classes of simple objects. We prove that if their derived categories $\mathbf{D}^b(\mathcal{A})$ and $\mathbf{D}^b(\mathcal{B})$ are left triangulated tensor ideals and are equivalent as triangulated $\mathbf{D}^b(\mathcal{C})$-module categories via an equivalence induced by a monoidal triangulated functor $F:\mathbf{D}^b(\mathcal{C})\rightarrow \mathbf{D}^b(\mathcal{D})$, then the original module categories $\mathcal{A}$ and $\mathcal{B}$ are themselves equivalent. We then apply this result to smash product algebras. Furthermore, the localization theory of module categories and triangulated module categories is investigated.

preprint2026arXiv

SLVC-DIDA: Signature-less Verifiable Credential-based Issuer-hiding and Multi-party Authentication for Decentralized Identity

As an emerging paradigm in digital identity, Decentralized Identity (DID) appears advantages over traditional identity management methods in a variety of aspects, e.g., enhancing user-centric online services and ensuring complete user autonomy and control. Verifiable Credential (VC) techniques are used to facilitate decentralized DID-based access control across multiple entities. However, existing DID schemes generally rely on a distributed public key infrastructure that also causes challenges, such as context information deduction, key exposure, and issuer data leakage. To address the issues above, this paper proposes a issuer-hiding and privacy-preserving DID multi-party authentication model with a signature-less VC scheme, named SLVC-DIDA, for the first time. Our proposed scheme avoids the dependence on signing keys by employing hashing and issuer membership proofs, which supports universal zero-knowledge multi-party DID authentications, eliminating additional technical integrations. We adopt a novel zero-knowledge circuit to maintain the anonymity of the issuer set, thereby enabling public verification while safeguarding the privacy of identity attributes via a Merkle tree-based VC list. Furthermore, by eliminating reliance on a Public Key Infrastructure (PKI), SLVC-DIDA enables decentralized and self-sovereign DID authentication. Our experiments further evaluate the effectiveness and practicality of SLVC-DIDA.

preprint2025arXiv

Enhancing atomic-resolution in electron microscopy: A frequency-domain deep learning denoiser

Atomic resolution electron microscopy, particularly high-angle annular dark-field scanning transmission electron microscopy, has become an essential tool for many scientific fields, when direct visualization of atomic arrangements and defects are needed, as they dictate the material's functional and mechanical behavior. However, achieving this precision is often hindered by noise, arising from electron microscopy acquisition limitations, particularly when imaging beam-sensitive materials or light atoms. In this work, we present a deep learning-based denoising approach that operates in the frequency domain using a convolutional neural network U-Net trained on simulated data. To generate the training dataset, we simulate FFT patterns for various materials, crystallographic orientations, and imaging conditions, introducing noise and drift artifacts to accurately mimic experimental scenarios. The model is trained to identify relevant frequency components, which are then used to enhance experimental images by applying element-wise multiplication in the frequency domain. The model enhances experimental images by identifying and amplifying relevant frequency components, significantly improving signal-to-noise ratio while preserving structural integrity. Applied to both Ge quantum wells and WS2 monolayers, the method facilitates more accurate strain quantitative analyses, critical for assessing functional device performance (e.g. quantum properties in SiGe quantum wells), and enables the clear identification of light atoms in beam sensitive materials. Our results demonstrate the potential of automated frequency-based deep learning denoising as a useful tool for atomic-resolution nano-materials analysis.

preprint2022arXiv

MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering

Knowledge-based visual question answering requires the ability of associating external knowledge for open-ended cross-modal scene understanding. One limitation of existing solutions is that they capture relevant knowledge from text-only knowledge bases, which merely contain facts expressed by first-order predicates or language descriptions while lacking complex but indispensable multimodal knowledge for visual understanding. How to construct vision-relevant and explainable multimodal knowledge for the VQA scenario has been less studied. In this paper, we propose MuKEA to represent multimodal knowledge by an explicit triplet to correlate visual objects and fact answers with implicit relations. To bridge the heterogeneous gap, we propose three objective losses to learn the triplet representations from complementary views: embedding structure, topological relation and semantic space. By adopting a pre-training and fine-tuning learning strategy, both basic and domain-specific multimodal knowledge are progressively accumulated for answer prediction. We outperform the state-of-the-art by 3.35% and 6.08% respectively on two challenging knowledge-required datasets: OK-VQA and KRVQA. Experimental results prove the complementary benefits of the multimodal knowledge with existing knowledge bases and the advantages of our end-to-end framework over the existing pipeline methods. The code is available at https://github.com/AndersonStra/MuKEA.

preprint2022arXiv

Robust Online Voltage Control with an Unknown Grid Topology

Voltage control generally requires accurate information about the grid's topology in order to guarantee network stability. However, accurate topology identification is a challenging problem for existing methods, especially as the grid is subject to increasingly frequent reconfiguration due to the adoption of renewable energy. Further, running existing control mechanisms with incorrect network information may lead to unstable control. In this work, we combine a nested convex body chasing algorithm with a robust predictive controller to achieve provably finite-time convergence to safe voltage limits in the online setting where the network topology is initially unknown. Specifically, the online controller does not know the true network topology and line parameters, but instead must learn them over time by narrowing down the set of network topologies and line parameters that are consistent with its observations and adjusting reactive power generation accordingly to keep voltages within desired safety limits. We demonstrate the effectiveness of the approach using a case study, which shows that in practical settings the controller is indeed able to narrow the set of consistent topologies quickly enough to make control decisions that ensure stability.

preprint2021arXiv

Evolving Attention with Residual Convolutions

Transformer is a ubiquitous model for natural language processing and has attracted wide attentions in computer vision. The attention maps are indispensable for a transformer model to encode the dependencies among input tokens. However, they are learned independently in each layer and sometimes fail to capture precise patterns. In this paper, we propose a novel and generic mechanism based on evolving attention to improve the performance of transformers. On one hand, the attention maps in different layers share common knowledge, thus the ones in preceding layers can instruct the attention in succeeding layers through residual connections. On the other hand, low-level and high-level attentions vary in the level of abstraction, so we adopt convolutional layers to model the evolutionary process of attention maps. The proposed evolving attention mechanism achieves significant performance improvement over various state-of-the-art models for multiple tasks, including image classification, natural language understanding and machine translation.

preprint2021arXiv

Syntax-BERT: Improving Pre-trained Transformers with Syntax Trees

Pre-trained language models like BERT achieve superior performances in various NLP tasks without explicit consideration of syntactic information. Meanwhile, syntactic information has been proved to be crucial for the success of NLP applications. However, how to incorporate the syntax trees effectively and efficiently into pre-trained Transformers is still unsettled. In this paper, we address this problem by proposing a novel framework named Syntax-BERT. This framework works in a plug-and-play mode and is applicable to an arbitrary pre-trained checkpoint based on Transformer architecture. Experiments on various datasets of natural language understanding verify the effectiveness of syntax trees and achieve consistent improvement over multiple pre-trained models, including BERT, RoBERTa, and T5.

preprint2020arXiv

Achieving Performance and Safety in Large Scale Systems with Saturation using a Nonlinear System Level Synthesis Approach

We present a novel class of nonlinear controllers that interpolates among differently behaving linear controllers as a case study for recently proposed Linear and Nonlinear System Level Synthesis framework. The structure of the nonlinear controller allows for simultaneously satisfying performance and safety objectives defined for small- and large-disturbance regimes. The proposed controller is distributed, handles delays, sparse actuation, and localizes disturbances. We show our nonlinear controller always outperforms its linear counterpart for constrained LQR problems. We further demonstrate the anti-windup property of an augmented control strategy based on the proposed controller for saturated systems via simulation.

preprint2020arXiv

DAM: Deliberation, Abandon and Memory Networks for Generating Detailed and Non-repetitive Responses in Visual Dialogue

Visual Dialogue task requires an agent to be engaged in a conversation with human about an image. The ability of generating detailed and non-repetitive responses is crucial for the agent to achieve human-like conversation. In this paper, we propose a novel generative decoding architecture to generate high-quality responses, which moves away from decoding the whole encoded semantics towards the design that advocates both transparency and flexibility. In this architecture, word generation is decomposed into a series of attention-based information selection steps, performed by the novel recurrent Deliberation, Abandon and Memory (DAM) module. Each DAM module performs an adaptive combination of the response-level semantics captured from the encoder and the word-level semantics specifically selected for generating each word. Therefore, the responses contain more detailed and non-repetitive descriptions while maintaining the semantic accuracy. Furthermore, DAM is flexible to cooperate with existing visual dialogue encoders and adaptive to the encoder structures by constraining the information selection mode in DAM. We apply DAM to three typical encoders and verify the performance on the VisDial v1.0 dataset. Experimental results show that the proposed models achieve new state-of-the-art performance with high-quality responses. The code is available at https://github.com/JXZe/DAM.

preprint2020arXiv

DARWIN: A Highly Flexible Platform for Imaging Research in Radiology

To conduct a radiomics or deep learning research experiment, the radiologists or physicians need to grasp the needed programming skills, which, however, could be frustrating and costly when they have limited coding experience. In this paper, we present DARWIN, a flexible research platform with a graphical user interface for medical imaging research. Our platform is consists of a radiomics module and a deep learning module. The radiomics module can extract more than 1000 dimension features(first-, second-, and higher-order) and provided many draggable supervised and unsupervised machine learning models. Our deep learning module integrates state of the art architectures of classification, detection, and segmentation tasks. It allows users to manually select hyperparameters, or choose an algorithm to automatically search for the best ones. DARWIN also offers the possibility for users to define a custom pipeline for their experiment. These flexibilities enable radiologists to carry out various experiments easily.

preprint2020arXiv

KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning in Visual Dialogue

Visual dialogue is a challenging task that needs to extract implicit information from both visual (image) and textual (dialogue history) contexts. Classical approaches pay more attention to the integration of the current question, vision knowledge and text knowledge, despising the heterogeneous semantic gaps between the cross-modal information. In the meantime, the concatenation operation has become de-facto standard to the cross-modal information fusion, which has a limited ability in information retrieval. In this paper, we propose a novel Knowledge-Bridge Graph Network (KBGN) model by using graph to bridge the cross-modal semantic relations between vision and text knowledge in fine granularity, as well as retrieving required knowledge via an adaptive information selection mode. Moreover, the reasoning clues for visual dialogue can be clearly drawn from intra-modal entities and inter-modal bridges. Experimental results on VisDial v1.0 and VisDial-Q datasets demonstrate that our model outperforms existing models with state-of-the-art results.

preprint2020arXiv

Resilience for Landslide Geohazards and Promoting Strategies in the Three Gorges Reservoir Area

Recently, resilience is increasingly used as a concept for understanding natural disaster systems. Landslide is one of the most frequent geohazards in the Three Gorges Reservoir Area (TGRA).However, it is difficult to measure local disaster resilience, because of special geographical location in the TGRA and the special disaster landslide. Current approaches to disaster resilience evaluation are usually limited either by the qualitative method or properties of different disaster. Therefore, practical evaluating methods for the disaster resilience are needed. In this study, we developed an indicator system to evaluate landslides disaster resilience in the TGRE at the county level. It includes two properties of inherent geological stress and external social response, which are summarized into physical stress and social forces. The evaluated disaster resilience can be simulated for promoting strategies with fuzzy cognitive map (FCM).

preprint2015arXiv

An effective criterion for Eulerian multizeta values in positive characteristic

Characteristic p multizeta values were initially studied by Thakur, who defined them as analogues of classical multiple zeta values of Euler. In the present paper we establish an effective criterion for Eulerian multizeta values, which characterizes when a multizeta value is a rational multiple of a power of the Carlitz period. The resulting "t-motivic" algorithm can tell whether any given multizeta value is Eulerian or not. We also prove that if zeta_A(s_1,...,s_r) is Eulerian, then zeta_A(s_2,...,s_r) has to be Eulerian. When r=2, this was conjectured (and later on conjectured for arbitrary r) by Lara Rodriguez and Thakur for the zeta-like case from numerical data. Our methods apply equally well to values of Carlitz multiple polylogarithms at algebraic points and zeta-like multizeta values.

preprint2009arXiv

Frobenius difference equations and algebraic independence of zeta values in positive equal characteristic

In analogy with the Riemann zeta function at positive integers, for each finite field F_p^r with fixed characteristic p we consider Carlitz zeta values zeta_r(n) at positive integers n. Our theorem asserts that among the zeta values in {zeta_r(1), zeta_r(2), zeta_r(3), ... | r = 1, 2, 3, ...}, all the algebraic relations are those algebraic relations within each individual family {zeta_r(1), zeta_r(2), zeta_r(3), ...}. These are the algebraic relations coming from the Euler-Carlitz relations and the Frobenius relations. To prove this, a motivic method for extracting algebraic independence results from systems of Frobenius difference equations is developed.