Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
18works
0followers
16topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

18 published item(s)

preprint2026arXiv

GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment

We present GoLongRL, a fully open-source, capability-oriented post-training recipe for long-context reinforcement learning with verifiable rewards (RLVR). Existing long-context RL methods often treat data construction as a matter of designing increasingly complex retrieval paths, leading to homogeneous task coverage and reward formulations that inadequately reflect practical long-context requirements. Our work offers two contributions. (1) Capability-oriented data construction with full open release. We openly release a dataset of 23K RLVR samples, the complete construction pipeline, and all training code. Guided by a taxonomy of long-context capabilities, the dataset spans 9 task types, each paired with its natural evaluation metric. It comprises curated open-source samples from established corpora and synthetic samples whose QA pairs are generated from real source documents such as books, academic papers, and multi-turn dialogues. Under the same vanilla GRPO setup, our dataset alone outperforms the closed-source QwenLong-L1.5 dataset. Moreover, our Qwen3-30B-A3B model trained on this data delivers long-context performance comparable to DeepSeek-R1-0528 and Qwen3-235B-A22B-Thinking-2507, suggesting that broader coverage and greater reward diversity substantially benefit long-context capability improvement. (2) TMN-Reweight for heterogeneous multitask optimization. To address optimization challenges from heterogeneous rewards, we propose TMN-Reweight, which combines task-level mean normalization for cross-task reward scale alignment with difficulty-adaptive weighting for more reliable advantage estimation. TMN-Reweight further improves average performance over vanilla GRPO, with general capabilities preserved or improved across reported evaluations.

preprint2026arXiv

Industrial Data-Service-Knowledge Governance: Toward Integrated and Trusted Intelligence for Industry 5.0

The convergence of artificial intelligence, cyber-physical systems, and cross-enterprise data ecosystems has propelled industrial intelligence to unprecedented scales. Yet, the absence of a unified trust foundation across data, services, and knowledge layers undermines reliability, accountability, and regulatory compliance in real-world deployments. While existing surveys address isolated aspects, such as data governance, service orchestration, and knowledge representation, none provides a holistic, cross-layer perspective on trustworthiness tailored to industrial settings. To bridge this gap, we present \textsc{Trisk} (TRusted Industrial Data-Service-Knowledge governance), a novel conceptual and taxonomic framework for trustworthy industrial intelligence. Grounded in a five-dimensional trust model (quality, security, privacy, fairness, and explainability), \textsc{Trisk} unifies 120+ representative studies along three orthogonal axes: governance scope (data, service, and knowledge), architectural paradigm (centralized, federated, or edge-embedded), and enabling technology (knowledge graphs, zero-trust policies, causal inference, etc.). We systematically analyze how trust propagates across digital layers, identify critical gaps in semantic interoperability, runtime policy enforcement, and operational/information technologies alignment, and evaluate the maturity of current industrial implementations. Finally, we articulate a forward-looking research agenda for Industry 5.0, advocating for an integrated governance fabric that embeds verifiable trust semantics into every layer of the industrial intelligence stack. This survey serves as both a foundational reference for researchers and a practical roadmap for engineers to deploy trustworthy AI in complex and multi-stakeholder environments.

preprint2026arXiv

InterLight: Leveraging Intrinsic Illumination Priors for Low-Light Image Enhancement

Low-Light Image Enhancement (LLIE) has long been a challenging problem in low-level vision, as insufficient illumination often leads to low contrast, detail loss, and noise. Recent studies show that deep learning-based Retinex theory can effectively decouple illumination and reflectance. However, existing methods frequently suffer from over-enhancement or color distortion, and often assume uniform noise or ideal lighting. To address these limitations, we propose InterLight, a novel framework that systematically excavates and operationalizes intrinsic illumination priors for LLIE.Our core insight is that robust enhancement requires not just estimating illumination, but constructing an illumination-aware pipeline. We first inject sensor-level illumination-response priors via physics-guided augmentation, then represent the degradation through adaptive prompts conditioned on the scene's latent illumination state. This explicit representation directly guides a luminance-gated intrinsic memory mechanism to selectively compensate for information loss, prioritizing reconstruction in dark regions while preserving fidelity in bright ones. Finally, the entire process is regularized by a self-supervised consistency objective that distills illumination-invariant features. By deeply exploiting intrinsic illumination priors, our method achieves clearer textures and more visually coherent enhancement results. Extensive experiments across multiple benchmarks demonstrate the effectiveness of our approach. Code is available at: https://github.com/House-yuyu/InterLight.

preprint2026arXiv

StrLoRA: Towards Streaming Continual Visual Instruction Tuning for MLLMs

Continual Visual Instruction Tuning (CVIT) enables Multimodal Large Language Models to incrementally acquire new abilities. However, existing CVIT methods operate under a restrictive task-incremental setting, where each training phase corresponds to a single, predefined task. This does not reflect real-world conditions, where data arrives as a continuous stream of interleaved and dynamically evolving tasks. To bridge this gap, we introduce Streaming CVIT (StrCVIT), a more general and realistic setting where models learn from a stream of data chunks containing a dynamic mixture of tasks. In StrCVIT, a model must simultaneously acquire new abilities, reinforce recurring abilities, and mitigate forgetting. Existing CVIT methods fail here as they cannot reliably distinguish or adapt to the heterogeneous task samples within each chunk. We therefore propose StrLoRA, a regularized two-stage expert routing framework. StrLoRA first performs task-aware expert selection using the textual instruction to activate a sparse subset of relevant experts, reducing cross-task interference. It then applies token-wise expert weighting within this subset, where contribution weights are computed via cross-modal attention between local visual tokens and the global instruction representation. To maintain stability across the non-stationary stream, a routing-stability regularization aligns current routing distributions with a historical exponential moving average reference. Extensive experiments on a newly developed StrCVIT benchmark show that StrLoRA substantially outperforms existing methods, effectively enhancing model's abilities from continuously evolving data streams. The code is available at https://github.com/chanceche/StrCVIT.

preprint2026arXiv

Trust or Abstain? A Self-Aware RAG Approach

Retrieval-augmented generation (RAG) improves large language models (LLMs) by incorporating external evidence, but it also introduces knowledge conflicts when retrieved contextual knowledge (CK) and parametric knowledge (PK) disagree or are both unreliable. Existing approaches mainly coordinate which source to use, without explicitly asking whether each answer path is correct. We argue that faithful RAG requires LLM self-awareness, namely the ability to recognize the limits of its own knowledge and reasoning. To ground this problem, we construct a model-specific, ground-truth-aligned knowledge-conflict benchmark by evaluating LLM backbones on PK-only and CK-conditioned answer paths over approximately 69K query-context instances per backbone, drawn from five conflict-QA datasets. We then introduce SABER, a Self-Aware Belief Estimator for RAG that requires no LLM fine-tuning. SABER combines a self-prior with PK-side and CK-side conditional reasoning representations from multi-trace inference, then estimates reliability beliefs with two lightweight predictors to drive a 4-cell decision over trust PK, trust CK, trust either, or abstain. Across four LLM backbones, SABER improves end-to-end accuracy and conflict-specific faithfulness over ten inference-time and fine-tuning baselines, with the largest gains on conflict-heavy datasets. Under abstention, SABER's risk-coverage curve Pareto-dominates every prompt-based abstainer, providing a tunable balance between coverage and answer risk. Our code is available at https://github.com/xizhu1022/SABER.

preprint2026arXiv

When Does Hierarchy Help? Benchmarking Agent Coordination in Event-Driven Industrial Scheduling

Recent advances in agent and multi-agent systems have shown strong performance on tool use, reasoning, and collaborative tasks. However, existing benchmarks mostly evaluate task completion in weakly coupled environments, and provide limited support for studying coordination in shared, dynamically evolving systems with hierarchy and coupled constraints. This leaves an important question underexplored: when do different coordination paradigms succeed or fail? We introduce Distributed Event-driven Scheduling Benchmark (DESBench), a benchmark for evaluating agent coordination in hierarchical event-driven scheduling. Built on a shared discrete-event driven environment in industrial scheduling, our benchmark captures multi-timescale decision making, partial observability, and dynamically coupled constraints. We define tasks and metrics that evaluate effectiveness, constraint alignment, coordination efficiency, and robustness, and focus on four representative coordination paradigms: centralized, hierarchical, heterarchical, and holonic. These paradigms correspond to distinct mechanisms of information flow, decision authority, and conflict resolution. Our controlled evaluations reveal clear coordination trade-offs: centralized coordination is robust and communication-efficient but scales poorly with difficulty; hierarchical coordination improves efficiency through decomposition but suffers from cross-level misalignment; heterarchical coordination is flexible but communication-heavy; and holonic coordination satisfies constraints well but loses global robustness. These findings demonstrate that coordination design fundamentally shapes agent system behavior in complex environments, revealing structural trade-offs that cannot be captured by outcome metrics alone and underscoring the imperative for more adaptive, principled, and dynamic coordination mechanisms in future MAS research.

preprint2025arXiv

Recorded Versus Synthetic Spectral-compatible Ground Motions: A Comparative Analysis of Structural Seismic Responses

This paper presents a comparative analysis of structural seismic responses under two types of ground motion inputs: (i) synthetic motions generated by stochastic spectral-compatible ground motion models and (ii) recorded motions from an earthquake database. Both ground motion datasets are calibrated to a shared target response spectrum to ensure consistent spectral median, variance, and correlation structure. Five key stochastic response metrics-probability distributions, statistical moments, correlations, tail indices, and variance-based global sensitivity indices-are systematically evaluated for two representative structures: a medium-period building and a limiting case of a long-period tower. The comparison accounts for uncertainties both from ground motion and structural parameters. The results reveal that synthetic motions closely replicate recorded motions in terms of global response behavior-including distributions, mean and variance, correlation structure, and dominant uncertainty sources-indicating their suitability for routine seismic design and parametric studies. However, substantial differences emerge in response extremes for long-period structures, particularly in metrics governed by rare events, such as higher-order moments and tail behavior. These differences, which often exceed 50%, can be attributed to the non-Gaussian features and complex characteristics inherent in recorded motions, which are less pronounced in synthetic datasets. The findings support the use of synthetic ground motions for evaluating global seismic response characteristics, while highlighting their limitations in capturing rare-event behavior and long-period structural dynamics.

preprint2022arXiv

Enhancing Classifier Conservativeness and Robustness by Polynomiality

We illustrate the detrimental effect, such as overconfident decisions, that exponential behavior can have in methods like classical LDA and logistic regression. We then show how polynomiality can remedy the situation. This, among others, leads purposefully to random-level performance in the tails, away from the bulk of the training data. A directly related, simple, yet important technical novelty we subsequently present is softRmax: a reasoned alternative to the standard softmax function employed in contemporary (deep) neural networks. It is derived through linking the standard softmax to Gaussian class-conditional models, as employed in LDA, and replacing those by a polynomial alternative. We show that two aspects of softRmax, conservativeness and inherent gradient regularization, lead to robustness against adversarial attacks without gradient obfuscation.

preprint2022arXiv

FALCON: Fast Visual Concept Learning by Integrating Images, Linguistic descriptions, and Conceptual Relations

We present a meta-learning framework for learning new visual concepts quickly, from just one or a few examples, guided by multiple naturally occurring data streams: simultaneously looking at images, reading sentences that describe the objects in the scene, and interpreting supplemental sentences that relate the novel concept with other concepts. The learned concepts support downstream applications, such as answering questions by reasoning about unseen images. Our model, namely FALCON, represents individual visual concepts, such as colors and shapes, as axis-aligned boxes in a high-dimensional space (the "box embedding space"). Given an input image and its paired sentence, our model first resolves the referential expression in the sentence and associates the novel concept with particular objects in the scene. Next, our model interprets supplemental sentences to relate the novel concept with other known concepts, such as "X has property Y" or "X is a kind of Y". Finally, it infers an optimal box embedding for the novel concept that jointly 1) maximizes the likelihood of the observed instances in the image, and 2) satisfies the relationships between the novel concepts and the known ones. We demonstrate the effectiveness of our model on both synthetic and real-world datasets.

preprint2022arXiv

FPGA-based electronic system for the control and readout of superconducting quantum processors

Electronic systems for qubit control and measurement serve as a bridge between quantum programming language and quantum information processors. With the rapid development of superconducting quantum circuit (SQC) technology, synchronization in a large-scale system, low-latency execution, and low noise are required for electronic systems. Here, we present a field-programmable gate array (FPGA)-based electronic system with a distributed synchronous clock and trigger architecture. The system supports synchronous control of qubits with jitters of approximately 5 ps. We implement a real-time digital signal processing system in the FPGA, enabling precise timing control, arbitrary waveform generation, IQ demodulation for qubit state discrimination, and the generation of real-time qubit-state-dependent trigger signals for feedback/feedforward control. The hardware and firmware low-latency design reduces the feedback/feedforward latency of the electronic system to 125 ns, significantly less than the decoherence times of the qubit. Finally, we demonstrate the functionalities and low-noise performance of this system using a fluxonium quantum processor.

preprint2022arXiv

Online Game Level Generation from Music

Game consists of multiple types of content, while the harmony of different content types play an essential role in game design. However, most works on procedural content generation consider only one type of content at a time. In this paper, we propose and formulate online level generation from music, in a way of matching a level feature to a music feature in real-time, while adapting to players' play speed. A generic framework named online player-adaptive procedural content generation via reinforcement learning, OPARL for short, is built upon the experience-driven reinforcement learning and controllable reinforcement learning, to enable online level generation from music. Furthermore, a novel control policy based on local search and k-nearest neighbours is proposed and integrated into OPARL to control the level generator considering the play data collected online. Results of simulation-based experiments show that our implementation of OPARL is competent to generate playable levels with difficulty degree matched to the ``energy'' dynamic of music for different artificial players in an online fashion.

preprint2022arXiv

Reinforcement Learning with Dual-Observation for General Video Game Playing

Reinforcement learning algorithms have performed well in playing challenging board and video games. More and more studies focus on improving the generalisation ability of reinforcement learning algorithms. The General Video Game AI Learning Competition aims to develop agents capable of learning to play different game levels that were unseen during training. This paper summarises the five years' General Video Game AI Learning Competition editions. At each edition, three new games were designed. The training and test levels were designed separately in the first three editions. Since 2020, three test levels of each game were generated by perturbing or combining two training levels. Then, we present a novel reinforcement learning technique with dual-observation for general video game playing, assuming that it is more likely to observe similar local information in different levels rather than global information. Instead of directly inputting a single, raw pixel-based screenshot of the current game screen, our proposed general technique takes the encoded, transformed global and local observations of the game screen as two simultaneous inputs, aiming at learning local information for playing new levels. Our proposed technique is implemented with three state-of-the-art reinforcement learning algorithms and tested on the game set of the 2020 General Video Game AI Learning Competition. Ablation studies show the outstanding performance of using encoded, transformed global and local observations as input.

preprint2020arXiv

A Novel CNet-assisted Evolutionary Level Repairer and Its Applications to Super Mario Bros

Applying latent variable evolution to game level design has become more and more popular as little human expert knowledge is required. However, defective levels with illegal patterns may be generated due to the violation of constraints for level design. A traditional way of repairing the defective levels is programming specific rule-based repairers to patch the flaw. However, programming these constraints is sometimes complex and not straightforward. An autonomous level repairer which is capable of learning the constraints is needed. In this paper, we propose a novel approach, CNet, to learn the probability distribution of tiles giving its surrounding tiles on a set of real levels, and then detect the illegal tiles in generated new levels. Then, an evolutionary repairer is designed to search for optimal replacement schemes equipped with a novel search space being constructed with the help of CNet and a novel heuristic function. The proposed approaches are proved to be effective in our case study of repairing GAN-generated and artificially destroyed levels of Super Mario Bros. game. Our CNet-assisted evolutionary repairer can also be easily applied to other games of which the levels can be represented by a matrix of objects or tiles.

preprint2020arXiv

Black Magic in Deep Learning: How Human Skill Impacts Network Training

How does a user's prior experience with deep learning impact accuracy? We present an initial study based on 31 participants with different levels of experience. Their task is to perform hyperparameter optimization for a given deep learning architecture. The results show a strong positive correlation between the participant's experience and the final performance. They additionally indicate that an experienced participant finds better solutions using fewer resources on average. The data suggests furthermore that participants with no prior experience follow random strategies in their pursuit of optimal hyperparameters. Our study investigates the subjective human factor in comparisons of state of the art results and scientific reproducibility in deep learning.

preprint2020arXiv

Learning from Explanations with Neural Execution Tree

While deep neural networks have achieved impressive performance on a range of NLP tasks, these data-hungry models heavily rely on labeled data, which restricts their applications in scenarios where data annotation is expensive. Natural language (NL) explanations have been demonstrated very useful additional supervision, which can provide sufficient domain knowledge for generating more labeled data over new instances, while the annotation time only doubles. However, directly applying them for augmenting model learning encounters two challenges: (1) NL explanations are unstructured and inherently compositional, which asks for a modularized model to represent their semantics, (2) NL explanations often have large numbers of linguistic variants, resulting in low recall and limited generalization ability. In this paper, we propose a novel Neural Execution Tree (NExT) framework to augment training data for text classification using NL explanations. After transforming NL explanations into executable logical forms by semantic parsing, NExT generalizes different types of actions specified by the logical forms for labeling data instances, which substantially increases the coverage of each NL explanation. Experiments on two NLP tasks (relation extraction and sentiment analysis) demonstrate its superiority over baseline methods. Its extension to multi-hop question answering achieves performance gain with light annotation effort.

preprint2020arXiv

NERO: A Neural Rule Grounding Framework for Label-Efficient Relation Extraction

Deep neural models for relation extraction tend to be less reliable when perfectly labeled data is limited, despite their success in label-sufficient scenarios. Instead of seeking more instance-level labels from human annotators, here we propose to annotate frequent surface patterns to form labeling rules. These rules can be automatically mined from large text corpora and generalized via a soft rule matching mechanism. Prior works use labeling rules in an exact matching fashion, which inherently limits the coverage of sentence matching and results in the low-recall issue. In this paper, we present a neural approach to ground rules for RE, named NERO, which jointly learns a relation extraction module and a soft matching module. One can employ any neural relation extraction models as the instantiation for the RE module. The soft matching module learns to match rules with semantically similar sentences such that raw corpora can be automatically labeled and leveraged by the RE module (in a much better coverage) as augmented supervision, in addition to the exactly matched sentences. Extensive experiments and analysis on two public and widely-used datasets demonstrate the effectiveness of the proposed NERO framework, comparing with both rule-based and semi-supervised methods. Through user studies, we find that the time efficiency for a human to annotate rules and sentences are similar (0.30 vs. 0.35 min per label). In particular, NERO's performance using 270 rules is comparable to the models trained using 3,000 labeled sentences, yielding a 9.5x speedup. Moreover, NERO can predict for unseen relations at test time and provide interpretable predictions. We release our code to the community for future research.

preprint2020arXiv

Probabilistic Performance-Pattern Decomposition (PPPD): analysis framework and applications to stochastic mechanical systems

Since the early 1900s, numerous research efforts have been devoted to developing quantitative solutions to stochastic mechanical systems. In general, the problem is perceived as solved when a complete or partial probabilistic description on the quantity of interest (QoI) is determined. However, in the presence of complex system behavior, there is a critical need to go beyond mere probabilistic descriptions. In fact, to gain a full understanding of the system, it is crucial to extract physical characterizations from the probabilistic structure of the QoI, especially when the QoI solution is obtained in a data-driven fashion. Motivated by this perspective, the paper proposes a framework to obtain structuralized characterizations on behaviors of stochastic systems. The framework is named Probabilistic Performance-Pattern Decomposition (PPPD). PPPD analysis aims to decompose complex response behaviors, conditional to a prescribed performance state, into meaningful patterns in the space of system responses, and to investigate how the patterns are triggered in the space of basic random variables. To illustrate the application of PPPD, the paper studies three numerical examples: 1) an illustrative example with hypothetical stochastic processes input and output; 2) a stochastic Lorenz system with periodic as well as chaotic behaviors; and 3) a simplified shear-building model subjected to a stochastic ground motion excitation.

preprint2020arXiv

The dynamics of entropy in the COVID-19 outbreaks

With the unfolding of the COVID-19 pandemic, mathematical modeling of epidemics has been perceived and used as a central element in understanding, predicting, and governing the pandemic event. However, soon it became clear that long term predictions were extremely challenging to address. Moreover, it is still unclear which metric shall be used for a global description of the evolution of the outbreaks. Yet a robust modeling of pandemic dynamics and a consistent choice of the transmission metric is crucial for an in-depth understanding of the macroscopic phenomenology and better-informed mitigation strategies. In this study, we propose a Markovian stochastic framework designed to describe the evolution of entropy during the COVID-19 pandemic and the instantaneous reproductive ratio. We then introduce and use entropy-based metrics of global transmission to measure the impact and temporal evolution of a pandemic event. In the formulation of the model, the temporal evolution of the outbreak is modeled by the master equation of a nonlinear Markov process for a statistically averaged individual, leading to a clear physical interpretation. We also provide a full Bayesian inversion scheme for calibration. The time evolution of the entropy rate, the absolute change in the system entropy, and the instantaneous reproductive ratio are natural and transparent outputs of this framework. The framework has the appealing property of being applicable to any compartmental epidemic model. As an illustration, we apply the proposed approach to a simple modification of the Susceptible-Exposed-Infected-Removed (SEIR) model. Applying the model to the Hubei region, South Korean, Italian, Spanish, German, and French COVID-19 data-sets, we discover a significant difference in the absolute change of entropy but highly regular trends for both the entropy evolution and the instantaneous reproductive ratio.