Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
14topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2026arXiv

MHSA: A Lightweight Framework for Mitigating Hallucinations via Steered Attention in LVLMs

Large vision-language models (LVLMs) have achieved remarkable performance across diverse multimodal tasks, yet they continue to suffer from hallucinations, generating content that is inconsistent with the visual input. Prior work DHCP (Detecting Hallucinations by Cross-modal Attention Pattern) has explored hallucination detection from the perspective of cross-modal attention, but does not address hallucination mitigation. In this paper, we propose MHSA (Mitigating Hallucinations via Steered Attention), a lightweight framework that mitigates hallucinations by learning to correct cross-modal attention patterns in LVLMs. MHSA trains a simple three-layer MLP generator to produce corrected attention, guided by supervisory signals from the DHCP discriminator and the LVLM itself. During inference, MHSA mitigates both discriminative and generative hallucinations across various datasets and LVLMs by simply replacing the original cross-modal attention with the corrected one, without modifying any LVLM parameters. By extending cross-modal attention mechanisms from hallucination detection to hallucination mitigation, MHSA offers a novel perspective on hallucination research in LVLMs and helps enhance their reliability.

preprint2025arXiv

Polarization-Differential Loss Enabled High Polarization Extinction in Hollow-Core Fibers

Delivering a well defined state of polarization over hollow core fibres (HCFs) is pivotal for next generation ultra stable photonic systems. Yet in all existing HCFs, whether birefringent or not, their polarization extinction ratio (PER) rapidly deteriorates during propagation or under mechanical disturbance, leaving no practical high and stable PER solution. Here, we break this impasse by embedding a polarization differential loss (PDL) mechanism directly into the cladding architecture.

preprint2022arXiv

Multiscale nonlocal beam theory: An application of distributed-order fractional operators

This study presents a comprehensive theoretical framework to simulate the response of multiscale nonlocal elastic beams. By employing distributed-order (DO) fractional operators with a fourth-order tensor as the strength-function, the framework can accurately capture anisotropic behavior of 2D heterogeneous beams with nonlocal effects localized across multiple scales. Building upon this general continuum theory and on the multiscale character of DO operators, a one-dimensional (1D) multiscale nonlocal Timoshenko model is also presented. This approach enables a significant model-order reduction without compromising the heterogeneous nonlocal description of the material, hence leading to an efficient and accurate multiscale nonlocal modeling approach. Both 1D and 2D approaches are applied to simulate the mechanical responses of nonlocal beams. The direct comparison of numerical simulations produced by either the DO or an integer-order fully-resolved model (used as ground truth) clearly illustrates the ability of the DO formulation to capture the effect of the microstructure on the macroscopic response. The assessment of the computational cost also indicates the superior efficiency of the proposed approach.

preprint2022arXiv

PGADA: Perturbation-Guided Adversarial Alignment for Few-shot Learning Under the Support-Query Shift

Few-shot learning methods aim to embed the data to a low-dimensional embedding space and then classify the unseen query data to the seen support set. While these works assume that the support set and the query set lie in the same embedding space, a distribution shift usually occurs between the support set and the query set, i.e., the Support-Query Shift, in the real world. Though optimal transportation has shown convincing results in aligning different distributions, we find that the small perturbations in the images would significantly misguide the optimal transportation and thus degrade the model performance. To relieve the misalignment, we first propose a novel adversarial data augmentation method, namely Perturbation-Guided Adversarial Alignment (PGADA), which generates the hard examples in a self-supervised manner. In addition, we introduce Regularized Optimal Transportation to derive a smooth optimal transportation plan. Extensive experiments on three benchmark datasets manifest that our framework significantly outperforms the eleven state-of-the-art methods on three datasets.

preprint2021arXiv

Multiscale Nonlocal Elasticity: A Distributed Order Fractional Formulation

This study presents a generalized multiscale nonlocal elasticity theory that leverages distributed order fractional calculus to accurately capture coexisting multiscale and nonlocal effects within a macroscopic continuum. The nonlocal multiscale behavior is captured via distributed order fractional constitutive relations derived from a nonlocal thermodynamic formulation. The governing equations of the inhomogeneous continuum are obtained via the Hamilton principle. As a generalization of the constant order fractional continuum theory, the distributed order theory can model complex media characterized by inhomogeneous nonlocality and multiscale effects. In order to understand the correspondence between microscopic effects and the properties of the continuum, an equivalent mass-spring lattice model is also developed by direct discretization of the distributed order elastic continuum. Detailed theoretical arguments are provided to show the equivalence between the discrete and the continuum distributed order models in terms of internal nonlocal forces, potential energy distribution, and boundary conditions. These theoretical arguments facilitate the physical interpretation of the role played by the distributed order framework within nonlocal elasticity theories. They also highlight the outstanding potential and opportunities offered by this methodology to account for multiscale nonlocal effects. The capabilities of the methodology are also illustrated via a numerical study that highlights the excellent agreement between the displacement profiles and the total potential energy predicted by the two models under various order distributions. Remarkably, multiscale effects such as displacement distortion, material softening, and energy concentration are well captured at continuum level by the distributed order theory.

preprint2020arXiv

Comparing SNNs and RNNs on Neuromorphic Vision Datasets: Similarities and Differences

Neuromorphic data, recording frameless spike events, have attracted considerable attention for the spatiotemporal information components and the event-driven processing fashion. Spiking neural networks (SNNs) represent a family of event-driven models with spatiotemporal dynamics for neuromorphic computing, which are widely benchmarked on neuromorphic data. Interestingly, researchers in the machine learning community can argue that recurrent (artificial) neural networks (RNNs) also have the capability to extract spatiotemporal features although they are not event-driven. Thus, the question of "what will happen if we benchmark these two kinds of models together on neuromorphic data" comes out but remains unclear. In this work, we make a systematic study to compare SNNs and RNNs on neuromorphic data, taking the vision datasets as a case study. First, we identify the similarities and differences between SNNs and RNNs (including the vanilla RNNs and LSTM) from the modeling and learning perspectives. To improve comparability and fairness, we unify the supervised learning algorithm based on backpropagation through time (BPTT), the loss function exploiting the outputs at all timesteps, the network structure with stacked fully-connected or convolutional layers, and the hyper-parameters during training. Especially, given the mainstream loss function used in RNNs, we modify it inspired by the rate coding scheme to approach that of SNNs. Furthermore, we tune the temporal resolution of datasets to test model robustness and generalization. At last, a series of contrast experiments are conducted on two types of neuromorphic datasets: DVS-converted (N-MNIST) and DVS-captured (DVS Gesture).

preprint2020arXiv

Fair Auction and Trade Framework for Cloud VM Allocation based on Blockchain

Cloud auctions provide cost-effective strategies for cloud VM allocation. Most existing cloud auctions simply assume that the auctioneer is trustable, and thus the fairness of auctions can be easily achieved. However, in fact, such a trustable auctioneer may not exist, and the fairness is non-trivial to guarantee. In this work, for the first time, we propose a decentralized cloud VM auction and trade framework based on blockchain. We realize both auction fairness and trade fairness among participants (e.g., cloud provider and cloud users) in this system, which guarantees the interest of each party will not suffer any loss as long as it follows the protocol. Furthermore, we implement our system through the local blockchain and Ethereum official test blockchain, carry out experimental simulations, and demonstrate the feasibility of our system.

preprint2020arXiv

GeoCMS : Towards a Geo-Tagged Media Management System

In this paper, we propose the design and implementation of the new geotagged media management system. A large amount of daily geo-tagged media data generated by user's smart phone, mobile device, dash cam and camera. Geotagged media, such as geovideos and geophotos, can be captured with spatial temporal information such as time, location, visible area, camera direction, moving direction and visible distance information. Due to the increase in geo-tagged multimedia data, the researches for efficient managing and mining geo-tagged multimedia are newly expected to be a new area in database and data mining. This paper proposes a geo-tagged media management system, so called Open GeoCMS(Geotagged media Contents Management System). Open GeoCMS is a new framework to manage geotagged media data on the web. Our framework supports various types which are for moving point, moving photo - a sequence of photos by a drone, moving double and moving video. Also, GeoCMS has the label viewer and editor system for photos and videos. The Open GeoCMS have been developed as an open source system.

preprint2020arXiv

Mitigating Class Boundary Label Uncertainty to Reduce Both Model Bias and Variance

The study of model bias and variance with respect to decision boundaries is critically important in supervised classification. There is generally a tradeoff between the two, as fine-tuning of the decision boundary of a classification model to accommodate more boundary training samples (i.e., higher model complexity) may improve training accuracy (i.e., lower bias) but hurt generalization against unseen data (i.e., higher variance). By focusing on just classification boundary fine-tuning and model complexity, it is difficult to reduce both bias and variance. To overcome this dilemma, we take a different perspective and investigate a new approach to handle inaccuracy and uncertainty in the training data labels, which are inevitable in many applications where labels are conceptual and labeling is performed by human annotators. The process of classification can be undermined by uncertainty in the labels of the training data; extending a boundary to accommodate an inaccurately labeled point will increase both bias and variance. Our novel method can reduce both bias and variance by estimating the pointwise label uncertainty of the training set and accordingly adjusting the training sample weights such that those samples with high uncertainty are weighted down and those with low uncertainty are weighted up. In this way, uncertain samples have a smaller contribution to the objective function of the model's learning algorithm and exert less pull on the decision boundary. In a real-world physical activity recognition case study, the data presents many labeling challenges, and we show that this new approach improves model performance and reduces model variance.

preprint2020arXiv

SummerTime: Variable-length Time SeriesSummarization with Applications to PhysicalActivity Analysis

\textit{SummerTime} seeks to summarize globally time series signals and provides a fixed-length, robust summarization of the variable-length time series. Many classical machine learning methods for classification and regression depend on data instances with a fixed number of features. As a result, those methods cannot be directly applied to variable-length time series data. One common approach is to perform classification over a sliding window on the data and aggregate the decisions made at local sections of the time series in some way, through majority voting for classification or averaging for regression. The downside to this approach is that minority local information is lost in the voting process and averaging assumes that each time series measurement is equal in significance. Also, since time series can be of varying length, the quality of votes and averages could vary greatly in cases where there is a close voting tie or bimodal distribution of regression domain. Summarization conducted by the \textit{SummerTime} method will be a fixed-length feature vector which can be used in-place of the time series dataset for use with classical machine learning methods. We use Gaussian Mixture models (GMM) over small same-length disjoint windows in the time series to group local data into clusters. The time series' rate of membership for each cluster will be a feature in the summarization. The model is naturally capable of converging to an appropriate cluster count. We compare our results to state-of-the-art studies in physical activity classification and show high-quality improvement by classifying with only the summarization. Finally, we show that regression using the summarization can augment energy expenditure estimation, producing more robust and precise results.