Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
19works
0followers
17topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

19 published item(s)

preprint2026arXiv

BGG: Bridging the Geometric Gap between Cross-View images by Vision Foundation Model Adaptation for Geo-Localization

Geometric differences between cross-view images, such as drone and satellite views, significantly increase the challenge of Cross-View Geo-Localization (CVGL), which aims to acquire the geolocation of images by image retrieval. To further enhance the CVGL performance, this paper proposes a parameter-efficient adaptation framework for bridging the geometric gap across images based on the vision foundation model (VFM) (e.g., DINOv3), termed BGG. BGG not only effectively leverages the general visual representations of VFM and captures the robust and consistent features from cross-view images, but also utilizes the generalization capabilities of the VFM, significantly improving the CVGL performance. It mainly contains a Multi-granularity Feature Enhancement Adapter (MFEA) and a Frequency-Aware Structural Aggregation (FASA) module. Specifically, MFEA enhances the scale adaptability and viewpoint robustness of features by multi-level dilated convolutions, effectively bridging the cross-view geometric gap with small training costs. Additionally, considering the [CLS] token lacks spatial details for precise image retrieval and localization, the FASA module modulates patch tokens in the frequency domain and performs adaptive aggregation for local structural feature enhancement. Finally, BGG fuses the enhanced local features with the [CLS] token for more accurate CVGL. Extensive experiments on University-1652 and SUES-200 datasets demonstrate that BGG has significant advantages over other methods and achieves state-of-the-art localization performance with low training costs.

preprint2026arXiv

Modulation Consistency-based Contrastive Learning for Self-Supervised Automatic Modulation Classification

Deep learning-based AMC methods have achieved remarkable performance, but their practical deployment remains constrained by the high cost of labeled data. Although self-supervised learning (SSL) reduces the reliance on labels, existing SSL-based AMC methods often rely on task-agnostic pretext objectives misaligned with modulation classification, leading to representations entangled with nuisance factors such as symbol, channel, and noise. In this paper, we identify intra-instance modulation consistency as a task-aware structural prior, whereby different temporal segments of the same signal may differ in waveform while preserving the same modulation type, thus providing a principled cue for task-aligned self-supervision. Based on this prior, we propose Mod-CL, a Modulation consistency-based Contrastive Learning framework that constructs positive pairs from different temporal segments of the same signal instance, to encourage the model to learn shared modulation information while suppressing nuisance variations. We further develop a contrastive objective tailored to Mod-CL, which jointly exploits temporal segmentation and data augmentation to pull together views sharing the same modulation semantics while avoiding supervisory conflicts within each signal instance. Extensive experiments on RadioML datasets show that Mod-CL consistently outperforms strong baselines, especially in low-label regimes, achieving substantial improvements in linear probing accuracy.

preprint2026arXiv

OpenEM: Large-scale multi-structural 3D datasets for electromagnetic methods

Electromagnetic methods have become one of the most widely used techniques in geological exploration. With the remarkable success of deep learning, applying such techniques to EM methods has emerged as a promising research direction to overcome the limitations of conventional approaches. The effectiveness of deep learning methods depends heavily on the quality of datasets, which directly influences model performance and generalization ability. Existing application studies often construct datasets from random one-dimensional or structurally simple three-dimensional models, which fail to represent the real geological environments. Furthermore, the absence of standardized, publicly 3D geoelectric datasets continues to hinder progress in deep learning based EM exploration. To address these limitations, we present OpenEM, a large-scale, multi-structural three-dimensional geoelectric dataset that encompasses a broad range of geologically plausible subsurface structures. OpenEM consists of nine categories of geoelectric models, spanning from simple configurations with anomalous bodies in half-space to more complex structures such as flat layers, folded layers, flat faults, curved faults, and their corresponding variants with anomalous bodies. Since three-dimensional forward modeling in electromagnetics is extremely time-consuming, we further developed a deep learning based fast forward modeling approach for OpenEM, enabling efficient and reliable forward modeling across the entire dataset. This capability allows OpenEM to be rapidly deployed for a wide range of tasks. OpenEM provides a unified, comprehensive, and large-scale dataset for common EM exploration systems to accelerate the application of deep learning in electromagnetic methods.The complete dataset is publicly available at https://doi.org/10.5281/zenodo.17141981.

preprint2026arXiv

TAR: Text Semantic Assisted Cross-modal Image Registration Framework for Optical and SAR Images

Existing deep learning-based methods can capture shared features from optical and synthetic aperture radar (SAR) images for spatial alignment. However, optical-SAR registration remains challenging under large geometric deformations, because the model needs to simultaneously handle cross-modal appearance discrepancies and complex spatial transformations. To address this issue, this paper proposes a text semantic-assisted cross-modal image registration framework, named TAR, for optical and SAR images. TAR exploits text semantic priors from remote sensing scenes and land-cover categories to alleviate the modality gap and enhance cross-modal feature learning. TAR consists of three components: a multi-scale visual feature learning (MSFL) module, a text-assisted feature enhancement (TAFE) module, and a coarse-to-fine dense matching (CFDM) module. MSFL extracts multi-scale visual features from optical and SAR images. TAFE constructs text descriptors related to remote sensing scenes and land-cover objects, and uses a frozen RemoteCLIP text encoder to extract text features. These text features are introduced through visual-text interaction to enhance high-level visual features for more reliable coarse matching. CFDM then establishes coarse correspondences based on the enhanced high-level features and refines the matched locations using low-level features. Experimental results on cross-modal remote sensing images demonstrate the effectiveness of TAR, which achieves stronger matching performance than several state-of-the-art methods and yields significant gains under large geometric deformations.

preprint2025arXiv

SeisRDT: Latent Diffusion Model Based On Representation Learning For Seismic Data Interpolation And Reconstruction

Due to limitations such as geographic, physical, or economic factors, collected seismic data often have missing traces. Traditional seismic data reconstruction methods face the challenge of selecting numerous empirical parameters and struggle to handle large-scale continuous missing traces. With the advancement of deep learning, various diffusion models have demonstrated strong reconstruction capabilities. However, these UNet-based diffusion models require significant computational resources and struggle to learn the correlation between different traces in seismic data. To address the complex and irregular missing situations in seismic data, we propose a latent diffusion transformer utilizing representation learning for seismic data reconstruction. By employing a mask modeling scheme based on representation learning, the representation module uses the token sequence of known data to infer the token sequence of unknown data, enabling the reconstructed data from the diffusion model to have a more consistent data distribution and better correlation and accuracy with the known data. We propose the Representation Diffusion Transformer architecture, and a relative positional bias is added when calculating attention, enabling the diffusion model to achieve global modeling capability for seismic data. Using a pre-trained data compression model compresses the training and inference processes of the diffusion model into a latent space, which, compared to other diffusion model-based reconstruction methods, reduces computational and inference costs. Reconstruction experiments on field and synthetic datasets indicate that our method achieves higher reconstruction accuracy than existing methods and can handle various complex missing scenarios.

preprint2024arXiv

Disentangle Estimation of Causal Effects from Cross-Silo Data

Estimating causal effects among different events is of great importance to critical fields such as drug development. Nevertheless, the data features associated with events may be distributed across various silos and remain private within respective parties, impeding direct information exchange between them. This, in turn, can result in biased estimations of local causal effects, which rely on the characteristics of only a subset of the covariates. To tackle this challenge, we introduce an innovative disentangle architecture designed to facilitate the seamless cross-silo transmission of model parameters, enriched with causal mechanisms, through a combination of shared and private branches. Besides, we introduce global constraints into the equation to effectively mitigate bias within the various missing domains, thereby elevating the accuracy of our causal effect estimation. Extensive experiments conducted on new semi-synthetic datasets show that our method outperforms state-of-the-art baselines.

preprint2022arXiv

Temporal Cascade Model for Analyzing Spread in Evolving Networks with Disease Monitoring Applications

Current approaches for modeling propagation in networks (e.g., spread of disease) are unable to adequately capture temporal properties of the data such as order and duration of evolving connections or dynamic likelihoods of propagation along these connections. Temporal models in evolving networks are crucial in many applications that need to analyze dynamic spread. For example, a disease-spreading virus has varying transmissibility based on interactions between individuals occurring over time with different frequency, proximity, and venue population density. To capture such behaviors, we first develop the Temporal Independent Cascade (T-IC) model and propose a novel spread function, that we prove to be submodular, with a hypergraph-based sampling strategy that efficiently utilizes dynamic propagation probabilities. We then introduce the notion of 'reverse spread' using the proposed T-IC processes, and develop solutions to identify both sentinel/detector nodes and highly susceptible nodes. The proven guarantees of approximation quality enable scalable analysis of highly granular temporal networks. Extensive experimental results on a variety of real-world datasets show that the proposed approach significantly outperforms the alternatives in modeling both if and how spread occurs, by considering evolving network topology as well as granular contact/interaction information. Our approach has numerous applications, including its utility for the vital challenge of monitoring disease spread. Utilizing the proposed methods and T-IC, we analyze the impact of various intervention strategies over real spatio-temporal contact networks. Our approach is shown also to be highly effective in quantifying the importance of superspreaders, designing targeted restrictions for controlling spread, and backward contact tracing.

preprint2022arXiv

The Outcome of the 2022 Landslide4Sense Competition: Advanced Landslide Detection from Multi-Source Satellite Imagery

The scientific outcomes of the 2022 Landslide4Sense (L4S) competition organized by the Institute of Advanced Research in Artificial Intelligence (IARAI) are presented here. The objective of the competition is to automatically detect landslides based on large-scale multiple sources of satellite imagery collected globally. The 2022 L4S aims to foster interdisciplinary research on recent developments in deep learning (DL) models for the semantic segmentation task using satellite imagery. In the past few years, DL-based models have achieved performance that meets expectations on image interpretation, due to the development of convolutional neural networks (CNNs). The main objective of this article is to present the details and the best-performing algorithms featured in this competition. The winning solutions are elaborated with state-of-the-art models like the Swin Transformer, SegFormer, and U-Net. Advanced machine learning techniques and strategies such as hard example mining, self-training, and mix-up data augmentation are also considered. Moreover, we describe the L4S benchmark data set in order to facilitate further comparisons, and report the results of the accuracy assessment online. The data is accessible on \textit{Future Development Leaderboard} for future evaluation at \url{https://www.iarai.ac.at/landslide4sense/challenge/}, and researchers are invited to submit more prediction results, evaluate the accuracy of their methods, compare them with those of other users, and, ideally, improve the landslide detection results reported in this article.

preprint2022arXiv

Unbalanced-basis-misalignment tolerant measurement-device-independent quantum key distribution

Measurement-device-independent quantum key distribution (MDIQKD) is a revolutionary protocol since it is physically immune to all attacks on the detection side. However, the protocol still keeps the strict assumptions on the source side that the four BB84-states must be perfectly prepared to ensure security. Some protocols release part of the assumptions in the encoding system to keep the practical security, but the performance would be dramatically reduced. In this work, we present a MDIQKD protocol that requires less knowledge of encoding system to combat the troublesome modulation errors and fluctuations. We have also experimentally demonstrated the protocol. The result indicates the high-performance and good security for its practical applications. Besides, its robustness and flexibility exhibit a good value for complex scenarios such as the QKD networks.

preprint2021arXiv

Effects of the initial perturbations on the Rayleigh-Taylor-Kelvin-Helmholtz instability system

In the paper, the effects of initial perturbations on the Rayleigh-Taylor instability (RTI), Kelvin-Helmholtz instability (KHI), and the coupled Rayleigh-Taylor-Kelvin-Helmholtz instability (RTKHI) systems are investigated using a multiple-relaxation-time discrete Boltzmann model. Six different perturbation interfaces are designed to study the effects of the initial perturbations on the instability systems. Based on the mean heat flux strength $D_{3,1}$, the effects of initial interfaces on the coupled RTKHI are examined in detail. The research is focused on two aspects: (i) the main mechanism in the early stage of the RTKHI, (ii) the transition point from KHI-like to RTI-like for the case where the KHI dominates at earlier time and the RTI dominates at later time. It is found that the early main mechanism is related to the shape of the initial interface, which is represented by both the bilateral contact angle $θ_{1}$ and the middle contact angle $θ_{2}$. The influence of inverted parabolic and inverted ellipse perturbations ($θ_{1}<90$) on the transition point of the RTKHI system is greater than that of other interfaces.

preprint2021arXiv

Experimental test of the majorization uncertainty relation with mixed states

The uncertainty relation lies at the heart of quantum theory and behaves as a non-classical constraint on the indeterminacies of incompatible observables in a system. In the literature, many experiments have been devoted to the test of the uncertainty relations which mainly focus on the pure states. In this work we test the novel majorization uncertainty relations of three incompatible observables using a series of mixed states with adjustable mixing degrees, and compare the compactness of various entropy uncertainty relations. The experimental results confirm that for general mixed quantum system, the majorization uncertainty relation tends to be the tightest constraint on uncertainty, and indicate that the entropy uncertainty relation obtained from the majorzation uncertainty relation is the optimal one. Our experimental setup provides an easy means for preparing mixed states, and based on this simple optical elements can be utilized to realize the required quantum states.

preprint2021arXiv

Quantum key distribution over scattering channel

Scattering of light by cloud, haze, and fog decreases the transmission efficiency of communication channels in quantum key distribution (QKD), reduces the system&#39;s practical security, and thus constrains the deployment of free-space QKD. Here, we employ the wavefront shaping technology to compensate distorted optical signals in high-loss scattering quantum channels and fulfill a polarization-encoded BB84 QKD experiment. With this quantum channel compensation technology, we achieve a typical enhancement of about 250 in transmission efficiency and improve the secure key rate from 0 to $1.85\times10^{-6}$ per sifted key. The method and its first time validation show the great potential to expand the territory of QKD systems from lossless channels to highly scattered ones and therefore enhances the deployment ability of global quantum communication network.

preprint2021arXiv

Security Analysis and Improvement of Source Independent Quantum Random Number Generators with Imperfect Devices

A quantum random number generator (QRNG) as a genuine source of randomness is essential in many applications, such as number simulation and cryptography. Recently, a source-independent quantum random number generator (SI-QRNG), which can generate secure random numbers with untrusted sources, has been realized. However, the measurement loopholes of the trusted but imperfect devices used in SI-QRNGs have not yet been fully explored, which will cause security problems, especially in high-speed systems. Here, we point out and evaluate the security loopholes of practical imperfect measurement devices in SI-QRNGs. We also provide corresponding countermeasures to prevent these information leakages by recalculating the conditional minimum entropy and adding a monitor. Furthermore, by taking into account the finite-size effect,we show that the influence of the afterpulse can exceed that of the finite-size effect with the large number of sampled rounds. Our protocol is simple and effective, and it promotes the security of SI-QRNG in practice as well as the compatibility with high-speed measurement devices, thus paving the way for constructing ultrafast and security-certified commercial SI-QRNG systems.

preprint2020arXiv

Efficient decoy-states for the reference-frame-independent measurement-device-independent quantum key distribution

Reference-frame-independent measurement-device-independent quantum key distribution (RFI-MDI-QKD) is a novel protocol which eliminates all possible attacks on detector side and necessity of reference-frame alignment in source sides. However, its performance may degrade notably due to statistical fluctuations, since more parameters, e.g. yields and error rates for mismatched-basis events, must be accumulated to monitor the security. In this work, we find that the original decoy-states method estimates these yields over pessimistically since it ignores the potential relations between different bases. Through processing parameters of different bases jointly, the performance of RFI-MDI-QKD is greatly improved in terms of secret key rate and achievable distance when statistical fluctuations are considered. Our results pave an avenue towards practical RFI-MDI-QKD.

preprint2020arXiv

Finite-key analysis for twin-field quantum key distribution based on generalized operator dominance condition

Quantum key distribution (QKD) can help two distant peers to share secret key bits, whose security is guaranteed by the law of physics. In practice, the secret key rate of a QKD protocol is always lowered with the increasing of channel distance, which severely limits the applications of QKD. Recently, twin-field (TF) QKD has been proposed and intensively studied, since it can beat the rate-distance limit and greatly increase the achievable distance of QKD. Remarkalebly, K. Maeda et. al. proposed a simple finite-key analysis for TF-QKD based on operator dominance condition. Although they showed that their method is sufficient to beat the rate-distance limit, their operator dominance condition is not general, i.e. it can be only applied in three decoy states scenarios, which implies that its key rate cannot be increased by introducing more decoy states, and also cannot reach the asymptotic bound even in case of preparing infinite decoy states and optical pulses. Here, to bridge this gap, we propose an improved finite-key analysis of TF-QKD through devising new operator dominance condition. We show that by adding the number of decoy states, the secret key rate can be furtherly improved and approach the asymptotic bound. Our theory can be directly used in TF-QKD experiment to obtain higher secret key rate. Our results can be directly used in experiments to obtain higher key rates.

preprint2020arXiv

Gate-tunable van der Waals heterostructure for reconfigurable neural network vision sensor

Early processing of visual information takes place in the human retina. Mimicking neurobiological structures and functionalities of the retina provide a promising pathway to achieving vision sensor with highly efficient image processing. Here, we demonstrate a prototype vision sensor that operates via the gate-tunable positive and negative photoresponses of the van der Waals (vdW) vertical heterostructures. The sensor emulates not only the neurobiological functionalities of bipolar cells and photoreceptors but also the unique synaptic connectivity between bipolar cells and photoreceptors. By tuning gate voltage for each pixel, we achieve reconfigurable vision sensor for simultaneously image sensing and processing. Furthermore, our prototype vision sensor itself can be trained to classify the input images, via updating the gate voltages applied individually to each pixel in the sensor. Our work indicates that vdW vertical heterostructures offer a promising platform for the development of neural network vision sensor.

preprint2020arXiv

Non-Markovian Majority-Vote model

Non-Markovian dynamics pervades human activity and social networks and it induces memory effects and burstiness in a wide range of processes including inter-event time distributions, duration of interactions in temporal networks and human mobility. Here we propose a non-Markovian Majority-Vote model (NMMV) that introduces non-Markovian effects in the standard (Markovian) Majority-Vote model (SMV). The SMV model is one of the simplest two-state stochastic models for studying opinion dynamics, and displays a continuous order-disorder phase transition at a critical noise. In the NMMV model we assume that the probability that an agent changes state is not only dependent on the majority state of his neighbors but it also depends on his {\em age}, i.e. how long the agent has been in his current state. The NMMV model has two regimes: the aging regime implies that the probability that an agent changes state is decreasing with his age, while in the anti-aging regime the probability that an agent changes state is increasing with his age. Interestingly, we find that the critical noise at which we observe the order-disorder phase transition is a non-monotonic function of the rate $β$ of the aging (anti-aging) process. In particular the critical noise in the aging regime displays a maximum as a function of $β$ while in the anti-aging regime displays a minimum. This implies that the aging/anti-aging dynamics can retard/anticipate the transition and that there is an optimal rate $β$ for maximally perturbing the value of the critical noise. The analytical results obtained in the framework of the heterogeneous mean-field approach are validated by extensive numerical simulations on a large variety of network topologies.

preprint2020arXiv

Optimized protocol for twin-field quantum key distribution

Twin-field quantum key distribution (TF-QKD) and its variant protocols are highly attractive due to the advantage of overcoming the rate-loss limit for secret key rates of point-to-point QKD protocols. For variations of TF-QKD, the key point to ensure security is switching randomly between a code mode and a test mode. Among all TF-QKD protocols, their code modes are very different, e.g. modulating continuous phases, modulating only two opposite phases, and sending or not sending signal pulses. Here we show that, by discretizing the number of global phases in the code mode, we can give a unified view on the first two types of TF-QKD protocols, and demonstrate that increasing the number of discrete phases extends the achievable distance, and as a trade-off, lowers the secret key rate at short distances due to the phase post-selection.

preprint2020arXiv

Quantum key distribution with dissipative Kerr soliton generated by on-chip microresonators

Quantum key distribution (QKD) can distribute symmetric key bits between remote legitimate users with the guarantee of quantum mechanics principles. For practical applications, the compact and robust photonic components for QKD are essential, and there are increasing attention to integrate the source, detector and modulators on a photonic chip. However, the massive and parallel QKD based on wavelength multiplexing are still challenge, due to the limited coherent light sources on the chip. Here, we introduce the Kerr dissipative soliton in a microresonator, which provides the locked coherent frequency comb with 49GHz frequency spacing, for QKD. We demonstrate the parallel QKD by demulplexing the coherent comb lines form the soliton, and showing the potential of Gbps secret key rate if the hundreds of channels covering C and L bands are fully exploited. The demonstrated soliton based QKD architecture are compatible with the efforts of quantum photonic integrated circuits, which are compact, robust and low-cost, and provides a competitive platform of practical QKD chip.