Source author record

Yang Xiang

Yang Xiang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

60works

25topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Diffeomorphic Cortical Alignment via Direct Warping of Streamline Endpoints

Cortical surface registration is often driven by local geometric descriptors (e.g., sulcal depth and curvature). While this approach achieves geometric correspondence, it neglects the long-range wiring constraints imposed by white-matter anatomy. Diffusion MRI tractography offers these crucial constraints; however, prior connectivity-informed pipelines typically align precomputed connectivity matrices, making the optimization highly sensitive to connectivity estimation and its resolution. In this paper, we introduce a novel connectivity-based surface registration method that aligns cortical surfaces by operating directly on white-matter fiber-tract endpoints. We model tract endpoints as a point cloud on the product manifold $Ω\times Ω$, where $Ω$ represents the spherical domain of the inflated cortical hemispheres. Our alignment method iteratively (i) computes a small diffeomorphic warp for $Ω$ by minimizing connectivity mismatch, and (ii) updates the endpoints based on this warp. The method relies on a geometric framework that ensures output warps are diffeomorphisms and has a final goal that optimizes the matching of well-known fiber bundles. Experiments on Human Connectome Project (HCP) data demonstrate improved tract-level correspondence, achieving higher connectivity-level overlap coefficients on major fiber bundles and stronger robustness across grid resolutions for $Ω$ compared to state-of-the-art methods such as ENCORE and MSMAll.

preprint2026arXiv

Exploring the Translation Mechanism of Large Language Models

While large language models (LLMs) demonstrate remarkable success in multilingual translation, their internal core translation mechanisms, even at the fundamental word level, remain insufficiently understood. To address this critical gap, this work introduces a systematic framework for interpreting the mechanism behind LLM translation from the perspective of computational components. This paper first proposes subspace-intervened path patching for precise, fine-grained causal analysis, enabling the detection of components crucial to translation tasks and subsequently characterizing their behavioral patterns in human-interpretable terms. Comprehensive experiments reveal that translation is predominantly driven by a sparse subset of components: specialized attention heads serve critical roles in extracting source language, translation indicators, and positional features, which are then integrated and processed by specific multi-layer perceptrons (MLPs) into intermediary English-centric latent representations before ultimately yielding the final translation. The significance of these findings is underscored by the empirical demonstration that targeted fine-tuning a minimal parameter subset ($<5\%$) enhances translation performance while preserving general capabilities. This result further indicates that these crucial components generalize effectively to sentence-level translation and are instrumental in elucidating more intricate translation tasks.

preprint2026arXiv

StablePDENet: Enhancing Stability of Operator Learning for Solving Differential Equations

Learning solution operators for differential equations with neural networks has shown great potential in scientific computing, but ensuring their stability under input perturbations remains a critical challenge. This paper presents a robust self-supervised neural operator framework that enhances stability through adversarial training while preserving accuracy. We formulate operator learning as a min-max optimization problem, where the model is trained against worst-case input perturbations to achieve consistent performance under both normal and adversarial conditions. We demonstrate that our method not only achieves good performance on standard inputs, but also maintains high fidelity under adversarial perturbed inputs. The results highlight the importance of stability-aware training in operator learning and provide a foundation for developing reliable neural PDE solvers in real-world applications, where input noise and uncertainties are inevitable.

preprint2026arXiv

TeachPro: Multi-Label Qualitative Teaching Evaluation via Cross-View Graph Synergy and Semantic Anchored Evidence Encoding

Standardized Student Evaluation of Teaching often suffer from low reliability, restricted response options, and response distortion. Existing machine learning methods that mine open-ended comments usually reduce feedback to binary sentiment, which overlooks concrete concerns such as content clarity, feedback timeliness, and instructor demeanor, and provides limited guidance for instructional improvement.We propose TeachPro, a multi-label learning framework that systematically assesses five key teaching dimensions: professional expertise, instructional behavior, pedagogical efficacy, classroom experience, and other performance metrics. We first propose a Dimension-Anchored Evidence Encoder, which integrates three core components: (i) a pre-trained text encoder that transforms qualitative feedback annotations into contextualized embeddings; (ii) a prompt module that represents five teaching dimensions as learnable semantic anchors; and (iii) a cross-attention mechanism that aligns evidence with pedagogical dimensions within a structured semantic space. We then propose a Cross-View Graph Synergy Network to represent student comments. This network comprises two components: (i) a Syntactic Branch that extracts explicit grammatical dependencies from parse trees, and (ii) a Semantic Branch that models latent conceptual relations derived from BERT-based similarity graphs. BiAffine fusion module aligns syntactic and semantic units, while a differential regularizer disentangles embeddings to encourage complementary representations. Finally, a cross-attention mechanism bridges the dimension-anchored evidence with the multi-view comment representations. We also contribute a novel benchmark dataset featuring expert qualitative annotations and multi-label scores. Extensive experiments demonstrate that TeachPro offers superior diagnostic granularity and robustness across diverse evaluation settings.

preprint2025arXiv

Dual prototype attentive graph network for cross-market recommendation

Cross-market recommender systems (CMRS) aim to utilize historical data from mature markets to promote multinational products in emerging markets. However, existing CMRS approaches often overlook the potential for shared preferences among users in different markets, focusing primarily on modeling specific preferences within each market. In this paper, we argue that incorporating both market-specific and market-shared insights can enhance the generalizability and robustness of CMRS. We propose a novel approach called Dual Prototype Attentive Graph Network for Cross-Market Recommendation (DGRE) to address this. DGRE leverages prototypes based on graph representation learning from both items and users to capture market-specific and market-shared insights. Specifically, DGRE incorporates market-shared prototypes by clustering users from various markets to identify behavioural similarities and create market-shared user profiles. Additionally, it constructs item-side prototypes by aggregating item features within each market, providing valuable market-specific insights. We conduct extensive experiments to validate the effectiveness of DGRE on a real-world cross-market dataset, and the results show that considering both market-specific and market-sharing aspects in modelling can improve the generalization and robustness of CMRS.

preprint2025arXiv

Non-Euclidean interfaces decode the continuous landscape of graphene-induced surface reconstructions

Interfacial reconstruction between two-dimensional (2D) materials and metal substrates fundamentally governs heterostructure properties, yet conventional flat substrates fail to capture the continuous crystallographic landscape. Here, we overcome this topological limitation using non-Euclidean interfaces-curved 2D graphene-copper surfaces as a model system-to traverse the infinite spectrum of lattice orientations. By integrating multimodal microscopy with a deep-learning-enhanced dimensional upscaling framework, we translate 2D scanning electron microscopy (SEM) contrast into quantitative three-dimensional (3D) morphologies with accurate facet identification. Coupling these observations with machine-learning-assisted density functional theory, we demonstrate that reconstruction is governed by a unified thermodynamic mechanism where high-index facets correspond to specific local minima in the surface energy landscape. This work resolves the long-standing complexity of graphene-copper faceting and establishes non-Euclidean surface topologies as a generalizable paradigm for decoding and controlling interfacial reconstruction in diverse metal-2D material systems.

preprint2025arXiv

RDSA: A Robust Deep Graph Clustering Framework via Dual Soft Assignment

Graph clustering is an essential aspect of network analysis that involves grouping nodes into separate clusters. Recent developments in deep learning have resulted in graph clustering, which has proven effective in many applications. Nonetheless, these methods often encounter difficulties when dealing with real-world graphs, particularly in the presence of noisy edges. Additionally, many denoising graph clustering methods tend to suffer from lower performance, training instability, and challenges in scaling to large datasets compared to non-denoised models. To tackle these issues, we introduce a new framework called the Robust Deep Graph Clustering Framework via Dual Soft Assignment (RDSA). RDSA consists of three key components: (i) a node embedding module that effectively integrates the graph's topological features and node attributes; (ii) a structure-based soft assignment module that improves graph modularity by utilizing an affinity matrix for node assignments; and (iii) a node-based soft assignment module that identifies community landmarks and refines node assignments to enhance the model's robustness. We assess RDSA on various real-world datasets, demonstrating its superior performance relative to existing state-of-the-art methods. Our findings indicate that RDSA provides robust clustering across different graph types, excelling in clustering effectiveness and robustness, including adaptability to noise, stability, and scalability.

preprint2022arXiv

A Bayesian Permutation training deep representation learning method for speech enhancement with variational autoencoder

Recently, variational autoencoder (VAE), a deep representation learning (DRL) model, has been used to perform speech enhancement (SE). However, to the best of our knowledge, current VAE-based SE methods only apply VAE to the model speech signal, while noise is modeled using the traditional non-negative matrix factorization (NMF) model. One of the most important reasons for using NMF is that these VAE-based methods cannot disentangle the speech and noise latent variables from the observed signal. Based on Bayesian theory, this paper derives a novel variational lower bound for VAE, which ensures that VAE can be trained in supervision, and can disentangle speech and noise latent variables from the observed signal. This means that the proposed method can apply the VAE to model both speech and noise signals, which is totally different from the previous VAE-based SE works. More specifically, the proposed DRL method can learn to impose speech and noise signal priors to different sets of latent variables for SE. The experimental results show that the proposed method can not only disentangle speech and noise latent variables from the observed signal but also obtain a higher scale-invariant signal-to-distortion ratio and speech quality score than the similar deep neural network-based (DNN) SE method.

preprint2022arXiv

A deep representation learning speech enhancement method using $β$-VAE

In previous work, we proposed a variational autoencoder-based (VAE) Bayesian permutation training speech enhancement (SE) method (PVAE) which indicated that the SE performance of the traditional deep neural network-based (DNN) method could be improved by deep representation learning (DRL). Based on our previous work, we in this paper propose to use $β$-VAE to further improve PVAE's ability of representation learning. More specifically, our $β$-VAE can improve PVAE's capacity of disentangling different latent variables from the observed signal without the trade-off problem between disentanglement and signal reconstruction. This trade-off problem widely exists in previous $β$-VAE algorithms. Unlike the previous $β$-VAE algorithms, the proposed $β$-VAE strategy can also be used to optimize the DNN's structure. This means that the proposed method can not only improve PVAE's SE performance but also reduce the number of PVAE training parameters. The experimental results show that the proposed method can acquire better speech and noise latent representation than PVAE. Meanwhile, it also obtains a higher scale-invariant signal-to-distortion ratio, speech quality, and speech intelligibility.

preprint2022arXiv

Approximation of Functionals by Neural Network without Curse of Dimensionality

In this paper, we establish a neural network to approximate functionals, which are maps from infinite dimensional spaces to finite dimensional spaces. The approximation error of the neural network is $O(1/\sqrt{m})$ where $m$ is the size of networks, which overcomes the curse of dimensionality. The key idea of the approximation is to define a Barron spectral space of functionals.

preprint2022arXiv

Bunching instability and asymptotic properties in epitaxial growth with elasticity effects: continuum model

We study the continuum epitaxial model for elastic interacting atomic steps on vicinal surfaces proposed by Xiang and E (Xiang, SIAM J. Appl. Math. 63:241-258, 2002; Xiang and E, Phys. Rev. B 69:035409, 2004). The non-local term and the singularity complicate the analysis of its PDE. In this paper, we first generalize this model to the Lennard-Jones (m,n) interaction between steps. Based on several important formulations of the non-local energy, we prove the existence, symmetry, unimodality, and regularity of the energy minimizer in the periodic setting. In particular, the symmetry and unimodality of the minimizer implies that it has a bunching profile. Furthermore, we derive the minimum energy scaling law for the original continnum model. All results are consistent with the corresponding results proved for discrete models by Luo et al. (Luo et al., Multiscale Model. Simul. 14:737 - 771, 2016).

preprint2022arXiv

CATNet: Cross-event Attention-based Time-aware Network for Medical Event Prediction

Medical event prediction (MEP) is a fundamental task in the medical domain, which needs to predict medical events, including medications, diagnosis codes, laboratory tests, procedures, outcomes, and so on, according to historical medical records. The task is challenging as medical data is a type of complex time series data with heterogeneous and temporal irregular characteristics. Many machine learning methods that consider the two characteristics have been proposed for medical event prediction. However, most of them consider the two characteristics separately and ignore the correlations among different types of medical events, especially relations between historical medical events and target medical events. In this paper, we propose a novel neural network based on attention mechanism, called cross-event attention-based time-aware network (CATNet), for medical event prediction. It is a time-aware, event-aware and task-adaptive method with the following advantages: 1) modeling heterogeneous information and temporal information in a unified way and considering temporal irregular characteristics locally and globally respectively, 2) taking full advantage of correlations among different types of events via cross-event attention. Experiments on two public datasets (MIMIC-III and eICU) show CATNet can be adaptive with different MEP tasks and outperforms other state-of-the-art methods on various MEP tasks. The source code of CATNet will be released after this manuscript is accepted.

preprint2022arXiv

Existence, uniqueness, and energy scaling of 2+1 dimensional continuum model for stepped epitaxial surfaces with elastic effects

We study the 2+1 dimensional continuum model for the evolution of stepped epitaxial surface under long-range elastic interaction proposed by Xu and Xiang (SIAM J. Appl. Math. 69, 1393-1414, 2009). The long-range interaction term and the two length scales in this model makes PDE analysis challenging. Moreover, unlike in the 1+1 dimensional case, there is a nonconvexity contribution in the total energy in the 2+1 dimensional case, and it is not easy to prove that the solution is always in the well-posed regime during the evolution. In this paper, we propose a modified 2+1 dimensional continuum model based on the underlying physics. This modification fixes the problem of possible illposedness due to the nonconvexity of the energy functional. We prove the existence and uniqueness of both the static and dynamic solutions and derive a minimum energy scaling law for them. We show that the minimum energy surface profile is mainly attained by surfaces with step meandering instability. This is essentially different from the energy scaling law for the 1+1 dimensional epitaxial surfaces under elastic effects attained by step bunching surface profiles. We also discuss the transition from the step bunching instability to the step meandering instability in 2+1 dimensions.

preprint2022arXiv

Exploring Unfairness on Proof of Authority: Order Manipulation Attacks and Remedies

Proof of Authority (PoA) is a type of permissioned consensus algorithm with a fixed committee. PoA has been widely adopted by communities and industries due to its better performance and faster finality. In this paper, we explore the \textit{unfairness} issue existing in the current PoA implementations. We have investigated 2,500+ \textit{in the wild} projects and selected 10+ as our main focus (covering Ethereum, Binance smart chain, etc.). We have identified two types of order manipulation attacks to separately break the transaction-level (a.k.a. transaction ordering) and the block-level (sealer position ordering) fairness. Both of them merely rely on honest-but-\textit{profitable} sealer assumption without modifying original settings. We launch these attacks on the forked branches under an isolated environment and carefully evaluate the attacking scope towards different implementations. To date (as of Nov 2021), the potentially affected PoA market cap can reach up to $681,087$ million USD. Besides, we further dive into the source code of selected projects, and accordingly, propose our recommendation for the fix. To the best of knowledge, this work provides the first exploration of the \textit{unfairness} issue in PoA algorithms.

preprint2022arXiv

FAAG: Fast Adversarial Audio Generation through Interactive Attack Optimisation

Automatic Speech Recognition services (ASRs) inherit deep neural networks' vulnerabilities like crafted adversarial examples. Existing methods often suffer from low efficiency because the target phases are added to the entire audio sample, resulting in high demand for computational resources. This paper proposes a novel scheme named FAAG as an iterative optimization-based method to generate targeted adversarial examples quickly. By injecting the noise over the beginning part of the audio, FAAG generates adversarial audio in high quality with a high success rate timely. Specifically, we use audio's logits output to map each character in the transcription to an approximate position of the audio's frame. Thus, an adversarial example can be generated by FAAG in approximately two minutes using CPUs only and around ten seconds with one GPU while maintaining an average success rate over 85%. Specifically, the FAAG method can speed up around 60% compared with the baseline method during the adversarial example generation process. Furthermore, we found that appending benign audio to any suspicious examples can effectively defend against the targeted adversarial attack. We hope that this work paves the way for inventing new adversarial attacks against speech recognition with computational constraints.

preprint2022arXiv

Formal Security Analysis on dBFT Protocol of NEO

NEO is one of the top public chains worldwide. We focus on its backbone consensus protocol, called delegated Byzantine Fault Tolerance (dBFT). The dBFT protocol has been adopted by a variety of blockchain systems such as ONT. dBFT claims to guarantee the security when no more than $f = \lfloor \frac{n}{3} \rfloor$ nodes are Byzantine, where $n$ is the total number of consensus participants. However, we identify attacks to break the claimed security. In this paper, we show our results by providing a security analysis on its dBFT protocol. First, we evaluate NEO's source code and formally present the procedures of dBFT via the state machine replication (SMR) model. Next, we provide a theoretical analysis with two example attacks. These attacks break the security of dBFT with no more than $f$ nodes. Then, we provide recommendations on how to fix the system against the identified attacks. The suggested fixes have been accepted by the NEO official team. Finally, we further discuss the reasons causing such issues, the relationship with current permissioned blockchain systems, and the scope of potential influence.

preprint2022arXiv

Nebula-I: A General Framework for Collaboratively Training Deep Learning Models on Low-Bandwidth Cloud Clusters

The ever-growing model size and scale of compute have attracted increasing interests in training deep learning models over multiple nodes. However, when it comes to training on cloud clusters, especially across remote clusters, huge challenges are faced. In this work, we introduce a general framework, Nebula-I, for collaboratively training deep learning models over remote heterogeneous clusters, the connections between which are low-bandwidth wide area networks (WANs). We took natural language processing (NLP) as an example to show how Nebula-I works in different training phases that include: a) pre-training a multilingual language model using two remote clusters; and b) fine-tuning a machine translation model using knowledge distilled from pre-trained models, which run through the most popular paradigm of recent deep learning. To balance the accuracy and communication efficiency, in Nebula-I, parameter-efficient training strategies, hybrid parallel computing methods and adaptive communication acceleration techniques are jointly applied. Meanwhile, security strategies are employed to guarantee the safety, reliability and privacy in intra-cluster computation and inter-cluster communication. Nebula-I is implemented with the PaddlePaddle deep learning framework, which can support collaborative training over heterogeneous hardware, e.g. GPU and NPU. Experiments demonstrate that the proposed framework could substantially maximize the training efficiency while preserving satisfactory NLP performance. By using Nebula-I, users can run large-scale training tasks over cloud clusters with minimum developments, and the utility of existed large pre-trained models could be further promoted. We also introduced new state-of-the-art results on cross-lingual natural language inference tasks, which are generated based upon a novel learning framework and Nebula-I.

preprint2022arXiv

Stochastic Continuum Models for High--Entropy Alloys with Short-range Order

High entropy alloys (HEAs) are a class of novel materials that exhibit superb engineering properties. It has been demonstrated by extensive experiments and first principles/atomistic simulations that short-range order in the atomic level randomness strongly influences the properties of HEAs. In this paper, we derive stochastic continuum models for HEAs with short-range order from atomistic models. A proper continuum limit is obtained such that the mean and variance of the atomic level randomness together with the short-range order described by a characteristic length are kept in the process from the atomistic interaction model to the continuum equation. The obtained continuum model with short-range order is in the form of an Ornstein--Uhlenbeck (OU) process. This validates the continuum model based on the OU process adopted phenomenologically by Zhang et al. [Acta Mater., 166 (2019), pp. 424--434] for HEAs with short-range order. We derive such stochastic continuum models with short-range order for both elasticity in HEAs without defects and HEAs with dislocations (line defects). The obtained stochastic continuum models are based on the energy formulations, whose variations lead to stochastic partial differential equations.

preprint2022arXiv

Video is All You Need: Attacking PPG-based Biometric Authentication

Unobservable physiological signals enhance biometric authentication systems. Photoplethysmography (PPG) signals are convenient owning to its ease of measurement and are usually well protected against remote adversaries in authentication. Any leaked PPG signals help adversaries compromise the biometric authentication systems, and the advent of remote PPG (rPPG) enables adversaries to acquire PPG signals through restoration. While potentially dangerous, rPPG-based attacks are overlooked because existing methods require the victim's PPG signals. This paper proposes a novel spoofing attack approach that uses the waveforms of rPPG signals extracted from video clips to fool the PPG-based biometric authentication. We develop a new PPG restoration model that does not require leaked PPG signals for adversarial attacks. Test results on state-of-art PPG-based biometric authentication show that the signals recovered through rPPG pose a severe threat to PPG-based biometric authentication.

preprint2022arXiv

Weak solutions to an initial-boundary value problem for a continuum equation of motion of grain boundaries

We investigate an initial-(periodic-)boundary value problem for a continuum equation, which is a model for motion of grain boundaries based on the underlying microscopic mechanisms of line defects (disconnections) and integrated the effects of a diverse range of thermodynamic driving forces. We first prove the global-in-time existence and uniqueness of weak solution to this initial-boundary value problem in the case with positive equilibrium disconnection density parameter B, and then investigate the asymptotic behavior of the solutions as B goes to zero. The main difficulties in the proof of main theorems are due to the degeneracy of B=0, a non-local term with singularity, and a non-smooth coefficient of the highest derivative associated with the gradient of the unknown. The key ingredients in the proof are the energy method, an estimate for a singular integral of the Hilbert type, and a compactness lemma.

preprint2022arXiv

Well-posedness of a modified degenerate Cahn-Hilliard model for surface diffusion

We study the well-posedness of a modified degenerate Cahn-Hilliard type model for surface diffusion. With degenerate phase-dependent diffusion mobility and additional stabilizing function, this model is able to give the correct sharp interface limit. We introduce a notion of weak solutions for the nonlinear model. The existence result is obtained by approximations of the proposed model with nondegenerate mobilities. We also employ this method to prove existence of weak solutions to a related model where the chemical potential contains a nonlocal term originated from self-climb of dislocations in crystalline materials.

preprint2020arXiv

A New Formulation of Coupling and Sliding Motions of Grain Boundaries Based on Dislocation Structure

A continuum model of the two dimensional low angle grain boundary motion and the dislocation structure evolution on the grain boundaries has been developed in Ref. [48]. The model is based on the motion and reaction of the constituent dislocations of the grain boundaries. The long-range elastic interaction between dislocations is included in the continuum model, and it maintains a stable dislocation structure described by the Frank's formula for grain boundaries. In this paper, we develop a new continuum model for the coupling and sliding motions of grain boundaries that avoids the time-consuming calculation of the long-range elastic interaction. In this model, the long-range elastic interaction is replaced by a constraint of the Frank's formula. The constrained evolution problem in our new continuum model is further solved by using the projection method. Effects of the coupling and sliding motions in our new continuum model and relationship with the classical motion by curvature model are discussed. The continuum model is validated by comparisons with discrete dislocation dynamics model and the early continuum model [48] in which the long-range dislocation interaction is explicitly calculated.

preprint2020arXiv

A Speech Enhancement Algorithm based on Non-negative Hidden Markov Model and Kullback-Leibler Divergence

In this paper, we propose a novel supervised single-channel speech enhancement method combing the the Kullback-Leibler divergence-based non-negative matrix factorization (NMF) and hidden Markov model (NMF-HMM). With the application of HMM, the temporal dynamics information of speech signals can be taken into account. In the training stage, the sum of Poisson, leading to the KL divergence measure, is used as the observation model for each state of HMM. This ensures that a computationally efficient multiplicative update can be used for the parameter update of the proposed model. In the online enhancement stage, we propose a novel minimum mean-square error (MMSE) estimator for the proposed NMF-HMM. This estimator can be implemented using parallel computing, saving the time complexity. The performance of the proposed algorithm is verified by objective measures. The experimental results show that the proposed strategy achieves better speech enhancement performance than state-of-the-art speech enhancement methods. More specifically, compared with the traditional NMF-based speech enhancement methods, our proposed algorithm achieves a 5\% improvement for short-time objective intelligibility (STOI) and 0.18 improvement for perceptual evaluation of speech quality (PESQ).

preprint2020arXiv

Analysis of Trending Topics and Text-based Channels of Information Delivery in Cybersecurity

Computer users are generally faced with difficulties in making correct security decisions. While an increasingly fewer number of people are trying or willing to take formal security training, online sources including news, security blogs, and websites are continuously making security knowledge more accessible. Analysis of cybersecurity texts can provide insights into the trending topics and identify current security issues as well as how cyber attacks evolve over time. These in turn can support researchers and practitioners in predicting and preparing for these attacks. Comparing different sources may facilitate the learning process for normal users by persisting the security knowledge gained from different cybersecurity context. Prior studies neither systematically analysed the wide-range of digital sources nor provided any standardisation in analysing the trending topics from recent security texts. Although LDA has been widely adopted in topic generation, its generated topics cannot cover the cybersecurity concepts completely and considerably overlap. To address this issue, we propose a semi-automated classification method to generate comprehensive security categories instead of LDA-generated topics. We further compare the identified 16 security categories across different sources based on their popularity and impact. We have revealed several surprising findings. (1) The impact reflected from cyber-security texts strongly correlates with the monetary loss caused by cybercrimes. (2) For most categories, security blogs share the largest popularity and largest absolute/relative impact over time. (3) Websites deliver security information without caring about timeliness much, where one third of the articles do not specify the date and the rest have a time lag in posting emerging security issues.

preprint2020arXiv

Catering to Your Concerns: Automatic Generation of Personalised Security-Centric Descriptions for Android Apps

Android users are increasingly concerned with the privacy of their data and security of their devices. To improve the security awareness of users, recent automatic techniques produce security-centric descriptions by performing program analysis. However, the generated text does not always address users' concerns as they are generally too technical to be understood by ordinary users. Moreover, different users have varied linguistic preferences, which do not match the text. Motivated by this challenge, we develop an innovative scheme to help users avoid malware and privacy-breaching apps by generating security descriptions that explain the privacy and security related aspects of an Android app in clear and understandable terms. We implement a prototype system, PERSCRIPTION, to generate personalised security-centric descriptions that automatically learn users' security concerns and linguistic preferences to produce user-oriented descriptions. We evaluate our scheme through experiments and user studies. The results clearly demonstrate the improvement on readability and users' security awareness of PERSCRIPTION's descriptions compared to existing description generators.

preprint2020arXiv

Daedalus: Breaking Non-Maximum Suppression in Object Detection via Adversarial Examples

This paper demonstrates that Non-Maximum Suppression (NMS), which is commonly used in Object Detection (OD) tasks to filter redundant detection results, is no longer secure. Considering that NMS has been an integral part of OD systems, thwarting the functionality of NMS can result in unexpected or even lethal consequences for such systems. In this paper, an adversarial example attack which triggers malfunctioning of NMS in end-to-end OD models is proposed. The attack, namely \texttt{Daedalus}, compresses the dimensions of detection boxes to evade NMS. As a result, the final detection output contains extremely dense false positives. This can be fatal for many OD applications such as autonomous vehicles and surveillance systems. The attack can be generalised to different end-to-end OD models, such that the attack cripples various OD applications. Furthermore, a way to craft robust adversarial examples is developed by using an ensemble of popular detection models as the substitutes. Considering the pervasive nature of model reusing in real-world OD scenarios, Daedalus examples crafted based on an \textit{ensemble of substitutes} can launch attacks without knowing the parameters of the victim models. Experimental results demonstrate that the attack effectively stops NMS from filtering redundant bounding boxes. As the evaluation results suggest, Daedalus increases the false positive rate in detection results to $99.9\%$ and reduces the mean average precision scores to $0$, while maintaining a low cost of distortion on the original inputs. It is also demonstrated that the attack can be practically launched against real-world OD systems via printed posters.

preprint2020arXiv

Defending against Adversarial Attack towards Deep Neural Networks via Collaborative Multi-task Training

Deep neural networks (DNNs) are known to be vulnerable to adversarial examples which contain human-imperceptible perturbations. A series of defending methods, either proactive defence or reactive defence, have been proposed in the recent years. However, most of the methods can only handle specific attacks. For example, proactive defending methods are invalid against grey-box or white-box attacks, while reactive defending methods are challenged by low-distortion adversarial examples or transferring adversarial examples. This becomes a critical problem since a defender usually does not have the type of the attack as a priori knowledge. Moreover, existing two-pronged defences (e.g., MagNet), which take advantages of both proactive and reactive methods, have been reported as broken under transferring attacks. To address this problem, this paper proposed a novel defensive framework based on collaborative multi-task training, aiming at providing defence for different types of attacks. The proposed defence first encodes training labels into label pairs and counters black-box attacks leveraging adversarial training supervised by the encoded label pairs. The defence further constructs a detector to identify and reject high-confidence adversarial examples that bypass the black-box defence. In addition, the proposed collaborative architecture can prevent adversaries from finding valid adversarial examples when the defence strategy is exposed. In the experiments, we evaluated our defence against four state-of-the-art attacks on $MNIST$ and $CIFAR10$ datasets. The results showed that our defending method achieved up to $96.3\%$ classification accuracy on black-box adversarial examples, and detected up to $98.7\%$ of the high confidence adversarial examples. It only decreased the model accuracy on benign example classification by $2.1\%$ for the $CIFAR10$ dataset.

preprint2020arXiv

Hybrid Neural Tagging Model for Open Relation Extraction

Open relation extraction (ORE) remains a challenge to obtain a semantic representation by discovering arbitrary relation tuples from the unstructured text. Conventional methods heavily depend on feature engineering or syntactic parsing, they are inefficient or error-cascading. Recently, leveraging supervised deep learning structures to address the ORE task is an extraordinarily promising way. However, there are two main challenges: (1) The lack of enough labeled corpus to support supervised training; (2) The exploration of specific neural architecture that adapts to the characteristics of open relation extracting. In this paper, to overcome these difficulties, we build a large-scale, high-quality training corpus in a fully automated way, and design a tagging scheme to assist in transforming the ORE task into a sequence tagging processing. Furthermore, we propose a hybrid neural network model (HNN4ORT) for open relation tagging. The model employs the Ordered Neurons LSTM to encode potential syntactic information for capturing the associations among the arguments and relations. It also emerges a novel Dual Aware Mechanism, including Local-aware Attention and Global-aware Convolution. The dual aware nesses complement each other so that the model can take the sentence-level semantics as a global perspective, and at the same time implement salient local features to achieve sparse annotation. Experimental results on various testing sets show that our model can achieve state-of-the-art performances compared to the conventional methods or other neural models.

preprint2020arXiv

Incorporating Uncertain Segmentation Information into Chinese NER for Social Media Text

Chinese word segmentation is necessary to provide word-level information for Chinese named entity recognition (NER) systems. However, segmentation error propagation is a challenge for Chinese NER while processing colloquial data like social media text. In this paper, we propose a model (UIcwsNN) that specializes in identifying entities from Chinese social media text, especially by leveraging ambiguous information of word segmentation. Such uncertain information contains all the potential segmentation states of a sentence that provides a channel for the model to infer deep word-level characteristics. We propose a trilogy (i.e., candidate position embedding -> position selective attention -> adaptive word convolution) to encode uncertain word segmentation information and acquire appropriate word-level representation. Experiments results on the social media corpus show that our model alleviates the segmentation error cascading trouble effectively, and achieves a significant performance improvement of more than 2% over previous state-of-the-art methods.

preprint2020arXiv

Med-BERT: pre-trained contextualized embeddings on large-scale structured electronic health records for disease prediction

Deep learning (DL) based predictive models from electronic health records (EHR) deliver impressive performance in many clinical tasks. Large training cohorts, however, are often required to achieve high accuracy, hindering the adoption of DL-based models in scenarios with limited training data size. Recently, bidirectional encoder representations from transformers (BERT) and related models have achieved tremendous successes in the natural language processing domain. The pre-training of BERT on a very large training corpus generates contextualized embeddings that can boost the performance of models trained on smaller datasets. We propose Med-BERT, which adapts the BERT framework for pre-training contextualized embedding models on structured diagnosis data from 28,490,650 patients EHR dataset. Fine-tuning experiments are conducted on two disease-prediction tasks: (1) prediction of heart failure in patients with diabetes and (2) prediction of pancreatic cancer from two clinical databases. Med-BERT substantially improves prediction accuracy, boosting the area under receiver operating characteristics curve (AUC) by 2.02-7.12%. In particular, pre-trained Med-BERT substantially improves the performance of tasks with very small fine-tuning training sets (300-500 samples) boosting the AUC by more than 20% or equivalent to the AUC of 10 times larger training set. We believe that Med-BERT will benefit disease-prediction studies with small local training datasets, reduce data collection expenses, and accelerate the pace of artificial intelligence aided healthcare.

preprint2020arXiv

Security Analysis on Tangle-based Blockchain through Simulation

The Tangle-based structure becomes one of the most promising solutions when designing DAG-based blockchain systems. The approach improves the scalability by directly confirming multiple transactions in parallel instead of single blocks in linear. However, the performance gain may bring potential security risks. In this paper, we construct three types of attacks with comprehensive evaluations, namely parasite attack (PS), double spending attack (DS), and hybrid attack (HB). To achieve that, we deconstruct the Tangle-based projects (e.g. IOTA) and abstract the main components to rebuild a simple but flexible network for the simulation. Then, we informally define three smallest actions to build up the attack strategies layer by layer. Based on that, we provide analyses to evaluate different types of attacks. To the best of our knowledge, this is the first study to provide a comprehensive security analysis of Tangle-based blockchains.

preprint2020arXiv

SIAT: A Systematic Inter-Component Communication Analysis Technology for Detecting Threats on Android

In this paper, we present the design and implementation of a Systematic Inter-Component Communication Analysis Technology (SIAT) consisting of two key modules: \emph{Monitor} and \emph{Analyzer}. As an extension to the Android operating system at framework layer, the \emph{Monitor} makes the first attempt to revise the taint tag approach named TaintDroid both at method-level and file-level, to migrate it to the app-pair ICC paths identification through systemwide tracing and analysis of taint in intent both at the data flow and control flow. By taking over the taint logs offered by the \emph{Monitor}, the \emph{Analyzer} can build the accurate and integrated ICC models adopted to identify the specific threat models with the detection algorithms and predefined rules. Meanwhile, we employ the models' deflation technology to improve the efficiency of the \emph{Analyzer}. We implement the SIAT with Android Open Source Project and evaluate its performance through extensive experiments on well-known datasets and real-world apps. The experimental results show that, compared to state-of-the-art approaches, the SIAT can achieve about 25\%$\sim$200\% accuracy improvements with 1.0 precision and 0.98 recall at the cost of negligible runtime overhead. Moreover, the SIAT can identify two undisclosed cases of bypassing that prior technologies cannot detect and quite a few malicious ICC threats in real-world apps with lots of downloads on the Google Play market.

preprint2020arXiv

Stochastic Peierls-Nabarro Model for Dislocations in High Entropy Alloys

High entropy alloys (HEAs) are single phase crystals that consist of random solid solutions of multiple elements in approximately equal proportions. This class of novel materials have exhibited superb mechanical properties, such as high strength combined with other desired features. The strength of crystalline materials is associated with the motion of dislocations. In this paper, we derive a stochastic continuum model based on the Peierls-Nabarro framework for inter-layer dislocations in a bilayer HEA from an atomistic model that incorporates the atomic level randomness. We use asymptotic analysis and limit theorem in the convergence from the atomistic model to the continuum model. The total energy in the continuum model consists of a stochastic elastic energy in the two layers, and a stochastic misfit energy that accounts for the inter-layer nonlinear interaction. The obtained continuum model can be considered as a stochastic generalization of the classical, deterministic Peierls-Nabarro model for the dislocation core and related properties. This derivation also validates the stochastic model adopted by Zhang et al. (Acta Mater. 166, 424-434, 2019).

preprint2016arXiv

A View of Fog Computing from Networking Perspective

With smart devices, particular smartphones, becoming our everyday companions, the ubiquitous mobile Internet and computing applications pervade people's daily lives. With the surge demand on high-quality mobile services at anywhere, how to address the ubiquitous user demand and accommodate the explosive growth of mobile traffics is the key issue of the next generation mobile networks. The Fog computing is a promising solution towards this goal. Fog computing extends cloud computing by providing virtualized resources and engaged location-based services to the edge of the mobile networks so as to better serve mobile traffics. Therefore, Fog computing is a lubricant of the combination of cloud computing and mobile applications. In this article, we outline the main features of Fog computing and describe its concept, architecture and design goals. Lastly, we discuss some of the future research issues from the networking perspective.

preprint2016arXiv

Continuum dynamics of the formation, migration and dissociation of self-locked dislocation structures on parallel slip planes

In continuum models of dislocations, proper formulations of short-range elastic interactions of dislocations are crucial for capturing various types of dislocation patterns formed in crystalline materials. In this article, the continuum dynamics of straight dislocations distributed on two parallel slip planes is modelled through upscaling the underlying discrete dislocation dynamics. Two continuum velocity field quantities are introduced to facilitate the discrete-to-continuum transition. The first one is the local migration velocity of dislocation ensembles which is found fully independent of the short-range dislocation correlations. The second one is the decoupling velocity of dislocation pairs controlled by a threshold stress value, which is proposed to be the effective flow stress for single slip systems. Compared to the almost ubiquitously adopted Taylor relationship, the derived flow stress formula exhibits two features that are more consistent with the underlying discrete dislocation dynamics: i) the flow stress increases with the in-plane component of the dislocation density only up to a certain value, hence the derived formula admits a minimum inter-dislocation distance within slip planes; ii) the flow stress smoothly transits to zero when all dislocations become geometrically necessary dislocations. A regime under which inhomogeneities in dislocation density grow is identified, and is further validated through comparison with discrete dislocation dynamical simulation results. Based on the findings in this article and in our previous works, a general strategy for incorporating short-range dislocation correlations in continuum models of dislocations is proposed.

preprint2016arXiv

Dislocation climb models from atomistic scheme to dislocation dynamics

We develop a mesoscopic dislocation dynamics model for vacancy-assisted dislocation climb by upscalings from a stochastic model on the atomistic scale. Our models incorporate microscopic mechanisms of (i) bulk diffusion of vacancies, (ii) vacancy exchange dynamics between bulk and dislocation core, (iii) vacancy pipe diffusion along the dislocation core, and (iv) vacancy attachment-detachment kinetics at jogs leading to the motion of jogs. Our mesoscopic model consists of the vacancy bulk diffusion equation and a dislocation climb velocity formula. The effects of pipe diffusion and the jog structure on dislocations are incorporated by a Robin boundary condition near the dislocations for the bulk diffusion equation and a new contribution in the dislocation climb velocity due to vacancy pipe diffusion driven by the stress variation along the dislocation. Our climb formulation is able to quantitatively describe the translation of prismatic loops at low temperatures when the bulk diffusion is negligible. Using this new formulation, we derive analytical formulas for the climb velocity of a straight edge dislocation and a prismatic circular loop. Our dislocation climb formulation can be implemented in dislocation dynamics simulations to incorporate all the above four microscopic mechanisms of dislocation climb.

preprint2016arXiv

Energy of low angle grain boundaries based on continuum dislocation structure

In this paper, we present a continuum model to compute the energy of low angle grain boundaries for any given degrees of freedom (arbitrary rotation axis, rotation angle and boundary plane orientation) based on a continuum dislocation structure. In our continuum model, we minimize the grain boundary energy associated with the dislocation structure subject to the constraint of Frank's formula for dislocations with all possible Burgers vectors. This constrained minimization problem is solved by the penalty method by which it is turned into an unconstrained minimization problem. The grain boundary dislocation structure is approximated by a network of straight dislocations that predicts the energy and dislocation densities of the grain boundaries. The grain boundary energy based on the calculated dislocation structure is able to incorporate its anisotropic nature. We use our continuum model to systematically study the energy of $<111>$ low angle grain boundaries in fcc Al with any boundary plane orientation and all six possible Burgers vectors. Comparisons with results of the atomistic simulations show that our continuum model is able to give excellent predictions of the energy and dislocation densities of low angle grain boundaries. We also study the energy of low angle grain boundaries in fcc Al with varying rotation axis while the rest degrees of freedom are fixed. With minor modifications, our model can also apply to dislocation structures and energy of heterogeneous interfaces.

preprint2016arXiv

Fog Computing: Focusing on Mobile Users at the Edge

With smart devices, particular smartphones, becoming our everyday companions, the ubiquitous mobile Internet and computing applications pervade people daily lives. With the surge demand on high-quality mobile services at anywhere, how to address the ubiquitous user demand and accommodate the explosive growth of mobile traffics is the key issue of the next generation mobile networks. The Fog computing is a promising solution towards this goal. Fog computing extends cloud computing by providing virtualized resources and engaged location-based services to the edge of the mobile networks so as to better serve mobile traffics. Therefore, Fog computing is a lubricant of the combination of cloud computing and mobile applications. In this article, we outline the main features of Fog computing and describe its concept, architecture and design goals. Lastly, we discuss some of the future research issues from the networking perspective.

preprint2016arXiv

Homogenisation of a Row of Dislocation Dipoles from Discrete Dislocation Dynamics

Conventional discrete-to-continuum approaches have seen their limitation in describing the collective behaviour of the multi-polar configurations of dislocations, which are widely observed in crystalline materials. The reason is that dislocation dipoles, which play an important role in determining the mechanical properties of crystals, often get smeared out when traditional homogenisation methods are applied. To address such difficulties, the collective behaviour of a row of dislocation dipoles is studied by using matched asymptotic techniques. The discrete-to-continuum transition is facilitated by introducing two field variables respectively describing the dislocation pair density potential and the dislocation pair width. It is found that the dislocation pair width evolves much faster than the pair density. Such hierarchy in evolution time scales enables us to describe the dislocation dynamics at the coarse-grained level by an evolution equation for the slowly varying variable (the pair density) coupled with an equilibrium equation for the fast varying variable (the pair width). The time-scale separation method adopted here paves a way for properly incorporating dipole-like (zero net Burgers vector but non-vanishing) dislocation structures, known as the statistically stored dislocations (SSDs) into macroscopic models of crystal plasticity in three dimensions. Moreover, the natural transition between different equilibrium patterns found here may also shed light on understanding the emergence of the persistent slip bands (PSBs) in fatigue metals induced by cyclic loads.

preprint2016arXiv

Smoothed Hierarchical Dirichlet Process: A Non-Parametric Approach to Constraint Measures

Time-varying mixture densities occur in many scenarios, for example, the distributions of keywords that appear in publications may evolve from year to year, video frame features associated with multiple targets may evolve in a sequence. Any models that realistically cater to this phenomenon must exhibit two important properties: the underlying mixture densities must have an unknown number of mixtures, and there must be some "smoothness" constraints in place for the adjacent mixture densities. The traditional Hierarchical Dirichlet Process (HDP) may be suited to the first property, but certainly not the second. This is due to how each random measure in the lower hierarchies is sampled independent of each other and hence does not facilitate any temporal correlations. To overcome such shortcomings, we proposed a new Smoothed Hierarchical Dirichlet Process (sHDP). The key novelty of this model is that we place a temporal constraint amongst the nearby discrete measures $\{G_j\}$ in the form of symmetric Kullback-Leibler (KL) Divergence with a fixed bound $B$. Although the constraint we place only involves a single scalar value, it nonetheless allows for flexibility in the corresponding successive measures. Remarkably, it also led us to infer the model within the stick-breaking process where the traditional Beta distribution used in stick-breaking is now replaced by a new constraint calculated from $B$. We present the inference algorithm and elaborate on its solutions. Our experiment using NIPS keywords has shown the desirable effect of the model.

preprint2016arXiv

The Dependent Random Measures with Independent Increments in Mixture Models

When observations are organized into groups where commonalties exist amongst them, the dependent random measures can be an ideal choice for modeling. One of the propositions of the dependent random measures is that the atoms of the posterior distribution are shared amongst groups, and hence groups can borrow information from each other. When normalized dependent random measures prior with independent increments are applied, we can derive appropriate exchangeable probability partition function (EPPF), and subsequently also deduce its inference algorithm given any mixture model likelihood. We provide all necessary derivation and solution to this framework. For demonstration, we used mixture of Gaussians likelihood in combination with a dependent structure constructed by linear combinations of CRMs. Our experiments show superior performance when using this framework, where the inferred values including the mixing weights and the number of clusters both respond appropriately to the number of completely random measure used.

preprint2015arXiv

A continuum model for dislocation dynamics in three dimensions using the dislocation density potential functions and its application in understanding the micro-pillar size effect

In this paper, we present a dislocation-density-based three-dimensional continuum model, where the dislocation substructures are represented by pairs of dislocation density potential functions (DDPFs), denoted by $ϕ$ and $ψ$. The slip plane distribution is characterized by the contour surfaces of $ψ$, while the distribution of dislocation curves on each slip plane is identified by the contour curves of $ϕ$ which represents the plastic slip on the slip plane. By using DDPFs, we can explicitly write down an evolution equation system, which is shown consistent with the underlying discrete dislocation dynamics. The system includes i) A constitutive stress rule, which describes how the total stress field is determined in the presence of dislocation networks and applied loads; ii) A plastic flow rule, which describes how dislocation ensembles evolve. The proposed continuum model is validated through comparisons with discrete dislocation dynamics simulation results and experimental data. As an application of the proposed model, the "smaller-being-stronger" size effect observed in single-crystal micro-pillars is studied. A scaling law for the pillar flow stress $σ_{\text{flow}}$ against its (non-dimensionalized) size $D$ is derived to be $σ_{\text{flow}}\sim\log(D)/D$.

preprint2015arXiv

Synergy, suppression and immorality: forward differences of the entropy function

Conditional mutual information is important in the selection and interpretation of graphical models. Its empirical version is well known as a generalised likelihood ratio test and that it may be represented as a difference in entropy. We consider the forward difference expansion of the entropy function defined on all subsets of the variables under study. The elements of this expansion are invariant to permutation of their suffices and relate higher order mutual informations to lower order ones. The third order difference is expressible as an, apparently assymmetric, difference between a marginal and a conditional mutual information. Its role in the decomposition for explained information provides a technical definition for synergy between three random variables. Positive values occur when two variables provide alternative explanations for a third; negative values, termed synergies, occur when the sum of explained information is greater than the sum of its parts. Synergies tend to be infrequent; they connect the seemingly unrelated concepts of suppressor variables in regression, on the one hand, and unshielded colliders in Bayes networks (immoralities), on the other. We give novel characterizations of these phenomena that generalise to categorical variables and to higher dimensions. We propose an algorithm for systematically computing low order differences from a given graph. Examples from small scale real-life studies indicate the potential of these techniques for empirical statistical analysis.

preprint2014arXiv

Simple linear algorithms for mining graph cores

Batagelj and Zaversnik proposed a linear algorithm for the well-known $k$-core decomposition problem. However, when $k$-cores are desired for a given $k$, we find that a simple linear algorithm requiring no sorting works for mining $k$-cores. In addition, this algorithm can be extended to mine $(k_1, k_2,\ldots, k_p)$-cores from $p$-partite graphs in linear time, and this mining approach can be efficiently implemented in a distributed computing environment with a lower message complexity bound in comparison with the best known method of distributed $k$-core decomposition.

preprint2013arXiv

A Method for Implementing a Probabilistic Model as a Relational Database

This paper discusses a method for implementing a probabilistic inference system based on an extended relational data model. This model provides a unified approach for a variety of applications such as dynamic programming, solving sparse linear equations, and constraint propagation. In this framework, the probability model is represented as a generalized relational database. Subsequent probabilistic requests can be processed as standard relational queries. Conventional database management systems can be easily adopted for implementing such an approximate reasoning system.

preprint2013arXiv

Can Uncertainty Management be Realized in a Finite Totally Ordered Probability Algebra?

In this paper, the feasibility of using finite totally ordered probability models under Alelinnas's Theory of Probabilistic Logic [Aleliunas, 1988] is investigated. The general form of the probability algebra of these models is derived and the number of possible algebras with given size is deduced. Based on this analysis, we discuss problems of denominator-indifference and ambiguity-generation that arise in reasoning by cases and abductive reasoning. An example is given that illustrates how these problems arise. The investigation shows that a finite probability model may be of very limited usage.

preprint2013arXiv

Critical Remarks on Single Link Search in Learning Belief Networks

In learning belief networks, the single link lookahead search is widely adopted to reduce the search space. We show that there exists a class of probabilistic domain models which displays a special pattern of dependency. We analyze the behavior of several learning algorithms using different scoring metrics such as the entropy, conditional independence, minimal description length and Bayesian metrics. We demonstrate that single link lookahead search procedures (employed in these algorithms) cannot learn these models correctly. Thus, when the underlying domain model actually belongs to this class, the use of a single link search procedure will result in learning of an incorrect model. This may lead to inference errors when the model is used. Our analysis suggests that if the prior knowledge about a domain does not rule out the possible existence of these models, a multi-link lookahead search or other heuristics should be used for the learning process.

preprint2013arXiv

Exploring Localization in Bayesian Networks for Large Expert Systems

Current Bayesian net representations do not consider structure in the domain and include all variables in a homogeneous network. At any time, a human reasoner in a large domain may direct his attention to only one of a number of natural subdomains, i.e., there is ?localization' of queries and evidence. In such a case, propagating evidence through a homogeneous network is inefficient since the entire network has to be updated each time. This paper presents multiply sectioned Bayesian networks that enable a (localization preserving) representation of natural subdomains by separate Bayesian subnets. The subnets are transformed into a set of permanent junction trees such that evidential reasoning takes place at only one of them at a time. Probabilities obtained are identical to those that would be obtained from the homogeneous network. We discuss attention shift to a different junction tree and propagation of previously acquired evidence. Although the overall system can be large, computational requirements are governed by the size of only one junction tree.

preprint2013arXiv

Exploring Parallelism in Learning Belief Networks

It has been shown that a class of probabilistic domain models cannot be learned correctly by several existing algorithms which employ a single-link look ahead search. When a multi-link look ahead search is used, the computational complexity of the learning algorithm increases. We study how to use parallelism to tackle the increased complexity in learning such models and to speed up learning in large domains. An algorithm is proposed to decompose the learning task for parallel processing. A further task decomposition is used to balance load among processors and to increase the speed-up and efficiency. For learning from very large datasets, we present a regrouping of the available processors such that slow data access through file can be replaced by fast memory access. Our implementation in a parallel computer demonstrates the effectiveness of the algorithm.

preprint2013arXiv

Learning Belief Networks in Domains with Recursively Embedded Pseudo Independent Submodels

A pseudo independent (PI) model is a probabilistic domain model (PDM) where proper subsets of a set of collectively dependent variables display marginal independence. PI models cannot be learned correctly by many algorithms that rely on a single link search. Earlier work on learning PI models has suggested a straightforward multi-link search algorithm. However, when a domain contains recursively embedded PI submodels, it may escape the detection of such an algorithm. In this paper, we propose an improved algorithm that ensures the learning of all embedded PI submodels whose sizes are upper bounded by a predetermined parameter. We show that this improved learning capability only increases the complexity slightly beyond that of the previous algorithm. The performance of the new algorithm is demonstrated through experiment.

preprint2013arXiv

Optimization of Inter-Subnet Belief Updating in Multiply Sectioned Bayesian Networks

Recent developments show that Multiply Sectioned Bayesian Networks (MSBNs) can be used for diagnosis of natural systems as well as for model-based diagnosis of artificial systems. They can be applied to single-agent oriented reasoning systems as well as multi-agent distributed probabilistic reasoning systems. Belief propagation between a pair of subnets plays a central role in maintenance of global consistency in a MSBN. This paper studies the operation UpdateBelief, presented originally with MSBNs, for inter-subnet propagation. We analyze how the operation achieves its intended functionality, which provides hints as for how its efficiency can be improved. We then define two new versions of UpdateBelief that reduce the computation time for inter-subnet propagation. One of them is optimal in the sense that the minimum amount of computation for coordinating multi-linkage belief propagation is required. The optimization problem is solved through the solution of a graph-theoretic problem: the minimum weight open tour in a tree.

preprint2012arXiv

A Robust Quantum Random Access Memory

A "bucket brigade" architecture for a quantum random memory of $N=2^n$ memory cells needs $n(n+5)/2$ times of quantum manipulation on control circuit nodes per memory call. Here we propose a scheme, in which only average $n/2$ times manipulation is required to accomplish a memory call. This scheme may significantly decrease the time spent on a memory call and the average overall error rate per memory call. A physical implementation scheme for storing an arbitrary state in a selected memory cell followed by reading it out is discussed.

preprint2011arXiv

A Graphical Password Based System for Small Mobile Devices

Passwords provide security mechanism for authentication and protection services against unwanted access to resources. A graphical based password is one promising alternatives of textual passwords. According to human psychology, humans are able to remember pictures easily. In this paper, we have proposed a new hybrid graphical password based system, which is a combination of recognition and recall based techniques that offers many advantages over the existing systems and may be more convenient for the user. Our scheme is resistant to shoulder surfing attack and many other attacks on graphical passwords. This scheme is proposed for smart mobile devices (like smart phones i.e. ipod, iphone, PDAs etc) which are more handy and convenient to use than traditional desktop computer systems.

preprint2010arXiv

Anomalous photoelectron spectrum caused by finite interaction time in few-cycle xuv laser pulses

With the development of laser technology, pulse length enters the optical cycle regime and hence the interaction time between laser pulse and atoms becomes prominent. We investigate this problem in this Letter through the photoelectron spectrum of hydrogen atom in few-cycle xuv laser pulses. By solving one-dimensional time-dependent Schrödinger equation, we find that due to the insufficient interaction time, the electron can not gain enough energy from optical field when escaping the bind of the nuclear and then the abnormality appears in the photoelectron spectrum: the peak of photoelectron spectrum shows red shift compared with the well-known Einstein photo-electric effect formula. The shift becomes large as the pulse duration decreases.

preprint2010arXiv

Investigation of nonlocal information as condition for violations of Bell inequality and information causality

On the basis of local realism theory, nonlocal information is necessary for violation of Bell's inequality. From a theoretical point of view, nonlocal information is essentially the mutual information on distant outcome and measurement setting. In this work we prove that if the measurement is free and unbiased, the mutual information about the distant outcome and setting is both necessary for the violation of Bell's inequality in the case with unbiased marginal probabilities. In the case with biased marginal probabilities, we point out that the mutual information about distant outcome cease to be necessary for violation of Bell's inequality, while the mutual information about distant measurement settings is still required. We also prove that the mutual information about distant measurement settings must be contained in the transmitted messages due to the freedom of measurement choices. Finally we point out that the mutual information about both distant outcome and measurement settings are necessary for a violation of information causality.

preprint2010arXiv

Joint measurements and Svetlichny's inequality

We prove that the Svetlichny's inequality can be derived from the existence of joint measurements and the principle of no-signaling. Then we show that, on the basis of quantum measurement assumption, it would imply the breach of causality if the magnitude of violation of Svetlichny's inequality exceeds quantum bound.

preprint2010arXiv

Maximal violation of Bell inequality for any given two-qubit pure state

In the case of bipartite two qubits systems, we derive the analytical expression of bound of Bell operator for any given pure state. Our result not only manifest some properties of Bell inequality, for example which may be violated by any pure entangled state and only be maximally violated for a maximally entangled state, but also give the explicit values of maximal violation for any pure state. Finally we point out that for two qubits systems there is no mixed state which can produce maximal violation of Bell inequality.

preprint2010arXiv

The relation between Hardy's non-locality and violation of Bell inequality

We give a analytic quantitative relation between Hardy's non-locality and Bell operator. We find that Hardy's non-locality is a sufficient condition for violation of Bell inequality, the upper bound of Hardy's non-locality allowed by information causality just correspond to Tsirelson bound of Bell inequality, and the upper bound of Hardy's non-locality allowed by the principle of no-signaling just correspond to the algebraic maximum of Bell operator. Then we study the Cabello's argument of Hardy's non-locality (a generalization of Hardy's argument) and find a similar relation between it and violation of Bell inequality. Finally, we give a simple derivation of the bound of Hardy's non-locality under the constraint of information causality with the aid of above derived relation between Hardy's non-locality and Bell operator, this bound is the main result of a recent work of Ahanj \emph{et al.} [Phys. Rev. A {\bf81}, 032103(2010)].

preprint2008arXiv

Control the high-order harmonics cutoff through the combination of chirped laser and static electric field

The high harmonic generation from atoms in the combination of chirped laser pulse and static field is theoretically investigated. For the first time, we explore a further physical mechanism of the significant extension of high harmonic generation cutoff based on three-step model. It is shown that the cutoff is substantially extended due to the asymmetry of the combined field. If appropriate parameters are chosen, the cutoff of high harmonic generation can reach Ip+42Up. Furthermore, an ultrabroad super-continuum spectrum can be generated. When the phases are properly compensated for, an isolated 9 attosecond pulse can be obtained.

preprint2005arXiv

The Localization of $s$-Wave and Quantum Effective Potential of a Quasi-Free Particle with Position-Dependent Mass

The properties of the s-wave for a quasi-free particle with position-dependent mass(PDM) have been discussed in details. Differed from the system with constant mass in which the localization of the s-wave for the free quantum particle around the origin only occurs in two dimensions, the quasi-free particle with PDM can experience attractive forces in $D$ dimensions except D=1 when its mass function satisfies some conditions. The effective mass of a particle varying with its position can induce effective interaction which may be attractive in some cases. The analytical expressions of the eigenfunctions and the corresponding probability densities for the s-waves of the two- and three-dimensional systems with a special PDM are given, and the existences of localization around the origin for these systems are shown.

Yang Xiang

What is connected

Connect this record

See the researcher in context

Building this map preview

60 published item(s)

Diffeomorphic Cortical Alignment via Direct Warping of Streamline Endpoints

Exploring the Translation Mechanism of Large Language Models

StablePDENet: Enhancing Stability of Operator Learning for Solving Differential Equations

TeachPro: Multi-Label Qualitative Teaching Evaluation via Cross-View Graph Synergy and Semantic Anchored Evidence Encoding

Dual prototype attentive graph network for cross-market recommendation

Non-Euclidean interfaces decode the continuous landscape of graphene-induced surface reconstructions

RDSA: A Robust Deep Graph Clustering Framework via Dual Soft Assignment

A Bayesian Permutation training deep representation learning method for speech enhancement with variational autoencoder

A deep representation learning speech enhancement method using $β$-VAE

Approximation of Functionals by Neural Network without Curse of Dimensionality

Bunching instability and asymptotic properties in epitaxial growth with elasticity effects: continuum model

CATNet: Cross-event Attention-based Time-aware Network for Medical Event Prediction

Existence, uniqueness, and energy scaling of 2+1 dimensional continuum model for stepped epitaxial surfaces with elastic effects

Exploring Unfairness on Proof of Authority: Order Manipulation Attacks and Remedies

FAAG: Fast Adversarial Audio Generation through Interactive Attack Optimisation

Formal Security Analysis on dBFT Protocol of NEO

Nebula-I: A General Framework for Collaboratively Training Deep Learning Models on Low-Bandwidth Cloud Clusters

Stochastic Continuum Models for High--Entropy Alloys with Short-range Order

Video is All You Need: Attacking PPG-based Biometric Authentication

Weak solutions to an initial-boundary value problem for a continuum equation of motion of grain boundaries

Well-posedness of a modified degenerate Cahn-Hilliard model for surface diffusion

A New Formulation of Coupling and Sliding Motions of Grain Boundaries Based on Dislocation Structure

A Speech Enhancement Algorithm based on Non-negative Hidden Markov Model and Kullback-Leibler Divergence

Analysis of Trending Topics and Text-based Channels of Information Delivery in Cybersecurity

Catering to Your Concerns: Automatic Generation of Personalised Security-Centric Descriptions for Android Apps

Daedalus: Breaking Non-Maximum Suppression in Object Detection via Adversarial Examples

Defending against Adversarial Attack towards Deep Neural Networks via Collaborative Multi-task Training

Hybrid Neural Tagging Model for Open Relation Extraction

Incorporating Uncertain Segmentation Information into Chinese NER for Social Media Text

Med-BERT: pre-trained contextualized embeddings on large-scale structured electronic health records for disease prediction

Security Analysis on Tangle-based Blockchain through Simulation

SIAT: A Systematic Inter-Component Communication Analysis Technology for Detecting Threats on Android

Stochastic Peierls-Nabarro Model for Dislocations in High Entropy Alloys

A View of Fog Computing from Networking Perspective

Continuum dynamics of the formation, migration and dissociation of self-locked dislocation structures on parallel slip planes

Dislocation climb models from atomistic scheme to dislocation dynamics

Energy of low angle grain boundaries based on continuum dislocation structure

Fog Computing: Focusing on Mobile Users at the Edge

Homogenisation of a Row of Dislocation Dipoles from Discrete Dislocation Dynamics

Smoothed Hierarchical Dirichlet Process: A Non-Parametric Approach to Constraint Measures

The Dependent Random Measures with Independent Increments in Mixture Models

A continuum model for dislocation dynamics in three dimensions using the dislocation density potential functions and its application in understanding the micro-pillar size effect

Synergy, suppression and immorality: forward differences of the entropy function

Simple linear algorithms for mining graph cores

A Method for Implementing a Probabilistic Model as a Relational Database

Can Uncertainty Management be Realized in a Finite Totally Ordered Probability Algebra?

Critical Remarks on Single Link Search in Learning Belief Networks

Exploring Localization in Bayesian Networks for Large Expert Systems

Exploring Parallelism in Learning Belief Networks

Learning Belief Networks in Domains with Recursively Embedded Pseudo Independent Submodels

Optimization of Inter-Subnet Belief Updating in Multiply Sectioned Bayesian Networks

A Robust Quantum Random Access Memory

A Graphical Password Based System for Small Mobile Devices

Anomalous photoelectron spectrum caused by finite interaction time in few-cycle xuv laser pulses

Investigation of nonlocal information as condition for violations of Bell inequality and information causality

Joint measurements and Svetlichny's inequality

Maximal violation of Bell inequality for any given two-qubit pure state

The relation between Hardy's non-locality and violation of Bell inequality

Control the high-order harmonics cutoff through the combination of chirped laser and static electric field

The Localization of $s$-Wave and Quantum Effective Potential of a Quasi-Free Particle with Position-Dependent Mass