Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
35works
0followers
21topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

35 published item(s)

preprint2026arXiv

LCGNav: Local Candidate-Aware Geometric Enhancement for General Topological Planning in Vision-Language Navigation

Online topological planning has become an effective paradigm for Vision-Language Navigation in Continuous Environments (VLN-CE), but existing methods still suffer from two limitations: redundant local depth information and weakened focus on current frontier candidates as the topological graph grows. To address this, we propose LCGNav, a modular local geometric enhancement framework for topological VLN. LCGNav explicitly converts candidate depth views into 3D point clouds and applies physical truncation based on the agent's reachable range, enabling more compact local geometric modeling. It further introduces a dimension-preserving local fusion strategy with transient state degradation, so that geometric enhancement is applied only to the currently relevant ghost nodes without changing the original planner interface. Experiments on R2R-CE and RxR-CE show that LCGNav serves as an effective cross-architecture enhancement module, consistently improving multiple key metrics of representative online topological baselines with low additional training cost. When integrated with ETP-R1, LCGNav achieves the best performance among the compared online topological methods on the val-unseen splits of the R2R-CE and RxR-CE benchmarks. The code is available at https://github.com/shannanshouyin/LCGNav.

preprint2025arXiv

Flowing from Reasoning to Motion: Learning 3D Hand Trajectory Prediction from Egocentric Human Interaction Videos

Prior works on 3D hand trajectory prediction are constrained by datasets that decouple motion from semantic supervision and by models that weakly link reasoning and action. To address these, we first present the EgoMAN dataset, a large-scale egocentric dataset for interaction stage-aware 3D hand trajectory prediction with 219K 6DoF trajectories and 3M structured QA pairs for semantic, spatial, and motion reasoning. We then introduce the EgoMAN model, a reasoning-to-motion framework that links vision-language reasoning and motion generation via a trajectory-token interface. Trained progressively to align reasoning with motion dynamics, our approach yields accurate and stage-aware trajectories with generalization across real-world scenes.

preprint2024arXiv

On the Reliability and Explainability of Language Models for Program Generation

Recent studies have adopted pre-trained language models, such as CodeT5 and CodeGPT, for automated program generation tasks like code generation, repair, and translation. Numerous language model-based approaches have been proposed and evaluated on various benchmark datasets, demonstrating promising performance. However, there is still uncertainty about the reliability of these models, particularly their realistic ability to consistently transform code sequences. This raises the question: are these techniques sufficiently trustworthy for automated program generation? Consequently, Further research is needed to understand model logic and assess reliability and explainability. To bridge these research gaps, we conduct a thorough empirical study of eight popular language models on five representative datasets to determine the capabilities and limitations of automated program generation approaches. We further employ advanced explainable AI approaches to highlight the tokens that significantly contribute to the code transformation. We discover that state-of-the-art approaches suffer from inappropriate performance evaluation stemming from severe data duplication, causing over-optimistic results. Our explainability analysis reveals that, in various experimental scenarios, language models can recognize code grammar and structural information, but they exhibit limited robustness to changes in input sequences. Overall, more rigorous evaluation approaches and benchmarks are critical to enhance the reliability and explainability of automated program generation moving forward. Our findings provide important guidelines for this goal.

preprint2023arXiv

Cluster-guided Contrastive Graph Clustering Network

Benefiting from the intrinsic supervision information exploitation capability, contrastive learning has achieved promising performance in the field of deep graph clustering recently. However, we observe that two drawbacks of the positive and negative sample construction mechanisms limit the performance of existing algorithms from further improvement. 1) The quality of positive samples heavily depends on the carefully designed data augmentations, while inappropriate data augmentations would easily lead to the semantic drift and indiscriminative positive samples. 2) The constructed negative samples are not reliable for ignoring important clustering information. To solve these problems, we propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) by mining the intrinsic supervision information in the high-confidence clustering results. Specifically, instead of conducting complex node or edge perturbation, we construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. Then, guided by the high-confidence clustering information, we carefully select and construct the positive samples from the same high-confidence cluster in two views. Moreover, to construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples, thus improving the discriminative capability and reliability of the constructed sample pairs. Lastly, we design an objective function to pull close the samples from the same cluster while pushing away those from other clusters by maximizing and minimizing the cross-view cosine similarity between positive and negative samples. Extensive experimental results on six datasets demonstrate the effectiveness of CCGC compared with the existing state-of-the-art algorithms.

preprint2023arXiv

Decentralized Gradient Tracking with Local Steps

Gradient tracking (GT) is an algorithm designed for solving decentralized optimization problems over a network (such as training a machine learning model). A key feature of GT is a tracking mechanism that allows to overcome data heterogeneity between nodes. We develop a novel decentralized tracking mechanism, $K$-GT, that enables communication-efficient local updates in GT while inheriting the data-independence property of GT. We prove a convergence rate for $K$-GT on smooth non-convex functions and prove that it reduces the communication overhead asymptotically by a linear factor $K$, where $K$ denotes the number of local steps. We illustrate the robustness and effectiveness of this heterogeneity correction on convex and non-convex benchmark problems and on a non-convex neural network training task with the MNIST dataset.

preprint2023arXiv

Maximum Likelihood Estimation for Maximal Distribution under Sublinear Expectation

Maximum likelihood estimation is a common method of estimating the parameters of the probability distribution from a given sample. This paper aims to introduce the maximum likelihood estimation in the framework of sublinear expectation. We find the maximum likelihood estimator for the parameters of the maximal distribution via the solution of the associated minimax problem, which coincides with the optimal unbiased estimation given by Jin and Peng \cite{JP21}. A general estimation method for samples with dependent structure is also provided. This result provides a theoretical foundation for the estimator of upper and lower variances, which is widely used in the G-VaR prediction model in finance.

preprint2023arXiv

Swin MAE: Masked Autoencoders for Small Datasets

The development of deep learning models in medical image analysis is majorly limited by the lack of large-sized and well-annotated datasets. Unsupervised learning does not require labels and is more suitable for solving medical image analysis problems. However, most of the current unsupervised learning methods need to be applied to large datasets. To make unsupervised learning applicable to small datasets, we proposed Swin MAE, which is a masked autoencoder with Swin Transformer as its backbone. Even on a dataset of only a few thousand medical images and without using any pre-trained models, Swin MAE is still able to learn useful semantic features purely from images. It can equal or even slightly outperform the supervised model obtained by Swin Transformer trained on ImageNet in terms of the transfer learning results of downstream tasks. The code is publicly available at https://github.com/Zian-Xu/Swin-MAE.

preprint2022arXiv

A Local Method for Identifying Causal Relations under Markov Equivalence

Causality is important for designing interpretable and robust methods in artificial intelligence research. We propose a local approach to identify whether a variable is a cause of a given target under the framework of causal graphical models of directed acyclic graphs (DAGs). In general, the causal relation between two variables may not be identifiable from observational data as many causal DAGs encoding different causal relations are Markov equivalent. In this paper, we first introduce a sufficient and necessary graphical condition to check the existence of a causal path from a variable to a target in every Markov equivalent DAG. Next, we provide local criteria for identifying whether a variable is a cause/non-cause of a target based only on the local structure instead of the entire graph. Finally, we propose a local learning algorithm for this causal query via learning the local structure of the variable and some additional statistical independence tests related to the target. Simulation studies show that our local algorithm is efficient and effective, compared with other state-of-art methods.

preprint2022arXiv

A Systematic Literature Review on Blockchain Governance

Blockchain has been increasingly used as a software component to enable decentralisation in software architecture for a variety of applications. Blockchain governance has received considerable attention to ensure the safe and appropriate use and evolution of blockchain, especially after the Ethereum DAO attack in 2016. However, there are no systematic efforts to analyse existing governance solutions. To understand the state-of-the-art of blockchain governance, we conducted a systematic literature review with 37 primary studies. The extracted data from primary studies are synthesised to answer identified research questions. The study results reveal several major findings: 1) governance can improve the adaptability and upgradability of blockchain, whilst the current studies neglect broader ethical responsibilities as the objectives of blockchain governance; 2) governance is along with the development process of a blockchain platform, while ecosystem-level governance process is missing, and; 3) the responsibilities and capabilities of blockchain stakeholders are briefly discussed, whilst the decision rights, accountability, and incentives of blockchain stakeholders are still under studied. We provide actionable guidelines for academia and practitioners to use throughout the lifecycle of blockchain, and identify future trends to support researchers in this area.

preprint2022arXiv

Control of diffusion-driven pattern formation behind a wave of competency

In certain biological contexts, such as the plumage patterns of birds and stripes on certain species of fishes, pattern formation takes place behind a so-called "wave of competency". Currently, the effects of a wave of competency on the patterning outcome is not well-understood. In this study, we use Turing's diffusion-driven instability model to study pattern formation behind a wave of competency, under a range of wave speeds. Numerical simulations show that in one spatial dimension a slower wave speed drives a sequence of peak splittings in the pattern, whereas a higher wave speed leads to peak insertions. In two spatial dimensions, we observe stripes that are either perpendicular or parallel to the moving boundary under slow or fast wave speeds, respectively. We argue that there is a correspondence between the one- and two-dimensional phenomena, and that pattern formation behind a wave of competency can account for the pattern organization observed in many biological systems.

preprint2022arXiv

Deep Learning for Android Malware Defenses: a Systematic Literature Review

Malicious applications (particularly those targeting the Android platform) pose a serious threat to developers and end-users. Numerous research efforts have been devoted to developing effective approaches to defend against Android malware. However, given the explosive growth of Android malware and the continuous advancement of malicious evasion technologies like obfuscation and reflection, Android malware defense approaches based on manual rules or traditional machine learning may not be effective. In recent years, a dominant research field called deep learning (DL), which provides a powerful feature abstraction ability, has demonstrated a compelling and promising performance in a variety of areas, like natural language processing and computer vision. To this end, employing deep learning techniques to thwart Android malware attacks has recently garnered considerable research attention. Yet, no systematic literature review focusing on deep learning approaches for Android Malware defenses exists. In this paper, we conducted a systematic literature review to search and analyze how deep learning approaches have been applied in the context of malware defenses in the Android environment. As a result, a total of 132 studies covering the period 2014-2021 were identified. Our investigation reveals that, while the majority of these sources mainly consider DL-based on Android malware detection, 53 primary studies (40.1 percent) design defense approaches based on other scenarios. This review also discusses research trends, research focuses, challenges, and future research directions in DL-based Android malware defenses.

preprint2022arXiv

Defining Blockchain Governance Principles: A Comprehensive Framework

Blockchain eliminates the need for trusted third-party intermediaries in business by enabling decentralised architecture design in software applications. However, the vulnerabilities in on-chain autonomous decision-makings and cumbersome off-chain coordination lead to serious concerns about blockchain's ability to behave in a trustworthy and efficient way. Blockchain governance has received considerable attention to support the decision-making process during the use and evolution of blockchain. Nevertheless, the conventional governance frameworks do not apply to blockchain due to its distributed architecture and decentralised decision process. These inherent features lead to the absence of a clear source of authority in blockchain ecosystem. Currently, there is a lack of systematic guidance on the governance of blockchain. Therefore, in this paper, we present a comprehensive blockchain governance framework, which elucidates an integrated view of the degree of decentralisation, decision rights, incentives, accountability, ecosystem, and legal and ethical responsibilities. The above aspects are formulated as six high-level principles for blockchain governance. We demonstrate a qualitative analysis of the proposed framework, including case studies on five extant blockchain platforms, and comparison with existing blockchain governance frameworks. The results show that our proposed framework is feasible and applicable in a real-world context.

preprint2022arXiv

Explainable AI for Android Malware Detection: Towards Understanding Why the Models Perform So Well?

Machine learning (ML)-based Android malware detection has been one of the most popular research topics in the mobile security community. An increasing number of research studies have demonstrated that machine learning is an effective and promising approach for malware detection, and some works have even claimed that their proposed models could achieve 99\% detection accuracy, leaving little room for further improvement. However, numerous prior studies have suggested that unrealistic experimental designs bring substantial biases, resulting in over-optimistic performance in malware detection. Unlike previous research that examined the detection performance of ML classifiers to locate the causes, this study employs Explainable AI (XAI) approaches to explore what ML-based models learned during the training process, inspecting and interpreting why ML-based malware classifiers perform so well under unrealistic experimental settings. We discover that temporal sample inconsistency in the training dataset brings over-optimistic classification performance (up to 99\% F1 score and accuracy). Importantly, our results indicate that ML models classify malware based on temporal differences between malware and benign, rather than the actual malicious behaviors. Our evaluation also confirms the fact that unrealistic experimental designs lead to not only unrealistic detection performance but also poor reliability, posing a significant obstacle to real-world applications. These findings suggest that XAI approaches should be used to help practitioners/researchers better understand how do AI/ML models (i.e., malware detection) work -- not just focusing on accuracy improvement.

preprint2022arXiv

FedSynth: Gradient Compression via Synthetic Data in Federated Learning

Model compression is important in federated learning (FL) with large models to reduce communication cost. Prior works have been focusing on sparsification based compression that could desparately affect the global model accuracy. In this work, we propose a new scheme for upstream communication where instead of transmitting the model update, each client learns and transmits a light-weight synthetic dataset such that using it as the training data, the model performs similarly well on the real training data. The server will recover the local model update via the synthetic data and apply standard aggregation. We then provide a new algorithm FedSynth to learn the synthetic data locally. Empirically, we find our method is comparable/better than random masking baselines in all three common federated learning benchmark datasets.

preprint2022arXiv

Improved Dual Correlation Reduction Network

Deep graph clustering, which aims to reveal the underlying graph structure and divide the nodes into different clusters without human annotations, is a fundamental yet challenging task. However, we observed that the existing methods suffer from the representation collapse problem and easily tend to encode samples with different classes into the same latent embedding. Consequently, the discriminative capability of nodes is limited, resulting in sub-optimal clustering performance. To address this problem, we propose a novel deep graph clustering algorithm termed Improved Dual Correlation Reduction Network (IDCRN) through improving the discriminative capability of samples. Specifically, by approximating the cross-view feature correlation matrix to an identity matrix, we reduce the redundancy between different dimensions of features, thus improving the discriminative capability of the latent space explicitly. Meanwhile, the cross-view sample correlation matrix is forced to approximate the designed clustering-refined adjacency matrix to guide the learned latent representation to recover the affinity matrix even across views, thus enhancing the discriminative capability of features implicitly. Moreover, we avoid the collapsed representation caused by the over-smoothing issue in Graph Convolutional Networks (GCNs) through an introduced propagation regularization term, enabling IDCRN to capture the long-range information with the shallow network structure. Extensive experimental results on six benchmarks have demonstrated the effectiveness and the efficiency of IDCRN compared to the existing state-of-the-art deep graph clustering algorithms.

preprint2022arXiv

Multiple Kernel Clustering with Dual Noise Minimization

Clustering is a representative unsupervised method widely applied in multi-modal and multi-view scenarios. Multiple kernel clustering (MKC) aims to group data by integrating complementary information from base kernels. As a representative, late fusion MKC first decomposes the kernels into orthogonal partition matrices, then learns a consensus one from them, achieving promising performance recently. However, these methods fail to consider the noise inside the partition matrix, preventing further improvement of clustering performance. We discover that the noise can be disassembled into separable dual parts, i.e. N-noise and C-noise (Null space noise and Column space noise). In this paper, we rigorously define dual noise and propose a novel parameter-free MKC algorithm by minimizing them. To solve the resultant optimization problem, we design an efficient two-step iterative strategy. To our best knowledge, it is the first time to investigate dual noise within the partition in the kernel space. We observe that dual noise will pollute the block diagonal structures and incur the degeneration of clustering performance, and C-noise exhibits stronger destruction than N-noise. Owing to our efficient mechanism to minimize dual noise, the proposed algorithm surpasses the recent methods by large margins.

preprint2022arXiv

Pogorelov type $C^2$ estimates for Sum Hessian equations and a rigidity theorem

We mainly study Pogorelov type $C^2$ estimates for solutions to the Dirichlet problem of Sum Hessian equations. We establish respectively Pogorelov type $C^2$ estimates for $k$-convex solutions and admissible solutions under some conditions. Furthermore, we apply such estimates to obtain a rigidity theorem for $k$-convex solutions of Sum Hessian equations in Euclidean space.

preprint2022arXiv

Simple Contrastive Graph Clustering

Contrastive learning has recently attracted plenty of attention in deep graph clustering for its promising performance. However, complicated data augmentations and time-consuming graph convolutional operation undermine the efficiency of these methods. To solve this problem, we propose a Simple Contrastive Graph Clustering (SCGC) algorithm to improve the existing methods from the perspectives of network architecture, data augmentation, and objective function. As to the architecture, our network includes two main parts, i.e., pre-processing and network backbone. A simple low-pass denoising operation conducts neighbor information aggregation as an independent pre-processing, and only two multilayer perceptrons (MLPs) are included as the backbone. For data augmentation, instead of introducing complex operations over graphs, we construct two augmented views of the same vertex by designing parameter un-shared siamese encoders and corrupting the node embeddings directly. Finally, as to the objective function, to further improve the clustering performance, a novel cross-view structural consistency objective function is designed to enhance the discriminative capability of the learned network. Extensive experimental results on seven benchmark datasets validate our proposed algorithm's effectiveness and superiority. Significantly, our algorithm outperforms the recent contrastive deep clustering competitors with at least seven times speedup on average.

preprint2022arXiv

Spatial Transformation for Image Composition via Correspondence Learning

When using cut-and-paste to acquire a composite image, the geometry inconsistency between foreground and background may severely harm its fidelity. To address the geometry inconsistency in composite images, several existing works learned to warp the foreground object for geometric correction. However, the absence of annotated dataset results in unsatisfactory performance and unreliable evaluation. In this work, we contribute a Spatial TRAnsformation for virtual Try-on (STRAT) dataset covering three typical application scenarios. Moreover, previous works simply concatenate foreground and background as input without considering their mutual correspondence. Instead, we propose a novel correspondence learning network (CorrelNet) to model the correspondence between foreground and background using cross-attention maps, based on which we can predict the target coordinate that each source coordinate of foreground should be mapped to on the background. Then, the warping parameters of foreground object can be derived from pairs of source and target coordinates. Additionally, we learn a filtering mask to eliminate noisy pairs of coordinates to estimate more accurate warping parameters. Extensive experiments on our STRAT dataset demonstrate that our proposed CorrelNet performs more favorably against previous methods.

preprint2022arXiv

Temporal epistasis inference from more than 3,500,000 SARS-CoV-2 Genomic Sequences

We use Direct Coupling Analysis (DCA) to determine epistatic interactions between loci of variability of the SARS-CoV-2 virus, segmenting genomes by month of sampling. We use full-length, high-quality genomes from the GISAID repository up to October 2021, in total over 3,500,000 genomes. We find that DCA terms are more stable over time than correlations, but nevertheless change over time as mutations disappear from the global population or reach fixation. Correlations are enriched for phylogenetic effects, and in particularly statistical dependencies at short genomic distances, while DCA brings out links at longer genomic distance. We discuss the validity of a DCA analysis under these conditions in terms of a transient Quasi-Linkage Equilibrium state. We identify putative epistatic interaction mutations involving loci in Spike.

preprint2021arXiv

Deep Graph Clustering via Dual Correlation Reduction

Deep graph clustering, which aims to reveal the underlying graph structure and divide the nodes into different groups, has attracted intensive attention in recent years. However, we observe that, in the process of node encoding, existing methods suffer from representation collapse which tends to map all data into the same representation. Consequently, the discriminative capability of the node representation is limited, leading to unsatisfied clustering performance. To address this issue, we propose a novel self-supervised deep graph clustering method termed Dual Correlation Reduction Network (DCRN) by reducing information correlation in a dual manner. Specifically, in our method, we first design a siamese network to encode samples. Then by forcing the cross-view sample correlation matrix and cross-view feature correlation matrix to approximate two identity matrices, respectively, we reduce the information correlation in the dual-level, thus improving the discriminative capability of the resulting features. Moreover, in order to alleviate representation collapse caused by over-smoothing in GCN, we introduce a propagation regularization term to enable the network to gain long-distance information with the shallow network structure. Extensive experimental results on six benchmark datasets demonstrate the effectiveness of the proposed DCRN against the existing state-of-the-art methods.

preprint2021arXiv

Dynamical anyon generation in Kitaev honeycomb non-Abelian spin liquids

Relativistic Mott insulators known as 'Kitaev materials' potentially realize spin liquids hosting non-Abelian anyons. Motivated by fault-tolerant quantum-computing applications in this setting, we introduce a dynamical anyon-generation protocol that exploits universal edge physics. The setup features holes in the spin liquid, which define energetically cheap locations for non-Abelian anyons, connected by a narrow bridge that can be tuned between spin liquid and topologically trivial phases. We show that modulating the bridge from trivial to spin liquid over intermediate time scales -- quantified by analytics and extensive simulations -- deposits non-Abelian anyons into the holes with O(1) probability. The required bridge manipulations can be implemented by integrating the Kitaev material into magnetic tunnel junction arrays that engender locally tunable exchange fields. Combined with existing readout strategies, our protocol reveals a path to topological qubit experiments in Kitaev materials at zero applied magnetic field.

preprint2021arXiv

Out-of-Distribution Generalization Analysis via Influence Function

The mismatch between training and target data is one major challenge for current machine learning systems. When training data is collected from multiple domains and the target domains include all training domains and other new domains, we are facing an Out-of-Distribution (OOD) generalization problem that aims to find a model with the best OOD accuracy. One of the definitions of OOD accuracy is worst-domain accuracy. In general, the set of target domains is unknown, and the worst over target domains may be unseen when the number of observed domains is limited. In this paper, we show that the worst accuracy over the observed domains may dramatically fail to identify the OOD accuracy. To this end, we introduce Influence Function, a classical tool from robust statistics, into the OOD generalization problem and suggest the variance of influence function to monitor the stability of a model on training domains. We show that the accuracy on test domains and the proposed index together can help us discern whether OOD algorithms are needed and whether a model achieves good OOD generalization.

preprint2020arXiv

A Blockchain-based Platform Architecture for Multimedia Data Management

Massive amounts of multimedia data (i.e., text, audio, video, graphics and animation) are being generated everyday. Conventionally, multimedia data are managed by the platforms maintained by multimedia service providers, which are generally designed using centralised architecture. However, such centralised architecture may lead to a single point of failure and disputes over royalties or other rights. It is hard to ensure the data integrity and track fulfilment of obligations listed on the copyright agreement. To tackle these issues, in this paper, we present a blockchain-based platform architecture for multimedia data management. We adopt self-sovereign identity for identity management and design a multi-level capability-based mechanism for access control. We implement a proof-of-concept prototype using the proposed approach and evaluate it using a use case. The results show that the proposed approach is feasible and has scalable performance.

preprint2020arXiv

A Note on Parallel Distinguishability of two Quantum Operations

We consider a homogeneous system of linear equations of the form $A_α^{\otimes N} {\bf x} = 0$ arising from the distinguishability of two quantum operations by $N$ uses in parallel, where the coefficient matrix $A_α$ depends on a real parameter $α$. It was conjectured by Duan et al. that the system has a non-trivial nonnegative solution if and only if $α$ lies in a certain interval $R_N$ depending on $N$. We affirm the necessity part of the conjecture and establish the sufficiency of the conjecture for $N\leq 10$ by presenting explicit non-trivial nonnegative solutions for the linear system.

preprint2020arXiv

Adding Seemingly Uninformative Labels Helps in Low Data Regimes

Evidence suggests that networks trained on large datasets generalize well not solely because of the numerous training examples, but also class diversity which encourages learning of enriched features. This raises the question of whether this remains true when data is scarce - is there an advantage to learning with additional labels in low-data regimes? In this work, we consider a task that requires difficult-to-obtain expert annotations: tumor segmentation in mammography images. We show that, in low-data settings, performance can be improved by complementing the expert annotations with seemingly uninformative labels from non-expert annotators, turning the task into a multi-class problem. We reveal that these gains increase when less expert data is available, and uncover several interesting properties through further studies. We demonstrate our findings on CSAW-S, a new dataset that we introduce here, and confirm them on two public datasets.

preprint2020arXiv

Decoupling Inherent Risk and Early Cancer Signs in Image-based Breast Cancer Risk Models

The ability to accurately estimate risk of developing breast cancer would be invaluable for clinical decision-making. One promising new approach is to integrate image-based risk models based on deep neural networks. However, one must take care when using such models, as selection of training data influences the patterns the network will learn to identify. With this in mind, we trained networks using three different criteria to select the positive training data (i.e. images from patients that will develop cancer): an inherent risk model trained on images with no visible signs of cancer, a cancer signs model trained on images containing cancer or early signs of cancer, and a conflated model trained on all images from patients with a cancer diagnosis. We find that these three models learn distinctive features that focus on different patterns, which translates to contrasts in performance. Short-term risk is best estimated by the cancer signs model, whilst long-term risk is best estimated by the inherent risk model. Carelessly training with all images conflates inherent risk with early cancer signs, and yields sub-optimal estimates in both regimes. As a consequence, conflated models may lead physicians to recommend preventative action when early cancer signs are already visible.

preprint2020arXiv

Deep Neural Network Approach for Annual Luminance Simulations

Annual luminance maps provide meaningful evaluations for occupants' visual comfort, preferences, and perception. However, acquiring long-term luminance maps require labor-intensive and time-consuming simulations or impracticable long-term field measurements. This paper presents a novel data-driven machine learning approach that makes annual luminance-based evaluations more efficient and accessible. The methodology is based on predicting the annual luminance maps from a limited number of point-in-time high dynamic range imagery by utilizing a deep neural network (DNN). Panoramic views are utilized, as they can be post-processed to study multiple view directions. The proposed DNN model can faithfully predict high-quality annual panoramic luminance maps from one of the three options within 30 minutes training time: a) point-in-time luminance imagery spanning 5% of the year, when evenly distributed during daylight hours, b) one-month hourly imagery generated or collected continuously during daylight hours around the equinoxes (8% of the year); or c) 9 days of hourly data collected around the spring equinox, summer and winter solstices (2.5% of the year) all suffice to predict the luminance maps for the rest of the year. The DNN predicted high-quality panoramas are validated against Radiance (RPICT) renderings using a series of quantitative and qualitative metrics. The most efficient predictions are achieved with 9 days of hourly data collected around the spring equinox, summer and winter solstices. The results clearly show that practitioners and researchers can efficiently incorporate long-term luminance-based metrics over multiple view directions into the design and research processes using the proposed DNN workflow.

preprint2020arXiv

Design Patterns for Blockchain-based Self-Sovereign Identity

Self-sovereign identity is a new identity management paradigm that allows entities to really have the ownership of their identity data and control their use without involving any intermediary. Blockchain is an enabling technology for building self-sovereign identity systems by providing a neutral and trustable storage and computing infrastructure and can be viewed as a component of the systems. Both blockchain and self-sovereign identity are emerging technologies which could present a steep learning curve for architects. We collect and propose 12 design patterns for blockchain-based self-sovereign identity systems to help the architects understand and easily apply the concepts in system design. Based on the lifecycles of three main objects involved in self-sovereign identity, we categorise the patterns into three groups: key management patterns, decentralised identifier management patterns, and credential design patterns. The proposed patterns provide a systematic and holistic guide for architects to design the architecture of blockchain-based self-sovereign identity systems.

preprint2020arXiv

Online NEAT for Credit Evaluation -- a Dynamic Problem with Sequential Data

In this paper, we describe application of Neuroevolution to a P2P lending problem in which a credit evaluation model is updated based on streaming data. We apply the algorithm Neuroevolution of Augmenting Topologies (NEAT) which has not been widely applied generally in the credit evaluation domain. In addition to comparing the methodology with other widely applied machine learning techniques, we develop and evaluate several enhancements to the algorithm which make it suitable for the particular aspects of online learning that are relevant in the problem. These include handling unbalanced streaming data, high computation costs, and maintaining model similarity over time, that is training the stochastic learning algorithm with new data but minimizing model change except where there is a clear benefit for model performance

preprint2020arXiv

Orbital Stability of smooth solitary waves for the Degasperis-Procesi Equation

The Degasperis-Procesi equation is the integrable Camassa-Holm-type model which is an asymptotic approximation for the unidirectional propagation of shallow water waves. This work establishes the orbital stability of localized smooth solitary waves to the Desgasperis-Procesi (DP) equation on the real line. %extending our previous work on their spectral stability \cite{LLW}. The main difficulty stems from the fact that the translation symmetry for the DP equation gives rise to a conserved quantity equivalent to the $L^2$-norm, which by itself can not bound the higher-order nonlinear terms in the Lagrangian. The remedy is to observe that, given a sufficiently smooth initial condition satisfying a measurable constraint, the $L^\infty$ orbital norm of the perturbation is bounded above by a function of its $L^2$ orbital norm, yielding the orbital stability in the $L^2\cap L^\infty$ space.

preprint2020arXiv

Predicting the Porosity Formed in Freeze Casting by Artificial Neural Network

Freeze casting has been increasingly applied to process various porous materials. A linear relationship between the final porosity and the initial solid material fraction in the suspension was reported by other researchers. However, the relationship of the volume fraction between the porosity and the solid material shows high divergence among different materials or frozen solvents, as the nature of materials significantly affects the pores formed in freeze casting. Here, we proposed an artificial neural network (ANN) to evaluate the porosity in freeze casting process. After well training the ANN model on experimental data, a porosity value can be predicted from four inputs which describe the most influential process conditions. The error range, reliability and optimization of the model were also analyzed and discussed in this study. The results proved that this method effectively summarizes a general rule for diverse materials in one model, which is difficult to be realized by linear fitting. Finally, a user-friendly mini program based on a well-trained ANN model is also provided to predict the porosity level for customized freeze-casting experiments.

preprint2020arXiv

Real-time Human Activity Recognition Using Conditionally Parametrized Convolutions on Mobile and Wearable Devices

Recently, deep learning has represented an important research trend in human activity recognition (HAR). In particular, deep convolutional neural networks (CNNs) have achieved state-of-the-art performance on various HAR datasets. For deep learning, improvements in performance have to heavily rely on increasing model size or capacity to scale to larger and larger datasets, which inevitably leads to the increase of operations. A high number of operations in deep leaning increases computational cost and is not suitable for real-time HAR using mobile and wearable sensors. Though shallow learning techniques often are lightweight, they could not achieve good performance. Therefore, deep learning methods that can balance the trade-off between accuracy and computation cost is highly needed, which to our knowledge has seldom been researched. In this paper, we for the first time propose a computation efficient CNN using conditionally parametrized convolution for real-time HAR on mobile and wearable devices. We evaluate the proposed method on four public benchmark HAR datasets consisting of WISDM dataset, PAMAP2 dataset, UNIMIB-SHAR dataset, and OPPORTUNITY dataset, achieving state-of-the-art accuracy without compromising computation cost. Various ablation experiments are performed to show how such a network with large capacity is clearly preferable to baseline while requiring a similar amount of operations. The method can be used as a drop-in replacement for the existing deep HAR architectures and easily deployed onto mobile and wearable devices for real-time HAR applications.

preprint2020arXiv

Stable Prediction via Leveraging Seed Variable

In this paper, we focus on the problem of stable prediction across unknown test data, where the test distribution is agnostic and might be totally different from the training one. In such a case, previous machine learning methods might exploit subtly spurious correlations in training data induced by non-causal variables for prediction. Those spurious correlations are changeable across data, leading to instability of prediction across data. By assuming the relationships between causal variables and response variable are invariant across data, to address this problem, we propose a conditional independence test based algorithm to separate those causal variables with a seed variable as priori, and adopt them for stable prediction. By assuming the independence between causal and non-causal variables, we show, both theoretically and with empirical experiments, that our algorithm can precisely separate causal and non-causal variables for stable prediction across test data. Extensive experiments on both synthetic and real-world datasets demonstrate that our algorithm outperforms state-of-the-art methods for stable prediction.

preprint2020arXiv

Video Moment Retrieval via Natural Language Queries

In this paper, we propose a novel method for video moment retrieval (VMR) that achieves state of the arts (SOTA) performance on R@1 metrics and surpassing the SOTA on the high IoU metric (R@1, IoU=0.7). First, we propose to use a multi-head self-attention mechanism, and further a cross-attention scheme to capture video/query interaction and long-range query dependencies from video context. The attention-based methods can develop frame-to-query interaction and query-to-frame interaction at arbitrary positions and the multi-head setting ensures the sufficient understanding of complicated dependencies. Our model has a simple architecture, which enables faster training and inference while maintaining . Second, We also propose to use multiple task training objective consists of moment segmentation task, start/end distribution prediction and start/end location regression task. We have verified that start/end prediction are noisy due to annotator disagreement and joint training with moment segmentation task can provide richer information since frames inside the target clip are also utilized as positive training examples. Third, we propose to use an early fusion approach, which achieves better performance at the cost of inference time. However, the inference time will not be a problem for our model since our model has a simple architecture which enables efficient training and inference.