Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
76works
0followers
36topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

76 published item(s)

preprint2026arXiv

$f$-Divergence Regularized RLHF: Two Tales of Sampling and Unified Analyses

Reinforcement Learning from Human Feedback (RLHF) has become a cornerstone technique for post-training large language models. While most existing approaches rely on the reverse KL-regularization, recent empirical studies have begun exploring alternative divergences (e.g., forward KL, chi-squared) as regularizers in RLHF. However, a unified theoretical understanding of general $f$-divergence regularization remains under-explored. To fill this gap, this work develops a comprehensive theoretical framework for online RLHF with a general $f$-divergence regularized objective. Rather than treating each possible divergence function individually, we adopt a holistic perspective across the entire function class and propose two algorithms based on distinct sampling principles. The first extends the classical optimism principle with a carefully designed exploration bonus, while the second introduces a new method that exploits the sensitivity of the optimal policy to reward perturbations under $f$-divergence regularization. Theoretical analysis shows that $O(\log T)$ regret and $O(1/T)$ sub-optimality gap are achievable, establishing provable efficiency of both algorithms and, to the best of our knowledge, the first performance bounds for online RLHF under general $f$-divergence regularization.

preprint2026arXiv

LongMemEval-V2: Evaluating Long-Term Agent Memory Toward Experienced Colleagues

Long-term memory is crucial for agents in specialized web environments, where success depends on recalling interface affordances, state dynamics, workflows, and recurring failure modes. However, existing memory benchmarks for agents mostly focus on user histories, short traces, or downstream task success, leaving open how to directly evaluate whether memory systems effectively internalize environment-specific experience. To address this gap, we introduce LongMemEval-V2 (LME-V2), a benchmark for evaluating whether memory systems can help agents acquire the experience needed to become knowledgeable colleagues in customized environments. LME-V2 contains 451 manually curated questions covering five core memory abilities for web agents: static state recall, dynamic state tracking, workflow knowledge, environment gotchas, and premise awareness. Questions are paired with history trajectories containing up to 500 trajectories and 115M tokens. We use a context gathering formulation: memory systems consume history trajectories and return compact evidence for downstream question answering. We propose a suite of two memory methods: AgentRunbook-R, an efficient RAG-based memory with knowledge pools for raw state observations, events, and strategy notes, and AgentRunbook-C, which stores trajectories as files and invokes a coding agent to gather evidence in an augmented sandbox. Experiments show that AgentRunbook-C achieves the best performance with 72.5% average accuracy, outperforming the strongest RAG baseline (48.5%) and the off-the-shelf coding agent baseline (69.3%). Despite the strong performance gains, coding agent based methods have high latency costs. While AgentRunbook-C advances the accuracy-latency Pareto frontier, substantial room for improvement remains. Together, these results establish LME-V2 as a challenging testbed for developing long-term memory systems for environment experience.

preprint2024arXiv

EcoFed: Efficient Communication for DNN Partitioning-based Federated Learning

Efficiently running federated learning (FL) on resource-constrained devices is challenging since they are required to train computationally intensive deep neural networks (DNN) independently. DNN partitioning-based FL (DPFL) has been proposed as one mechanism to accelerate training where the layers of a DNN (or computation) are offloaded from the device to the server. However, this creates significant communication overheads since the intermediate activation and gradient need to be transferred between the device and the server during training. While current research reduces the communication introduced by DNN partitioning using local loss-based methods, we demonstrate that these methods are ineffective in improving the overall efficiency (communication overhead and training speed) of a DPFL system. This is because they suffer from accuracy degradation and ignore the communication costs incurred when transferring the activation from the device to the server. This article proposes EcoFed - a communication efficient framework for DPFL systems. EcoFed eliminates the transmission of the gradient by developing pre-trained initialization of the DNN model on the device for the first time. This reduces the accuracy degradation seen in local loss-based methods. In addition, EcoFed proposes a novel replay buffer mechanism and implements a quantization-based compression technique to reduce the transmission of the activation. It is experimentally demonstrated that EcoFed can reduce the communication cost by up to 133x and accelerate training by up to 21x when compared to classic FL. Compared to vanilla DPFL, EcoFed achieves a 16x communication reduction and 2.86x training time speed-up. EcoFed is available from https://github.com/blessonvar/EcoFed.

preprint2024arXiv

On the Selection of Intermediate Length Representative Periods for Capacity Expansion

As the decarbonization of power systems accelerates, there has been increasing interest in capacity expansion models for their role in guiding this transition. Representative period selection is an important component of capacity expansion modeling, enabling computational tractability of optimization while ensuring fidelity between the representative periods and the full year. However, little attention has been devoted to selecting representative periods longer than a single day. This prevents the capacity expansion model from directly simulating interday energy sharing, which is of key importance as energy generation becomes more variable and storage more important. To this end, we propose a novel method for selecting representative periods of any length. The method is validated using a capacity expansion model and production cost model based on California's decarbonization goals. We demonstrate that the representative period length has a substantial impact in the results of the capacity expansion investment plan.

preprint2023arXiv

A Novel Modular, Reconfigurable Battery Energy Storage System: Design, Control, and Experimentation

This paper presents a novel modular, reconfigurable battery energy storage system. The proposed design is characterized by a tight integration of reconfigurable power switches and DC/DC converters. This characteristic enables isolation of faulty cells from the system and allows fine power control for individual cells toward optimal system-level performance. An optimal power management approach is developed to extensively exploit the merits of the proposed design. Based on receding-horizon convex optimization, this approach aims to minimize the total power losses in charging/discharging while allocating the power in line with each cell's condition to achieve state-of-charge (SoC) and temperature balancing. By appropriate design, the approach manages to regulate the power of a cell across its full SoC range and guarantees the feasibility of the optimization problem. We perform extensive simulations and further develop a lab-scale prototype to validate the proposed system design and power management approach.

preprint2023arXiv

Gap Minimization for Knowledge Sharing and Transfer

Learning from multiple related tasks by knowledge sharing and transfer has become increasingly relevant over the last two decades. In order to successfully transfer information from one task to another, it is critical to understand the similarities and differences between the domains. In this paper, we introduce the notion of \emph{performance gap}, an intuitive and novel measure of the distance between learning tasks. Unlike existing measures which are used as tools to bound the difference of expected risks between tasks (e.g., $\mathcal{H}$-divergence or discrepancy distance), we theoretically show that the performance gap can be viewed as a data- and algorithm-dependent regularizer, which controls the model complexity and leads to finer guarantees. More importantly, it also provides new insights and motivates a novel principle for designing strategies for knowledge sharing and transfer: gap minimization. We instantiate this principle with two algorithms: 1. gapBoost, a novel and principled boosting algorithm that explicitly minimizes the performance gap between source and target domains for transfer learning; and 2. gapMTNN, a representation learning algorithm that reformulates gap minimization as semantic conditional matching for multitask learning. Our extensive evaluation on both transfer learning and multitask learning benchmark data sets shows that our methods outperform existing baselines.

preprint2023arXiv

Topological classes of rotating black holes

In this paper, we investigate the topological numbers for singly rotating Kerr black holes in arbitrary dimensions and four-dimensional Kerr-Newman black hole. We show that for uncharged black holes, the rotation parameter has a significant effect on the topological number, and for rotating black holes, the dimension of spacetime has a remarkable effect on the topological number too. In addition, we find that the topological numbers of the four-dimensional Kerr and Kerr-Newman black holes are the same, which seems to indicate that the electric charge parameter has no effect on the topological number of rotating black holes. Our current research provides more evidence that the conjecture put forward in Wei et al. [Phys. Rev. Lett. 129, 191101 (2022)], according to which all black hole solutions should be separated into three different topological classes, is accurate, at least in the pure Einstein-Maxwell gravity theory.

preprint2022arXiv

A Differential Evolution-Enhanced Latent Factor Analysis Model for High-dimensional and Sparse Data

High-dimensional and sparse (HiDS) matrices are frequently adopted to describe the complex relationships in various big data-related systems and applications. A Position-transitional Latent Factor Analysis (PLFA) model can accurately and efficiently represent an HiDS matrix. However, its involved latent factors are optimized by stochastic gradient descent with the specific gradient direction step-by-step, which may cause a suboptimal solution. To address this issue, this paper proposes a Sequential-Group-Differential- Evolution (SGDE) algorithm to refine the latent factors optimized by a PLFA model, thereby achieving a highly-accurate SGDE-PLFA model to HiDS matrices. As demonstrated by the experiments on four HiDS matrices, a SGDE-PLFA model outperforms the state-of-the-art models.

preprint2022arXiv

A duality-based approach for solving linear parabolic control constrained optimal control problems

This paper is concerned with the optimal control problem governed by a linear parabolic equation and subjected to box constraints on control variables. This type of problem has important applications in heating and cooling systems. By applying the scheme of Fenchel duality, we derive the dual problem explicitly where the control constraints in primal problem are embedded in the dual problem's objective functional. The existence and uniqueness of the solution to the dual problem are proved and the first-order optimality conditions are also derived. In addition, we discuss the saddle point property between solution of the primal problem and the dual problem. The solution of primal problem can be readily obtained by the solution of the dual problem. To solve the dual problem numerically, we design two implementable method: conjugate gradient method and semismooth Newton method. Three example problems are solved, numerical results show that the proposed method is efficient and accurate.

preprint2022arXiv

A Latent Feature Analysis-based Approach for Spatio-Temporal Traffic Data Recovery

Missing data is an inevitable and common problem in data-driven intelligent transportation systems (ITS). In the past decade, scholars have done many research on the recovery of missing traffic data, however how to make full use of spatio-temporal traffic patterns to improve the recovery performance is still an open problem. Aiming at the spatio-temporal characteristics of traffic speed data, this paper regards the recovery of missing data as a matrix completion problem, and proposes a spatio-temporal traffic data completion method based on hidden feature analysis, which discovers spatio-temporal patterns and underlying structures from incomplete data to complete the recovery task. Therefore, we introduce spatial and temporal correlation to capture the main underlying features of each dimension. Finally, these latent features are applied to recovery traffic data through latent feature analysis. The experimental and evaluation results show that the evaluation criterion value of the model is small, which indicates that the model has better performance. The results show that the model can accurately estimate the continuous missing data.

preprint2022arXiv

A Multi-Metric Latent Factor Model for Analyzing High-Dimensional and Sparse data

High-dimensional and sparse (HiDS) matrices are omnipresent in a variety of big data-related applications. Latent factor analysis (LFA) is a typical representation learning method that extracts useful yet latent knowledge from HiDS matrices via low-rank approximation. Current LFA-based models mainly focus on a single-metric representation, where the representation strategy designed for the approximation Loss function, is fixed and exclusive. However, real-world HiDS matrices are commonly heterogeneous and inclusive and have diverse underlying patterns, such that a single-metric representation is most likely to yield inferior performance. Motivated by this, we in this paper propose a multi-metric latent factor (MMLF) model. Its main idea is two-fold: 1) two vector spaces and three Lp-norms are simultaneously employed to develop six variants of LFA model, each of which resides in a unique metric representation space, and 2) all the variants are ensembled with a tailored, self-adaptive weighting strategy. As such, our proposed MMLF enjoys the merits originated from a set of disparate metric spaces all at once, achieving the comprehensive and unbiased representation of HiDS matrices. Theoretical study guarantees that MMLF attains a performance gain. Extensive experiments on eight real-world HiDS datasets, spanning a wide range of industrial and science domains, verify that our MMLF significantly outperforms ten state-of-the-art, shallow and deep counterparts.

preprint2022arXiv

An Adam-adjusting-antennae BAS Algorithm for Refining Latent Factors

Extracting the latent information in high-dimensional and incomplete matrices is an important and challenging issue. The Latent Factor Analysis (LFA) model can well handle the high-dimensional matrices analysis. Recently, Particle Swarm Optimization (PSO)-incorporated LFA models have been proposed to tune the hyper-parameters adaptively with high efficiency. However, the incorporation of PSO causes the premature problem. To address this issue, we propose a sequential Adam-adjusting-antennae BAS (A2BAS) optimization algorithm, which refines the latent factors obtained by the PSO-incorporated LFA model. The A2BAS algorithm consists of two sub-algorithms. First, we design an improved BAS algorithm which adjusts beetles' antennae and step-size with Adam; Second, we implement the improved BAS algorithm to optimize all the row and column latent factors sequentially. With experimental results on two real high-dimensional matrices, we demonstrate that our algorithm can effectively solve the premature convergence issue.

preprint2022arXiv

An Online Sparse Streaming Feature Selection Algorithm

Online streaming feature selection (OSFS), which conducts feature selection in an online manner, plays an important role in dealing with high-dimensional data. In many real applications such as intelligent healthcare platform, streaming feature always has some missing data, which raises a crucial challenge in conducting OSFS, i.e., how to establish the uncertain relationship between sparse streaming features and labels. Unfortunately, existing OSFS algorithms never consider such uncertain relationship. To fill this gap, we in this paper propose an online sparse streaming feature selection with uncertainty (OS2FSU) algorithm. OS2FSU consists of two main parts: 1) latent factor analysis is utilized to pre-estimate the missing data in sparse streaming features before con-ducting feature selection, and 2) fuzzy logic and neighborhood rough set are employed to alleviate the uncertainty between estimated streaming features and labels during conducting feature selection. In the experiments, OS2FSU is compared with five state-of-the-art OSFS algorithms on six real datasets. The results demonstrate that OS2FSU outperforms its competitors when missing data are encountered in OSFS.

preprint2022arXiv

Analytical Shaping Method for Low-Thrust Rendezvous Trajectory Using Cubic Spline Functions

Preliminary mission design requires an efficient and accurate approximation to the low-thrust rendezvous trajectories, which might be generally three-dimensional and involve multiple revolutions. In this paper, a new shaping method using cubic spline functions is developed for the analytical approximation, which shows advantages in the optimality and computational efficiency. The rendezvous constraints on the boundary states and transfer time are all satisfied analytically, under the assumption that the boundary conditions and segment numbers of cubic spline functions are designated in advance. Two specific shapes are then formulated according to whether they have free optimization parameters. The shape without free parameters provides an efficient and robust estimation, while the other one allows a subsequent optimization for the satisfaction of additional constraints such as the constraint on the thrust magnitude. Applications of the proposed method in combination with the particle swarm optimization algorithm are discussed through two typical interplanetary rendezvous missions, that is, an inclined multi-revolution trajectory from the Earth to asteroid Dionysus and a multi-rendezvous trajectory of sample return. Simulation examples show that the proposed method is superior to existing methods in terms of providing good estimation for the global search and generating suitable initial guess for the subsequent trajectory optimization.

preprint2022arXiv

Bridging the Gap Between Patient-specific and Patient-independent Seizure Prediction via Knowledge Distillation

Objective. Deep neural networks (DNNs) have shown unprecedented success in various brain-machine interface applications such as epileptic seizure prediction. However, existing approaches typically train models in a patient-specific fashion due to the highly personalized characteristics of epileptic signals. Therefore, only a limited number of labeled recordings from each subject can be used for training. As a consequence, current DNN based methods demonstrate poor generalization ability to some extent due to the insufficiency of training data. On the other hand, patient-independent models attempt to utilize more patient data to train a universal model for all patients by pooling patient data together. Despite different techniques applied, results show that patient-independent models perform worse than patient-specific models due to high individual variation across patients. A substantial gap thus exists between patient-specific and patient-independent models. Approach. In this paper, we propose a novel training scheme based on knowledge distillation which makes use of a large amount of data from multiple subjects. It first distills informative features from signals of all available subjects with a pre-trained general model. A patient-specific model can then be obtained with the help of distilled knowledge and additional personalized data. Main results. Four state-of-the-art seizure prediction methods are trained on the Children's Hospital of Boston-MIT sEEG database with our proposed scheme. The resulting accuracy, sensitivity, and false prediction rate show that our proposed training scheme consistently improves the prediction performance of state-of-the-art methods by a large margin. Significance. The proposed training scheme significantly improves the performance of patient-specific seizure predictors and bridges the gap between patient-specific and patient-independent predictors.

preprint2022arXiv

C$^2$SP-Net: Joint Compression and Classification Network for Epilepsy Seizure Prediction

Recent development in brain-machine interface technology has made seizure prediction possible. However, the communication of large volume of electrophysiological signals between sensors and processing apparatus and related computation become two major bottlenecks for seizure prediction systems due to the constrained bandwidth and limited computation resource, especially for wearable and implantable medical devices. Although compressive sensing (CS) can be adopted to compress the signals to reduce communication bandwidth requirement, it needs a complex reconstruction procedure before the signal can be used for seizure prediction. In this paper, we propose C$^2$SP-Net, to jointly solve compression, prediction, and reconstruction with a single neural network. A plug-and-play in-sensor compression matrix is constructed to reduce transmission bandwidth requirement. The compressed signal can be used for seizure prediction without additional reconstruction steps. Reconstruction of the original signal can also be carried out in high fidelity. Prediction accuracy, sensitivity, false prediction rate, and reconstruction quality of the proposed framework are evaluated under various compression ratios. The experimental results illustrate that our model outperforms the competitive state-of-the-art baselines by a large margin in prediction accuracy. In particular, our proposed method produces an average loss of 0.35 % in prediction accuracy with a compression ratio ranging from 1/2 to 1/16.

preprint2022arXiv

Conditional Approximate Normalizing Flows for Joint Multi-Step Probabilistic Forecasting with Application to Electricity Demand

Some real-world decision-making problems require making probabilistic forecasts over multiple steps at once. However, methods for probabilistic forecasting may fail to capture correlations in the underlying time-series that exist over long time horizons as errors accumulate. One such application is with resource scheduling under uncertainty in a grid environment, which requires forecasting electricity demand that is inherently noisy, but often cyclic. In this paper, we introduce the conditional approximate normalizing flow (CANF) to make probabilistic multi-step time-series forecasts when correlations are present over long time horizons. We first demonstrate our method's efficacy on estimating the density of a toy distribution, finding that CANF improves the KL divergence by one-third compared to that of a Gaussian mixture model while still being amenable to explicit conditioning. We then use a publicly available household electricity consumption dataset to showcase the effectiveness of CANF on joint probabilistic multi-step forecasting. Empirical results show that conditional approximate normalizing flows outperform other methods in terms of multi-step forecast accuracy and lead to up to 10x better scheduling decisions. Our implementation is available at https://github.com/sisl/JointDemandForecasting.

preprint2022arXiv

Consistent mass formulas for the four-dimensional dyonic NUT-charged spacetimes

In our previous work, a novel idea that the NUT charge can be thought of as a thermodynamical multi-hair has been advocated to describe perfectly the thermodynamical character of the generic four-dimensional Taub-NUT spacetimes. According to this scheme, the Komar mass M, the gravito-magnetic charge and/or the dual (magnetic) mass N, together with a new secondary hair J_N=MN, namely, a Kerr-like conserved angular momentum, enter into the standard forms of the first law and Bekenstein-Smarr mass formula. Distinguished from other recent attempts, our consistent thermodynamic differential and integral mass formulae are both obtainable from a meaningful Christodoulou-Ruffini-type squared mass formula of almost all of the four-dimensional NUT-charged spacetimes. As an excellent consequence, the famous Bekenstein-Hawking one-quarter area-entropy relation can be naturally restored not only in the Lorentzian sector and but also in the Euclidian counterpart of the generic Taub-NUT-type spacetimes without imposing any constraint condition. However, only purely electric-charged cases in the four-dimensional Einstein-Maxwell gravity theory with a NUT charge have been addressed there. In this paper, we shall follow the simple, systematic way proposed in that article to further investigate the dyonic NUT-charged case. It is shown that the standard thermodynamic relations continue to hold true provided that no new secondary charge is added, however, the so-obtained electrostatic and magneto-static potentials are not coincident with those computed via the standard method. To rectify this inconsistence, a simple strategy is provided by further introducing two additional secondary hairs: Q_N=QN and P_N=PN, together with their thermodynamical conjugate potentials, so that the first law and Bekenstein-Smarr mass formula are still satisfied.

preprint2022arXiv

Constrained Adaptive Projection with Pretrained Features for Anomaly Detection

Anomaly detection aims to separate anomalies from normal samples, and the pretrained network is promising for anomaly detection. However, adapting the pretrained features would be confronted with the risk of pattern collapse when finetuning on one-class training data. In this paper, we propose an anomaly detection framework called constrained adaptive projection with pretrained features (CAP). Combined with pretrained features, a simple linear projection head applied on a specific input and its k most similar pretrained normal representations is designed for feature adaptation, and a reformed self-attention is leveraged to mine the inner-relationship among one-class semantic features. A loss function is proposed to avoid potential pattern collapse. Concretely, it considers the similarity between a specific data and its corresponding adaptive normal representation, and incorporates a constraint term slightly aligning pretrained and adaptive spaces. Our method achieves state-ofthe-art anomaly detection performance on semantic anomaly detection and sensory anomaly detection benchmarks including 96.5% AUROC on CIFAR- 100 dataset, 97.0% AUROC on CIFAR-10 dataset and 89.9% AUROC on MvTec dataset.

preprint2022arXiv

Data-driven Ranking and Selection under Input Uncertainty

We consider a simulation-based Ranking and Selection (R&S) problem with input uncertainty, where unknown input distributions can be estimated using input data arriving in batches of varying sizes over time. Each time a batch arrives, additional simulations can be run using updated input distribution estimates. The goal is to confidently identify the best design after collecting as few batches as possible. We first introduce a moving average estimator for aggregating simulation outputs generated under heterogenous input distributions. Then, based on a Sequential Elimination framework, we devise two major R&S procedures by establishing exact and asymptotic confidence bands for the estimator. In deriving the latter confidence bands, we incorporate the result of "Multiple Comparison with Best" and establish an asymptotic normality result which explicitly characterizes the tradeoff between input uncertainty and stochastic uncertainty in an online environment. We also extend our procedures to the indifference zone setting, which helps save simulation effort for practical usage. Numerical results show the effectiveness and necessity of our procedures. Moreover, the efficiency can be further boosted through optimizing the "drop rate" parameter of the estimator.

preprint2022arXiv

DeepC2: AI-powered Covert Command and Control on OSNs

Command and control (C&C) is important in an attack. It transfers commands from the attacker to the malware in the compromised hosts. Currently, some attackers use online social networks (OSNs) in C&C tasks. There are two main problems in the C&C on OSNs. First, the process for the malware to find the attacker is reversible. If the malware sample is analyzed by the defender, the attacker would be exposed before publishing the commands. Second, the commands in plain or encrypted form are regarded as abnormal contents by OSNs, which would raise anomalies and trigger restrictions on the attacker. The defender can limit the attacker once it is exposed. In this work, we propose DeepC2, an AI-powered C&C on OSNs, to solve these problems. For the reversible hard-coding, the malware finds the attacker using a neural network model. The attacker's avatars are converted into a batch of feature vectors, and the defender cannot recover the avatars in advance using the model and the feature vectors. To solve the abnormal contents on OSNs, hash collision and text data augmentation are used to embed commands into normal contents. The experiment on Twitter shows that command-embedded tweets can be generated efficiently. The malware can find the attacker covertly on OSNs. Security analysis shows it is hard to recover the attacker's identifiers in advance.

preprint2022arXiv

Differentially Private AUC Computation in Vertical Federated Learning

Federated learning has gained great attention recently as a privacy-enhancing tool to jointly train a machine learning model by multiple parties. As a sub-category, vertical federated learning (vFL) focuses on the scenario where features and labels are split into different parties. The prior work on vFL has mostly studied how to protect label privacy during model training. However, model evaluation in vFL might also lead to potential leakage of private label information. One mitigation strategy is to apply label differential privacy (DP) but it gives bad estimations of the true (non-private) metrics. In this work, we propose two evaluation algorithms that can more accurately compute the widely used AUC (area under curve) metric when using label DP in vFL. Through extensive experiments, we show our algorithms can achieve more accurate AUCs compared to the baselines.

preprint2022arXiv

DLME: Deep Local-flatness Manifold Embedding

Manifold learning (ML) aims to seek low-dimensional embedding from high-dimensional data. The problem is challenging on real-world datasets, especially with under-sampling data, and we find that previous methods perform poorly in this case. Generally, ML methods first transform input data into a low-dimensional embedding space to maintain the data's geometric structure and subsequently perform downstream tasks therein. The poor local connectivity of under-sampling data in the former step and inappropriate optimization objectives in the latter step leads to two problems: structural distortion and underconstrained embedding. This paper proposes a novel ML framework named Deep Local-flatness Manifold Embedding (DLME) to solve these problems. The proposed DLME constructs semantic manifolds by data augmentation and overcomes the structural distortion problem using a smoothness constrained based on a local flatness assumption about the manifold. To overcome the underconstrained embedding problem, we design a loss and theoretically demonstrate that it leads to a more suitable embedding based on the local flatness. Experiments on three types of datasets (toy, biological, and image) for various downstream tasks (classification, clustering, and visualization) show that our proposed DLME outperforms state-of-the-art ML and contrastive learning methods.

preprint2022arXiv

Elucidating the Degradation Mechanism of Gd2Zr2O7 Waste Form under Multi-Energy He Ion Irradiation

We studied the microstructural and helium bubbling evolutions of Gd2Zr2O7 waste form with immobilized TRPO (50 wt%) under multi-energy He ion irradiation. Three structurally heterogeneous regions for the Gd2Zr2O7 waste form were found as a function of the depth from the He-irradiated surface. Specifically, at a depth less than 40 nm below the He-irradiated surface (Region I) the Gd2Zr2O7 waste form is completely amorphous with large spherical He bubbles (5-25 nm). In the intermediate region, Region II, (40-800 nm) partially amorphized Gd2Zr2O7 waste form accompanied with ribbon-like He bubbles that may lead to the formation of microcracks is observed. The crystallinity is not impacted in Region III for a depth of more than 800 nm. For the first time, we elucidated that the Gd2Zr2O7 waste form, which was considered to be structurally intact at 100 dpa, is completely amorphized at 6.5 dpa with the synergistic displacement damage, electronic energy loss, and He concentration enabled. This study leads to new physical insights into amorphization and He bubbles formation mechanisms of Gd2Zr2O7 waste form under multi-energy He irradiation, which is essential for the design and optimization of irradiation-resistant ceramic waste matrices.

preprint2022arXiv

Environment-Aware Hybrid Beamforming by Leveraging Channel Knowledge Map

Hybrid analog/digital beamforming is a promising technique to realize millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) systems cost-effectively. However, existing hybrid beamforming designs mainly rely on real-time channel training or beam sweeping to find the desired beams, which incurs prohibitive overhead due to a large number of antennas at both the transmitter and receiver with only limited radio frequency (RF) chains. To resolve this challenging issue, in this paper, we propose a new environment-aware hybrid beamforming technique that requires only light real-time training, by leveraging the useful tool of channel knowledge map (CKM) with the user's location information. CKM is a site-specific database, which offers location-specific channel-relevant information to facilitate or even obviate the acquisition of real-time channel state information (CSI). Two specific types of CKM are proposed in this paper for hybrid beamforming design in mmWave massive MIMO systems, namely channel angle map (CAM) and beam index map (BIM). It is shown that compared with existing environment-unaware schemes, the proposed environment-aware hybrid beamforming scheme based on CKM can drastically improve the effective communication rate, even under moderate user location errors, thanks to its great saving of the prohibitive real-time training overhead.

preprint2022arXiv

FedAdapt: Adaptive Offloading for IoT Devices in Federated Learning

Applying Federated Learning (FL) on Internet-of-Things devices is necessitated by the large volumes of data they produce and growing concerns of data privacy. However, there are three challenges that need to be addressed to make FL efficient: (i) execution on devices with limited computational capabilities, (ii) accounting for stragglers due to computational heterogeneity of devices, and (iii) adaptation to the changing network bandwidths. This paper presents FedAdapt, an adaptive offloading FL framework to mitigate the aforementioned challenges. FedAdapt accelerates local training in computationally constrained devices by leveraging layer offloading of deep neural networks (DNNs) to servers. Further, FedAdapt adopts reinforcement learning based optimization and clustering to adaptively identify which layers of the DNN should be offloaded for each individual device on to a server to tackle the challenges of computational heterogeneity and changing network bandwidth. Experimental studies are carried out on a lab-based testbed and it is demonstrated that by offloading a DNN from the device to the server FedAdapt reduces the training time of a typical IoT device by over half compared to classic FL. The training time of extreme stragglers and the overall training time can be reduced by up to 57%. Furthermore, with changing network bandwidth, FedAdapt is demonstrated to reduce the training time by up to 40% when compared to classic FL, without sacrificing accuracy.

preprint2022arXiv

FedComm: Understanding Communication Protocols for Edge-based Federated Learning

Federated learning (FL) trains machine learning (ML) models on devices using locally generated data and exchanges models without transferring raw data to a distant server. This exchange incurs a communication overhead and impacts the performance of FL training. There is limited understanding of how communication protocols specifically contribute to the performance of FL. Such an understanding is essential for selecting the right communication protocol when designing an FL system. This paper presents FedComm, a benchmarking methodology to quantify the impact of optimized application layer protocols, namely Message Queue Telemetry Transport (MQTT), Advanced Message Queuing Protocol (AMQP), and ZeroMQ Message Transport Protocol (ZMTP), and non-optimized application layer protocols, namely as TCP and UDP, on the performance of FL. FedComm measures the overall performance of FL in terms of communication time and accuracy under varying computational and network stress and packet loss rates. Experiments on a lab-based testbed demonstrate that TCP outperforms UDP as a non-optimized application layer protocol with higher accuracy and shorter communication times for 4G and Wi-Fi networks. Optimized application layer protocols such as AMQP, MQTT, and ZMTP outperformed non-optimized application layer protocols in most network conditions, resulting in a 2.5x reduction in communication time compared to TCP while maintaining accuracy. The experimental results enable us to highlight a number of open research issues for further investigation. FedComm is available for download from https://github.com/qub-blesson/FedComm.

preprint2022arXiv

FedFly: Towards Migration in Edge-based Distributed Federated Learning

Federated learning (FL) is a privacy-preserving distributed machine learning technique that trains models while keeping all the original data generated on devices locally. Since devices may be resource constrained, offloading can be used to improve FL performance by transferring computational workload from devices to edge servers. However, due to mobility, devices participating in FL may leave the network during training and need to connect to a different edge server. This is challenging because the offloaded computations from edge server need to be migrated. In line with this assertion, we present FedFly, which is, to the best of our knowledge, the first work to migrate a deep neural network (DNN) when devices move between edge servers during FL training. Our empirical results on the CIFAR10 dataset, with both balanced and imbalanced data distribution, support our claims that FedFly can reduce training time by up to 33% when a device moves after 50% of the training is completed, and by up to 45% when 90% of the training is completed when compared to state-of-the-art offloading approach in FL. FedFly has negligible overhead of up to two seconds and does not compromise accuracy. Finally, we highlight a number of open research issues for further investigation. FedFly can be downloaded from https://github.com/qub-blesson/FedFly.

preprint2022arXiv

FedMCSA: Personalized Federated Learning via Model Components Self-Attention

Federated learning (FL) facilitates multiple clients to jointly train a machine learning model without sharing their private data. However, Non-IID data of clients presents a tough challenge for FL. Existing personalized FL approaches rely heavily on the default treatment of one complete model as a basic unit and ignore the significance of different layers on Non-IID data of clients. In this work, we propose a new framework, federated model components self-attention (FedMCSA), to handle Non-IID data in FL, which employs model components self-attention mechanism to granularly promote cooperation between different clients. This mechanism facilitates collaboration between similar model components while reducing interference between model components with large differences. We conduct extensive experiments to demonstrate that FedMCSA outperforms the previous methods on four benchmark datasets. Furthermore, we empirically show the effectiveness of the model components self-attention mechanism, which is complementary to existing personalized FL and can significantly improve the performance of FL.

preprint2022arXiv

Graph-incorporated Latent Factor Analysis for High-dimensional and Sparse Matrices

A High-dimensional and sparse (HiDS) matrix is frequently encountered in a big data-related application like an e-commerce system or a social network services system. To perform highly accurate representation learning on it is of great significance owing to the great desire of extracting latent knowledge and patterns from it. Latent factor analysis (LFA), which represents an HiDS matrix by learning the low-rank embeddings based on its observed entries only, is one of the most effective and efficient approaches to this issue. However, most existing LFA-based models perform such embeddings on a HiDS matrix directly without exploiting its hidden graph structures, thereby resulting in accuracy loss. To address this issue, this paper proposes a graph-incorporated latent factor analysis (GLFA) model. It adopts two-fold ideas: 1) a graph is constructed for identifying the hidden high-order interaction (HOI) among nodes described by an HiDS matrix, and 2) a recurrent LFA structure is carefully designed with the incorporation of HOI, thereby improving the representa-tion learning ability of a resultant model. Experimental results on three real-world datasets demonstrate that GLFA outperforms six state-of-the-art models in predicting the missing data of an HiDS matrix, which evidently supports its strong representation learning ability to HiDS data.

preprint2022arXiv

HIFI-Net: A Novel Network for Enhancement to Underwater Images

A novel network for enhancement to underwater images is proposed in this paper. It contains a Reinforcement Fusion Module for Haar wavelet images (RFM-Haar) based on Reinforcement Fusion Unit (RFU), which is used to fuse an original image and some important information within it. Fusion is achieved for better enhancement. As this network make "Haar Images into Fusion Images", it is called HIFI-Net. The experimental results show the proposed HIFI-Net performs best among many state-of-the-art methods on three datasets at three normal metrics and a new metric.

preprint2022arXiv

Hunting for extra dimensions in the shadow of Sagittarius A*

Recently, Vagnozzi and Visinelli's work [Phys. Rev. D 100, 024020 (2020)] reveals that M87*'s shadow establishes an upper limit of $l \lesssim 170$ AU, where $l$ is the AdS$_5$ curvature radius and 1 AU is one astronomical unit. The Event Horizon Telescope, on the other hand, just captured the first image of the shadow of Sagittarius A* (SgrA*), the Galactic center source associated with a supermassive black hole. In this paper, we are motivated to comprehensively explore a new upper limit of $l$ with the shadow of SgrA*, and the findings suggest that $l \lesssim 0.097$ AU. Our results improve accuracy by three orders of magnitude. This is also one of the first quantifiable limitations on exotic physics derived from the remarkable first image of the shadow of SgrA*.

preprint2022arXiv

Impression Allocation and Policy Search in Display Advertising

In online display advertising, guaranteed contracts and real-time bidding (RTB) are two major ways to sell impressions for a publisher. For large publishers, simultaneously selling impressions through both guaranteed contracts and in-house RTB has become a popular choice. Generally speaking, a publisher needs to derive an impression allocation strategy between guaranteed contracts and RTB to maximize its overall outcome (e.g., revenue and/or impression quality). However, deriving the optimal strategy is not a trivial task, e.g., the strategy should encourage incentive compatibility in RTB and tackle common challenges in real-world applications such as unstable traffic patterns (e.g., impression volume and bid landscape changing). In this paper, we formulate impression allocation as an auction problem where each guaranteed contract submits virtual bids for individual impressions. With this formulation, we derive the optimal bidding functions for the guaranteed contracts, which result in the optimal impression allocation. In order to address the unstable traffic pattern challenge and achieve the optimal overall outcome, we propose a multi-agent reinforcement learning method to adjust the bids from each guaranteed contract, which is simple, converging efficiently and scalable. The experiments conducted on real-world datasets demonstrate the effectiveness of our method.

preprint2022arXiv

Large Exchange Bias Effect and Coverage-Dependent Interfacial Coupling in CrI3/MnBi2Te4 van der Waals Heterostructures

Igniting interface magnetic ordering of magnetic topological insulators by building a van der Waals heterostructure can help to reveal novel quantum states and design functional devices. Here, we observe an interesting exchange bias effect, indicating successful interfacial magnetic coupling, in CrI3/MnBi2Te4 ferromagnetic insulator/antiferromagnetic topological insulator (FMI/AFM-TI) heterostructure devices. The devices originally exhibit a negative exchange bias field, which decays with increasing temperature and is unaffected by the back-gate voltage. When we change the device configuration to be half-covered by CrI3, the exchange bias becomes positive with a very large exchange bias field exceeding 300 mT. Such sensitive manipulation is explained by the competition between the FM and AFM coupling at the interface of CrI3 and MnBi2Te4, pointing to coverage-dependent interfacial magnetic interactions. Our work will facilitate the development of topological and antiferromagnetic devices.

preprint2022arXiv

Learning to Simulate Unseen Physical Systems with Graph Neural Networks

Simulation of the dynamics of physical systems is essential to the development of both science and engineering. Recently there is an increasing interest in learning to simulate the dynamics of physical systems using neural networks. However, existing approaches fail to generalize to physical substances not in the training set, such as liquids with different viscosities or elastomers with different elasticities. Here we present a machine learning method embedded with physical priors and material parameters, which we term as "Graph-based Physics Engine" (GPE), to efficiently model the physical dynamics of different substances in a wide variety of scenarios. We demonstrate that GPE can generalize to materials with different properties not seen in the training set and perform well from single-step predictions to multi-step roll-out simulations. In addition, introducing the law of momentum conservation in the model significantly improves the efficiency and stability of learning, allowing convergence to better models with fewer training steps.

preprint2022arXiv

Magnetic phase transition induced ferroelectric polarization in BaFeF4 with room temperature weak ferromagnetism

BaMF4 (M=Fe, Co, Ni and Mn) family are typical multiferroic materials, having antiferromagnetism at around liquid nitrogen temperature. In this work, polycrystalline BaFeF4 has been prepared by solid state reaction. The slight deficiency of Fe leads to the coexistence of valence states of +2 and +3, facilitating the electrons to hop between the neighboring Fe2+ and Fe3+ ions through the middle F- ion, leading to the strong double exchange interaction with weak ferromagnetism above room temperature. A bifurcation at about 170 K between the zero-field-cooled and field-cooled temperature dependent magnetization curves indicates the onset of 2-dimensional antiferromagnetism, which is completed at about 125 K with the sudden drop of magnetization. Despite the fact of type-I multiferroic, its magnetoelectricity can be evidenced by the pyroelectric current, which shows a peak starting at about 170 K and finishing at about 125 K. The saturated ferroelectric polarization change of around 34 μC/m2 is observed, which is switchable by the reversed poling electric field and decreases to about 30 μC/m2 under a magnetic field of 90 kOe. This magnetoelectricity can be qualitatively reproduced by first-principles calculations. Our results represent substantial progress to search for high-temperature multiferroics in ferroelectric fluorides.

preprint2022arXiv

MALICE: Manipulation Attacks on Learned Image ComprEssion

Deep learning techniques have shown promising results in image compression, with competitive bitrate and image reconstruction quality from compressed latent. However, while image compression has progressed towards a higher peak signal-to-noise ratio (PSNR) and fewer bits per pixel (bpp), their robustness to adversarial images has never received deliberation. In this work, we, for the first time, investigate the robustness of image compression systems where imperceptible perturbation of input images can precipitate a significant increase in the bitrate of their compressed latent. To characterize the robustness of state-of-the-art learned image compression, we mount white-box and black-box attacks. Our white-box attack employs fast gradient sign method on the entropy estimation of the bitstream as its bitrate approximation. We propose DCT-Net simulating JPEG compression with architectural simplicity and lightweight training as the substitute in the black-box attack and enable fast adversarial transferability. Our results on six image compression models, each with six different bitrate qualities (thirty-six models in total), show that they are surprisingly fragile, where the white-box attack achieves up to 56.326x and black-box 1.947x bpp change. To improve robustness, we propose a novel compression architecture factorAtn which incorporates attention modules and a basic factorized entropy model, resulting in a promising trade-off between the rate-distortion performance and robustness to adversarial attacks that surpasses existing learned image compressors.

preprint2022arXiv

MirrorAlign: A Super Lightweight Unsupervised Word Alignment Model via Cross-Lingual Contrastive Learning

Word alignment is essential for the downstream cross-lingual language understanding and generation tasks. Recently, the performance of the neural word alignment models has exceeded that of statistical models. However, they heavily rely on sophisticated translation models. In this study, we propose a super lightweight unsupervised word alignment model named MirrorAlign, in which bidirectional symmetric attention trained with a contrastive learning objective is introduced, and an agreement loss is employed to bind the attention maps, such that the alignments follow mirror-like symmetry hypothesis. Experimental results on several public benchmarks demonstrate that our model achieves competitive, if not better, performance compared to the state of the art in word alignment while significantly reducing the training and decoding time on average. Further ablation analysis and case studies show the superiority of our proposed MirrorAlign. Notably, we recognize our model as a pioneer attempt to unify bilingual word embedding and word alignments. Encouragingly, our approach achieves {16.4X speedup} against GIZA++, and {50X parameter compression} compared with the Transformer-based alignment methods. We release our code to facilitate the community: https://github.com/moore3930/MirrorAlign.

preprint2022arXiv

New Approximation Algorithms for Fair $k$-median Problem

The fair $k$-median problem is one of the important clustering problems. The current best approximation ratio is 4.675 for this problem with 1-fair violation, which was proposed by Bercea et al. [APPROX-RANDOM'2019]. As far as we know, there is no available approximation algorithm for the problem without any fair violation. In this paper, we consider the fair $k$-median problem in bounded doubling metrics and general metrics. We provide the first QPTAS for fair $k$-median problem in doubling metrics. Based on the split-tree decomposition of doubling metrics, we present a dynamic programming process to find the candidate centers, and apply min-cost max-flow method to deal with the assignment of clients. Especially, for overcoming the difficulties caused by the fair constraints, we construct an auxiliary graph and use minimum weighted perfect matching to get part of the cost. For the fair $k$-median problem in general metrics, we present an approximation algorithm with ratio $O(\log k)$, which is based on the embedding of given space into tree metrics, and the dynamic programming method. Our two approximation algorithms for the fair $k$-median problem are the first results for the corresponding problems without any fair violation, respectively.

preprint2022arXiv

Object Localization under Single Coarse Point Supervision

Point-based object localization (POL), which pursues high-performance object sensing under low-cost data annotation, has attracted increased attention. However, the point annotation mode inevitably introduces semantic variance for the inconsistency of annotated points. Existing POL methods heavily reply on accurate key-point annotations which are difficult to define. In this study, we propose a POL method using coarse point annotations, relaxing the supervision signals from accurate key points to freely spotted points. To this end, we propose a coarse point refinement (CPR) approach, which to our best knowledge is the first attempt to alleviate semantic variance from the perspective of algorithm. CPR constructs point bags, selects semantic-correlated points, and produces semantic center points through multiple instance learning (MIL). In this way, CPR defines a weakly supervised evolution procedure, which ensures training high-performance object localizer under coarse point supervision. Experimental results on COCO, DOTA and our proposed SeaPerson dataset validate the effectiveness of the CPR approach. The dataset and code will be available at https://github.com/ucas-vg/PointTinyBenchmark/.

preprint2022arXiv

On coloring of graphs of girth 2l + 1 without longer odd holes

A hole is an induced cycle of length at least 4. Let $ł\ge 2$ be a positive integer, let ${\cal G}_l$ denote the family of graphs which have girth $2ł+1$ and have no holes of odd length at least $2ł+3$, and let $G\in {\cal G}_ł$. For a vertex $u\in V(G)$ and a nonempty set $S\subseteq V(G)$, let $d(u, S)=\min\{d(u, v):v\in S\}$, and let $L_i(S)=\{u\in V(G) \mbox{ and } d(u, S)=i\}$ for any integer $i\ge 0$. We show that if $G[S]$ is connected and $G[L_i(S)]$ is bipartite for each $i\in\{1, \ldots, \lfloor{ł\over 2}\rfloor\}$, then $G[L_i(S)]$ is bipartite for each $i>0$, and consequently $χ(G)\le 4$, where $G[S]$ denotes the subgraph induced by $S$. Let $θ^-$ be the graph obtained from the Petersen graph by deleting three vertices which induce a path, let $θ^+$ be the graph obtained from the Petersen graph by deleting two adjacent vertices, and let $θ$ be the graph obtained from $θ^+$ by removing an edge incident with two vertices of degree 3. For a graph $G\in{\cal G}_2$, we show that if $G$ is 3-connected and has no unstable 3-cutset then $G$ must induce either $θ$ or $θ^-$ but does not induce $θ^+$. As corollaries, $χ(G)\le 3$ for every graph $G$ of ${\cal G}_2$ that induces neither $θ$ nor $θ^-$, and minimal non-3-colorable graphs of ${\cal G}_2$ induce no $θ^+$.

preprint2022arXiv

Optimal Measurement of Drone Swarm in RSS-based Passive Localization with Region Constraints

Passive geolocation by multiple unmanned aerial vehicles (UAVs) covers a wide range of military and civilian applications including rescue, wild life tracking and electronic warfare. The sensor-target geometry is known to significantly affect the localization precision. The existing sensor placement strategies mainly work on the cases without any constraints on the sensors locations. However, UAVs cannot fly/hover simply in arbitrary region due to realistic constraints, such as the geographical limitations, the security issues, and the max flying speed. In this paper, optimal geometrical configurations of UAVs in received signal strength (RSS)-based localization under region constraints are investigated. Employing the D-optimal criteria, i.e., minimizing the determinate of Fisher information matrix (FIM), such optimal problem is formulated. Based on the rigorous algebra and geometrical derivations, optimal and also closed form configurations of UAVs under different flying states are proposed. Finally, the effectiveness and practicality of the proposed configurations are demonstrated by simulation examples.

preprint2022arXiv

P2P-Loc: Point to Point Tiny Person Localization

Bounding-box annotation form has been the most frequently used method for visual object localization tasks. However, bounding-box annotation relies on a large amount of precisely annotating bounding boxes, and it is expensive and laborious. It is impossible to be employed in practical scenarios and even redundant for some applications (such as tiny person localization) that the size would not matter. Therefore, we propose a novel point-based framework for the person localization task by annotating each person as a coarse point (CoarsePoint) instead of an accurate bounding box that can be any point within the object extent. Then, the network predicts the person's location as a 2D coordinate in the image. Although this greatly simplifies the data annotation pipeline, the CoarsePoint annotation inevitably decreases label reliability (label uncertainty) and causes network confusion during training. As a result, we propose a point self-refinement approach that iteratively updates point annotations in a self-paced way. The proposed refinement system alleviates the label uncertainty and progressively improves localization performance. Experimental results show that our approach has achieved comparable object localization performance while saving up to 80$\%$ of annotation cost.

preprint2022arXiv

Predicting Peak Day and Peak Hour of Electricity Demand with Ensemble Machine Learning

Battery energy storage systems can be used for peak demand reduction in power systems, leading to significant economic benefits. Two practical challenges are 1) accurately determining the peak load days and hours and 2) quantifying and reducing uncertainties associated with the forecast in probabilistic risk measures for dispatch decision-making. In this study, we develop a supervised machine learning approach to generate 1) the probability of the next operation day containing the peak hour of the month and 2) the probability of an hour to be the peak hour of the day. Guidance is provided on the preparation and augmentation of data as well as the selection of machine learning models and decision-making thresholds. The proposed approach is applied to the Duke Energy Progress system and successfully captures 69 peak days out of 72 testing months with a 3% exceedance probability threshold. On 90% of the peak days, the actual peak hour is among the 2 hours with the highest probabilities.

preprint2022arXiv

Solutions of Nonlinear Optimal Control Problems Using Quasilinearization and Fenchel Duality

In this paper, we consider a special class of nonlinear optimal control problems, where the control variables are box-constrained and the objective functional is strongly convex corresponding to control variables and separable with respect to the state variables and control variables. We convert solving the original nonlinear problem into solving a sequence of constrained linear-quadratic optimal control problems by quasilinearization method. In order to solve each linear-quadratic problem efficiently we turn to study its dual problem. We formulate dual problem by the scheme of Fenchel duality, the strong duality property and the saddle point property corresponding to primal and dual problem are also proved, which together ensure that solving dual problem is effective. Thus solving the sequence of control constrained linear-quadratic optimal control problems obtained by quasilinearization technique is substituted by solving the sequence of their dual problem. We solve the sequence of dual problem and obtain the solution to primal control constrained linear-quadratic problem by the saddle point property. Furthermore, the fact that solution to each subproblem finally converges to the solution to the optimality conditions of original nonlinear problem is also proved. After that we carry out numerical experiments using present approach, for each subproblem we formulate the discretized primal and dual problem by Euler discretization scheme in our experiments. Efficiency of the present method is validated by numerical results.

preprint2022arXiv

Time Series Anomaly Detection via Reinforcement Learning-Based Model Selection

Time series anomaly detection has been recognized as of critical importance for the reliable and efficient operation of real-world systems. Many anomaly detection methods have been developed based on various assumptions on anomaly characteristics. However, due to the complex nature of real-world data, different anomalies within a time series usually have diverse profiles supporting different anomaly assumptions. This makes it difficult to find a single anomaly detector that can consistently outperform other models. In this work, to harness the benefits of different base models, we propose a reinforcement learning-based model selection framework. Specifically, we first learn a pool of different anomaly detection models, and then utilize reinforcement learning to dynamically select a candidate model from these base models. Experiments on real-world data have demonstrated that the proposed strategy can indeed outplay all baseline models in terms of overall performance.

preprint2022arXiv

Unconventional Excitonic States with Phonon Sidebands in Layered Silicon Diphosphide

Many-body interactions between quasiparticles (electrons, excitons, and phonons) have led to the emergence of new complex correlated states and are at the core of condensed matter physics and material science. In low-dimensional materials, unique electronic properties for these correlated states could significantly affect their optical properties. Herein, combining photoluminescence, optical reflection measurements and theoretical calculations, we demonstrate an unconventional excitonic state and its bound phonon sideband in layered silicon diphosphide (SiP$_2$), in which the bound electron-hole pair is composed of electrons confined within one-dimensional phosphorus$-$phosphorus chains and holes extended in two-dimensional SiP$_2$ layers. The excitonic state and the emergent phonon sideband show linear dichroism and large energy redshifts with increasing temperature. Within the $GW$ plus Bethe$-$Salpeter equation calculations and solving the generalized Holstein model non-perturbatively, we confirm that the observed sideband feature results from the correlated interaction between excitons and optical phonons. Such a layered material provides a new platform to study excitonic physics and many-particle effects.

preprint2022arXiv

WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit

Recently, we made available WeNet, a production-oriented end-to-end speech recognition toolkit, which introduces a unified two-pass (U2) framework and a built-in runtime to address the streaming and non-streaming decoding modes in a single model. To further improve ASR performance and facilitate various production requirements, in this paper, we present WeNet 2.0 with four important updates. (1) We propose U2++, a unified two-pass framework with bidirectional attention decoders, which includes the future contextual information by a right-to-left attention decoder to improve the representative ability of the shared encoder and the performance during the rescoring stage. (2) We introduce an n-gram based language model and a WFST-based decoder into WeNet 2.0, promoting the use of rich text data in production scenarios. (3) We design a unified contextual biasing framework, which leverages user-specific context (e.g., contact lists) to provide rapid adaptation ability for production and improves ASR accuracy in both with-LM and without-LM scenarios. (4) We design a unified IO to support large-scale data for effective model training. In summary, the brand-new WeNet 2.0 achieves up to 10\% relative recognition performance improvement over the original WeNet on various corpora and makes available several important production-oriented features.

preprint2022arXiv

WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition

In this paper, we present WenetSpeech, a multi-domain Mandarin corpus consisting of 10000+ hours high-quality labeled speech, 2400+ hours weakly labeled speech, and about 10000 hours unlabeled speech, with 22400+ hours in total. We collect the data from YouTube and Podcast, which covers a variety of speaking styles, scenarios, domains, topics, and noisy conditions. An optical character recognition (OCR) based method is introduced to generate the audio/text segmentation candidates for the YouTube data on its corresponding video captions, while a high-quality ASR transcription system is used to generate audio/text pair candidates for the Podcast data. Then we propose a novel end-to-end label error detection approach to further validate and filter the candidates. We also provide three manually labelled high-quality test sets along with WenetSpeech for evaluation -- Dev for cross-validation purpose in training, Test_Net, collected from Internet for matched test, and Test\_Meeting, recorded from real meetings for more challenging mismatched test. Baseline systems trained with WenetSpeech are provided for three popular speech recognition toolkits, namely Kaldi, ESPnet, and WeNet, and recognition results on the three test sets are also provided as benchmarks. To the best of our knowledge, WenetSpeech is the current largest open-sourced Mandarin speech corpus with transcriptions, which benefits research on production-level speech recognition.

preprint2021arXiv

End-to-End Learning for Simultaneously Generating Decision Map and Multi-Focus Image Fusion Result

The general aim of multi-focus image fusion is to gather focused regions of different images to generate a unique all-in-focus fused image. Deep learning based methods become the mainstream of image fusion by virtue of its powerful feature representation ability. However, most of the existing deep learning structures failed to balance fusion quality and end-to-end implementation convenience. End-to-end decoder design often leads to unrealistic result because of its non-linear mapping mechanism. On the other hand, generating an intermediate decision map achieves better quality for the fused image, but relies on the rectification with empirical post-processing parameter choices. In this work, to handle the requirements of both output image quality and comprehensive simplicity of structure implementation, we propose a cascade network to simultaneously generate decision map and fused result with an end-to-end training procedure. It avoids the dependence on empirical post-processing methods in the inference stage. To improve the fusion quality, we introduce a gradient aware loss function to preserve gradient information in output fused image. In addition, we design a decision calibration strategy to decrease the time consumption in the application of multiple images fusion. Extensive experiments are conducted to compare with 19 different state-of-the-art multi-focus image fusion structures with 6 assessment metrics. The results prove that our designed structure can generally ameliorate the output fused image quality, while implementation efficiency increases over 30\% for multiple images fusion.

preprint2021arXiv

Environment-Aware and Training-Free Beam Alignment for mmWave Massive MIMO via Channel Knowledge Map

Millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) communication system is expected to achieve enormous transmission rate, provided that the transmit and receive beams are properly aligned with the MIMO channel. However, existing beam alignment techniques rely on either channel estimation or beam sweeping, which incur prohibitively high training overhead, especially for future wireless systems with further increased antenna dimensions and more stringent requirement on cost-effective hardware architectures. In this paper, we propose a new beam alignment technique, which is environment-aware and training-free, by utilizing the emerging concept of channel knowledge map (CKM), together with the user location information that is readily available in contemporary wireless systems. CKM is a site-specific database, tagged with the transmitter/receiver locations, which contains useful channel information to facilitate or even obviate real-time channel state information (CSI) acquistion. Two instances of CKM are proposed for beam alignment in mmWave massive MIMO systems, namely channel path map (CPM) and beam index map (BIM). It is shown that compared with existing training-based beam alignment schemes, the proposed CKM-enabled environment-aware beam alignment is able to drastically improve the effective communication rate, even with moderate user location errors, thanks to its significant saving of the prohibitive training overhead.

preprint2021arXiv

Gain without Pain: Offsetting DP-injected Nosies Stealthily in Cross-device Federated Learning

Federated Learning (FL) is an emerging paradigm through which decentralized devices can collaboratively train a common model. However, a serious concern is the leakage of privacy from exchanged gradient information between clients and the parameter server (PS) in FL. To protect gradient information, clients can adopt differential privacy (DP) to add additional noises and distort original gradients before they are uploaded to the PS. Nevertheless, the model accuracy will be significantly impaired by DP noises, making DP impracticable in real systems. In this work, we propose a novel Noise Information Secretly Sharing (NISS) algorithm to alleviate the disturbance of DP noises by sharing negated noises among clients. We theoretically prove that: 1) If clients are trustworthy, DP noises can be perfectly offset on the PS; 2) Clients can easily distort negated DP noises to protect themselves in case that other clients are not totally trustworthy, though the cost lowers model accuracy. NISS is particularly applicable for FL across multiple IoT (Internet of Things) systems, in which all IoT devices need to collaboratively train a model. To verify the effectiveness and the superiority of the NISS algorithm, we conduct experiments with the MNIST and CIFAR-10 datasets. The experiment results verify our analysis and demonstrate that NISS can improve model accuracy by 21% on average and obtain better privacy protection if clients are trustworthy.

preprint2021arXiv

Measuring Discrimination to Boost Comparative Testing for Multiple Deep Learning Models

The boom of DL technology leads to massive DL models built and shared, which facilitates the acquisition and reuse of DL models. For a given task, we encounter multiple DL models available with the same functionality, which are considered as candidates to achieve this task. Testers are expected to compare multiple DL models and select the more suitable ones w.r.t. the whole testing context. Due to the limitation of labeling effort, testers aim to select an efficient subset of samples to make an as precise rank estimation as possible for these models. To tackle this problem, we propose Sample Discrimination based Selection (SDS) to select efficient samples that could discriminate multiple models, i.e., the prediction behaviors (right/wrong) of these samples would be helpful to indicate the trend of model performance. To evaluate SDS, we conduct an extensive empirical study with three widely-used image datasets and 80 real world DL models. The experimental results show that, compared with state-of-the-art baseline methods, SDS is an effective and efficient sample selection method to rank multiple DL models.

preprint2021arXiv

On the $L^\infty$ stability of Prandtl expansions in Gevrey class

In this paper, we prove the $L^\infty\cap L^2$ stability of Prandtl expansions of shear flow type as $\big(U(y/\sqrtν),0\big)$ for the initial perturbation in the Gevrey class, where $U(y)$ is a monotone and concave function and $ν$ is the viscosity coefficient. To this end, we develop the direct resolvent estimate method for the linearized Orr-Sommerfeld operator instead of the Rayleigh-Airy iteration method introduced by Grenier, Guo and Nguyen.

preprint2021arXiv

On the Practicality of Differential Privacy in Federated Learning by Tuning Iteration Times

In spite that Federated Learning (FL) is well known for its privacy protection when training machine learning models among distributed clients collaboratively, recent studies have pointed out that the naive FL is susceptible to gradient leakage attacks. In the meanwhile, Differential Privacy (DP) emerges as a promising countermeasure to defend against gradient leakage attacks. However, the adoption of DP by clients in FL may significantly jeopardize the model accuracy. It is still an open problem to understand the practicality of DP from a theoretic perspective. In this paper, we make the first attempt to understand the practicality of DP in FL through tuning the number of conducted iterations. Based on the FedAvg algorithm, we formally derive the convergence rate with DP noises in FL. Then, we theoretically derive: 1) the conditions for the DP based FedAvg to converge as the number of global iterations (GI) approaches infinity; 2) the method to set the number of local iterations (LI) to minimize the negative influence of DP noises. By further substituting the Laplace and Gaussian mechanisms into the derived convergence rate respectively, we show that: 3) The DP based FedAvg with the Laplace mechanism cannot converge, but the divergence rate can be effectively prohibited by setting the number of LIs with our method; 4) The learning error of the DP based FedAvg with the Gaussian mechanism can converge to a constant number finally if we use a fixed number of LIs per GI. To verify our theoretical findings, we conduct extensive experiments using two real-world datasets. The results not only validate our analysis results, but also provide useful guidelines on how to optimize model accuracy when incorporating DP into FL

preprint2021arXiv

Optimizing Video Caching at the Edge: A Hybrid Multi-Point Process Approach

It is always a challenging problem to deliver a huge volume of videos over the Internet. To meet the high bandwidth and stringent playback demand, one feasible solution is to cache video contents on edge servers based on predicted video popularity. Traditional caching algorithms (e.g., LRU, LFU) are too simple to capture the dynamics of video popularity, especially long-tailed videos. Recent learning-driven caching algorithms (e.g., DeepCache) show promising performance, however, such black-box approaches are lack of explainability and interpretability. Moreover, the parameter tuning requires a large number of historical records, which are difficult to obtain for videos with low popularity. In this paper, we optimize video caching at the edge using a white-box approach, which is highly efficient and also completely explainable. To accurately capture the evolution of video popularity, we develop a mathematical model called \emph{HRS} model, which is the combination of multiple point processes, including Hawkes' self-exciting, reactive and self-correcting processes. The key advantage of the HRS model is its explainability, and much less number of model parameters. In addition, all its model parameters can be learned automatically through maximizing the Log-likelihood function constructed by past video request events. Next, we further design an online HRS-based video caching algorithm. To verify its effectiveness, we conduct a series of experiments using real video traces collected from Tencent Video, one of the largest online video providers in China. Experiment results demonstrate that our proposed algorithm outperforms the state-of-the-art algorithms, with 12.3\% improvement on average in terms of cache hit rate under realistic settings.

preprint2021arXiv

The design of the Ali CMB Polarization Telescope receiver

Ali CMB Polarization Telescope (AliCPT-1) is the first CMB degree-scale polarimeter to be deployed on the Tibetan plateau at 5,250m above sea level. AliCPT-1 is a 90/150 GHz 72 cm aperture, two-lens refracting telescope cooled down to 4 K. Alumina lenses, 800mm in diameter, image the CMB in a 33.4° field of view on a 636mm wide focal plane. The modularized focal plane consists of dichroic polarization-sensitive Transition-Edge Sensors (TESes). Each module includes 1,704 optically active TESes fabricated on a 150mm diameter silicon wafer. Each TES array is read out with a microwave multiplexing readout system capable of a multiplexing factor up to 2,048. Such a large multiplexing factor has allowed the practical deployment of tens of thousands of detectors, enabling the design of a receiver that can operate up to 19 TES arrays for a total of 32,376 TESes. AliCPT-1 leverages the technological advancements in the detector design from multiple generations of previously successful feedhorn-coupled polarimeters, and in the instrument design from BICEP-3, but applied on a larger scale. The cryostat receiver is currently under integration and testing. During the first deployment year, the focal plane will be populated with up to 4 TES arrays. Further TES arrays will be deployed in the following years, fully populating the focal plane with 19 arrays on the fourth deployment year. Here we present the AliCPT-1 receiver design, and how the design has been optimized to meet the experimental requirements.

preprint2021arXiv

Transition of laser-induced terahertz spin currents from torque- to conduction-electron-mediated transport

Spin transport is crucial for future spintronic devices operating at bandwidths up to the terahertz (THz) range. In F|N thin-film stacks made of a ferro/ferrimagnetic layer F and a normal-metal layer N, spin transport is mediated by (1) spin-polarized conduction electrons and/or (2) torque between electron spins. To identify a cross-over from (1) to (2), we study laser-driven spin currents in F|Pt stacks where F consists of model materials with different degrees of electrical conductivity. For the magnetic insulators YIG, GIG and maghemite, identical dynamics is observed. It arises from the THz interfacial spin Seebeck effect (SSE), is fully determined by the relaxation of the electrons in the metal layer and provides an estimate of the spin-mixing conductance of the GIG/Pt interface. Remarkably, in the half-metallic ferrimagnet Fe3O4 (magnetite), our measurements reveal two spin-current components with opposite direction. The slower, positive component exhibits SSE dynamics and is assigned to torque-type magnon excitation of the A- and B-spin sublattices of Fe3O4. The faster, negative component arises from the pyro-spintronic effect and can consistently be assigned to ultrafast demagnetization of e-sublattice minority-spin hopping electrons. This observation supports the magneto-electronic model of Fe3O4. In general, our results provide a new route to the contact-free separation of torque- and conduction-electron-mediated spin currents.

preprint2021arXiv

U2++: Unified Two-pass Bidirectional End-to-end Model for Speech Recognition

The unified streaming and non-streaming two-pass (U2) end-to-end model for speech recognition has shown great performance in terms of streaming capability, accuracy, real-time factor (RTF), and latency. In this paper, we present U2++, an enhanced version of U2 to further improve the accuracy. The core idea of U2++ is to use the forward and the backward information of the labeling sequences at the same time at training to learn richer information, and combine the forward and backward prediction at decoding to give more accurate recognition results. We also proposed a new data augmentation method called SpecSub to help the U2++ model to be more accurate and robust. Our experiments show that, compared with U2, U2++ shows faster convergence at training, better robustness to the decoding method, as well as consistent 5\% - 8\% word error rate reduction gain over U2. On the experiment of AISHELL-1, we achieve a 4.63\% character error rate (CER) with a non-streaming setup and 5.05\% with a streaming setup with 320ms latency by U2++. To the best of our knowledge, 5.05\% is the best-published streaming result on the AISHELL-1 test set.

preprint2020arXiv

An Edge Computing-based Photo Crowdsourcing Framework for Real-time 3D Reconstruction

Image-based three-dimensional (3D) reconstruction utilizes a set of photos to build 3D model and can be widely used in many emerging applications such as augmented reality (AR) and disaster recovery. Most of existing 3D reconstruction methods require a mobile user to walk around the target area and reconstruct objectives with a hand-held camera, which is inefficient and time-consuming. To meet the requirements of delay intensive and resource hungry applications in 5G, we propose an edge computing-based photo crowdsourcing (EC-PCS) framework in this paper. The main objective is to collect a set of representative photos from ubiquitous mobile and Internet of Things (IoT) devices at the network edge for real-time 3D model reconstruction, with network resource and monetary cost considerations. Specifically, we first propose a photo pricing mechanism by jointly considering their freshness, resolution and data size. Then, we design a novel photo selection scheme to dynamically select a set of photos with the required target coverage and the minimum monetary cost. We prove the NP-hardness of such problem, and develop an efficient greedy-based approximation algorithm to obtain a near-optimal solution. Moreover, an optimal network resource allocation scheme is presented, in order to minimize the maximum uploading delay of the selected photos to the edge server. Finally, a 3D reconstruction algorithm and a 3D model caching scheme are performed by the edge server in real time. Extensive experimental results based on real-world datasets demonstrate the superior performance of our EC-PCS system over the existing mechanisms.

preprint2020arXiv

Are ultra-spinning Kerr-Sen-AdS$_4$ black holes always super-entropic ?

We study thermodynamics of the four-dimensional Kerr-Sen-AdS black hole and its ultra-spinning counterpart, and verify that both black holes fullfil the first law and Bekenstein-Smarr mass formulae of black hole thermodynamics. Furthermore, we derive new Christodoulou-Ruffini-like squared-mass formulae for the usual and ultra-spinning Kerr-Sen-AdS$_4$ solutions. We show that this ultra-spinning Kerr-Sen-AdS$_4$ black hole does not always violate the Reverse Isoperimetric Inequality (RII) since the value of the isoperimetric ratio can be larger/smaller than, or equal to unity, depending upon where the solution parameters lie in the parameters space. This property is obviously different from that of the Kerr-Newman-AdS$_4$ super-entropic black hole, which always strictly violates the RII, although both of them have some similar properties in other aspects, such as horizon geometry and conformal boundary. In addition, it is found that while there exists the same lower bound on mass ($m_e \geqslant 8l/\sqrt{27}$ with $l$ being the cosmological scale) both for the extremal ultra-spinning Kerr-Sen-AdS$_4$ black hole and for the extremal super-entropic Kerr-Newman-AdS$_4$ case, the former has a maximal horizon radius: $r_{\rm\, HP} = l/\sqrt{3}$ which is the minimum of the latter. Therefore, these two different kinds of four-dimensional ultra-spinning charged AdS black holes exhibit some significant physical differences .

preprint2020arXiv

Aspects of the dyonic Kerr-Sen-AdS$_4$ black hole and its ultraspinning version

We explore some (especially, thermodynamical) properties of the dyonic Kerr-Sen-AdS$_4$ black hole and its ultraspinning counterpart, and check whether or not both black holes satisfy the first law and Bekenstein-Smarr mass formulas. To this end, new Christodoulou-Ruffini-like squared-mass formulae for the usual dyonic Kerr-Sen-AdS$_4$ solution and its ultraspinning cousin are deduced. Similar to the ultraspinning Kerr-Sen-AdS$_4$ black hole case, we demonstrate that the ultraspinning dyonic Kerr-Sen-AdS$_4$ black hole does not always violate the reverse isoperimetric inequality (RII) since the value of the isoperimetric ratio can either be larger/smaller than, or equal to unity, depending upon the range of the solution parameters, as is the case only with an electric charge. This property is apparently distinct from that of the superentropic dyonic Kerr-Newman-AdS$_4$ black hole, which always strictly violates the RII, although both of them have some similar properties in other aspects, such as the horizon geometry and conformal boundary.

preprint2020arXiv

Chiral symmetry breaking for deterministic switching of perpendicular magnetization by spin-orbit torque

Symmetry breaking is a characteristic to determine which branch of a bifurcation system follows upon crossing a critical point. Specifically, in spin-orbit torque (SOT) devices, a fundamental question arises: how to break the symmetry of the perpendicular magnetic moment by the in-plane spin polarization? Here, we show that the chiral symmetry breaking by the DMI can induce the deterministic SOT switching of the perpendicular magnetization. By introducing a gradient of saturation magnetization or magnetic anisotropy, non-collinear spin textures are formed by the gradient of effective SOT strength, and thus the chiral symmetry of the SOT-induced spin textures is broken by the DMI, resulting in the deterministic magnetization switching. We introduce a strategy to induce an out-of-plane (z) gradient of magnetic properties, as a practical solution for the wafer-scale manufacture of SOT devices.

preprint2020arXiv

EasyQuant: Post-training Quantization via Scale Optimization

The 8 bits quantization has been widely applied to accelerate network inference in various deep learning applications. There are two kinds of quantization methods, training-based quantization and post-training quantization. Training-based approach suffers from a cumbersome training process, while post-training quantization may lead to unacceptable accuracy drop. In this paper, we present an efficient and simple post-training method via scale optimization, named EasyQuant (EQ),that could obtain comparable accuracy with the training-based method.Specifically, we first alternately optimize scales of weights and activations for all layers target at convolutional outputs to further obtain the high quantization precision. Then, we lower down bit width to INT7 both for weights and activations, and adopt INT16 intermediate storage and integer Winograd convolution implementation to accelerate inference.Experimental results on various computer vision tasks show that EQ outperforms the TensorRT method and can achieve near INT8 accuracy in 7 bits width post-training.

preprint2020arXiv

Energy Model for UAV Communications: Experimental Validation and Model Generalization

Wireless communication involving unmanned aerial vehicles (UAVs) is expected to play an important role in future wireless networks. However, different from conventional terrestrial communication systems, UAVs typically have rather limited onboard energy on one hand, and require additional flying energy consumption on the other hand, which renders energy-efficient UAV communication with smart energy expenditure of paramount importance. In this paper, via extensive flight experiments, we aim to firstly validate the recently derived theoretical energy model for rotary-wing UAVs, and then develop a general model for those complicated flight scenarios where rigorous theoretical model derivation is quite challenging, if not impossible. Specifically, we first investigate how UAV power consumption varies with its flying speed for the simplest straight-and-level flight. With about 12,000 valid power-speed data points collected, we first apply the model-based curve fitting to obtain the modelling parameters based on the theoretical closed-form energy model in the existing literature. In addition, in order to exclude the potential bias caused by the theoretical energy model, the obtained measurement data is also trained using a model-free deep neural network. It is found that the obtained curve from both methods can match quite well with the theoretical energy model. Next, we further extend the study to arbitrary 2-dimensional (2-D) flight, where, to our best knowledge, no rigorous theoretical derivation is available for the closed-form energy model as a function of its flying speed, direction, and acceleration. To fill the gap, we first propose a heuristic energy model for these more complicated cases, and then provide experimental validation based on the measurement results for circular level flight.

preprint2020arXiv

Exploring open cluster properties with Gaia and LAMOST

In Gaia DR2, the unprecedented high-precision level reached in sub-mas for astrometry and mmag for photometry. Using cluster members identified with these astrometry and photometry in Gaia DR2, we can obtain a reliable determination of cluster properties. However, because of the shortcoming of Gaia spectroscopic observation in dealing with densely crowded cluster region, the number of radial velocity and metallicity for cluster member stars from Gaia DR2 is still lacking. In this study, we aim to improve the cluster properties by combining the LAMOST spectra. In particular, we provide the list of cluster members with spectroscopic parameters as an add-value catalog in LAMOST DR5, which can be used to perform detailed study for a better understanding on the stellar properties, by using their spectra and fundamental properties from the host cluster. We cross-matched the spectroscopic catalog in LAMOST DR5 with the identified cluster members in Cantat-Gaudin et al.2018 and then used members with spectroscopic parameters to derive statistical properties of open clusters. We obtained a list of 8811 members with spectroscopic parameters and a catalog of 295 cluster properties. In addition, we study the radial and vertical metallicity gradient and age-metallicity relation with the compiled open clusters as tracers, finding slopes of -0.053$\pm$0.004 dex kpc$^{-1}$, -0.252$\pm$0.039 dex kpc$^{-1}$ and 0.022$\pm$0.008 dex Gyr$^{-1}$, respectively. Both slopes of metallicity distribution relation for young clusters (0.1 Gyr < Age < 2 Gyr) and the age-metallicity relation for clusters within 6 Gyr are consistent with literature results. In order to fully study the chemical evolution history in the disk, more spectroscopic observations for old and distant open clusters are needed for further investigation.

preprint2020arXiv

Ferromagnetic MnSn monolayer epitaxially grown on silicon substrate

Two-dimensional (2D) ferromagnetic materials have been exhibiting promising potential in applications, such as spintronics devices. To grow epitaxial magnetic films on silicon substrate, in the single-layer limit, is practically important but challenging. In this study, we realized the epitaxial growth of MnSn monolayer on Si(111) substrate, with an atomically thin Sn/Si(111)-$2\sqrt{3}\times2\sqrt{3}$- buffer layer, and controlled the MnSn thickness with atomic-layer precision. We discovered the ferromagnetism in MnSn monolayer with the Curie temperature (Tc) of ~54 K. As the MnSn film is grown to 4 monolayers, Tc increases accordingly to ~235 K. The lattice of the epitaxial MnSn monolayer as well as the Sn/Si(111)-$2\sqrt{3}\times2\sqrt{3}$ is perfectly compatible with silicon, and thus an sharp interface is formed between MnSn, Sn and Si. This system provides a new platform for exploring the 2D ferromagnetism, integrating magnetic monolayers into silicon-based technology, and engineering the spintronics heterostructures.

preprint2020arXiv

Flow by Gauss curvature to Dual Orlicz-Minkowski problems

In this paper we study a normalised anisotropic Gauss curvature flow of strictly convex, closed hypersurfaces in the Euclidean space R^n+1. We prove that the flow exists for all time and converges smoothly to the unique, strictly convex solution of a Monge-Amp`ere type equation. Our argument provides a parabolic proof in the smooth category for the existence of solutions to the Dual Orlicz-Minkowski problem introduced by Zhu, Xing and Ye.

preprint2020arXiv

Neural Mesh Refiner for 6-DoF Pose Estimation

How can we effectively utilise the 2D monocular image information for recovering the 6D pose (6-DoF) of the visual objects? Deep learning has shown to be effective for robust and real-time monocular pose estimation. Oftentimes, the network learns to regress the 6-DoF pose using a naive loss function. However, due to a lack of geometrical scene understanding from the directly regressed pose estimation, there are misalignments between the rendered mesh from the 3D object and the 2D instance segmentation result, e.g., bounding boxes and masks prediction. This paper bridges the gap between 2D mask generation and 3D location prediction via a differentiable neural mesh renderer. We utilise the overlay between the accurate mask prediction and less accurate mesh prediction to iteratively optimise the direct regressed 6D pose information with a focus on translation estimation. By leveraging geometry, we demonstrate that our technique significantly improves direct regression performance on the difficult task of translation estimation and achieve the state of the art results on Peking University/Baidu - Autonomous Driving dataset and the ApolloScape 3D Car Instance dataset. The code can be found at \url{https://bit.ly/2IRihfU}.

preprint2020arXiv

Notes on thermodynamics of super-entropic AdS black holes

The super-entropic black hole, which possesses a noncompact horizon topology and violates the reverse isoperimetric inequality, has been found to satisfy both the thermodynamic first law and the Bekenstein-Smarr mass formula. In this paper, we first derive a new Christodoulou-Ruffini-like squared-mass formula for the four-dimensional Kerr-Newman-AdS super-entropic black hole, and then establish a set of very simple relations between thermodynamic quantities of the super-entropic Kerr-Newman-AdS$_4$ black hole and its usual counterparts. Using these relations, the thermodynamic quantities of the Kerr-Newman-AdS$_4$ super-entropic black hole can be obtained from those of the usual pro-type by taking the ultra-spinning limit properly. Then these relations are extended to the singly-rotating Kerr-AdS black holes in arbitrary dimensions and the double-rotating charged black hole in the five-dimensional minimal gauged supergravity. It can be inferred that the thermodynamic quantities of all super-entropic black holes obey similar limiting relations to those of their corresponding conventional rotating AdS black holes, and thus can be obtained by taking the ultra-spinning limit appropriately.

preprint2020arXiv

Scaling invariant Serrin criterion via one velocity component for the Navier-Stokes equations

In this paper, we prove that the Leray weak solution $u : \mathbb{R}^3\times (0, T)\rightarrow\mathbb{R}^3 $ of the Navier-Stokes equations is regular in $\mathbb{R}^3\times (0,T)$ under the scaling invariant Serrin condition imposed on one component of the velocity $u_3\in L^{q,1}(0, T;L^p(\mathbb{R}^3))$ with \[ \frac{2}{q}+\frac{3}{p}\leq 1,\quad 3<p<+\infty. \] This result is an immediate consequence of a new local regularity criterion in terms of one velocity component for suitable weak solutions.

preprint2020arXiv

Solving Bayesian Risk Optimization via Nested Stochastic Gradient Estimation

In this paper, we aim to solve Bayesian Risk Optimization (BRO), which is a recently proposed framework that formulates simulation optimization under input uncertainty. In order to efficiently solve the BRO problem, we derive nested stochastic gradient estimators and propose corresponding stochastic approximation algorithms. We show that our gradient estimators are asymptotically unbiased and consistent, and that the algorithms converge asymptotically. We demonstrate the empirical performance of the algorithms on a two-sided market model. Our estimators are of independent interest in extending the literature of stochastic gradient estimation to the case of nested risk functions.

preprint2020arXiv

When Deep Reinforcement Learning Meets Federated Learning: Intelligent Multi-Timescale Resource Management for Multi-access Edge Computing in 5G Ultra Dense Network

Ultra-dense edge computing (UDEC) has great potential, especially in the 5G era, but it still faces challenges in its current solutions, such as the lack of: i) efficient utilization of multiple 5G resources (e.g., computation, communication, storage and service resources); ii) low overhead offloading decision making and resource allocation strategies; and iii) privacy and security protection schemes. Thus, we first propose an intelligent ultra-dense edge computing (I-UDEC) framework, which integrates blockchain and Artificial Intelligence (AI) into 5G ultra-dense edge computing networks. First, we show the architecture of the framework. Then, in order to achieve real-time and low overhead computation offloading decisions and resource allocation strategies, we design a novel two-timescale deep reinforcement learning (\textit{2Ts-DRL}) approach, consisting of a fast-timescale and a slow-timescale learning process, respectively. The primary objective is to minimize the total offloading delay and network resource usage by jointly optimizing computation offloading, resource allocation and service caching placement. We also leverage federated learning (FL) to train the \textit{2Ts-DRL} model in a distributed manner, aiming to protect the edge devices&#39; data privacy. Simulation results corroborate the effectiveness of both the \textit{2Ts-DRL} and FL in the I-UDEC framework and prove that our proposed algorithm can reduce task execution time up to 31.87%.

preprint2019arXiv

Experimental observation of the gate-controlled reversal of the anomalous Hall effect in the intrinsic magnetic topological insulator MnBi2Te4 device

Here we report the reserved anomalous Hall effect (AHE) in the 5-septuple-layer van der Waals device of the intrinsic magnetic topological insulator MnBi2Te4. By employing the top/bottom gate, a negative AHE loop gradually decreases to zero and changes to a reversed sign. The reversed AHE exhibits distinct coercive fields and temperature dependence from the previous AHE. It reaches the maximum inside the gap of the Dirac cone. The newly-seen reversed AHE is attributed to the competition of the intrinsic Berry curvature and the Dirac-gap enhanced extrinsic skew scattering. Its gate-controlled switching contributes a scheme for the topological spin field-effect transistors.

preprint2019arXiv

Explainable Deep Relational Networks for Predicting Compound-Protein Affinities and Contacts

Predicting compound-protein affinity is critical for accelerating drug discovery. Recent progress made by machine learning focuses on accuracy but leaves much to be desired for interpretability. Through molecular contacts underlying affinities, our large-scale interpretability assessment finds commonly-used attention mechanisms inadequate. We thus formulate a hierarchical multi-objective learning problem whose predicted contacts form the basis for predicted affinities. We further design a physics-inspired deep relational network, DeepRelations, with intrinsically explainable architecture. Specifically, various atomic-level contacts or &#34;relations&#34; lead to molecular-level affinity prediction. And the embedded attentions are regularized with predicted structural contexts and supervised with partially available training contacts. DeepRelations shows superior interpretability to the state-of-the-art: without compromising affinity prediction, it boosts the AUPRC of contact prediction 9.5, 16.9, 19.3 and 5.7-fold for the test, compound-unique, protein-unique, and both-unique sets, respectively. Our study represents the first dedicated model development and systematic model assessment for interpretable machine learning of compound-protein affinity.