Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
56works
0followers
37topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

56 published item(s)

preprint2026arXiv

ViCrop-Det: Spatial Attention Entropy Guided Cropping for Training-Free Small-Object Detection

Transformer-based architectures have established a dominant paradigm in global semantic perception; however, they remain fundamentally constrained by the profound spatial heterogeneity inherent in natural images. Specifically, the imposition of a uniform global receptive field across regions of varying information density inevitably leads to local feature degradation, particularly in dense conflict zones populated by microscopic targets. To address this mechanistic limitation, we propose ViCrop-Det, a training-free inference framework that introduces adaptive spatial trust region shrinkage. Inspired by the use of attention entropy in anomaly segmentation, ViCrop-Det leverages the detection decoder's cross-attention distribution as an endogenous probe. By utilizing Spatial Attention Entropy (SAE) to heuristically evaluate local spatial ambiguity, the framework executes dynamic spatial routing, allocating a fixed computational budget exclusively to regions exhibiting both high target saliency and high cognitive uncertainty. By shrinking the spatial trust region and injecting high-frequency localized observations, ViCrop-Det actively resolves spatial ambiguity and recovers fine-grained features without requiring architectural modifications. Extensive evaluations on VisDrone and DOTA-v1.5 demonstrate that ViCrop-Det yields competitive performance enhancements, consistently adding +1-3 mAP@50 to RT-DETR-R50 and Deformable DETR with a marginal 20-23\% latency overhead. On MS COCO, $AP_{S}$ improves while $AP_{M}/AP_{L}$ remains stable, indicating precise fine-scale refinement without compromising the global spatial prior. Under compute-matched settings, our adaptive routing strategy comprehensively surpasses uniform slicing baselines, achieving a highly optimized accuracy-speed trade-off.

preprint2025arXiv

Intrinsic nonlinear valley Nernst effect

We investigate the intrinsic nonlinear valley Nernst effect, which induces a transverse valley current via a second-order thermoelectric response to a longitudinal temperature gradient. The effect arises from the Berry connection polarizability dipole of valley electrons and is permissible in both inversion-symmetric and inversion-asymmetric materials. We demonstrate that the response tensor is connected to the intrinsic nonlinear valley Hall conductivity through a generalized Mott relation, with the two being directly proportional at low temperatures, scaled by the Lorenz number. We elucidate the symmetry constraints governing this effect and develop a theory for its nonlocal measurement, revealing a nonlocal second-harmonic signal with a distinct $ρ^2$ scaling. This signal comprises two scaling terms, with their ratio corresponding to the square of the thermopower normalized by the Lorenz number. Key characteristics are demonstrated using a tilted Dirac model and first-principles calculations on bilayer WTe$_2$. Possible extrinsic contributions and alternative experimental detection methods, e.g., by valley pumping and by nonreciprocal directional dichroism, are discussed. These findings underscore the significance of band quantum geometry on electron dynamics and establish a theoretical foundation for nonlinear valley caloritronics.

preprint2024arXiv

Training and Serving System of Foundation Models: A Comprehensive Survey

Foundation models (e.g., ChatGPT, DALL-E, PengCheng Mind, PanGu-$Σ$) have demonstrated extraordinary performance in key technological areas, such as natural language processing and visual recognition, and have become the mainstream trend of artificial general intelligence. This has led more and more major technology giants to dedicate significant human and financial resources to actively develop their foundation model systems, which drives continuous growth of these models' parameters. As a result, the training and serving of these models have posed significant challenges, including substantial computing power, memory consumption, bandwidth demands, etc. Therefore, employing efficient training and serving strategies becomes particularly crucial. Many researchers have actively explored and proposed effective methods. So, a comprehensive survey of them is essential for system developers and researchers. This paper extensively explores the methods employed in training and serving foundation models from various perspectives. It provides a detailed categorization of these state-of-the-art methods, including finer aspects such as network, computing, and storage. Additionally, the paper summarizes the challenges and presents a perspective on the future development direction of foundation model systems. Through comprehensive discussion and analysis, it hopes to provide a solid theoretical basis and practical guidance for future research and applications, promoting continuous innovation and development in foundation model systems.

preprint2023arXiv

Computational Approaches to Model X-ray Photon Correlation Spectroscopy from Molecular Dynamics

X-ray photon correlation spectroscopy (XPCS) allows for the resolution of dynamic processes within a material across a wide range of length and time scales. X-ray speckle visibility spectroscopy (XSVS) is a related method that uses a single diffraction pattern to probe ultrafast dynamics. Interpretation of the XPCS and XSVS data in terms of underlying physical processes is necessary to establish the connection between the macroscopic responses and the microstructural dynamics. To aid the interpretation of the XPCS and XSVS data, we present a computational framework to model these experiments by computing the X-ray scattering intensity directly from the atomic positions obtained from molecular dynamics (MD) simulations. We compare the efficiency and accuracy of two alternative computational methods: the direct method computing the intensity at each diffraction vector separately, and a method based on fast Fourier transform that computes the intensities at all diffraction vectors at once. The computed X-ray speckle patterns capture the density fluctuations over a range of length and time scales and are shown to reproduce the known properties and relations of experimental XPCS and XSVS for liquids.

preprint2023arXiv

Control the qubit-qubit coupling in the superconducting circuit with double-resonator couplers

We propose a scheme of using two fixed frequency resonator couplers to tune the coupling strength between two Xmon qubits. The induced indirect qubit-qubit interactions by two resonators could offset with each other, and the direct coupling between two qubits are not necessarily for switching off. The small direct qubit-quibt coupling could effectively suppress the frequency interval between switching off and switching on, and globally suppress the second and third-order static ZZ couplings. The frequencies differences between resonator couplers and qubits readout resonators are very large, this might be helpful for suppressing the qubits readout errors. The cross-kerr resonant processes between a qubit and two resonators might induce pole and affect the crosstalks between qubits. The double resonator couplers could unfreeze the restrictions on capacitances and coupling strengths in the superconducting circuit, and it can also reduce the flux noises and globally suppress the crosstalks.

preprint2022arXiv

Atomic-scale Deformation Process of Glasses Unveiled by Stress-induced Structural Anisotropy

Experimentally resolving atomic-scale structural changes of a deformed glass remains challenging owing to the disordered nature of glass structure. Here, we show that the structural anisotropy emerges as a general hallmark for different types of glasses (metallic glasses, oxide glass, amorphous selenium, and polymer glass) after thermo-mechanical deformation, and it is highly correlates with local nonaffine atomic displacements detected by the high-energy X-ray diffraction technique. By analyzing the anisotropic pair density function, we unveil the atomic-level mechanism responsible for the plastic flow, which notably differs between metallic glasses and covalent glasses. The structural rearrangements in metallic glasses are mediated through cutting and formation of atomic bonds, which occurs in some localized inelastic regions embedded in elastic matrix, whereas that of covalent glasses is mediated through the rotation of atomic bonds or chains without bond length change, which occurs in a less localized manner.

preprint2022arXiv

CATNet: Cross-event Attention-based Time-aware Network for Medical Event Prediction

Medical event prediction (MEP) is a fundamental task in the medical domain, which needs to predict medical events, including medications, diagnosis codes, laboratory tests, procedures, outcomes, and so on, according to historical medical records. The task is challenging as medical data is a type of complex time series data with heterogeneous and temporal irregular characteristics. Many machine learning methods that consider the two characteristics have been proposed for medical event prediction. However, most of them consider the two characteristics separately and ignore the correlations among different types of medical events, especially relations between historical medical events and target medical events. In this paper, we propose a novel neural network based on attention mechanism, called cross-event attention-based time-aware network (CATNet), for medical event prediction. It is a time-aware, event-aware and task-adaptive method with the following advantages: 1) modeling heterogeneous information and temporal information in a unified way and considering temporal irregular characteristics locally and globally respectively, 2) taking full advantage of correlations among different types of events via cross-event attention. Experiments on two public datasets (MIMIC-III and eICU) show CATNet can be adaptive with different MEP tasks and outperforms other state-of-the-art methods on various MEP tasks. The source code of CATNet will be released after this manuscript is accepted.

preprint2022arXiv

Coherently amplifying photon production from vacuum with a dense cloud of accelerating photodetectors

An accelerating photodetector is predicted to see photons in the electromagnetic vacuum. However, the extreme accelerations required have prevented the direct experimental verification of this quantum vacuum effect. In this work, we consider many accelerating photodetectors that are contained within an electromagnetic cavity. We show that the resulting photon production from the cavity vacuum can be collectively enhanced such as to be measurable. The combined cavity-photodetectors system maps onto a parametrically driven Dicke-type model; when the detector number exceeds a certain critical value, the vacuum photon production undergoes a phase transition from a normal phase to an enhanced superradiant-like, inverted lasing phase. Such a model may be realized as a mechanical membrane with a dense concentration of optically active defects undergoing gigahertz flexural motion within a superconducting microwave cavity. We provide estimates suggesting that recent related experimental devices are close to demonstrating this inverted, vacuum photon lasing phase.

preprint2022arXiv

Cross-Enhancement Transformer for Action Segmentation

Temporal convolutions have been the paradigm of choice in action segmentation, which enhances long-term receptive fields by increasing convolution layers. However, high layers cause the loss of local information necessary for frame recognition. To solve the above problem, a novel encoder-decoder structure is proposed in this paper, called Cross-Enhancement Transformer. Our approach can be effective learning of temporal structure representation with interactive self-attention mechanism. Concatenated each layer convolutional feature maps in encoder with a set of features in decoder produced via self-attention. Therefore, local and global information are used in a series of frame actions simultaneously. In addition, a new loss function is proposed to enhance the training process that penalizes over-segmentation errors. Experiments show that our framework performs state-of-the-art on three challenging datasets: 50Salads, Georgia Tech Egocentric Activities and the Breakfast dataset.

preprint2022arXiv

Crystal growth engineering and origin of the weak ferromagnetism in antiferromagnetic matrix of orthochromates from $t$-$e$ orbital hybridization

We report a combined experimental and theoretical study on intriguing magnetic properties of quasiferroelectric orthochromates. Large single crystals of the family of RECrO$_3$ (RE = Y, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb, and Lu) compounds were successfully grown. Neutron Laue study indicates a good quality of the obtained single crystals. Applied magnetic-field and temperature dependent magnetization measurements reveal their intrinsic magnetic properties, especially the antiferromagnetic (AFM) transition temperatures. Density functional theory studies of the electronic structures were carried out using the Perdew-Burke-Ernzerhof functional plus Hubbard $U$ method. Crystallographic information and magnetism were theoretically optimized systematically. When RE$^{3+}$ cations vary from Y$^{3+}$ and Eu$^{3+}$ to Lu$^{3+}$ ions, the calculated $t$-$e$ orbital hybridization degree and Néel temperature behave similarly to the experimentally-determined AFM transition temperature with variation in cationic radius. We found that the $t$-$e$ hybridization is anisotropic, causing a magnetic anisotropy of Cr$^{3+}$ sublattices. This was evaluated with the nearest-neighbour $J_1$-$J_2$ model. Our research provides a picture of the electronic structures during the $t$-$e$ hybridization process while changing RE ions and sheds light on the nature of the weak ferromagnetism coexisting with predominated antiferromagnetism. The available large RECrO$_3$ single crystals build a platform for further studies of orthochromates.

preprint2022arXiv

DBT-Net: Dual-branch federative magnitude and phase estimation with attention-in-attention transformer for monaural speech enhancement

The decoupling-style concept begins to ignite in the speech enhancement area, which decouples the original complex spectrum estimation task into multiple easier sub-tasks i.e., magnitude-only recovery and the residual complex spectrum estimation)}, resulting in better performance and easier interpretability. In this paper, we propose a dual-branch federative magnitude and phase estimation framework, dubbed DBT-Net, for monaural speech enhancement, aiming at recovering the coarse- and fine-grained regions of the overall spectrum in parallel. From the complementary perspective, the magnitude estimation branch is designed to filter out dominant noise components in the magnitude domain, while the complex spectrum purification branch is elaborately designed to inpaint the missing spectral details and implicitly estimate the phase information in the complex-valued spectral domain. To facilitate the information flow between each branch, interaction modules are introduced to leverage features learned from one branch, so as to suppress the undesired parts and recover the missing components of the other branch. Instead of adopting the conventional RNNs and temporal convolutional networks for sequence modeling, we employ a novel attention-in-attention transformer-based network within each branch for better feature learning. More specially, it is composed of several adaptive spectro-temporal attention transformer-based modules and an adaptive hierarchical attention module, aiming to capture long-term time-frequency dependencies and further aggregate intermediate hierarchical contextual information. Comprehensive evaluations on the WSJ0-SI84 + DNS-Challenge and VoiceBank + DEMAND dataset demonstrate that the proposed approach consistently outperforms previous advanced systems and yields state-of-the-art performance in terms of speech quality and intelligibility.

preprint2022arXiv

DMF-Net: A decoupling-style multi-band fusion model for full-band speech enhancement

For the difficulty and large computational complexity of modeling more frequency bands, full-band speech enhancement based on deep neural networks is still challenging. Previous studies usually adopt compressed full-band speech features in Bark and ERB scale with relatively low frequency resolution, leading to degraded performance, especially in the high-frequency region. In this paper, we propose a decoupling-style multi-band fusion model to perform full-band speech denoising and dereverberation. Instead of optimizing the full-band speech by a single network structure, we decompose the full-band target into multi sub-band speech features and then employ a multi-stage chain optimization strategy to estimate clean spectrum stage by stage. Specifically, the low- (0-8 kHz), middle- (8-16 kHz), and high-frequency (16-24 kHz) regions are mapped by three separate sub-networks and are then fused to obtain the full-band clean target STFT spectrum. Comprehensive experiments on two public datasets demonstrate that the proposed method outperforms previous advanced systems and yields promising performance in terms of speech quality and intelligibility in real complex scenarios.

preprint2022arXiv

Dual-branch Attention-In-Attention Transformer for single-channel speech enhancement

Curriculum learning begins to thrive in the speech enhancement area, which decouples the original spectrum estimation task into multiple easier sub-tasks to achieve better performance. Motivated by that, we propose a dual-branch attention-in-attention transformer dubbed DB-AIAT to handle both coarse- and fine-grained regions of the spectrum in parallel. From a complementary perspective, a magnitude masking branch is proposed to coarsely estimate the overall magnitude spectrum, and simultaneously a complex refining branch is elaborately designed to compensate for the missing spectral details and implicitly derive phase information. Within each branch, we propose a novel attention-in-attention transformer-based module to replace the conventional RNNs and temporal convolutional networks for temporal sequence modeling. Specifically, the proposed attention-in-attention transformer consists of adaptive temporal-frequency attention transformer blocks and an adaptive hierarchical attention module, aiming to capture long-term temporal-frequency dependencies and further aggregate global hierarchical contextual information. Experimental results on Voice Bank + DEMAND demonstrate that DB-AIAT yields state-of-the-art performance (e.g., 3.31 PESQ, 95.6% STOI and 10.79dB SSNR) over previous advanced systems with a relatively small model size (2.81M).

preprint2022arXiv

Feature-aware Diversified Re-ranking with Disentangled Representations for Relevant Recommendation

Relevant recommendation is a special recommendation scenario which provides relevant items when users express interests on one target item (e.g., click, like and purchase). Besides considering the relevance between recommendations and trigger item, the recommendations should also be diversified to avoid information cocoons. However, existing diversified recommendation methods mainly focus on item-level diversity which is insufficient when the recommended items are all relevant to the target item. Moreover, redundant or noisy item features might affect the performance of simple feature-aware recommendation approaches. Faced with these issues, we propose a Feature Disentanglement Self-Balancing Re-ranking framework (FDSB) to capture feature-aware diversity. The framework consists of two major modules, namely disentangled attention encoder (DAE) and self-balanced multi-aspect ranker. In DAE, we use multi-head attention to learn disentangled aspects from rich item features. In the ranker, we develop an aspect-specific ranking mechanism that is able to adaptively balance the relevance and diversity for each aspect. In experiments, we conduct offline evaluation on the collected dataset and deploy FDSB on KuaiShou app for online A/B test on the function of relevant recommendation. The significant improvements on both recommendation quality and user experience verify the effectiveness of our approach.

preprint2022arXiv

FedLab: A Flexible Federated Learning Framework

Federated learning (FL) is a machine learning field in which researchers try to facilitate model learning process among multiparty without violating privacy protection regulations. Considerable effort has been invested in FL optimization and communication related researches. In this work, we introduce \texttt{FedLab}, a lightweight open-source framework for FL simulation. The design of \texttt{FedLab} focuses on FL algorithm effectiveness and communication efficiency. Also, \texttt{FedLab} is scalable in different deployment scenario. We hope \texttt{FedLab} could provide flexible API as well as reliable baseline implementations, and relieve the burden of implementing novel approaches for researchers in FL community.

preprint2022arXiv

Graph-based Approximate NN Search: A Revisit

Nearest neighbor search plays a fundamental role in many disciplines such as multimedia information retrieval, data-mining, and machine learning. The graph-based search approaches show superior performance over other types of approaches in recent studies. In this paper, the graph-based NN search is revisited. We optimize two key components in the approach, namely the search procedure and the graph that supports the search. For the graph construction, a two-stage graph diversification scheme is proposed, which makes a good trade-off between the efficiency and reachability for the search procedure that builds upon it. Moreover, the proposed diversification scheme allows the search procedure to decide dynamically how many nodes should be visited in one node's neighborhood. By this way, the computing power of the devices is fully utilized when the search is carried out under different circumstances. Furthermore, two NN search procedures are designed respectively for small and large batch queries on the GPU. The optimized NN search, when being supported by the two-stage diversified graph, outperforms all the state-of-the-art approaches on both the CPU and the GPU across all the considered large-scale datasets.

preprint2022arXiv

Intrinsic Nonlinear Spin Magnetoelectricity in Centrosymmetric Magnets

We propose an intrinsic nonlinear spin magnetoelectric effect in magnetic materials, offering the potential of all-electric control of spin degree of freedom in centrosymmetric magnets, which reside outside of the current paradigm based on linear spin response. We reveal the band geometric origin of this effect in the momentum and magnetization space Berry connection polarizabilities, and clarify its symmetry characters. As an intrinsic effect, it is determined solely by the material's band structure and represents a material characteristic. Combining our theory with first-principles calculations, we predict sizable nonlinear spin magnetoelectricity in single-layer MnBi$_{2}$Te$_{4}$, which can be detected in experiment. Our theory paves the way for exploring rich nonlinear spintronic effects and novel device concepts based on them.

preprint2022arXiv

Joint magnitude estimation and phase recovery using Cycle-in-Cycle GAN for non-parallel speech enhancement

For the lack of adequate paired noisy-clean speech corpus in many real scenarios, non-parallel training is a promising task for DNN-based speech enhancement methods. However, because of the severe mismatch between input and target speeches, many previous studies only focus on the magnitude spectrum estimation and remain the phase unaltered, resulting in the degraded speech quality under low signal-to-noise ratio conditions. To tackle this problem, we decouple the difficult target w.r.t. original spectrum optimization into spectral magnitude and phase, and a novel Cycle-in-Cycle generative adversarial network (dubbed CinCGAN) is proposed to jointly estimate the spectral magnitude and phase information stage by stage under unpaired data. In the first stage, we pretrain a magnitude CycleGAN to coarsely estimate the spectral magnitude of clean speech. In the second stage, we incorporate the pretrained CycleGAN with a complex-valued CycleGAN as a cycle-in-cycle structure to simultaneously recover phase information and refine the overall spectrum. Experimental results demonstrate that the proposed approach significantly outperforms previous baselines under non-parallel training. The evaluation on training the models with standard paired data also shows that CinCGAN achieves remarkable performance especially in reducing background noise and speech distortion.

preprint2022arXiv

Meter-Range Wireless Motor Drive for Pipeline Transportation

This paper proposes and implements a meter-range wireless motor drive (WMD) system for promising applications of underground pipeline transportations or in-pipe robots. To power a pipeline network beneath the earth, both the power grid and the control system are usually required to be deployed deep underground, thus increasing the construction cost, maintenance difficulty and system complexity. The proposed system newly develops a hybrid repeater to enable the desired meter-range wireless power and drive transfer, which can offer a fault-tolerant network with a robust structure for the underground sensor-free WMD while maintaining a high transmission efficiency. Hence, this wireless pipeline network can reduce the maintenance requirement and regulate the flow rate effectively. A full-scale prototype has been built for practical verification, and the system efficiency can reach 88.8% at a long transfer distance of 150 cm. Theoretical analysis, software simulation and hardware experimentation are given to verify the feasibility of proposed meter-range WMD for underground pipeline transportations.

preprint2022arXiv

Methane detection to 1 ppm using machine learning analysis of atmospheric pressure plasma optical emission spectra

Optical emission spectroscopy from a small-volume, 5 uL, atmospheric pressure RF-driven helium plasma was used in conjunction with Partial Least Squares Discriminant Analysis (PLS-DA) for the detection of trace concentrations of methane gas. A limit of detection of 1 ppm was obtained and sample concentrations up to 100 ppm CH4 were classified using a nine-category model. A range of algorithm enhancements were investigated including regularization, simple data segmentation and subset selection, VIP feature selection and wavelength variable compression in order to address the high dimensionality and collinearity of spectral emission data. These approaches showed the potential for significant reduction in the number of wavelength variables and the spectral resolution/bandwidth. Wavelength variable compression exhibited reliable predictive performance, with accuracy values > 97%, under more challenging multi-session train - test scenarios. Simple modelling of plasma electron energy distribution functions highlights the complex cross-sensitivities between the target methane, its dissociation products and atmospheric impurities and their impact on excitation and emission.

preprint2022arXiv

Motion Gait: Gait Recognition via Motion Excitation

Gait recognition, which can realize long-distance and contactless identification, is an important biometric technology. Recent gait recognition methods focus on learning the pattern of human movement or appearance during walking, and construct the corresponding spatio-temporal representations. However, different individuals have their own laws of movement patterns, simple spatial-temporal features are difficult to describe changes in motion of human parts, especially when confounding variables such as clothing and carrying are included, thus distinguishability of features is reduced. In this paper, we propose the Motion Excitation Module (MEM) to guide spatio-temporal features to focus on human parts with large dynamic changes, MEM learns the difference information between frames and intervals, so as to obtain the representation of temporal motion changes, it is worth mentioning that MEM can adapt to frame sequences with uncertain length, and it does not add any additional parameters. Furthermore, we present the Fine Feature Extractor (FFE), which independently learns the spatio-temporal representations of human body according to different horizontal parts of individuals. Benefiting from MEM and FFE, our method innovatively combines motion change information, significantly improving the performance of the model under cross appearance conditions. On the popular dataset CASIA-B, our proposed Motion Gait is better than the existing gait recognition methods.

preprint2022arXiv

Nuclear phase retrieval spectroscopy using resonant x-ray scattering

Light-matter interaction is exploited in spectroscopic techniques to access information about molecular, atomic or nuclear constituents of the sample of interest. While scattered light carries both amplitude and phase information of the electromagnetic field, most of the time the latter is lost in intensity measurements. However, often the phase information is paramount to reconstruct the desired information of the target, as it is well known from coherent x-ray imaging. Here we introduce a new phase retrieval algorithm which allows us to reconstruct the field phase information from two-dimensional time- and energy-resolved spectra. We apply this method to the particular case of x-ray scattering off Mössbauer nuclei at a synchrotron radiation source. Knowledge of the phase allows also for an excellent reconstruction of the energy spectra from experimental data, which could not be achieved with this resolution otherwise. Our approach provides an efficient novel data analysis tool which will benefit x-ray quantum optics and Mössbauer spectroscopy with synchrotron radiation alike.

preprint2022arXiv

On Helical Surfaces with a Constant Ratio of Principal Curvatures

We determine all helical surfaces in three-dimensional Euclidean space which possess a constant ratio $a:=κ_1/κ_2$ of principal curvatures (CRPC surfaces), thus providing the first explicit CRPC surfaces beyond the known rotational ones. A key ingredient in the successful determination of these surfaces is the proper choice of generating profiles. We employ the contours for parallel projection orthogonal to the helical axis. This has the advantage that the CRPC property can be nicely expressed with the help of the involution of conjugate surface tangents. The arising ordinary differential equation has an explicit parametric solution, which forms the basis for a further study and classification of the possible shapes and the singularities arising for $a>0$.

preprint2022arXiv

On some estimates involving Fourier coefficients of Maass cusp forms

Let $f$ be a Hecke-Maass cusp form for $\rm SL_2(\mathbb{Z})$ with Laplace eigenvalue $λ_f(Δ)=1/4+μ^2$ and let $λ_f(n)$ be its $n$-th normalized Fourier coefficient. It is proved that, uniformly in $α, β\in \mathbb{R}$, $$ \sum_{n \leq X}λ_f(n)e\left(αn^2+βn\right) \ll X^{7/8+\varepsilon}λ_f(Δ)^{1/2+\varepsilon}, $$ where the implied constant depends only on $\varepsilon$. We also consider the summation function of $λ_f(n)$ and under the Ramanujan conjecture we are able to prove $$ \sum_{n \leq X}λ_f(n)\ll X^{1/3+\varepsilon}λ_f(Δ)^{4/9+\varepsilon} $$ with the implied constant depending only on $\varepsilon$.

preprint2022arXiv

Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement

Due to the high computational complexity to model more frequency bands, it is still intractable to conduct real-time full-band speech enhancement based on deep neural networks. Recent studies typically utilize the compressed perceptually motivated features with relatively low frequency resolution to filter the full-band spectrum by one-stage networks, leading to limited speech quality improvements. In this paper, we propose a coordinated sub-band fusion network for full-band speech enhancement, which aims to recover the low- (0-8 kHz), middle- (8-16 kHz), and high-band (16-24 kHz) in a step-wise manner. Specifically, a dual-stream network is first pretrained to recover the low-band complex spectrum, and another two sub-networks are designed as the middle- and high-band noise suppressors in the magnitude-only domain. To fully capitalize on the information intercommunication, we employ a sub-band interaction module to provide external knowledge guidance across different frequency bands. Extensive experiments show that the proposed method yields consistent performance advantages over state-of-the-art full-band baselines.

preprint2022arXiv

PowerGear: Early-Stage Power Estimation in FPGA HLS via Heterogeneous Edge-Centric GNNs

Power estimation is the basis of many hardware optimization strategies. However, it is still challenging to offer accurate power estimation at an early stage such as high-level synthesis (HLS). In this paper, we propose PowerGear, a graph-learning-assisted power estimation approach for FPGA HLS, which features high accuracy, efficiency and transferability. PowerGear comprises two main components: a graph construction flow and a customized graph neural network (GNN) model. Specifically, in the graph construction flow, we introduce buffer insertion, datapath merging, graph trimming and feature annotation techniques to transform HLS designs into graph-structured data, which encode both intra-operation micro-architectures and inter-operation interconnects annotated with switching activities. Furthermore, we propose a novel power-aware heterogeneous edge-centric GNN model which effectively learns heterogeneous edge semantics and structural properties of the constructed graphs via edge-centric neighborhood aggregation, and fits the formulation of dynamic power. Compared with on-board measurement, PowerGear estimates total and dynamic power for new HLS designs with errors of 3.60% and 8.81%, respectively, which outperforms the prior arts in research and the commercial product Vivado. In addition, PowerGear demonstrates a speedup of 4x over Vivado power estimator. Finally, we present a case study in which PowerGear is exploited to facilitate design space exploration for FPGA HLS, leading to a performance gain of up to 11.2%, compared with methods using state-of-the-art predictive models.

preprint2022arXiv

Unsupervised Quantized Prosody Representation for Controllable Speech Synthesis

In this paper, we propose a novel prosody disentangle method for prosodic Text-to-Speech (TTS) model, which introduces the vector quantization (VQ) method to the auxiliary prosody encoder to obtain the decomposed prosody representations in an unsupervised manner. Rely on its advantages, the speaking styles, such as pitch, speaking velocity, local pitch variance, etc., are decomposed automatically into the latent quantize vectors. We also investigate the internal mechanism of VQ disentangle process by means of a latent variables counter and find that higher value dimensions usually represent prosody information. Experiments show that our model can control the speaking styles of synthesis results by directly manipulating the latent variables. The objective and subjective evaluations illustrated that our model outperforms the popular models.

preprint2021arXiv

En route to high Tc superconductivity via Rb substitution of guest metal atoms in SrB3C3 clathrate

Recently, a host/guest clathrate SrB3C3 with sp3-bonded boron-carbon framework was synthesized at around 50 GPa. On the basis of electron count, the structure is understood as guest Sr2+ cations intercalated in the (B3C3)3- framework. Previous calculations suggest that SrB3C3 is a hole conductor with an estimated superconducting critical temperature (Tc) of 42 K at ambient pressure. If atoms with similar radius, such as Rb, can substitute Sr2+ in the lattice, the electronic as well as superconductivity properties of this material will be modified significantly. Here, we perform extensive simulations on the stability and physical properties of Rb-Sr-B3C3 system using first-principles density functional calculation in combination with cluster expansion and CALYPSO structure prediction method. We predict a phonon-mediated superconductor Rb0.5Sr0.5B3C3 with a remarkably high Tc of 78 K at ambient pressure, which is a significant improvement from the estimated value (42 K) in SrB3C3. The current results suggest that substitution of alkali atom in synthesized clathrate SrB3C3 is a viable route toward high-Tc compounds.

preprint2021arXiv

Experimental test of the majorization uncertainty relation with mixed states

The uncertainty relation lies at the heart of quantum theory and behaves as a non-classical constraint on the indeterminacies of incompatible observables in a system. In the literature, many experiments have been devoted to the test of the uncertainty relations which mainly focus on the pure states. In this work we test the novel majorization uncertainty relations of three incompatible observables using a series of mixed states with adjustable mixing degrees, and compare the compactness of various entropy uncertainty relations. The experimental results confirm that for general mixed quantum system, the majorization uncertainty relation tends to be the tightest constraint on uncertainty, and indicate that the entropy uncertainty relation obtained from the majorzation uncertainty relation is the optimal one. Our experimental setup provides an easy means for preparing mixed states, and based on this simple optical elements can be utilized to realize the required quantum states.

preprint2021arXiv

Gain without Pain: Offsetting DP-injected Nosies Stealthily in Cross-device Federated Learning

Federated Learning (FL) is an emerging paradigm through which decentralized devices can collaboratively train a common model. However, a serious concern is the leakage of privacy from exchanged gradient information between clients and the parameter server (PS) in FL. To protect gradient information, clients can adopt differential privacy (DP) to add additional noises and distort original gradients before they are uploaded to the PS. Nevertheless, the model accuracy will be significantly impaired by DP noises, making DP impracticable in real systems. In this work, we propose a novel Noise Information Secretly Sharing (NISS) algorithm to alleviate the disturbance of DP noises by sharing negated noises among clients. We theoretically prove that: 1) If clients are trustworthy, DP noises can be perfectly offset on the PS; 2) Clients can easily distort negated DP noises to protect themselves in case that other clients are not totally trustworthy, though the cost lowers model accuracy. NISS is particularly applicable for FL across multiple IoT (Internet of Things) systems, in which all IoT devices need to collaboratively train a model. To verify the effectiveness and the superiority of the NISS algorithm, we conduct experiments with the MNIST and CIFAR-10 datasets. The experiment results verify our analysis and demonstrate that NISS can improve model accuracy by 21% on average and obtain better privacy protection if clients are trustworthy.

preprint2021arXiv

Stable Online Computation Offloading via Lyapunov-guided Deep Reinforcement Learning

In this paper, we consider a multi-user mobile-edge computing (MEC) network with time-varying wireless channels and stochastic user task data arrivals in sequential time frames. In particular, we aim to design an online computation offloading algorithm to maximize the network data processing capability subject to the long-term data queue stability and average power constraints. The online algorithm is practical in the sense that the decisions for each time frame are made without the assumption of knowing future channel conditions and data arrivals. We formulate the problem as a multi-stage stochastic mixed integer non-linear programming (MINLP) problem that jointly determines the binary offloading (each user computes the task either locally or at the edge server) and system resource allocation decisions in sequential time frames. To address the coupling in the decisions of different time frames, we propose a novel framework, named LyDROO, that combines the advantages of Lyapunov optimization and deep reinforcement learning (DRL). Specifically, LyDROO first applies Lyapunov optimization to decouple the multi-stage stochastic MINLP into deterministic per-frame MINLP subproblems of much smaller size. Then, it integrates model-based optimization and model-free DRL to solve the per-frame MINLP problems with very low computational complexity. Simulation results show that the proposed LyDROO achieves optimal computation performance while satisfying all the long-term constraints. Besides, it induces very low execution latency that is particularly suitable for real-time implementation in fast fading environments.

preprint2021arXiv

Unsupervised Domain Adaptation for Image Classification via Structure-Conditioned Adversarial Learning

Unsupervised domain adaptation (UDA) typically carries out knowledge transfer from a label-rich source domain to an unlabeled target domain by adversarial learning. In principle, existing UDA approaches mainly focus on the global distribution alignment between domains while ignoring the intrinsic local distribution properties. Motivated by this observation, we propose an end-to-end structure-conditioned adversarial learning scheme (SCAL) that is able to preserve the intra-class compactness during domain distribution alignment. By using local structures as structure-aware conditions, the proposed scheme is implemented in a structure-conditioned adversarial learning pipeline. The above learning procedure is iteratively performed by alternating between local structures establishment and structure-conditioned adversarial learning. Experimental results demonstrate the effectiveness of the proposed scheme in UDA scenarios.

preprint2020arXiv

A Group Norm Regularized Factorization Model for Subspace Segmentation

Subspace segmentation assumes that data comes from the union of different subspaces and the purpose of segmentation is to partition the data into the corresponding subspace. Low-rank representation (LRR) is a classic spectral-type method for solving subspace segmentation problems, that is, one first obtains an affinity matrix by solving a LRR model and then performs spectral clustering for segmentation. This paper proposes a group norm regularized factorization model (GNRFM) inspired by the LRR model for subspace segmentation and then designs an Accelerated Augmented Lagrangian Method (AALM) algorithm to solve this model. Specifically, we adopt group norm regularization to make the columns of the factor matrix sparse, thereby achieving a purpose of low rank, which means no Singular Value Decompositions (SVD) are required and the computational complexity of each step is greatly reduced. We obtain affinity matrices by using different LRR models and then performing cluster testing on different sets of synthetic noisy data and real data, respectively. Compared with traditional models and algorithms, the proposed method is faster and more robust to noise, so the final clustering results are better. Moreover, the numerical results show that our algorithm converges fast and only requires approximately ten iterations.

preprint2020arXiv

A subconvex bound for twisted $L$-functions

Let $\mathfrak{q}>2$ be a prime number, $χ$ a primitive Dirichlet character modulo $\mathfrak{q}$ and $f$ a primitive holomorphic cusp form or a Hecke-Maass cusp form of level $\mathfrak{q}$ and trivial nebentypus. We prove the subconvex bound $$ L(1/2,f\otimes χ)\ll \mathfrak{q}^{1/2-1/12+\varepsilon}, $$ where the implicit constant depends only on the archimedean parameter of $f$ and $\varepsilon$. The main input is a modifying trivial delta method developed in [1].

preprint2020arXiv

Abnormally low thermal conductivity of 2D selenene: An ab initio study

The lattice thermal conductivity and thermal transport properties of 2D $α$-selenene are investigated based on the first-principles calculations. The isotropic in-plane thermal conductivity is as low as 3.04 W m$^{-1}$ K$^{-1}$ at room temperature, even abnormally lower than $α$-tellurene which processes analogous configuration and lower Debye temperature. We find this abnormal phenomenon reasonably stems from the larger anharmonicity of the acoustic phonon branch. Moreover, the phonon spectra, elastic properties, and related thermal properties are also exhibited. Acoustic phonons contribute mainly to the total thermal conductivity. Furthermore, size effect, boundary effect, the total phase space for three-phonon processes, phonon group velocity and relaxation time are further investigated, and the last one is unveiled to be the key ingredient of thermal transport in 2D selenene.

preprint2020arXiv

Analysis of Hyper-Parameters for Small Games: Iterations or Epochs in Self-Play?

The landmark achievements of AlphaGo Zero have created great research interest into self-play in reinforcement learning. In self-play, Monte Carlo Tree Search is used to train a deep neural network, that is then used in tree searches. Training itself is governed by many hyperparameters.There has been surprisingly little research on design choices for hyper-parameter values and loss-functions, presumably because of the prohibitive computational cost to explore the parameter space. In this paper, we investigate 12 hyper-parameters in an AlphaZero-like self-play algorithm and evaluate how these parameters contribute to training. We use small games, to achieve meaningful exploration with moderate computational effort. The experimental results show that training is highly sensitive to hyper-parameter choices. Through multi-objective analysis we identify 4 important hyper-parameters to further assess. To start, we find surprising results where too much training can sometimes lead to lower performance. Our main result is that the number of self-play iterations subsumes MCTS-search simulations, game-episodes, and training epochs. The intuition is that these three increase together as self-play iterations increase, and that increasing them individually is sub-optimal. A consequence of our experiments is a direct recommendation for setting hyper-parameter values in self-play: the overarching outer-loop of self-play iterations should be maximized, in favor of the three inner-loop hyper-parameters, which should be set at lower values. A secondary result of our experiments concerns the choice of optimization goals, for which we also provide recommendations.

preprint2020arXiv

Approximate k-NN Graph Construction: a Generic Online Approach

Nearest neighbor search and k-nearest neighbor graph construction are two fundamental issues arise from many disciplines such as multimedia information retrieval, data-mining and machine learning. They become more and more imminent given the big data emerge in various fields in recent years. In this paper, a simple but effective solution both for approximate k-nearest neighbor search and approximate k-nearest neighbor graph construction is presented. These two issues are addressed jointly in our solution. On the one hand, the approximate k-nearest neighbor graph construction is treated as a search task. Each sample along with its k-nearest neighbors are joined into the k-nearest neighbor graph by performing the nearest neighbor search sequentially on the graph under construction. On the other hand, the built k-nearest neighbor graph is used to support k-nearest neighbor search. Since the graph is built online, the dynamic update on the graph, which is not possible from most of the existing solutions, is supported. This solution is feasible for various distance measures. Its effectiveness both as k-nearest neighbor construction and k-nearest neighbor search approaches is verified across different types of data in different scales, various dimensions and under different metrics.

preprint2020arXiv

Computation Rate Maximization in Wireless Powered MEC with Spread Spectrum Multiple Access

The integration of mobile edge computing (MEC) and wireless power transfer (WPT) technologies has recently emerged as an effective solution for extending battery life and increasing the computing power of wireless devices. In this paper, we study the resource allocation problem of a multi-user wireless powered MEC system, where the users share the wireless channel via direct sequence code division multiple access (DS-CDMA). In particular, we are interested in jointly optimizing the task offloading decisions and resource allocation, to maximize the weighted sum computation rate of all the users in the network. The optimization problem is formulated as a mixed integer non-linear programming (MINLP). For a given offloading user set, we implement an efficient Fractional Programming (FP) approach to mitigate the multi-user interference in the uplink task offloading. On top of that, we then propose a Stochastic Local Search algorithm to optimize the offloading decisions. Simulation results show that the proposed method can effectively enhance the computing performance of a wireless powered MEC with spread spectrum multiple access compared to other representative benchmark methods.

preprint2020arXiv

Computational neurology: Computational modeling approaches in dementia

Dementia is a collection of symptoms associated with impaired cognition and impedes everyday normal functioning. Dementia, with Alzheimer's disease constituting its most common type, is highly complex in terms of etiology and pathophysiology. A more quantitative or computational attitude towards dementia research, or more generally in neurology, is becoming necessary - Computational Neurology. We provide a focused review of some computational approaches that have been developed and applied to the study of dementia, particularly Alzheimer's disease. Both mechanistic modeling and data-drive, including AI or machine learning, approaches are discussed. Linkage to clinical decision support systems for dementia diagnosis will also be discussed.

preprint2020arXiv

Development of Hyperthermia Measurable Fiber Radiometric Thermometer for Monitoring Tissue Temperature during Thermo-therapy

ABSTRACT: Temperature monitoring is extremely im-portant during thermotherapy. Fiber-optic temperature sensors are preferred because of their flexibility and im-munity to electromagnetic interference. Although many types of fiber-optic sensors have been developed, it re-mains challenging for clinically adopting them. Here, we report a silica fiber-based radiometric thermometer using a low-cost extended InGaAs detector to detect black body radiation between 1.7um to 2.4 um. For the first time, this silica fiber-based thermometer is capable of measuring temperature down to 35°C, making it suitable for seamless integration with current silica fiber catheters used in laser interstitial thermotherapy to monitor hyperthermia during a surgery. The feasibility, capability, and sensitivity of track-ing tissue temperature variation were proved through ex vivo and in vivo tissue studies. The technology is promising for being translated into clinics after further improving the signal to noise ratio.

preprint2020arXiv

Iterative Context-Aware Graph Inference for Visual Dialog

Visual dialog is a challenging task that requires the comprehension of the semantic dependencies among implicit visual and textual contexts. This task can refer to the relation inference in a graphical model with sparse contexts and unknown graph structure (relation descriptor), and how to model the underlying context-aware relation inference is critical. To this end, we propose a novel Context-Aware Graph (CAG) neural network. Each node in the graph corresponds to a joint semantic feature, including both object-based (visual) and history-related (textual) context representations. The graph structure (relations in dialog) is iteratively updated using an adaptive top-$K$ message passing mechanism. Specifically, in every message passing step, each node selects the most $K$ relevant nodes, and only receives messages from them. Then, after the update, we impose graph attention on all the nodes to get the final graph embedding and infer the answer. In CAG, each node has dynamic relations in the graph (different related $K$ neighbor nodes), and only the most relevant nodes are attributive to the context-aware relational graph inference. Experimental results on VisDial v0.9 and v1.0 datasets show that CAG outperforms comparative methods. Visualization results further validate the interpretability of our method.

preprint2020arXiv

Joint Beamforming and Power Control for Throughput Maximization in IRS-assisted MISO WPCNs

Intelligent reflecting surface (IRS) is an emerging technology to enhance the energy- and spectrum-efficiency of wireless powered communication networks (WPCNs). In this paper, we investigate an IRS-assisted multiuser multiple-input single-output (MISO) WPCN, where the single-antenna wireless devices (WDs) harvest wireless energy in the downlink (DL) and transmit their information simultaneously in the uplink (UL) to a common hybrid access point (HAP) equipped with multiple antennas. Our goal is to maximize the weighted sum rate (WSR) of all the energy-harvesting users. To make full use of the beamforming gain provided by both the HAP and the IRS, we jointly optimize the active beamforming of the HAP and the reflecting coefficients (passive beamforming) of the IRS in both DL and UL transmissions, as well as the transmit power of the WDs to mitigate the inter-user interference at the HAP. To tackle the challenging optimization problem, we first consider fixing the passive beamforming, and converting the remaining joint active beamforming and user transmit power control problem into an equivalent weighted minimum mean square error (WMMSE) problem, where we solve it using an efficient block-coordinate descent (BCD) method. Then, we fix the active beamforming and user transmit power, and optimize the passive beamforming coefficients of the IRS in both the DL and UL using a semidefinite relaxation (SDR) method. Accordingly, we apply a block-structured optimization (BSO) method to update the two sets of variables alternately. Numerical results show that the proposed joint optimization achieves significant performance gain over other representative benchmark methods and effectively improves the throughput performance in multiuser MISO WPCNs.

preprint2020arXiv

Leveraging Historical Interaction Data for Improving Conversational Recommender System

Recently, conversational recommender system (CRS) has become an emerging and practical research topic. Most of the existing CRS methods focus on learning effective preference representations for users from conversation data alone. While, we take a new perspective to leverage historical interaction data for improving CRS. For this purpose, we propose a novel pre-training approach to integrating both item-based preference sequence (from historical interaction data) and attribute-based preference sequence (from conversation data) via pre-training methods. We carefully design two pre-training tasks to enhance information fusion between item- and attribute-based preference. To improve the learning performance, we further develop an effective negative sample generator which can produce high-quality negative samples. Experiment results on two real-world datasets have demonstrated the effectiveness of our approach for improving CRS.

preprint2020arXiv

Privacy for All: Demystify Vulnerability Disparity of Differential Privacy against Membership Inference Attack

Machine learning algorithms, when applied to sensitive data, pose a potential threat to privacy. A growing body of prior work has demonstrated that membership inference attack (MIA) can disclose specific private information in the training data to an attacker. Meanwhile, the algorithmic fairness of machine learning has increasingly caught attention from both academia and industry. Algorithmic fairness ensures that the machine learning models do not discriminate a particular demographic group of individuals (e.g., black and female people). Given that MIA is indeed a learning model, it raises a serious concern if MIA ``fairly'' treats all groups of individuals equally. In other words, whether a particular group is more vulnerable against MIA than the other groups. This paper examines the algorithmic fairness issue in the context of MIA and its defenses. First, for fairness evaluation, it formalizes the notation of vulnerability disparity (VD) to quantify the difference of MIA treatment on different demographic groups. Second, it evaluates VD on four real-world datasets, and shows that VD indeed exists in these datasets. Third, it examines the impacts of differential privacy, as a defense mechanism of MIA, on VD. The results show that although DP brings significant change on VD, it cannot eliminate VD completely. Therefore, fourth, it designs a new mitigation algorithm named FAIRPICK to reduce VD. An extensive set of experimental results demonstrate that FAIRPICK can effectively reduce VD for both with and without the DP deployment.

preprint2020arXiv

Residual Block-based Multi-Label Classification and Localization Network with Integral Regression for Vertebrae Labeling

Accurate identification and localization of the vertebrae in CT scans is a critical and standard preprocessing step for clinical spinal diagnosis and treatment. Existing methods are mainly based on the integration of multiple neural networks, and most of them use the Gaussian heat map to locate the vertebrae&#39;s centroid. However, the process of obtaining the vertebrae&#39;s centroid coordinates using heat maps is non-differentiable, so it is impossible to train the network to label the vertebrae directly. Therefore, for end-to-end differential training of vertebra coordinates on CT scans, a robust and accurate automatic vertebral labeling algorithm is proposed in this study. Firstly, a novel residual-based multi-label classification and localization network is developed, which can capture multi-scale features, but also utilize the residual module and skip connection to fuse the multi-level features. Secondly, to solve the problem that the process of finding coordinates is non-differentiable and the spatial structure is not destructible, integral regression module is used in the localization network. It combines the advantages of heat map representation and direct regression coordinates to achieve end-to-end training, and can be compatible with any key point detection methods of medical image based on heat map. Finally, multi-label classification of vertebrae is carried out, which use bidirectional long short term memory (Bi-LSTM) to enhance the learning of long contextual information to improve the classification performance. The proposed method is evaluated on a challenging dataset and the results are significantly better than the state-of-the-art methods (mean localization error <3mm).

preprint2020arXiv

Reusing Wireless Power Transfer for Backscatter-assisted Relaying in WPCNs

User cooperation is an effective technique to tackle the severe near-far user unfairness problem in wireless powered communication networks (WPCNs). In this paper, we consider a WPCN where two collaborating wireless devices (WDs) first harvest wireless energy from a hybrid access point (HAP) and then transmit their information to the HAP. The WD with the stronger WD-to-HAP channel helps relay the message of the other weaker user. In particular, we exploit the use of ambient backscatter communication during the wireless energy transfer phase, where the weaker user backscatters the received energy signal to transmit its information to the relay user in a passive manner. By doing so, the relay user can reuse the energy signal for simultaneous energy harvesting and information decoding (e.g., using an energy detector). Compared to active information transmission in conventional WPCNs, the proposed method effectively saves the energy and time consumed by the weaker user on information transmission during cooperation. With the proposed backscatter-assisted relaying scheme, we jointly optimize the time and power allocations on wireless energy and information transmissions to maximize the common throughput. Specifically, we derive the semi-closed-form expressions of the optimal solution and propose a low-complexity optimal algorithm to solve the joint optimization problem. By comparing with some representative benchmark methods, we simulate under extensive network setups and demonstrate that the proposed cooperation method effectively improves the throughput performance in WPCNs.

preprint2020arXiv

S^3-Rec: Self-Supervised Learning for Sequential Recommendation with Mutual Information Maximization

Recently, significant progress has been made in sequential recommendation with deep learning. Existing neural sequential recommendation models usually rely on the item prediction loss to learn model parameters or data representations. However, the model trained with this loss is prone to suffer from data sparsity problem. Since it overemphasizes the final performance, the association or fusion between context data and sequence data has not been well captured and utilized for sequential recommendation. To tackle this problem, we propose the model S^3-Rec, which stands for Self-Supervised learning for Sequential Recommendation, based on the self-attentive neural architecture. The main idea of our approach is to utilize the intrinsic data correlation to derive self-supervision signals and enhance the data representations via pre-training methods for improving sequential recommendation. For our task, we devise four auxiliary self-supervised objectives to learn the correlations among attribute, item, subsequence, and sequence by utilizing the mutual information maximization (MIM) principle. MIM provides a unified way to characterize the correlation between different types of data, which is particularly suitable in our scenario. Extensive experiments conducted on six real-world datasets demonstrate the superiority of our proposed method over existing state-of-the-art methods, especially when only limited training data is available. Besides, we extend our self-supervised learning method to other recommendation models, which also improve their performance.

preprint2020arXiv

Sample-efficient benchmarking of multi-photon interference on a boson sampler in the sparse regime

Verification of a quantum advantage in the presence of noise is a key open problem in the study of near-term quantum devices. In this work, we show how to assess the quality of photonic interference in a linear optical quantum device (boson sampler) by using a maximum likelihood method to measure the strength at which various noise sources are present in the experiment. This allows us to use a sparse set of samples to test whether a given boson sampling experiment meets known upper bounds on the level of noise permissible to demonstrate a quantum advantage. Furthermore, this method allows us monitor the evolution of noise in real time, creating a valuable diagnostic tool. Finally, we observe that sources of noise in the experiment compound, meaning that the observed value of the mutual photon indistinguishability, which is the main imperfection in our study, is an effective value taking into account all sources of error in the experiment.

preprint2020arXiv

Sequential Recommendation with Self-Attentive Multi-Adversarial Network

Recently, deep learning has made significant progress in the task of sequential recommendation. Existing neural sequential recommenders typically adopt a generative way trained with Maximum Likelihood Estimation (MLE). When context information (called factor) is involved, it is difficult to analyze when and how each individual factor would affect the final recommendation performance. For this purpose, we take a new perspective and introduce adversarial learning to sequential recommendation. In this paper, we present a Multi-Factor Generative Adversarial Network (MFGAN) for explicitly modeling the effect of context information on sequential recommendation. Specifically, our proposed MFGAN has two kinds of modules: a Transformer-based generator taking user behavior sequences as input to recommend the possible next items, and multiple factor-specific discriminators to evaluate the generated sub-sequence from the perspectives of different factors. To learn the parameters, we adopt the classic policy gradient method, and utilize the reward signal of discriminators for guiding the learning of the generator. Our framework is flexible to incorporate multiple kinds of factor information, and is able to trace how each factor contributes to the recommendation decision over time. Extensive experiments conducted on three real-world datasets demonstrate the superiority of our proposed model over the state-of-the-art methods, in terms of effectiveness and interpretability.

preprint2020arXiv

Tackling Morpion Solitaire with AlphaZero-likeRanked Reward Reinforcement Learning

Morpion Solitaire is a popular single player game, performed with paper and pencil. Due to its large state space (on the order of the game of Go) traditional search algorithms, such as MCTS, have not been able to find good solutions. A later algorithm, Nested Rollout Policy Adaptation, was able to find a new record of 82 steps, albeit with large computational resources. After achieving this record, to the best of our knowledge, there has been no further progress reported, for about a decade. In this paper we take the recent impressive performance of deep self-learning reinforcement learning approaches from AlphaGo/AlphaZero as inspiration to design a searcher for Morpion Solitaire. A challenge of Morpion Solitaire is that the state space is sparse, there are few win/loss signals. Instead, we use an approach known as ranked reward to create a reinforcement learning self-play framework for Morpion Solitaire. This enables us to find medium-quality solutions with reasonable computational effort. Our record is a 67 steps solution, which is very close to the human best (68) without any other adaptation to the problem than using ranked reward. We list many further avenues for potential improvement.

preprint2020arXiv

Throughput Optimization of Intelligent Reflecting Surface Assisted User Cooperation in WPCNs

Intelligent reflecting surface (IRS) can effectively enhance the energy and spectral efficiency of wireless communication system through the use of a large number of lowcost passive reflecting elements. In this paper, we investigate throughput optimization of IRS-assisted user cooperation in a wireless powered communication network (WPCN), where the two WDs harvest wireless energy and transmit information to a common hybrid access point (HAP). In particular, the two WDs first exchange their independent information with each other and then form a virtual antenna array to transmit jointly to the HAP. We aim to maximize the common (minimum) throughput performance by jointly optimizing the transmit time and power allocations of the two WDs on wireless energy and information transmissions and the passive array coefficients on reflecting the wireless energy and information signals. By comparing with some existing benchmark schemes, our results show that the proposed IRS-assisted user cooperation method can effectively improve the throughput performance of cooperative transmission in WPCNs.

preprint2020arXiv

Warm-Start AlphaZero Self-Play Search Enhancements

Recently, AlphaZero has achieved landmark results in deep reinforcement learning, by providing a single self-play architecture that learned three different games at super human level. AlphaZero is a large and complicated system with many parameters, and success requires much compute power and fine-tuning. Reproducing results in other games is a challenge, and many researchers are looking for ways to improve results while reducing computational demands. AlphaZero&#39;s design is purely based on self-play and makes no use of labeled expert data ordomain specific enhancements; it is designed to learn from scratch. We propose a novel approach to deal with this cold-start problem by employing simple search enhancements at the beginning phase of self-play training, namely Rollout, Rapid Action Value Estimate (RAVE) and dynamically weighted combinations of these with the neural network, and Rolling Horizon Evolutionary Algorithms (RHEA). Our experiments indicate that most of these enhancements improve the performance of their baseline player in three different (small) board games, with especially RAVE based variants playing strongly.

preprint2019arXiv

Nonreciprocal transition between two nondegenerate energy levels

Stimulated emission and absorption are two fundamental processes of light-matter interaction, and the coefficients of the two processes should be equal in general. However, we will describe a generic method to realize significant difference between the stimulated emission and absorption coefficients of two nondegenerate energy levels, which we refer to as nonreciprocal transition. As a simple implementation, a cyclic three-level atom system, comprising two nondegenerate energy levels and one auxiliary energy level, is employed to show nonreciprocal transition via a combination of synthetic magnetism and reservoir engineering. Moreover, a single-photon nonreciprocal transporter is proposed using two one dimensional semi-infinite coupled-resonator waveguides connected by an atom with nonreciprocal transition effect. Our work opens up a route to design atom-mediated nonreciprocal devices in a wide range of physical systems.

preprint2018arXiv

Generalized Su-Schrieffer-Heeger model in one dimensional optomechanical arrays

We propose an implementation of a generalized Su-Schrieffer-Heeger (SSH) model based on optomechanical arrays. The topological properties of the generalized SSH model depend on the effective optomechanical interactions enhanced by strong driving optical fields. Three phases including one trivial and two distinct topological phases are found in the generalized SSH model. The phase transition can be observed by turning the strengths and phases of the effective optomechanical interactions via adjusting the external driving fields. Moreover, four types of edge states can be created in generalized SSH model of an open chain under single-particle excitation, and the dynamical behaviors of the excitation in the open chain are related to the topological properties under the periodic boundary condition. We show that the edge states can be pumped adiabatically along the optomechanical arrays by periodically modulating the amplitude and frequency of the driving fields. The generalized SSH model based on the optomechanical arrays provides us a tunable platform to engineer topological phases for photons and phonons, which may have potential applications in controlling the transport of photons and phonons.

preprint2018arXiv

Minimum Margin Loss for Deep Face Recognition

Face recognition has achieved great progress owing to the fast development of the deep neural network in the past a few years. As an important part of deep neural networks, a number of the loss functions have been proposed which significantly improve the state-of-the-art methods. In this paper, we proposed a new loss function called Minimum Margin Loss (MML) which aims at enlarging the margin of those overclose class centre pairs so as to enhance the discriminative ability of the deep features. MML supervises the training process together with the Softmax Loss and the Centre Loss, and also makes up the defect of Softmax + Centre Loss. The experimental results on MegaFace, LFW and YTF datasets show that the proposed method achieves the state-of-the-art performance, which demonstrates the effectiveness of the proposed MML.

preprint2018arXiv

Nonreciprocal photon blockade via quadratic optomechanical coupling

We propose to manipulate the statistic properties of the photons transport nonreciprocally via quadratic optomechanical coupling. We present a scheme to generate quadratic optomechanical interactions in the normal optical modes of a whispering-gallery-mode (WGM) optomechanical system by eliminating the linear optomechanical couplings via anticrossing of different modes. By optically pumping the WGM optomechanical system in one direction, the effective quadratic optomechanical coupling in that direction will be enhanced significantly, and nonreciprocal photon blockade will be observed consequently. Our proposal has potential applications for the on-chip nonreciprocal single-photon devices.