Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
47works
0followers
27topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

47 published item(s)

preprint2026arXiv

A3: Android Agent Arena for Mobile GUI Agents with Essential-State Procedural Evaluation

The advancement of Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) has catalyzed the development of mobile graphic user interface (GUI) AI agents, which is designed to autonomously perform tasks on mobile devices. However, a significant gap persists in mobile GUI agent evaluation, where existing benchmarks predominantly rely on either static frame assessments such as AndroidControl or offline static apps such as AndroidWorld and thus fail to capture agent performance in dynamic, real-world online mobile apps. To address this gap, we present Android Agent Arena (A3), a novel "essential-state" based procedural evaluation system for mobile GUI agents. A3 introduces a benchmark of 100 tasks derived from 20 widely-used, dynamic online apps across 20 categories from the Google Play Store, ensuring evaluation comprehension. A3 also presents a novel "essential-state" based procedural evaluation method that leverages MLLMs as reward models to progressively verify task completion and process achievement. This evaluation approach address the limitations of traditional function based evaluation methods on online dynamic apps. Furthermore, A3 includes a toolkit to streamline Android device interaction, reset online environment and apps and facilitate data collection from both human and agent demonstrations. The complete A3 system, including the benchmark and tools, will be publicly released to provide a robust foundation for future research and development in mobile GUI agents.

preprint2026arXiv

SRU-Pix2Pix: A Fusion-Driven Generator Network for Medical Image Translation with Few-Shot Learning

Magnetic Resonance Imaging (MRI) provides detailed tissue information, but its clinical application is limited by long acquisition time, high cost, and restricted resolution. Image translation has recently gained attention as a strategy to address these limitations. Although Pix2Pix has been widely applied in medical image translation, its potential has not been fully explored. In this study, we propose an enhanced Pix2Pix framework that integrates Squeeze-and-Excitation Residual Networks (SEResNet) and U-Net++ to improve image generation quality and structural fidelity. SEResNet strengthens critical feature representation through channel attention, while U-Net++ enhances multi-scale feature fusion. A simplified PatchGAN discriminator further stabilizes training and refines local anatomical realism. Experimental results demonstrate that under few-shot conditions with fewer than 500 images, the proposed method achieves consistent structural fidelity and superior image quality across multiple intra-modality MRI translation tasks, showing strong generalization ability. These results suggest an effective extension of Pix2Pix for medical image translation.

preprint2026arXiv

What Will Happen Next: Large Models-Driven Deduction for Emergency Instances

Traditional simulation methods reproduce occurred emergency instances through presetting to assist people in risk assessment and emergency decision-making. However, due to the lack of randomness and diversity, existing simulation systems struggle to fully explore the potential risk as emergency instances are scarce. In contrast, Large Models (LMs) can dynamically adjust generation strategies to introduce controllable randomness, while also possessing extensive prior knowledge and cross-domain knowledge transfer capabilities. Inspired by it, we propose the LMs-driven World Line Divergence System (WLDS), which enables diversified visualization and deduction of emergency instances in different domains. WLDS leverages LMs to deduce emergency instances in various development directions, and introduces the factual calibration and logical calibration mechanism to ensure factual accuracy and logical rigor during the deduction process. The interactive module can independently select deduction directions to avoid potential hallucinations that are difficult for the system to identify. Furthermore, by introducing the visualization module, WLDS forms simulation and deduction that combine text and images, which enhances interpretability. Extensive experiments conducted on the proposed Emergency Instances Deduction (EID) benchmark dataset demonstrate that WLDS achieves high-precision and high-fidelity simulation and deduction of emergency instances in multiple specific domains. Relevant experiments further demonstrate that WLDS can generate more emergency instances deduction data for users and provide support for better decision-making in similar emergency instances in the future.

preprint2024arXiv

Fact-checking based fake news detection: a review

This paper reviews and summarizes the research results on fact-based fake news from the perspectives of tasks and problems, algorithm strategies, and datasets. First, the paper systematically explains the task definition and core problems of fact-based fake news detection. Second, the paper summarizes the existing detection methods based on the algorithm principles. Third, the paper analyzes the classic and newly proposed datasets in the field, and summarizes the experimental results on each dataset. Finally, the paper summarizes the advantages and disadvantages of existing methods, proposes several challenges that methods in this field may face, and looks forward to the next stage of research. It is hoped that this paper will provide reference for subsequent work in the field.

preprint2024arXiv

Multiple Access Techniques for Intelligent and Multi-Functional 6G: Tutorial, Survey, and Outlook

Multiple access (MA) is a crucial part of any wireless system and refers to techniques that make use of the resource dimensions to serve multiple users/devices/machines/services, ideally in the most efficient way. Given the needs of multi-functional wireless networks for integrated communications, sensing, localization, computing, coupled with the surge of machine learning / artificial intelligence (AI) in wireless networks, MA techniques are expected to experience a paradigm shift in 6G and beyond. In this paper, we provide a tutorial, survey and outlook of past, emerging and future MA techniques and pay a particular attention to how wireless network intelligence and multi-functionality will lead to a re-thinking of those techniques. The paper starts with an overview of orthogonal, physical layer multicasting, space domain, power domain, ratesplitting, code domain MAs, and other domains, and highlight the importance of researching universal multiple access to shrink instead of grow the knowledge tree of MA schemes by providing a unified understanding of MA schemes across all resource dimensions. It then jumps into rethinking MA schemes in the era of wireless network intelligence, covering AI for MA such as AI-empowered resource allocation, optimization, channel estimation, receiver designs, user behavior predictions, and MA for AI such as federated learning/edge intelligence and over the air computation. We then discuss MA for network multi-functionality and the interplay between MA and integrated sensing, localization, and communications. We finish with studying MA for emerging intelligent applications before presenting a roadmap toward 6G standardization. We also point out numerous directions that are promising for future research.

preprint2023arXiv

Switchable Giant Bulk Photocurrents and Photo-spin-currents in Monolayer PT-symmetric Anti-ferromagnet MnPSe3

Converting light into steady currents and spin-currents in two-dimensional (2D) platform is essential for future energy harvesting and spintronics. We show that the giant and modulable bulk photovoltaic effects (BPVEs) can be achieved in air-stable 2D antiferromagnet (AFM) monolayer MnPSe3, with nonlinear photoconductance > 4000 nm$\cdotμ$A/V2 and photo-spin-conductance > 2000 (nm$\cdotμ$A/V2 $\hbar$/2e) in the visible spectrum. The propagation and the spin-polarizations of photocurrents can be switched via simply rotating the N$é$el vector. We unveil that the PT-symmetry, mirror symmetries, and spin-orbital-couplings are the keys for the observed sizable and controllable 2D BPVEs. All the results provide insights into the BPVEs of 2D AFM, and suggest that the layered MnPSe3 is an outstanding 2D platform for energy device and photo-spintronics.

preprint2022arXiv

A computable multipartite multimode Gaussian correlation measure and the monogamy relation for continuous-variable systems

In this paper, a computable multipartite multimode Gaussian quantum correlation measure ${\mathcal M}^{(k)}$ is proposed for any $k$-partite continuous-variable (CV) systems with $k\geq 2$. ${\mathcal M}^{(k)}$ depends only on the covariance matrix of CV states, is invariant under any permutation of subsystems, is a quantification without ancilla problem, nonincreasing under $k$-partite local Gaussian channels (particularly, invariant under $k$-partite local Gaussian unitary operations), vanishes on $k$-partite product states. For a $k$-partite Gaussian state $ρ$, ${\mathcal M}^{(k)}(ρ)=0$ if and only if $ρ$ is a $k$-partite product state. Thus, for the bipartite case, ${\mathcal M}={\mathcal M}^{(2)}$ is an accessible replacement of the Gaussian quantum discord and Gaussian geometric discord. Moreover, ${\mathcal M}^{(k)}$ satisfies the unification condition, hierarchy condition that a multipartite quantum correlation measure should obey. ${\mathcal M}^{(k)}$ is not bipartite like monogamous, but, ${\mathcal M}^{(k)}$ is complete monogamous and tight complete monogamous.

preprint2022arXiv

Covariance-Based Joint Device Activity and Delay Detection in Asynchronous mMTC

In this letter, we study the joint device activity and delay detection problem in asynchronous massive machine-type communications (mMTC), where all active devices asynchronously transmit their preassigned preamble sequences to the base station (BS) for device identification and delay detection. We first formulate this joint detection problem as a maximum likelihood estimation problem, which depends on the received signal only through its sample covariance, and then propose efficient coordinate descent type of algorithms to solve the formulated problem. Our proposed covariance-based approach is sharply different from the existing compressed sensing (CS) approach for the same problem. Numerical results show that our proposed covariance-based approach significantly outperforms the CS approach in terms of the detection performance since our proposed approach can make better use of the BS antennas than the CS approach.

preprint2022arXiv

Device-Free Sensing in OFDM Cellular Network

This paper considers device-free sensing in an orthogonal frequency division multiplexing (OFDM) cellular network to enable integrated sensing and communication (ISAC). A novel two-phase sensing framework is proposed to localize the passive targets that cannot transmit/receive reference signals to/from the base stations (BSs), where the ranges of the targets are estimated based on their reflected OFDM signals to the BSs in Phase I, and the location of each target is estimated based on its ranges to different BSs in Phase II. Specifically, in Phase I, we design a model-free range estimation approach by leveraging the OFDM channel estimation technique for determining the delay values of all the two-way BS-target-BS paths, which does not rely on any BS-target channel model. In Phase II, we reveal that ghost targets may be falsely detected in some cases as all the targets reflect the same signals to the BSs, which thus do not know how to match each estimated range with the right target. Interestingly, we show that the above data association issue is not a fundamental limitation for device-free sensing: under the ideal case of perfect range estimation in Phase I, the probability for ghost targets to exist is proved to be negligible when the targets are randomly located. Moreover, under the practical case of imperfect range estimation in Phase I, we propose an efficient algorithm for joint data association and target localization in Phase II. Numerical results show that our proposed two-phase framework can achieve very high accuracy in the localization of passive targets, which increases with the system bandwidth.

preprint2022arXiv

Efficiently and Globally Solving Joint Beamforming and Compression Problem in the Cooperative Cellular Network via Lagrangian Duality

Consider the joint beamforming and quantization problem in the cooperative cellular network, where multiple relay-like base stations (BSs) connected to the central processor (CP) via rate-limited fronthaul links cooperatively serve the users. This problem can be formulated as the minimization of the total transmit power, subject to all users' signal-to-interference-plus-noise-ratio (SINR) constraints and all relay-like BSs' fronthaul rate constraints. In this paper, we first show that there is no duality gap between the considered problem and its Lagrangian dual by showing the tightness of the semidefinite relaxation (SDR) of the considered problem. Then we propose an efficient algorithm based on Lagrangian duality for solving the considered problem. The proposed algorithm judiciously exploits the special structure of the Karush-Kuhn-Tucker (KKT) conditions of the considered problem and finds the solution that satisfies the KKT conditions via two fixed-point iterations. The proposed algorithm is highly efficient (as evaluating the functions in both fixed-point iterations are computationally cheap) and is guaranteed to find the global solution of the problem. Simulation results show the efficiency and the correctness of the proposed algorithm.

preprint2022arXiv

Exploiting Temporal Side Information in Massive IoT Connectivity

This paper considers the joint device activity detection and channel estimation problem in a massive Internet of Things (IoT) connectivity system, where a large number of IoT devices exist but merely a random subset of them become active for short-packet transmission in each coherence block. In particular, we propose to leverage the temporal correlation in device activity, e.g., a device active in the previous coherence block is more likely to be still active in the current coherence block, to improve the detection and estimation performance. However, it is challenging to utilize this temporal correlation as side information (SI), which relies on the knowledge about the exact statistical relation between the estimated activity pattern for the previous coherence block (which may be imperfect with unknown error) and the true activity pattern in the current coherence block. To tackle this challenge, we establish a novel SI-aided multiple measurement vector approximate message passing (MMV-AMP) framework. Specifically, thanks to the state evolution of the MMV-AMP algorithm, the correlation between the activity pattern estimated by the MMV-AMP algorithm in the previous coherence block and the real activity pattern in the current coherence block is quantified explicitly. Based on the well-defined temporal correlation, we further manage to embed this useful SI into the denoiser design under the MMV-AMP framework. Specifically, the SI-based soft-thresholding denoisers with binary thresholds and the SI-based minimum mean-squared error (MMSE) denoisers are characterized for the cases without and with the knowledge of the channel distribution, respectively. Numerical results are given to show the significant gain in device activity detection and channel estimation performance brought by our proposed SI-aided MMV-AMP framework.

preprint2022arXiv

FRIH: Fine-grained Region-aware Image Harmonization

Image harmonization aims to generate a more realistic appearance of foreground and background for a composite image. Existing methods perform the same harmonization process for the whole foreground. However, the implanted foreground always contains different appearance patterns. All the existing solutions ignore the difference of each color block and losing some specific details. Therefore, we propose a novel global-local two stages framework for Fine-grained Region-aware Image Harmonization (FRIH), which is trained end-to-end. In the first stage, the whole input foreground mask is used to make a global coarse-grained harmonization. In the second stage, we adaptively cluster the input foreground mask into several submasks by the corresponding pixel RGB values in the composite image. Each submask and the coarsely adjusted image are concatenated respectively and fed into a lightweight cascaded module, adjusting the global harmonization performance according to the region-aware local feature. Moreover, we further designed a fusion prediction module by fusing features from all the cascaded decoder layers together to generate the final result, which could utilize the different degrees of harmonization results comprehensively. Without bells and whistles, our FRIH algorithm achieves the best performance on iHarmony4 dataset (PSNR is 38.19 dB) with a lightweight model. The parameters for our model are only 11.98 M, far below the existing methods.

preprint2022arXiv

Massive MIMO Communication with Intelligent Reflecting Surface

This paper studies the feasibility of deploying intelligent reflecting surfaces (IRSs) in massive MIMO (multiple-input multiple-output) systems to improve the performance of users in the service dead zone. To reduce the channel training overhead, we advocate a novel protocol for the uplink communication in the IRS-assisted massive MIMO systems. Under this protocol, the IRS reflection coefficients are optimized based on the channel covariance matrices, which are generally fixed for many coherence blocks, to boost the long-term performance. Then, given the IRS reflecting coefficients, the BS beamforming vectors are designed in each coherence block based on the effective channel of each user, which is the superposition of its direct and reflected user-IRS-BS channels, to improve the instantaneous performance. Since merely the user effective channels are estimated in each coherence block, the training overhead of this protocol is the same as that in the legacy wireless systems without IRSs. Moreover, in the asymptotic regime that the numbers of IRS elements and BS antennas both go to infinity with a fixed ratio, we manage to first characterize the minimum mean-squared error (MMSE) estimators of the user effective channels and then quantify the closed-form user achievable rates as functions of channel covariance matrices with channel training overhead and estimation error taken into account. Interestingly, it is shown that the properties of channel hardening and favorable propagation still hold for the user effective channels, and satisfactory user rates are thus achievable even if simple BS beamforming solutions, e.g., maximal-ratio combining, are employed. Finally, thanks to the rate characterization, we design a low-complexity algorithm to optimize the IRS reflection coefficients based on channel covariance matrices.

preprint2022arXiv

Networked Sensing in 6G Cellular Networks: Opportunities and Challenges

Radar and wireless communication are widely acknowledged as the two most successful applications of the radio technology over the past decades. Recently, there is a trend in both academia and industry to achieve integrated sensing and communication (ISAC) in one system via utilizing a common radio spectrum and the same hardware platform. This article will discuss about the possibility of exploiting the future sixth-generation (6G) cellular network to realize ISAC. Our vision is that the cellular base stations (BSs) deployed all over the world can be transformed into a powerful sensor to provide highresolution localization services. Specifically, motivated by the joint encoding/decoding gain in multi-cell coordinated communication, we advocate the adoption of the networked sensing technique in 6G network to achieve the above goal, where the BSs can share the sensing information with each other for jointly estimating the locations and velocities of the targets. Several opportunities and challenges to realize networked sensing in the 6G era will be revealed in this article. Moreover, the future research directions for this promising trend will be outlined as well.

preprint2022arXiv

Scalable Multi-view Clustering with Graph Filtering

With the explosive growth of multi-source data, multi-view clustering has attracted great attention in recent years. Most existing multi-view methods operate in raw feature space and heavily depend on the quality of original feature representation. Moreover, they are often designed for feature data and ignore the rich topology structure information. Accordingly, in this paper, we propose a generic framework to cluster both attribute and graph data with heterogeneous features. It is capable of exploring the interplay between feature and structure. Specifically, we first adopt graph filtering technique to eliminate high-frequency noise to achieve a clustering-friendly smooth representation. To handle the scalability challenge, we develop a novel sampling strategy to improve the quality of anchors. Extensive experiments on attribute and graph benchmarks demonstrate the superiority of our approach with respect to state-of-the-art approaches.

preprint2022arXiv

STC: Spatio-Temporal Contrastive Learning for Video Instance Segmentation

Video Instance Segmentation (VIS) is a task that simultaneously requires classification, segmentation, and instance association in a video. Recent VIS approaches rely on sophisticated pipelines to achieve this goal, including RoI-related operations or 3D convolutions. In contrast, we present a simple and efficient single-stage VIS framework based on the instance segmentation method CondInst by adding an extra tracking head. To improve instance association accuracy, a novel bi-directional spatio-temporal contrastive learning strategy for tracking embedding across frames is proposed. Moreover, an instance-wise temporal consistency scheme is utilized to produce temporally coherent results. Experiments conducted on the YouTube-VIS-2019, YouTube-VIS-2021, and OVIS-2021 datasets validate the effectiveness and efficiency of the proposed method. We hope the proposed framework can serve as a simple and strong alternative for many other instance-level video association tasks.

preprint2022arXiv

Trajectory Optimization of Cellular-Connected UAV for Information Collection and Transmission

In this paper, we consider a cellular-connected unmanned aerial vehicle (UAV) with an information collection and transmission mission for multiple ground targets. Specifically, the UAV is required to collect a fixed amount of information of each target by hovering at a pre-determined location (via e.g., photography/videography/sensing), and transmit all the collected information to the cellular network during its flight. We aim to jointly optimize the UAV's trajectory and the information collection order of the ground targets to minimize the mission completion time. The formulated problem is NP-hard due to the need of visiting the information collection locations for all targets; moreover, the UAV's trajectories over different time durations are coupled in non-convex constraints for ensuring information transmission completion. To handle this difficult problem, we first propose a structured communication protocol between the UAV and the cellular network, which decouples the UAV's trajectory designs in different time durations. Then, under the proposed protocol, we establish an equivalent graph-based model for the considered problem, and devise a low-complexity algorithm for finding an approximate solution by exploiting the problem structure and leveraging graph theory. Numerical results show that our proposed design achieves efficient information collection and transmission, and outperforms various benchmark schemes.

preprint2022arXiv

Trilateration-Based Device-Free Sensing: Two Base Stations and One Passive IRS Are Sufficient

The classic trilateration technique can localize each target based on its distances to three anchors with known coordinates. Usually, this technique requires all the anchors and targets, e.g., the satellites and the mobile phones in Global Navigation Satellite System (GNSS), to actively transmit/receive radio signals such that the delay of the one-way radio signal propagated between each anchor and each target can be measured. Excitingly, this paper will show that the trilateration technique can be generalized to the scenario where one of the three anchors and all the targets merely reflect the radio signals passively as in radar networks, even if the propagation delay between the passive IRS and the passive targets is difficult to be measured directly, and the data association issue for multi-sensor multi-target tracking arises. Specifically, we consider device-free sensing in a cellular network consisting of two base stations (BSs), one passive intelligent reflecting surface (IRS), and multiple passive targets, to realize integrated sensing and communication (ISAC). The two BSs transmit the orthogonal frequency division multiplexing (OFDM) signals in the downlink and estimate the locations of the targets based on their reflected signals via/not via the IRS. We propose an efficient trilateration-based strategy that can first estimate the distances of each target to the two BSs and the IRS and then localize the targets. Numerical results show that the considered networked sensing architecture with heterogenous anchors can outperform its counterpart with three BSs.

preprint2021arXiv

An Efficient Algorithm for Device Detection and Channel Estimation in Asynchronous IoT Systems

A great amount of endeavour has recently been devoted to the joint device activity detection and channel estimation problem in massive machine-type communications. This paper targets at two practical issues along this line that have not been addressed before: asynchronous transmission from uncoordinated users and efficient algorithms for real-time implementation in systems with a massive number of devices. Specifically, this paper considers a practical system where the preamble sent by each active device is delayed by some unknown number of symbols due to the lack of coordination. We manage to cast the problem of detecting the active devices and estimating their delay and channels into a group LASSO problem. Then, a block coordinate descent algorithm is proposed to solve this problem globally, where the closed-form solution is available when updating each block of variables with the other blocks of variables being fixed, thanks to the special structure of our interested problem. Our analysis shows that the overall complexity of the proposed algorithm is low, making it suitable for real-time application.

preprint2021arXiv

Development of a Data-Driven Method to Simulate the Detector Response of Anti-neutron at BESIII

In this paper, a data-driven method to precisely simulate the detector response of the anti-neutron depositing in the Electromagnetic Calorimeter (EMC)at BESIII is introduced. A large anti-neutron data sample can be selected using the decay $J/ψ\to p\bar{n}π^{-}$ from the BESIII data sample of 10 billion $J/ψ$ events. The detection efficiency for and various observables of anti-neutrons interacting in the EMC detector are simulated, taking the correlations among the variables into consideration. The systematic uncertainty of this data-driven simulation method is determined to be less than 1\% on average. This method can be widely applied in physics processes that require precise simulation of the detector response of anti-neutrons in the EMC.

preprint2021arXiv

High fidelity entanglement of neutral atoms via a Rydberg-mediated single-modulated-pulse controlled-PHASE gate

Neutral atom platform has become an attractive choice to study the science of quantum information and quantum simulation, where intense efforts have been devoted to the entangling processes between individual atoms. For the development of this area, two-qubit controlled-PHASE gate via Rydberg blockade is one of the most essential elements. Recent theoretical studies have suggested the advantages of introducing non-trivial waveform modulation into the gate protocol, which is anticipated to improve its performance towards the next stage. We report our recent experimental results in realizing a two-qubit controlled-PHASE($C_Z$) gate via off-resonant modulated driving(ORMD) embedded in two-photon transition for Rb atoms. It relies upon a single modulated driving pulse with a carefully calculated smooth waveform to gain the appropriate phase accumulations required by the two-qubit gate. Combining this $C_Z$ gate with global microwave pulses, two-atom entanglement is generated with the raw fidelity of 0.945(6). Accounting for state preparation and measurement (SPAM) errors, we extract the entanglement operation fidelity to be 0.980(7). Our work features completing the $C_Z$ gate operation within a single pulse to avoid shelved Rydberg population, thus demonstrate another promising route for realizing high-fidelity two-qubit gate for neutral atom platform.

preprint2021arXiv

Quasi one-dimensional diffuse laser cooling of atoms

We demonstrate experimentally the generation of one-dimensional cold gases of $^{87}$Rb atoms by diffuse laser cooling (DLC). A horizontal slender vacuum glass tube with length of 105~cm and diameter of 2~cm is used in our experiment. The diffuse laser light inside the tube, which is generated by multi-reflection of injected lasers, cools the background vapor atoms. With 250~mW of cooling light and 50~mW of repumping light, an evenly distributed meter-long profile of atom cloud is obtained. We observe a factor 4 improvement on the atomic OD for a typical cooling duration of 170~ms and a sub-Doppler atomic temperature of 25~$μ$k. The maximum number of detected cold atoms remain constant for a free-fall duration of 30~ms. Such samples are ideal for many quantum optical experiments involving electromagnetically induced transparency, electronically highly excited (Rydberg) atoms and quantum precision measurements.

preprint2020arXiv

A Covariance-based User Activity Detection and Channel Estimation Approach with Novel Pilot Design

This paper studies the massive machine-type communications (mMTC) for the future Internet of Things (IoT) applications, where a large number of IoT devices exist in the network and a random subset of them become active at each time instant. Building upon the fact that the covariance matrix of the received signal can be accurately estimated in the spatial domain if the base station (BS) is equipped with a massive number of antennas, we propose a covariance-based device activity detection and channel estimation strategy in a massive MIMO (multiple-input multiple-output) aided mMTC system. For this strategy, a novel approach for the pilot sequence design is first provided, where the pilot of each device is merely determined by a unique phase parameter. Then, by estimating the phase parameters of the active pilot sequences that contribute to the received covariance matrix, an efficient algorithm is proposed to detect the active devices without the prior information about the total number of active devices. At last, given the estimation of active devices, channel estimation is conducted based on the conventional minimum mean-squared error (MMSE) approach. It is worth noting that our proposed strategy is able to obtain all the results in closed-forms, and is thus of much lower complexity compared to the existing strategies that are based on iterative algorithms for device detection and channel estimation.

preprint2020arXiv

A Learning Framework for n-bit Quantized Neural Networks toward FPGAs

The quantized neural network (QNN) is an efficient approach for network compression and can be widely used in the implementation of FPGAs. This paper proposes a novel learning framework for n-bit QNNs, whose weights are constrained to the power of two. To solve the gradient vanishing problem, we propose a reconstructed gradient function for QNNs in back-propagation algorithm that can directly get the real gradient rather than estimating an approximate gradient of the expected loss. We also propose a novel QNN structure named n-BQ-NN, which uses shift operation to replace the multiply operation and is more suitable for the inference on FPGAs. Furthermore, we also design a shift vector processing element (SVPE) array to replace all 16-bit multiplications with SHIFT operations in convolution operation on FPGAs. We also carry out comparable experiments to evaluate our framework. The experimental results show that the quantized models of ResNet, DenseNet and AlexNet through our learning framework can achieve almost the same accuracies with the original full-precision models. Moreover, when using our learning framework to train our n-BQ-NN from scratch, it can achieve state-of-the-art results compared with typical low-precision QNNs. Experiments on Xilinx ZCU102 platform show that our n-BQ-NN with our SVPE can execute 2.9 times faster than with the vector processing element (VPE) in inference. As the SHIFT operation in our SVPE array will not consume Digital Signal Processings (DSPs) resources on FPGAs, the experiments have shown that the use of SVPE array also reduces average energy consumption to 68.7% of the VPE array with 16-bit.

preprint2020arXiv

A New Accelerated Stochastic Gradient Method with Momentum

In this paper, we propose a novel accelerated stochastic gradient method with momentum, which momentum is the weighted average of previous gradients. The weights decays inverse proportionally with the iteration times. Stochastic gradient descent with momentum (Sgdm) use weights that decays exponentially with the iteration times to generate an momentum term. Using exponentially decaying weights, variants of Sgdm with well designed and complicated formats have been proposed to achieve better performance. The momentum update rules of our method is as simple as that of Sgdm. We provide theoretical convergence properties analyses for our method, which show both the exponentially decay weights and our inverse proportionally decay weights can limit the variance of the moving direction of parameters to be optimized to a region. Experimental results empirically show that our method works well with practical problems and outperforms Sgdm, and it outperforms Adam in convolutional neural networks.

preprint2020arXiv

APB2Face: Audio-guided face reenactment with auxiliary pose and blink signals

Audio-guided face reenactment aims at generating photorealistic faces using audio information while maintaining the same facial movement as when speaking to a real person. However, existing methods can not generate vivid face images or only reenact low-resolution faces, which limits the application value. To solve those problems, we propose a novel deep neural network named APB2Face, which consists of GeometryPredictor and FaceReenactor modules. GeometryPredictor uses extra head pose and blink state signals as well as audio to predict the latent landmark geometry information, while FaceReenactor inputs the face landmark image to reenact the photorealistic face. A new dataset AnnVI collected from YouTube is presented to support the approach, and experimental results indicate the superiority of our method than state-of-the-arts, whether in authenticity or controllability.

preprint2020arXiv

Channel Estimation for Intelligent Reflecting Surface Assisted Multiuser Communications

In the intelligent reflecting surface (IRS) assisted communication systems, the acquisition of channel state information (CSI) is a crucial impediment for achieving the passive beamforming gain of IRS because of the considerable overhead required for channel estimation. Specifically, under the current beamforming design for IRS-assisted communications, $KMN+KM$ channel coefficients should be estimated if the passive IRS cannot estimate its channels with the base station (BS) and users due to its lack of radio frequency (RF) chains, where $K$, $N$ and $M$ denote the number of users, reflecting elements of the IRS, and antennas at the BS, respectively. This number can be extremely large in practice considering the current trend of massive MIMO (multiple-input multiple-output), i.e., a large $M$, and massive connectivity, i.e., a large $K$. To accurately estimate such a large number of channel coefficients within a short time interval, we devote our endeavour in this paper to investigating the efficient pilot-based channel estimation method in IRS-assisted uplink communications. Building upon the observation that the IRS reflects the signals from all the users to the BS via the same channels, we analytically verify that a time duration consisting of $K+N+\max(K-1,\lceil (K-1)N/M \rceil)$ pilot symbols is sufficient for the BS to perfectly recover all the $KMN+KM$ channel coefficients in the case without noise. In contrast to the conventional uplink communications without IRS in which the minimum pilot sequence length for channel estimation is independent with the number of receive antennas, our study reveals the significant role of massive MIMO in reducing the channel training time for IRS-assisted communication systems.

preprint2020arXiv

Channel Estimation for Intelligent Reflecting Surface Assisted Multiuser Communications: Framework, Algorithms, and Analysis

In intelligent reflecting surface (IRS) assisted communication systems, the acquisition of channel state information (CSI) is a crucial impediment for achieving the beamforming gain of IRS because of the considerable overhead required for channel estimation. Specifically, under the current beamforming design for IRS-assisted communications, $KMN+KM$ channel coefficients should be estimated, where $K$, $N$ and $M$ denote the numbers of users, IRS reflecting elements, and antennas at the base station (BS), respectively. To accurately estimate such a large number of channel coefficients within a short time interval, we propose a novel three-phase pilot-based channel estimation framework in this paper for IRS-assisted uplink multiuser communications. Under this framework, we analytically prove that a time duration consisting of $K+N+\max(K-1,\lceil (K-1)N/M \rceil)$ pilot symbols is sufficient for the BS to perfectly recover all the $KMN+KM$ channel coefficients for the case without receiver noise at the BS. In contrast to the channel estimation for conventional uplink communications without IRS where the minimum channel estimation time is independent of the number of receive antennas at the BS, our result reveals the crucial role of massive MIMO (multiple-input multiple-output) in reducing the channel estimation time for IRS-assisted communications. Further, for the case with receiver noise, the user pilot sequences, IRS reflecting coefficients, and BS linear minimum mean-squared error (LMMSE) channel estimators are characterized in closed-form, and the corresponding estimation mean-squared error (MSE) is quantified.

preprint2020arXiv

Decentralized Massive MIMO Processing Exploring Daisy-chain Architecture and Recursive Algorithms

Algorithms for Massive MIMO uplink detection and downlink precoding typically rely on a centralized approach, by which baseband data from all antenna modules are routed to a central node in order to be processed. In the case of Massive MIMO, where hundreds or thousands of antennas are expected in the base-station, said routing becomes a bottleneck since interconnection throughput is limited. This paper presents a fully decentralized architecture and an algorithm for Massive MIMO uplink detection and downlink precoding based on the Stochastic Gradient Descent (SGD) method, which does not require a central node for these tasks. Through a recursive approach and very low complexity operations, the proposed algorithm provides a good trade-off between performance, interconnection throughput and latency. Further, our proposed solution achieves significantly lower interconnection data-rate than other architectures, enabling future scalability.

preprint2020arXiv

Extended Feature Pyramid Network for Small Object Detection

Small object detection remains an unsolved challenge because it is hard to extract information of small objects with only a few pixels. While scale-level corresponding detection in feature pyramid network alleviates this problem, we find feature coupling of various scales still impairs the performance of small objects. In this paper, we propose extended feature pyramid network (EFPN) with an extra high-resolution pyramid level specialized for small object detection. Specifically, we design a novel module, named feature texture transfer (FTT), which is used to super-resolve features and extract credible regional details simultaneously. Moreover, we design a foreground-background-balanced loss function to alleviate area imbalance of foreground and background. In our experiments, the proposed EFPN is efficient on both computation and memory, and yields state-of-the-art results on small traffic-sign dataset Tsinghua-Tencent 100K and small category of general object detection dataset MS COCO.

preprint2020arXiv

Focal Loss Analysis of Nerve Fiber Layer Reflectance for Glaucoma Diagnosis

Purpose: To evaluate nerve fiber layer (NFL) reflectance for glaucoma diagnosis. Methods: Participants were imaged with 4.5X4.5-mm volumetric disc scans using spectral-domain optical coherence tomography (OCT). The normalized NFL reflectance map was processed by an azimuthal filter to reduce directional reflectance bias due to variation of beam incidence angle. The peripapillary area of the map was divided into 160 superpixels. Average reflectance was the mean of superpixel reflectance. Low-reflectance superpixels were identified as those with NFL reflectance below the 5 percentile normative cutoff. Focal reflectance loss was measure by summing loss in low-reflectance superpixels. Results: Thirty-five normal, 30 pre-perimetric and 35 perimetric glaucoma participants were enrolled. Azimuthal filtering improved the repeatability of the normalized NFL reflectance, as measured by the pooled superpixel standard deviation (SD), from 0.73 to 0.57 dB (p<0.001, paired t-test) and reduced the population SD from 2.14 to 1.78 dB (p<0.001, t-test). Most glaucomatous reflectance maps showed characteristic patterns of contiguous wedge or diffuse defects. Focal NFL reflectance loss had significantly higher diagnostic sensitivity than the best NFL thickness parameter (overall, inferior, or focal loss volume): 53% v. 23% (p=0.027) in PPG eyes and 100% v. 80% (p=0.023) in PG eyes, with the specificity fixed at 99%. Conclusions: Azimuthal filtering reduces the variability of NFL reflectance measurements. Focal NFL reflectance loss has excellent glaucoma diagnostic accuracy compared to the standard NFL thickness parameters. The reflectance map may be useful for localizing NFL defects.

preprint2020arXiv

FReeNet: Multi-Identity Face Reenactment

This paper presents a novel multi-identity face reenactment framework, named FReeNet, to transfer facial expressions from an arbitrary source face to a target face with a shared model. The proposed FReeNet consists of two parts: Unified Landmark Converter (ULC) and Geometry-aware Generator (GAG). The ULC adopts an encode-decoder architecture to efficiently convert expression in a latent landmark space, which significantly narrows the gap of the face contour between source and target identities. The GAG leverages the converted landmark to reenact the photorealistic image with a reference image of the target person. Moreover, a new triplet perceptual loss is proposed to force the GAG module to learn appearance and geometry information simultaneously, which also enriches facial details of the reenacted images. Further experiments demonstrate the superiority of our approach for generating photorealistic and expression-alike faces, as well as the flexibility for transferring facial expressions between identities.

preprint2020arXiv

From Open Set to Closed Set: Supervised Spatial Divide-and-Conquer for Object Counting

Visual counting, a task that aims to estimate the number of objects from an image/video, is an open-set problem by nature, i.e., the number of population can vary in [0, inf) in theory. However, collected data and labeled instances are limited in reality, which means that only a small closed set is observed. Existing methods typically model this task in a regression manner, while they are prone to suffer from an unseen scene with counts out of the scope of the closed set. In fact, counting has an interesting and exclusive property---spatially decomposable. A dense region can always be divided until sub-region counts are within the previously observed closed set. We therefore introduce the idea of spatial divide-and-conquer (S-DC) that transforms open-set counting into a closed-set problem. This idea is implemented by a novel Supervised Spatial Divide-and-Conquer Network (SS-DCNet). Thus, SS-DCNet can only learn from a closed set but generalize well to open-set scenarios via S-DC. SS-DCNet is also efficient. To avoid repeatedly computing sub-region convolutional features, S-DC is executed on the feature map instead of on the input image. We provide theoretical analyses as well as a controlled experiment on toy data, demonstrating why closed-set modeling makes sense. Extensive experiments show that SS-DCNet achieves the state-of-the-art performance. Code and models are available at: https://tinyurl.com/SS-DCNet.

preprint2020arXiv

Hierarchical and Efficient Learning for Person Re-Identification

Recent works in the person re-identification task mainly focus on the model accuracy while ignore factors related to the efficiency, e.g. model size and latency, which are critical for practical application. In this paper, we propose a novel Hierarchical and Efficient Network (HENet) that learns hierarchical global, partial, and recovery features ensemble under the supervision of multiple loss combinations. To further improve the robustness against the irregular occlusion, we propose a new dataset augmentation approach, dubbed Random Polygon Erasing (RPE), to random erase irregular area of the input image for imitating the body part missing. We also propose an Efficiency Score (ES) metric to evaluate the model efficiency. Extensive experiments on Market1501, DukeMTMC-ReID, and CUHK03 datasets shows the efficiency and superiority of our approach compared with epoch-making methods.

preprint2020arXiv

Intelligent Reflecting Surface Assisted Massive MIMO Communications

In a practical massive MIMO (multiple-input multiple-output) system, the number of antennas at a base station (BS) is constrained by the space and cost factors, which limits the throughput gain promised by theoretical analysis. This paper thus studies the feasibility of adopting the intelligent reflecting surface (IRS) to further improve the beamforming gain of the uplink communications in a massive MIMO system. Under such a novel system, the central question lies in whether the IRS is able to enhance the network throughput as expected, if the channel estimation overhead is taken into account. In this paper, we first show that the favorable propagation property for the conventional massive MIMO system without IRS, i.e., the channels of arbitrary two users are orthogonal, no longer holds for the IRS-assisted massive MIMO system, due to its special channel property that each IRS element reflects the signals from all the users to the BS via the same channel. As a result, the maximal-ratio combining (MRC) receive beamforming strategy leads to strong inter-user interference and thus even lower user rates than those of the massive MIMO system without IRS. To tackle this challenge, we propose a novel strategy for zero-forcing (ZF) beamforming design at the BS and reflection coefficients design at the IRS to efficiently null the inter-user interference. Under our proposed strategy, it is rigorously shown that even if the channel estimation overhead is considered, the IRS-assisted massive MIMO system can always achieve higher throughput compared to its counterpart without IRS, despite the fact that the favorable propagation property no longer holds.

preprint2020arXiv

Magnetic asymmetry induced anomalous spin-orbit torque in IrMn

We demonstrate an anomalous spin-orbit torque induced by the broken magnetic symmetry in the antiferromagnet IrMn. We study the magnetic structure of three phases of IrMn thin films using neutron diffraction technique. The magnetic mirror symmetry M&#39; is broken laterally in both L10-IrMn and L12-IrMn3 but not γ-IrMn3. We observe an out-of-plane damping-like spin-orbit torque in both L10-IrMn/permalloy and L12-IrMn3/permalloy bilayers but not in γ-IrMn3/permalloy. This is consistent with both the symmetry analysis on the effects of a broken M&#39; on spin-orbit torque and the theoretical predictions of the spin Hall effect and the Rashba-Edelstein effect. In addition, the measured spin-orbit torque efficiencies are 0.61+-0.01, 1.01+-0.03 and 0.80+-0.01 for the L10, L12 and γ phases, respectively. Our work highlights the critical roles of the magnetic asymmetry in spin-orbit torque generation.

preprint2020arXiv

Nearly nondestructive thermometry of labeled cold atoms and application to isotropic laser cooling

We have designed and implemented a straightforward method to deterministically measure the temperature of the selected segment of a cold atom ensemble, and we have also developed an upgrade in the form of nondestructive thermometry. The essence is to monitor the thermal expansion of the targeted cold atoms after labeling them through manipulating the internal states, and the nondestructive property relies upon the nearly lossless detection via driving a cycling transition. For cold atoms subject to isotropic laser cooling, this method has the unique capability of addressing only the atoms on the optical detection axis within the enclosure, which is exactly the part we care about in major applications such as atomic clock or quantum sensing. Furthermore, our results confirm the sub-Doppler cooling features in isotropic laser cooling, and we have investigated the relevant cooling properties. Meanwhile, we have applied the recently developed optical configuration with the cooling laser injection in the form of hollow beams, which helps to enhance the cooling performance and accumulate more cold atoms in the central regions.

preprint2020arXiv

On the evolution of word usage of classical Chinese poetry

The hierarchy of classical Chinese poetry has been broadly acknowledged by a number of studies in Chinese literature. However, quantitative investigations about the evolutionary linkages of classical Chinese poetry are limited. The primary goal of this study is to provide quantitative evidence of the evolutionary linkages, with emphasis on character usage, among different period genres of classical Chinese poetry. Specifically, various statistical analyses are performed to find and compare the patterns of character usage in the poems of nine period genres, including shi jing, chu ci, Han shi , Jin shi, Tang shi, Song shi, Yuan shi, Ming shi, and Qing shi. The result of analysis indicates that each of nine period genres has unique patterns of character usage, with some Chinese characters that are preferably used in the poems of a particular period genre. The analysis on the general pattern of character preference implies a decreasing trend in the use of Chinese characters that rarely occur in modern Chinese literature along the timeline of dynastic types of classical Chinese poetry. The phylogenetic analysis based on the distance matrix suggests that the evolutionary linkages of different types of classical Chinese poetry are congruent with their chronological order, suggesting that character frequencies contain phylogenetic information that is useful for inferring evolutionary linkages among various types of classical Chinese poetry. The estimated phylogenetic tree identifies four groups (shi jing, chu ci), (Han shi, Jin shi), (Tang shi, Song shi, Yuan shi), and (Ming shi, Qing shi). The statistical analyses conducted in this study can be generalized to analyze the data sets of general Chinese literature. Such analyses can provide quantitative insights about the evolutionary linkages of general Chinese literature.

preprint2020arXiv

Perpendicular magnetic anisotropy and Dzyaloshinskii-Moriya interaction at an oxide/ferromagnetic metal interface

We report on the study of both perpendicular magnetic anisotropy (PMA) and Dzyaloshinskii-Moriya interaction (DMI) at an oxide/ferromagnetic metal (FM) interface, i.e. BaTiO3 (BTO)/CoFeB. Thanks to the functional properties of the BTO film and the capability to precisely control its growth, we are able to distinguish the dominant role of the oxide termination (TiO2 vs BaO), from the moderate effect of ferroelectric polarization in the BTO film, on the PMA and DMI at the oxide/FM interface. We find that the interfacial magnetic anisotropy energy of the BaO-BTO/CoFeB structure is two times larger than that of the TiO2-BTO/CoFeB, while the DMI of the TiO2-BTO/CoFeB interface is larger. We explain the observed phenomena by first-principles calculations, which ascribe them to the different electronic states around the Fermi level at the oxide/ferromagnetic metal interfaces and the different spin-flip processes. This study paves the way for further investigation of the PMA and DMI at various oxide/FM structures and thus their applications in the promising field of energy-efficient devices.

preprint2020arXiv

Processing Distribution and Architecture Tradeoff for Large Intelligent Surface Implementation

The Large Intelligent Surface (LIS) concept has emerged recently as a new paradigm for wireless communication, remote sensing and positioning. It consists of a continuous radiating surface placed relatively close to the users, which is able to communicate with users by independent transmission and reception (replacing base stations). Despite of its potential, there are a lot of challenges from an implementation point of view, with the interconnection data-rate and computational complexity being the most relevant. Distributed processing techniques and hierarchical architectures are expected to play a vital role addressing this while ensuring scalability. In this paper we perform algorithm-architecture codesign and analyze the hardware requirements and architecture trade-offs for a discrete LIS to perform uplink detection. By doing this, we expect to give concrete case studies and guidelines for efficient implementation of LIS systems.

preprint2020arXiv

QPS-r: A Cost-Effective Crossbar Scheduling Algorithm and Its Stability and Delay Analysis

In an input-queued switch, a crossbar schedule, or a matching between the input ports and the output ports needs to be computed in each switching cycle, or time slot. Designing switching algorithms with very low computational complexity, that lead to high throughput and small delay is a challenging problem. There appears to be a fundamental tradeoff between the computational complexity of the switching algorithm and the resultants throughput and delay. Parallel maximal matching algorithms (adapted for switching) appear to have stricken a sweet spot in this tradeoff, and prior work has shown the following performance guarantees. Using maximal matchings in every time slot results in at least 50% switch throughput and order-optimal (i.e., independent of the switch size N) average delay bounds for various traffic arrival processes. On the other hand, their computational complexity can be as low as $O(log^2N)$ per port/processor, which is much lower than those of the algorithms such as maximum weighted matching which ensures better throughput performance. In this work, we propose QPS-r, a parallel iterative switching algorithm that has the lowest possible computational complexity: O(1) per port. Using Lyapunov stability analysis, we show that the throughput and delay performance is identical to that of maximal matching algorithm. Although QPS-r builds upon an existing technique called Queue-Proportional Sampling (QPS), in this paper, we provide analytical guarantees on its throughput and delay under i.i.d. traffic as well as a Markovian traffic model which can model many realistic traffic patterns. We also demonstrate that QPS-3 (running 3 iterations) has comparable empirical throughput and delay performances as iSLIP (running $log_2 N$ iterations), a refined and optimized representative maximal matching algorithm adapted for switching.

preprint2020arXiv

Space- and Computationally-Efficient Set Reconciliation via Parity Bitmap Sketch (PBS)

Set reconciliation is a fundamental algorithmic problem that arises in many networking, system, and database applications. In this problem, two large sets A and B of objects (bitcoins, files, records, etc.) are stored respectively at two different network-connected hosts, which we name Alice and Bob respectively. Alice and Bob communicate with each other to learn $AΔB$, the difference between A and B, and as a result the reconciled set $A\bigcup B$. Current set reconciliation schemes are based on either Invertible Bloom Filters (IBF) or Error-Correction Codes (ECC). The former has a low computational complexity of O(d), where d is the cardinality of $AΔB$, but has a high communication overhead that is several times larger than the theoretical minimum. The latter has a low communication overhead close to the theoretical minimum, but has a much higher computational complexity of $O(d^2)$. In this work, we propose Parity Bitmap Sketch (PBS), an ECC- based set reconciliation scheme that gets the better of both worlds: PBS has both a low computational complexity of O(d) just like IBF-based solutions and a low communication overhead of roughly twice the theoretical minimum. A separate contribution of this work is a novel rigorous analytical framework that can be used for the precise calculation of various performance metrics and for the near-optimal parameter tuning of PBS.

preprint2020arXiv

The contact binary V344 Lacertae: is it a triple system?

The VRI passbands light curves of V344 Lac were presented and analyzed by using the latest version of the W-D code. The observed spectrum reveals that V344 Lac is not an A3 type but would be a later F type star according to the yielded temperature. The results of solution show that V344 Lac is an A-subtype contact binary, with a mediate photometric mass ratio of 0.387(0.003) and a mediate contact factor of 44.6(3.0)%. Based on the parallax given by Gaia, the parameters of the components are estimated as: M1 = 1.16 Ms, M2 = 0.45 Ms, R1 = 1.31 Rs, R2 = 0.88 Rs, L1 = 2.512 Ls, L2 = 1.057 Ls. The period investigation indicates that V344 Lac may have an eccentric orbital oscillation, with P3 = 12.4(0.5) yr, A3 = 0.0020(0.0002) d, and e = 0.38(0.16). Analysis shows such oscillation would be caused by a magnetic activity which can be explained by the Applegate mechanism. Meanwhile, according to the value of l3 and the estimated physical parameters of V344 Lac, the mass of the third companion may be 0.79 Ms. This third body could be a wide company.

preprint2020arXiv

Towards an Astronomical Science Platform: Experiences and Lessons Learned from Chinese Virtual Observatory

In the era of big data astronomy, next generation telescopes and large sky surveys produce data sets at the TB or even PB level. Due to their large data volumes, these astronomical data sets are extremely difficult to transfer and analyze using personal computers or small clusters. In order to offer better access to data, data centers now generally provide online science platforms that enable analysis close to the data. The Chinese Virtual Observatory (China-VO) is one of the member projects in the International Virtual Observatory Alliance and it is dedicated to providing a research and education environment where globally distributed astronomy archives are simple to find, access, and interoperate. In this study, we summarize highlights of the work conducted at the China-VO, as well the experiences and lessons learned during the full life-cycle management of astronomical data. Finally, We discuss the challenges and future trends for astronomical science platforms.

preprint2020arXiv

Towards Reliable UAV Swarm Communication in D2D-Enhanced Cellular Network

In the existing cellular networks, it remains a challenging problem to communicate with and control an unmanned aerial vehicle (UAV) swarm with both high reliability and low latency. Due to the UAV swarm&#39;s high working altitude and strong ground-to-air channels, it is generally exposed to multiple ground base stations (GBSs), while the GBSs that are serving ground users (occupied GBSs) can generate strong interference to the UAV swarm. To tackle this issue, we propose a novel two-phase transmission protocol by exploiting cellular plus device-to-device (D2D) communication for the UAV swarm. In Phase I, one swarm head is chosen for ground-to-air channel estimation, and all the GBSs that are not serving ground users (available GBSs) transmit a common control message to the UAV swarm simultaneously, using the same cellular frequency band, to combat the strong interference from occupied GBSs. In Phase II, all the UAVs that have decoded the common control message in Phase I further relay it to the other UAVs in the swarm via D2D communication, by exploiting the less interfered D2D frequency band and the proximity among UAVs. In this paper, we aim to characterize the reliability performance of the above two-phase protocol, i.e., the expected percentage of UAVs in the swarm that can decode the common control message, which is a non-trivial problem due to the complex system setup and the intricate coupling between the two phases. Nevertheless, we manage to obtain an approximated expression of the reliability performance of interest, under reasonable assumptions and with the aid of the Pearson distributions. Numerical results validate the accuracy of our analytical results and show the effectiveness of our protocol over other benchmark protocols. We also study the effect of key system parameters on the reliability performance, to reveal useful insights on the practical system design.

preprint2020arXiv

Weighing Counts: Sequential Crowd Counting by Reinforcement Learning

We formulate counting as a sequential decision problem and present a novel crowd counting model solvable by deep reinforcement learning. In contrast to existing counting models that directly output count values, we divide one-step estimation into a sequence of much easier and more tractable sub-decision problems. Such sequential decision nature corresponds exactly to a physical process in reality scale weighing. Inspired by scale weighing, we propose a novel &#39;counting scale&#39; termed LibraNet where the count value is analogized by weight. By virtually placing a crowd image on one side of a scale, LibraNet (agent) sequentially learns to place appropriate weights on the other side to match the crowd count. At each step, LibraNet chooses one weight (action) from the weight box (the pre-defined action pool) according to the current crowd image features and weights placed on the scale pan (state). LibraNet is required to learn to balance the scale according to the feedback of the needle (Q values). We show that LibraNet exactly implements scale weighing by visualizing the decision process how LibraNet chooses actions. Extensive experiments demonstrate the effectiveness of our design choices and report state-of-the-art results on a few crowd counting benchmarks. We also demonstrate good cross-dataset generalization of LibraNet. Code and models are made available at: https://git.io/libranet

preprint2018arXiv

Electrical switching of perpendicular magnetization in L10 FePt single layer

Electrical manipulation of magnetization is essential for integration of magnetic functionalities such as magnetic memories and magnetic logic devices into electronic circuits. The current induced spin-orbit torque (SOT) in heavy metal/ferromagnet (HM/FM) bilayers via the spin Hall effect in the HM and/or the Rashba effect at the interfaces provides an efficient way to switch the magnetization. In the meantime, current induced SOT has also been used to switch the in-plane magnetization in single layers such as ferromagnetic semiconductor (Ga,Mn)As and antiferromagnetic metal CuMnAs with globally or locally broken inversion symmetry. Here we demonstrate the current induced perpendicular magnetization switching in L10 FePt single layer. The current induced spin-orbit effective fields in L10 FePt increase with the chemical ordering parameter (S). In 20 nm FePt films with high S, we observe a large charge-to-spin conversion efficiency and a switching current density as low as 7.0E6 A/cm2. We anticipate our findings may stimulate the exploration of the spin-orbit torques in bulk perpendicular magnetic anisotropic materials and the application of high-efficient perpendicular magnetization switching in single FM layer.