Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
180works
0followers
51topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

180 published item(s)

preprint2026arXiv

6D Movable Antenna Enhanced Cell-free MIMO: Two-timescale Decentralized Beamforming and Antenna Movement Optimization

This paper investigates a six-dimensional movable antenna (6DMA)-aided cell-free multi-user multiple-input multiple-output (MIMO) communication system. In this system, each distributed access point (AP) can flexibly adjust its array orientation and antenna positions to adapt to spatial channel variations and enhance communication performance. However, frequent antenna movements and centralized beamforming based on global instantaneous channel state information (CSI) sharing among APs entail extremely high signal processing delay and system overhead, which is difficult to be practically implemented in high-mobility scenarios with short channel coherence time. To address these practical implementation challenges and improve scalability, a two-timescale decentralized optimization framework is proposed in this paper to jointly design the beamformer, antenna positions, and array orientations. In the short timescale, each AP updates its receive beamformer based on local instantaneous CSI and global statistical CSI. In the long timescale, the central processing unit optimizes the antenna positions and array orientations at all APs based on global statistical CSI to maximize the ergodic sum rate of all users. The resulting optimization problem is non-convex and involves highly coupled variables, thus posing significant challenges for obtaining efficient solutions. To address this problem, a constrained stochastic successive convex approximation algorithm is developed. Numerical results demonstrate that the proposed 6DMA-aided cell-free system with decentralized beamforming significantly outperforms other antenna movement schemes with less flexibility and even achieves a performance comparable to that of the centralized beamforming benchmark.

preprint2026arXiv

A Systematic Evaluation of Imbalance Handling Methods in Biomedical Binary Classification

Objective: The primary goal of this study was to systematically examine the impact of commonly used imbalance handling methods (IHMs) on predictive performance in biomedical binary classification, considering the interplay between model complexity and diverse data modalities. Material and Methods: We evaluated five representative IHMs: random undersampling (RUS), random oversampling (ROS), SMOTE, re-weighting (RW), and direct F1-score optimization (DMO), against a raw training (RAW) baseline. The evaluation encompassed three public biomedical datasets: MIMIC-III (tabular), ADE-Corpus-V2 (text), and MURA (image), spanning three common biomedical data modalities. To assess varying model complexity, we employed a range of architectures, from classical logistic regression and random forest to deep neural networks, including multilayer perceptron (MLP), BiLSTM, BERT, DenseNet, and DINOv2. Results: For simpler models such as logistic regression on tabular data, IHMs yielded no significant advantage over the RAW baseline, aligning with prior findings. However, clear benefits were observed for more complex models and unstructured data: (a) ROS and RW consistently enhanced the performance of powerful models; (b) direct F1-score optimization demonstrated utility primarily for unstructured text and image data; and (c) RUS and SMOTE consistently degraded performance and are therefore not recommended. Conclusion: The effectiveness of IHMs depends on both model complexity and data modality. Performance gains are most pronounced when leveraging appropriate IHMs, such as ROS, RW, and DMO, on high-complexity models.

preprint2026arXiv

AI Signal Processing Paradigm for Movable Antenna: From Spatial Position Optimization to Electromagnetic Reconfigurability

As 6G wireless communication systems evolve toward intelligence, high reconfigurability, and space-air-ground integration \cite{liu2025toward, liu2024near}, the limitations of traditional fixed antenna (TFA) have become increasingly prominent. As a remedy, spatially movable antenna (SMA) and electromagnetically reconfigurable antenna (ERA) have respectively emerged as key technologies to break through this bottleneck. SMA activates spatial degree of freedom (DoF) by dynamically adjusting antenna positions, ERA regulates radiation characteristics using tunable metamaterials, thereby introducing DoF in the electromagnetic domain. However, the ``spatial-electromagnetic dual reconfiguration" paradigm formed by their integration poses severe challenges of high-dimensional hybrid optimization to signal processing. To address this issue, we integrate the spatial optimization of SMA and the electromagnetic reconfiguration of ERA, propose a unified modeling framework termed movable and reconfigurable antenna (MARA) and investigate the channel modeling and spectral efficiency (SE) optimization for MARA. Besides, we systematically review artificial intelligence (AI)-based solutions, focusing on analyzing the advantages of AI over traditional algorithms in solving high-dimensional non-convex optimization problems. This paper fills the gap in existing literature regarding the lack of a comprehensive review on the AI-driven signal processing paradigm under spatial-electromagnetic dual reconfiguration and provides theoretical guidance for the design and optimization of 6G wireless systems with advanced MARA.

preprint2026arXiv

DreamSR: Towards Ultra-High-Resolution Image Super-Resolution via a Receptive-Field Enhanced Diffusion Transformer

Large-scale pre-trained diffusion models have been extensively adopted for real-world image Super-Resolution because of their powerful generative priors through textual guidance. However, when super-resolving high-resolution images with patch-wise inference strategy, most existing diffusion-based SR methods tend to suffer from over-generation, due to the misalignment between the global prompt from LR image and the incomplete semantic information of local patches during each inference step. On the other hand, most existing methods also failed to generate detailed texture in local patches due to the overemphasis on global generation capabilities in network designs and training strategies. To address this issue, we present DreamSR, a novel SR model that suppresses local over-generation and improves fine-detail synthesis, thereby achieving visually faithful results with ultra-high-quality details. Specifically, we propose a dual-branch MM-ControlNet, where the ControlNet generates local textual feature with patch-level prompts while the pre-trained DiT provides global textual feature with global prompts, thereby mitigating over-generation and ensuring semantic consistency across patches. We also design a comprehensive training strategy with stage-specific data processing pipelines and a Receptive-Field Enhancement strategy, enhancing the model's capability to capture patch information and effectively restore local textures. Extensive experiments demonstrate that DreamSR outperforms state-of-the-art methods, providing high-quality SR results. Code and model are available at https://github.com/jerrydong0219/DreamSR.

preprint2026arXiv

Elastic Attention Cores for Scalable Vision Transformers

Vision Transformers (ViTs) achieve strong data-driven scaling by leveraging all-to-all self-attention. However, this flexibility incurs a computational cost that scales quadratically with image resolution, limiting ViTs in high-resolution domains. Underlying this approach is the assumption that pairwise token interactions are necessary for learning rich visual-semantic representations. In this work, we challenge this assumption, demonstrating that effective visual representations can be learned without any direct patch-to-patch interaction. We propose VECA (Visual Elastic Core Attention), a vision transformer architecture that uses efficient linear-time core-periphery structured attention enabled by a small set of learned cores. In VECA, these cores act as a communication interface: patch tokens exchange information exclusively through the core tokens, which are initialized from scratch and propagated across layers. Because the $N$ image patches only directly interact with a resolution invariant set of $C$ learned "core" embeddings, this yields linear complexity $O(N)$ for predetermined $C$, which bypasses quadratic scaling. Compared to prior cross-attention architectures, VECA maintains and iteratively updates the full set of $N$ input tokens, avoiding a small $C$-way bottleneck. Combined with nested training along the core axis, our model can elastically trade off compute and accuracy during inference. Across classification and dense tasks, VECA achieves performance competitive with the latest vision foundation models while reducing computational cost. Our results establish elastic core-periphery attention as a scalable alternative building block for Vision Transformers.

preprint2026arXiv

Enhanced Distributed Variational Quantum Eigensolver for Large-Scale MaxCut Problem

MaxCut is a canonical NP-hard combinatorial optimization problem in graph theory with broad applications ranging from physics to bioinformatics. Although variational quantum algorithms offer promising new approaches that may eventually outperform classical schemes, they suffer from resource constraints and trainability issues such as barren plateaus, making large-scale instances intractable on noisy intermediate-scale quantum devices. In this paper, we propose an enhanced distributed variational quantum eigensolver for large-scale MaxCut problems, which extends our prior distributed variational quantum eigensolver framework by integrating a novel hybrid classical-quantum perturbation strategy, enhances optimization scalability and efficiency. Our algorithm solves weighted MaxCut instances with up to 1000 vertices using only 10 qubits, and numerical results indicate that it consistently outperforms the Goemans-Williamson algorithm. We further employ a warm-start initialization strategy, seeding the algorithm with high-quality solutions from the Goemans-Williamson algorithm, with results confirming that the optimal classical solution can be effectively further improved. The practical utility of the proposed algorithm is further validated through its application to haplotype phasing on genome sequencing data of the human ABCA1 gene, producing high-quality haplotypes that rival those obtained by the Goemans-Williamson algorithm with $10^6$ projections. These results establish the proposed algorithm as a scalable, NISQ-compatible framework for near-term quantum-enhanced large-scale combinatorial optimization.

preprint2026arXiv

Entanglement Detection with Variational Quantum Interference: Theory and Experiment

Entanglement detection is a fundamental task in quantum information science, serving as a cornerstone for quantum benchmarking and foundational studies. With an increasing qubit number that can be effectively controlled, there is a pressing need for a scalable and robust detection protocol which requires minimal resources while maintaining high detection capability. By integrating the Positive Partial Transposition criterion with variational quantum interference, we propose an entanglement detection protocol that requires moderate classical and quantum computation resources. We numerically show that this protocol achieves a high detection capability with shallow quantum circuits, surpassing some widely-used entanglement detection methods. The protocol also exhibits strong resilience to circuit noise, ensuring its applicability across different physical platforms. We further demonstrate the protocol experimentally on an eight-photon linear-optical platform, where it successfully detects the entanglement of a three-qubit mixed state that is inaccessible to conventional entanglement witnesses. By combining quantum interference with classical optimization, our protocol provides a scalable and resource-efficient route toward practical entanglement detection.

preprint2026arXiv

Geometry-Aware Neural Optimizer for Shape Optimization and Inversion

Geometry is central to PDE-governed systems, motivating shape optimization and inversion. Classical pipelines conduct costly forward simulation with geometry processing, requiring substantial expert effort. Neural surrogates accelerate forward analysis but do not close the loop because gradients from objectives to geometry are often unavailable. Existing differentiable methods either rely on restrictive parameterizations or unstable latent optimization driven by scalar objectives, limiting interpretability and part-wise control. To address these challenges, we propose Geometry-Aware Neural Optimizer (\textbf{\textsc{GANO}}), an end-to-end differentiable framework that unifies geometry representation, field-level prediction, and automated optimization/inversion in a single latent-space loop. \textsc{GANO} encodes shapes with an auto-decoder and stabilizes latent updates via a denoising mechanism, and a geometry-informed surrogate provides a reliable gradient pathway for geometry updates. Moreover, \textsc{GANO} supports part-wise control through null-space projection and uses remeshing-free projection to accelerate geometry processing. We further prove that denoising induces an implicit Jacobian regularization that reduces decoder sensitivity, yielding controlled deformations. Experiments on three benchmarks spanning 2D Helmholtz, 2D airfoil, and 3D vehicles show state-of-the-art accuracy and stable, controllable updates, achieving up to +55.9% lift-to-drag improvement for airfoils and ~7% drag reduction for vehicles.

preprint2026arXiv

Hardwired-Neurons Language Processing Units as General-Purpose Cognitive Substrates

The rapid advancement of Large Language Models (LLMs) has established language as a core general-purpose cognitive substrate, driving the demand for specialized Language Processing Units (LPUs) tailored for LLM inference. To overcome the growing energy consumption of LLM inference systems, this paper proposes a Hardwired-Neurons Language Processing Unit (HNLPU), which physically hardwires LLM weight parameters into the computational fabric, achieving several orders of magnitude computational efficiency improvement by extreme specialization. However, a significant challenge still lies in the scale of modern LLMs. A straightforward hardwiring of gpt-oss 120 B would require fabricating photomask sets valued at over 6 billion dollars, rendering this straightforward solution economically impractical. Addressing this challenge, we propose the novel Metal-Embedding methodology. Instead of embedding weights in a 2D grid of silicon device cells, Metal-Embedding embeds weight parameters into the 3D topology of metal wires. This brings two benefits: (1) a 15x increase in density, and (2) 60 out of 70 photomask layers are homogeneous across chips, including all EUV photomasks. In total, Metal-Embedding reduced the photomask cost by 112x, bringing the Non-Recurring Engineering (NRE) cost of HNLPU into an economically viable range. Experimental results show that HNLPU achieved 249,960 tokens/s (5,555x/85x that of GPU/WSE), 36 tokens/J (1,047x/283x that of GPU/WSE), 13,232 mm2 total die area, $59.46 M-123.5 M estimated NRE at 5 nm technology. Analysis shows that HNLPU achieved 41.7-80.4x improvement in cost-effectiveness and 357x reduction in carbon footprint compared to OpenAI-scale H100 clusters, under an annual weight updating assumption.

preprint2026arXiv

Movable Antenna for Integrating Near-field Channel Estimation and Localization

Movable antenna (MA) introduces a new degree of freedom for future wireless communication systems by enabling the adaptive adjustment of antenna positions. Its large-range movement renders wireless channels transmission into the near-field region, which brings new performance enhancement for integrated sensing and communication (ISAC). This paper proposes a novel multi-stage design framework for broadband near-field ISAC assisted by MA. The framework first divides the MA movement area into multiple subregions, and employs the Newtonized orthogonal matching pursuit algorithm (NOMP) to achieve high-precision angle estimation in each subregion. Subsequently, a method called near-field localization via subregion ray clustering (LSRC) is proposed for identifying the positions of scatterers. This method finds the coordinates of each scatterer by jointly processing the angle estimates across all subregions. Finally, according to the estimated locations of the scatterers, the near-field channel estimation (CE) is refined for improving communication performance. Simulation results demonstrate that the proposed scheme can significantly enhance MA sensing accuracy and CE, providing an efficient solution for MA-aided near-field ISAC.

preprint2026arXiv

muT2-NMR: Micro-Scale Correlation Relaxometry for in-situ High-Pressure Nuclear Magnetic Resonance

Over the last decade, frequency-domain in-situ high-pressure nuclear magnetic resonance (NMR) spectroscopy in diamond anvil cells (DACs) has been employed as a structural and electronic probe of condensed matter systems at pressures well into the megabar range. However, extensive spin interactions and sample heterogeneities under pressure often lead to significant spectral overlap, inhibiting independent observation of chemically similar spin sub-species in the same sample. In this work, we introduce a time-domain relaxometry framework specifically suited for DAC experiments, named muT2-NMR. Experimental flexibility and operational robustness are benchmarked on three hydrogen-rich molecular solids at pressures up to 72 GPa. We demonstrate that muT2-NMR can resolve individual molecular subunits in relaxation space, paving the way for novel high-pressure, high-resolution NMR applications in molecular solids.

preprint2026arXiv

Nexus : An Agentic Framework for Time Series Forecasting

Time series forecasting is not just numerical extrapolation, but often requires reasoning with unstructured contextual data such as news or events. While specialized Time Series Foundation Models (TSFMs) excel at forecasting based on numerical patterns, they remain unaware to real-world textual signals. Conversely, while LLMs are emerging as zero-shot forecasters, their performance remains uneven across domains and contextual grounding. To bridge this gap, we introduce Nexus, a multi-agent forecasting framework that decomposes prediction into specialized stages: isolating macro-level and micro-level temporal fluctuations, and integrating contextual information when available before synthesizing a final forecast. This decomposition enables Nexus to adapt from seasonal signals to volatile, event-driven information without relying on external statistical anchors or monolithic prompting. We show that current-generation LLMs possess substantially stronger intrinsic forecasting ability than previously recognized, depending critically on how numerical and contextual reasoning are organized. Evaluated on data strictly succeeding LLM knowledge cutoffs spanning Zillow real estate metrics and volatile stock market equities, Nexus consistently matches or outperforms state-of-the-art TSFMs and strong LLM baselines. Beyond numerical accuracy, Nexus produces high-quality reasoning traces that explicitly show the fundamental drivers behind each forecast. Our results establish that real-world forecasting is an agentic reasoning problem extending well beyond only sequence modeling.

preprint2026arXiv

PEMNet: Towards Autonomous and Enhanced Environment-Aware Mobile Networks

With 5G deployment and the evolution toward 6G, mobile networks must make decisions in highly dynamic environments under strict latency, energy, and spectrum constraints. Achieving this goal, however, depends on prior knowledge of spatial-temporal variations in wireless channels and traffic demands. This motivates a joint, site-specific representation of radio propagation and user demand that is queryable at low online overhead. In this work, we propose the perception embedding map (PEM), a localized framework that embeds fine-grained channel statistics together with grid-level spatial-temporal traffic patterns over a base station's coverage. PEM is built from standard-compliant measurements -- such as measurement report and scheduling/quality-of-service logs -- so it can be deployed and maintained at scale with low cost. Integrated into PEM, this joint knowledge supports enhanced environment-aware optimization across PHY, MAC, and network layers while substantially reducing training overhead and signaling. Compared with existing site-specific channel maps and digital-twin replicas, PEM distinctively emphasizes (i) joint channel-traffic embedding, which is essential for network optimization, and (ii) practical construction using standard measurements, enabling network autonomy while striking a favorable fidelity-cost balance.

preprint2026arXiv

PerFlow: Physics-Embedded Rectified Flow for Efficient Reconstruction and Uncertainty Quantification of Spatiotemporal Dynamics

Reconstructing PDE-governed fields from sparse and irregular measurements is challenging due to their ill-posed nature. Deterministic surrogates are trained on dense fields that struggle with limited measurements and uncertainty quantification. Generative models, by learning distributions over spatiotemporal fields, can better handle sparsity and uncertainty. However, existing generative approaches enforce data consistency and PDE constraints simultaneously via sampling-time gradient guidance, resulting in slow and unstable inference. To this end, we propose PerFlow, a Physics-embedded rectified Flow for efficient sparse reconstruction and uncertainty quantification of spatiotemporal dynamics. PerFlow decouples observation conditioning from physics enforcement, performing guidance-free conditioning by feeding observations into rectified-flow dynamics while embedding hard physics via a constraint-preserving projection (e.g., incompressibility or conservation). Theoretically, we establish invariance guarantees to ensure that trajectories remain on the physics-consistent manifold throughout sampling. Experiments on various PDE systems demonstrate competitive reconstruction accuracy with sound physics consistency, while enabling efficient conditional sampling (e.g., 50 steps) and up to 320x faster inference than 2000-step guided diffusion baselines.

preprint2026arXiv

SAMamba3D: adapting Segment Anything for generalizable 3D segmentation of multiphase pore-scale images

Reliable segmentation of multiphase pore-scale X-ray images of rocks is necessary to quantify fluid saturation, connectivity, and interfacial geometry. However, current 3D segmentation methods are typically dataset-specific, requiring retraining or extensive fine-tuning whenever rock type, fluid pattern, scanner, or acquisition conditions change. Foundation models such as the Segment Anything Model (SAM) provide strong 2D boundary priors, but they are not directly applicable to 3D data. We present SAMamba3D, a parameter-efficient framework that adapts a largely frozen SAM encoder to generalizable 3D pore-scale segmentation by coupling it with Mamba-based volumetric context modeling and progressive cross-scale feature interaction. For sandstone and carbonate datasets, with different fluids, wettability, and scanning conditions, SAMamba3D matches or outperforms current 3D baselines while reducing the need for case-specific retraining. The resulting segmented images preserve physically meaningful descriptors, including fluid saturation, connectivity, and interface morphology, enabling more reliable and rapid analysis of large 3D multiphase images.

preprint2026arXiv

Think-J: Learning to Think for Generative LLM-as-a-Judge

LLM-as-a-Judge refers to the automatic modeling of preferences for responses generated by Large Language Models (LLMs), which is of significant importance for both LLM evaluation and reward modeling. Although generative LLMs have made substantial progress in various tasks, their performance as LLM-Judge still falls short of expectations. In this work, we propose Think-J, which improves generative LLM-as-a-Judge by learning how to think. We first utilized a small amount of curated data to develop the model with initial judgment thinking capabilities. Subsequently, we optimize the judgment thinking traces based on reinforcement learning (RL). We propose two methods for judgment thinking optimization, based on offline and online RL, respectively. The offline method requires training a critic model to construct positive and negative examples for learning. The online method defines rule-based reward as feedback for optimization. Experimental results showed that our approach can significantly enhance the evaluation capability of generative LLM-Judge, surpassing both generative and classifier-based LLM-Judge without requiring extra human annotations.

preprint2026arXiv

To Use AI as Dice of Possibilities with Timing Computation

The dominant noun-based modeling paradigm has fundamentally constrained AI development, precluding any adequate representation of the future as an open temporal dimension. This paper introduces a verb-based paradigm, together with precise definitions of \emph{timing computation} and \emph{possibility}, that enables AI to function as an effective instrument for realizing the grammar of our thought. Applied to longitudinal EHR data from 3,276 breast cancer patients, the framework empirically demonstrates: (1) automatic discovery of clinically significant patient trajectories, and (2) counterfactual timing deduction. Both results are purely data-driven, require no prior domain knowledge, and, to our knowledge, represent the first such demonstrations in the machine learning literature.

preprint2026arXiv

Two-stage Multi-beam Training for Multiuser Millimeter-Wave Communications

In this letter, we study an efficient multi-beam training method for multiuser millimeter-wave communication systems. Unlike the conventional single-beam training method that relies on exhaustive search, multi-beam training design faces a key challenge in balancing the trade-off between beam training overhead and success beam-identification rate, exacerbated by severe inter-beam interference. To tackle this challenge, we propose a new two-stage multi-beam training method with two distinct multi-beam patterns to enable fast and accurate user angle identification. Specifically, in the first stage, the antenna array is divided into sparse subarrays to generate multiple beams (with high array gains), for identifying candidate user angles. In the second stage, the array is redivided into dense subarrays to generate flexibly steered wide beams, for which a cross-validation method is employed to effectively resolve the remaining angular ambiguity in the first stage. Last, numerical results demonstrate that the proposed method significantly improves the success beam-identification rate compared to existing multi-beam training methods, while retaining or even reducing the required beam training overhead.

preprint2026arXiv

VeriCache: Turning Lossy KV Cache into Lossless LLM Inference

The large size of the KV cache has become a major bottleneck for serving LLMs with increasing context lengths. In response, many KV cache compression methods, such as token dropping and quantization, have been proposed. However, almost all of these methods are inherently lossy-despite minimal accuracy degradation for short outputs, their outputs increasingly diverge from full-KV-cache outputs as more tokens are decoded, which leads to catastrophic failures in code generation and tool calling. We present VeriCache, the first inference framework that ensures the same output as full-KV-cache decoding but largely preserves the high decoding throughput of a range of KV cache compression algorithms. VeriCache uses the compressed KV cache to draft tokens, then verifies them against the full KV cache. While it may seem like just speculative decoding, VeriCache requires addressing a key system challenge to work-keeping the full KV cache out of GPU memory and minimizing the overhead of swapping it in for verification. The insight is two-fold: (1) compressed-KV decoding can be parallelized with full-KV swap, because one is HBM-bandwidth-bound and the other is PCIe/network-bound, and (2) the compressed KV cache often produces output similar to the full KV cache, allowing a long drafting horizon to amortize each full-KV swap. VeriCache applies to both long-context decoding and remote prefix caching, supports a broad family of token-dropping and quantization methods through a uniform compressor interface, and composes with traditional speculative decoding. Experimental results show that VeriCache achieves up to 4X higher throughput than full-KV inference while producing identical outputs.

preprint2026arXiv

When Relations Break: Analyzing Relation Hallucination in Vision-Language Model Under Rotation and Noise

Vision-language models (VLMs) achieve strong multimodal performance but remain prone to relation hallucination, which requires accurate reasoning over inter-object interactions. We study the impact of visual perturbations, specifically rotation and noise, and show that even mild distortions significantly degrade relational reasoning across models and datasets. We further evaluate prompt-based augmentation and preprocessing strategies (orientation correction and denoising), finding that while they offer partial improvements, they do not fully resolve hallucinations. Our results reveal a gap between perceptual robustness and relational understanding, highlighting the need for more robust, geometry-aware VLMs.

preprint2026arXiv

Why Do Reasoning Models Lose Coverage? The Role of Data and Forks in the Road

Recent progress in large language models has led to the emergence of reasoning models, which have shown strong performance on complex tasks through specialized fine-tuning procedures. While these methods reliably improve pass@1 accuracy, prior works have observed that they show a coverage shrinkage behavior, where pass@k degrades relative to the base model. In this paper, we investigate the reasoning shrinkage arise under SFT-based post-training. We hypothesize that this behavior is driven by properties of the fine-tuning data, specifically related to decision points or "forks in the road" scenarios where model faces indecipherable patterns with multiple valid reasoning paths. To test this hypothesis, we design controlled case studies that simulate such decision-point settings, spanning indecipherable nodes in graph branching, and reasoning modes. By tracking post-training dynamics in these settings, we find that the shrinkage phenomenon is tightly correlated with the prevalence of decision-point scenarios in the training data. We also demonstrate that this shrinkage behavior can be partially mitigated through targeted data synthesis design of decision-points, and a more systematic diversity-encouraging decoding mechanism. Our findings identify data-centric factors as a key driver of shrinkage in reasoning models and highlight diversity-aware designs as an effective lever for controlling it.

preprint2025arXiv

A Tutorial on MIMO-OFDM ISAC: From Far-Field to Near-Field

Integrated sensing and communication (ISAC) is one of the key usage scenarios for future sixth-generation (6G) mobile communication networks, where communication and sensing (C&S) services are simultaneously provided through shared wireless spectrum, signal processing modules, hardware, and network infrastructure. Such an integration is strengthened by the technology trends in 6G, such as denser network nodes, larger antenna arrays, wider bandwidths, higher frequency bands, and more efficient utilization of spectrum and hardware resources, which incentivize and empower enhanced sensing capabilities. As the dominant waveform used in contemporary communication systems, orthogonal frequency division multiplexing (OFDM) is still expected to be a very competitive technology for 6G, rendering it necessary to thoroughly investigate the potential and challenges of OFDM ISAC. Thus, this paper aims to provide a comprehensive tutorial overview of ISAC systems enabled by large-scale multi-input multi-output (MIMO) and OFDM technologies and to discuss their fundamental principles, advantages, and enabling signal processing methods. To this end, a unified MIMO-OFDM ISAC system model is first introduced, followed by four frameworks for estimating parameters across the spatial, delay, and Doppler domains, including parallel one-domain, sequential one-domain, joint two-domain, and joint three-domain parameter estimation. Next, sensing algorithms and performance analyses are presented in detail for far-field scenarios where uniform plane wave (UPW) propagation is valid, followed by their extensions to near-field scenarios where uniform spherical wave (USW) characteristics need to be considered. Finally, this paper points out open challenges and outlines promising avenues for future research on MIMO-OFDM ISAC.

preprint2025arXiv

Frequency-switching Array Enhanced Physical-Layer Security in Terahertz Bands: A Movable Antenna Perspective

In this paper, we propose a new frequency-switching array (FSA) to enhance the physical-layer security (PLS) in the presence of multiple eavesdroppers (Eves), where the carrier frequency can be flexibly switched and small frequency offsets can be imposed on each antenna at the secrecy transmitter (Alice).First, we analytically show that by flexibly controlling the carrier frequency parameters, FSAs can effectively form uniform/non-uniform sparse arrays, hence resembling existing mechanically controlled movable antennas (MAs) via the control of inter-antenna spacing and providing additional degree-of-freedom in the beam manipulation.Although the proposed FSA suffers from additional path-gain attenuation in the received signals, it can overcome several hardware and signal processing issues incurred by MAs, such as limited positioning accuracy, extra hardware and energy cost.Then, a secrecy-rate maximization problem is formulated under the constraints on the frequency control.To shed useful insights, we first consider a secrecy-guaranteed problem with a null-steering constraint for which maximum ratio transmission beamformer is considered at Alice and the frequency offsets are set as uniform frequency increment.Interestingly, it is shown that the proposed FSA can flexibly realize null-steering over Eve in both the angular domain and range domain, thereby achieving improved PLS performance.Then, for the general case, we propose an efficient algorithm to solve the formulated non-convex optimization problem by using the block coordinate descent and projected gradient ascent techniques. Finally, numerical results demonstrate that the proposed FSA achieves superior secrecy rate performance over conventional fixed-position array, while it only suffers a slight secrecy rate loss than the existing mechanically controlled MA.

preprint2025arXiv

Ultrahigh-Energy Gamma-ray Emission Associated with Black Hole-Jet Systems

Black holes (BH), one of the most intriguing objects in the universe, can manifest themselves through electromagnetic radiation initiated by the accretion flow. Some stellar-mass BHs drive relativistic jets when accreting matter from their companion stars, forming microquasars. Non-thermal emission from the radio to tera-electronvolt (TeV) gamma-ray band has been observed from microquasars, indicating the acceleration of relativistic particles. Here we report detection of four microquasars (SS 433, V4641 Sgr, GRS 1915+105, MAXI J1820+070) of spectrum extending to the ultrahigh-energy (UHE; photon energy $E>100$ TeV) band and one microquasar (Cygnus X-1) of spectrum approaching 100 TeV, using the Large High Altitude Air Shower Observatory (LHAASO). Notably, the total emission associated with SS 433 cannot be interpreted with a single leptonic component. In the UHE band, its emission is in spatial coincidence with a giant atomic cloud, which is consistent with a hadronic origin. An elongated source is discovered from V4641 Sgr with the spectrum continuing up to 800 TeV. The detection of UHE gamma rays demonstrates that accreting BHs and their environments can operate as extremely efficient accelerators of particles out of 1 peta-electronvolt (PeV), suggesting microquasars to be important contributors to Galactic cosmic rays especially around the `knee' region.

preprint2024arXiv

Roth-type Theorem for high-power system in Piatetski-Shapiro primes (II)

We consider the nonlinear system $c_1p_1^d +c_2p_2^d + \dots + c_s p_s^d = 0$ with $c_1, c_2,\dots, c_s\in\mathbb Z$ being nonzero and satisfying $c_1 +c_2 + \dots + c_s = 0$. We show that for $s\ge 2\lfloor \frac{d^2}2\rfloor+1$ and $c\in\left(1, 1+c(d,s)\right)$, if the system has only $K$-trivial solutions in subset $\mathcal{A}$ of Piatetski-Shapiro primes up to $x$ and corresponding to $c$, then $|\mathcal{A}| \ll \frac{x^{\frac1c}}{\log x} $$\left(\log \log \log \log x\right)^{\frac{2-s}{dc}+\varepsilon}$.

preprint2023arXiv

A Comprehensive Study on Optimizing Systems with Data Processing Units

New hardware, such as SmartNICs, has been released to offload network applications in data centers. Off-path SmartNICs, a type of multi-core SoC SmartNICs, have attracted the attention of many researchers. Unfortunatelly, they lack the fully exploration of off-path SmartNICs. In this paper, we use a BlueField SmartNIC as an example to conduct a systematical study on the advantages and disadvantages of off-path SmartNICs. We make a detailed performance characterization on an off-path SmartNIC including computing power and network communication overhead, and propose the following advices: 1) Directly utilize the specific accelerators on the SmartNIC to offload applications; 2) Offload latency-insensitive background processing to the SmartNIC to reduce the load on the host; 3) Regard the SmartNIC as a new endpoint in the network to expand the computing power and storage resources of the server host; 4) Avoid directly employing the design method for systems based on on-path SmartNICs. We apply these advices to several use cases and show the performance improvements.

preprint2023arXiv

Adaptive Depth Graph Attention Networks

As one of the most popular GNN architectures, the graph attention networks (GAT) is considered the most advanced learning architecture for graph representation and has been widely used in various graph mining tasks with impressive results. However, since GAT was proposed, none of the existing studies have provided systematic insight into the relationship between the performance of GAT and the number of layers, which is a critical issue in guiding model performance improvement. In this paper, we perform a systematic experimental evaluation and based on the experimental results, we find two important facts: (1) the main factor limiting the accuracy of the GAT model as the number of layers increases is the oversquashing phenomenon; (2) among the previous improvements applied to the GNN model, only the residual connection can significantly improve the GAT model performance. We combine these two important findings to provide a theoretical explanation that it is the residual connection that mitigates the loss of original feature information due to oversquashing and thus improves the deep GAT model performance. This provides empirical insights and guidelines for researchers to design the GAT variant model with appropriate depth and well performance. To demonstrate the effectiveness of our proposed guidelines, we propose a GAT variant model-ADGAT that adaptively selects the number of layers based on the sparsity of the graph, and experimentally demonstrate that the effectiveness of our model is significantly improved over the original GAT.

preprint2023arXiv

JEPOO: Highly Accurate Joint Estimation of Pitch, Onset and Offset for Music Information Retrieval

Melody extraction is a core task in music information retrieval, and the estimation of pitch, onset and offset are key sub-tasks in melody extraction. Existing methods have limited accuracy, and work for only one type of data, either single-pitch or multipitch. In this paper, we propose a highly accurate method for joint estimation of pitch, onset and offset, named JEPOO. We address the challenges of joint learning optimization and handling both single-pitch and multi-pitch data through novel model design and a new optimization technique named Pareto modulated loss with loss weight regularization. This is the first method that can accurately handle both single-pitch and multi-pitch music data, and even a mix of them. A comprehensive experimental study on a wide range of real datasets shows that JEPOO outperforms state-ofthe-art methods by up to 10.6%, 8.3% and 10.3% for the prediction of Pitch, Onset and Offset, respectively, and JEPOO is robust for various types of data and instruments. The ablation study shows the effectiveness of each component of JEPOO.

preprint2022arXiv

"Adversarial Examples" for Proof-of-Learning

In S&P '21, Jia et al. proposed a new concept/mechanism named proof-of-learning (PoL), which allows a prover to demonstrate ownership of a machine learning model by proving integrity of the training procedure. It guarantees that an adversary cannot construct a valid proof with less cost (in both computation and storage) than that made by the prover in generating the proof. A PoL proof includes a set of intermediate models recorded during training, together with the corresponding data points used to obtain each recorded model. Jia et al. claimed that an adversary merely knowing the final model and training dataset cannot efficiently find a set of intermediate models with correct data points. In this paper, however, we show that PoL is vulnerable to ``adversarial examples''! Specifically, in a similar way as optimizing an adversarial example, we could make an arbitrarily-chosen data point ``generate'' a given model, hence efficiently generating intermediate models with correct data points. We demonstrate, both theoretically and empirically, that we are able to generate a valid proof with significantly less cost than generating a proof by the prover.

preprint2022arXiv

A Benchmark and Comprehensive Survey on Knowledge Graph Entity Alignment via Representation Learning

In the last few years, the interest in knowledge bases has grown exponentially in both the research community and the industry due to their essential role in AI applications. Entity alignment is an important task for enriching knowledge bases. This paper provides a comprehensive tutorial-type survey on representative entity alignment techniques that use the new approach of representation learning. We present a framework for capturing the key characteristics of these techniques, propose two datasets to address the limitation of existing benchmark datasets, and conduct extensive experiments using the proposed datasets. The framework gives a clear picture of how the techniques work. The experiments yield important results about the empirical performance of the techniques and how various factors affect the performance. One important observation not stressed by previous work is that techniques making good use of attribute triples and relation predicates as features stand out as winners.

preprint2022arXiv

A sufficient and necessary condition of generalized polynomial Liénard systems with global centers

The aim of this paper is to give a sufficient and necessary condition of the generalized polynomial Liénard system with a global center (including linear typer and nilpotent type). Recently, Llibre and Valls [J. Differential Equations, 330 (2022), 66-80] gave a sufficient and necessary condition of the generalized polynomial Liénard system with a linear type global center. It is easy to see that our sufficient and necessary condition is more easy by comparison. In particular, we provide the explicit expressions of all the generalized polynomial Liénard differential systems of degree 5 having a global center at the origin and the explicit expression of a generalized polynomial Liénard differential system of indefinite degree having a global center at the origin.

preprint2022arXiv

A Survey on Channel Estimation and Practical Passive Beamforming Design for Intelligent Reflecting Surface Aided Wireless Communications

Intelligent reflecting surface (IRS) has emerged as a key enabling technology to realize smart and reconfigurable radio environment for wireless communications, by digitally controlling the signal reflection via a large number of passive reflecting elements in real-time. Different from conventional wireless communication techniques that only adapt to but have no or limited control over dynamic wireless channels, IRS provides a new and cost-effective means to combat the wireless channel impairments in a proactive manner. However, despite its great potential, IRS faces new and unique challenges in its efficient integration into wireless communication systems, especially its channel estimation and passive beamforming design under various practical hardware constraints. In this paper, we provide a comprehensive survey on the up-to-date research in IRS-aided wireless communications, with an emphasis on the promising solutions to tackle practical design issues. Furthermore, we discuss new and emerging IRS architectures and applications as well as their practical design problems to motivate future research.

preprint2022arXiv

A Survey on Gradient Inversion: Attacks, Defenses and Future Directions

Recent studies have shown that the training samples can be recovered from gradients, which are called Gradient Inversion (GradInv) attacks. However, there remains a lack of extensive surveys covering recent advances and thorough analysis of this issue. In this paper, we present a comprehensive survey on GradInv, aiming to summarize the cutting-edge research and broaden the horizons for different domains. Firstly, we propose a taxonomy of GradInv attacks by characterizing existing attacks into two paradigms: iteration- and recursion-based attacks. In particular, we dig out some critical ingredients from the iteration-based attacks, including data initialization, model training and gradient matching. Second, we summarize emerging defense strategies against GradInv attacks. We find these approaches focus on three perspectives covering data obscuration, model improvement and gradient protection. Finally, we discuss some promising directions and open problems for further research.

preprint2022arXiv

Achievable Rate Maximization for Underlay Spectrum Sharing MIMO System with Intelligent Reflecting Surface

In this letter, the achievable rate maximization problem is considered for intelligent reflecting surface (IRS) assisted multiple-input multiple-output (MIMO) systems in an underlay spectrum sharing scenario, subject to interference power constraints at the primary users. The formulated non-convex optimization problem is challenging to solve due to its non-convexity as well as coupling design variables in the constraints. Different from existing works that are mostly based on alternating optimization (AO), we propose a penalty dual decomposition based gradient projection (PDDGP) algorithm to solve this problem. We also provide a convergence proof and a complexity analysis for the proposed algorithm. We benchmark the proposed algorithm against two known solutions, namely a minimum mean-square error based AO algorithm and an inner approximation method with block coordinate descent. Specifically, the complexity of the proposed algorithm grows linearly with respect to the number of reflecting elements at the IRS, while that of the two benchmark methods grows with the third power of the number of IRS elements. Moreover, numerical results show that the proposed PDDGP algorithm yields considerably higher achievable rate than the benchmark solutions.

preprint2022arXiv

Active and Passive IRS Jointly Aided Communication: Deployment Design and Achievable Rate

In this letter, we study the wireless point-to-point communication from a transmitter (Tx) to a receiver (Rx), which is jointly aided by an active intelligent reflecting surface (AIRS) and a passive IRS (PIRS). We consider two practical transmission schemes by deploying the two IRSs in different orders, namely, Tx$\rightarrow$PIRS$\rightarrow$AIRS$\rightarrow$Rx (TPAR) and Tx$\rightarrow$AIRS$\rightarrow$PIRS$\rightarrow$Rx (TAPR). Assuming line-of-sight channels, we derive the achievable rates for the two schemes by optimizing the placement of the AIRS with the location of the PIRS fixed. Our analysis shows that when the number of PIRS elements and/or the AIRS amplification power is small, the AIRS should be deployed closer to the Rx in both schemes, and TAPR outperforms TPAR with their respective optimized AIRS/PIRS placement. Simulation results validate our analysis and show the considerable performance gain achieved by the jointly optimized AIRS/PIRS deployment over the baseline double-PIRS system under the same power and IRS element budgets.

preprint2022arXiv

Active-Passive IRS aided Wireless Communication: New Hybrid Architecture and Elements Allocation Optimization

Intelligent reflecting surface (IRS) has emerged as a promising technology to enhance the wireless communication network coverage and capacity by dynamically controlling the radio signal propagation environment. In contrast to the existing works that considered active or passive IRS only, we propose in this paper a new hybrid active-passive IRS architecture that consists of both active and passive reflecting elements, thus achieving their combined advantages flexibly. Under a practical channel setup with Rician fading where only the statistical channel state information (CSI) is available, we study the hybrid IRS design in a multi-user communication system. Specifically, we formulate an optimization problem to maximize the achievable ergodic capacity of the worst-case user by designing the hybrid IRS beamforming and active/passive elements allocation based on the statistical CSI, subject to various practical constraints on the active-element amplification factor and amplification power consumption, as well as the total active and passive elements deployment budget. To solve this challenging problem, we first approximate the ergodic capacity in a simpler form and then propose an efficient algorithm to solve the problem optimally. Moreover, we show that for the special case with all channels to be line-of-sight (LoS), only active elements need to be deployed when the total deployment budget is sufficiently small, while both active and passive elements should be deployed with a decreasing number ratio when the budget increases and exceeds a certain threshold. Finally, numerical results are presented which demonstrate the performance gains of the proposed hybrid IRS architecture and its optimal design over the conventional schemes with active/passive IRS only under various practical system setups.

preprint2022arXiv

Adversarial attacks and defenses in Speaker Recognition Systems: A survey

Speaker recognition has become very popular in many application scenarios, such as smart homes and smart assistants, due to ease of use for remote control and economic-friendly features. The rapid development of SRSs is inseparable from the advancement of machine learning, especially neural networks. However, previous work has shown that machine learning models are vulnerable to adversarial attacks in the image domain, which inspired researchers to explore adversarial attacks and defenses in Speaker Recognition Systems (SRS). Unfortunately, existing literature lacks a thorough review of this topic. In this paper, we fill this gap by performing a comprehensive survey on adversarial attacks and defenses in SRSs. We first introduce the basics of SRSs and concepts related to adversarial attacks. Then, we propose two sets of criteria to evaluate the performance of attack methods and defense methods in SRSs, respectively. After that, we provide taxonomies of existing attack methods and defense methods, and further review them by employing our proposed criteria. Finally, based on our review, we find some open issues and further specify a number of future directions to motivate the research of SRSs security.

preprint2022arXiv

AdvSmo: Black-box Adversarial Attack by Smoothing Linear Structure of Texture

Black-box attacks usually face two problems: poor transferability and the inability to evade the adversarial defense. To overcome these shortcomings, we create an original approach to generate adversarial examples by smoothing the linear structure of the texture in the benign image, called AdvSmo. We construct the adversarial examples without relying on any internal information to the target model and design the imperceptible-high attack success rate constraint to guide the Gabor filter to select appropriate angles and scales to smooth the linear texture from the input images to generate adversarial examples. Benefiting from the above design concept, AdvSmo will generate adversarial examples with strong transferability and solid evasiveness. Finally, compared to the four advanced black-box adversarial attack methods, for the eight target models, the results show that AdvSmo improves the average attack success rate by 9% on the CIFAR-10 and 16% on the Tiny-ImageNet dataset compared to the best of these attack methods.

preprint2022arXiv

An Interpretable Neuro-Symbolic Reasoning Framework for Task-Oriented Dialogue Generation

We study the interpretability issue of task-oriented dialogue systems in this paper. Previously, most neural-based task-oriented dialogue systems employ an implicit reasoning strategy that makes the model predictions uninterpretable to humans. To obtain a transparent reasoning process, we introduce neuro-symbolic to perform explicit reasoning that justifies model decisions by reasoning chains. Since deriving reasoning chains requires multi-hop reasoning for task-oriented dialogues, existing neuro-symbolic approaches would induce error propagation due to the one-phase design. To overcome this, we propose a two-phase approach that consists of a hypothesis generator and a reasoner. We first obtain multiple hypotheses, i.e., potential operations to perform the desired task, through the hypothesis generator. Each hypothesis is then verified by the reasoner, and the valid one is selected to conduct the final prediction. The whole system is trained by exploiting raw textual dialogues without using any reasoning chain annotations. Experimental studies on two public benchmark datasets demonstrate that the proposed approach not only achieves better results, but also introduces an interpretable decision process.

preprint2022arXiv

An Open Natural Language Processing Development Framework for EHR-based Clinical Research: A case demonstration using the National COVID Cohort Collaborative (N3C)

While we pay attention to the latest advances in clinical natural language processing (NLP), we can notice some resistance in the clinical and translational research community to adopt NLP models due to limited transparency, interpretability, and usability. In this study, we proposed an open natural language processing development framework. We evaluated it through the implementation of NLP algorithms for the National COVID Cohort Collaborative (N3C). Based on the interests in information extraction from COVID-19 related clinical notes, our work includes 1) an open data annotation process using COVID-19 signs and symptoms as the use case, 2) a community-driven ruleset composing platform, and 3) a synthetic text data generation workflow to generate texts for information extraction tasks without involving human subjects. The corpora were derived from texts from three different institutions (Mayo Clinic, University of Kentucky, University of Minnesota). The gold standard annotations were tested with a single institution's (Mayo) ruleset. This resulted in performances of 0.876, 0.706, and 0.694 in F-scores for Mayo, Minnesota, and Kentucky test datasets, respectively. The study as a consortium effort of the N3C NLP subgroup demonstrates the feasibility of creating a federated NLP algorithm development and benchmarking platform to enhance multi-institution clinical NLP study and adoption. Although we use COVID-19 as a use case in this effort, our framework is general enough to be applied to other domains of interest in clinical NLP.

preprint2022arXiv

AutoDES: AutoML Pipeline Generation of Classification with Dynamic Ensemble Strategy Selection

Automating machine learning has achieved remarkable technological developments in recent years, and building an automated machine learning pipeline is now an essential task. The model ensemble is the technique of combining multiple models to get a better and more robust model. However, existing automated machine learning tends to be simplistic in handling the model ensemble, where the ensemble strategy is fixed, such as stacked generalization. There have been many techniques on different ensemble methods, especially ensemble selection, and the fixed ensemble strategy limits the upper limit of the model's performance. In this article, we present a novel framework for automated machine learning. Our framework incorporates advances in dynamic ensemble selection, and to our best knowledge, our approach is the first in the field of AutoML to search and optimize ensemble strategies. In the comparison experiments, our method outperforms the state-of-the-art automated machine learning frameworks with the same CPU time in 42 classification datasets from the OpenML platform. Ablation experiments on our framework validate the effectiveness of our proposed method.

preprint2022arXiv

BARS: Towards Open Benchmarking for Recommender Systems

The past two decades have witnessed the rapid development of personalized recommendation techniques. Despite significant progress made in both research and practice of recommender systems, to date, there is a lack of a widely-recognized benchmarking standard in this field. Many existing studies perform model evaluations and comparisons in an ad-hoc manner, for example, by employing their own private data splits or using different experimental settings. Such conventions not only increase the difficulty in reproducing existing studies, but also lead to inconsistent experimental results among them. This largely limits the credibility and practical value of research results in this field. To tackle these issues, we present an initiative project (namely BARS) aiming for open benchmarking for recommender systems. In comparison to some earlier attempts towards this goal, we take a further step by setting up a standardized benchmarking pipeline for reproducible research, which integrates all the details about datasets, source code, hyper-parameter settings, running logs, and evaluation results. The benchmark is designed with comprehensiveness and sustainability in mind. It covers both matching and ranking tasks, and also enables researchers to easily follow and contribute to the research in this field. This project will not only reduce the redundant efforts of researchers to re-implement or re-run existing baselines, but also drive more solid and reproducible research on recommender systems. We would like to call upon everyone to use the BARS benchmark for future evaluation, and contribute to the project through the portal at: https://openbenchmark.github.io/BARS.

preprint2022arXiv

Bridge the Gap between Supervised and Unsupervised Learning for Fine-Grained Classification

Unsupervised learning technology has caught up with or even surpassed supervised learning technology in general object classification (GOC) and person re-identification (re-ID). However, it is found that the unsupervised learning of fine-grained visual classification (FGVC) is more challenging than GOC and person re-ID. In order to bridge the gap between unsupervised and supervised learning for FGVC, we investigate the essential factors (including feature extraction, clustering, and contrastive learning) for the performance gap between supervised and unsupervised FGVC. Furthermore, we propose a simple, effective, and practical method, termed as UFCL, to alleviate the gap. Three key issues are concerned and improved: First, we introduce a robust and powerful backbone, ResNet50-IBN, which has an ability of domain adaptation when we transfer ImageNet pre-trained models to FGVC tasks. Next, we propose to introduce HDBSCAN instead of DBSCAN to do clustering, which can generate better clusters for adjacent categories with fewer hyper-parameters. Finally, we propose a weighted feature agent and its updating mechanism to do contrastive learning by using the pseudo labels with inevitable noise, which can improve the optimization process of learning the parameters of the network. The effectiveness of our UFCL is verified on CUB-200-2011, Oxford-Flowers, Oxford-Pets, Stanford-Dogs, Stanford-Cars and FGVC-Aircraft datasets. Under the unsupervised FGVC setting, we achieve state-of-the-art results, and analyze the key factors and the important parameters to provide a practical guidance.

preprint2022arXiv

CancerBERT: a BERT model for Extracting Breast Cancer Phenotypes from Electronic Health Records

Accurate extraction of breast cancer patients' phenotypes is important for clinical decision support and clinical research. Current models do not take full advantage of cancer domain-specific corpus, whether pre-training Bidirectional Encoder Representations from Transformer model on cancer-specific corpus could improve the performances of extracting breast cancer phenotypes from texts data remains to be explored. The objective of this study is to develop and evaluate the CancerBERT model for extracting breast cancer phenotypes from clinical texts in electronic health records. This data used in the study included 21,291 breast cancer patients diagnosed from 2010 to 2020, patients' clinical notes and pathology reports were collected from the University of Minnesota Clinical Data Repository (UMN). Results: About 3 million clinical notes and pathology reports in electronic health records for 21,291 breast cancer patients were collected to train the CancerBERT model. 200 pathology reports and 50 clinical notes of breast cancer patients that contain 9,685 sentences and 221,356 tokens were manually annotated by two annotators. 20% of the annotated data was used as a test set. Our CancerBERT model achieved the best performance with macro F1 scores equal to 0.876 (95% CI, 0.896-0.902) for exact match and 0.904 (95% CI, 0.896-0.902) for the lenient match. The NER models we developed would facilitate the automated information extraction from clinical texts to further help clinical decision support. Conclusions and Relevance: In this study, we focused on the breast cancer-related concepts extraction from EHR data and obtained a comprehensive annotated dataset that contains 7 types of breast cancer-related concepts. The CancerBERT model with customized vocabulary could significantly improve the performance for extracting breast cancer phenotypes from clinical texts.

preprint2022arXiv

Code Comment Inconsistency Detection with BERT and Longformer

Comments, or natural language descriptions of source code, are standard practice among software developers. By communicating important aspects of the code such as functionality and usage, comments help with software project maintenance. However, when the code is modified without an accompanying correction to the comment, an inconsistency between the comment and code can arise, which opens up the possibility for developer confusion and bugs. In this paper, we propose two models based on BERT (Devlin et al., 2019) and Longformer (Beltagy et al., 2020) to detect such inconsistencies in a natural language inference (NLI) context. Through an evaluation on a previously established corpus of comment-method pairs both during and after code changes, we demonstrate that our models outperform multiple baselines and yield comparable results to the state-of-the-art models that exclude linguistic and lexical features. We further discuss ideas for future research in using pretrained language models for both inconsistency detection and automatic comment updating.

preprint2022arXiv

CONTaiNER: Few-Shot Named Entity Recognition via Contrastive Learning

Named Entity Recognition (NER) in Few-Shot setting is imperative for entity tagging in low resource domains. Existing approaches only learn class-specific semantic features and intermediate representations from source domains. This affects generalizability to unseen target domains, resulting in suboptimal performances. To this end, we present CONTaiNER, a novel contrastive learning technique that optimizes the inter-token distribution distance for Few-Shot NER. Instead of optimizing class-specific attributes, CONTaiNER optimizes a generalized objective of differentiating between token categories based on their Gaussian-distributed embeddings. This effectively alleviates overfitting issues originating from training domains. Our experiments in several traditional test domains (OntoNotes, CoNLL'03, WNUT '17, GUM) and a new large scale Few-Shot NER dataset (Few-NERD) demonstrate that on average, CONTaiNER outperforms previous methods by 3%-13% absolute F1 points while showing consistent performance trends, even in challenging scenarios where previous approaches could not achieve appreciable performance.

preprint2022arXiv

Contrastive Learning with Positive-Negative Frame Mask for Music Representation

Self-supervised learning, especially contrastive learning, has made an outstanding contribution to the development of many deep learning research fields. Recently, researchers in the acoustic signal processing field noticed its success and leveraged contrastive learning for better music representation. Typically, existing approaches maximize the similarity between two distorted audio segments sampled from the same music. In other words, they ensure a semantic agreement at the music level. However, those coarse-grained methods neglect some inessential or noisy elements at the frame level, which may be detrimental to the model to learn the effective representation of music. Towards this end, this paper proposes a novel Positive-nEgative frame mask for Music Representation based on the contrastive learning framework, abbreviated as PEMR. Concretely, PEMR incorporates a Positive-Negative Mask Generation module, which leverages transformer blocks to generate frame masks on the Log-Mel spectrogram. We can generate self-augmented negative and positive samples by masking important components or inessential components, respectively. We devise a novel contrastive learning objective to accommodate both self-augmented positives/negatives sampled from the same music. We conduct experiments on four public datasets. The experimental results of two music-related downstream tasks, music classification, and cover song identification, demonstrate the generalization ability and transferability of music representation learned by PEMR.

preprint2022arXiv

Deep Manifold Learning with Graph Mining

Admittedly, Graph Convolution Network (GCN) has achieved excellent results on graph datasets such as social networks, citation networks, etc. However, softmax used as the decision layer in these frameworks is generally optimized with thousands of iterations via gradient descent. Furthermore, due to ignoring the inner distribution of the graph nodes, the decision layer might lead to an unsatisfactory performance in semi-supervised learning with less label support. To address the referred issues, we propose a novel graph deep model with a non-gradient decision layer for graph mining. Firstly, manifold learning is unified with label local-structure preservation to capture the topological information of the nodes. Moreover, owing to the non-gradient property, closed-form solutions is achieved to be employed as the decision layer for GCN. Particularly, a joint optimization method is designed for this graph model, which extremely accelerates the convergence of the model. Finally, extensive experiments show that the proposed model has achieved state-of-the-art performance compared to the current models.

preprint2022arXiv

Deep Random Vortex Method for Simulation and Inference of Navier-Stokes Equations

Navier-Stokes equations are significant partial differential equations that describe the motion of fluids such as liquids and air. Due to the importance of Navier-Stokes equations, the development on efficient numerical schemes is important for both science and engineer. Recently, with the development of AI techniques, several approaches have been designed to integrate deep neural networks in simulating and inferring the fluid dynamics governed by incompressible Navier-Stokes equations, which can accelerate the simulation or inferring process in a mesh-free and differentiable way. In this paper, we point out that the capability of existing deep Navier-Stokes informed methods is limited to handle non-smooth or fractional equations, which are two critical situations in reality. To this end, we propose the \emph{Deep Random Vortex Method} (DRVM), which combines the neural network with a random vortex dynamics system equivalent to the Navier-Stokes equation. Specifically, the random vortex dynamics motivates a Monte Carlo based loss function for training the neural network, which avoids the calculation of derivatives through auto-differentiation. Therefore, DRVM not only can efficiently solve Navier-Stokes equations involving rough path, non-differentiable initial conditions and fractional operators, but also inherits the mesh-free and differentiable benefits of the deep-learning-based solver. We conduct experiments on the Cauchy problem, parametric solver learning, and the inverse problem of both 2-d and 3-d incompressible Navier-Stokes equations. The proposed method achieves accurate results for simulation and inference of Navier-Stokes equations. Especially for the cases that include singular initial conditions, DRVM significantly outperforms existing PINN method.

preprint2022arXiv

Detecting Arbitrary Order Beneficial Feature Interactions for Recommender Systems

Detecting beneficial feature interactions is essential in recommender systems, and existing approaches achieve this by examining all the possible feature interactions. However, the cost of examining all the possible higher-order feature interactions is prohibitive (exponentially growing with the order increasing). Hence existing approaches only detect limited order (e.g., combinations of up to four features) beneficial feature interactions, which may miss beneficial feature interactions with orders higher than the limitation. In this paper, we propose a hypergraph neural network based model named HIRS. HIRS is the first work that directly generates beneficial feature interactions of arbitrary orders and makes recommendation predictions accordingly. The number of generated feature interactions can be specified to be much smaller than the number of all the possible interactions and hence, our model admits a much lower running time. To achieve an effective algorithm, we exploit three properties of beneficial feature interactions, and propose deep-infomax-based methods to guide the interaction generation. Our experimental results show that HIRS outperforms state-of-the-art algorithms by up to 5% in terms of recommendation accuracy.

preprint2022arXiv

DYLE: Dynamic Latent Extraction for Abstractive Long-Input Summarization

Transformer-based models have achieved state-of-the-art performance on short-input summarization. However, they still struggle with summarizing longer text. In this paper, we present DYLE, a novel dynamic latent extraction approach for abstractive long-input summarization. DYLE jointly trains an extractor and a generator and treats the extracted text snippets as the latent variable, allowing dynamic snippet-level attention weights during decoding. To provide adequate supervision, we propose simple yet effective heuristics for oracle extraction as well as a consistency loss term, which encourages the extractor to approximate the averaged dynamic weights predicted by the generator. We evaluate our method on different long-document and long-dialogue summarization tasks: GovReport, QMSum, and arXiv. Experiment results show that DYLE outperforms all existing methods on GovReport and QMSum, with gains up to 6.1 ROUGE, while yielding strong results on arXiv. Further analysis shows that the proposed dynamic weights provide interpretability of our generation process.

preprint2022arXiv

Efficient Bipartite Entanglement Detection Scheme with a Quantum Adversarial Solver

The recognition of entanglement states is a notoriously difficult problem when no prior information is available. Here, we propose an efficient quantum adversarial bipartite entanglement detection scheme to address this issue. Our proposal reformulates the bipartite entanglement detection as a two-player zero-sum game completed by parameterized quantum circuits, where a two-outcome measurement can be used to query a classical binary result about whether the input state is bipartite entangled or not. In principle, for an $N$-qubit quantum state, the runtime complexity of our proposal is $O(\text{poly}(N)T)$ with $T$ being the number of iterations. We experimentally implement our protocol on a linear optical network and exhibit its effectiveness to accomplish the bipartite entanglement detection for 5-qubit quantum pure states and 2-qubit quantum mixed states. Our work paves the way for using near-term quantum machines to tackle entanglement detection on multipartite entangled quantum systems.

preprint2022arXiv

Efficient Non-parametric Bayesian Hawkes Processes

In this paper, we develop an efficient nonparametric Bayesian estimation of the kernel function of Hawkes processes. The non-parametric Bayesian approach is important because it provides flexible Hawkes kernels and quantifies their uncertainty. Our method is based on the cluster representation of Hawkes processes. Utilizing the finite support assumption of the Hawkes process, we efficiently sample random branching structures and thus, we split the Hawkes process into clusters of Poisson processes. We derive two algorithms -- a block Gibbs sampler and a maximum a posteriori estimator based on expectation maximization -- and we show that our methods have a linear time complexity, both theoretically and empirically. On synthetic data, we show our methods to be able to infer flexible Hawkes triggering kernels. On two large-scale Twitter diffusion datasets, we show that our methods outperform the current state-of-the-art in goodness-of-fit and that the time complexity is linear in the size of the dataset. We also observe that on diffusions related to online videos, the learned kernels reflect the perceived longevity for different content types such as music or pets videos.

preprint2022arXiv

Empowering Base Stations with Co-Site Intelligent Reflecting Surfaces: User Association, Channel Estimation and Reflection Optimization

Intelligent reflecting surface (IRS) has emerged as a promising technique to enhance wireless communication performance cost effectively. The existing literature has mainly considered IRS being deployed near user terminals to improve their performance. However, this approach may incur a high cost if IRSs need to be densely deployed in the network to cater to random user locations. To avoid such high deployment cost, in this paper we consider a new IRS aided wireless network architecture, where IRSs are deployed in the vicinity of each base station (BS) to assist in its communications with distributed users regardless of their locations. Besides significantly enhancing IRSs' signal coverage, this scheme helps reduce the IRS associated channel estimation overhead as compared to conventional user-side IRSs, by exploiting the nearly static BS-IRS channels over short distance. For this scheme, we propose a new two stage transmission protocol to achieve IRS channel estimation and reflection optimization for uplink data transmission efficiently. In addition, we propose effective methods for solving the user IRS association problem based on long term channel knowledge and the selected user IRS BS cascaded channel estimation problem. Finally, all IRSs' passive reflections are jointly optimized with the BS's multi-antenna receive combining to maximize the minimum achievable rate among all users for data transmission. Numerical results show that the proposed co site IRS empowered BS scheme can achieve significant performance gains over the conventional BS without co site IRS and existing schemes for IRS channel estimation and reflection optimization, thus enabling an appealing low cost and high performance BS design for future wireless networks.

preprint2022arXiv

Environment-Aware Hybrid Beamforming by Leveraging Channel Knowledge Map

Hybrid analog/digital beamforming is a promising technique to realize millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) systems cost-effectively. However, existing hybrid beamforming designs mainly rely on real-time channel training or beam sweeping to find the desired beams, which incurs prohibitive overhead due to a large number of antennas at both the transmitter and receiver with only limited radio frequency (RF) chains. To resolve this challenging issue, in this paper, we propose a new environment-aware hybrid beamforming technique that requires only light real-time training, by leveraging the useful tool of channel knowledge map (CKM) with the user's location information. CKM is a site-specific database, which offers location-specific channel-relevant information to facilitate or even obviate the acquisition of real-time channel state information (CSI). Two specific types of CKM are proposed in this paper for hybrid beamforming design in mmWave massive MIMO systems, namely channel angle map (CAM) and beam index map (BIM). It is shown that compared with existing environment-unaware schemes, the proposed environment-aware hybrid beamforming scheme based on CKM can drastically improve the effective communication rate, even under moderate user location errors, thanks to its great saving of the prohibitive real-time training overhead.

preprint2022arXiv

FedOCR: Communication-Efficient Federated Learning for Scene Text Recognition

While scene text recognition techniques have been widely used in commercial applications, data privacy has rarely been taken into account by this research community. Most existing algorithms have assumed a set of shared or centralized training data. However, in practice, data may be distributed on different local devices that can not be centralized to share due to the privacy restrictions. In this paper, we study how to make use of decentralized datasets for training a robust scene text recognizer while keeping them stay on local devices. To the best of our knowledge, we propose the first framework leveraging federated learning for scene text recognition, which is trained with decentralized datasets collaboratively. Hence we name it FedOCR. To make FedCOR fairly suitable to be deployed on end devices, we make two improvements including using lightweight models and hashing techniques. We argue that both are crucial for FedOCR in terms of the communication efficiency of federated learning. The simulations on decentralized datasets show that the proposed FedOCR achieves competitive results to the models that are trained with centralized data, with fewer communication costs and higher-level privacy-preserving.

preprint2022arXiv

Flavor Diagonal Nucleon Charges

This talk provides an update on the calculation of matrix elements of flavor diagonal axial, scalar and tensor quark bilinear operators between the nucleon ground state. The simulations are done using Wilson-clover fermions on a sea of eight 2+1+1-flavor HISQ ensembles generated by the MILC collaboration. We discuss the signal in the connected and disconnected contributions, calculation of the renormalization constants and mixing in the RI-sMOM scheme, and control over the simultaneous chiral-continuum-finite-volume fit used to extract the final charges.

preprint2022arXiv

FRIB: Low-poisoning Rate Invisible Backdoor Attack based on Feature Repair

During the generation of invisible backdoor attack poisoned data, the feature space transformation operation tends to cause the loss of some poisoned features and weakens the mapping relationship between source images with triggers and target labels, resulting in the need for a higher poisoning rate to achieve the corresponding backdoor attack success rate. To solve the above problems, we propose the idea of feature repair for the first time and introduce the blind watermark technique to repair the poisoned features lost during the generation of poisoned data. Under the premise of ensuring consistent labeling, we propose a low-poisoning rate invisible backdoor attack based on feature repair, named FRIB. Benefiting from the above design concept, the new method enhances the mapping relationship between the source images with triggers and the target labels, and increases the degree of misleading DNNs, thus achieving a high backdoor attack success rate with a very low poisoning rate. Ultimately, the detailed experimental results show that the goal of achieving a high success rate of backdoor attacks with a very low poisoning rate is achieved on all MNIST, CIFAR10, GTSRB, and ImageNet datasets.

preprint2022arXiv

Generalised Image Outpainting with U-Transformer

In this paper, we develop a novel transformer-based generative adversarial neural network called U-Transformer for generalised image outpainting problem. Different from most present image outpainting methods conducting horizontal extrapolation, our generalised image outpainting could extrapolate visual context all-side around a given image with plausible structure and details even for complicated scenery, building, and art images. Specifically, we design a generator as an encoder-to-decoder structure embedded with the popular Swin Transformer blocks. As such, our novel neural network can better cope with image long-range dependencies which are crucially important for generalised image outpainting. We propose additionally a U-shaped structure and multi-view Temporal Spatial Predictor (TSP) module to reinforce image self-reconstruction as well as unknown-part prediction smoothly and realistically. By adjusting the predicting step in the TSP module in the testing stage, we can generate arbitrary outpainting size given the input sub-image. We experimentally demonstrate that our proposed method could produce visually appealing results for generalized image outpainting against the state-of-the-art image outpainting approaches.

preprint2022arXiv

How to Deploy Intelligent Reflecting Surfaces in Wireless Network: BS-side, User-side, or Both Sides?

The performance of wireless communication systems is fundamentally constrained by the random and uncontrollable wireless channel. By leveraging the recent advance in digitally-controlled metasurface, intelligent reflecting surface (IRS) has emerged as a promising solution to enhance the wireless network performance by smartly reconfiguring the radio propagation environment. Despite the substantial research on IRS-aided communications, this article addresses the important issue of how to deploy IRSs in a wireless network to achieve its optimum performance. We first compare the two conventional strategies of deploying IRS at the side of base station or distributed users in terms of various communication performance metrics, and then propose a new hybrid IRS deployment strategy by combining their complementary advantages. Moreover, the main challenges in optimizing IRS deployment as well as their promising solutions are discussed. A case study is also presented to compare the performance of different IRS deployment strategies and draw useful insights for practical design.

preprint2022arXiv

Improving Transferability for Domain Adaptive Detection Transformers

DETR-style detectors stand out amongst in-domain scenarios, but their properties in domain shift settings are under-explored. This paper aims to build a simple but effective baseline with a DETR-style detector on domain shift settings based on two findings. For one, mitigating the domain shift on the backbone and the decoder output features excels in getting favorable results. For another, advanced domain alignment methods in both parts further enhance the performance. Thus, we propose the Object-Aware Alignment (OAA) module and the Optimal Transport based Alignment (OTA) module to achieve comprehensive domain alignment on the outputs of the backbone and the detector. The OAA module aligns the foreground regions identified by pseudo-labels in the backbone outputs, leading to domain-invariant based features. The OTA module utilizes sliced Wasserstein distance to maximize the retention of location information while minimizing the domain gap in the decoder outputs. We implement the findings and the alignment modules into our adaptation method, and it benchmarks the DETR-style detector on the domain shift settings. Experiments on various domain adaptive scenarios validate the effectiveness of our method.

preprint2022arXiv

Instantaneous indirect measurement principle in quantum mechanics

In quantum systems, the measurement of operators and the measurement of the quantum states of the system are very challenging tasks. In this Letter, we propose a method to obtain the average value of one operator in a certain state by measuring the instantaneous change of the average value of another operator with the assistance of a known reference state. We refer to this measurement method as the instantaneous indirect measurement method. By studying the application of this method to some typical models, we find that this measurement can be applied to the measurement of an arbitrary state of a quantum system. Furthermore, for the system to be measured, we find that such measurement neither significantly affects the wave function of the system nor causes wave function collapse of the system. Also, our study shows that when two independent systems are coupled, the information mapping between them is done instantaneously. Finally, we discuss applying this measurement method to the measurement of quantum Fisher information, which quantizes the limited accuracy of estimating a parameter from a quantum state.

preprint2022arXiv

Intelligent Reflecting Surface Aided Wireless Networks: From Single-Reflection to Multi-Reflection Design and Optimization

Intelligent reflecting surface (IRS) has emerged as a promising technique for wireless communication networks. By dynamically tuning the reflection amplitudes/phase shifts of a large number of passive elements, IRS enables flexible wireless channel control and configuration, and thereby enhances the wireless signal transmission rate and reliability significantly. Despite the vast literature on designing and optimizing assorted IRS-aided wireless systems, prior works have mainly focused on enhancing wireless links with single signal reflection only by one or multiple IRSs, which may be insufficient to boost the wireless link capacity under some harsh propagation conditions (e.g., indoor environment with dense blockages/obstructions). This issue can be tackled by employing two or more IRSs to assist each wireless link and jointly exploiting their single as well as multiple signal reflections over them. However, the resultant double-/multi-IRS aided wireless systems face more complex design issues as well as new practical challenges for implementation as compared to the conventional single-IRS counterpart, in terms of IRS reflection optimization, channel acquisition, as well as IRS deployment and association/selection. As such, a new paradigm for designing multi-IRS cooperative passive beamforming and joint active/passive beam routing arises which calls for innovative design approaches and optimization methods. In this paper, we give a tutorial overview of multi-IRS aided wireless networks, with an emphasis on addressing the new challenges due to multi-IRS signal reflection and routing. Moreover, we point out important directions worthy of research and investigation in the future.

preprint2022arXiv

Intelligent Reflecting Surface for MIMO VLC: Joint Design of Surface Configuration and Transceiver Signal Processing

With the capability of reconfiguring the wireless electromagnetic environment, intelligent reflecting surface (IRS) is a new paradigm for designing future wireless communication systems. In this paper, we consider optical IRS for improving the performance of visible light communication (VLC) under a multiple-input and multiple-output (MIMO) setting. Specifically, we focus on the downlink communication of an indoor MIMO VLC system and aim to minimize the mean square error (MSE) of demodulated signals at the receiver. To this end, the MIMO channel gain of the IRS-aided VLC is first derived under the point source assumption, based on which the MSE minimization problem is then formulated subject to the emission power constraints. Next, we propose an alternating optimization algorithm, which decomposes the original problem into three subproblems, to iteratively optimize the IRS configuration, the precoding and detection matrices for minimizing the MSE. Moreover, theoretical analysis on the performance of the proposed algorithm in high and low signal-to-noise rate (SNR) regimes is provided, revealing that the joint optimization process can be simplified in such special cases, and the algorithm's convergence property and computational complexity are also discussed. Finally, numerical results show that IRS-aided schemes significantly reduce the MSE as compared to their counterparts without IRS, and the proposed algorithm outperforms other baseline schemes.

preprint2022arXiv

Intelligent Reflecting Surface for Multi-Path Beam Routing with Active/Passive Beam Splitting and Combining

Intelligent reflecting surface (IRS) can be densely deployed in wireless networks to significantly enhance the communication channels. In this letter, we consider the downlink transmission from a multi-antenna base station (BS) to a single-antenna user, by exploiting the cooperative passive beamforming (CPB) and line-of-sight (LoS) path diversity gains of multi-IRS signal reflection. Unlike existing works where only one single multi-IRS reflection path from the BS to user is selected, we propose a new and more general multi-path beam routing scheme. Specifically, the BS sends the user's information signal via multiple orthogonal active beams (termed as active beam splitting), which point towards different IRSs. Then, these beamed signals are subsequently reflected by selected IRSs via their CPB in different paths, and finally coherently combined at the user's receiver (thus named {\it \textbf{passive beam combining}}). For this scheme, we formulate a new multi-path beam routing design problem to jointly optimize the number of IRS reflection paths, the selected IRSs for each of the reflection paths, the active/passive beamforming at the BS/each selected IRS, as well as the BS's power allocation over different active beams, so as to maximize the received signal power at the user. To solve this challenging problem, we first derive the optimal BS/IRS beamforming and BS power allocation for a given set of reflection paths. The clique-based approach in graph theory is then applied to solve the remaining multi-path selection problem efficiently. Simulation results show that our proposed multi-path beam routing scheme significantly outperforms its conventional single-path beam routing special case.

preprint2022arXiv

Intelligent Reflecting Surface-Aided LEO Satellite Communication: Cooperative Passive Beamforming and Distributed Channel Estimation

We consider in this paper a new intelligent reflecting surface (IRS)-aided LEO satellite communication system, by utilizing the controllable phase shifts of massive passive reflecting elements to achieve flexible beamforming, which copes with the time-varying channel between the high-mobility satellite (SAT) and ground node (GN) cost-effectively. In particular, we propose a new architecture for IRS-aided LEO satellite communication where IRSs are deployed at both sides of the SAT and GN, and study their cooperative passive beamforming (CPB) design over line-of-sight (LoS)-dominant single-reflection and double-reflection channels. Specifically, we jointly optimize the active transmit/receive beamforming at the SAT/GN as well as the CPB at two-sided IRSs to maximize the overall channel gain from the SAT to each GN. Interestingly, we show that under LoS channel conditions, the high-dimensional SAT-GN channel can be decomposed into the outer product of two low-dimensional vectors. By exploiting the decomposed SAT-GN channel, we decouple the original beamforming optimization problem into two simpler subproblems corresponding to the SAT and GN sides, respectively, which are both solved in closed-form. Furthermore, we propose an efficient transmission protocol to conduct channel estimation and beam tracking, which only requires independent processing of the SAT and GN in a distributed manner, thus substantially reducing the implementation complexity. Simulation results validate the performance advantages of the proposed IRS-aided LEO satellite communication system with two-sided cooperative IRSs, as compared to various baseline schemes such as the conventional reflect-array and one-sided IRS.

preprint2022arXiv

Intelligent Reflecting Surface-Aided Spectrum Sensing for Cognitive Radio

Spectrum sensing is a key enabling technique for cognitive radio (CR), which provides essential information on the spectrum availability. However, due to severe wireless channel fading and path loss, the primary user (PU) signals received at the CR or secondary user (SU) can be practically too weak for reliable detection. To tackle this issue, we consider in this letter a new intelligent reflecting surface (IRS)-aided spectrum sensing scheme for CR, by exploiting the large aperture and passive beamforming gains of IRS to boost the PU signal strength received at the SU to facilitate its spectrum sensing. Specifically, by dynamically changing the IRS reflection over time according to a given codebook, its reflected signal power varies substantially at the SU, which is utilized for opportunistic signal detection. Furthermore, we propose a weighted energy detection method by combining the received signal power values over different IRS reflections, which significantly improves the detection performance. Simulation results validate the performance gain of the proposed IRS-aided spectrum sensing scheme, as compared to different benchmark schemes.

preprint2022arXiv

Loop induced top quark FCNC through top quark and dark matter interactions

We present a comprehensive analysis of the loop induced top quark FCNC signals at the LHC within one class of the simplified model. The loop level FCNC interactions are well motivated to avoid the hierarchy of the top quark couplings from the new physics and standard model. Such a theory will posit a Majorana dark matter candidate and could be tested through dark matter relic density, direct detection experiments (the scattering between dark matter and heavy nuclei), and the collider signals at the LHC. We find that the spin-independent (SI) scattering between Majorana dark matter and nuclei will vanish at the leading order, while the next-to-leading order correction to the SI scattering becomes significance to constrain the parameter space of the model. A detailed comparison from direct detection experiments and LHC searches is also discussed and both of them are very important to full constrain the model.

preprint2022arXiv

Loss-tolerant all-photonic quantum repeater with generalized Shor code

The all-photonic quantum repeater (APQR) is a promising repeater scheme to realize long-distance quantum communication. For a practical APQR, an indispensable requirement is the robustness of the repeater graph state (RGS) against photon loss. We propose a new loss-tolerant scheme by applying the generalized Shor code to RGS, which can be experimentally demonstrated with current technology. Experimentally, we first prepare and verify the nine-qubit Shor code. Then, by applying the generalized Shor code to APQR and preparing a simplified encoded RGS with the structure of $1\times2$ based on the Shor code state, the effectiveness of our loss-tolerant scheme and the loss tolerance of the encoded RGS are respectively verified. Our results make an essential step toward a practical APQR and enrich the research of quantum error correction code.

preprint2022arXiv

Matrix Completion via Non-Convex Relaxation and Adaptive Correlation Learning

The existing matrix completion methods focus on optimizing the relaxation of rank function such as nuclear norm, Schatten-p norm, etc. They usually need many iterations to converge. Moreover, only the low-rank property of matrices is utilized in most existing models and several methods that incorporate other knowledge are quite time-consuming in practice. To address these issues, we propose a novel non-convex surrogate that can be optimized by closed-form solutions, such that it empirically converges within dozens of iterations. Besides, the optimization is parameter-free and the convergence is proved. Compared with the relaxation of rank, the surrogate is motivated by optimizing an upper-bound of rank. We theoretically validate that it is equivalent to the existing matrix completion models. Besides the low-rank assumption, we intend to exploit the column-wise correlation for matrix completion, and thus an adaptive correlation learning, which is scaling-invariant, is developed. More importantly, after incorporating the correlation learning, the model can be still solved by closed-form solutions such that it still converges fast. Experiments show the effectiveness of the non-convex surrogate and adaptive correlation learning.

preprint2022arXiv

MIA 2022 Shared Task: Evaluating Cross-lingual Open-Retrieval Question Answering for 16 Diverse Languages

We present the results of the Workshop on Multilingual Information Access (MIA) 2022 Shared Task, evaluating cross-lingual open-retrieval question answering (QA) systems in 16 typologically diverse languages. In this task, we adapted two large-scale cross-lingual open-retrieval QA datasets in 14 typologically diverse languages, and newly annotated open-retrieval QA data in 2 underrepresented languages: Tagalog and Tamil. Four teams submitted their systems. The best system leveraging iteratively mined diverse negative examples and larger pretrained models achieves 32.2 F1, outperforming our baseline by 4.5 points. The second best system uses entity-aware contextualized representations for document retrieval, and achieves significant improvements in Tamil (20.8 F1), whereas most of the other systems yield nearly zero scores.

preprint2022arXiv

MISS: Multi-Interest Self-Supervised Learning Framework for Click-Through Rate Prediction

CTR prediction is essential for modern recommender systems. Ranging from early factorization machines to deep learning based models in recent years, existing CTR methods focus on capturing useful feature interactions or mining important behavior patterns. Despite the effectiveness, we argue that these methods suffer from the risk of label sparsity (i.e., the user-item interactions are highly sparse with respect to the feature space), label noise (i.e., the collected user-item interactions are usually noisy), and the underuse of domain knowledge (i.e., the pairwise correlations between samples). To address these challenging problems, we propose a novel Multi-Interest Self-Supervised learning (MISS) framework which enhances the feature embeddings with interest-level self-supervision signals. With the help of two novel CNN-based multi-interest extractors,self-supervision signals are discovered with full considerations of different interest representations (point-wise and union-wise), interest dependencies (short-range and long-range), and interest correlations (inter-item and intra-item). Based on that, contrastive learning losses are further applied to the augmented views of interest representations, which effectively improves the feature representation learning. Furthermore, our proposed MISS framework can be used as an plug-in component with existing CTR prediction models and further boost their performances. Extensive experiments on three large-scale datasets show that MISS significantly outperforms the state-of-the-art models, by up to 13.55% in AUC, and also enjoys good compatibility with representative deep CTR models.

preprint2022arXiv

Modular Extremely Large-Scale Array Communication: Near-Field Modelling and Performance Analysis

This paper investigates wireless communications based on a new antenna array architecture, termed modular extremely large-scale array (XL-array), where an extremely large number of antenna elements are regularly arranged on a common platform in a modular manner. Each module consists of a flexible/moderate number of antenna elements, and different modules are separated with an inter-module spacing that is typically much larger than the inter-element spacing/signal wavelength for ease of deployment. By properly modelling the variations of signal phase, amplitude and projected aperture across different array modules/elements, we develop the new channel model and analyze the signal-to-noise ratio (SNR) performance of the modular XL-array based communications. Under the practical non-uniform spherical wave (NUSW) model, the closed-form expression of the maximum achievable SNR is derived in terms of key geometric parameters, including the total planar array size, module separation distances along each dimension, as well as the user's location in the three-dimensional (3D) space. Besides, the asymptotic SNR scaling laws are revealed as the number of modules along different dimensions goes to infinity. Moreover, we show that our developed near-field modelling and performance analysis include the existing ones for the collocated XL-array, the far-field uniform plane wave (UPW) model, as well as the one-dimensional (1D) modular extremely large-scale uniform linear array (XL-ULA) as special cases. Extensive simulation results are provided to validate our obtained results.

preprint2022arXiv

Multi-Level Interaction Reranking with User Behavior History

As the final stage of the multi-stage recommender system (MRS), reranking directly affects users' experience and satisfaction, thus playing a critical role in MRS. Despite the improvement achieved in the existing work, three issues are yet to be solved. First, users' historical behaviors contain rich preference information, such as users' long and short-term interests, but are not fully exploited in reranking. Previous work typically treats items in history equally important, neglecting the dynamic interaction between the history and candidate items. Second, existing reranking models focus on learning interactions at the item level while ignoring the fine-grained feature-level interactions. Lastly, estimating the reranking score on the ordered initial list before reranking may lead to the early scoring problem, thereby yielding suboptimal reranking performance. To address the above issues, we propose a framework named Multi-level Interaction Reranking (MIR). MIR combines low-level cross-item interaction and high-level set-to-list interaction, where we view the candidate items to be reranked as a set and the users' behavior history in chronological order as a list. We design a novel SLAttention structure for modeling the set-to-list interactions with personalized long-short term interests. Moreover, feature-level interactions are incorporated to capture the fine-grained influence among items. We design MIR in such a way that any permutation of the input items would not change the output ranking, and we theoretically prove it. Extensive experiments on three public and proprietary datasets show that MIR significantly outperforms the state-of-the-art models using various ranking and utility metrics.

preprint2022arXiv

MultiHiertt: Numerical Reasoning over Multi Hierarchical Tabular and Textual Data

Numerical reasoning over hybrid data containing both textual and tabular content (e.g., financial reports) has recently attracted much attention in the NLP community. However, existing question answering (QA) benchmarks over hybrid data only include a single flat table in each document and thus lack examples of multi-step numerical reasoning across multiple hierarchical tables. To facilitate data analytical progress, we construct a new large-scale benchmark, MultiHiertt, with QA pairs over Multi Hierarchical Tabular and Textual data. MultiHiertt is built from a wealth of financial reports and has the following unique characteristics: 1) each document contain multiple tables and longer unstructured texts; 2) most of tables contained are hierarchical; 3) the reasoning process required for each question is more complex and challenging than existing benchmarks; and 4) fine-grained annotations of reasoning processes and supporting facts are provided to reveal complex numerical reasoning. We further introduce a novel QA model termed MT2Net, which first applies facts retrieving to extract relevant supporting facts from both tables and text and then uses a reasoning module to perform symbolic reasoning over retrieved facts. We conduct comprehensive experiments on various baselines. The experimental results show that MultiHiertt presents a strong challenge for existing baselines whose results lag far behind the performance of human experts. The dataset and code are publicly available at https://github.com/psunlpgroup/MultiHiertt.

preprint2022arXiv

Near-Field Modelling and Performance Analysis of Modular Extremely Large-Scale Array Communications

This letter studies a new array architecture, termed as modular extremely large-scale array (XL-array), for which a large number of array elements are arranged in a modular manner. Each module consists of a moderate number of array elements and the modules are regularly arranged with the inter-module space typically much larger than signal wavelength to cater to the actual mounting structure. We study the mathematical modelling and conduct the performance analysis for modular XL-array communications, by considering the non-uniform spherical wave (NUSW) characteristic that is more suitable than the conventional uniform plane wave (UPW) assumption for physically large arrays. A closed-form expression is derived for the maximum signal-to-noise ratio (SNR) in terms of the geometries of the modular XL-array, including the total array size and module separation, as well as the user's location. The asymptotic SNR scaling law is revealed as the size of modular array goes to infinity. Furthermore, we show that the developed modelling and performance analysis include the existing results for collocated XL-array or far-field UPW assumption as special cases. Numerical results demonstrate the importance of near-field modelling for modular XL-array communications since it leads to significantly different results from the conventional far-field UPW modelling.

preprint2022arXiv

Nematic fluctuations in the non-superconducting iron pnictide BaFe$_{1.9-x}$Ni$_{0.1}$Cr$_{x}$As$_{2}$

The main driven force of the electronic nematic phase in iron-based superconductors is still under debate. Here, we report a comprehensive study on the nematic fluctuations in a non-superconducting iron pnictide system BaFe$_{1.9-x}$Ni$_{0.1}$Cr$_{x}$As$_{2}$ by electronic transport, angle-resolved photoemission spectroscopy (ARPES) and inelastic neutron scattering (INS) measurements. Previous neutron diffraction and transport measurements suggested that the collinear antiferromagnetism persists to $x=0.8$, with similar Néel temperature $T_N$ and structural transition temperature $T_s$ around 32 K, but the charge carriers change from electron type to hole type around $x=$ 0.5. In this study, we have found that the in-plane resistivity anisotropy also highly depends on the Cr dopings and the type of charge carriers. While ARPES measurements suggest possibly weak orbital anisotropy onset near $T_s$ for both $x=0.05$ and $x=0.5$ compounds, INS experiments reveal clearly different onset temperatures of low-energy spin excitation anisotropy, which is likely related to the energy scale of spin nematicity. These results suggest that the interplay between the local spins on Fe atoms and the itinerant electrons on Fermi surfaces is crucial to the nematic fluctuations of iron pnictides, where the orbital degree of freedom may behave differently from the spin degree of freedom, and the transport properties are intimately related to the spin dynamics.

preprint2022arXiv

Neural Program Synthesis with Query

Aiming to find a program satisfying the user intent given input-output examples, program synthesis has attracted increasing interest in the area of machine learning. Despite the promising performance of existing methods, most of their success comes from the privileged information of well-designed input-output examples. However, providing such input-output examples is unrealistic because it requires the users to have the ability to describe the underlying program with a few input-output examples under the training distribution. In this work, we propose a query-based framework that trains a query neural network to generate informative input-output examples automatically and interactively from a large query space. The quality of the query depends on the amount of the mutual information between the query and the corresponding program, which can guide the optimization of the query framework. To estimate the mutual information more accurately, we introduce the functional space (F-space) which models the relevance between the input-output examples and the programs in a differentiable way. We evaluate the effectiveness and generalization of the proposed query-based framework on the Karel task and the list processing task. Experimental results show that the query-based framework can generate informative input-output examples which achieve and even outperform well-designed input-output examples.

preprint2022arXiv

Neural Re-ranking in Multi-stage Recommender Systems: A Review

As the final stage of the multi-stage recommender system (MRS), re-ranking directly affects user experience and satisfaction by rearranging the input ranking lists, and thereby plays a critical role in MRS. With the advances in deep learning, neural re-ranking has become a trending topic and been widely applied in industrial applications. This review aims at integrating re-ranking algorithms into a broader picture, and paving ways for more comprehensive solutions for future research. For this purpose, we first present a taxonomy of current methods on neural re-ranking. Then we give a description of these methods along with the historic development according to their objectives. The network structure, personalization, and complexity are also discussed and compared. Next, we provide benchmarks of the major neural re-ranking models and quantitatively analyze their re-ranking performance. Finally, the review concludes with a discussion on future prospects of this field. A list of papers discussed in this review, the benchmark datasets, our re-ranking library LibRerank, and detailed parameter settings are publicly available at https://github.com/LibRerank-Community/LibRerank.

preprint2022arXiv

New criterions on nonexistence of periodic orbits of planar dynamical systems and their applications

Characterizing existence or not of periodic orbit is a classical problem and it has both theoretical importance and many real applications. Here, several new criterions on nonexistence of periodic orbits of the planar dynamical system $\dot x=y,~\dot y=-g(x)-f(x,y)y$ are obtained in this paper, and by examples showing that these criterions are applicable, but the known ones are invalid to them. Based on these criterions, we further characterize the local topological structures of its equilibrium, which also show that one of the classical results by A.F. Andreev [Amer. Math. Soc. Transl. 8 (1958), 183--207] on local topological classification of the degenerate equilibrium is incomplete. Finally, as another application of these results, we classify the global phase portraits of a planar differential system, which comes from the third question in the list of the 33 questions posed by A. Gasull and also from a mechanical oscillator under suitable restriction to its parameters.

preprint2022arXiv

On the Opportunity of Causal Learning in Recommendation Systems: Foundation, Estimation, Prediction and Challenges

Recently, recommender system (RS) based on causal inference has gained much attention in the industrial community, as well as the states of the art performance in many prediction and debiasing tasks. Nevertheless, a unified causal analysis framework has not been established yet. Many causal-based prediction and debiasing studies rarely discuss the causal interpretation of various biases and the rationality of the corresponding causal assumptions. In this paper, we first provide a formal causal analysis framework to survey and unify the existing causal-inspired recommendation methods, which can accommodate different scenarios in RS. Then we propose a new taxonomy and give formal causal definitions of various biases in RS from the perspective of violating the assumptions adopted in causal analysis. Finally, we formalize many debiasing and prediction tasks in RS, and summarize the statistical and machine learning-based causal estimation methods, expecting to provide new research opportunities and perspectives to the causal RS community.

preprint2022arXiv

Optimizing Age of Information in Wireless Uplink Networks with Partial Observations

We consider a wireless uplink network consisting of multiple end devices and an access point (AP). Each device monitors a physical process with stochastic arrival of status updates and sends these updates to the AP over a shared channel. The AP aims to schedule the transmissions of these devices to optimize the network-wide information freshness, quantified by the Age of Information (AoI) metric. Due to the stochastic arrival of the status updates at the devices, the AP only has partial observations of system times of the latest status updates at the devices when making scheduling decisions. We formulate such a decision-making problem as a belief Markov Decision Process (belief-MDP). The belief-MDP in its original form is difficult to solve as the dimension of its states can go to infinity and its belief space is uncountable. By leveraging the properties of the status update arrival (i.e., Bernoulli) processes, we manage to simplify the feasible states of the belief-MDP to two-dimensional vectors. Built on that, we devise a low-complexity scheduling policy. We derive upper bounds for the AoI performance of the low-complexity policy and analyze the performance guarantee by comparing its performance with a universal lower bound. Numerical results validate our analyses.

preprint2022arXiv

Outpainting by Queries

Image outpainting, which is well studied with Convolution Neural Network (CNN) based framework, has recently drawn more attention in computer vision. However, CNNs rely on inherent inductive biases to achieve effective sample learning, which may degrade the performance ceiling. In this paper, motivated by the flexible self-attention mechanism with minimal inductive biases in transformer architecture, we reframe the generalised image outpainting problem as a patch-wise sequence-to-sequence autoregression problem, enabling query-based image outpainting. Specifically, we propose a novel hybrid vision-transformer-based encoder-decoder framework, named \textbf{Query} \textbf{O}utpainting \textbf{TR}ansformer (\textbf{QueryOTR}), for extrapolating visual context all-side around a given image. Patch-wise mode's global modeling capacity allows us to extrapolate images from the attention mechanism's query standpoint. A novel Query Expansion Module (QEM) is designed to integrate information from the predicted queries based on the encoder's output, hence accelerating the convergence of the pure transformer even with a relatively small dataset. To further enhance connectivity between each patch, the proposed Patch Smoothing Module (PSM) re-allocates and averages the overlapped regions, thus providing seamless predicted images. We experimentally show that QueryOTR could generate visually appealing results smoothly and realistically against the state-of-the-art image outpainting approaches.

preprint2022arXiv

Performance Analysis of OMP in Super-Resolution

Given a spectrally sparse signal $\mathbf{y} = \sum_{i=1}^s x_i\mathbf{f}(τ_i) \in \mathbb{C}^{2n+1}$ consisting of $s$ complex sinusoids, we consider the super-resolution problem, which is about estimating frequency components $\{τ_i\}_{i=1}^s$ of $\mathbf y$. We consider the OMP-type algorithms for super-resolution, which is more efficient than other approaches based on Semi-Definite Programming. Our analysis shows that a two-stage algorithm with OMP initialization can recover frequency components under the separation condition $nΔ\gtrsim \text{dyn}(\mathbf{x})$ and the dependency on $\text{dyn}(\mathbf{x})$ is inevitable for the vanilla OMP algorithm. We further show that the Sliding-OMP algorithm, a variant of the OMP algorithm with an additional refinement step at each iteration, is provable to recover $\{τ_i\}_{i=1}^s$ under the separation condition $nΔ\geq c$. Moreover, our result can be extended to an incomplete measurement model with $O( s^2\log n)$ measurements.

preprint2022arXiv

Predicting Cancer Treatments Induced Cardiotoxicity of Breast Cancer Patients

Cardiotoxicity induced by the breast cancer treatments (i.e., chemotherapy, targeted therapy and radiation therapy) is a significant problem for breast cancer patients. The cardiotoxicity risk for breast cancer patients receiving different treatments remains unclear. We developed and evaluated risk predictive models for cardiotoxicity in breast cancer patients using EHR data. The AUC scores to predict the CHF, CAD, CM and MI are 0.846, 0.857, 0.858 and 0.804 respectively. After adjusting for baseline differences in cardiovascular health, patients who received chemotherapy or targeted therapy appeared to have higher risk of cardiotoxicity than patients who received radiation therapy. Due to differences in baseline cardiac health across the different breast cancer treatment groups, caution is recommended in interpreting the cardiotoxic effect of these treatments.

preprint2022arXiv

Quantum Support Vector Machine without Iteration

Quantum algorithms can enhance machine learning in different aspects. In 2014, Rebentrost $et~al.$ constructed a least squares quantum support vector machine (LS-QSVM), in which the Swap Test plays a crucial role in realizing the classification. However, as the output states of a previous test cannot be reused for a new test in the Swap Test, the quantum algorithm LS-QSVM has to be repeated in preparing qubits, manipulating operations, and carrying out the measurement. This paper proposes a QSVM based on the generalized quantum amplitude estimation (AE-QSVM) which gets rid of the constraint of repetitive processes and saves the quantum resources. At first, AE-QSVM is trained by using the quantum singular value decomposition. Then, a query sample is classified by using the generalized quantum amplitude estimation in which high accuracy can be achieved by adding auxiliary qubits instead of repeating the algorithm. The complexity of AE-QSVM is reduced to $O(κ^{3}\varepsilon^{-3}(log(mn)+1))$ with an accuracy $\varepsilon$, where $m$ is the number of training vectors, $n$ is the dimension of the feature space, and $κ$ is the condition number. Experiments demonstrate that AE-QSVM is advantageous in terms of training matrix, the number of iterations, space complexity, and time complexity.

preprint2022arXiv

ReLoop: A Self-Correction Continual Learning Loop for Recommender Systems

Deep learning-based recommendation has become a widely adopted technique in various online applications. Typically, a deployed model undergoes frequent re-training to capture users' dynamic behaviors from newly collected interaction logs. However, the current model training process only acquires users' feedbacks as labels, but fail to take into account the errors made in previous recommendations. Inspired by the intuition that humans usually reflect and learn from mistakes, in this paper, we attempt to build a self-correction learning loop (dubbed ReLoop) for recommender systems. In particular, a new customized loss is employed to encourage every new model version to reduce prediction errors over the previous model version during training. Our ReLoop learning framework enables a continual self-correction process in the long run and thus is expected to obtain better performance over existing training strategies. Both offline experiments and an online A/B test have been conducted to validate the effectiveness of ReLoop.

preprint2022arXiv

Self Supervised Lesion Recognition For Breast Ultrasound Diagnosis

Previous deep learning based Computer Aided Diagnosis (CAD) system treats multiple views of the same lesion as independent images. Since an ultrasound image only describes a partial 2D projection of a 3D lesion, such paradigm ignores the semantic relationship between different views of a lesion, which is inconsistent with the traditional diagnosis where sonographers analyze a lesion from at least two views. In this paper, we propose a multi-task framework that complements Benign/Malignant classification task with lesion recognition (LR) which helps leveraging relationship among multiple views of a single lesion to learn a complete representation of the lesion. To be specific, LR task employs contrastive learning to encourage representation that pulls multiple views of the same lesion and repels those of different lesions. The task therefore facilitates a representation that is not only invariant to the view change of the lesion, but also capturing fine-grained features to distinguish between different lesions. Experiments show that the proposed multi-task framework boosts the performance of Benign/Malignant classification as two sub-tasks complement each other and enhance the learned representation of ultrasound images.

preprint2022arXiv

SepLUT: Separable Image-adaptive Lookup Tables for Real-time Image Enhancement

Image-adaptive lookup tables (LUTs) have achieved great success in real-time image enhancement tasks due to their high efficiency for modeling color transforms. However, they embed the complete transform, including the color component-independent and the component-correlated parts, into only a single type of LUTs, either 1D or 3D, in a coupled manner. This scheme raises a dilemma of improving model expressiveness or efficiency due to two factors. On the one hand, the 1D LUTs provide high computational efficiency but lack the critical capability of color components interaction. On the other, the 3D LUTs present enhanced component-correlated transform capability but suffer from heavy memory footprint, high training difficulty, and limited cell utilization. Inspired by the conventional divide-and-conquer practice in the image signal processor, we present SepLUT (separable image-adaptive lookup table) to tackle the above limitations. Specifically, we separate a single color transform into a cascade of component-independent and component-correlated sub-transforms instantiated as 1D and 3D LUTs, respectively. In this way, the capabilities of two sub-transforms can facilitate each other, where the 3D LUT complements the ability to mix up color components, and the 1D LUT redistributes the input colors to increase the cell utilization of the 3D LUT and thus enable the use of a more lightweight 3D LUT. Experiments demonstrate that the proposed method presents enhanced performance on photo retouching benchmark datasets than the current state-of-the-art and achieves real-time processing on both GPUs and CPUs.

preprint2022arXiv

Simultaneous Transmit Diversity and Passive Beamforming with Large-Scale Intelligent Reflecting Surface: Far-Field or Near-Field?

Intelligent reflecting surface (IRS) has emerged as a cost-effective solution to enhance wireless communication performance via passive signal reflection. Existing works on IRS have mainly focused on investigating IRS's passive beamforming/reflection design to boost the communication rate for users assuming that their channel state information (CSI) is fully or partially known. However, how to exploit IRS to improve the wireless transmission reliability without any CSI, which is typical in high-mobility/delay-sensitive communication scenarios, remains largely open. In this paper, we study a new IRS-aided communication system with the IRS integrated to its aided access point (AP) to achieve both functions of transmit diversity and passive beamforming simultaneously. Specifically, we first show an interesting result that the IRS's passive beamforming gain in any direction is invariant to the common phase-shift applied to all of its reflecting elements. Accordingly, we design the common phase-shift of IRS elements to achieve transmit diversity at the AP side without the need of any CSI of the users. In addition, we propose a practical method for the users to estimate the CSI at the receiver side for information decoding. Meanwhile, we show that the conventional passive beamforming gain of IRS can be retained for the other users with their CSI known at the AP. Furthermore, we derive the asymptotic performance of both IRS-aided transmit diversity and passive beamforming in closed-form, by considering the large-scale IRS with an infinite number of elements. Numerical results validate our analysis and show the performance gains of the proposed IRS-aided simultaneous transmit diversity and passive beamforming scheme over other benchmark schemes.

preprint2022arXiv

Solving The Long-Tailed Problem via Intra- and Inter-Category Balance

Benchmark datasets for visual recognition assume that data is uniformly distributed, while real-world datasets obey long-tailed distribution. Current approaches handle the long-tailed problem to transform the long-tailed dataset to uniform distribution by re-sampling or re-weighting strategies. These approaches emphasize the tail classes but ignore the hard examples in head classes, which result in performance degradation. In this paper, we propose a novel gradient harmonized mechanism with category-wise adaptive precision to decouple the difficulty and sample size imbalance in the long-tailed problem, which are correspondingly solved via intra- and inter-category balance strategies. Specifically, intra-category balance focuses on the hard examples in each category to optimize the decision boundary, while inter-category balance aims to correct the shift of decision boundary by taking each category as a unit. Extensive experiments demonstrate that the proposed method consistently outperforms other approaches on all the datasets.

preprint2022arXiv

Spatial-temporal Conv-sequence Learning with Accident Encoding for Traffic Flow Prediction

In an intelligent transportation system, the key problem of traffic forecasting is how to extract periodic temporal dependencies and complex spatial correlations. Current state-of-the-art methods for predicting traffic flow are based on graph architectures and sequence learning models, but they do not fully exploit dynamic spatial-temporal information in the traffic system. Specifically, the temporal dependencies in the short-range are diluted by recurrent neural networks. Moreover, local spatial information is also ignored by existing sequence models, because their convolution operation uses global average pooling. Besides, accidents may occur during object transition, which will cause congestion in the real world and further decrease prediction accuracy. To overcome these challenges, we propose Spatial-Temporal Conv-sequence Learning (STCL), where a focused temporal block uses unidirectional convolution to capture short-term periodic temporal dependencies effectively, and a patial-temporal fusion module is responsible for extracting dependencies of interactions and decreasing the feature dimensions. Moreover, as the accidents features have an impact on local traffic congestion, we employ position encoding to detect anomalies in complex traffic situations. We have conducted a large number of experiments on real-world tasks and verified the effectiveness of our proposed method.

preprint2022arXiv

SSMI: How to Make Objects of Interest Disappear without Accessing Object Detectors?

Most black-box adversarial attack schemes for object detectors mainly face two shortcomings: requiring access to the target model and generating inefficient adversarial examples (failing to make objects disappear in large numbers). To overcome these shortcomings, we propose a black-box adversarial attack scheme based on semantic segmentation and model inversion (SSMI). We first locate the position of the target object using semantic segmentation techniques. Next, we design a neighborhood background pixel replacement to replace the target region pixels with background pixels to ensure that the pixel modifications are not easily detected by human vision. Finally, we reconstruct a machine-recognizable example and use the mask matrix to select pixels in the reconstructed example to modify the benign image to generate an adversarial example. Detailed experimental results show that SSMI can generate efficient adversarial examples to evade human-eye perception and make objects of interest disappear. And more importantly, SSMI outperforms existing same kinds of attacks. The maximum increase in new and disappearing labels is 16%, and the maximum decrease in mAP metrics for object detection is 36%.

preprint2022arXiv

Summ^N: A Multi-Stage Summarization Framework for Long Input Dialogues and Documents

Text summarization helps readers capture salient information from documents, news, interviews, and meetings. However, most state-of-the-art pretrained language models (LM) are unable to efficiently process long text for many summarization tasks. In this paper, we propose Summ$^N$, a simple, flexible, and effective multi-stage framework for input texts that are longer than the maximum context length of typical pretrained LMs. Summ$^N$ first splits the data samples and generates a coarse summary in multiple stages and then produces the final fine-grained summary based on it. Our framework can process input text of arbitrary length by adjusting the number of stages while keeping the LM input size fixed. Moreover, it can deal with both single-source documents and dialogues, and it can be used on top of different backbone abstractive summarization models. To the best of our knowledge, Summ$^N$ is the first multi-stage split-then-summarize framework for long input summarization. Our experiments demonstrate that Summ$^N$ outperforms previous state-of-the-art methods by improving ROUGE scores on three long meeting summarization datasets AMI, ICSI, and QMSum, two long TV series datasets from SummScreen, and a long document summarization dataset GovReport. Our data and code are available at https://github.com/psunlpgroup/Summ-N.

preprint2022arXiv

Target Sensing with Intelligent Reflecting Surface: Architecture and Performance

Intelligent reflecting surface (IRS) has emerged as a promising technology to reconfigure the radio propagation environment by dynamically controlling wireless signal's amplitude and/or phase via a large number of reflecting elements. In contrast to the vast literature on studying IRS's performance gains in wireless communications, we study in this paper a new application of IRS for sensing/localizing targets in wireless networks. Specifically, we propose a new self-sensing IRS architecture where the IRS controller is capable of transmitting probing signals that are not only directly reflected by the target (referred to as the direct echo link), but also consecutively reflected by the IRS and then the target (referred to as the IRS-reflected echo link). Moreover, dedicated sensors are installed at the IRS for receiving both the direct and IRS-reflected echo signals from the target, such that the IRS can sense the direction of its nearby target by applying a customized multiple signal classification (MUSIC) algorithm. However, since the angle estimation mean square error (MSE) by the MUSIC algorithm is intractable, we propose to optimize the IRS passive reflection for maximizing the average echo signals' total power at the IRS sensors and derive the resultant Cramer-Rao bound (CRB) of the angle estimation MSE. Last, numerical results are presented to show the effectiveness of the proposed new IRS sensing architecture and algorithm, as compared to other benchmark sensing systems/algorithms.

preprint2022arXiv

The extremal process of super-Brownian motion: a probabilistic approach via skeletons

Recently Ren et al. [Stoch. Proc. Appl., 137 (2021)] have proved that the extremal process of the super-Brownian motion converges in distribution in the limit of large times. Their techniques rely heavily on the study of the convergence of solutions to the Kolmogorov-Petrovsky-Piscounov equation along the lines of [M. Bramson, Mem. Amer. Math. Soc., 44 (1983)]. In this paper we take a different approach. Our approach is based on the skeleton decomposition of super-Brownian motion. The skeleton may be interpreted as immortal particles that determine the large time behaviour of the process. We exploit this fact and carry asymptotic properties from the skeleton over to the super-Brownian motion. Some new results concerning the probabilistic representations of the limiting process are obtained, which cannot be directly obtained through the results of [Y.-X. Ren et al., Stoch. Proc. Appl., 137 (2021)]. Apart from the results, our approach offers insights into the driving force behind the limiting process for super-Brownian motions.

preprint2022arXiv

Thermal radiative cooling of carbon cluster cations C$_N^+$, $N = 9, 11,12, 17-27$

The radiative cooling rates of C$_N^+$ clusters ($N = 9, 11, 12, 17-27$) have been measured in the ultrahigh vacuum of an electrostatic storage ring to values on the order of $10^4$ s$^{-1}$. The rates were measured as a competing channel to unimolecular decay, and the rate constants pertain to the excitation energies where these two channels compete. Such high values can only be explained as photon emission from thermally excited electronic states, a mechanism that has also been seen in polycyclic aromatic hydrocarbon cations. The high rates have a very strong stabilizing effect on the clusters and the underlying mechanism gives a high energy conversion efficiency, with the potential to reach high quantum efficiencies in the emission process. The competing decay of unimolecular fragmentation defines upper limits for photon energies that can be down-converted to lower energy photons. Including previously measured cluster sizes provides the limits for all clusters C$_N^+$, $N=8-27$, of values that vary from 10 to 14.5 eV, with a general increase with size. Clusters absorbing photons of energies below these limits cool down efficiently by emission of photons via electronic transitions and their fragmentation is strongly reduced, increasing their survival in HI regions.

preprint2022arXiv

Transformation between elastic dipoles, quadrupoles, octupoles and hexadecapoles driven by surfactant self-assembly in nematic emulsion

Emulsions comprising isotropic fluid drops within a nematic host are of interest for applications ranging from biodetection to smart windows, which rely on changes of molecular alignment structures around the drops in response to chemical, thermal, electric and other stimuli. We show that absorption or desorption of trace amounts of common surfactants can drive continuous transformations of elastic multipoles induced by the droplets within the uniformly aligned nematic host. Out-of-equilibrium dynamics of director structures emerge from a controlled self-assembly or desorption of different surfactants at the drop-nematic interfaces, with ensuing forward and reverse transformations between elastic dipoles, quadrupoles, octupoles and hexadecapoles. We characterize inter-transformations of droplet-induced surface and bulk defects, probe elastic pair interactions and discuss emergent prospects for fundamental science and applications of the reconfigurable nematic emulsions.

preprint2021arXiv

A General Machine Learning-based Approach for Inverse Design of One-dimensional Photonic Crystals Toward Targeted Visible Light Reflection Spectrum

Data-driven methods have increasingly been applied to the development of optical systems as inexpensive and effective inverse design approaches. Optical properties (e.g., band-gap properties) of photonic crystals (PCs) are closely associated with characteristics of their light reflection spectra. Finding optimal PC constructions (within a pre-specified parameter space) that generate reflection spectra closest to a targeted spectrum is thus an interesting and meaningful inverse design problem, although relevant studies are still limited. Here we report a generally effective machine learning-based inverse design approach for one-dimensional photonic crystals (1DPCs), focusing on visible light spectra which are of high practical relevance. For a given class of 1DPC system, a deep neural network (DNN) in a unified structure is first trained over data from sizeable forward calculations (from layer thicknesses to spectrum). An iterative optimization scheme is then developed based on a coherent integration of DNN backward predictions (from spectrum to layer thicknesses), forward calculations, and Monte Carlo moves. We employ this new approach to four representative 1DPC systems including periodic structures with two-, three-, and four-layer repeating units and a heterostructure. The approach successfully converges to solutions of optimal 1DPC constructions for various targeted spectra regardless of their exact achievability. As two demonstrating examples, inverse designs toward a specially constructed "rectangle-shaped" green-light or red-light reflection spectrum are presented and discussed in detail. Remarkably, the results show that the approach can efficiently find out optimal layer thicknesses even when they are far outside the range covered by the original training data of DNN.

preprint2021arXiv

An Overview of Signal Processing Techniques for RIS/IRS-aided Wireless Systems

In the past as well as present wireless communication systems, the wireless propagation environment is regarded as an uncontrollable black box that impairs the received signal quality, and its negative impacts are compensated for by relying on the design of various sophisticated transmission/reception schemes. However, the improvements through applying such schemes operating at two endpoints (i.e., transmitter and receiver) only are limited even after five generations of wireless systems. Reconfigurable intelligent surface (RIS) or intelligent reflecting surface (IRS) have emerged as a new and revolutionary technology that can configure the wireless environment in a favorable manner by properly tuning the phase shifts of a large number of quasi passive and low-cost reflecting elements, thus standing out as a promising candidate technology for the next-/sixth-generation (6G) wireless system. However, to reap the performance benefits promised by RIS/IRS, efficient signal processing techniques are crucial, for a variety of purposes such as channel estimation, transmission design, radio localization, and so on. In this paper, we provide a comprehensive overview of recent advances on RIS/IRS-aided wireless systems from the signal processing perspective. We also highlight promising research directions that are worthy of investigation in the future.

preprint2021arXiv

Cooperative Multi-Beam Routing for Multi-IRS Aided Massive MIMO

Intelligent reflecting surface (IRS) is envisioned to play a significant role in future wireless communication systems thanks to its powerful capability of enabling smart and reconfigurable radio environment. In this paper, we study the multi-IRS aided downlink communication in a massive multiple-input multiple-output (MIMO) system, where a multi-antenna BS simultaneously serves multiple remote single-antenna users with orthogonal beams reflected by multiple IRSs. By exploiting the line-of-sight (LoS) link between each pair of selected IRSs, a multi-hop cascaded LoS link can be established between the BS and each user via their cooperative beam routing. Under this setup, we optimize the selected IRSs and their beam routing path for each user, along with the BS/IRS active/passive beamforming such that the minimum received signal power among all users is maximized, subject to a new multi-beam routing path separation constraint for avoiding the inter-user/route interference. To tackle this problem, we first derive the optimal BS/IRS active/passive beamforming in closed-form for any given beam routes and show the beam routing optimization is NP-complete by recasting it as an equivalent graph-optimization problem. To solve this challenging problem, we then propose an efficient recursive algorithm to partially enumerate the feasible solutions, which effectively balances the performance-complexity trade-off by tuning its design parameter. Numerical results demonstrate that the proposed algorithm can achieve near-optimal performance with low enumeration complexity and also outperform other benchmark schemes.

preprint2021arXiv

Distributed quantum phase estimation with entangled photons

Distributed quantum metrology can enhance the sensitivity for sensing spatially distributed parameters beyond the classical limits. Here we demonstrate distributed quantum phase estimation with discrete variables to achieve Heisenberg limit phase measurements. Based on parallel entanglement in modes and particles, we demonstrate distributed quantum sensing for both individual phase shifts and an averaged phase shift, with an error reduction up to 1.4 dB and 2.7 dB below the shot-noise limit. Furthermore, we demonstrate a combined strategy with parallel mode entanglement and multiple passes of the phase shifter in each mode. In particular, our experiment uses six entangled photons with each photon passing the phase shifter up to six times, and achieves a total number of photon passes N=21 at an error reduction up to 4.7 dB below the shot-noise limit. Our research provides a faithful verification of the benefit of entanglement and coherence for distributed quantum sensing in general quantum networks.

preprint2021arXiv

Double-IRS Aided MIMO Communication under LoS Channels: Capacity Maximization and Scaling

Intelligent reflecting surface (IRS) is a promising technology to extend the wireless signal coverage and support the high performance communication. By intelligently adjusting the reflection coefficients of a large number of passive reflecting elements, the IRS can modify the wireless propagation environment in favour of signal transmission. Different from most of the prior works which did not consider any cooperation between IRSs, in this work we propose and study a cooperative double-IRS aided multiple-input multiple-output (MIMO) communication system under the line-of-sight (LoS) propagation channels. We investigate the capacity maximization problem by jointly optimizing the transmit covariance matrix and the passive beamforming matrices of the two cooperative IRSs. Although the above problem is non-convex and difficult to solve, we transform and simplify the original problem by exploiting a tractable characterization of the LoS channels. Then we develop a novel low-complexity algorithm whose complexity is independent of the number of IRS elements. Moreover, we analyze the capacity scaling orders of the double-IRS aided MIMO system with respect to an asymptotically large number of IRS elements or transmit power, which significantly outperform those of the conventional single-IRS aided MIMO system, thanks to the cooperative passive beamforming gain brought by the double-reflection link and the spatial multiplexing gain harvested from the two single-reflection links. Extensive numerical results are provided to show that by exploiting the LoS channel properties, our proposed algorithm can achieve a desirable performance with low computational time. Also, our capacity scaling analysis is validated, and the double-IRS system is shown to achieve a much higher rate than its single-IRS counterpart as long as the number of IRS elements or the transmit power is not small.

preprint2021arXiv

Drug Repurposing for COVID-19 via Knowledge Graph Completion

Objective: To discover candidate drugs to repurpose for COVID-19 using literature-derived knowledge and knowledge graph completion methods. Methods: We propose a novel, integrative, and neural network-based literature-based discovery (LBD) approach to identify drug candidates from both PubMed and COVID-19-focused research literature. Our approach relies on semantic triples extracted using SemRep (via SemMedDB). We identified an informative subset of semantic triples using filtering rules and an accuracy classifier developed on a BERT variant, and used this subset to construct a knowledge graph. Five SOTA, neural knowledge graph completion algorithms were used to predict drug repurposing candidates. The models were trained and assessed using a time slicing approach and the predicted drugs were compared with a list of drugs reported in the literature and evaluated in clinical trials. These models were complemented by a discovery pattern-based approach. Results: Accuracy classifier based on PubMedBERT achieved the best performance (F1= 0.854) in classifying semantic predications. Among five knowledge graph completion models, TransE outperformed others (MR = 0.923, Hits@1=0.417). Some known drugs linked to COVID-19 in the literature were identified, as well as some candidate drugs that have not yet been studied. Discovery patterns enabled generation of plausible hypotheses regarding the relationships between the candidate drugs and COVID-19. Among them, five highly ranked and novel drugs (paclitaxel, SB 203580, alpha 2-antiplasmin, pyrrolidine dithiocarbamate, and butylated hydroxytoluene) with their mechanistic explanations were further discussed. Conclusion: We show that an LBD approach can be feasible for discovering drug candidates for COVID-19, and for generating mechanistic explanations. Our approach can be generalized to other diseases as well as to other clinical questions.

preprint2021arXiv

Environment-Aware and Training-Free Beam Alignment for mmWave Massive MIMO via Channel Knowledge Map

Millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) communication system is expected to achieve enormous transmission rate, provided that the transmit and receive beams are properly aligned with the MIMO channel. However, existing beam alignment techniques rely on either channel estimation or beam sweeping, which incur prohibitively high training overhead, especially for future wireless systems with further increased antenna dimensions and more stringent requirement on cost-effective hardware architectures. In this paper, we propose a new beam alignment technique, which is environment-aware and training-free, by utilizing the emerging concept of channel knowledge map (CKM), together with the user location information that is readily available in contemporary wireless systems. CKM is a site-specific database, tagged with the transmitter/receiver locations, which contains useful channel information to facilitate or even obviate real-time channel state information (CSI) acquistion. Two instances of CKM are proposed for beam alignment in mmWave massive MIMO systems, namely channel path map (CPM) and beam index map (BIM). It is shown that compared with existing training-based beam alignment schemes, the proposed CKM-enabled environment-aware beam alignment is able to drastically improve the effective communication rate, even with moderate user location errors, thanks to its significant saving of the prohibitive training overhead.

preprint2021arXiv

Extracting Lifestyle Factors for Alzheimer's Disease from Clinical Notes Using Deep Learning with Weak Supervision

Since no effective therapies exist for Alzheimer's disease (AD), prevention has become more critical through lifestyle factor changes and interventions. Analyzing electronic health records (EHR) of patients with AD can help us better understand lifestyle's effect on AD. However, lifestyle information is typically stored in clinical narratives. Thus, the objective of the study was to demonstrate the feasibility of natural language processing (NLP) models to classify lifestyle factors (e.g., physical activity and excessive diet) from clinical texts. We automatically generated labels for the training data by using a rule-based NLP algorithm. We conducted weak supervision for pre-trained Bidirectional Encoder Representations from Transformers (BERT) models on the weakly labeled training corpus. These models include the BERT base model, PubMedBERT(abstracts + full text), PubMedBERT(only abstracts), Unified Medical Language System (UMLS) BERT, Bio BERT, and Bio-clinical BERT. We performed two case studies: physical activity and excessive diet, in order to validate the effectiveness of BERT models in classifying lifestyle factors for AD. These models were compared on the developed Gold Standard Corpus (GSC) on the two case studies. The PubmedBERT(Abs) model achieved the best performance for physical activity, with its precision, recall, and F-1 scores of 0.96, 0.96, and 0.96, respectively. Regarding classifying excessive diet, the Bio BERT model showed the highest performance with perfect precision, recall, and F-1 scores. The proposed approach leveraging weak supervision could significantly increase the sample size, which is required for training the deep learning models. The study also demonstrates the effectiveness of BERT models for extracting lifestyle factors for Alzheimer's disease from clinical notes.

preprint2021arXiv

HexCNN: A Framework for Native Hexagonal Convolutional Neural Networks

Hexagonal CNN models have shown superior performance in applications such as IACT data analysis and aerial scene classification due to their better rotation symmetry and reduced anisotropy. In order to realize hexagonal processing, existing studies mainly use the ZeroOut method to imitate hexagonal processing, which causes substantial memory and computation overheads. We address this deficiency with a novel native hexagonal CNN framework named HexCNN. HexCNN takes hexagon-shaped input and performs forward and backward propagation on the original form of the input based on hexagon-shaped filters, hence avoiding computation and memory overheads caused by imitation. For applications with rectangle-shaped input but require hexagonal processing, HexCNN can be applied by padding the input into hexagon-shape as preprocessing. In this case, we show that the time and space efficiency of HexCNN still outperforms existing hexagonal CNN methods substantially. Experimental results show that compared with the state-of-the-art models, which imitate hexagonal processing but using rectangle-shaped filters, HexCNN reduces the training time by up to 42.2%. Meanwhile, HexCNN saves the memory space cost by up to 25% and 41.7% for loading the input and performing convolution, respectively.

preprint2021arXiv

INSQ: An Influential Neighbor Set Based Moving kNN Query Processing System

We revisit the moving k nearest neighbor (MkNN) query, which computes one's k nearest neighbor set and maintains it while at move. Existing MkNN algorithms are mostly safe region based, which lack efficiency due to either computing small safe regions with a high recomputation frequency or computing larger safe regions but with a high cost for each computation. In this demonstration, we showcase a system named INSQ that adopts a novel algorithm called the Influential Neighbor Set (INS) algorithm to process the MkNN query in both two-dimensional Euclidean space and road networks. This algorithm uses a small set of safe guarding objects instead of safe regions. As long as the the current k nearest neighbors are closer to the query object than the safe guarding objects are, the current k nearest neighbors stay valid and no recomputation is required. Meanwhile, the region defined by the safe guarding objects is the largest possible safe region. This means that the recomputation frequency is also minimized and hence, the INS algorithm achieves high overall query processing efficiency.

preprint2021arXiv

Mechanical Properties of Atomically Thin Tungsten Dichalcogenides: WS$_2$, WSe$_2$ and WTe$_2$

Two-dimensional (2D) tungsten disulfide (WS$_2$), tungsten diselenide (WSe$_2$), and tungsten ditelluride (WTe$_2$) draw increasing attention due to their attractive properties deriving from the heavy tungsten and chalcogenide atoms, but their mechanical properties are still mostly unknown. Here, we determine the intrinsic and air-aged mechanical properties of mono-, bi-, and trilayer (1-3L) WS$_2$, WSe$_2$ and WTe$_2$ using a complementary suite of experiments and theoretical calculations. High-quality 1L WS$_2$ has the highest Young's modulus (302.4+-24.1 GPa) and strength (47.0+-8.6 GPa) of the entire family, overpassing those of 1L WSe$_2$ (258.6+-38.3 and 38.0+-6.0 GPa, respectively) and WTe$_2$ (149.1+-9.4 and 6.4+-3.3 GPa, respectively). However, the elasticity and strength of WS$_2$ decrease most dramatically with increased thickness among the three materials. We interpret the phenomenon by the different tendencies for interlayer sliding in equilibrium state and under in-plane strain and out-of-plane compression conditions in the indentation process, revealed by finite element method (FEM) and density functional theory (DFT) calculations including van der Waals (vdW) interactions. We also demonstrate that the mechanical properties of the high-quality 1-3L WS$_2$ and WSe$_2$ are largely stable in the air for up to 20 weeks. Intriguingly, the 1-3L WSe$_2$ shows increased modulus and strength values with aging in the air. This is ascribed to oxygen doping, which reinforces the structure. The present study will facilitate the design and use of 2D tungsten dichalcogenides in applications, such as strain engineering and flexible field-effect transistors (FETs).

preprint2021arXiv

Online detection of cascading change-points

We propose an online detection procedure for cascading failures in the network from sequential data, which can be modeled as multiple correlated change-points happening during a short period. We consider a temporal diffusion network model to capture the temporal dynamic structure of multiple change-points and develop a sequential Shewhart procedure based on the generalized likelihood ratio statistics based on the diffusion network model assuming unknown post-change distribution parameters. We also tackle the computational complexity posed by the unknown propagation. Numerical experiments demonstrate the good performance for detecting cascade failures.

preprint2021arXiv

TEMImageNet Training Library and AtomSegNet Deep-Learning Models for High-Precision Atom Segmentation, Localization, Denoising, and Super-Resolution Processing of Atomic-Resolution Images

Atom segmentation and localization, noise reduction and deblurring of atomic-resolution scanning transmission electron microscopy (STEM) images with high precision and robustness is a challenging task. Although several conventional algorithms, such has thresholding, edge detection and clustering, can achieve reasonable performance in some predefined sceneries, they tend to fail when interferences from the background are strong and unpredictable. Particularly, for atomic-resolution STEM images, so far there is no well-established algorithm that is robust enough to segment or detect all atomic columns when there is large thickness variation in a recorded image. Herein, we report the development of a training library and a deep learning method that can perform robust and precise atom segmentation, localization, denoising, and super-resolution processing of experimental images. Despite using simulated images as training datasets, the deep-learning model can self-adapt to experimental STEM images and shows outstanding performance in atom detection and localization in challenging contrast conditions and the precision consistently outperforms the state-of-the-art two-dimensional Gaussian fit method. Taking a step further, we have deployed our deep-learning models to a desktop app with a graphical user interface and the app is free and open-source. We have also built a TEM ImageNet project website for easy browsing and downloading of the training data.

preprint2021arXiv

The evolution of network controllability in growing networks

The study of network structural controllability focuses on the minimum number of driver nodes needed to control a whole network. Despite intensive studies on this topic, most of them consider static networks only. It is well-known, however, that real networks are growing, with new nodes and links added to the system. Here, we analyze controllability of evolving networks and propose a general rule for the change of driver nodes. We further apply the rule to solve the problem of network augmentation subject to the controllability constraint. The findings fill a gap in our understanding of network controllability and shed light on controllability of real systems.

preprint2021arXiv

The Labeled Multiple Canonical Correlation Analysis for Information Fusion

The objective of multimodal information fusion is to mathematically analyze information carried in different sources and create a new representation which will be more effectively utilized in pattern recognition and other multimedia information processing tasks. In this paper, we introduce a new method for multimodal information fusion and representation based on the Labeled Multiple Canonical Correlation Analysis (LMCCA). By incorporating class label information of the training samples,the proposed LMCCA ensures that the fused features carry discriminative characteristics of the multimodal information representations, and are capable of providing superior recognition performance. We implement a prototype of LMCCA to demonstrate its effectiveness on handwritten digit recognition,face recognition and object recognition utilizing multiple features,bimodal human emotion recognition involving information from both audio and visual domains. The generic nature of LMCCA allows it to take as input features extracted by any means,including those by deep learning (DL) methods. Experimental results show that the proposed method enhanced the performance of both statistical machine learning (SML) methods, and methods based on DL.

preprint2021arXiv

Transforming Fading Channel from Fast to Slow: IRS-Assisted High-Mobility Communication

In this paper, we study a new intelligent refracting surface (IRS)-assisted high-mobility communication with the IRS deployed in a high-speed moving vehicle to assist its passenger's communication with a static base station (BS) on the roadside. The vehicle's high Doppler frequency results in a fast fading channel between the BS and the passenger/user, which renders channel estimation for the IRS with a large number of refracting elements a more challenging task as compared to the conventional case with low-mobility users only. In order to mitigate the Doppler effect and reap the full IRS passive beamforming gain with low training overhead, we propose a new and efficient transmission protocol to execute channel estimation and IRS refraction design for data transmission. Specifically, by exploiting the quasi-static channel between the IRS and user both moving at the same high speed, we first estimate the cascaded BS-IRS-user channel with the Doppler effect compensated. Then, we estimate the instantaneous BS-user fast fading channel (without IRS refraction) and tune the IRS refraction over time accordingly to align the cascaded channel with the BS-user direct channel, thus maximizing the IRS's passive beamforming gain as well as converting their combined channel from fast to slow fading. Simulation results show the effectiveness of the proposed channel estimation scheme and passive beamforming design as compared to various benchmark schemes.

preprint2021arXiv

UAV-Enabled Wireless Power Transfer: A Tutorial Overview

Unmanned aerial vehicle (UAV)-enabled wireless power transfer (WPT) has recently emerged as a promising technique to provide sustainable energy supply for widely distributed low-power ground devices (GDs) in large-scale wireless networks. Compared with the energy transmitters (ETs) in conventional WPT systems which are deployed at fixed locations, UAV-mounted aerial ETs can fly flexibly in the three-dimensional (3D) space to charge nearby GDs more efficiently. This paper provides a tutorial overview on UAV-enabled WPT and its appealing applications, in particular focusing on how to exploit UAVs' controllable mobility via their 3D trajectory design to maximize the amounts of energy transferred to all GDs in a wireless network with fairness. First, we consider the single-UAV-enabled WPT scenario with one UAV wirelessly charging multiple GDs at known locations. To solve the energy maximization problem in this case, we present a general trajectory design framework consisting of three innovative approaches to optimize the UAV trajectory, which are multi-location hovering, successive-hover-and-fly, and time-quantization-based optimization, respectively. Next, we consider the multi-UAV-enabled WPT scenario where multiple UAVs cooperatively charge many GDs in a large area. Building upon the single-UAV trajectory design, we propose two efficient schemes to jointly optimize multiple UAVs' trajectories, based on the principles of UAV swarming and GD clustering, respectively. Furthermore, we consider two important extensions of UAV-enabled WPT, namely UAV-enabled wireless powered communication networks (WPCN) and UAV-enabled wireless powered mobile edge computing (MEC).

preprint2021arXiv

Vector Measurements Using All Optical Scalar Atomic Magnetometers

Vector field measurement is demonstrated with an all-optical scalar atomic magnetometer using intrinsic parameters related to its scalar operation. The Bell-Bloom type atomic magnetometer measures the Larmor precession of cesium atoms through on-resonant absorption of a probe beam. While the AC component of the probe signal is used for the field magnitude measurement, the probe DC signal contains information about the polar angle, defined as the angle between the magnetic field and the probe beam. Additional polar angle information is obtained from the light-shift-induced magnetic field caused by the frequency modulation of the probe beam. With a measurement time of 100 milliseconds, better than 0.02 degree sensitivity has been achieved using a commercial miniaturized sensor at the optimal sensor orientation. The angle measurement accuracy is checked against an optical encoder over the entire polar angle range of 0 to 180 degrees. Better than 1 degree error is recorded over most set polar angles. Azimuthal angle measurement is also exhibited with two orthogonally oriented sensors.

preprint2020arXiv

A heterogeneous branch and multi-level classification network for person re-identification

Convolutional neural networks with multiple branches have recently been proved highly effective in person re-identification (re-ID). Researchers design multi-branch networks using part models, yet they always attribute the effectiveness to multiple parts. In addition, existing multi-branch networks always have isomorphic branches, which lack structural diversity. In order to improve this problem, we propose a novel Heterogeneous Branch and Multi-level Classification Network (HBMCN), which is designed based on the pre-trained ResNet-50 model. A new heterogeneous branch, SE-Res-Branch, is proposed based on the SE-Res module, which consists of the Squeeze-and-Excitation block and the residual block. Furthermore, a new multi-level classification joint objective function is proposed for the supervised learning of HBMCN, whereby multi-level features are extracted from multiple high-level layers and concatenated to represent a person. Based on three public person re-ID benchmarks (Market1501, DukeMTMC-reID and CUHK03), experimental results show that the proposed HBMCN reaches 94.4%, 85.7% and 73.8% in Rank-1, and 85.7%, 74.6% and 69.0% in mAP, achieving a state-of-the-art performance. Further analysis demonstrates that the specially designed heterogeneous branch performs better than an isomorphic branch, and multi-level classification provides more discriminative features compared to single-level classification. As a result, HBMCN provides substantial further improvements in person re-ID tasks.

preprint2020arXiv

A Low-Complexity Beamforming Design for Multiuser Wireless Energy Transfer

Wireless energy transfer (WET) is a green enabler of low-power Internet of Things (IoT). Therein, traditional optimization schemes relying on full channel state information (CSI) are often too costly to implement due to excessive energy consumption and high processing complexity. This letter proposes a simple, yet effective, energy beamforming scheme that allows a multi-antenna power beacon (PB) to fairly power a set of IoT devices by only relying on the first-order statistics of the channels. In addition to low complexity, the proposed scheme performs favorably as compared to benchmarking schemes and its performance improves as the number of PB's antennas increases. Finally, it is shown that further performance improvement can be achieved through proper angular rotations of the PB.

preprint2020arXiv

A New Observable for Measuring CP Property of Top-Higgs Interaction

We propose a new dihedral angle observable to measure the CP property of the interaction of top quark and Higgs boson in the $t\bar{t}H$ production at the 14~TeV LHC. We consider two decay modes of the Higgs boson, $H\to b\bar{b}$ and $H\to γγ$ and show that the dihedral angle distribution is able to distinguish the CP-even and the CP-odd hypothesis at 95\% confidence level with an integrated luminosity of $\sim 180~{\rm fb}^{-1}$.

preprint2020arXiv

A New Store-then-Amplify-and-Forward Protocol for UAV Mobile Relaying

In this letter, we consider the use of an unmanned aerial vehicle (UAV) as a mobile relay to assist the communication between two ground users without a direct link. We propose a novel store-then-amplify-and-forward (SAF) relaying protocol for the UAV to exploit its mobility jointly with the low-complexity AF relaying. Specifically, the received signal from the source is first stored in a buffer at the UAV, then amplified and forwarded to the destination when the UAV flies closer to the destination. With this new SAF protocol, we aim to maximize the throughput of the UAV-enabled relaying system by jointly optimizing the source/UAV transmit power and the UAV trajectory, as well as the time-slot pairing for each data packet received and forwarded by the UAV. As this problem is a non-convex mixed integer optimization problem that is difficult to solve, we propose an efficient algorithm for obtaining a suboptimal solution for it by applying the techniques of Hungary algorithm, alternating optimization and successive convex approximation. Numerical results show that the proposed mobile SAF relaying outperforms the conventional AF relaying without signal storing.

preprint2020arXiv

Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers. Recent emerged quantization technique has been applied to inference of deep neural networks for fast and efficient execution. However, directly applying quantization in training can cause significant accuracy loss, thus remaining an open challenge.

preprint2020arXiv

Anchor-Assisted Intelligent Reflecting Surface Channel Estimation for Multiuser Communications

Due to the passive nature of Intelligent Reflecting Surface (IRS), channel estimation is a fundamental challenge in IRS-aided wireless networks. Particularly, as the number of IRS reflecting elements and/or that of IRS-served users increase, the channel training overhead becomes excessively high. To tackle this challenge, we propose in this paper a new anchor-assisted two-phase channel estimation scheme, where two anchor nodes, namely A1 and A2, are deployed near the IRS for helping the base station (BS) to acquire the cascaded BS-IRS-user channels. Specifically, in the first phase, the partial channel state information (CSI), i.e., the element-wise channel gain square, of the BS-IRS link is obtained by estimating the BS-IRS-A1/A2 channels and the A1-IRS-A2 channel, separately. Then, in the second phase, by leveraging such partial knowledge of the BS-IRS channel that is common to all users, the individual cascaded BS-IRS-user channels are efficiently estimated. Simulation results demonstrate that the proposed anchor-assisted channel estimation scheme is able to achieve comparable mean-squared error (MSE) performance as compared to the conventional scheme, but with significantly reduced channel training time.

preprint2020arXiv

Beamforming Optimization for Wireless Network Aided by Intelligent Reflecting Surface with Discrete Phase Shifts

Intelligent reflecting surface (IRS) is a cost-effective solution for achieving high spectrum and energy efficiency in future wireless networks by leveraging massive low-cost passive elements that are able to reflect the signals with adjustable phase shifts. Prior works on IRS mainly consider continuous phase shifts at reflecting elements, which are practically difficult to implement due to the hardware limitation. In contrast, we study in this paper an IRS-aided wireless network, where an IRS with only a finite number of phase shifts at each element is deployed to assist in the communication from a multi-antenna access point (AP) to multiple single-antenna users. We aim to minimize the transmit power at the AP by jointly optimizing the continuous transmit precoding at the AP and the discrete reflect phase shifts at the IRS, subject to a given set of minimum signal-to-interference-plus-noise ratio (SINR) constraints at the user receivers. The considered problem is shown to be a mixed-integer non-linear program (MINLP) and thus is difficult to solve in general. To tackle this problem, we first study the single-user case with one user assisted by the IRS and propose both optimal and suboptimal algorithms for solving it. Besides, we analytically show that as compared to the ideal case with continuous phase shifts, the IRS with discrete phase shifts achieves the same squared power gain in terms of asymptotically large number of reflecting elements, while a constant proportional power loss is incurred that depends only on the number of phase-shift levels. The proposed designs for the single-user case are also extended to the general setup with multiple users among which some are aided by the IRS. Simulation results verify our performance analysis as well as the effectiveness of our proposed designs as compared to various benchmark schemes.

preprint2020arXiv

BrePartition: Optimized High-Dimensional kNN Search with Bregman Distances

Bregman distances (also known as Bregman divergences) are widely used in machine learning, speech recognition and signal processing, and kNN searches with Bregman distances have become increasingly important with the rapid advances of multimedia applications. Data in multimedia applications such as images and videos are commonly transformed into space of hundreds of dimensions. Such high-dimensional space has posed significant challenges for existing kNN search algorithms with Bregman distances, which could only handle data of medium dimensionality (typically less than 100). This paper addresses the urgent problem of high-dimensional kNN search with Bregman distances. We propose a novel partition-filter-refinement framework. Specifically, we propose an optimized dimensionality partitioning scheme to solve several non-trivial issues. First, an effective bound from each partitioned subspace to obtain exact kNN results is derived. Second, we conduct an in-depth analysis of the optimized number of partitions and devise an effective strategy for partitioning. Third, we design an efficient integrated index structure for all the subspaces together to accelerate the search processing. Moreover, we extend our exact solution to an approximate version by a trade-off between the accuracy and efficiency. Experimental results on four real-world datasets and two synthetic datasets show the clear advantage of our method in comparison to state-of-the-art algorithms.

preprint2020arXiv

Channel Estimation and Passive Beamforming for Intelligent Reflecting Surface: Discrete Phase Shift and Progressive Refinement

Prior studies on Intelligent Reflecting Surface (IRS) have mostly assumed perfect channel state information (CSI) available for designing the IRS passive beamforming as well as the continuously adjustable phase shift at each of its reflecting elements, which, however, have simplified two challenging issues for implementing IRS in practice, namely, its channel estimation and passive beamforming designs both under the constraint of discrete phase shifts. To address them, we consider in this paper an IRS-aided single-user communication system with discrete phase shifts and design the IRS training reflection matrix for channel estimation as well as the passive beamforming for data transmission, both subject to the constraint of discrete phase shifts. We show that the training reflection matrix design for discrete phase shifts greatly differs from that for continuous phase shifts, and thus the corresponding passive beamforming should be optimized by taking into account the correlated channel estimation error due to discrete phase shifts. Specifically, we consider a practical block-based transmission, where each block has a finite (insufficient) number of training symbols for channel estimation. A novel hierarchical training reflection design is proposed to progressively estimate IRS elements' channels over multiple blocks by exploiting IRS-elements grouping and partition. Based on the resolved IRS channels in each block, we further design the progressive passive beamforming at the IRS with discrete phase shifts to improve the achievable rate for data transmission over the blocks.

preprint2020arXiv

Cooperative Double-IRS Aided Communication: Beamforming Design and Power Scaling

Intelligent reflecting surface (IRS) is a promising technology to support high performance wireless communication. By adaptively configuring the reflection amplitude and/or phase of each passive reflecting element on it, the IRS can reshape the electromagnetic environment in favour of signal transmission. This letter advances the existing research by proposing and analyzing a double-IRS aided wireless communication system. Under the reasonable assumption that the reflection channel from IRS 1 to IRS 2 is of rank 1 (e.g., line-of-sight channel), we propose a joint passive beamforming design for the two IRSs. Based on this, we show that deploying two cooperative IRSs with in total K elements can yield a power gain of order O(K^4), which greatly outperforms the case of deploying one traditional IRS with a power gain of order O(K^2). Our simulation results validate that the performance of deploying two cooperative IRSs is significantly better than that of deploying one IRS given a sufficient total number of IRS elements. We also extend our line-of-sight channel model to show how different channel models affect the performance of the double-IRS aided wireless communication system.

preprint2020arXiv

Cooperative NOMA for Downlink Asymmetric Interference Cancellation

This letter advances the non-orthogonal multiple access (NOMA) technique for cellular downlink co-channel interference mitigation, via exploiting the (limited) cooperation among base stations (BSs). Specifically, we consider a simplified but practically relevant scenario of two co-channel cells with asymmetric interference, i.e., only the user in one cell receives the strong interference from the BS in the other cell. To mitigate such interference, we propose a new cooperative NOMA scheme, where the interfered user's serving BS sends a superposed signal comprising both the desired message and the co-channel user's message (shared by the interfering BS). The co-channel user's signal is aimed to add constructively with the interfering BS's signal at the interfered user's receiver so that the combined interference with enhanced power can be effectively decoded and cancelled. This thus leads to a new problem on how to optimally allocate the transmit power for the two superposed messages. We provide the closed-form solution to this problem and investigate the conditions under which the performance of the proposed scheme is superior over the existing schemes.

preprint2020arXiv

Counting Classical Nodes in Quantum Networks

Quantum networks illustrate the use of connected nodes of quantum systems as the backbone of distributed quantum information processing. When the network nodes are entangled in graph states, such a quantum platform is indispensable to almost all the existing distributed quantum tasks. Unfortunately, real networks unavoidably suffer from noise and technical restrictions, making nodes transit from quantum to classical at worst. Here, we introduce a figure of merit in terms of the number of classical nodes for quantum networks in arbitrary graph states. Such a network property is revealed by exploiting a novel Einstein-Podolsky-Rosen steerability. Experimentally, we demonstrate photonic quantum networks of $n_q$ quantum nodes and $n_c$ classical nodes with $n_q$ up to 6 and $n_c$ up to 18 using spontaneous parametric down-conversion entanglement sources. We show that the proposed method is faithful in quantifying the classical defects in prepared multiphoton quantum networks. Our results provide novel identification of generic quantum networks and nonclassical correlations in graph states.

preprint2020arXiv

Creation of 2000-atom Greenberger-Horne-Zeilinger states by entanglement amplification

We propose a novel entanglement-creation scheme in a multi-atom ensemble, named entanglement amplification, which converts unentangled states into entangled states and amplifies less-entangled ones to maximally-entangled Greenberger-Horne-Zeilinger (GHZ) states. The scheme starts with a multi-atom ensemble initialized in a coherent spin state. By shifting the energy of a particular Dicke state, we break the Hilbert space of the ensemble into two isolated subspaces to tear the coherent spin state into two components so that entanglement is introduced. After that, we utilize the isolated subspaces to further enhance the entanglement by coherently separating the two components. By single-particle Rabi drivings on atoms in a high-finesse optical cavity illuminated by a single-frequency light, 2000-atom GHZ states can be created with a fidelity above 80% in an experimentally achievable system, making resources of ensembles at Heisenberg limit practically available for quantum metrology.

preprint2020arXiv

Data User-Based Attribute-Based Encryption

Attribute-Based Encryption (ABE) has emerged as an information-centric public-key cryptographic system which allows a data owner to share data, according to access policy, with multiple data users based on the attributes they possess, without knowing their identities. In the original ABE schemes, a central authority administrates the system and issues secret keys to data users based on their attributes and both the owner and users need to trust a specific CA. However, in certain real-world applications, the data users would not trust anyone but themselves. For such situations, we introduce a new decentralization model of ABE, termed Data User-based ABE (DU-ABE), which is managed jointly by the data users. DU-ABE is the first decentralized ABE scheme that replaces the authorities with the data users without employing any other extra entities.

preprint2020arXiv

Differentiating Population Spatial Behavior using Representative Features of Geospatial Mobility (ReFGeM)

Understanding how humans use and consume space by comparing stratified groups, either through observation or controlled study, is key to designing better spaces, cities, and policies. GPS data traces provide detailed movement patterns of individuals but can be difficult to interpret due to the scale and scope of the data collected. For actionable insights, GPS traces are usually reduced to one or more features which express the spatial phenomenon of interest. However, it is not always clear which spatial features should be employed, and substantial effort can be invested into designing features which may or may not provide insight. In this paper we present an alternative approach: a standardized feature set with actionable interpretations that can be efficiently run against many datasets. We show that these features can distinguish between disparate human mobility patterns, although no single feature can distinguish them alone.

preprint2020arXiv

DWM: A Decomposable Winograd Method for Convolution Acceleration

Winograd's minimal filtering algorithm has been widely used in Convolutional Neural Networks (CNNs) to reduce the number of multiplications for faster processing. However, it is only effective on convolutions with kernel size as 3x3 and stride as 1, because it suffers from significantly increased FLOPs and numerical accuracy problem for kernel size larger than 3x3 and fails on convolution with stride larger than 1. In this paper, we propose a novel Decomposable Winograd Method (DWM), which breaks through the limitation of original Winograd's minimal filtering algorithm to a wide and general convolutions. DWM decomposes kernels with large size or large stride to several small kernels with stride as 1 for further applying Winograd method, so that DWM can reduce the number of multiplications while keeping the numerical accuracy. It enables the fast exploring of larger kernel size and larger stride value in CNNs for high performance and accuracy and even the potential for new CNNs. Comparing against the original Winograd, the proposed DWM is able to support all kinds of convolutions with a speedup of ~2, without affecting the numerical accuracy.

preprint2020arXiv

Enabling Panoramic Full-Angle Reflection via Aerial Intelligent Reflecting Surface

This paper proposes a new three dimensional (3D) networking architecture enabled by aerial intelligent reflecting surface (AIRS) to achieve panoramic signal reflection from the sky. Compared to the conventional terrestrial IRS, AIRS not only enjoys higher deployment flexibility, but also is able to achieve 360$^\circ$ panoramic full-angle reflection and requires fewer reflections in general due to its higher likelihood of having line of sight (LoS) links with the ground nodes. We focus on the problem to maximize the worst-case signal-to-noise ratio (SNR) in a given coverage area by jointly optimizing the transmit beamforming, AIRS placement and phase shifts. The formulated problem is non-convex and the optimization variables are coupled with each other in an intricate manner. To tackle this problem, we first consider the special case of single-location SNR maximization to gain useful insights, for which the optimal solution is obtained in closed-form. Then for the general case of area coverage, an efficient suboptimal solution is proposed by exploiting the similarity between phase shifts optimization for IRS and analog beamforming for the conventional phase array. Numerical results show that the proposed design can achieve significant performance gain than heuristic AIRS deployment schemes.

preprint2020arXiv

ESPRIT: Explaining Solutions to Physical Reasoning Tasks

Neural networks lack the ability to reason about qualitative physics and so cannot generalize to scenarios and tasks unseen during training. We propose ESPRIT, a framework for commonsense reasoning about qualitative physics in natural language that generates interpretable descriptions of physical events. We use a two-step approach of first identifying the pivotal physical events in an environment and then generating natural language descriptions of those events using a data-to-text approach. Our framework learns to generate explanations of how the physical simulation will causally evolve so that an agent or a human can easily reason about a solution using those interpretable descriptions. Human evaluations indicate that ESPRIT produces crucial fine-grained details and has high coverage of physical concepts compared to even human annotations. Dataset, code and documentation are available at https://github.com/salesforce/esprit.

preprint2020arXiv

Explainable Tensorized Neural Ordinary Differential Equations forArbitrary-step Time Series Prediction

We propose a continuous neural network architecture, termed Explainable Tensorized Neural Ordinary Differential Equations (ETN-ODE), for multi-step time series prediction at arbitrary time points. Unlike the existing approaches, which mainly handle univariate time series for multi-step prediction or multivariate time series for single-step prediction, ETN-ODE could model multivariate time series for arbitrary-step prediction. In addition, it enjoys a tandem attention, w.r.t. temporal attention and variable attention, being able to provide explainable insights into the data. Specifically, ETN-ODE combines an explainable Tensorized Gated Recurrent Unit (Tensorized GRU or TGRU) with Ordinary Differential Equations (ODE). The derivative of the latent states is parameterized with a neural network. This continuous-time ODE network enables a multi-step prediction at arbitrary time points. We quantitatively and qualitatively demonstrate the effectiveness and the interpretability of ETN-ODE on five different multi-step prediction tasks and one arbitrary-step prediction task. Extensive experiments show that ETN-ODE can lead to accurate predictions at arbitrary time points while attaining best performance against the baseline methods in standard multi-step time series prediction.

preprint2020arXiv

Fast Beam Training for IRS-Assisted Multiuser Communications

In this letter, we consider an intelligent reflecting surface (IRS)-assisted multiuser communication system, where an IRS is deployed to provide virtual line-of-sight (LoS) links between an access point (AP) and multiple users. We consider the practical codebook-based IRS passive beamforming and study efficient design for IRS reflect beam training, which is challenging due to the large number of IRS reflecting elements. In contrast to the conventional single-beam training, we propose a new multi-beam training method by dividing the IRS reflecting elements into multiple sub-arrays and designing their simultaneous multi-beam steering over time. By simply comparing the received signal power over time, each user can detect its optimal IRS beam direction with a high probability, even without searching over all possible beam directions as the single-beam training. Simulation results show that our proposed multi-beam training significantly reduces the training time of conventional single-beam training and yet achieves comparable IRS passive beamforming performance for data transmission.

preprint2020arXiv

Fast Channel Estimation for IRS-Assisted OFDM

In this letter, we study efficient channel estimation for an intelligent reflecting surface (IRS)-assisted orthogonal frequency division multiplexing (OFDM) system to achieve minimum training time. First, a fast channel estimation scheme with reduced OFDM symbol duration is proposed for arbitrary frequency-selective fading channels. Next, under the typical condition that the IRS-user channel is line-of-sight (LoS) dominant, another fast channel estimation scheme based on the novel concept of sampling-wise IRS reflection variation is proposed. Moreover, the pilot signal and IRS training reflection pattern are jointly optimized for both proposed schemes. Finally, the proposed schemes are compared in terms of training time and channel estimation performance via simulations, as well as against benchmark schemes.

preprint2020arXiv

Generative Image Inpainting with Submanifold Alignment

Image inpainting aims at restoring missing regions of corrupted images, which has many applications such as image restoration and object removal. However, current GAN-based generative inpainting models do not explicitly exploit the structural or textural consistency between restored contents and their surrounding contexts.To address this limitation, we propose to enforce the alignment (or closeness) between the local data submanifolds (or subspaces) around restored images and those around the original (uncorrupted) images during the learning process of GAN-based inpainting models. We exploit Local Intrinsic Dimensionality (LID) to measure, in deep feature space, the alignment between data submanifolds learned by a GAN model and those of the original data, from a perspective of both images (denoted as iLID) and local patches (denoted as pLID) of images. We then apply iLID and pLID as regularizations for GAN-based inpainting models to encourage two levels of submanifold alignment: 1) an image-level alignment for improving structural consistency, and 2) a patch-level alignment for improving textural details. Experimental results on four benchmark datasets show that our proposed model can generate more accurate results than state-of-the-art models.

preprint2020arXiv

Grafted network for person re-identification

Convolutional neural networks have shown outstanding effectiveness in person re-identification (re-ID). However, the models always have large number of parameters and much computation for mobile application. In order to relieve this problem, we propose a novel grafted network (GraftedNet), which is designed by grafting a high-accuracy rootstock and a light-weighted scion. The rootstock is based on the former parts of ResNet-50 to provide a strong baseline, while the scion is a new designed module, composed of the latter parts of SqueezeNet, to compress the parameters. To extract more discriminative feature representation, a joint multi-level and part-based feature is proposed. In addition, to train GraftedNet efficiently, we propose an accompanying learning method, by adding an accompanying branch to train the model in training and removing it in testing for saving parameters and computation. On three public person re-ID benchmarks (Market1501, DukeMTMC-reID and CUHK03), the effectiveness of GraftedNet are evaluated and its components are analyzed. Experimental results show that the proposed GraftedNet achieves 93.02%, 85.3% and 76.2% in Rank-1 and 81.6%, 74.7% and 71.6% in mAP, with only 4.6M parameters.

preprint2020arXiv

Holographic MIMO Surfaces for 6G Wireless Networks: Opportunities, Challenges, and Trends

Future wireless networks are expected to evolve towards an intelligent and software reconfigurable paradigm enabling ubiquitous communications between humans and mobile devices. They will be also capable of sensing, controlling, and optimizing the wireless environment to fulfill the visions of low-power, high-throughput, massively-connected, and low-latency communications. A key conceptual enabler that is recently gaining increasing popularity is the Holographic Multiple Input Multiple Output Surface (HMIMOS) that refers to a low-cost transformative wireless planar structure comprising of sub-wavelength metallic or dielectric scattering particles, which is capable of impacting electromagnetic waves according to desired objectives. In this article, we provide an overview of HMIMOS communications by introducing the available hardware architectures for reconfigurable such metasurfaces and their main characteristics, as well as highlighting the opportunities and key challenges in designing HMIMOS-enabled communications.

preprint2020arXiv

Hybrid Offline-Online Design for UAV-Enabled Data Harvesting in Probabilistic LoS Channel

This paper considers an unmanned aerial vehicle (UAV)-enabled wireless sensor network (WSN) in urban areas, where a UAV is deployed to collect data from distributed sensor nodes (SNs) within a given duration. To characterize the occasional building blockage between the UAV and SNs, we construct the probabilistic line-of-sight (LoS) channel model for a Manhattan-type city by using the combined simulation and data regression method, which is shown in the form of a generalized logistic function of the UAV-SN elevation angle. We assume that only the knowledge of SNs' locations and the probabilistic LoS channel model is known a priori, while the UAV can obtain the instantaneous LoS/Non-LoS channel state information (CSI) with the SNs in real time along its flight. Our objective is to maximize the minimum (average) data collection rate from all the SNs for the UAV. To this end, we formulate a new rate maximization problem by jointly optimizing the UAV three-dimensional (3D) trajectory and transmission scheduling of SNs. Although the optimal solution is intractable due to the lack of the complete UAV-SNs CSI, we propose in this paper a novel and general design method, called hybrid offline-online optimization, to obtain a suboptimal solution to it, by leveraging both the statistical and real-time CSI. Essentially, our proposed method decouples the joint design of UAV trajectory and communication scheduling into two phases: namely, an offline phase that determines the UAV path prior to its flight based on the probabilistic LoS channel model, followed by an online phase that adaptively adjusts the UAV flying speeds along the offline optimized path as well as communication scheduling based on the instantaneous UAV-SNs CSI and SNs' individual amounts of data received accumulatively.

preprint2020arXiv

Intelligent Reflecting Surface Aided Multicasting with Random Passive Beamforming

In this letter, we consider a multicast system where a single-antenna transmitter sends a common message to multiple single-antenna users, aided by an intelligent reflecting surface (IRS) equipped with $N$ passive reflecting elements. Prior works on IRS have mostly assumed the availability of channel state information (CSI) for designing its passive beamforming. However, the acquisition of CSI requires substantial training overhead that increases with $N$. In contrast, we propose in this letter a novel \emph{random passive beamforming} scheme, where the IRS performs independent random reflection for $Q\geq 1$ times in each channel coherence interval without the need of CSI acquisition. For the proposed scheme, we first derive a closed-form approximation of the outage probability, based on which the optimal $Q$ with best outage performance can be efficiently obtained. Then, for the purpose of comparison, we derive a lower bound of the outage probability with traditional CSI-based passive beamforming. Numerical results show that a small $Q$ is preferred in the high-outage regime (or with high rate target) and the optimal $Q$ becomes larger as the outage probability decreases (or as the rate target decreases). Moreover, the proposed scheme significantly outperforms the CSI-based passive beamforming scheme with training overhead taken into consideration when $N$ and/or the number of users are large, thus offering a promising CSI-free alternative to existing CSI-based schemes.

preprint2020arXiv

Intelligent Reflecting Surface Aided Multiple Access: Capacity Region and Deployment Strategy

Intelligent reflecting surface (IRS) is a new promising technology that is able to manipulate the wireless propagation channel via smart and controllable signal reflection. In this paper, we investigate the capacity region of a multiple access channel (MAC) with two users sending independent messages to an access point (AP), aided by $M$ IRS reflecting elements. We consider two practical IRS deployment strategies that lead to different user-AP effective channels, namely, the distributed deployment where the $M$ reflecting elements form two IRSs, each deployed in the vicinity of one user, versus the centralized deployment where all the $M$ reflecting elements are deployed in the vicinity of the AP. For the distributed deployment, we derive the capacity region in closed-form; while for the centralized deployment, we derive a capacity region outer bound and propose an efficient rate-profile based method to characterize an achievable rate region (or capacity region inner bound). Furthermore, we compare the capacity regions of the two cases and draw useful insights into the optimal deployment of IRS in practical systems.

preprint2020arXiv

Intelligent Reflecting Surface Aided Wireless Communications: A Tutorial

Intelligent reflecting surface (IRS) is an enabling technology to engineer the radio signal prorogation in wireless networks. By smartly tuning the signal reflection via a large number of low-cost passive reflecting elements, IRS is capable of dynamically altering wireless channels to enhance the communication performance. It is thus expected that the new IRS-aided hybrid wireless network comprising both active and passive components will be highly promising to achieve a sustainable capacity growth cost-effectively in the future. Despite its great potential, IRS faces new challenges to be efficiently integrated into wireless networks, such as reflection optimization, channel estimation, and deployment from communication design perspectives. In this paper, we provide a tutorial overview of IRS-aided wireless communication to address the above issues, and elaborate its reflection and channel models, hardware architecture and practical constraints, as well as various appealing applications in wireless networks. Moreover, we highlight important directions worthy of further investigation in future work.

preprint2020arXiv

Intelligent Reflecting Surface Assisted Multi-User OFDMA: Channel Estimation and Training Design

To achieve the full passive beamforming gains of intelligent reflecting surface (IRS), accurate channel state information (CSI) is indispensable but practically challenging to acquire, due to the excessive amount of channel parameters to be estimated which increases with the number of IRS reflecting elements as well as that of IRS-served users. To tackle this challenge, we propose in this paper two efficient channel estimation schemes for different channel setups in an IRS-assisted multi-user broadband communication system employing the orthogonal frequency division multiple access (OFDMA). The first channel estimation scheme, which estimates the CSI of all users in parallel simultaneously at the access point (AP), is applicable for arbitrary frequency-selective fading channels. In contrast, the second channel estimation scheme, which exploits a key property that all users share the same (common) IRS-AP channel to enhance the training efficiency and support more users, is proposed for the typical scenario with line-of-sight (LoS) dominant user-IRS channels. For the two proposed channel estimation schemes, we further optimize their corresponding training designs (including pilot tone allocations for all users and IRS time-varying reflection pattern) to minimize the channel estimation error. Moreover, we derive and compare the fundamental limits on the minimum training overhead and the maximum number of supportable users of these two schemes. Simulation results verify the effectiveness of the proposed channel estimation schemes and training designs, and show their significant performance improvement over various benchmark schemes.

preprint2020arXiv

Intelligent Reflecting Surface-Assisted Multiple Access with User Pairing: NOMA or OMA?

The integration of intelligent reflecting surface (IRS) to multiple access networks is a cost-effective solution for boosting spectrum/energy efficiency and enlarging network coverage/connections. However, due to the new capability of IRS in reconfiguring the wireless propagation channels, it is fundamentally unknown which multiple access scheme is superior in the IRS-assisted wireless network. In this letter, we pursue a theoretical performance comparison between non-orthogonal multiple access (NOMA) and orthogonal multiple access (OMA) in the IRS-assisted downlink communication, for which the transmit power minimization problems are formulated under the discrete unit-modulus reflection constraint on each IRS element. We analyze the minimum transmit powers required by different multiple access schemes and compare them numerically, which turn out to not fully comply with the stereotyped superiority of NOMA over OMA in conventional systems without IRS. Moreover, to avoid the exponential complexity of the brute-force search for the optimal discrete IRS phase shifts, we propose a low-complexity solution to achieve near-optimal performance.

preprint2020arXiv

Intelligent Reflecting Surface-Enhanced OFDM: Channel Estimation and Reflection Optimization

In the intelligent reflecting surface (IRS)-enhanced wireless communication system, channel state information (CSI) is of paramount importance for achieving the passive beamforming gain of IRS, which, however, is a practically challenging task due to its massive number of passive elements without transmitting/receiving capabilities. In this letter, we propose a practical transmission protocol to execute channel estimation and reflection optimization successively for an IRS-enhanced orthogonal frequency division multiplexing (OFDM) system. Under the unit-modulus constraint, a novel reflection pattern at the IRS is designed to aid the channel estimation at the access point (AP) based on the received pilot signals from the user, for which the channel estimation error is derived in closed-form. With the estimated CSI, the reflection coefficients are then optimized by a low-complexity algorithm based on the resolved strongest signal path in the time domain. Simulation results corroborate the effectiveness of the proposed channel estimation and reflection optimization methods.

preprint2020arXiv

Intelligent Reflecting Surface: Practical Phase Shift Model and Beamforming Optimization

Intelligent reflecting surface (IRS) that enables the control of wireless propagation environment has recently emerged as a promising cost-effective technology for boosting the spectrum and energy efficiency in future wireless communication systems. Prior works on IRS are mainly based on the ideal phase shift model assuming the full signal reflection by each of the elements regardless of its phase shift, which, however, is practically difficult to realize. In contrast, we propose in this paper the practical phase shift model that captures the phase-dependent amplitude variation in the element-wise reflection coefficient. Based on the proposed model and considering an IRS-aided multiuser system with an IRS deployed to assist in the downlink communications from a multi-antenna access point (AP) to multiple single-antenna users, we formulate an optimization problem to minimize the total transmit power at the AP by jointly designing the AP transmit beamforming and the IRS reflect beamforming, subject to the users' individual signal-to-interference-plus-noise ratio (SINR) constraints. Iterative algorithms are proposed to find suboptimal solutions to this problem efficiently by utilizing the alternating optimization (AO) or penalty-based optimization technique. Moreover, we analyze the asymptotic performance loss of the IRS-aided system that employs practical phase shifters but assumes the ideal phase shift model for beamforming optimization, as the number of IRS elements goes to infinity. Simulation results unveil substantial performance gains achieved by the proposed beamforming optimization based on the practical phase shift model as compared to the conventional ideal model.

preprint2020arXiv

Intelligent Reflecting Surface: Practical Phase Shift Model and Beamforming Optimization

Intelligent reflecting surface (IRS) that enables the control of the wireless propagation environment has been looked upon as a promising technology for boosting the spectrum and energy efficiency in future wireless communication systems. Prior works on IRS are mainly based on the ideal phase shift model assuming the full signal reflection by each of the elements regardless of its phase shift, which, however, is practically difficult to realize. In contrast, we propose in this paper a practical phase shift model that captures the phase-dependent amplitude variation in the element-wise reflection coefficient. Applying this new model to an IRS-aided wireless system, we formulate a problem to maximize its achievable rate by jointly optimizing the transmit beamforming and the IRS reflect beamforming. The formulated problem is non-convex and difficult to be optimally solved in general, for which we propose a low-complexity suboptimal solution based on the alternating optimization (AO) technique. Simulation results unveil a substantial performance gain achieved by the joint beamforming optimization based on the proposed phase shift model as compared to the conventional ideal model.

preprint2020arXiv

Joint Power Control and Passive Beamforming in IRS-Assisted Spectrum Sharing

In cognitive radio (CR) communication systems, achieving high secondary user (SU) rate in the presence of strong cross-link interference with the primary user (PU) is challenging. In this letter, we exploit the emerging intelligent reflecting surface (IRS) technology to tackle this problem. Specifically, we investigate an IRS-assisted CR communication system where an IRS is deployed to assist in the spectrum sharing between a PU link and an SU link. We aim to maximize the achievable SU rate subject to a given signal-to-interference-plus-noise ratio target for the PU link, by jointly optimizing the SU transmit power and IRS reflect beamforming. Since the formulated problem is difficult to solve due to its non-convexity and coupled variables, we propose an efficient algorithm based on alternating optimization and successive convex approximation techniques to solve it sub-optimally, along with some heuristic designs for lower complexity. Simulation results show that IRS is able to significantly improve the SU rate, even for the scenarios deemed most challenging in conventional CR systems without using IRS.

preprint2020arXiv

Learning Based Distributed Tracking

Inspired by the great success of machine learning in the past decade, people have been thinking about the possibility of improving the theoretical results by exploring data distribution. In this paper, we revisit a fundamental problem called Distributed Tracking (DT) under an assumption that the data follows a certain (known or unknown) distribution, and propose a number data-dependent algorithms with improved theoretical bounds. Informally, in the DT problem, there is a coordinator and k players, where the coordinator holds a threshold N and each player has a counter. At each time stamp, at most one counter can be increased by one. The job of the coordinator is to capture the exact moment when the sum of all these k counters reaches N. The goal is to minimise the communication cost. While our first type of algorithms assume the concrete data distribution is known in advance, our second type of algorithms can learn the distribution on the fly. Both of the algorithms achieve a communication cost bounded byO(k log log N) with high probability, improving the state-of-the-art data-independent bound O(k log N/k). We further propose a number of implementation optimisation heuristics to improve both efficiency and robustness of the algorithms. Finally, we conduct extensive experiments on three real datasets and four synthetic datasets. The experimental results show that the communication cost of our algorithms is as least as 20% of that of the state-of-the-art algorithms.

preprint2020arXiv

Machine-Learning Prediction for Quasi-PDF Matrix Elements

There have been rapid developments in the direct calculation in lattice QCD (LQCD) of the Bjorken-$x$ dependence of hadron structure through large-momentum effective theory (LaMET). LaMET overcomes the previous limitation of LQCD to moments (that is, integrals over Bjorken-$x$) of hadron structure, allowing LQCD to directly provide the kinematic regions where the experimental values are least known. LaMET requires large-momentum hadron states to minimize its systematics and allow us to reach small-$x$ reliably. This means that very fine lattice spacing to minimize lattice artifacts at order $(P_z a)^n$ will become crucial for next-generation LaMET-like structure calculations. Furthermore, such calculations require operators with long Wilson-link displacements (in finer lattice units), increasing the communication costs relative to that of the propagator inversion. In this work, we explore whether machine-learning (ML) algorithms can make correlator predictions to reduce the computational cost of these LQCD calculations. We consider two algorithms, gradient-boosting decision tree and linear models, applied to LaMET data, the matrix elements needed to determine the kaon and $η_s$ unpolarized parton distribution functions (PDFs), meson distribution amplitude (DA), and the nucleon gluon PDF. We find that both algorithms can reliably predict the target observables with different fit quality and systematic errors. The predictions from smaller displacement $z$ to larger ones work better than those for momentum $p$ due to the higher correlation among the data.

preprint2020arXiv

Map Generation from Large Scale Incomplete and Inaccurate Data Labels

Accurately and globally mapping human infrastructure is an important and challenging task with applications in routing, regulation compliance monitoring, and natural disaster response management etc.. In this paper we present progress in developing an algorithmic pipeline and distributed compute system that automates the process of map creation using high resolution aerial images. Unlike previous studies, most of which use datasets that are available only in a few cities across the world, we utilizes publicly available imagery and map data, both of which cover the contiguous United States (CONUS). We approach the technical challenge of inaccurate and incomplete training data adopting state-of-the-art convolutional neural network architectures such as the U-Net and the CycleGAN to incrementally generate maps with increasingly more accurate and more complete labels of man-made infrastructure such as roads and houses. Since scaling the mapping task to CONUS calls for parallelization, we then adopted an asynchronous distributed stochastic parallel gradient descent training scheme to distribute the computational workload onto a cluster of GPUs with nearly linear speed-up.

preprint2020arXiv

Mechanical Properties of Atomically Thin Boron Nitride and the Role of Interlayer Interactions

Atomically thin boron nitride (BN) nanosheets are important two-dimensional nanomaterials with many unique properties distinct from those of graphene, but the investigation of their mechanical properties still greatly lacks. Here we report that high-quality single-crystalline mono- and few-layer BN nanosheets are one of the strongest electrically insulating materials. More intriguingly, few-layer BN shows mechanical behaviors quite different from those of few-layer graphene under indentation. In striking contrast to graphene, whose strength decreases by more than 30% when the number of layers increases from 1 to 8, the mechanical strength of BN nanosheets is not sensitive to increasing thickness. We attribute this difference to the distinct interlayer interactions and hence sliding tendencies in these two materials under indentation. The significantly better mechanical integrity of BN nanosheets makes them a more attractive candidate than graphene for several applications, e.g. as mechanical reinforcements.

preprint2020arXiv

Multi-Antenna UAV Data Harvesting: Joint Trajectory and Communication Optimization

Unmanned aerial vehicle (UAV)-enabled communication is a promising technology to extend coverage and enhance throughput for traditional terrestrial wireless communication systems. In this paper, we consider a UAV-enabled wireless sensor network (WSN), where a multi-antenna UAV is dispatched to collect data from a group of sensor nodes (SNs). The objective is to maximize the minimum data collection rate from all SNs via jointly optimizing their transmission scheduling and power allocations as well as the trajectory of the UAV, subject to the practical constraints on the maximum transmit power of the SNs and the maximum speed of the UAV. The formulated optimization problem is challenging to solve as it involves non-convex constraints and discrete-value variables. To draw useful insight, we first consider the special case of the formulated problem by ignoring the UAV speed constraint and optimally solve it based on the Lagrange duality method. It is shown that for this relaxed problem, the UAV should hover above a finite number of optimal locations with different durations in general. Next, we address the general case of the formulated problem where the UAV speed constraint is considered and propose a traveling salesman problem (TSP)-based trajectory initialization, where the UAV sequentially visits the locations obtained in the relaxed problem with minimum flying time. Given this initial trajectory, we then find the corresponding transmission scheduling and power allocations of the SNs and further optimize the UAV trajectory by applying the block coordinate descent (BCD) and successive convex approximation (SCA) techniques. Finally, numerical results are provided to illustrate the spectrum and energy efficiency gains of the proposed scheme for multi-antenna UAV data harvesting, as compared to benchmark schemes.

preprint2020arXiv

Nanoscale magnetic resonance imaging of proteins in a single cell

Magnetic resonance imaging (MRI) is a non-invasive and label-free technique widely used in medical diagnosis and life science research, and its success has benefited greatly from continuing efforts on enhancing contrast and resolution. Here we reported nanoscale MRI in a single cell using an atomic-size quantum sensor. With nitrogen-vacancy center in diamond, the intracellular protein ferritin has been imaged with a spatial resolution of ~ 10 nanometers, and ferritin-containing organelles were co-localized by correlative MRI and electron microscopy. Comparing to the current micrometer resolution in current state-of-art conventional MRI, our approach represents a 100-fold enhancement, and paves the way for MRI of intracellular proteins.

preprint2020arXiv

Optimal l-one Rank One Matrix Decompositions

In this paper we consider the decomposition of positive semidefinite matrices as a sum of rank one matrices. We introduce and investigate the properties of various measures of optimality of such decompositions. For some classes of positive semidefinite matrices we give explicitly these optimal decompositions. These classes include diagonally dominant matrices and certain of their generalizations, $2\times 2$, and a class of $3\times 3$ matrices.

preprint2020arXiv

Pilot Decontamination for Massive MIMO Network with UAVs

This letter studies the pilot contamination (PC) problem for massive multiple-input multiple-output (MIMO) networks with coexisting terrestrial users and unmanned aerial vehicles (UAVs). Due to the strong line-of-sight (LoS) air-to-ground channels between UAVs and base stations (BSs), UAVs usually cause a more severe PC issue as compared to the traditional terrestrial users. To mitigate the PC caused by UAVs, we propose a low-complexity distributed scheme by exploiting the full-dimensional beamforming of massive MIMO BSs and the angle-dependent LoS channels between them and high-altitude UAVs. Numerical results show the effectiveness of the proposed pilot decontamination scheme and the significant signal-to-interference-plus-noise ratio (SINR) gains in both the uplink and downlink after pilot decontamination.

preprint2020arXiv

Portable Intrinsic Gradiometer for Ultra-Sensitive Detection of Magnetic Gradient in Unshielded Environment

We demonstrate a portable all-optical intrinsic scalar magnetic gradiometer composed of miniaturized cesium vapor cells and vertical-cavity surface-emitting lasers (VCSELs). Two cells, with an inner dimension of 5 mm x 5 mm x 5 mm and separated by a baseline of 5 cm, are driven by one VCSEL and the resulting Larmor precessions are probed by a second VCSEL through optical rotation. The off-resonant linearly polarized probe light interrogates two cells at the same time and the output of the intrinsic gradiometer is proportional to the magnetic field gradient measured over the given baseline. This intrinsic gradiometer scheme has the advantage of avoiding added noise from combining two scalar magnetometers. We achieve better than 18 fT/cm/rt-Hz sensitivity in the gradient measurement. Ultra-sensitive short-baseline magnetic gradiometers can potentially play an important role in many practical applications, such as nondestructive evaluation and unexploded ordnance (UXO) detection. Another application of the gradiometer is for magnetocardiography (MCG) in an unshielded environment. Real-time MCG signals can be extracted from the raw gradiometer readings. The demonstrated gradiometer greatly simplifies the MCG setup and may lead to ubiquitous MCG measurement in the future.

preprint2020arXiv

Rank one tensor completion problem

In this paper, we consider the rank-one tensor completion problem. We address the question of existence and uniqueness of the rank-one solution. In particular we show that the global uniqueness over the field of real numbers can be verified in a polynomial time. We give examples showing that there is an essential difference between the question of global uniqueness over the fields of real and complex numbers. Finally we briefly discuss the rank-one approximation problem for noisy observations.

preprint2020arXiv

ReADS: A Rectified Attentional Double Supervised Network for Scene Text Recognition

In recent years, scene text recognition is always regarded as a sequence-to-sequence problem. Connectionist Temporal Classification (CTC) and Attentional sequence recognition (Attn) are two very prevailing approaches to tackle this problem while they may fail in some scenarios respectively. CTC concentrates more on every individual character but is weak in text semantic dependency modeling. Attn based methods have better context semantic modeling ability while tends to overfit on limited training data. In this paper, we elaborately design a Rectified Attentional Double Supervised Network (ReADS) for general scene text recognition. To overcome the weakness of CTC and Attn, both of them are applied in our method but with different modules in two supervised branches which can make a complementary to each other. Moreover, effective spatial and channel attention mechanisms are introduced to eliminate background noise and extract valid foreground information. Finally, a simple rectified network is implemented to rectify irregular text. The ReADS can be trained end-to-end and only word-level annotations are required. Extensive experiments on various benchmarks verify the effectiveness of ReADS which achieves state-of-the-art performance.

preprint2020arXiv

Refined Nonlinear Rectenna Modeling and Optimal Waveform Design for Multi-User Multi-Antenna Wireless Power Transfer

In this paper, we study the optimal waveform design for wireless power transfer (WPT) from a multi-antenna energy transmitter (ET) to multiple single-antenna energy receivers (ERs) simultaneously in multi-path frequency-selective channels. First, we propose a refined nonlinear current-voltage model of the diode in the ER rectifier, and accordingly derive new expressions for the output direct current (DC) voltage and corresponding harvested power at the ER. Leveraging this new rectenna model, we first consider the single-ER case and study the multisine-based power waveform design based on the wireless channel to maximize the harvested power at the ER. We propose two efficient algorithms for finding high-quality suboptimal solutions to this non-convex optimization problem. Next, we extend our formulated waveform design problem to the general multi-ER case for maximizing the weighted sum of the harvested powers by all ERs, and propose an efficient difference-of-convex functions programming (DCP)-based algorithm for solving this problem. Finally, we demonstrate the superior performance of our proposed waveform designs based on the new rectenna model over existing schemes/models via simulations.

preprint2020arXiv

Relative Entropy Regularised TDLAS Tomography for Robust Temperature Imaging

Tunable Diode Laser Absorption Spectroscopy (TDLAS) tomography has been widely used for in situ combustion diagnostics, yielding images of both species concentration and temperature. The temperature image is generally obtained from the reconstructed absorbance distributions for two spectral transitions, i.e. two-line thermometry. However, the inherently ill-posed nature of tomographic data inversion leads to noise in each of the reconstructed absorbance distributions. These noise effects propagate into the absorbance ratio and generate artefacts in the retrieved temperature image. To address this problem, we have developed a novel algorithm, which we call Relative Entropy Tomographic RecOnstruction (RETRO), for TDLAS tomography. A relative entropy regularisation is introduced for high-fidelity temperature image retrieval from jointly reconstructed two-line absorbance distributions. We have carried out numerical simulations and proof-of-concept experiments to validate the proposed algorithm. Compared with the well-established Simultaneous Algebraic Reconstruction Technique (SART), the RETRO algorithm significantly improves the quality of the tomographic temperature images, exhibiting excellent robustness against TDLAS tomographic measurement noise. RETRO offers great potential for industrial field applications of TDLAS tomography, where it is common for measurements to be performed in very harsh environments.

preprint2020arXiv

RF-Rhythm: Secure and Usable Two-Factor RFID Authentication

Passive RFID technology is widely used in user authentication and access control. We propose RF-Rhythm, a secure and usable two-factor RFID authentication system with strong resilience to lost/stolen/cloned RFID cards. In RF-Rhythm, each legitimate user performs a sequence of taps on his/her RFID card according to a self-chosen secret melody. Such rhythmic taps can induce phase changes in the backscattered signals, which the RFID reader can detect to recover the user's tapping rhythm. In addition to verifying the RFID card's identification information as usual, the backend server compares the extracted tapping rhythm with what it acquires in the user enrollment phase. The user passes authentication checks if and only if both verifications succeed. We also propose a novel phase-hopping protocol in which the RFID reader emits Continuous Wave (CW) with random phases for extracting the user's secret tapping rhythm. Our protocol can prevent a capable adversary from extracting and then replaying a legitimate tapping rhythm from sniffed RFID signals. Comprehensive user experiments confirm the high security and usability of RF-Rhythm with false-positive and false-negative rates close to zero.

preprint2020arXiv

Sculpting stable structures in pure liquids

Pure liquids in thermodynamic equilibrium are structurally homogeneous. In liquid crystals, flow and light pulses are used to create reconfigurable domains with polar order. Moreover, through careful engineering of concerted microfluidic flows and localized opto-thermal fields, it is possible to achieve complete control over the nucleation, growth, and shape of such domains. Experiments, theory, and simulations indicate that the resulting structures can be stabilized indefinitely, provided the liquids are maintained in a controlled non-equilibrium state. The resulting sculpted liquids could find applications in microfluidic devices for selective encapsulation of solutes and particles into optically active compartments that interact with external stimuli.

preprint2020arXiv

Security of Distributed Machine Learning: A Game-Theoretic Approach to Design Secure DSVM

Distributed machine learning algorithms play a significant role in processing massive data sets over large networks. However, the increasing reliance on machine learning on information and communication technologies (ICTs) makes it inherently vulnerable to cyber threats. This work aims to develop secure distributed algorithms to protect the learning from data poisoning and network attacks. We establish a game-theoretic framework to capture the conflicting goals of a learner who uses distributed support vector machines (SVMs) and an attacker who is capable of modifying training data and labels. We develop a fully distributed and iterative algorithm to capture real-time reactions of the learner at each node to adversarial behaviors. The numerical results show that distributed SVM is prone to fail in different types of attacks, and their impact has a strong dependence on the network structure and attack capabilities.

preprint2020arXiv

Semi-Supervised Dialogue Policy Learning via Stochastic Reward Estimation

Dialogue policy optimization often obtains feedback until task completion in task-oriented dialogue systems. This is insufficient for training intermediate dialogue turns since supervision signals (or rewards) are only provided at the end of dialogues. To address this issue, reward learning has been introduced to learn from state-action pairs of an optimal policy to provide turn-by-turn rewards. This approach requires complete state-action annotations of human-to-human dialogues (i.e., expert demonstrations), which is labor intensive. To overcome this limitation, we propose a novel reward learning approach for semi-supervised policy learning. The proposed approach learns a dynamics model as the reward function which models dialogue progress (i.e., state-action sequences) based on expert demonstrations, either with or without annotations. The dynamics model computes rewards by predicting whether the dialogue progress is consistent with expert demonstrations. We further propose to learn action embeddings for a better generalization of the reward function. The proposed approach outperforms competitive policy learning baselines on MultiWOZ, a benchmark multi-domain dataset.

preprint2020arXiv

Short-Term and Long-Term Context Aggregation Network for Video Inpainting

Video inpainting aims to restore missing regions of a video and has many applications such as video editing and object removal. However, existing methods either suffer from inaccurate short-term context aggregation or rarely explore long-term frame information. In this work, we present a novel context aggregation network to effectively exploit both short-term and long-term frame information for video inpainting. In the encoding stage, we propose boundary-aware short-term context aggregation, which aligns and aggregates, from neighbor frames, local regions that are closely related to the boundary context of missing regions into the target frame. Furthermore, we propose dynamic long-term context aggregation to globally refine the feature map generated in the encoding stage using long-term frame features, which are dynamically updated throughout the inpainting process. Experiments show that it outperforms state-of-the-art methods with better inpainting results and fast inpainting speed.

preprint2020arXiv

Signal-Dependent Performance Analysis of Orthogonal Matching Pursuit for Exact Sparse Recovery

Exact recovery of $K$-sparse signals $x \in \mathbb{R}^{n}$ from linear measurements $y=Ax$, where $A\in \mathbb{R}^{m\times n}$ is a sensing matrix, arises from many applications. The orthogonal matching pursuit (OMP) algorithm is widely used for reconstructing $x$. A fundamental question in the performance analysis of OMP is the characterizations of the probability of exact recovery of $x$ for random matrix $A$ and the minimal $m$ to guarantee a target recovery performance. In many practical applications, in addition to sparsity, $x$ also has some additional properties. This paper shows that these properties can be used to refine the answer to the above question. In this paper, we first show that the prior information of the nonzero entries of $x$ can be used to provide an upper bound on $\|x\|_1^2/\|x\|_2^2$. Then, we use this upper bound to develop a lower bound on the probability of exact recovery of $x$ using OMP in $K$ iterations. Furthermore, we develop a lower bound on the number of measurements $m$ to guarantee that the exact recovery probability using $K$ iterations of OMP is no smaller than a given target probability. Finally, we show that when $K=O(\sqrt{\ln n})$, as both $n$ and $K$ go to infinity, for any $0<ζ\leq 1/\sqrtπ$, $m=2K\ln (n/ζ)$ measurements are sufficient to ensure that the probability of exact recovering any $K$-sparse $x$ is no lower than $1-ζ$ with $K$ iterations of OMP. For $K$-sparse $α$-strongly decaying signals and for $K$-sparse $x$ whose nonzero entries independently and identically follow the Gaussian distribution, the number of measurements sufficient for exact recovery with probability no lower than $1-ζ$ reduces further to $m=(\sqrt{K}+4\sqrt{\frac{α+1}{α-1}\ln(n/ζ)})^2$ and asymptotically $m\approx 1.9K\ln (n/ζ)$, respectively.

preprint2020arXiv

Simultaneous Navigation and Radio Mapping for Cellular-Connected UAV with Deep Reinforcement Learning

Cellular-connected unmanned aerial vehicle (UAV) is a promising technology to unlock the full potential of UAVs in the future. However, how to achieve ubiquitous three-dimensional (3D) communication coverage for the UAVs in the sky is a new challenge. In this paper, we tackle this challenge by a new coverage-aware navigation approach, which exploits the UAV&#39;s controllable mobility to design its navigation/trajectory to avoid the cellular BSs&#39; coverage holes while accomplishing their missions. We formulate an UAV trajectory optimization problem to minimize the weighted sum of its mission completion time and expected communication outage duration, and propose a new solution approach based on the technique of deep reinforcement learning (DRL). To further improve the performance, we propose a new framework called simultaneous navigation and radio mapping (SNARM), where the UAV&#39;s signal measurement is used not only for training the deep Q network (DQN) directly, but also to create a radio map that is able to predict the outage probabilities at all locations in the area of interest. This thus enables the generation of simulated UAV trajectories and predicting their expected returns, which are then used to further train the DQN via Dyna technique, thus greatly improving the learning efficiency.

preprint2020arXiv

Spatial Throughput Characterization for Intelligent Reflecting Surface Aided Multiuser System

Intelligent Reflecting Surface (IRS) has been recently proposed as a promising solution to enhance the spectral and energy efficiency of future wireless networks by tuning a massive number of low-cost passive reflecting elements and thereby constructing favorable wireless propagation environment. Different from the prior works that focus on link-level performance optimization for IRS-aided wireless systems, this letter characterizes the spatial throughput of a single-cell multiuser system aided by multiple IRSs that are randomly deployed in the cell. It is shown by simulation that our analysis is valid and the IRS-aided system outperforms the full-duplex relay-aided counterpart system in terms of spatial throughput when the number of IRSs exceeds a certain value. Moreover, it is shown that different deploying strategies for IRSs/active relays should be adopted for their respective throughput maximization. Finally, it is revealed that given the total number of reflecting elements for IRSs, the system spatial throughput increases when fewer IRSs are deployed each with more reflecting elements, but at the cost of more spatially varying user rates.

preprint2020arXiv

Towards Reliable UAV Swarm Communication in D2D-Enhanced Cellular Network

In the existing cellular networks, it remains a challenging problem to communicate with and control an unmanned aerial vehicle (UAV) swarm with both high reliability and low latency. Due to the UAV swarm&#39;s high working altitude and strong ground-to-air channels, it is generally exposed to multiple ground base stations (GBSs), while the GBSs that are serving ground users (occupied GBSs) can generate strong interference to the UAV swarm. To tackle this issue, we propose a novel two-phase transmission protocol by exploiting cellular plus device-to-device (D2D) communication for the UAV swarm. In Phase I, one swarm head is chosen for ground-to-air channel estimation, and all the GBSs that are not serving ground users (available GBSs) transmit a common control message to the UAV swarm simultaneously, using the same cellular frequency band, to combat the strong interference from occupied GBSs. In Phase II, all the UAVs that have decoded the common control message in Phase I further relay it to the other UAVs in the swarm via D2D communication, by exploiting the less interfered D2D frequency band and the proximity among UAVs. In this paper, we aim to characterize the reliability performance of the above two-phase protocol, i.e., the expected percentage of UAVs in the swarm that can decode the common control message, which is a non-trivial problem due to the complex system setup and the intricate coupling between the two phases. Nevertheless, we manage to obtain an approximated expression of the reliability performance of interest, under reasonable assumptions and with the aid of the Pearson distributions. Numerical results validate the accuracy of our analytical results and show the effectiveness of our protocol over other benchmark protocols. We also study the effect of key system parameters on the reliability performance, to reveal useful insights on the practical system design.

preprint2020arXiv

UAV-Sensing-Assisted Cellular Interference Coordination: A Cognitive Radio Approach

Aerial-ground interference mitigation has been deemed as the main challenge in realizing cellular-connected unmanned aerial vehicle (UAV) communications. Due to the line-of-sight (LoS)-dominant air-ground channels, the UAV generates/suffers much stronger interference to/from cellular base stations (BSs) over a much larger region in its uplink/downlink communication, as compared to the terrestrial users. As a result, conventional inter-cell interference coordination (ICIC) techniques catered for terrestrial networks become ineffective in mitigating the more severe UAV-induced interference. To deal with this new challenge, this letter introduces a cognitive radio based solution by treating the UAV and terrestrial users as secondary and primary users in the network, respectively. In particular, the LoS channels with terrestrial BSs/users endow the UAV with a powerful spectrum sensing capability for detecting the terrestrial signals over a much larger region than its serving BS. By exploiting this unique feature, we propose a new UAV-sensing-assisted ICIC design for both the UAV downlink and uplink communications. Specifically, the UAV senses its received interference and the transmissions of terrestrial users in the downlink and uplink, respectively, over the resource blocks (RBs) available at its serving BS to assist its RB allocation to the UAV for avoiding the interference with co-channel terrestrial communications. Numerical results demonstrate that the proposed UAV-assisted ICIC outperforms the conventional terrestrial ICIC by engaging the neighboring BSs for cooperation only.

preprint2020arXiv

Uplink Cooperative Interference Cancellation for Cellular-Connected UAV: A Quantize-and-Forward Approach

Aerial-ground interference is the main obstacle to achieving high spectral efficiency in cellular networks with both traditional terrestrial users and new unmanned aerial vehicle (UAV) users. Due to their strong line-of-sight (LoS) channels with the ground, UAVs could cause/suffer severe interference to/from a large number of non-associated but co-channel base stations (BSs) in their uplink/downlink communications. In this letter, we propose a new cooperative interference cancellation (CIC) scheme for the UAV&#39;s uplink communication to mitigate its strong interference to co-channel BSs, which requires only local cooperation between each co-channel BS and its adjacent helping BSs. Specifically, the helping BSs without serving any users in the UAV&#39;s communication channel quantize and forward their received signals from the UAV to their aided co-channel BS, which then processes the quantized signals jointly with its own received signal to decode its served terrestrial user&#39;s message via canceling the UAV&#39;s signal by linear/nonlinear interference cancellation techniques (ICTs). We derive the achievable rates of the proposed CIC scheme with different ICTs as a function of the UAV&#39;s transmit power and rate, and thereby unveil the conditions under which the proposed CIC scheme outperforms the existing CIC scheme based on the decode-and-forward (DF) operation of the helping BSs.

preprint2020arXiv

Weighted Sum Power Maximization for Intelligent Reflecting Surface Aided SWIPT

The low efficiency of far-field wireless power transfer (WPT) limits the fundamental rate-energy (R-E) performance trade-off of the simultaneous wireless information and power transfer (SWIPT) system. To address this challenge, we propose in this letter a new SWIPT system aided by the emerging intelligent reflecting surface (IRS) technology. By leveraging massive low-cost passive elements that are able to reflect the signals with adjustable phase shifts, IRS achieve a high passive beamforming gain, which is appealing for drastically enhancing the WPT efficiency and thereby the R-E trade-off of SWIPT systems. We consider an IRS being deployed to assist a multi-antenna access point (AP) to serve multiple information decoding receivers (IDRs) and energy harvesting receivers (EHRs). We aim to maximize the weighted sum-power received by EHRs via jointly optimizing the transmit precoders at the AP and reflect phase shifts at the IRS, subject to the individual signal-to-interference-plus-noise ratio (SINR) constraints for IDRs. Since this problem is non-convex, we propose efficient algorithms to obtain suboptimal solutions for it. In particular, we prove that it is sufficient to send information signals only at the AP to serve both IDRs and EHRs regardless of their channel realizations. Moreover, simulation results show significant performance gains achieved by our proposed designs over benchmark schemes.

preprint2020arXiv

Wireless Communication via Double IRS: Channel Estimation and Passive Beamforming Designs

In this letter, we study efficient channel estimation and passive beamforming designs for a double-intelligent reflecting surface (IRS) aided single-user communication system, where a user communicates with an access point (AP) via the cascaded user-IRS 1-IRS 2-AP double-reflection link. First, a general channel estimation scheme is proposed for the system under any arbitrary inter-IRS channel, where all coefficients of the cascaded channel are estimated. Next, for the typical scenario with a line-of-sight (LoS)-dominant inter-IRS channel, we propose another customized scheme to estimate two signature vectors of the rank-one cascaded channel with significantly less channel training time than the first scheme. For the two proposed channel estimation schemes, we further optimize their corresponding cooperative passive beamforming for data transmission to maximize the achievable rate with the training overhead and channel estimation error taken into account. Numerical results show that deploying two cooperative IRSs with the proposed channel estimation and passive beamforming designs achieves significant rate enhancement as compared to the conventional case of single IRS deployment.

preprint2019arXiv

Doping Dependence of the Second Magnetization Peak, Critical Current Density and Pinning Mechanism in BaFe$_{2-x}$Ni$_x$As$_2$ Pnictide Superconductors

A series of high quality BaFe$_{2-x}$Ni$_x$As$_2$ pnictide superconductors were studied using magnetic relaxation and isothermal magnetic measurements in order to study the second magnetization peak (SMP) and critical current behaviour in Ni-doped 122 family. The temperature dependence of the magnetic relaxation rate suggests a pinning crossover, whereas, it&#39;s magnetic field dependence hints a vortex-lattice structural phase-transition. The activation energy ($U$) estimated using the magnetic relaxation data was analyzed in detail for slightly-underdoped, slightly-overdoped and an overdoped samples, using Maley&#39;s method and collective creep theory. Our results confirm that the SMP in these samples is due to the collective (elastic) to plastic creep crossover as has been observed for the other members of 122-family. In addition, we also investigated the doping dependence of the critical current density ($J_c$) and the vortex-pinning behaviour in these compounds. The observed $J_c$ is higher than the threshold limit (10$^5$ A/cm$^2$) considered for the technological potential and even greater than 1 MA/cm$^2$ for slightly underdoped Ni-content, x = 0.092 sample. The pinning characteristics were analyzed in terms of the models developed by Dew-Hughes and Griessen $et$ $al$, which suggest the dominant role of $δl$-type pinning.

preprint2019arXiv

Multiple agile Earth observation satellites, oversubscribed targets scheduling using complex networks theory

The Earth observation satellites (EOSs) scheduling is of great importance to achieve efficient observation missions. The agile EOSs (AEOS) with stronger attitude maneuvering capacity can greatly improve observation efficiency while increasing scheduling complexity. The multiple AEOSs, oversubscribed targets scheduling problem with multiple observations are addressed, and the potential observation missions are modeled as nodes in the complex networks. To solve the problem, an improved feedback structured heuristic is designed by defining the node and target importance factors. On the basis of a real world Chinese AEOS constellation, simulation experiments are conducted to validate the heuristic efficiency in comparison with a constructive algorithm and a structured genetic algorithm.

preprint2019arXiv

Photonic realization of quantum resetting

Contrary to the usual assumption of at least partial control of quantum dynamics, a surprising recent result proved that an arbitrary quantum state can be probabilistically reset to a state in the past by having it interact with probing systems in a consistent, but $uncontrolled$ way. We present a photonic implementation to achieve this resetting process, experimentally verifying that a state can be probabilistically reset to its past with a fidelity of $0.870\pm0.012$. We further demonstrate the preservation of an entangled state, which still violates a Bell inequality, after half of the entangled pair was reset. The ability to reset uncontrolled quantum states has implications in the foundations of quantum physics and applications in areas of quantum technology.

preprint2019arXiv

Variational Inference for Sparse Gaussian Process Modulated Hawkes Process

The Hawkes process (HP) has been widely applied to modeling self-exciting events including neuron spikes, earthquakes and tweets. To avoid designing parametric triggering kernel and to be able to quantify the prediction confidence, the non-parametric Bayesian HP has been proposed. However, the inference of such models suffers from unscalability or slow convergence. In this paper, we aim to solve both problems. Specifically, first, we propose a new non-parametric Bayesian HP in which the triggering kernel is modeled as a squared sparse Gaussian process. Then, we propose a novel variational inference schema for model optimization. We employ the branching structure of the HP so that maximization of evidence lower bound (ELBO) is tractable by the expectation-maximization algorithm. We propose a tighter ELBO which improves the fitting performance. Further, we accelerate the novel variational inference schema to linear time complexity by leveraging the stationarity of the triggering kernel. Different from prior acceleration methods, ours enjoys higher efficiency. Finally, we exploit synthetic data and two large social media datasets to evaluate our method. We show that our approach outperforms state-of-the-art non-parametric frequentist and Bayesian methods. We validate the efficiency of our accelerated variational inference schema and practical utility of our tighter ELBO for model selection. We observe that the tighter ELBO exceeds the common one in model selection.