Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
80works
0followers
46topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

80 published item(s)

preprint2026arXiv

PnP-Corrector: A Universal Correction Framework for Coupled Spatiotemporal Forecasting

Coupled spatiotemporal forecasting is important for predicting the future evolution of multiple interacting dynamical systems, such as in climate models. However, existing methods are severely constrained by the persistent bottleneck of compounding errors. In coupled systems, errors from each subsystem simulator propagate and amplify one another, a phenomenon we term Reciprocal Error Amplification, leading to a rapid collapse of long-range predictions. To address this challenge, we propose a universal framework called PnP-Corrector (Plug-and-Play Corrector). The core idea of our framework is to decouple the physical simulation from the error correction process: it freezes pre-trained physics simulation engines and exclusively trains a correction agent to proactively counteract the systematic biases emerging from the coupled system. Furthermore, we design an efficient predictive model architecture, DSLCast, to serve as the backbone of this framework. Extensive experiments demonstrate that our method significantly enhances the long-term stability and accuracy of coupled forecasting systems. For instance, in the challenging task of a 300-day global ocean-atmosphere coupled forecast, our PnP-Corrector framework reduces the prediction error of the baseline model by 29% and surpasses state-of-the-art models on several key metrics.

preprint2026arXiv

RoboAlign-R1: Distilled Multimodal Reward Alignment for Robot Video World Models

Existing robot video world models are typically trained with low-level objectives such as reconstruction and perceptual similarity, which are poorly aligned with the capabilities that matter most for robot decision making, including instruction following, manipulation success, and physical plausibility. They also suffer from error accumulation in long-horizon autoregressive prediction. We present RoboAlign-R1, a framework that combines reward-aligned post-training with stabilized long-horizon inference for robot video world models. We construct RobotWorldBench, a benchmark of 10,000 annotated video-instruction pairs collected from four robot data sources, and train a multimodal teacher judge, RoboAlign-Judge, to provide fine-grained six-dimensional evaluation of generated videos. We then distill the teacher into a lightweight student reward model for efficient reinforcement-learning-based post-training. To reduce long-horizon rollout drift, we further introduce Sliding Window Re-encoding (SWR), a training-free inference strategy that periodically refreshes the generation context. Under our in-domain evaluation protocol, RoboAlign-R1 improves the aggregate six-dimension score by 10.1% over the strongest baseline, including gains of 7.5% on Manipulation Accuracy and 4.6% on Instruction Following; these ranking improvements are further supported by an external VLM-based cross-check and a blinded human study. Meanwhile, SWR improves long-horizon prediction quality with only about 1% additional latency, yielding a 2.8% gain in SSIM and a 9.8% reduction in LPIPS. Together, these results show that reward-aligned post-training and stabilized long-horizon decoding improve task consistency, physical realism, and long-horizon prediction quality in robot video world models.

preprint2025arXiv

Bridging the Perception-Cognition Gap:Re-engineering SAM2 with Hilbert-Mamba for Robust VLM-based Medical Diagnosis

Recent studies suggest that Visual Language Models (VLMs) hold great potential for tasks such as automated medical diagnosis. However, processing complex three-dimensional (3D) multimodal medical images poses significant challenges - specifically, the effective integration of complementary information and the occasional oversight of subtle yet critical pathological features. To address these issues, we present a novel two-stage fusion framework termed Hilbert-VLM. This framework leverages the HilbertMed-SAM module for precise lesion segmentation, with the generated multimodal enhanced prompts then guiding the VLM toward accurate disease classification. Our key innovation lies in the systematic redesign of the Segment Anything Model 2 (SAM2) architecture: we incorporate Hilbert space-filling curves into the scanning mechanism of the Mamba State Space Model (SSM) to maximize the preservation of spatial locality in 3D data, a property critical for medical image analysis. We also introduce a novel Hilbert-Mamba Cross-Attention (HMCA) mechanism and a scale-aware decoder to capture fine-grained details. Meanwhile, the prompt enhancement module unifies segmentation masks and their corresponding textual attributes into an information-dense prompt to support VLM inference. Extensive experiments were conducted to validate the effectiveness of the Hilbert-VLM model. On the BraTS2021 segmentation benchmark, it achieves a Dice score of 82.35 percent, with a diagnostic classification accuracy (ACC) of 78.85 percent. These results demonstrate that the proposed model offers substantial potential to improve the accuracy and reliability of medical VLM-based analysis.

preprint2024arXiv

Possible Meissner effect near room temperature in copper-substituted lead apatite

With copper-substituted lead apatite below room temperature, we observe diamagnetic dc magnetization under magnetic field of 25 Oe with remarkable bifurcation between zero-field-cooling and field-cooling measurements, and under 200 Oe it changes to be paramagnetism. A glassy memory effect is found during cooling. Typical hysteresis loops for superconductors are detected below 250 K, along with an asymmetry between forward and backward sweep of magnetic field. Our experiment suggests at room temperature the Meissner effect is possibly present in this material.

preprint2024arXiv

Verifying Concurrent Stacks by Divergence-Sensitive Bisimulation

The verification of linearizability -- a key correctness criterion for concurrent objects -- is based on trace refinement whose checking is PSPACE-complete. This paper suggests to use \emph{branching} bisimulation instead. Our approach is based on comparing an abstract specification in which object methods are executed atomically to a real object program. Exploiting divergence sensitivity, this also applies to progress properties such as lock-freedom. These results enable the use of \emph{polynomial-time} divergence-sensitive branching bisimulation checking techniques for verifying linearizability and progress. We conducted the experiment on concurrent lock-free stacks to validate the efficiency and effectiveness of our methods.

preprint2023arXiv

Room-Temperature Highly-Tunable Coercivity and Highly-Efficient Nonvolatile Multi-States Magnetization Switching by Small Current in Single 2D Ferromagnet Fe$_3$GaTe$_2$

Room-temperature electrically-tuned coercivity and nonvolatile multi-states magnetization switching is crucial for next-generation low-power 2D spintronics. However, most methods have limited ability to adjust the coercivity of ferromagnetic systems, and room-temperature electrically-driven magnetization switching shows high critical current density and high power dissipation. Here, highly-tunable coercivity and highly-efficient nonvolatile multi-states magnetization switching are achieved at room temperature in single-material based devices by 2D van der Waals itinerant ferromagnet Fe$_3$GaTe$_2$. The coercivity can be readily tuned up to ~98.06% at 300 K by a tiny in-plane electric field that is 2-5 orders of magnitude smaller than that of other ferromagnetic systems. Moreover, the critical current density and power dissipation for room-temperature magnetization switching in 2D Fe$_3$GaTe$_2$ are down to ~1.7E5 A cm$^{-2}$ and ~4E12 W m$^{-3}$, respectively. Such switching power dissipation is 2-6 orders of magnitude lower than that of other 2D ferromagnetic systems. Meanwhile, multi-states magnetization switching are presented by continuously controlling the current, which can dramatically enhance the information storage capacity and develop new computing methodology. This work opens the avenue for room-temperature electrical control of ferromagnetism and potential applications for vdW-integrated 2D spintronics.

preprint2023arXiv

Towards simultaneous coherent radiation in the visible and microwave bands with doped molecular crystals

Coherent sources exploiting the stimulated emission of non-equilibrium quantum systems, i.e. gain media, have proven indispensable for advancing fundamental research and engineering. The operating electromagnetic bands of such coherent sources have been continuously enriched for increasing demands.Nevertheless, for a single bench top coherent source, simultaneous generation of radiation in multiple bands, especially when the bands are widely separated, present formidable challenges with a single gain medium. Here, we propose a mechanism of simultaneously realizing the stimulated emission of radiation in the visible and microwave bands, i.e. lasing and masing actions, at ambient conditions by utilizing photoexcited singlet and triplet states of the pentacene molecules that are doped in p-terphenyl. The possibility is validated by the observed amplified spontaneous emission (ASE) at 645 nm with a narrow linewidth around 1 nm from the pentacene-doped p-terphenyl crystal used for masing at 1.45 GHz and consolidated by a 20 fold lower threshold of ASE compared to the reported masing threshold. The overall threshold of the pentacene-based multiband coherent source can be optimized by appropriate alignment of the pump-light polarization with the pentacene's transition dipole moment. Our work not only shows a great promise on immediate realization of multiband coherent sources but also establishes an intriguing solid-state platform for fundamental research of quantum optics in multiple frequency domains.

preprint2022arXiv

$β$-Divergence-Based Latent Factorization of Tensors model for QoS prediction

A nonnegative latent factorization of tensors (NLFT) model can well model the temporal pattern hidden in nonnegative quality-of-service (QoS) data for predicting the unobserved ones with high accuracy. However, existing NLFT models' objective function is based on Euclidean distance, which is only a special case of $β$-divergence. Hence, can we build a generalized NLFT model via adopting $β$-divergence to achieve prediction accuracy gain? To tackle this issue, this paper proposes a $β$-divergence-based NLFT model ($β$-NLFT). Its ideas are two-fold 1) building a learning objective with $β$-divergence to achieve higher prediction accuracy, and 2) implementing self-adaptation of hyper-parameters to improve practicability. Empirical studies on two dynamic QoS datasets demonstrate that compared with state-of-the-art models, the proposed $β$-NLFT model achieves the higher prediction accuracy for unobserved QoS data.

preprint2022arXiv

A note on time hierarchies for reasonable semantic classes without advice

We show time hierarchies for reasonable semantic classes without advice by eliminating the constant bits of advice in previous results.The elimination is done by a contrapositive argument that for any reasonable computational model,let $\text{CTIME}(f(n))/{g(n)}$ denote the set of all languages decide by machines running in time $O(f(n))$ with advice of $g(n)$ bits in that model, if $\text{CTIME}(t(n))\subseteq \text{CTIME}(T(n))/{A(n)}$ then $\text{CTIME}(t(n))/a \subseteq \text{CTIME}(T(n))/{a+2^aA(n)}$ where $a$ is a constant integer.

preprint2022arXiv

A novel robot calibration method with plane constraint based on dial indicator

In pace with the electronic technology development and the production technology improvement, industrial robot Give Scope to the Advantage in social services and industrial production. However, due to long-term mechanical wear and structural deformation, the absolute positioning accuracy is low, which greatly hinders the development of manufacturing industry. Calibrating the kinematic parameters of the robot is an effective way to address it. However, the main measuring equipment such as laser trackers and coordinate measuring machines are expensive and need special personnel to operate. Additionally, in the measurement process, due to the influence of many environmental factors, measurement noises are generated, which will affect the calibration accuracy of the robot. Basing on these, we have done the following work: a) developing a robot calibration method based on plane constraint to simplify measurement steps; b) employing Square-root Culture Kalman Filter (SCKF) algorithm for reducing the influence of measurement noises; c) proposing a novel algorithm for identifying kinematic parameters based on SCKF algorithm and Levenberg Marquardt (LM) algorithm to achieve the high calibration accuracy; d) adopting the dial indicator as the measuring equipment for slashing costs. The enough experiments verify the effectiveness of the proposed calibration algorithm and experimental platform.

preprint2022arXiv

An Optimal Transport Approach to the Computation of the LM Rate

Mismatch capacity characterizes the highest information rate for a channel under a prescribed decoding metric, and is thus a highly relevant fundamental performance metric when dealing with many practically important communication scenarios. Compared with the frequently used generalized mutual information (GMI), the LM rate has been known as a tighter lower bound of the mismatch capacity. The computation of the LM rate, however, has been a difficult task, due to the fact that the LM rate involves a maximization over a function of the channel input, which becomes challenging as the input alphabet size grows, and direct numerical methods (e.g., interior point methods) suffer from intensive memory and computational resource requirements. Noting that the computation of the LM rate can also be formulated as an entropy-based optimization problem with constraints, in this work, we transform the task into an optimal transport (OT) problem with an extra constraint. This allows us to efficiently and accurately accomplish our task by using the well-known Sinkhorn algorithm. Indeed, only a few iterations are required for convergence, due to the fact that the formulated problem does not contain additional regularization terms. Moreover, we convert the extra constraint into a root-finding procedure for a one-dimensional monotonic function. Numerical experiments demonstrate the feasibility and efficiency of our OT approach to the computation of the LM rate.

preprint2022arXiv

Aspect-driven User Preference and News Representation Learning for News Recommendation

News recommender systems are essential for helping users to efficiently and effectively find out those interesting news from a large amount of news. Most of existing news recommender systems usually learn topic-level representations of users and news for recommendation, and neglect to learn more informative aspect-level features of users and news for more accurate recommendation. As a result, they achieve limited recommendation performance. Aiming at addressing this deficiency, we propose a novel Aspect-driven News Recommender System (ANRS) built on aspect-level user preference and news representation learning. Here, news aspect is fine-grained semantic information expressed by a set of related words, which indicates specific aspects described by the news. In ANRS, news aspect-level encoder and user aspect-level encoder are devised to learn the fine-grained aspect-level representations of user's preferences and news characteristics respectively, which are fed into click predictor to judge the probability of the user clicking the candidate news. Extensive experiments are done on the commonly used real-world dataset MIND, which demonstrate the superiority of our method compared with representative and state-of-the-art methods.

preprint2022arXiv

Asymptotically Optimal Locally Private Heavy Hitters via Parameterized Sketches

We present two new local differentially private algorithms for frequency estimation. One solves the fundamental frequency oracle problem; the other solves the well-known heavy hitters identification problem. Consistent with prior art, these are randomized algorithms. As a function of failure probability~$β$, the former achieves optimal worst-case estimation error for every~$β$, while the latter is optimal when~$β$ is at least inverse polynomial in~$n$, the number of users. In both algorithms, server running time is~$\tilde{O}(n)$ while user running time is~$\tilde{O}(1)$. Our frequency-oracle algorithm achieves lower estimation error than the prior works of Bassily et al. (NeurIPS 2017). On the other hand, our heavy hitters identification method is as easily implementable as as TreeHist (Bassily et al., 2017) and has superior worst-case error, by a factor of $Ω(\sqrt{\log n})$.

preprint2022arXiv

Automation Slicing and Testing for in-App Deep Learning Models

Intelligent Apps (iApps), equipped with in-App deep learning (DL) models, are emerging to offer stable DL inference services. However, App marketplaces have trouble auto testing iApps because the in-App model is black-box and couples with ordinary codes. In this work, we propose an automated tool, ASTM, which can enable large-scale testing of in-App models. ASTM takes as input an iApps, and the outputs can replace the in-App model as the test object. ASTM proposes two reconstruction techniques to translate the in-App model to a backpropagation-enabled version and reconstruct the IO processing code for DL inference. With the ASTM's help, we perform a large-scale study on the robustness of 100 unique commercial in-App models and find that 56\% of in-App models are vulnerable to robustness issues in our context. ASTM also detects physical attacks against three representative iApps that may cause economic losses and security issues.

preprint2022arXiv

AVMiner: Expansible and Semantic-Preserving Anti-Virus Labels Mining Method

With the increase in the variety and quantity of malware, there is an urgent need to speed up the diagnosis and the analysis of malware. Extracting the malware family-related tokens from AV (Anti-Virus) labels, provided by online anti-virus engines, paves the way for pre-diagnosing the malware. Automatically extract the vital information from AV labels will greatly enhance the detection ability of security enterprises and equip the research ability of security analysts. Recent works like AVCLASS and AVCLASS2 try to extract the attributes of malware from AV labels and establish the taxonomy based on expert knowledge. However, due to the uncertain trend of complicated malicious behaviors, the system needs the following abilities to face the challenge: preserving vital semantics, being expansible, and free from expert knowledge. In this work, we present AVMiner, an expansible malware tagging system that can mine the most vital tokens from AV labels. AVMiner adopts natural language processing techniques and clustering methods to generate a sequence of tokens without expert knowledge ranked by importance. AVMiner can self-update when new samples come. Finally, we evaluate AVMiner on over 8,000 samples from well-known datasets with manually labeled ground truth, which outperforms previous works.

preprint2022arXiv

Berry-Esseen bounds with targets and Local Limit Theorems for products of random matrices

Let $μ$ be a probability measure on $\text{GL}_d(\mathbb R)$ and denote by $S_n:= g_n \cdots g_1$ the associated random matrix product, where $g_j$'s are i.i.d.'s with law $μ$. We study statistical properties of random variables of the form $$σ(S_n,x) + u(S_n x),$$ where $x \in \mathbb P^{d-1}$, $σ$ is the norm cocycle and $u$ belongs to a class of admissible functions on $\mathbb P^{d-1}$ with values in $\mathbb R \cup \{\pm \infty\}$. Assuming that $μ$ has a finite exponential moment and generates a proximal and strongly irreducible semigroup, we obtain optimal Berry-Esseen bounds and the Local Limit Theorem for such variables using a large class of observables on $\mathbb R$ and Hölder continuous target functions on $\mathbb P^{d-1}$. As particular cases, we obtain new limit theorems for $σ(S_n,x)$ and for the coefficients of $S_n$.

preprint2022arXiv

Beyond the Limitation of Pulse Width in Optical Time-domain Reflectometry

Optical time-domain reflectometry (OTDR) is the basis for distributed time-domain optical fiber sensing techniques. By injecting pulse light into an optical fiber, the distance information of an event can be obtained based on the time of light flight. The minimum distinguishable event separation along the fiber length is called the spatial resolution, which is determined by the optical pulse width. By reducing the pulse width, the spatial resolution can be improved. However, at the same time, the signal-to-noise ratio of the system is degraded, and higher speed equipment is required. To solve this problem, data processing methods such as iterative subdivision, deconvolution, and neural networks have been proposed. However, they all have some shortcomings and thus have not been widely applied. Here, we propose and experimentally demonstrate an OTDR deconvolution neural network based on deep convolutional neural networks. A simplified OTDR model is built to generate a large amount of training data. By optimizing the network structure and training data, an effective OTDR deconvolution is achieved. The simulation and experimental results show that the proposed neural network can achieve more accurate deconvolution than the conventional deconvolution algorithm with a higher signal-to-noise ratio.

preprint2022arXiv

Cavity Quantum Electrodynamics Effects of Optically Cooled Nitrogen-Vacancy Centers Coupled to a High Frequency Microwave Resonator

Recent experiments demonstrated the cooling of a microwave mode of a high-quality dielectric resonator coupled to optically cooled nitrogen-vacancy (NV) spins in diamond. Our recent theoretical study [arXiv:2110.10950] pointed out the cooled NV spins can be used to realize cavity quantum electrodynamics effects (C-QED) at room temperature. In this article, we propose to modify the setup used in a recent diamond maser experiment [Nature 55, 493-496 (2018)], which features a higher spin transition frequency, a lower spin-dephasing rate and a stronger NV spins-resonator coupling, to realize better microwave mode cooling and the room-temperature CQED effects. To describe more precisely the optical spin cooling and the collective spin-resonator coupling, we extend the standard Jaynes-Cumming model to account for the rich electronic and spin levels of the NV centers. Our calculations show that for the proposed setup it is possible to cool the microwave mode from $293$ K (room temperature) to $116$ K, which is about $72$ K lower than the previous records, and to study the intriguing dynamics of the CQED effects under the weak-to-strong coupling transition by varying the laser power. With simple modifications, our model can be applied to, e.g., other solid-state spins or triplet spins of pentacene molecules, and to investigate other effects, such as the operations of pulsed and continuous-wave masing.

preprint2022arXiv

CGN: A Capacity-Guaranteed Network Architecture for Future Ultra-Dense Wireless Systems

The sixth generation (6G) era is envisioned to be a fully intelligent and autonomous era, with physical and digital lifestyles merged together. Future wireless network architectures should provide a solid support for such new lifestyles. A key problem thus arises that what kind of network architectures are suitable for 6G. In this paper, we propose a capacity-guaranteed network (CGN) architecture, which provides high capacity for wireless devices densely distributed everywhere, and ensures a superior scalability with low signaling overhead and computation complexity simultaneously. Our theorem proves that the essence of a CGN architecture is to decompose the whole network into non-overlapping clusters with equal cluster sum capacity. Simulation results reveal that in terms of the minimum cluster sum capacity, the proposed CGN can achieve at least 30% performance gain compared with existing base station clustering (BS-clustering) architectures. In addition, our theorem is sufficiently general and can be applied for networks with different distributions of BSs and users.

preprint2022arXiv

CODE-MVP: Learning to Represent Source Code from Multiple Views with Contrastive Pre-Training

Recent years have witnessed increasing interest in code representation learning, which aims to represent the semantics of source code into distributed vectors. Currently, various works have been proposed to represent the complex semantics of source code from different views, including plain text, Abstract Syntax Tree (AST), and several kinds of code graphs (e.g., Control/Data Flow Graph). However, most of them only consider a single view of source code independently, ignoring the correspondences among different views. In this paper, we propose to integrate different views with the natural-language description of source code into a unified framework with Multi-View contrastive Pre-training, and name our model as CODE-MVP. Specifically, we first extract multiple code views using compiler tools, and learn the complementary information among them under a contrastive learning framework. Inspired by the type checking in compilation, we also design a fine-grained type inference objective in the pre-training. Experiments on three downstream tasks over five datasets demonstrate the superiority of CODE-MVP when compared with several state-of-the-art baselines. For example, we achieve 2.4/2.3/1.1 gain in terms of MRR/MAP/Accuracy metrics on natural language code retrieval, code similarity, and code defect detection tasks, respectively.

preprint2022arXiv

Compilable Neural Code Generation with Compiler Feedback

Automatically generating compilable programs with (or without) natural language descriptions has always been a touchstone problem for computational linguistics and automated software engineering. Existing deep-learning approaches model code generation as text generation, either constrained by grammar structures in decoder, or driven by pre-trained language models on large-scale code corpus (e.g., CodeGPT, PLBART, and CodeT5). However, few of them account for compilability of the generated programs. To improve compilability of the generated programs, this paper proposes COMPCODER, a three-stage pipeline utilizing compiler feedback for compilable code generation, including language model fine-tuning, compilability reinforcement, and compilability discrimination. Comprehensive experiments on two code generation tasks demonstrate the effectiveness of our proposed approach, improving the success rate of compilation from 44.18 to 89.18 in code completion on average and from 70.3 to 96.2 in text-to-code generation, respectively, when comparing with the state-of-the-art CodeGPT.

preprint2022arXiv

Contrastive Vision-Language Pre-training with Limited Resources

Pioneering dual-encoder pre-training works (e.g., CLIP and ALIGN) have revealed the potential of aligning multi-modal representations with contrastive learning. However, these works require a tremendous amount of data and computational resources (e.g., billion-level web data and hundreds of GPUs), which prevent researchers with limited resources from reproduction and further exploration. To this end, we propose a stack of novel methods, which significantly cut down the heavy resource dependency and allow us to conduct dual-encoder multi-modal representation alignment with limited resources. Besides, we provide a reproducible baseline of competitive results, namely ZeroVL, with only 14M publicly accessible academic datasets and 8 V100 GPUs. Additionally, we collect 100M web data for pre-training, and achieve comparable or superior results than state-of-the-art methods, further proving the effectiveness of our methods on large-scale data. We hope that this work will provide useful data points and experience for future research in contrastive vision-language pre-training. Code is available at https://github.com/zerovl/ZeroVL.

preprint2022arXiv

Digging into Primary Financial Market: Challenges and Opportunities of Adopting Blockchain

Since the emergence of blockchain technology, its application in the financial market has always been an area of focus and exploration by all parties. With the characteristics of anonymity, trust, tamper-proof, etc., blockchain technology can effectively solve some problems faced by the financial market, such as trust issues and information asymmetry issues. To deeply understand the application scenarios of blockchain in the financial market, the issue of securities issuance and trading in the primary market is a problem that must be studied clearly. We conducted an empirical study to investigate the main difficulties faced by primary market participants in their business practices and the potential challenges of the deepening application of blockchain technology in the primary market. We adopted a hybrid method combining interviews (qualitative methods) and surveys (quantitative methods) to conduct this research in two stages. In the first stage, we interview 15 major primary market participants with different backgrounds and expertise. In the second phase, we conducted a verification survey of 54 primary market practitioners to confirm various insights from the interviews, including challenges and desired improvements. Our interviews and survey results revealed several significant challenges facing blockchain applications in the primary market: complex due diligence, mismatch, and difficult monitoring. On this basis, we believe that our future research can focus on some aspects of these challenges.

preprint2022arXiv

Distinctive Image Captioning via CLIP Guided Group Optimization

Image captioning models are usually trained according to human annotated ground-truth captions, which could generate accurate but generic captions. In this paper, we focus on generating distinctive captions that can distinguish the target image from other similar images. To evaluate the distinctiveness of captions, we introduce a series of metrics that use large-scale vision-language pre-training model CLIP to quantify the distinctiveness. To further improve the distinctiveness of captioning models, we propose a simple and effective training strategy that trains the model by comparing target image with similar image group and optimizing the group embedding gap. Extensive experiments are conducted on various baseline models to demonstrate the wide applicability of our strategy and the consistency of metric results with human evaluation. By comparing the performance of our best model with existing state-of-the-art models, we claim that our model achieves new state-of-the-art towards distinctiveness objective.

preprint2022arXiv

Dual Power Spectrum Manifold and Toeplitz HPD Manifold: Enhancement and Analysis for Matrix CFAR Detection

Recently, an innovative matrix CFAR detection scheme based on information geometry, also referred to as the geometric detector, has been developed speedily and exhibits distinct advantages in several practical applications. These advantages benefit from the geometry of the Toeplitz Hermitian positive definite (HPD) manifold $\mathcal{M}_{\mathcal{T}H_{++}}$, but the sophisticated geometry also results in some challenges for geometric detectors, such as the implementation of the enhanced detector to improve the SCR (signal-to-clutter ratio) and the analysis of the detection performance. To meet these challenges, this paper develops the dual power spectrum manifold $\mathcal{M}_{\text{P}}$ as the dual space of $\mathcal{M}_{\mathcal{T}H_{++}}$. For each affine invariant geometric measure on $\mathcal{M}_{\mathcal{T}H_{++}}$, we show that there exists an equivalent function named induced potential function on $\mathcal{M}_{\text{P}}$. By the induced potential function, the measurements of the dissimilarity between two matrices can be implemented on $\mathcal{M}_{\text{P}}$, and the geometric detectors can be reformulated as the form related to the power spectrum. The induced potential function leads to two contributions: 1) The enhancement of the geometric detector, which is formulated as an optimization problem concerning $\mathcal{M}_{\mathcal{T}H_{++}}$, is transformed to an equivalent and simpler optimization on $\mathcal{M}_{\text{P}}$. In the presented example of the enhancement, the closed-form solution, instead of the gradient descent method, is provided through the equivalent optimization. 2) The detection performance is analyzed based on $\mathcal{M}_{\text{P}}$, and the advantageous characteristics, which benefit the detection performance, can be deduced by analyzing the corresponding power spectrum to the maximal point of the induced potential function.

preprint2022arXiv

Edge YOLO: Real-Time Intelligent Object Detection System Based on Edge-Cloud Cooperation in Autonomous Vehicles

Driven by the ever-increasing requirements of autonomous vehicles, such as traffic monitoring and driving assistant, deep learning-based object detection (DL-OD) has been increasingly attractive in intelligent transportation systems. However, it is difficult for the existing DL-OD schemes to realize the responsible, cost-saving, and energy-efficient autonomous vehicle systems due to low their inherent defects of low timeliness and high energy consumption. In this paper, we propose an object detection (OD) system based on edge-cloud cooperation and reconstructive convolutional neural networks, which is called Edge YOLO. This system can effectively avoid the excessive dependence on computing power and uneven distribution of cloud computing resources. Specifically, it is a lightweight OD framework realized by combining pruning feature extraction network and compression feature fusion network to enhance the efficiency of multi-scale prediction to the largest extent. In addition, we developed an autonomous driving platform equipped with NVIDIA Jetson for system-level verification. We experimentally demonstrate the reliability and efficiency of Edge YOLO on COCO2017 and KITTI data sets, respectively. According to COCO2017 standard datasets with a speed of 26.6 frames per second (FPS), the results show that the number of parameters in the entire network is only 25.67 MB, while the accuracy (mAP) is up to 47.3%.

preprint2022arXiv

Enhanced quantum sensing with room-temperature solid-state masers

Quantum sensing with solid-state systems finds broad applications in diverse areas ranging from material and biomedical sciences to fundamental physics. Several solid-state spin sensors have been developed, facilitating the ultra-sensitive detection of physical quantities such as magnetic and electric fields and temperature. Exploiting collective behaviour of non-interacting spins holds the promise of pushing the detection limit to even lower levels, while to date, those levels are scarcely reached due to the broadened linewidth and inefficient readout of solid-state spin ensembles. Here, we experimentally demonstrate that such drawbacks can be overcome by newly reborn maser technology at room temperature in the solid state. Owing to maser action, we observe a 4-fold reduction in the inhomogeneously broadened linewidth of a molecular spin ensemble, which is narrower than the same measured from single spins at cryogenic temperatures. The maser-based readout applied to magnetometry showcases a signal-to-noise ratio (SNR) of 30 dB for single shots. This technique would be a significant addition to the toolbox for boosting the sensitivity of solid-state ensemble spin sensors.

preprint2022arXiv

Ergodic Deviations of Degenerate Multidimensional Actions -- Symmetric Convex Bodies

We prove that the ergodic deviation of a degenerate $\mathbb{Z}^2$-action on the torus $\mathbb{T}^2$ relative to a symmetric, strictly convex body can be decomposed into two parts, and that each part admits a limit distribution after choosing a suitable normalizer. Specifically, the first part is similar to an ergodic sum of smooth observables after being normalized by $N$, and the second part is similar to the case of a random toral translation, i.e., the $\mathbb{Z}$-action, but with a normalizer of $N^{\frac{1}{2}}$. The key difference is that we employ the product flow on the product space of $\mathbb{Z}^2$ lattices for the multidimensional action.

preprint2022arXiv

Experimental test of Tsirelson's bound with a single photonic qubit

For many protocols, quantum strategies have advantages compared with their classical counter-partners, and these advantages have attracted many interests and applications. One of the famous examples is the Clauser-Horne-Shimony-Holt (CHSH) game, which recasts Bell's theorem~\cite{2} into the framework of a game. In the CHSH game, two space-like separated players, Alice and Bob are each assigned a classical bit $a$ and $b$ respectively. Then they return bits $x$ and $y$ according to some pre-agreed strategies. They will win the game when $x\oplus y= a\cdot b$. In the game, if the players use the classical strategies, the optimal success probability $w(\text{CHSH})=0.75$.However, if they add some quantum resources, the success probability will increase and up to maximal value $cos^2(π/8)$, which is know as the Tsirelson's bound. Moreover, Popescu and Rohrlich noted that the perfect success probability $1$ can also be achieved in a more general theory without violating the no-signaling assumption

preprint2022arXiv

Exploiting Unlabeled Data for Target-Oriented Opinion Words Extraction

Target-oriented Opinion Words Extraction (TOWE) is a fine-grained sentiment analysis task that aims to extract the corresponding opinion words of a given opinion target from the sentence. Recently, deep learning approaches have made remarkable progress on this task. Nevertheless, the TOWE task still suffers from the scarcity of training data due to the expensive data annotation process. Limited labeled data increase the risk of distribution shift between test data and training data. In this paper, we propose exploiting massive unlabeled data to reduce the risk by increasing the exposure of the model to varying distribution shifts. Specifically, we propose a novel Multi-Grained Consistency Regularization (MGCR) method to make use of unlabeled data and design two filters specifically for TOWE to filter noisy data at different granularity. Extensive experimental results on four TOWE benchmark datasets indicate the superiority of MGCR compared with current state-of-the-art methods. The in-depth analysis also demonstrates the effectiveness of the different-granularity filters. Our codes are available at https://github.com/TOWESSL/TOWESSL.

preprint2022arXiv

Fast Sinkhorn II: Collinear Triangular Matrix and Linear Time Accurate Computation of Optimal Transport

In our previous work [arXiv:2202.10042], the complexity of Sinkhorn iteration is reduced from $O(N^2)$ to the optimal $O(N)$ by leveraging the special structure of the kernel matrix. In this paper, we explore the special structure of kernel matrices by defining and utilizing the properties of the Lower-ColLinear Triangular Matrix (L-CoLT matrix) and Upper-ColLinear Triangular Matrix (U-CoLT matrix). We prove that (1) L/U-CoLT matrix-vector multiplications can be carried out in $O(N)$ operations; (2) both families of matrices are closed under the Hadamard product and matrix scaling. These properties help to alleviate two key difficulties for reducing the complexity of the Inexact Proximal point method (IPOT), and allow us to significantly reduce the number of iterations to $O(N)$. This yields the Fast Sinkhorn II (FS-2) algorithm for accurate computation of optimal transport with low algorithm complexity and fast convergence. Numerical experiments are presented to show the effectiveness and efficiency of our approach.

preprint2022arXiv

FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo Labeling

The recently proposed FixMatch achieved state-of-the-art results on most semi-supervised learning (SSL) benchmarks. However, like other modern SSL algorithms, FixMatch uses a pre-defined constant threshold for all classes to select unlabeled data that contribute to the training, thus failing to consider different learning status and learning difficulties of different classes. To address this issue, we propose Curriculum Pseudo Labeling (CPL), a curriculum learning approach to leverage unlabeled data according to the model's learning status. The core of CPL is to flexibly adjust thresholds for different classes at each time step to let pass informative unlabeled data and their pseudo labels. CPL does not introduce additional parameters or computations (forward or backward propagation). We apply CPL to FixMatch and call our improved algorithm FlexMatch. FlexMatch achieves state-of-the-art performance on a variety of SSL benchmarks, with especially strong performances when the labeled data are extremely limited or when the task is challenging. For example, FlexMatch achieves 13.96% and 18.96% error rate reduction over FixMatch on CIFAR-100 and STL-10 datasets respectively, when there are only 4 labels per class. CPL also significantly boosts the convergence speed, e.g., FlexMatch can use only 1/5 training time of FixMatch to achieve even better performance. Furthermore, we show that CPL can be easily adapted to other SSL algorithms and remarkably improve their performances. We open-source our code at https://github.com/TorchSSL/TorchSSL.

preprint2022arXiv

Free Component Analysis: Theory, Algorithms & Applications

We describe a method for unmixing mixtures of freely independent random variables in a manner analogous to the independent component analysis (ICA) based method for unmixing independent random variables from their additive mixtures. Random matrices play the role of free random variables in this context so the method we develop, which we call Free component analysis (FCA), unmixes matrices from additive mixtures of matrices. Thus, while the mixing model is standard, the novelty and difference in unmixing performance comes from the introduction of a new statistical criteria, derived from free probability theory, that quantify freeness analogous to how kurtosis and entropy quantify independence. We describe the theory, the various algorithms, and compare FCA to vanilla ICA which does not account for spatial or temporal structure. We highlight why the statistical criteria make FCA also vanilla despite its matricial underpinnings and show that FCA performs comparably to, and sometimes better than, (vanilla) ICA in every application, such as image and speech unmixing, where ICA has been known to succeed. Our computational experiments suggest that not-so-random matrices, such as images and short time fourier transform matrix of waveforms are (closer to being) freer "in the wild" than we might have theoretically expected.

preprint2022arXiv

Joint Offloading Decision and Resource Allocation for Vehicular Fog-Edge Computing Networks: A Contract-Stackelberg Approach

With the popularity of mobile devices and development of computationally intensive applications, researchers are focusing on offloading computation to Mobile Edge Computing (MEC) server due to its high computational efficiency and low communication delay. As the computing resources of an MEC server are limited, vehicles in the urban area who have abundant idle resources should be fully utilized. However, offloading computing tasks to vehicles faces many challenging issues. In this paper, we introduce a vehicular fog-edge computing paradigm and formulate it as a multi-stage Stackelberg game to deal with these issues. Specifically, vehicles are not obligated to share resources, let alone disclose their private information (e.g., stay time and the amount of resources). Therefore, in the first stage, we design a contract-based incentive mechanism to motivate vehicles to contribute their idle resources. Next, due to the complicated interactions among vehicles, road-side unit (RSU), MEC server and mobile device users, it is challenging to coordinate the resources of all parties and design a transaction mechanism to make all entities benefit. In the second and third stages, based on Stackelberg game, we develop pricing strategies that maximize the utilities of all parties. The analytical forms of optimal strategies for each stage are given. Simulation results demonstrate the effectiveness of our proposed incentive mechanism, reveal the trends of energy consumption and offloading decisions of users with various parameters, and present the performance comparison between our framework and existing MEC offloading paradigm in vehicular networks.

preprint2022arXiv

LiGCN: Label-interpretable Graph Convolutional Networks for Multi-label Text Classification

Multi-label text classification (MLTC) is an attractive and challenging task in natural language processing (NLP). Compared with single-label text classification, MLTC has a wider range of applications in practice. In this paper, we propose a label-interpretable graph convolutional network model to solve the MLTC problem by modeling tokens and labels as nodes in a heterogeneous graph. In this way, we are able to take into account multiple relationships including token-level relationships. Besides, the model allows better interpretability for predicted labels as the token-label edges are exposed. We evaluate our method on four real-world datasets and it achieves competitive scores against selected baseline methods. Specifically, this model achieves a gain of 0.14 on the F1 score in the small label set MLTC, and 0.07 in the large label set scenario.

preprint2022arXiv

Mobility Support for Millimeter Wave Communications: Opportunities and Challenges

Millimeter-wave (mmWave) communication technology offers a potential and promising solution to support 5G and B5G wireless networks in dynamic scenarios and applications. However, mobility introduces many challenges as well as opportunities to mmWave applications. To address these problems, we conduct a survey of the opportunities and technologies to support mmWave communications in mobile scenarios. Firstly, we summarize the mobile scenarios where mmWave communications are exploited, including indoor wireless local area network (WLAN) or wireless personal area network (WPAN), cellular access, vehicle-to-everything (V2X), high speed train (HST), unmanned aerial vehicle (UAV), and the new space-air-ground-sea communication scenarios. Then, to address users' mobility impact on the system performance in different application scenarios, we introduce several representative mobility models in mmWave systems, including human mobility, vehicular mobility, high speed train mobility and ship mobility. Next we survey the key challenges and existing solutions to mmWave applications, such as channel modeling, channel estimation, anti-blockage, and capacity improvement. Lastly, we discuss the open issues concerning mobility-aware mmWave communications that deserve further investigation. In particular, we highlight future heterogeneous mobile networks, dynamic resource management, artificial intelligence (AI) for mobility and integration of geographical information, deployment of large intelligent surface and reconfigurable antenna technology, and finally, the evolution to Terahertz (THz) communications.

preprint2022arXiv

On the Mass-Conserving Allen-Cahn Approximation for Incompressible Binary Fluids

This paper is devoted to the global well-posedness of two Diffuse Interface systems modeling the motion of an incompressible two-phase fluid mixture in presence of capillarity effects in a bounded smooth domain $Ω\subset \mathbb{R}^d$, $d=2,3$. We focus on dissipative mixing effects originating from the mass-conserving Allen-Cahn dynamics with the physically relevant Flory-Huggins potential. More precisely, we study the mass-conserving Navier-Stokes-Allen-Cahn system for nonhomogeneous fluids and the mass-conserving Euler-Allen-Cahn system for homogeneous fluids. We prove existence and uniqueness of global weak and strong solutions as well as their property of separation from the pure states. In our analysis, we combine the energy and entropy estimates, a novel end-point estimate of the product of two functions, a new estimate for the Stokes problem with non-constant viscosity, and logarithmic type Gronwall arguments.

preprint2022arXiv

Physics-embedded inverse analysis with automatic differentiation for the earth's subsurface

Inverse analysis has been utilized to understand unknown underground geological properties by matching the observational data with simulators. To overcome the underconstrained nature of inverse problems and achieve good performance, an approach is presented with embedded physics and a technique known as automatic differentiation. We use a physics-embedded generative model, which takes statistically simple parameters as input and outputs subsurface properties (e.g., permeability or P-wave velocity), that embeds physical knowledge of the subsurface properties into inverse analysis and improves its performance. We tested the application of this approach on four geologic problems: two heterogeneous hydraulic conductivity fields, a hydraulic fracture network, and a seismic inversion for P-wave velocity. This physics-embedded inverse analysis approach consistently characterizes these geological problems accurately. Furthermore, the excellent performance in matching the observational data demonstrates the reliability of the proposed method. Moreover, the application of automatic differentiation makes this an easy and fast approach to inverse analysis when dealing with complicated geological structures.

preprint2022arXiv

Production of neutron-rich actinide nuclides in isobaric collisions via multinucleon transfer reactions

We have calculated the multinucleon transfer reactions of $^{208}$Os, $^{208}$Pt, $^{208}$Hg, $^{208}$Pb,$^{208}$Po, $^{208}$Rn, $^{208}$Ra,$^{132,136}$Xe bombarding on $^{232}$Th and $^{248}$Cm at Coulomb barrier energies within the dinuclear system model, systematically. The results are in good agreement with the available experimental data. Coulomb effect and shell effect on production of actinides in these reactions have been investigated thoroughly. Potential energy surface and total kinetic energy mass distributions in the reactions $^{208}$Hg, $^{208}$Pb and$^{208}$Po colliding on $^{248}$Cm and $^{232}$Th are calculated and analyzed, respectively. It is found that PES and TKE spectra manifest the fragment formation mechanism in the multinucleon transfer reactions. The isospin effect and shell effect are shown in PES and TKE. Production cross-sections of multinucleon transfer products are highly dependent on the isobar projectiles with mass number $A=208$. The isobar projectiles with larger N/Z ratios are favorable for creating the neutron-rich target-like fragments. The isobar projectiles with larger charge number induced products tend to shift to proton-rich region. Coulomb potential coupled to shell effect is shown in production cross-sections of actinide isotopes. Based on the radioactive projectiles induced reactions, we have predicted massive new actinide isotopes around nuclear drip lines, even could access the superheavy nuclei region.

preprint2022arXiv

Protection of quantum evolutions under parity-time symmetric non-Hermitian Hamiltonians by dynamical decoupling

Parity-time (PT) symmetric non-Hermitian Hamiltonians bring about many novel features and interesting applications such as quantum gates faster than those in Hermitian systems, and topological state transfer. The performance of evolutions under $\mathcal{PT}$-symmetric Hamiltonians is degraded by the inevitable noise and errors due to system-environment interaction and experimental imperfections. In contrast to Hermitian Hamiltonians, the fluctuations in dissipative beams that are utilized to generate non-Hermitian contributions in the PT-symmetric Hamiltonians cause additional errors. Here we achieve the protection of PT-symmetric Hamiltonians against noise acting along the qubit's quantization axis by combining quantum evolutions with dynamical decoupling sequences. We demonstrate the performance of our method by numerical simulations. Realistic noise sources and parameters are chosen including: constant detuning error, time-varying detuning noise and dissipative-beam noise. The fidelities of the protected evolutions are well above the unprotected ones under all the above situations. Our work paves the way for further studies and applications of non-Hermitian $\mathcal{PT}$-symmetric physics in noisy quantum systems.

preprint2022arXiv

Quasi-Continuous Cooling of a Microwave Mode on a Benchtop using Hyperpolarized NV$^-$ Diamond

We demonstrate the cooling of a microwave mode at 2872 MHz through its interaction with optically spin-polarized NV$^-$ centers in diamond at zero applied magnetic field, removing thermal photons from the mode. By photo-exciting (pumping) a brilliant-cut red diamond jewel with a continuous-wave 532-nm laser, outputting 2 W, the microwave mode is cooled down to a noise temperature of 188 K. This noise temperature can be preserved continuously for as long as the diamond is optically excited and kept cool. The latter requirement restricted operation out to 10 ms in our preliminary setup. The mode-cooling performance of NV$^-$ diamond is directly compared against that of pentacene-doped para-terphenyl, where we find that the former affords the advantages of cooling immediately upon light excitation without needing to mase beforehand (or at all) and being able to cool continuously at substantially lower optical pump power.

preprint2022arXiv

Randomize the Future: Asymptotically Optimal Locally Private Frequency Estimation Protocol for Longitudinal Data

Longitudinal data tracking under Local Differential Privacy (LDP) is a challenging task. Baseline solutions that repeatedly invoke a protocol designed for one-time computation lead to linear decay in the privacy or utility guarantee with respect to the number of computations. To avoid this, the recent approach of Erlingsson et al. (2020) exploits the potential sparsity of user data that changes only infrequently. Their protocol targets the fundamental problem of frequency estimation protocol for longitudinal binary data, with $\ell_\infty$ error of $O ( (1 / ε) \cdot (\log d)^{3 / 2} \cdot k \cdot \sqrt{ n \cdot \log ( d / β) } )$, where $ε$ is the privacy budget, $d$ is the number of time periods, $k$ is the maximum number of changes of user data, and $β$ is the failure probability. Notably, the error bound scales polylogarithmically with $d$, but linearly with $k$. In this paper, we break through the linear dependence on $k$ in the estimation error. Our new protocol has error $O ( (1 / ε) \cdot (\log d) \cdot \sqrt{ k \cdot n \cdot \log ( d / β) } )$, matching the lower bound up to a logarithmic factor. The protocol is an online one, that outputs an estimate at each time period. The key breakthrough is a new randomizer for sequential data, FutureRand, with two key features. The first is a composition strategy that correlates the noise across the non-zero elements of the sequence. The second is a pre-computation technique which, by exploiting the symmetry of input space, enables the randomizer to output the results on the fly, without knowing future inputs. Our protocol closes the error gap between existing online and offline algorithms.

preprint2022arXiv

Realizing quantum speed limit in open system with a PT-symmetric trapped-ion qubit

Evolution time of a qubit under a Hamiltonian operation is one of the key issues in quantum control, quantum information processing and quantum computing. It has a lower bound in Hermitian system, which is limited by the coupling between two states of the qubit, while it is proposed that in a non-Hermitian system it can be made much smaller without violating the time-energy uncertainty principle. Here we have experimentally confirmed the proposal in a single dissipative qubit system and demonstrate that the evolution time of a qubit from an initial state to an arbitrary state can be controlled by tuning the dissipation intensity in a non-Hermitian Parity-Time-Symmetric ($\mathcal{P T}$-symmetric) quantum system. It decreases with increasing dissipation intensity and also gives a tighter bound for quantum speed limit (QSL). We also find that the evolution time of its reversal operation increases with the increasing dissipation intensity. These findings give us a well-controlled knob for speeding up the qubit operation, and pave the way towards fast and practical quantum computation, opening the door for solving sophisticated problems with only a few qubits.

preprint2022arXiv

Resilience in Industrial Internet of Things Systems: A Communication Perspective

Industrial Internet of Things is an ultra-large-scale system that is much more sophisticated and fragile than conventional industrial platforms. The effective management of such a system relies heavily on the resilience of the network, especially the communication part. Imperative as resilient communication is, there is not enough attention from literature and a standardized framework is still missing. In awareness of these, this paper intends to provide a systematic overview of resilience in IIoT with a communication perspective, aiming to answer the questions of why we need it, what it is, how to enhance it, and where it can be applied. Specifically, we emphasize the urgency of resilience studies via examining existing literature and analyzing malfunction data from a real satellite communication system. Resilience-related concepts and metrics, together with standardization efforts are then summarized and discussed, presenting a basic framework for analyzing the resilience of the system before, during, and after disruptive events. On the basis of the framework, key resilience concerns associated with the design, deployment, and operation of IIoT are briefly described to shed light on the methods for resilience enhancement. Promising resilient applications in different IIoT sectors are also introduced to highlight the opportunities and challenges in practical implementations.

preprint2022arXiv

Robust quantum control for the manipulation of solid-state spins

Robust and high-fidelity control of electron spins in solids is the cornerstone for facilitating applications of solid-state spins in quantum information processing and quantum sensing. However, precise control of spin systems is always challenging due to the presence of a variety of noises originating from the environment and control fields. Herein, noise-resilient quantum gates, designed with robust optimal control (ROC) algorithms, are demonstrated experimentally with nitrogen-vacancy (NV) centers in diamond to realize tailored robustness against detunings and Rabi errors simultaneously. In the presence of both 10% off-resonant detuning and deviation of a Rabi frequency, we achieve an average single-qubit gate fidelity of up to 99.97%. Our experiments also show that, ROCbased multipulse quantum sensing sequences can suppress spurious responses resulting from finite widths and imperfections of microwave pulses, which provides an efficient strategy for enhancing the performance of existing multipulse quantum sensing sequences.

preprint2022arXiv

Scheduling of UAV-assisted Millimeter Wave Communications for High-Speed Railway

To exploit richer spectrum resources for even better service quality, millimeter wave (mmWave) communication has been considered for high-speed railway (HSR) communication systems. In this paper, we focus on scheduling as many flows as possible while satisfying their QoS requirements. Due to interference, eavesdropping, or other problems, some flows may not be directly transmitted from the track-side BS. In this paper, we propose an UAV-assisted scheduling scheme which utilizes a UAV to serve as relay for such flows. The proposed scheme also utilize two mmWave bands, one for the BS links and the other for the UAV links. The proposed algorithm aims to maximize the number of flows with their QoS requirements satisfied. Simulations demonstrate that the proposed scheme achieves a superior performance on the number of completed flows and the system throughput over two baseline schemes.

preprint2022arXiv

The Effective Sample Size in Bayesian Information Criterion for Level-Specific Fixed and Random Effects Selection in a Two-Level Nested Model

Popular statistical software provides Bayesian information criterion (BIC) for multilevel models or linear mixed models. However, it has been observed that the combination of statistical literature and software documentation has led to discrepancies in the formulas of the BIC and uncertainties of the proper use of the BIC in selecting a multilevel model with respect to level-specific fixed and random effects. These discrepancies and uncertainties result from different specifications of sample size in the BIC's penalty term for multilevel models. In this study, we derive the BIC's penalty term for level-specific fixed and random effect selection in a two-level nested design. In this new version of BIC, called BIC_E, this penalty term is decomposed into two parts if the random effect variance-covariance matrix has full rank: (a) a term with the log of average sample size per cluster whose multiplier involves the overlapping number of dimensions between the column spaces of the random and fixed effect design matrices and (b) the total number of parameters times the log of the total number of clusters. Furthermore, we study the behavior of BIC_E in the presence of redundant random effects. The use of BIC_E is illustrated with a textbook example data set and a numerical demonstration shows that the derived formulae adheres to empirical values.

preprint2022arXiv

The Moment Passing Method for Wireless Channel Capacity Estimation

Wireless network capacity can be regarded as the most important performance metric for wireless communication systems. With the fast development of wireless communication technology, future wireless systems will become more and more complicated. As a result, the channel gain matrix will become a large-dimensional random matrix, leading to an extremely high computational cost to obtain the capacity. In this paper, we propose a moment passing method (MPM) to realize the fast and accurate capacity estimation for future ultra-dense wireless systems. It can determine the capacity with quadratic complexity, which is optimal considering that the cost of a single matrix operation is not less than quadratic complexity. Moreover, it has high accuracy. The simulation results show that the estimation error of this method is below 2 percent. Finally, our method is highly general, as it is independent of the distributions of BSs and users, and the shape of network areas. More importantly, it can be applied not only to the conventional multi-user multiple input and multiple output (MU-MIMO) networks, but also to the capacity-centric networks designed for B5G/6G.

preprint2022arXiv

The Quadratic Wasserstein Metric With Squaring Scaling For Seismic Velocity Inversion

The quadratic Wasserstein metric has shown its power in measuring the difference between probability densities, which benefits optimization objective function with better convexity and is insensitive to data noise. Nevertheless, it is always an important question to make the seismic signals suitable for comparison using the quadratic Wasserstein metric. The squaring scaling is worth exploring since it guarantees the convexity caused by data shift. However, as mentioned in [Commun. Inf. Syst., 2019, 19:95-145], the squaring scaling may lose uniqueness and result in more local minima to the misfit function. In our previous work [J. Comput. Phys., 2018, 373:188-209], the quadratic Wasserstein metric with squaring scaling was successfully applied to the earthquake location problem. But it only discussed the inverse problem with few degrees of freedom. In this work, we will present a more in-depth study on the combination of squaring scaling technique and the quadratic Wasserstein metric. By discarding some inapplicable data, picking seismic phases, and developing a new normalization method, we successfully invert the seismic velocity structure based on the squaring scaling technique and the quadratic Wasserstein metric. The numerical experiments suggest that this newly proposed method is an efficient approach to obtain more accurate inversion results.

preprint2022arXiv

Triple-Band Scheduling with Millimeter Wave and Terahertz Bands for Wireless Backhaul

With the explosive growth of mobile traffic demand, densely deployed small cells underlying macrocells have great potential for 5G and beyond wireless networks. In this paper, we consider the problem of supporting traffic flows with diverse QoS requirements by exploiting three high frequency bands, i.e., the 28GHz band, the E-band, and the Terahertz (THz) band. The cooperation of the three bands is helpful for maximizing the number of flows with their QoS requirements satisfied. To solve the formulated nonlinear integer programming problem, we propose a triple-band scheduling scheme which can select the optimum scheduling band for each flow among three different frequency bands. The proposed scheme also efficiently utilizes the resource to schedule flow transmissions in time slots. Extensive simulations demonstrate the superior performance of the proposed scheme over three baseline schemes with respect to the number of completed flows and the system throughput.

preprint2022arXiv

Walking to Hide: Privacy Amplification via Random Message Exchanges in Network

The *shuffle model* is a powerful tool to amplify the privacy guarantees of the *local model* of differential privacy. In contrast to the fully decentralized manner of guaranteeing privacy in the local model, the shuffle model requires a central, trusted shuffler. To avoid this central shuffler, recent work of Liew et al. (2022) proposes shuffling locally randomized data in a decentralized manner, via random walks on the communication network constituted by the clients. The privacy amplification bound it thus provides depends on the topology of the underlying communication network, even for infinitely long random walks. It does not match the state-of-the-art privacy amplification bound for the shuffle model (Feldman et al., 2021). In this work, we prove that the output of~$n$ clients' data, each perturbed by an $ε_0$-local randomizer, and shuffled by random walks with a logarithmic number of steps, is $( {O} ( (1 - e^{-ε_0} ) \sqrt{ ( e^{ε_0} / n ) \ln (1 / δ) } ), O(δ) )$-differentially private. Importantly, this bound is independent of the topology of the communication network, and asymptotically closes the gap between the privacy amplification bounds for the network shuffle model (Liew et al., 2022) and the shuffle model (Feldman et al., 2021). Our proof is based on a reduction to the shuffle model, and an analysis of the distribution of random walks of finite length. Building on this, we further show that if each client is sampled independently with probability~$p$, the privacy guarantee of the network shuffle model can be further improved to $( {O} ( (1 - e^{-ε_0} ) \sqrt{p ( e^{ε_0} / n ) \ln (1 / δ) } ) , O(δ) )$. Importantly, the subsampling is also performed in a fully decentralized manner that does not require a trusted central entity; compared with related bounds in prior work, our bound is stronger.

preprint2021arXiv

A Convergent Semi-Proximal Alternating Direction Method of Multipliers for Recovering Internet Traffics from Link Measurements

It is challenging to recover the large-scale internet traffic data purely from the link measurements. With the rapid growth of the problem scale, it will be extremely difficult to sustain the recovery accuracy and the computational cost. In this work, we propose a new Sparsity Low-Rank Recovery (SLRR) model and its Schur Complement Based semi-proximal Alternating Direction Method of Multipliers (SCB-spADMM) solver. Our approach distinguishes itself mainly for the following two aspects. First, we fully exploit the spatial low-rank property and the sparsity of traffic data, which are barely considered in the literature. Our model can be divided into a series of subproblems, which only relate to the traffics in a certain individual time interval. Thus, the model scale is significantly reduced. Second, we establish a globally convergent ADMM-type algorithm inspired by [Li et al., Math. Program., 155(2016)] to solve the SLRR model. In each iteration, all the intermediate variables' optimums can be calculated analytically, which makes the algorithm fast and accurate. Besides, due to the separability of the SLRR model, it is possible to design a parallel algorithm to further reduce computational time. According to the numerical results on the classic datasets Abilene and GEANT, our method achieves the best accuracy with a low computational cost. Moreover, in our newly released large-scale Huawei Origin-Destination (HOD) network traffics, our method perfectly reaches the seconds-level feedback, which meets the essential requirement for practical scenarios.

preprint2021arXiv

A Review on the Cahn-Hilliard Equation: Classical Results and Recent Advances in Dynamic Boundary Conditions

The Cahn-Hilliard equation is a fundamental model that describes the phase separation process in multi-component mixtures. It has been successfully extended to many different contexts in several scientific fields. In this survey article, we briefly review the derivation, structure as well as some analytical issues for the Cahn-Hilliard equation and its variants. Our focus will be placed on the well-posedness and long-time behavior of the Cahn-Hilliard equation in the classical setting and recent progresses on the dynamic boundary conditions accounting for non-trivial boundary effects.

preprint2021arXiv

Coalition Game Based Full-duplex Popular Content Distribution in mmWave Vehicular Networks

The millimeter wave (mmWave) communication has drawn intensive attention with abundant band resources. In this paper, we consider the popular content distribution (PCD) problem in the mmWave vehicular network. In order to offload the communication burden of base stations (BSs), vehicle-to-vehicle (V2V) communication is introduced into the PCD problem to transmit contents between on-board units (OBUs) and improve the transmission efficiency. We propose a full-duplex (FD) cooperative scheme based on coalition formation game, and the utility function is provided based on the maximization of the number of received contents. The contribution of each member in the coalition can be transferable to its individual profit. While maximizing the number of received contents in the fixed time, the cooperative scheme also ensures the individual profit of each OBU in the coalition. We evaluate the proposed scheme by extensive simulations in mmWave vehicular networks. Compared with other existing schemes, the proposed scheme has superior performances on the number of possessed contents and system fairness. Besides, the low complexity of the proposed algorithm is demonstrated by the switch operation number and CPU time.

preprint2021arXiv

Evidence of Andreev blockade in a double quantum dot coupled to a superconductor

We design and investigate an experimental system capable of entering an electron transport blockade regime in which a spin-triplet localized in the path of current is forbidden from entering a spin-singlet superconductor. To stabilize the triplet a double quantum dot is created electrostatically near a superconducting lead in an InAs nanowire. The dots are filled stochastically with electrons of either spin. The superconducting lead is a molecular beam epitaxy grown Al shell. The shell is etched away over a wire segment to make room for the double dot and the normal metal gold lead. The quantum dot closest to the normal lead exhibits Coulomb diamonds, the dot closest to the superconducting lead exhibits Andreev bound states and an induced gap. The experimental observations compare favorably to a theoretical model of Andreev blockade, named so because the triplet double dot configuration suppresses Andreev reflections. Observed leakage currents can be accounted for by finite temperature. We observe the predicted quadruple level degeneracy points of high current and a periodic conductance pattern controlled by the occupation of the normal dot. Even-odd transport asymmetry is lifted with increased temperature and magnetic field. This blockade phenomenon can be used to study spin structure of superconductors. It may also find utility in quantum computing devices that utilize Andreev or Majorana states.

preprint2021arXiv

Normalizing field flows: Solving forward and inverse stochastic differential equations using physics-informed flow models

We introduce in this work the normalizing field flows (NFF) for learning random fields from scattered measurements. More precisely, we construct a bijective transformation (a normalizing flow characterizing by neural networks) between a Gaussian random field with the Karhunen-Loève (KL) expansion structure and the target stochastic field, where the KL expansion coefficients and the invertible networks are trained by maximizing the sum of the log-likelihood on scattered measurements. This NFF model can be used to solve data-driven forward, inverse, and mixed forward/inverse stochastic partial differential equations in a unified framework. We demonstrate the capability of the proposed NFF model for learning Non Gaussian processes and different types of stochastic partial differential equations.

preprint2021arXiv

Well-posedness of a Hydrodynamic Phase-field Model for Functionalized Membrane-Fluid Interaction

In this paper, we study a hydrodynamic phase-field system modeling the deformation of functionalized membranes in incompressible viscous fluids. The governing PDE system consists of the Navier-Stokes equations coupled with a convective sixth-order Cahn-Hilliard type equation driven by the functionalized Cahn-Hilliard free energy, which describes phase separation in mixtures with an amphiphilic structure. In the three dimensional case, we first prove existence of global weak solutions provided that the initial total energy is finite. Then we establish uniqueness of weak solutions under suitable regularity assumptions only imposed on the velocity field (or its gradient). Finally, we prove the existence and uniqueness of local strong solutions for arbitrary regular initial data and derive some blow-up criteria. The results are obtained in the general setting with variable viscosity and mobility.

preprint2020arXiv

A Deep Neural Network Model of Particle Thermal Radiation in Packed Bed

Prediction of particle radiative heat transfer flux is an important task in the large discrete granular systems, such as pebble bed in power plants and industrial fluidized beds. For particle motion and packing, discrete element method (DEM) now is widely accepted as the excellent Lagrangian approach. For thermal radiation, traditional methods focus on calculating the obstructed view factor directly by numerical algorithms. The major challenge for the simulation is that the method is proven to be time-consuming and not feasible to be applied in the practical cases. In this work, we propose an analytical model to calculate macroscopic effective conductivity from particle packing structures Then, we develop a deep neural network (DNN) model used as a predictor of the complex view factor function. The DNN model is trained by a large dataset and the computational speed is greatly improved with good accuracy. It is feasible to perform real-time simulation with DNN model for radiative heat transfer in large pebble bed. The trained model also can be coupled with DEM and used to analyze efficiently the directional radiative conductivity, anisotropic factor and wall effect of the particle thermal radiation.

preprint2020arXiv

Amortized Population Gibbs Samplers with Neural Sufficient Statistics

We develop amortized population Gibbs (APG) samplers, a class of scalable methods that frames structured variational inference as adaptive importance sampling. APG samplers construct high-dimensional proposals by iterating over updates to lower-dimensional blocks of variables. We train each conditional proposal by minimizing the inclusive KL divergence with respect to the conditional posterior. To appropriately account for the size of the input data, we develop a new parameterization in terms of neural sufficient statistics. Experiments show that APG samplers can train highly structured deep generative models in an unsupervised manner, and achieve substantial improvements in inference accuracy relative to standard autoencoding variational methods.

preprint2020arXiv

Beyond the Limits of Conventional Stark Deceleration

Stark deceleration enables the production of cold and dense molecular beams with applications in trapping, collisional studies, and precision measurement. Improving the efficiency of Stark deceleration, and hence the achievable molecular densities, is central to unlock the full potential of such studies. One of the chief limitations arises from the transverse focusing properties of Stark decelerators. We introduce a new operation strategy that circumvents this limit without any hardware modifications, and experimentally verify our results for hydroxyl radicals. Notably, improved focusing results in significant gains in molecule yield with increased operating voltage, formerly limited by transverse-longitudinal coupling. At final velocities sufficiently small for trapping, molecule flux improves by a factor of four, and potentially more with increased voltage. The improvement is more significant for less readily polarized species, thereby expanding the class of candidate molecules for Stark deceleration.

preprint2020arXiv

Chiral symmetry breaking for deterministic switching of perpendicular magnetization by spin-orbit torque

Symmetry breaking is a characteristic to determine which branch of a bifurcation system follows upon crossing a critical point. Specifically, in spin-orbit torque (SOT) devices, a fundamental question arises: how to break the symmetry of the perpendicular magnetic moment by the in-plane spin polarization? Here, we show that the chiral symmetry breaking by the DMI can induce the deterministic SOT switching of the perpendicular magnetization. By introducing a gradient of saturation magnetization or magnetic anisotropy, non-collinear spin textures are formed by the gradient of effective SOT strength, and thus the chiral symmetry of the SOT-induced spin textures is broken by the DMI, resulting in the deterministic magnetization switching. We introduce a strategy to induce an out-of-plane (z) gradient of magnetic properties, as a practical solution for the wafer-scale manufacture of SOT devices.

preprint2020arXiv

Crossing estimates from metric graph and discrete GFF

We compare level-set percolation for Gaussian free fields (GFFs) defined on a rectangular subset of $δ\mathbb{Z}^2$ to level-set percolation for GFFs defined on the corresponding metric graph as the mesh size $δ$ goes to 0. In particular, we look at the probability that there is a path that crosses the rectangle in the horizontal direction on which the field is positive. We show this probability is strictly larger in the discrete graph. In the metric graph case, we show that for appropriate boundary conditions the probability that there exists a closed pivotal edge for the horizontal crossing event decays logarithmically in $δ$. In the discrete graph case, we compute the limit of the probability of a horizontal crossing for appropriate boundary conditions.

preprint2020arXiv

Data Age Aware Scheduling for Wireless Powered Mobile-Edge Computing in Industrial Internet of Things

Wireless powered mobile edge computing has been envisioned as a promising paradigm to enhance the computation capability of low-power wireless devices in Industrial Internet of Things. An efficient resource scheduling method is critical yet challenging to design in such a scenario due to stochastic traffic arrival, time-coupling uplink/downlink decision and incomplete system state knowledge. To tackle these challenges, an online optimization algorithm is proposed in this paper to maximize long-term system utility balancing throughput and fairness, subject to data age and stability constraints. A set of virtual queues is designed to transform the scheduling task, which is hard to solve due to time-dependent data age constraints, into a stochastic optimization problem. Leveraging Lyapunov and convex optimization techniques, the proposed approach can achieve asymptotically near-optimal online decisions without any prior statistical knowledge, and maintain the asymptotic optimality in the presence of partial and outdated network state information. Numerical simulations corroborate the theoretical analysis and demonstrate the effectiveness of the proposed approach.

preprint2020arXiv

DeepDualMapper: A Gated Fusion Network for Automatic Map Extraction using Aerial Images and Trajectories

Automatic map extraction is of great importance to urban computing and location-based services. Aerial image and GPS trajectory data refer to two different data sources that could be leveraged to generate the map, although they carry different types of information. Most previous works on data fusion between aerial images and data from auxiliary sensors do not fully utilize the information of both modalities and hence suffer from the issue of information loss. We propose a deep convolutional neural network called DeepDualMapper which fuses the aerial image and trajectory data in a more seamless manner to extract the digital map. We design a gated fusion module to explicitly control the information flows from both modalities in a complementary-aware manner. Moreover, we propose a novel densely supervised refinement decoder to generate the prediction in a coarse-to-fine way. Our comprehensive experiments demonstrate that DeepDualMapper can fuse the information of images and trajectories much more effectively than existing approaches, and is able to generate maps with higher accuracy.

preprint2020arXiv

Distributed Brillouin frequency shift extraction via a convolutional neural network

Distributed optical fiber Brillouin sensors detect the temperature and strain along a fiber according to the local Brillouin frequency shift, which is usually calculated by the measured Brillouin spectrum using Lorentzian curve fitting. In addition, cross-correlation, principal component analysis, and machine learning methods have been proposed for the more efficient extraction of Brillouin frequency shifts. However, existing methods only process the Brillouin spectrum individually, ignoring the correlation in the time domain, indicating that there is still room for improvement. Here, we propose and experimentally demonstrate a full convolution neural network to extract the distributed Brillouin frequency shift directly from the measured two-dimensional data. Simulated ideal Brillouin spectrum with various parameters are used to train the network. Both the simulation and experimental results show that the extraction accuracy of the network is better than that of the traditional curve fitting algorithm with a much shorter processing time. This network has good universality and robustness and can effectively improve the performances of existing Brillouin sensors.

preprint2020arXiv

Electric generation from drops impacting onto charged surfaces

The impact of liquid drops onto solid surfaces leads to conversion of kinetic energy of directed drop motion into various forms of energy including surface energy, vibrational energy, heat, and under suitable conditions, electrical energy. The latter has attracted substantial attention in recent years for its potential to directly convert energy from random environmental flows such as rainfall, spray, and wave motion on the sea to electrical energy. Despite the invention of numerous configurations of such energy harvesters, the underlying physical principles and optimum operation conditions have remained elusive. In this letter, we use a combination of high-speed electrical current and video imaging measurements to develop a parameter-free quantitative description of the energy harvesting process for an optimized electrode configuration. A novel electrowetting-assisted charge injection method, EWCI, enables highly stable surface charges and robust energy conversion for several months with record efficiencies exceeding 2.5 percent of the initial kinetic energy.

preprint2020arXiv

Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation

Quantization techniques can reduce the size of Deep Neural Networks and improve inference latency and throughput by taking advantage of high throughput integer instructions. In this paper we review the mathematical aspects of quantization parameters and evaluate their choices on a wide range of neural network models for different application domains, including vision, speech, and language. We focus on quantization techniques that are amenable to acceleration by processors with high-throughput integer math pipelines. We also present a workflow for 8-bit quantization that is able to maintain accuracy within 1% of the floating-point baseline on all networks studied, including models that are more difficult to quantize, such as MobileNets and BERT-large.

preprint2020arXiv

Kernel Embedding based Variational Approach for Low-dimensional Approximation of Dynamical Systems

Transfer operators such as Perron-Frobenius or Koopman operator play a key role in modeling and analysis of complex dynamical systems, which allow linear representations of nonlinear dynamics by transforming the original state variables to feature spaces. However, it remains challenging to identify the optimal low-dimensional feature mappings from data. The variational approach for Markov processes (VAMP) provides a comprehensive framework for the evaluation and optimization of feature mappings based on the variational estimation of modeling errors, but it still suffers from a flawed assumption on the transfer operator and therefore sometimes fails to capture the essential structure of system dynamics. In this paper, we develop a powerful alternative to VAMP, called kernel embedding based variational approach for dynamical systems (KVAD). By using the distance measure of functions in the kernel embedding space, KVAD effectively overcomes the theoretical and practical limitations of VAMP. In addition, we develop a data-driven KVAD algorithm for seeking the ideal feature mapping within a subspace spanned by given basis functions, and numerical experiments show that the proposed algorithm can significantly improve the modeling accuracy compared to VAMP.

preprint2020arXiv

Learning Based Distributed Tracking

Inspired by the great success of machine learning in the past decade, people have been thinking about the possibility of improving the theoretical results by exploring data distribution. In this paper, we revisit a fundamental problem called Distributed Tracking (DT) under an assumption that the data follows a certain (known or unknown) distribution, and propose a number data-dependent algorithms with improved theoretical bounds. Informally, in the DT problem, there is a coordinator and k players, where the coordinator holds a threshold N and each player has a counter. At each time stamp, at most one counter can be increased by one. The job of the coordinator is to capture the exact moment when the sum of all these k counters reaches N. The goal is to minimise the communication cost. While our first type of algorithms assume the concrete data distribution is known in advance, our second type of algorithms can learn the distribution on the fly. Both of the algorithms achieve a communication cost bounded byO(k log log N) with high probability, improving the state-of-the-art data-independent bound O(k log N/k). We further propose a number of implementation optimisation heuristics to improve both efficiency and robustness of the algorithms. Finally, we conduct extensive experiments on three real datasets and four synthetic datasets. The experimental results show that the communication cost of our algorithms is as least as 20% of that of the state-of-the-art algorithms.

preprint2020arXiv

Masked Face Recognition Dataset and Application

In order to effectively prevent the spread of COVID-19 virus, almost everyone wears a mask during coronavirus epidemic. This almost makes conventional facial recognition technology ineffective in many cases, such as community access control, face access control, facial attendance, facial security checks at train stations, etc. Therefore, it is very urgent to improve the recognition performance of the existing face recognition technology on the masked faces. Most current advanced face recognition approaches are designed based on deep learning, which depend on a large number of face samples. However, at present, there are no publicly available masked face recognition datasets. To this end, this work proposes three types of masked face datasets, including Masked Face Detection Dataset (MFDD), Real-world Masked Face Recognition Dataset (RMFRD) and Simulated Masked Face Recognition Dataset (SMFRD). Among them, to the best of our knowledge, RMFRD is currently theworld's largest real-world masked face dataset. These datasets are freely available to industry and academia, based on which various applications on masked faces can be developed. The multi-granularity masked face recognition model we developed achieves 95% accuracy, exceeding the results reported by the industry. Our datasets are available at: https://github.com/X-zhangyang/Real-World-Masked-Face-Dataset.

preprint2020arXiv

Multivariate Regression of Mixed Responses for Evaluation of Visualization Designs

Information visualization significantly enhances human perception by graphically representing complex data sets. The variety of visualization designs makes it challenging to efficiently evaluate all possible designs catering to users' preferences and characteristics. Most of existing evaluation methods perform user studies to obtain multivariate qualitative responses from users via questionnaires and interviews. However, these methods cannot support online evaluation of designs as they are often time-consuming. A statistical model is desired to predict users' preferences on visualization designs based on non-interference measurements (i.e., wearable sensor signals). In this work, we propose a multivariate regression of mixed responses (MRMR) to facilitate quantitative evaluation of visualization designs. The proposed MRMR method is able to provide accurate model prediction with meaningful variable selection. A simulation study and a user study of evaluating visualization designs with 14 effective participants are conducted to illustrate the merits of the proposed model.

preprint2020arXiv

Optical Vortex Shaping & Multiple Singularities Manipulation via High-order Cross-phase

Increasing demand for practical applications is forcing deeper research into optical vortices (OVs): from the generation and measurement to shaping and multiple singularities manipulation of OVs. Herein, we propose a new type of phase structure called the high-order cross phase (HOCP) can be employed to modulate OVs to implement both polygonal shaping and singularities manipulation. Theoretically, we investigate the propagation characteristics of OVs with the HOCP. In experiments, we achieve the shaping and singularities manipulation of OVs by utilizing the HOCP. On this basis, we discuss the interference patterns of superposed OVs after the modulation. This work provides an alternative method to achieve both polygonal shaping and multiple singularities manipulation, which will facilitate applications in optical micro-manipulation, optical communication, and high-dimensional quantum entanglement.

preprint2020arXiv

Real-time Human Activity Recognition Using Conditionally Parametrized Convolutions on Mobile and Wearable Devices

Recently, deep learning has represented an important research trend in human activity recognition (HAR). In particular, deep convolutional neural networks (CNNs) have achieved state-of-the-art performance on various HAR datasets. For deep learning, improvements in performance have to heavily rely on increasing model size or capacity to scale to larger and larger datasets, which inevitably leads to the increase of operations. A high number of operations in deep leaning increases computational cost and is not suitable for real-time HAR using mobile and wearable sensors. Though shallow learning techniques often are lightweight, they could not achieve good performance. Therefore, deep learning methods that can balance the trade-off between accuracy and computation cost is highly needed, which to our knowledge has seldom been researched. In this paper, we for the first time propose a computation efficient CNN using conditionally parametrized convolution for real-time HAR on mobile and wearable devices. We evaluate the proposed method on four public benchmark HAR datasets consisting of WISDM dataset, PAMAP2 dataset, UNIMIB-SHAR dataset, and OPPORTUNITY dataset, achieving state-of-the-art accuracy without compromising computation cost. Various ablation experiments are performed to show how such a network with large capacity is clearly preferable to baseline while requiring a similar amount of operations. The method can be used as a drop-in replacement for the existing deep HAR architectures and easily deployed onto mobile and wearable devices for real-time HAR applications.

preprint2020arXiv

Response to LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts

Live video commenting systems are an emerging feature of online video sites. Recently the Chinese video sharing platform Bilibili, has popularised a novel captioning system where user comments are displayed as streams of moving subtitles overlaid on the video playback screen and broadcast to all viewers in real-time. LiveBot was recently introduced as a novel Automatic Live Video Commenting (ALVC) application. This enables the automatic generation of live video comments from both the existing video stream and existing viewers comments. In seeking to reproduce the baseline results reported in the original Livebot paper, we found differences between the reproduced results using the project codebase and the numbers reported in the paper. Further examination of this situation suggests that this may be caused by a number of small issues in the project code, including a non-obvious overlap between the training and test sets. In this paper, we study these discrepancies in detail and propose an alternative baseline implementation as a reference for other researchers in this field.

preprint2020arXiv

Room-temperature quasi-continuous-wave pentacene maser pumped by an invasive Ce:YAG luminescent concentrator

We present in this work a quasi-continuous-wave (CW) pentacene maser operating at 1.45 GHz in the Earth's magnetic field at room temperature with a duration of $\sim$4 ms and an output power of up to -25 dBm. The maser is optically pumped by a cerium-doped YAG (Ce:YAG) luminescent concentrator (LC) whose wedge-shaped output is embedded inside a 0.1% pentacene-doped para-terphenyl (Pc:Ptp) crystal. The pumped crystal is located inside a ring of strontium titanate (STO) that supports a TE$_{01δ}$ mode of high magnetic Purcell factor. Combined with simulations, our results indicate that CW operation of pentacene masers at room-temperature is perfectly feasible so long as excessive heating of the crystal is avoided.

preprint2020arXiv

Scaling Limits of Crossing Probabilities in Metric Graph GFF

We consider metric graph Gaussian free field (GFF) defined on polygons of $δ\mathbb{Z}^2$ with alternating boundary data. The crossing probabilities for level-set percolation of metric graph GFF have scaling limits. When the boundary data is well-chosen, the scaling limits of crossing probabilities can be explicitly constructed as "fusion" of multiple SLE$_4$ pure partition functions.

preprint2020arXiv

Structural-Aware Sentence Similarity with Recursive Optimal Transport

Measuring sentence similarity is a classic topic in natural language processing. Light-weighted similarities are still of particular practical significance even when deep learning models have succeeded in many other tasks. Some light-weighted similarities with more theoretical insights have been demonstrated to be even stronger than supervised deep learning approaches. However, the successful light-weighted models such as Word Mover's Distance [Kusner et al., 2015] or Smooth Inverse Frequency [Arora et al., 2017] failed to detect the difference from the structure of sentences, i.e. order of words. To address this issue, we present Recursive Optimal Transport (ROT) framework to incorporate the structural information with the classic OT. Moreover, we further develop Recursive Optimal Similarity (ROTS) for sentences with the valuable semantic insights from the connections between cosine similarity of weighted average of word vectors and optimal transport. ROTS is structural-aware and with low time complexity compared to optimal transport. Our experiments over 20 sentence textural similarity (STS) datasets show the clear advantage of ROTS over all weakly supervised approaches. Detailed ablation study demonstrate the effectiveness of ROT and the semantic insights.

preprint2019arXiv

A State-Failure--Network Method to Identify Critical Components in Power Systems

In order to mitigate cascading failure blackout risks in power systems, the critical components whose failures lead to high blackout risks should be identified. In this paper, such critical components are identified by the state-failure network (SF-network) formed by cascading failure chain and loss data, which can be gathered from either utilities or simulations. The failures along the chains are recombined in the SF-network, where each failure is allocated a value that can reveal the blackout risks after their occurrences. Thus, critical failures can be identified in the SF-network where the failures raise up blackout risks, and thus the critical components can be found based on their critical failure risks. The simulation results validate the effectiveness of the proposed method.

preprint2019arXiv

Experimental creation and annihilation of nonvolatile magnetic skyrmions using voltage control of magnetic anisotropy without an external magnetic field

In this work, we utilize voltage controlled magnetic anisotropy (VCMA) to manipulate magnetic skyrmions that are fixed in space. Memory devices based on this strategy can potentially be of smaller footprint and better energy efficiency than current-controlled motion-based skyrmionic devices. To demonstrate VCMA induced manipulation of skyrmions, we fabricate antiferromagnet/ferromagnet/oxide heterostructure films where skyrmions can be stabilized without any external magnetic field due to the presence of exchange bias. These isolated skyrmions were annihilated by applying a voltage pulse that increased PMA. On the other hand, decreasing PMA promoted formation of more skyrmions. Furthermore, skyrmions can be created from chiral domains by increasing PMA of the system. To corroborate our experimental observations, we performed micromagnetic simulation. The proposed method could potentially lead to novel skyrmion-based memory devices.

preprint2019arXiv

Generation and Measurement of High-order Optical Vortex via Cross-Phase

The generation and measurement of optical vortex (OV) are the basis for a variety of related applications. However, the special case of high-order OVs has not been sufficient addressed yet. Herein, a generation and measurement method of high-order OV via utilizing the CP is investigated. In the experiment, we generate OVs with l=60, p=20 and successfully measure OVs with l=200,p=0, where experimental results agree well with simulation outcome. On this basis, the intensity distributions of LG and HG beams (corresponding to the generation and measurement) versus waist radius of initial light beams is discussed. This work provides an alternative method to generate or measure high-order OV, which will facilitate applications in optical micro-manipulation, quantum entanglement and rotation speed detection.