Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
98works
0followers
49topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

98 published item(s)

preprint2026arXiv

Doping induced itinerant ferromagnetism and enhanced ferroelectricity in BL-InSe

The microscopic coexistence of ferroelectricity and ferromagnetism in solids remains a fundamental challenge in condensed matter physics, with far-reaching implications for multifunctional materials and next-generation electronic devices. Using first-principles calculations, we predict emergent sliding ferroelectricity and doping-mediated ferromagnetism in bilayer (BL) InSe. The energetically favored AB stacked BL-InSe spontaneously breaks the out-of-plane mirror symmetry, resulting in a switchable polarization with a saturated component of 0.089 pC/m and a low transition barrier of 28.8 meV per unit cell. Strikingly, low-concentration electrostatic doping enhances rather than suppresses the ferroelectric polarization due to the abnormal layer-dependent electronic occupation in BL-InSe, in contrast to the conventional screening paradigm. In addition, the characteristic Mexican-hat-shaped valence band enables doping-induced itinerant half-metallic ferromagnetism, where the interlayer spin density difference scales linearly with doping concentration and can be reversed by switching the polarization direction. These results demonstrate the coexistence of ferroelectric and ferromagnetic orders in BL-InSe and establish a viable platform for realizing voltage-tunable multiferroicity through stacking and carrier doping in otherwise nonpolar and nonmagnetic semiconductors.

preprint2026arXiv

Engineering Ideal 2D Type-II Nodal Line Semimetals via Stacking and Intercalation of van der Waals Layers

Two-dimensional type-II topological semimetals (TSMs), characterized by strongly tilted Dirac cones, have attracted intense interest for their unconventional electronic properties and exotic transport behaviors. However, rational design remains challenging due to the sensitivity of band tilting to lattice geometry, atomic coordination, and symmetry constraints. Here, we present a bottom-up approach to engineer ideal type-II nodal line semimetals (NLSMs) in van der Waals bilayers via atomic intercalation. Using monolayer $h$-AlN as a prototype, we show that fluorine-intercalated bilayer AlN (F@BL-AlN) hosts a symmetry-protected type-II nodal loop precisely at the Fermi level, enabled by preserved mirror symmetry ($\mathcal{M}_z$) and tailored interlayer hybridization. First-principles calculations reveal that fluorine not only tunes interlayer coupling but also aligns the Fermi energy with the nodal line, stabilizing the type-II NLSM phase. The system exhibits tunable electronic properties under external electric and strain fields and features a van Hove singularity that induces spontaneous ferromagnetism, realizing a ferromagnetic topological semimetal state. This work provides a versatile platform for designing type-II NLSMs and offers practical guidance for their experimental realization.

preprint2026arXiv

InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation

Text and faces are among the most perceptually salient and practically important patterns in visual generation, yet they remain challenging for autoregressive generators built on discrete tokenization. A central bottleneck is the tokenizer: aggressive downsampling and quantization often discard the fine-grained structures needed to preserve readable glyphs and distinctive facial features. We attribute this gap to standard discrete-tokenizer objectives being weakly aligned with text legibility and facial fidelity, as these objectives typically optimize generic reconstruction while compressing diverse content uniformly. To address this, we propose InsightTok, a simple yet effective discrete visual tokenization framework that enhances text and face fidelity through localized, content-aware perceptual losses. With a compact 16k codebook and a 16x downsampling rate, InsightTok significantly outperforms prior tokenizers in text and face reconstruction without compromising general reconstruction quality. These gains consistently transfer to autoregressive image generation in InsightAR, producing images with clearer text and more faithful facial details. Overall, our results highlight the potential of specialized supervision in tokenizer training for advancing discrete image generation.

preprint2026arXiv

Interpretable Machine Learning for Quantum-Informed Property Predictions in Artificial Sensing Materials

Digital sensing faces challenges in developing sustainable methods to extend the applicability of customized e-noses to complex body odor volatilome (BOV). To address this challenge, we developed MORE-ML, a computational framework that integrates quantum-mechanical (QM) property data of e-nose molecular building blocks with machine learning (ML) methods to predict sensing-relevant properties. Within this framework, we expanded our previous dataset, MORE-Q, to MORE-QX by sampling a larger conformational space of interactions between BOV molecules and mucin-derived receptors. This dataset provides extensive electronic binding features (BFs) computed upon BOV adsorption. Analysis of MORE-QX property space revealed weak correlations between QM properties of building blocks and resulting BFs. Leveraging this observation, we defined electronic descriptors of building blocks as inputs for tree-based ML models to predict BFs. Benchmarking showed CatBoost models outperform alternatives, especially in transferability to unseen compounds. Explainable AI methods further highlighted which QM properties most influence BF predictions. Collectively, MORE-ML combines QM insights with ML to provide mechanistic understanding and rational design principles for molecular receptors in BOV sensing. This approach establishes a foundation for advancing artificial sensing materials capable of analyzing complex odor mixtures, bridging the gap between molecular-level computations and practical e-nose applications.

preprint2026arXiv

Normalized Solutions for Schrödinger-Bopp-Podolsky Systems with Critical Choquard-Type Nonlinearity on Bounded Domains

In this paper, we study normalized solutions for the following critical Schrödinger-Bopp-Podolsky system: $$-Δu + q(x)ϕu = λu + |u|^{p-2}u + \bigl(I_α* |u|^{3+α}\bigr)|u|^{1+α}u,\quad \text{in } Ω_r,$$ $$-Δϕ+ Δ^2ϕ= q(x)u^2, \ \qquad\qquad\qquad\qquad\qquad\qquad\qquad\ \text{ in } Ω_r,$$ where $Ω_r \subset \mathbb R^3$ is a smooth bounded domain, $p \in \left(2, \frac{8}{3}\right)$, $q(x) \in C(\barΩ_r) \backslash \{0\}$ and $λ\in \mathbb R$ is the Lagrange multiplier associated with the constraint $\int_{Ω_r} |u|^2\, \mathrm d x = b^2$ for some $b > 0$. Here $α> 0$, $I_α$ denotes the Riesz potential, and the domain parameter $r$ reflects the size of $Ω_r$ whose precise definition will be given in Section 3. By applying a special minimax principle together with a truncation technique, we prove that there exists $b^* > 0$ such that the system admits multiple normalized solutions whenever $b \in (0, b^*)$ under Navier boundary conditions.

preprint2026arXiv

RecGPT-Mobile: On-Device Large Language Models for User Intent Understanding in Taobao Feed Recommendation

Predicting a user's next search query from recent interaction behaviors is a critical problem in modern e-commerce systems, particularly in scenarios where user intent evolves rapidly. Large Language Models (LLMs) offer strong semantic reasoning capabilities and have recently been adopted to enhance training data construction for next-query prediction. However, due to resource constraints on mobile devices, existing applications are deployed on cloud servers, resulting in high inference costs. In this paper, we propose RecGPT-Mobile, a framework that designs a lightweight LLM-based intent understanding agent to improve recommendation quality in mobile e-commerce scenarios. By deploying LLMs directly on mobile devices, our approach can capture evolving interests of users more quickly and adjust the recommendation results in real time. Extensive offline analyses and online experiments demonstrate that our method significantly improves the accuracy of recommendation results, laying a practical path for LLM deployment in production-scale recommendation systems on mobile devices, as well as a scalable solution for integrating LLMs into real-world next-query prediction systems.

preprint2026arXiv

Thompson Sampling for Repeated Newsvendor

In this paper, we investigate the performance of Thompson Sampling (TS) for online learning with censored feedback, focusing primarily on the classic repeated newsvendor model--a foundational framework in inventory management--and demonstrating how our techniques can be naturally extended to a broader class of problems. We first model demand using a Weibull distribution and initialize TS with a Gamma prior to dynamically adjust order quantities. Our analysis establishes optimal (up to logarithmic factors) frequentist regret bounds for TS without imposing restrictive prior assumptions. More importantly, it yields novel and highly interpretable insights on how TS addresses the exploration-exploitation trade-off in the repeated newsvendor setting. Specifically, our results show that when past order quantities are sufficiently large to overcome censoring, TS accurately estimates the unknown demand parameters, leading to near-optimal ordering decisions. Conversely, when past orders are relatively small, TS automatically increases future order quantities to gather additional demand information. Then, we extend our analysis to general parametric distribution family and provide proof for Bayesian regret. Extensive numerical simulations further demonstrate that TS outperforms more conservative and widely-used approaches such as online convex optimization, upper confidence bounds, and myopic Bayesian dynamic programming.

preprint2025arXiv

Jailbreak-Zero: A Path to Pareto Optimal Red Teaming for Large Language Models

This paper introduces Jailbreak-Zero, a novel red teaming methodology that shifts the paradigm of Large Language Model (LLM) safety evaluation from a constrained example-based approach to a more expansive and effective policy-based framework. By leveraging an attack LLM to generate a high volume of diverse adversarial prompts and then fine-tuning this attack model with a preference dataset, Jailbreak-Zero achieves Pareto optimality across the crucial objectives of policy coverage, attack strategy diversity, and prompt fidelity to real user inputs. The empirical evidence demonstrates the superiority of this method, showcasing significantly higher attack success rates against both open-source and proprietary models like GPT-40 and Claude 3.5 when compared to existing state-of-the-art techniques. Crucially, Jailbreak-Zero accomplishes this while producing human-readable and effective adversarial prompts with minimal need for human intervention, thereby presenting a more scalable and comprehensive solution for identifying and mitigating the safety vulnerabilities of LLMs.

preprint2024arXiv

Integrating Secondary Structures Information into Triangular Spatial Relationships (TSR) for Advanced Protein Classification

Protein structures represent the key to deciphering biological functions. The more detailed form of similarity among these proteins is sometimes overlooked by the conventional structural comparison methods. In contrast, further advanced methods, such as Triangular Spatial Relationship (TSR), have been demonstrated to make finer differentiations. Still, the classical implementation of TSR does not provide for the integration of secondary structure information, which is important for a more detailed understanding of the folding pattern of a protein. To overcome these limitations, we developed the SSE-TSR approach. The proposed method integrates secondary structure elements (SSEs) into TSR-based protein representations. This allows an enriched representation of protein structures by considering 18 different combinations of helix, strand, and coil arrangements. Our results show that using SSEs improves the accuracy and reliability of protein classification to varying degrees. We worked with two large protein datasets of 9.2K and 7.8K samples, respectively. We applied the SSE-TSR approach and used a neural network model for classification. Interestingly, introducing SSEs improved performance statistics for Dataset 1, with accuracy moving from 96.0% to 98.3%. For Dataset 2, where the performance statistics were already good, further small improvements were found with the introduction of SSE, giving an accuracy of 99.5% compared to 99.4%. These results show that SSE integration can dramatically improve TSR key discrimination, with significant benefits in datasets with low initial accuracies and only incremental gains in those with high baseline performance. Thus, SSE-TSR is a powerful bioinformatics tool that improves protein classification and understanding of protein function and interaction.

preprint2024arXiv

Joint Channel Estimation and Data Recovery for Millimeter Massive MIMO: Using Pilot to Capture Principal Components

Channel state information (CSI) is important to reap the full benefits of millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) systems. The traditional channel estimation methods using pilot frames (PF) lead to excessive overhead. To reduce the demand for PF, data frames (DF) can be adopted for joint channel estimation and data recovery. However, the computational complexity of the DF-based methods is prohibitively high. To reduce the computational complexity, we propose a joint channel estimation and data recovery (JCD) method assisted by a small number of PF for mmWave massive MIMO systems. The proposed method has two stages. In Stage 1, differing from the traditional PF-based methods, the proposed PF-assisted method is utilized to capture the angle of arrival (AoA) of principal components (PC) of channels. In Stage 2, JCD is designed for parallel implementation based on the multi-user decoupling strategy. The theoretical analysis demonstrates that the PF-assisted JCD method can achieve equivalent performance to the Bayesian-optimal DF-based method, while greatly reducing the computational complexity. Simulation results are also presented to validate the analytical results.

preprint2024arXiv

LaneSegNet: Map Learning with Lane Segment Perception for Autonomous Driving

A map, as crucial information for downstream applications of an autonomous driving system, is usually represented in lanelines or centerlines. However, existing literature on map learning primarily focuses on either detecting geometry-based lanelines or perceiving topology relationships of centerlines. Both of these methods ignore the intrinsic relationship of lanelines and centerlines, that lanelines bind centerlines. While simply predicting both types of lane in one model is mutually excluded in learning objective, we advocate lane segment as a new representation that seamlessly incorporates both geometry and topology information. Thus, we introduce LaneSegNet, the first end-to-end mapping network generating lane segments to obtain a complete representation of the road structure. Our algorithm features two key modifications. One is a lane attention module to capture pivotal region details within the long-range feature space. Another is an identical initialization strategy for reference points, which enhances the learning of positional priors for lane attention. On the OpenLane-V2 dataset, LaneSegNet outperforms previous counterparts by a substantial gain across three tasks, \textit{i.e.}, map element detection (+4.8 mAP), centerline perception (+6.9 DET$_l$), and the newly defined one, lane segment perception (+5.6 mAP). Furthermore, it obtains a real-time inference speed of 14.7 FPS. Code is accessible at https://github.com/OpenDriveLab/LaneSegNet.

preprint2024arXiv

Parabolic Anderson model in bounded domains of recurrent metric measure spaces

A metric measure space equipped with a Dirichlet form is called recurrent if its Hausdorff dimension is less than its walk dimension. In bounded domains of such spaces we study the parabolic Anderson models \[ \partial_{t} u(t,x) = Δu(t,x) + βu(t,x) \, \dot{W}_α(t,x) \] where the noise $W_α$ is white in time and colored in space when $α>0$ while for $α=0$ it is also white in space. Both Dirichlet and Neumann boundary conditions are considered. Besides proving existence and uniqueness in the Itô sense we also get precise $L^p$ estimates for the moments and intermittency properties of the solution as a consequence. Our study reveals new exponents which are intrinsically associated to the geometry of the underlying space and the results for instance apply in metric graphs or fractals like the Sierpiński gasket for which we prove scaling invariance properties of the models.

preprint2023arXiv

Personalized Prompt Learning for Explainable Recommendation

Providing user-understandable explanations to justify recommendations could help users better understand the recommended items, increase the system's ease of use, and gain users' trust. A typical approach to realize it is natural language generation. However, previous works mostly adopt recurrent neural networks to meet the ends, leaving the potentially more effective pre-trained Transformer models under-explored. In fact, user and item IDs, as important identifiers in recommender systems, are inherently in different semantic space as words that pre-trained models were already trained on. Thus, how to effectively fuse IDs into such models becomes a critical issue. Inspired by recent advancement in prompt learning, we come up with two solutions: find alternative words to represent IDs (called discrete prompt learning), and directly input ID vectors to a pre-trained model (termed continuous prompt learning). In the latter case, ID vectors are randomly initialized but the model is trained in advance on large corpora, so they are actually in different learning stages. To bridge the gap, we further propose two training strategies: sequential tuning and recommendation as regularization. Extensive experiments show that our continuous prompt learning approach equipped with the training strategies consistently outperforms strong baselines on three datasets of explainable recommendation.

preprint2023arXiv

Reversible Attack based on Local Visual Adversarial Perturbation

Adding perturbations to images can mislead classification models to produce incorrect results. Recently, researchers exploited adversarial perturbations to protect image privacy from retrieval by intelligent models. However, adding adversarial perturbations to images destroys the original data, making images useless in digital forensics and other fields. To prevent illegal or unauthorized access to sensitive image data such as human faces without impeding legitimate users, the use of reversible adversarial attack techniques is increasing. The original image can be recovered from its reversible adversarial examples. However, existing reversible adversarial attack methods are designed for traditional imperceptible adversarial perturbations and ignore the local visible adversarial perturbation. In this paper, we propose a new method for generating reversible adversarial examples based on local visible adversarial perturbation. The information needed for image recovery is embedded into the area beyond the adversarial patch by the reversible data hiding technique. To reduce image distortion, lossless compression and the B-R-G (bluered-green) embedding principle are adopted. Experiments on CIFAR-10 and ImageNet datasets show that the proposed method can restore the original images error-free while ensuring good attack performance.

preprint2023arXiv

Rigorous Derivation of the Degenerate Parabolic-Elliptic Keller-Segel System from a Moderately Interacting Stochastic Particle System. Part I Partial Differential Equation

The aim of this paper is to provide the analysis result for the partial differential equations arising from the rigorous derivation of the degenerate parabolic-elliptic Keller-Segel system from a moderately interacting stochastic particle system. The rigorous derivation is divided into two articles. In this paper, we establish the solution theory of the degenerate parabolic-elliptic Keller-Segel problem and its non-local version, which will be used in the second paper for the discussion of the mean-field limit. A parabolic regularized system is introduced to bridge the stochastic particle model and the degenerate Keller-Segel system. We derive the existence of the solution to this regularized system by constructing approximate solutions, giving uniform estimates and taking the limits, where a crucial step is to obtain the L infty Bernstein type estimate for the gradient of the approximate solution. Based on this, we obtain the well-posedness of the corresponding non-local equation through perturbation method. Finally, the weak solution of the degenerate Keller-Segel system is obtained by using a nonlinear version of Aubin-Lions lemma.

preprint2023arXiv

Visual Point Cloud Forecasting enables Scalable Autonomous Driving

In contrast to extensive studies on general vision, pre-training for scalable visual autonomous driving remains seldom explored. Visual autonomous driving applications require features encompassing semantics, 3D geometry, and temporal information simultaneously for joint perception, prediction, and planning, posing dramatic challenges for pre-training. To resolve this, we bring up a new pre-training task termed as visual point cloud forecasting - predicting future point clouds from historical visual input. The key merit of this task captures the synergic learning of semantics, 3D structures, and temporal dynamics. Hence it shows superiority in various downstream tasks. To cope with this new problem, we present ViDAR, a general model to pre-train downstream visual encoders. It first extracts historical embeddings by the encoder. These representations are then transformed to 3D geometric space via a novel Latent Rendering operator for future point cloud prediction. Experiments show significant gain in downstream tasks, e.g., 3.1% NDS on 3D detection, ~10% error reduction on motion forecasting, and ~15% less collision rate on planning.

preprint2022arXiv

A Data-driven Adversarial Examples Recognition Framework via Adversarial Feature Genome

Adversarial examples pose many security threats to convolutional neural networks (CNNs). Most defense algorithms prevent these threats by finding differences between the original images and adversarial examples. However, the found differences do not contain features about the classes, so these defense algorithms can only detect adversarial examples without recovering the correct labels. In this regard, we propose the Adversarial Feature Genome (AFG), a novel type of data that contains both the differences and features about classes. This method is inspired by an observed phenomenon, namely the Adversarial Feature Separability (AFS), where the difference between the feature maps of the original images and adversarial examples becomes larger with deeper layers. On top of that, we further develop an adversarial example recognition framework that detects adversarial examples and can recover the correct labels. In the experiments, the detection and classification of adversarial examples by AFGs has an accuracy of more than 90.01\% in various attack scenarios. To the best of our knowledge, our method is the first method that focuses on both attack detecting and recovering. AFG gives a new data-driven perspective to improve the robustness of CNNs. The source code is available at https://github.com/GeoX-Lab/Adv_Fea_Genome.

preprint2022arXiv

A Survey of Surface Defect Detection of Industrial Products Based on A Small Number of Labeled Data

The surface defect detection method based on visual perception has been widely used in industrial quality inspection. Because defect data are not easy to obtain and the annotation of a large number of defect data will waste a lot of manpower and material resources. Therefore, this paper reviews the methods of surface defect detection of industrial products based on a small number of labeled data, and this method is divided into traditional image processing-based industrial product surface defect detection methods and deep learning-based industrial product surface defect detection methods suitable for a small number of labeled data. The traditional image processing-based industrial product surface defect detection methods are divided into statistical methods, spectral methods and model methods. Deep learning-based industrial product surface defect detection methods suitable for a small number of labeled data are divided into based on data augmentation, based on transfer learning, model-based fine-tuning, semi-supervised, weak supervised and unsupervised.

preprint2022arXiv

An Efficient Cervical Whole Slide Image Analysis Framework Based on Multi-scale Semantic and Location Deep Features

Digital gigapixel whole slide image (WSI) is widely used in clinical diagnosis, and automated WSI analysis is key for computer-aided diagnosis. Currently, analyzing the integrated descriptor of probabilities or feature maps from massive local patches encoded by ResNet classifier is the main manner for WSI-level prediction. Feature representations of the sparse and tiny lesion cells in cervical slides, however, are still challenging, while the unused location representations are available to supply the semantics classification. This study designs a novel and efficient framework with a new module InCNet constructed lightweight model YOLCO (You Only Look Cytology Once). It directly extracts feature inside the single cell (cluster) instead of the traditional way that from image tile with a fixed size. The InCNet (Inline Connection Network) enriches the multi-scale connectivity without efficiency loss. The proposal allows the input size enlarged to megapixel that can stitch the WSI by the average repeats decreased from $10^3\sim10^4$ to $10^1\sim10^2$ for collecting features and predictions at two scales. Based on Transformer for classifying the integrated multi-scale multi-task WSI features, the experimental results appear $0.872$ AUC score better than the best conventional model on our dataset ($n$=2,019) from four scanners. The code is available at https://github.com/Chrisa142857/You-Only-Look-Cytopathology-Once , where the deployment version has the speed $\sim$70 s/WSI.

preprint2022arXiv

Besov class via heat semigroup on Dirichlet spaces III: BV functions and sub-Gaussian heat kernel estimates

With a view toward fractal spaces, by using a Korevaar-Schoen space approach, we introduce the class of bounded variation (BV) functions in a general framework of strongly local Dirichlet spaces with a heat kernel satisfying sub-Gaussian estimates. Under a weak Bakry-Émery curvature type condition, which is new in this setting, this BV class is identified with a heat semigroup based Besov class. As a consequence of this identification, properties of BV functions and associated BV measures are studied in detail. In particular, we prove co-area formulas, global $L^1$ Sobolev embeddings and isoperimetric inequalities. It is shown that for nested fractals or their direct products the BV class we define is dense in $L^1$. The examples of the unbounded Vicsek set, unbounded Sierpinski gasket and unbounded Sierpinski carpet are discussed.

preprint2022arXiv

Combined mean-field and semiclassical limits of large fermionic systems

We study the time dependent Schrödinger equation for large spinless fermions with the semiclassical scale $\hbar = N^{-1/3}$ in three dimensions. By using the Husimi measure defined by coherent states, we rewrite the Schrödinger equation into a BBGKY type of hierarchy for the k particle Husimi measure. Further estimates are derived to obtain the weak compactness of the Husimi measure, and in addition uniform estimates for the remainder terms in the hierarchy are derived in order to show that in the semiclassical regime the weak limit of the Husimi measure is exactly the solution of the Vlasov equation.

preprint2022arXiv

Conversational Recommendation: A Grand AI Challenge

Animated avatars, which look and talk like humans, are iconic visions of the future of AI-powered systems. Through many sci-fi movies we are acquainted with the idea of speaking to such virtual personalities as if they were humans. Today, we talk more and more to machines like Apple's Siri, e.g., to ask them for the weather forecast. However, when asked for recommendations, e.g., for a restaurant to go to, the limitations of such devices quickly become obvious. They do not engage in a conversation to find out what we might prefer, they often do not provide explanations for what they recommend, and they may have difficulties remembering what was said one minute earlier. Conversational recommender systems promise to address these limitations. In this paper, we review existing approaches to build such systems, which developments we observe today, which challenges are still open and why the development of conversational recommenders represents one of the next grand challenges of AI.

preprint2022arXiv

Decentralized Federated Learning: Balancing Communication and Computing Costs

Decentralized stochastic gradient descent (SGD) is a driving engine for decentralized federated learning (DFL). The performance of decentralized SGD is jointly influenced by inter-node communications and local updates. In this paper, we propose a general DFL framework, which implements both multiple local updates and multiple inter-node communications periodically, to strike a balance between communication efficiency and model consensus. It can provide a general decentralized SGD analytical framework. We establish strong convergence guarantees for the proposed DFL algorithm without the assumption of convex objectives. The convergence rate of DFL can be optimized to achieve the balance of communication and computing costs under constrained resources. For improving communication efficiency of DFL, compressed communication is further introduced to the proposed DFL as a new scheme, named DFL with compressed communication (C-DFL). The proposed C-DFL exhibits linear convergence for strongly convex objectives. Experiment results based on MNIST and CIFAR-10 datasets illustrate the superiority of DFL over traditional decentralized SGD methods and show that C-DFL further enhances communication efficiency.

preprint2022arXiv

Deep Learning with Label Noise: A Hierarchical Approach

Deep neural networks are susceptible to label noise. Existing methods to improve robustness, such as meta-learning and regularization, usually require significant change to the network architecture or careful tuning of the optimization procedure. In this work, we propose a simple hierarchical approach that incorporates a label hierarchy when training the deep learning models. Our approach requires no change of the network architecture or the optimization procedure. We investigate our hierarchical network through a wide range of simulated and real datasets and various label noise types. Our hierarchical approach improves upon regular deep neural networks in learning with label noise. Combining our hierarchical approach with pre-trained models achieves state-of-the-art performance in real-world noisy datasets.

preprint2022arXiv

Design of Coded Caching Schemes with Linear Subpacketizations Based on Injective Arc Coloring of Regular Digraphs

Coded caching is an effective technique to decongest the amount of traffic in the backhaul link. In such a scheme, each file hosted in the server is divided into a number of packets to pursue a low transmission rate based on the delicate design of contents cached into users and broadcast messages. However, the implementation complexity of this scheme increases with the number of packets. It is desirable to design a scheme with a small subpacketization level and a relatively low transmission rate. Recently, placement delivery array (PDA) was proposed to address the subpacketization bottleneck of coded caching. This paper investigates the design PDA from a new perspective, i.e., the injective arc coloring of regular digraphs. It is shown that the injective arc coloring of a regular digraph can yield a PDA with the same number of rows and columns. Based on this, a new class of regular digraphs are defined and the upper bounds on the injective chromatic index of such digraphs are derived. Consequently, some new coded caching schemes with a linear subpacketization level and a small transmission rate are proposed, one of which generalizes the existing scheme for the scenario with a more flexible number of users.

preprint2022arXiv

Design of Placement Delivery Arrays for Coded Caching with Small Subpacketizations and Flexible Memory Sizes

Coded caching is an emerging technique to reduce the data transmission load during the peak-traffic times. In such a scheme, each file in the data center or library is usually divided into a number of packets to pursue a low broadcasting rate based on the designed placements at each user's cache. However, the implementation complexity of this scheme increases as the number of packets increases. It is crucial to design a scheme with a small subpacketization level, while maintaining a relatively low transmission rate. It is known that the design of caches in users (i.e., the placement phase) and broadcasting (i.e., the delivery phase) can be unified in one matrix, namely the placement delivery array (PDA). This paper proposes a novel PDA construction by selecting proper orthogonal arrays (POAs), which generalizes some known constructions but with a more flexible memory size. Based on the proposed PDA construction, an effective transformation is further proposed to enable a coded caching scheme to have a smaller subpacketization level. Moreover, two new coded caching schemes with the coded placement are considered. It is shown that the proposed schemes yield a lower subpacketization level and transmission rate over some existing schemes.

preprint2022arXiv

Discovery of extended structure around open cluster COIN-Gaia 13 based on Gaia EDR3

COIN-Gaia 13 is a newly discovered open cluster revealed by Gaia DR2 data. It is a nearby open cluster with a distance of about 513 pc. Combined with the five-dimensional astrometric data of Gaia EDR3 with higher accuracy, we use the membership assignment algorithm (pyUPMASK) to determine the membership of COIN-Gaia 13 in a large extended spatial region. The cluster has found 478 candidate members. After obtaining reliable cluster members, we further study its basic properties and spatial distribution. Our results show that there is an obvious extended structure of the cluster in the X-Y plane. This elongated structure is distributed along the spiral arm, and the whole length is about 270 pc. The cluster age is 250 Myr, the total mass is about 439 M$_\odot$, and the tidal radius of the cluster is about 11 pc. Since more than half of the member stars (352 stars) are located outside twice the tidal radius, it is suspected that this cluster is undergoing the dynamic dissolution process. Furthermore, the spatial distribution and kinematic analysis indicate that the extended structure in COIN-Gaia 13 is more likely to be caused by the differential rotation of the Galaxy.

preprint2022arXiv

Enhanced Deep Animation Video Interpolation

Existing learning-based frame interpolation algorithms extract consecutive frames from high-speed natural videos to train the model. Compared to natural videos, cartoon videos are usually in a low frame rate. Besides, the motion between consecutive cartoon frames is typically nonlinear, which breaks the linear motion assumption of interpolation algorithms. Thus, it is unsuitable for generating a training set directly from cartoon videos. For better adapting frame interpolation algorithms from nature video to animation video, we present AutoFI, a simple and effective method to automatically render training data for deep animation video interpolation. AutoFI takes a layered architecture to render synthetic data, which ensures the assumption of linear motion. Experimental results show that AutoFI performs favorably in training both DAIN and ANIN. However, most frame interpolation algorithms will still fail in error-prone areas, such as fast motion or large occlusion. Besides AutoFI, we also propose a plug-and-play sketch-based post-processing module, named SktFI, to refine the final results using user-provided sketches manually. With AutoFI and SktFI, the interpolated animation frames show high perceptual quality.

preprint2022arXiv

ERNIE-SPARSE: Learning Hierarchical Efficient Transformer Through Regularized Self-Attention

Sparse Transformer has recently attracted a lot of attention since the ability for reducing the quadratic dependency on the sequence length. We argue that two factors, information bottleneck sensitivity and inconsistency between different attention topologies, could affect the performance of the Sparse Transformer. This paper proposes a well-designed model named ERNIE-Sparse. It consists of two distinctive parts: (i) Hierarchical Sparse Transformer (HST) to sequentially unify local and global information. (ii) Self-Attention Regularization (SAR) method, a novel regularization designed to minimize the distance for transformers with different attention topologies. To evaluate the effectiveness of ERNIE-Sparse, we perform extensive evaluations. Firstly, we perform experiments on a multi-modal long sequence modeling task benchmark, Long Range Arena (LRA). Experimental results demonstrate that ERNIE-Sparse significantly outperforms a variety of strong baseline methods including the dense attention and other efficient sparse attention methods and achieves improvements by 2.77% (57.78% vs. 55.01%). Secondly, to further show the effectiveness of our method, we pretrain ERNIE-Sparse and verified it on 3 text classification and 2 QA downstream tasks, achieve improvements on classification benchmark by 0.83% (92.46% vs. 91.63%), on QA benchmark by 3.24% (74.67% vs. 71.43%). Experimental results continue to demonstrate its superior performance.

preprint2022arXiv

Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks

One major problem in black-box adversarial attacks is the high query complexity in the hard-label attack setting, where only the top-1 predicted label is available. In this paper, we propose a novel geometric-based approach called Tangent Attack (TA), which identifies an optimal tangent point of a virtual hemisphere located on the decision boundary to reduce the distortion of the attack. Assuming the decision boundary is locally flat, we theoretically prove that the minimum $\ell_2$ distortion can be obtained by reaching the decision boundary along the tangent line passing through such tangent point in each iteration. To improve the robustness of our method, we further propose a generalized method which replaces the hemisphere with a semi-ellipsoid to adapt to curved decision boundaries. Our approach is free of pre-training. Extensive experiments conducted on the ImageNet and CIFAR-10 datasets demonstrate that our approach can consume only a small number of queries to achieve the low-magnitude distortion. The implementation source code is released online at https://github.com/machanic/TangentAttack.

preprint2022arXiv

Global weak solutions to the Vlasov-Poisson-Fokker-Planck-Navier-Stokes system

We consider the compressible Vlasov-Poisson-Fokker-Planck-Navier-Stokes system in a three dimensional bounded domain with nonhomogeneous Dirichlet boundary conditions. The system describes the evolution of charged particles ensemble dispersed in an isentropic fluid. For the adiabatic coefficient $γ>3/2$, we establish the global existence of weak solutions to this system with arbitrary large initial and boundary data.

preprint2022arXiv

Impacts of Personal Characteristics on User Trust in Conversational Recommender Systems

Conversational recommender systems (CRSs) imitate human advisors to assist users in finding items through conversations and have recently gained increasing attention in domains such as media and e-commerce. Like in human communication, building trust in human-agent communication is essential given its significant influence on user behavior. However, inspiring user trust in CRSs with a "one-size-fits-all" design is difficult, as individual users may have their own expectations for conversational interactions (e.g., who, user or system, takes the initiative), which are potentially related to their personal characteristics. In this study, we investigated the impacts of three personal characteristics, namely personality traits, trust propensity, and domain knowledge, on user trust in two types of text-based CRSs, i.e., user-initiative and mixed-initiative. Our between-subjects user study (N=148) revealed that users' trust propensity and domain knowledge positively influenced their trust in CRSs, and that users with high conscientiousness tended to trust the mixed-initiative system.

preprint2022arXiv

Learning quantum dissipation by the neural ordinary differential equation

Quantum dissipation arises from the unavoidable coupling between a quantum system and its surrounding environment, which is known as a major obstacle in the quantum processing of information. Apart from its existence, how to trace the dissipation from observational data is a crucial topic that may stimulate manners to suppress the dissipation. In this paper, we propose to learn the quantum dissipation from dynamical observations using the neural ordinary differential equation, and then demonstrate this method concretely on two open quantum-spin systems -- a large spin system and a spin-1/2 chain. We also investigate the learning efficiency of the dataset, which provides useful guidance for data acquisition in experiments. Our work promisingly facilitates effective modeling and decoherence suppression in open quantum systems.

preprint2022arXiv

Level 2 Autonomous Driving on a Single Device: Diving into the Devils of Openpilot

Equipped with a wide span of sensors, predominant autonomous driving solutions are becoming more modular-oriented for safe system design. Though these sensors have laid a solid foundation, most massive-production solutions up to date still fall into L2 phase. Among these, Comma.ai comes to our sight, claiming one $999 aftermarket device mounted with a single camera and board inside owns the ability to handle L2 scenarios. Together with open-sourced software of the entire system released by Comma.ai, the project is named Openpilot. Is it possible? If so, how is it made possible? With curiosity in mind, we deep-dive into Openpilot and conclude that its key to success is the end-to-end system design instead of a conventional modular framework. The model is briefed as Supercombo, and it can predict the ego vehicle's future trajectory and other road semantics on the fly from monocular input. Unfortunately, the training process and massive amount of data to make all these work are not publicly available. To achieve an intensive investigation, we try to reimplement the training details and test the pipeline on public benchmarks. The refactored network proposed in this work is referred to as OP-Deepdive. For a fair comparison of our version to the original Supercombo, we introduce a dual-model deployment scheme to test the driving performance in the real world. Experimental results on nuScenes, Comma2k19, CARLA, and in-house realistic scenarios verify that a low-cost device can indeed achieve most L2 functionalities and be on par with the original Supercombo model. In this report, we would like to share our latest findings, shed some light on the new perspective of end-to-end autonomous driving from an industrial product-level side, and potentially inspire the community to continue improving the performance. Our code, benchmarks are at https://github.com/OpenPerceptionX/Openpilot-Deepdive.

preprint2022arXiv

Maximum Flow and Minimum-Cost Flow in Almost-Linear Time

We give an algorithm that computes exact maximum flows and minimum-cost flows on directed graphs with $m$ edges and polynomially bounded integral demands, costs, and capacities in $m^{1+o(1)}$ time. Our algorithm builds the flow through a sequence of $m^{1+o(1)}$ approximate undirected minimum-ratio cycles, each of which is computed and processed in amortized $m^{o(1)}$ time using a new dynamic graph data structure. Our framework extends to algorithms running in $m^{1+o(1)}$ time for computing flows that minimize general edge-separable convex functions to high accuracy. This gives almost-linear time algorithms for several problems including entropy-regularized optimal transport, matrix scaling, $p$-norm flows, and $p$-norm isotonic regression on arbitrary directed acyclic graphs.

preprint2022arXiv

Minimum Coverage Instrumentation

Modern compilers leverage block coverage profile data to carry out downstream profile-guided optimizations to improve the runtime performance and the size of a binary. Given a control-flow graph $G=(V, E)$ of a function in the binary, where nodes in $V$ correspond to basic blocks (sequences of instructions that are always executed sequentially) and edges in $E$ represent jumps in the control flow, the goal is to know for each block $u \in V$ whether $u$ was executed during a session. To this end, extra instrumentation code that records when a block is executed needs to be added to the binary. This extra code creates a time and space overhead, which one would like to minimize as much as possible. Motivated by this application, we study the Minimum Coverage Instrumentation problem, where the goal is to find a minimum size subset of blocks to instrument such that the coverage of the remaining blocks in the graph can be inferred from the coverage status of the instrumented subset. Our main result is an algorithm to find an optimal instrumentation strategy and to carry out the inference in $O(|E|)$ time. We also study variants of this basic problem in which we are interested in learning the coverage of edges instead of the nodes, or when we are only allowed to instrument edges instead of the nodes.

preprint2022arXiv

New insights into the structure of open clusters in the Gaia era

With the help of Gaia data, it is noted that in addition to the core components, there are low-density outer halo components in the extended region of open clusters. To study the extended structure beyond the core radius of the cluster ($\sim$ 10 pc), based on Gaia EDR3 data, taking up to 50 pc as the searching radius, we use the pyUPMASK algorithm to re-determine the member stars of the open cluster within 1-2 kpc. We obtain the member stars of 256 open clusters, especially those located in the outer halo region of open clusters. Furthermore, we find that most open clusters' radial density profile in the outer region deviates from the King's profile. To better describe the internal and external structural characteristics of open clusters, we propose a double components model for description: core components with King model distribution and outer halo components with logarithmic Gaussian distribution, and then suggest using four radii ( $r_c$, $r_t$, $r_o$, $r_e$) for describing the structure and distribution profile of star clusters, where $r_t$ and $r_e$ represent the boundaries of core components and outer halo components respectively. Finally, we provide a catalog of 256 clusters with structural parameters. In addition, our study shows the sizes of these radii are statistically linear related, which indicates that the inner and outer regions of the cluster are interrelated and follow similar evolutionary processes. Further, we show that the structure of two components can be used to better trace the cluster evolution properties in different stages.

preprint2022arXiv

New Open Cluster candidates Found in Galactic Disk Using Gaia DR2/EDR3 Data

We report 541 new open cluster candidates in Gaia EDR3 through revisiting the cluster results from an earlier analysis of the Gaia DR2, which revealed nearly a thousand open cluster candidates in the solar neighborhood (mostly d < 3 kpc) resideing at Galactic latitudes |b| < 20 degrees. A subsequent comparison with lists of known clusters shows a large increases of the cluster samples within 2 kpc from the Sun. We assign membership probabilities to the stars through the open source pyUPMASK algorithm, and also estimate the physical parameters through isochrone fitting for each candidate. Most of the new candidates show small total proper motion dispersions and clear features in the color-magnitude diagrams. Besides, the metallicity gradient of the new candidates is consistent with those found in the literature. The cluster parameters and member stars are available at CDS via anonymous ftp to cdsarc.u-strasbg.fr(130.79.128.5) or via https://cdsarc.unistra.fr/viz-bin/cat/J/ApJS. The discovery of these new objects shows that the open cluster samples in Gaia data is still not complete, and more discoveries are expected in the future researches.

preprint2022arXiv

PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark

Methods for 3D lane detection have been recently proposed to address the issue of inaccurate lane layouts in many autonomous driving scenarios (uphill/downhill, bump, etc.). Previous work struggled in complex cases due to their simple designs of the spatial transformation between front view and bird&#39;s eye view (BEV) and the lack of a realistic dataset. Towards these issues, we present PersFormer: an end-to-end monocular 3D lane detector with a novel Transformer-based spatial feature transformation module. Our model generates BEV features by attending to related front-view local regions with camera parameters as a reference. PersFormer adopts a unified 2D/3D anchor design and an auxiliary task to detect 2D/3D lanes simultaneously, enhancing the feature consistency and sharing the benefits of multi-task learning. Moreover, we release one of the first large-scale real-world 3D lane datasets: OpenLane, with high-quality annotation and scenario diversity. OpenLane contains 200,000 frames, over 880,000 instance-level lanes, 14 lane categories, along with scene tags and the closed-in-path object annotations to encourage the development of lane detection and more industrial-related autonomous driving methods. We show that PersFormer significantly outperforms competitive baselines in the 3D lane detection task on our new OpenLane dataset as well as Apollo 3D Lane Synthetic dataset, and is also on par with state-of-the-art algorithms in the 2D task on OpenLane. The project page is available at https://github.com/OpenPerceptionX/PersFormer_3DLane and OpenLane dataset is provided at https://github.com/OpenPerceptionX/OpenLane.

preprint2022arXiv

Quasi-periodic oscillations of the X-ray burst from the magnetar SGR J1935+2154 and associated with the fast radio burst FRB 200428

The origin(s) and mechanism(s) of fast radio bursts (FRBs), which are short radio pulses from cosmological distances, have remained a major puzzle since their discovery. We report a strong Quasi-Periodic Oscillation(QPO) of 40 Hz in the X-ray burst from the magnetar SGR J1935+2154 and associated with FRB 200428, significantly detected with the Hard X-ray Modulation Telescope (Insight-HXMT) and also hinted by the Konus-Wind data. QPOs from magnetar bursts have only been rarely detected; our 3.4 sigma (p-value is 2.9e-4) detection of the QPO reported here reveals the strongest QPO signal observed from magnetars (except in some very rare giant flares), making this X-ray burst unique among magnetar bursts. The two X-ray spikes coinciding with the two FRB pulses are also among the peaks of the QPO. Our results suggest that at least some FRBs are related to strong oscillation processes of neutron stars. We also show that we may overestimate the significance of the QPO signal and underestimate the errors of QPO parameters if QPO exists only in a fraction of the time series of a X-ray burst which we use to calculate the Leahy-normalized periodogram.

preprint2022arXiv

Robust Landmark-based Stent Tracking in X-ray Fluoroscopy

In clinical procedures of angioplasty (i.e., open clogged coronary arteries), devices such as balloons and stents need to be placed and expanded in arteries under the guidance of X-ray fluoroscopy. Due to the limitation of X-ray dose, the resulting images are often noisy. To check the correct placement of these devices, typically multiple motion-compensated frames are averaged to enhance the view. Therefore, device tracking is a necessary procedure for this purpose. Even though angioplasty devices are designed to have radiopaque markers for the ease of tracking, current methods struggle to deliver satisfactory results due to the small marker size and complex scenes in angioplasty. In this paper, we propose an end-to-end deep learning framework for single stent tracking, which consists of three hierarchical modules: U-Net based landmark detection, ResNet based stent proposal and feature extraction, and graph convolutional neural network (GCN) based stent tracking that temporally aggregates both spatial information and appearance features. The experiments show that our method performs significantly better in detection compared with the state-of-the-art point-based tracking models. In addition, its fast inference speed satisfies clinical requirements.

preprint2022arXiv

ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal Feature Learning

Many existing autonomous driving paradigms involve a multi-stage discrete pipeline of tasks. To better predict the control signals and enhance user safety, an end-to-end approach that benefits from joint spatial-temporal feature learning is desirable. While there are some pioneering works on LiDAR-based input or implicit design, in this paper we formulate the problem in an interpretable vision-based setting. In particular, we propose a spatial-temporal feature learning scheme towards a set of more representative features for perception, prediction and planning tasks simultaneously, which is called ST-P3. Specifically, an egocentric-aligned accumulation technique is proposed to preserve geometry information in 3D space before the bird&#39;s eye view transformation for perception; a dual pathway modeling is devised to take past motion variations into account for future prediction; a temporal-based refinement unit is introduced to compensate for recognizing vision-based elements for planning. To the best of our knowledge, we are the first to systematically investigate each part of an interpretable end-to-end vision-based autonomous driving system. We benchmark our approach against previous state-of-the-arts on both open-loop nuScenes dataset as well as closed-loop CARLA simulation. The results show the effectiveness of our method. Source code, model and protocol details are made publicly available at https://github.com/OpenPerceptionX/ST-P3.

preprint2022arXiv

Tail Quantile Estimation for Non-preemptive Priority Queues

Motivated by applications in computing and telecommunication systems, we investigate the problem of estimating p-quantile of steady-state sojourn times in a single-server multi-class queueing system with non-preemptive priorities for p close to 1. The main challenge in this problem lies in efficient sampling from the tail event. To address this issue, we develop a regenerative simulation algorithm with importance sampling. In addition, we establish a central limit theorem for the estimator to construct the confidence interval. Numerical experiments show that our algorithm outperforms benchmark simulation methods. Our result contributes to the literature on rare event simulation for queueing systems.

preprint2022arXiv

Time-constrained Dynamic Mechanisms for College Admissions

Recent literature shows that dynamic matching mechanisms may outperform the standard mechanisms to deliver desirable results. We highlight an under-explored design dimension, the time constraints that students face under such a dynamic mechanism. First, we theoretically explore the effect of time constraints and show that the outcome can be worse than the outcome produced by the student-proposing deferred acceptance mechanism. Second, we present evidence from the Inner Mongolian university admissions that time constraints can prevent dynamic mechanisms from achieving stable outcomes, creating losers and winners among students.

preprint2021arXiv

BU-Trace: A Permissionless Mobile System for Privacy-Preserving Intelligent Contact Tracing

The coronavirus disease 2019 (COVID-19) pandemic has caused an unprecedented health crisis for the global. Digital contact tracing, as a transmission intervention measure, has shown its effectiveness on pandemic control. Despite intensive research on digital contact tracing, existing solutions can hardly meet users&#39; requirements on privacy and convenience. In this paper, we propose BU-Trace, a novel permissionless mobile system for privacy-preserving intelligent contact tracing based on QR code and NFC technologies. First, a user study is conducted to investigate and quantify the user acceptance of a mobile contact tracing system. Second, a decentralized system is proposed to enable contact tracing while protecting user privacy. Third, an intelligent behavior detection algorithm is designed to ease the use of our system. We implement BU-Trace and conduct extensive experiments in several real-world scenarios. The experimental results show that BU-Trace achieves a privacy-preserving and intelligent mobile system for contact tracing without requesting location or other privacy-related permissions.

preprint2021arXiv

Dynamical reciprocity in interacting games: numerical results and mechanism analysis

We study the evolution of two mutually interacting games with both pairwise games as well as the public goods game on different topologies. On 2d square lattices, we reveal that the game-game interaction can promote the cooperation prevalence in all cases, and the cooperation-defection phase transitions even become absent and fairly high cooperation is expected when the interaction goes to be very strong. A mean-field theory is developed that points out new dynamical routes arising therein. Detailed analysis shows indeed that there are rich categories of interactions in either individual or bulk scenario: invasion, neutral, and catalyzed types; their combination puts cooperators at a persistent advantage position, which boosts the cooperation. The robustness of the revealed reciprocity is strengthened by the studies of model variants, including asymmetrical or time-varying interactions, games of different types, games with time-scale separation, different updating rules etc. The structural complexities of the underlying population, such as Newman--Watts small world networks, Erdős--Rényi random networks, and Barabási--Albert networks, also do not alter the working of the dynamical reciprocity. In particular, as the number of games engaged increases, the cooperation level continuously improves in general. Our work thus uncovers a new class of cooperation mechanism and indicates the great potential for human cooperation where concurrent issues are so often seen in the real world.

preprint2021arXiv

Social hierarchy promotes the cooperation prevalence

Social hierarchy is important that can not be ignored in human socioeconomic activities and in the animal world. Here we incorporate this factor into the evolutionary game to see what impact it could have on the cooperation outcome. The probabilistic strategy adoption between two players is then not only determined by their payoffs, but also by their hierarchy difference -- players in the high rank are more likely to reproduce their strategies than the peers in the low rank. Through simulating the evolution of Prisoners&#39; dilemma game with three hierarchical distributions, we find that the levels of cooperation are enhanced in all cases, and the enhancement is optimal in the uniform case. The enhancement is due to the fact that the presence of hierarchy facilitates the formation of cooperation clusters with high-rank players acting as the nucleation cores. This mechanism remains valid on Barabási-Albert scale-free networks, in particular the cooperation enhancement is maximal when the hubs are of higher social ranks. We also study a two-hierarchy model, where similar cooperation promotion is revealed and some theoretical analyses are provided. Our finding may partially explain why the social hierarchy is so ubiquitous on this planet.

preprint2021arXiv

Speech enhancement with weakly labelled data from AudioSet

Speech enhancement is a task to improve the intelligibility and perceptual quality of degraded speech signal. Recently, neural networks based methods have been applied to speech enhancement. However, many neural network based methods require noisy and clean speech pairs for training. We propose a speech enhancement framework that can be trained with large-scale weakly labelled AudioSet dataset. Weakly labelled data only contain audio tags of audio clips, but not the onset or offset times of speech. We first apply pretrained audio neural networks (PANNs) to detect anchor segments that contain speech or sound events in audio clips. Then, we randomly mix two detected anchor segments containing speech and sound events as a mixture, and build a conditional source separation network using PANNs predictions as soft conditions for speech enhancement. In inference, we input a noisy speech signal with the one-hot encoding of &#34;Speech&#34; as a condition to the trained system to predict enhanced speech. Our system achieves a PESQ of 2.28 and an SSNR of 8.75 dB on the VoiceBank-DEMAND dataset, outperforming the previous SEGAN system of 2.16 and 7.73 dB respectively.

preprint2020arXiv

A Concise Proof of Discrete Jordan Curve Theorem

This paper gives a concise proof of the Jordan curve theorem on discrete surfaces. We also embed the discrete surface in the 2D plane to prove the original version of the Jordan curve theorem. This paper is a simple version of L. Chen, Note on the discrete Jordan curve theorem (revised version), arXiv:1312.0316. We seek to clarify and simplify some statements and proofs. Again, the purpose of this paper is to make the proof of the theorems easier to understand. In revision 2, we added Appendix B to make a self-contained proof on verifying simple connectedness of the Euclidean plane in this paper. In this revision, we added a special case for the proof of Theorem 3 in Appendix B that was found when we were revising a new paper for high dimensional contraction. It was easy to resolve in 2D. We put it in Appendix C of this paper.

preprint2020arXiv

A Discrete Proof of The General Jordan-Schoenflies Theorem

In the early 1960s, Brown and Mazur proved the general Jordan-Schoenflies theorem. This fundamental theorem states: If we embed an $(n-1)$ sphere $S^{(n-1)}$ locally flatly in an $n$ sphere $S^{n}$, then it decomposes $S^{n}$ into two components. In addition, the embedded $S^{(n-1)}$ is the common boundary of the two components and each component is homeomorphic to the $n$-ball.\newline This paper gives a constructive proof of the theorem using the discrete method. More specifically, we prove the equivalent statements: Let $M$ be an $n$-manifold, which is homeomorphic to $S^{n}$. Then, every $(n-1)$-manifold $S$, a submanifold with local flatness in $M$, decomposes the space $M$ into two components where each component is homeomorphic to an $n$-ball. The method was chosen in order to evaluate the computability and computational costs among operations between cells regarding homeomorphism. In addition, methods within the proof can be extended to applications in design algorithms under the assumption that homeomorphic mappings are constructible and computable. In this new revision, We add some new detailed discussions.

preprint2020arXiv

A Systematic Analysis of the Phase Lags Associated with the Type-C Quasi-periodic Oscillation in GRS 1915+105

We present a systematic analysis of the phase lags associated with the type-C QPOs in GRS 1915+105 using RXTE data. Our sample comprises of 620 RXTE observations with type-C QPOs ranging from ~0.4 Hz to ~6.3 Hz. Based on our analysis, we confirm that the QPO phase lags decrease with QPO frequency, and change sign from positive to negative at a QPO frequency of ~2 Hz. In addition, we find that the slope of this relation is significantly different between QPOs below and above 2 Hz. The relation between the QPO lags and QPO rms can be well fitted with a broken line: as the QPO lags go from negative to positive, the QPO rms first increases, reaching its maximum at around zero lag, and then decreases. The phase-lag behaviour of the subharmonic of the QPO is similar to that of the QPO fundamental, where the subharmonic lags decrease with subharmonic frequency and change sign from positive to negative at a subharmonic frequency of ~1 Hz; on the contrary, the second harmonic of the QPO shows a quite different phase-lag behaviour, where all the second harmonics show hard lags that remain more or less constant. For both the QPO and its (sub)harmonics, the slope of the lag-energy spectra shows a similar evolution with frequency as the average phase lags. This suggests that the lag-energy spectra drives the average phase lags. We discuss the possibility for the change in lag sign, and the physical origin of the QPO lags.

preprint2020arXiv

Adversarial Example in Remote Sensing Image Recognition

With the wide application of remote sensing technology in various fields, the accuracy and security requirements for remote sensing images (RSIs) recognition are also increasing. In recent years, due to the rapid development of deep learning in the field of image recognition, RSI recognition models based on deep convolution neural networks (CNNs) outperform traditional hand-craft feature techniques. However, CNNs also pose security issues when they show their capability of accurate classification. By adding a very small variation of the adversarial perturbation to the input image, the CNN model can be caused to produce erroneous results with extremely high confidence, and the modification of the image is not perceived by the human eye. This added adversarial perturbation image is called an adversarial example, which poses a serious security problem for systems based on CNN model recognition results. This paper, for the first time, analyzes adversarial example problem of RSI recognition under CNN models. In the experiments, we used different attack algorithms to fool multiple high-accuracy RSI recognition models trained on multiple RSI datasets. The results show that RSI recognition models are also vulnerable to adversarial examples, and the models with different structures trained on the same RSI dataset also have different vulnerabilities. For each RSI dataset, the number of features also affects the vulnerability of the model. Many features are good for defensive adversarial examples. Further, we find that the attacked class of RSI has an attack selectivity property. The misclassification of adversarial examples of the RSIs are related to the similarity of the original classes in the CNN feature space. In addition, adversarial examples in RSI recognition are of great significance for the security of remote sensing applications, showing a huge potential for future research.

preprint2020arXiv

Algorithms for Deforming and Contracting Simply Connected Discrete Closed Manifolds (III)

In a recent paper, {\it Algorithms for Deforming and Contracting Simply Connected Discrete Closed Manifolds (II)}, we discussed two algorithms for deforming and contracting a simply connected discrete closed manifold into a discrete sphere. The first algorithm was a continuation of work that began in {\it Algorithms for Deforming and Contracting Simply Connected Discrete Closed Manifolds (I)}, the second algorithm contained a more direct treatment of contraction for discrete manifolds. In this paper, we clarify that we can use this same method on standard piecewise linear (PL) complexes on the triangulation of general smooth manifolds. Our discussion is based on triangulation techniques invented by Cairns, Whitehead, and Whitney more than half of a century ago. In this paper, we use PL or simplicial complexes to replace certain concepts of discrete manifolds in previous papers. Note that some details in the original papers related to discrete manifolds may also need to be slightly modified for this purpose. In this paper, we use the algorithmic procedure (a contraction process) to prove the following theorem: For a finite triangulation of a simply-connected closed and orientable 3-manifold $M$ in Euclidean space, if a (simply-connected closed) 2-cycle which was made by 2-cells of this triangulation separates $M$ into two connected components, then each of the components will also be simply-connected. In addition, we can algorithmically make $M$ to be homeomorphic to a 3-sphere. The relationship of the theorem to a generalized Jordan separation problem, the general Jordan-Schoenflies theorem, and other important problems are also discussed. This revision adds more figures and explanations as well as two more special cases. We will post the detailed algorithm/procedure for practical triangulations in the following papers.

preprint2020arXiv

An Integrated Quadratic Reconstruction for Finite Volume Schemes to Scalar Conservation Laws in Multiple Dimensions

We proposed a piecewise quadratic reconstruction method in multiple dimensions, which is in an integrated style, for finite volume schemes to scalar conservation laws. This integrated quadratic reconstruction is parameter-free and applicable on flexible grids. We show that the finite volume schemes with the new reconstruction satisfy a local maximum principle with properly setup on time steplength. Numerical examples are presented to show that the proposed scheme attains a third-order accuracy for smooth solutions in both 2D and 3D cases. It is indicated by numerical results that the local maximum principle is helpful to prevent overshoots in numerical solutions.

preprint2020arXiv

Automated Intracranial Artery Labeling using a Graph Neural Network and Hierarchical Refinement

Automatically labeling intracranial arteries (ICA) with their anatomical names is beneficial for feature extraction and detailed analysis of intracranial vascular structures. There are significant variations in the ICA due to natural and pathological causes, making it challenging for automated labeling. However, the existing public dataset for evaluation of anatomical labeling is limited. We construct a comprehensive dataset with 729 Magnetic Resonance Angiography scans and propose a Graph Neural Network (GNN) method to label arteries by classifying types of nodes and edges in an attributed relational graph. In addition, a hierarchical refinement framework is developed for further improving the GNN outputs to incorporate structural and relational knowledge about the ICA. Our method achieved a node labeling accuracy of 97.5%, and 63.8% of scans were correctly labeled for all Circle of Willis nodes, on a testing set of 105 scans with both healthy and diseased subjects. This is a significant improvement over available state-of-the-art methods. Automatic artery labeling is promising to minimize manual effort in characterizing the complicated ICA networks and provides valuable information for the identification of geometric risk factors of vascular disease. Our code and dataset are available at https://github.com/clatfd/GNN-ARTLABEL.

preprint2020arXiv

Besov class via heat semigroup on Dirichlet spaces I: Sobolev type inequalities

We introduce heat semigroup-based Besov classes in the general framework of Dirichlet spaces. General properties of those classes are studied and quantitative regularization estimates for the heat semigroup in this scale of spaces are obtained. As a highlight of the paper, we obtain a far reaching $L^p$-analogue, $p \ge 1$, of the Sobolev inequality that was proved for $p=2$ by N. Varopoulos under the assumption of ultracontractivity for the heat semigroup. The case $p=1$ is of special interest since it yields isoperimetric type inequalities.

preprint2020arXiv

Besov class via heat semigroup on Dirichlet spaces II: BV functions and Gaussian heat kernel estimates

We introduce the class of bounded variation (BV) functions in a general framework of strictly local Dirichlet spaces with doubling measure. Under the 2-Poincaré inequality and a weak Bakry-Émery curvature type condition, this BV class is identified with the heat semigroup based Besov class $\mathbf{B}^{1,1/2}(X)$ that was introduced in our previous paper. Assuming furthermore a quasi Bakry-Émery curvature type condition, we identify the Sobolev class $W^{1,p}(X)$ with $\mathbf{B}^{p,1/2}(X)$ for $p>1$. Consequences of those identifications in terms of isoperimetric and Sobolev inequalities with sharp exponents are given.

preprint2020arXiv

Blurry Video Frame Interpolation

Existing works reduce motion blur and up-convert frame rate through two separate ways, including frame deblurring and frame interpolation. However, few studies have approached the joint video enhancement problem, namely synthesizing high-frame-rate clear results from low-frame-rate blurry inputs. In this paper, we propose a blurry video frame interpolation method to reduce motion blur and up-convert frame rate simultaneously. Specifically, we develop a pyramid module to cyclically synthesize clear intermediate frames. The pyramid module features adjustable spatial receptive field and temporal scope, thus contributing to controllable computational complexity and restoration ability. Besides, we propose an inter-pyramid recurrent module to connect sequential models to exploit the temporal relationship. The pyramid module integrates a recurrent module, thus can iteratively synthesize temporally smooth results without significantly increasing the model size. Extensive experimental results demonstrate that our method performs favorably against state-of-the-art methods.

preprint2020arXiv

Composite Signalling for DFRC: Dedicated Probing Signal or Not?

Dual-functional radar-communication (DFRC) is a promising new solution to simultaneously probe the radar target and transmit information in wireless networks. In this paper, we study the joint optimization of transmit and receive beamforming for the DFRC system. Specifically, the signal to interference plus noise ratio (SINR) of the radar is maximized under the SINR constraints of the communication user (CU), which characterizes the optimal tradeoff between radar and communication. In addition to simply using the communication signal for target probing, we further consider to exploit dedicated probing signals to enhance the radar sensing performance. We commence by studying the single-CU scenario, where a closed-form solution to the beamforming design problem is provided. It is then proved that a dedicated radar probing signal is not needed. As a further step, we consider a more complicated multi-CU scenario, where the beamforming design is formulated as a non-convex quadratically constrained quadratic programming. The optimal solutions are obtained by applying semidefinite relaxation with guaranteed rank-1 property. It is shown that under the multi-CU scenario, the dedicated probing signal should be employed to improve the radar performance at the cost of implementing an additional interference cancellation at the CU. Finally, the numerical simulations are provided to verify the effectiveness of the proposed algorithm.

preprint2020arXiv

Construction of a series of new $ν=2/5$ fractional quantum Hall wave functions by conformal field theory

In this paper, a series of $ν=2/5$ fractional quantum Hall wave functions are constructed from conformal field theory(CFT). They share the same topological properties with states constructed by Jain&#39;s composite fermion approach. Upon exact lowest Landau level(LLL) projection, some of Jain composite fermion states would not survive if constraints on Landau level indices given in the appendices of this paper were not satisfied. By contrast, states constructed from CFT always stay in LLL. These states are characterized by different topological shifts and multibody relative angular momenta. As a by-product, in the appendices we prove the necessary conditions for general $ ν=p/(2p+1) $ composite fermion states to have nonvanishing LLL projection.

preprint2020arXiv

Content Adaptive and Error Propagation Aware Deep Video Compression

Recently, learning based video compression methods attract increasing attention. However, the previous works suffer from error propagation due to the accumulation of reconstructed error in inter predictive coding. Meanwhile, the previous learning based video codecs are also not adaptive to different video contents. To address these two problems, we propose a content adaptive and error propagation aware video compression system. Specifically, our method employs a joint training strategy by considering the compression performance of multiple consecutive frames instead of a single frame. Based on the learned long-term temporal information, our approach effectively alleviates error propagation in reconstructed frames. More importantly, instead of using the hand-crafted coding modes in the traditional compression systems, we design an online encoder updating scheme in our system. The proposed approach updates the parameters for encoder according to the rate-distortion criterion but keeps the decoder unchanged in the inference stage. Therefore, the encoder is adaptive to different video contents and achieves better compression performance by reducing the domain gap between the training and testing datasets. Our method is simple yet effective and outperforms the state-of-the-art learning based video codecs on benchmark datasets without increasing the model size or decreasing the decoding speed.

preprint2020arXiv

Convolution Neural Network Architecture Learning for Remote Sensing Scene Classification

Remote sensing image scene classification is a fundamental but challenging task in understanding remote sensing images. Recently, deep learning-based methods, especially convolutional neural network-based (CNN-based) methods have shown enormous potential to understand remote sensing images. CNN-based methods meet with success by utilizing features learned from data rather than features designed manually. The feature-learning procedure of CNN largely depends on the architecture of CNN. However, most of the architectures of CNN used for remote sensing scene classification are still designed by hand which demands a considerable amount of architecture engineering skills and domain knowledge, and it may not play CNN&#39;s maximum potential on a special dataset. In this paper, we proposed an automatically architecture learning procedure for remote sensing scene classification. We designed a parameters space in which every set of parameters represents a certain architecture of CNN (i.e., some parameters represent the type of operators used in the architecture such as convolution, pooling, no connection or identity, and the others represent the way how these operators connect). To discover the optimal set of parameters for a given dataset, we introduced a learning strategy which can allow efficient search in the architecture space by means of gradient descent. An architecture generator finally maps the set of parameters into the CNN used in our experiments.

preprint2020arXiv

Discovery of oscillations above 200 keV in a black hole X-ray binary with Insight-HXMT

Low-frequency quasi-periodic oscillations (LFQPOs) are commonly found in black hole X-ray binaries, and their origin is still under debate. The properties of LFQPOs at high energies (above 30 keV) are closely related to the nature of the accretion flow in the innermost regions, and thus play a crucial role in critically testing various theoretical models. The Hard X-ray Modulation Telescope (Insight-HXMT) is capable of detecting emissions above 30 keV, and is therefore an ideal instrument to do so. Here we report the discovery of LFQPOs above 200 keV in the new black hole MAXI J1820+070 in the X-ray hard state, which allows us to understand the behaviours of LFQPOs at hundreds of kiloelectronvolts. The phase lag of the LFQPO is constant around zero below 30 keV, and becomes a soft lag (that is, the high-energy photons arrive first) above 30 keV. The soft lag gradually increases with energy and reaches ~0.9s in the 150-200 keV band. The detection at energies above 200 keV, the large soft lag and the energy-related behaviors of the LFQPO pose a great challenge for most currently existing models, but suggest that the LFQPO probably originates from the precession of a small-scale jet.

preprint2020arXiv

Exploring open cluster properties with Gaia and LAMOST

In Gaia DR2, the unprecedented high-precision level reached in sub-mas for astrometry and mmag for photometry. Using cluster members identified with these astrometry and photometry in Gaia DR2, we can obtain a reliable determination of cluster properties. However, because of the shortcoming of Gaia spectroscopic observation in dealing with densely crowded cluster region, the number of radial velocity and metallicity for cluster member stars from Gaia DR2 is still lacking. In this study, we aim to improve the cluster properties by combining the LAMOST spectra. In particular, we provide the list of cluster members with spectroscopic parameters as an add-value catalog in LAMOST DR5, which can be used to perform detailed study for a better understanding on the stellar properties, by using their spectra and fundamental properties from the host cluster. We cross-matched the spectroscopic catalog in LAMOST DR5 with the identified cluster members in Cantat-Gaudin et al.2018 and then used members with spectroscopic parameters to derive statistical properties of open clusters. We obtained a list of 8811 members with spectroscopic parameters and a catalog of 295 cluster properties. In addition, we study the radial and vertical metallicity gradient and age-metallicity relation with the compiled open clusters as tracers, finding slopes of -0.053$\pm$0.004 dex kpc$^{-1}$, -0.252$\pm$0.039 dex kpc$^{-1}$ and 0.022$\pm$0.008 dex Gyr$^{-1}$, respectively. Both slopes of metallicity distribution relation for young clusters (0.1 Gyr < Age < 2 Gyr) and the age-metallicity relation for clusters within 6 Gyr are consistent with literature results. In order to fully study the chemical evolution history in the disk, more spectroscopic observations for old and distant open clusters are needed for further investigation.

preprint2020arXiv

Fast Dynamic Cuts, Distances and Effective Resistances via Vertex Sparsifiers

We present a general framework of designing efficient dynamic approximate algorithms for optimization on undirected graphs. In particular, we develop a technique that, given any problem that admits a certain notion of vertex sparsifiers, gives data structures that maintain approximate solutions in sub-linear update and query time. We illustrate the applicability of our paradigm to the following problems. (1) A fully-dynamic algorithm that approximates all-pair maximum-flows/minimum-cuts up to a nearly logarithmic factor in $\tilde{O}(n^{2/3})$ amortized time against an oblivious adversary, and $\tilde{O}(m^{3/4})$ time against an adaptive adversary. (2) An incremental data structure that maintains $O(1)$-approximate shortest path in $n^{o(1)}$ time per operation, as well as fully dynamic approximate all-pair shortest path and transshipment in $\tilde{O}(n^{2/3+o(1)})$ amortized time per operation. (3) A fully-dynamic algorithm that approximates all-pair effective resistance up to an $(1+ε)$ factor in $\tilde{O}(n^{2/3+o(1)} ε^{-O(1)})$ amortized update time per operation. The key tool behind result (1) is the dynamic maintenance of an algorithmic construction due to Madry [FOCS&#39; 10], which partitions a graph into a collection of simpler graph structures (known as j-trees) and approximately captures the cut-flow and metric structure of the graph. The $O(1)$-approximation guarantee of (2) is by adapting the distance oracles by [Thorup-Zwick JACM `05]. Result (3) is obtained by invoking the random-walk based spectral vertex sparsifier by [Durfee et al. STOC `19] in a hierarchical manner, while carefully keeping track of the recourse among levels in the hierarchy.

preprint2020arXiv

Fast Kernel k-means Clustering Using Incomplete Cholesky Factorization

Kernel-based clustering algorithm can identify and capture the non-linear structure in datasets, and thereby it can achieve better performance than linear clustering. However, computing and storing the entire kernel matrix occupy so large memory that it is difficult for kernel-based clustering to deal with large-scale datasets. In this paper, we employ incomplete Cholesky factorization to accelerate kernel clustering and save memory space. The key idea of the proposed kernel $k$-means clustering using incomplete Cholesky factorization is that we approximate the entire kernel matrix by the product of a low-rank matrix and its transposition. Then linear $k$-means clustering is applied to columns of the transpose of the low-rank matrix. We show both analytically and empirically that the performance of the proposed algorithm is similar to that of the kernel $k$-means clustering algorithm, but our method can deal with large-scale datasets.

preprint2020arXiv

FFusionCGAN: An end-to-end fusion method for few-focus images using conditional GAN in cytopathological digital slides

Multi-focus image fusion technologies compress different focus depth images into an image in which most objects are in focus. However, although existing image fusion techniques, including traditional algorithms and deep learning-based algorithms, can generate high-quality fused images, they need multiple images with different focus depths in the same field of view. This criterion may not be met in some cases where time efficiency is required or the hardware is insufficient. The problem is especially prominent in large-size whole slide images. This paper focused on the multi-focus image fusion of cytopathological digital slide images, and proposed a novel method for generating fused images from single-focus or few-focus images based on conditional generative adversarial network (GAN). Through the adversarial learning of the generator and discriminator, the method is capable of generating fused images with clear textures and large depth of field. Combined with the characteristics of cytopathological images, this paper designs a new generator architecture combining U-Net and DenseBlock, which can effectively improve the network&#39;s receptive field and comprehensively encode image features. Meanwhile, this paper develops a semantic segmentation network that identifies the blurred regions in cytopathological images. By integrating the network into the generative model, the quality of the generated fused images is effectively improved. Our method can generate fused images from only single-focus or few-focus images, thereby avoiding the problem of collecting multiple images of different focus depths with increased time and hardware costs. Furthermore, our model is designed to learn the direct mapping of input source images to fused images without the need to manually design complex activity level measurements and fusion rules as in traditional methods.

preprint2020arXiv

Flow by Gauss curvature to Dual Orlicz-Minkowski problems

In this paper we study a normalised anisotropic Gauss curvature flow of strictly convex, closed hypersurfaces in the Euclidean space R^n+1. We prove that the flow exists for all time and converges smoothly to the unique, strictly convex solution of a Monge-Amp`ere type equation. Our argument provides a parabolic proof in the smooth category for the existence of solutions to the Dual Orlicz-Minkowski problem introduced by Zhu, Xing and Ye.

preprint2020arXiv

Frozen Patterns of Impacted Droplets: From Conical Tips to Toroidal Shapes

We report frozen patterns for the water droplets impacting on a cold substrate through fast-speed images. These patterns can be manipulated by several physical parameters (the droplet size, falling height, and substrate temperature), and the scaling analysis has a remarkable agreement with the phase diagram. The observed double-concentric toroidal shape is attributed to the correlation between the impacting dynamics and freezing process, as confirmed by the spatiotemporal evolution of the droplet temperature, the identified timescale associated with the morphology and solidification ($t_{inn}\simeq τ_{sol}$), and the ice front-advection model. These results for frozen patterns provide insight into the complex interplay of the rapid impacting hydrodynamics, the transient heat transfer, and the intricate solidification process.

preprint2020arXiv

Fully nonlinear equations of Krylov type on Riemannian manifolds with negative curvature

In this paper, we consider fully nonlinear equations of Krylov type on Riemannian manifolds with negative curvature which naturally arise in conformal geometry. Moreover, we prove the a priori estimates for solutions to these equations and establish the existence results. Our results can be viewed as an extension of previous results given by Gursky-Viaclovsky and Li-Sheng.

preprint2020arXiv

Horo-convex hypersurfaces with prescribed shifted Gauss curvatures in $\mathbb{H}^{n+1}$

In this paper, we consider prescribed shifted Gauss curvature equations for horo-convex hypersurfaces in $\mathbb{H}^{n+1}$. Under some sufficient condition, we obtain an existence result by the standard degree theory based on the a prior estimates for the solutions to the equations. Different from the prescribed Weingarten curvature problem in space forms, we do not impose a sign condition for radial derivative of the functions in the right-hand side of the equations to prove the existence due to the horo-covexity of hypersurfaces in $\mathbb{H}^{n+1}$.

preprint2020arXiv

LAMOST Medium-Resolution Spectroscopic Survey (LAMOST-MRS): Scientific goals and survey plan

Since September 2018, LAMOST starts a new 5-year medium-resolution spectroscopic survey (MRS) using bright/gray nights. We present the scientific goals of LAMOST-MRS and propose a near optimistic strategy of the survey. A complete footprint is also provided. Not only the regular medium-resolution survey, but also a time-domain spectroscopic survey is being conducted since 2018 and will be end in 2023. According to the detailed survey plan, we expect that LAMOST-MRS can observe about 2 million stellar spectra with ~7500 and limiting magnitude of around G=15 mag. Moreover, it will also provide about 200 thousand stars with averagely 60-epoch observations and limiting magnitude of G~14 mag. These high quality spectra will give around 20 elemental abundances, rotational velocities, emission line profiles as well as precise radial velocity with uncertainty less than 1 km/s. With these data, we expect that LAMOST can effectively leverage sciences on stellar physics, e.g. exotic binary stars, detailed observation of many types of variable stars etc., planet host stars, emission nebulae, open clusters, young pre-main-sequence stars etc.

preprint2020arXiv

Large-scale Real-time Personalized Similar Product Recommendations

Similar product recommendation is one of the most common scenes in e-commerce. Many recommendation algorithms such as item-to-item Collaborative Filtering are working on measuring item similarities. In this paper, we introduce our real-time personalized algorithm to model product similarity and real-time user interests. We also introduce several other baseline algorithms including an image-similarity-based method, item-to-item collaborative filtering, and item2vec, and compare them on our large-scale real-world e-commerce dataset. The algorithms which achieve good offline results are also tested on the online e-commerce website. Our personalized method achieves a 10% improvement on the add-cart number in the real-world e-commerce scenario.

preprint2020arXiv

On the global classical solution to compressible Euler system with singular velocity alignment

We consider a compressible Euler system with singular velocity alignment, known as the Euler-alignment system, describing the flocking behaviors of large animal groups. We establish a local well-posedness theory for the system, as well as a global well-posedness theory for small initial data. We also show the asymptotic flocking behavior, where solutions converge to a constant steady state exponentially in time.

preprint2020arXiv

SCAttNet: Semantic Segmentation Network with Spatial and Channel Attention Mechanism for High-Resolution Remote Sensing Images

High-resolution remote sensing images (HRRSIs) contain substantial ground object information, such as texture, shape, and spatial location. Semantic segmentation, which is an important task for element extraction, has been widely used in processing mass HRRSIs. However, HRRSIs often exhibit large intraclass variance and small interclass variance due to the diversity and complexity of ground objects, thereby bringing great challenges to a semantic segmentation task. In this paper, we propose a new end-to-end semantic segmentation network, which integrates lightweight spatial and channel attention modules that can refine features adaptively. We compare our method with several classic methods on the ISPRS Vaihingen and Potsdam datasets. Experimental results show that our method can achieve better semantic segmentation results. The source codes are available at https://github.com/lehaifeng/SCAttNet.

preprint2020arXiv

Spin squeezing in a spin-orbit coupled Bose-Einstein condensate

We study the spin squeezing in a spin-1/2 Bose-Einstein condensates (BEC) with Raman induced spin-orbit coupling (SOC). Under the condition of two-photon resonance and weak Raman coupling strength, the system possesses two degenerate ground states, using which we construct an effective two-mode model. The Hamiltonian of the two-mode model takes the form of the one-axis-twisting Hamiltonian which is known to generate spin squeezing. More importantly, we show that the SOC provides a convenient control knob to adjust the spin nonlinearity responsible for spin squeezing. Specifically, the spin nonlinearity strength can be tuned to be comparable to the two-body density-density interaction, hence is much larger than the intrinsic spin-dependent interaction strength in conventional two-component BEC systems such as $^{87}$Rb and $^{23}$Na in the absence of the SOC. We confirm the spin squeezing by carrying out a fully beyond-mean-field numerical calculation using the truncated Wigner method. Additionally, the experimental implementation is also discussed.

preprint2020arXiv

Stable Sparse Subspace Embedding for Dimensionality Reduction

Sparse random projection (RP) is a popular tool for dimensionality reduction that shows promising performance with low computational complexity. However, in the existing sparse RP matrices, the positions of non-zero entries are usually randomly selected. Although they adopt uniform sampling with replacement, due to large sampling variance, the number of non-zeros is uneven among rows of the projection matrix which is generated in one trial, and more data information may be lost after dimension reduction. To break this bottleneck, based on random sampling without replacement in statistics, this paper builds a stable sparse subspace embedded matrix (S-SSE), in which non-zeros are uniformly distributed. It is proved that the S-SSE is stabler than the existing matrix, and it can maintain Euclidean distance between points well after dimension reduction. Our empirical studies corroborate our theoretical findings and demonstrate that our approach can indeed achieve satisfactory performance.

preprint2020arXiv

Uniform Interpolation Constrained Geodesic Learning on Data Manifold

In this paper, we propose a method to learn a minimizing geodesic within a data manifold. Along the learned geodesic, our method can generate high-quality interpolations between two given data samples. Specifically, we use an autoencoder network to map data samples into latent space and perform interpolation via an interpolation network. We add prior geometric information to regularize our autoencoder for the convexity of representations so that for any given interpolation approach, the generated interpolations remain within the distribution of the data manifold. Before the learning of a geodesic, a proper Riemannianmetric should be defined. Therefore, we induce a Riemannian metric by the canonical metric in the Euclidean space which the data manifold is isometrically immersed in. Based on this defined Riemannian metric, we introduce a constant speed loss and a minimizing geodesic loss to regularize the interpolation network to generate uniform interpolation along the learned geodesic on the manifold. We provide a theoretical analysis of our model and use image translation as an example to demonstrate the effectiveness of our method.

preprint2020arXiv

Uniqueness of solutions to Lp-Christoffel-Minkowski problem for p<1

$L_p$-Christoffel-Minkowski problem arises naturally in the $L_p$-Brunn-Minkowski theory. It connects both curvature measures and area measures of convex bodies and is a fundamental problem in convex geometric analysis. Since the lack of Firey&#39;s extension of Brunn-Minkowski inequality and constant rank theorem for $p<1$, the existence and uniqueness of $L_p$-Brunn-Minkowski problem are difficult problems. In this paper, we prove a uniqueness theorem for solutions to $L_p$-Christoffel-Minkowski problem with $p<1$ and constant prescribed data. Our proof is motivated by the idea of Brendle-Choi-Daskaspoulos&#39;s work on asymptotic behavior of flows by powers of the Gaussian curvature. One of the highlights of our arguments is that we introduce a new auxiliary function $Z$ which is the key to our proof.

preprint2020arXiv

User Validation of Recommendation Serendipity Metrics

Though it has been recognized that recommending serendipitous (i.e., surprising and relevant) items can be helpful for increasing users&#39; satisfaction and behavioral intention, how to measure serendipity in the offline environment is still an open issue. In recent years, a number of metrics have been proposed, but most of them were based on researchers&#39; assumptions due to the serendipity&#39;s subjective nature. In order to validate these metrics&#39; actual performance, we collected over 10,000 users&#39; real feedback data and compared with the metrics&#39; results. It turns out the user profile based metrics, especially content-based ones, perform better than those based on item popularity, in terms of estimating the unexpectedness facet of recommendations. Moreover, the full metrics, which involve the unexpectedness component, relevance, timeliness, and user curiosity, can more accurately indicate the recommendation&#39;s serendipity degree, relative to those that just involve some of them. The application of these metrics to several recommender algorithms further consolidates their practical usage, because the comparison results are consistent with those from user evaluation. Thus, this work is constructive for filling the gap between offline measurement and user study on recommendation serendipity.

preprint2020arXiv

Vertex nomination: The canonical sampling and the extended spectral nomination schemes

Suppose that one particular block in a stochastic block model is of interest, but block labels are only observed for a few of the vertices in the network. Utilizing a graph realized from the model and the observed block labels, the vertex nomination task is to order the vertices with unobserved block labels into a ranked nomination list with the goal of having an abundance of interesting vertices near the top of the list. There are vertex nomination schemes in the literature, including the optimally precise canonical nomination scheme~$\mathcal{L}^C$ and the consistent spectral partitioning nomination scheme~$\mathcal{L}^P$. While the canonical nomination scheme $\mathcal{L}^C$ is provably optimally precise, it is computationally intractable, being impractical to implement even on modestly sized graphs. With this in mind, an approximation of the canonical scheme---denoted the {\it canonical sampling nomination scheme} $\mathcal{L}^{CS}$---is introduced; $\mathcal{L}^{CS}$ relies on a scalable, Markov chain Monte Carlo-based approximation of $\mathcal{L}^{C}$, and converges to $\mathcal{L}^{C}$ as the amount of sampling goes to infinity. The spectral partitioning nomination scheme is also extended to the {\it extended spectral partitioning nomination scheme}, $\mathcal{L}^{EP}$, which introduces a novel semisupervised clustering framework to improve upon the precision of $\mathcal{L}^P$. Real-data and simulation experiments are employed to illustrate the precision of these vertex nomination schemes, as well as their empirical computational complexity. Keywords: vertex nomination, Markov chain Monte Carlo, spectral partitioning, Mclust MSC[2010]: 60J22, 65C40, 62H30, 62H25

preprint2020arXiv

Xiaomingbot: A Multilingual Robot News Reporter

This paper proposes the building of Xiaomingbot, an intelligent, multilingual and multimodal software robot equipped with four integral capabilities: news generation, news translation, news reading and avatar animation. Its system summarizes Chinese news that it automatically generates from data tables. Next, it translates the summary or the full article into multiple languages, and reads the multilingual rendition through synthesized speech. Notably, Xiaomingbot utilizes a voice cloning technology to synthesize the speech trained from a real person&#39;s voice data in one input language. The proposed system enjoys several merits: it has an animated avatar, and is able to generate and read multilingual news. Since it was put into practice, Xiaomingbot has written over 600,000 articles, and gained over 150,000 followers on social media platforms.

preprint2019arXiv

BiRA-Net: Bilinear Attention Net for Diabetic Retinopathy Grading

Diabetic retinopathy (DR) is a common retinal disease that leads to blindness. For diagnosis purposes, DR image grading aims to provide automatic DR grade classification, which is not addressed in conventional research methods of binary DR image classification. Small objects in the eye images, like lesions and microaneurysms, are essential to DR grading in medical imaging, but they could easily be influenced by other objects. To address these challenges, we propose a new deep learning architecture, called BiRA-Net, which combines the attention model for feature extraction and bilinear model for fine-grained classification. Furthermore, in considering the distance between different grades of different DR categories, we propose a new loss function, called grading loss, which leads to improved training convergence of the proposed approach. Experimental results are provided to demonstrate the superior performance of the proposed approach.

preprint2019arXiv

Combined Mean Field Limit and Non-relativistic Limit of Vlasov-Maxwell Particle System to Vlasov-Poisson System

In this paper we consider the mean field limit and non-relativistic limit of relativistic Vlasov-Maxwell particle system to Vlasov-Poisson equation. With the relativistic Vlasov-Maxwell particle system being a starting point, we carry out the estimates (with respect to $N$ and $c$) between the characteristic equation of both Vlasov-Maxwell particle model and Vlasov-Poisson equation, where the probabilistic method is exploited. In the last step, we take both large $N$ limit and non-relativistic limit (meaning $c$ tending to infinity) to close the argument.

preprint2019arXiv

Direct comparison of many-body methods for realistic electronic Hamiltonians

A large collaboration carefully benchmarks 20 first principles many-body electronic structure methods on a test set of 7 transition metal atoms, and their ions and monoxides. Good agreement is attained between the 3 systematically converged methods, resulting in experiment-free reference values. These reference values are used to assess the accuracy of modern emerging and scalable approaches to the many-electron problem. The most accurate methods obtain energies indistinguishable from experimental results, with the agreement mainly limited by the experimental uncertainties. Comparison between methods enables a unique perspective on calculations of many-body systems of electrons.

preprint2019arXiv

Overview to the Hard X-ray Modulation Telescope (Insight-HXMT) Satellite

As China&#39;s first X-ray astronomical satellite, the Hard X-ray Modulation Telescope (HXMT), which was dubbed as Insight-HXMT after the launch on June 15, 2017, is a wide-band (1-250 keV) slat-collimator-based X-ray astronomy satellite with the capability of all-sky monitoring in 0.2-3 MeV. It was designed to perform pointing, scanning and gamma-ray burst (GRB) observations and, based on the Direct Demodulation Method (DDM), the image of the scanned sky region can be reconstructed. Here we give an overview of the mission and its progresses, including payload, core sciences, ground calibration/facility, ground segment, data archive, software, in-orbit performance, calibration, background model, observations and some preliminary results.

preprint2017arXiv

Conical square functions for degenerate elliptic operators

The aim of this paper is to study the boundedness of different conical square functions that arise naturally from second order divergence form degenerate elliptic operators. More precisely, let $L_w=w^{-1}\,{\rm div}(w\,A\,\nabla)$ where $w\in A_2$ and $A$ is an $n\times n$ bounded, complex-valued, uniformly elliptic matrix. D. Cruz-Uribe and C. Rios solved the $L^2(w)$-Kato square root problem obtaining that $\sqrt{L_w}$ is equivalent to the gradient on $L^2(w)$. The same authors in collaboration with the second named author of this paper studied the $L^p(w)$-boundedness of operators that are naturally associated with $L_w$, such as the functional calculus, Riesz transforms, or vertical square functions. The theory developed admitted also weighted estimates (i.e., estimates in $L^p(v dw)$ for $v\in A_\infty(w)$), and in particular a class of &#34;degeneracy&#34; weights $w$ was found in such a way that the classical $L^2$-Kato problem can be solved. In this paper, continuing this line of research, and also that originated in some recent results by the second and third named authors of the current paper, we study the boundedness on $L^p(w)$ and on $L^p(v dw)$, with $v\in A_\infty(w)$, of the conical square functions that one can construct using the heat or Poisson semigroup associated with $L_w$. As a consequence of our methods, we find a class of degeneracy weights $w$ for which $L^2$-estimates for these conical square functions hold. This opens the door to the study of weighted and unweighted Hardy spaces and of boundary value problems associated with $L_w$.