Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
76works
0followers
41topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

76 published item(s)

preprint2026arXiv

Distill, Diffuse, and Semanticize (DDS): Annotation-Free 3D Scene Understanding Based on Multi-Granularity Distillation and Graph-Diffusion-Based Segmentation

3D semantic scene understanding is essential for digital twins, autonomous driving, smart agriculture, and embodied perception, yet dense point-wise annotation for point clouds remains expensive and difficult to scale. Existing annotation-free methods often face a trade-off between semantic recognition and structural efficiency: open-vocabulary and foundation-model-driven methods provide strong semantic priors, but often come with substantial computational costs, while structure-oriented methods based on superpoints, clustering, and graph reasoning are lightweight but often produce category-agnostic regions. We propose DDS, a resource-efficient structure-oriented framework for region-consistent and semanticized annotation-free 3D scene understanding. DDS preserves the lightweight superpoint-based organization paradigm while incorporating visual semantic cues from projected features and segmentation-derived masks. It first performs multi-granularity distillation to guide the 3D backbone at the point, mask-prototype, and inter-prototype levels, then applies graph diffusion over superpoints to propagate semantic information directly in 3D, producing coherent region representations without costly spectral decomposition or dense open-vocabulary 3D feature fields. Finally, DDS uses segmentation-cluster association to assign interpretable semantic names to category-agnostic 3D clusters. Experiments on real-world datasets show that DDS achieves the best performance among representative structure-oriented annotation-free baselines, improving oAcc, mAcc, and mIoU by up to 5.9%, 8.1%, and 2.4%, respectively. These results demonstrate that DDS improves region consistency and lightweight semantic recognition, providing a scalable and interpretable solution for annotation-free 3D scene understanding.

preprint2025arXiv

GARDO: Reinforcing Diffusion Models without Reward Hacking

Fine-tuning diffusion models via online reinforcement learning (RL) has shown great potential for enhancing text-to-image alignment. However, since precisely specifying a ground-truth objective for visual tasks remains challenging, the models are often optimized using a proxy reward that only partially captures the true goal. This mismatch often leads to reward hacking, where proxy scores increase while real image quality deteriorates and generation diversity collapses. While common solutions add regularization against the reference policy to prevent reward hacking, they compromise sample efficiency and impede the exploration of novel, high-reward regions, as the reference policy is usually sub-optimal. To address the competing demands of sample efficiency, effective exploration, and mitigation of reward hacking, we propose Gated and Adaptive Regularization with Diversity-aware Optimization (GARDO), a versatile framework compatible with various RL algorithms. Our key insight is that regularization need not be applied universally; instead, it is highly effective to selectively penalize a subset of samples that exhibit high uncertainty. To address the exploration challenge, GARDO introduces an adaptive regularization mechanism wherein the reference model is periodically updated to match the capabilities of the online policy, ensuring a relevant regularization target. To address the mode collapse issue in RL, GARDO amplifies the rewards for high-quality samples that also exhibit high diversity, encouraging mode coverage without destabilizing the optimization process. Extensive experiments across diverse proxy rewards and hold-out unseen metrics consistently show that GARDO mitigates reward hacking and enhances generation diversity without sacrificing sample efficiency or exploration, highlighting its effectiveness and robustness.

preprint2024arXiv

Hierarchical Aligned Multimodal Learning for NER on Tweet Posts

Mining structured knowledge from tweets using named entity recognition (NER) can be beneficial for many down stream applications such as recommendation and intention understanding. With tweet posts tending to be multimodal, multimodal named entity recognition (MNER) has attracted more attention. In this paper, we propose a novel approach, which can dynamically align the image and text sequence and achieve the multi-level cross-modal learning to augment textual word representation for MNER improvement. To be specific, our framework can be split into three main stages: the first stage focuses on intra-modality representation learning to derive the implicit global and local knowledge of each modality, the second evaluates the relevance between the text and its accompanying image and integrates different grained visual information based on the relevance, the third enforces semantic refinement via iterative cross-modal interactions and co-attention. We conduct experiments on two open datasets, and the results and detailed analysis demonstrate the advantage of our model.

preprint2023arXiv

Approaching the Limit of Image Rescaling via Flow Guidance

Image downscaling and upscaling are two basic rescaling operations. Once the image is downscaled, it is difficult to be reconstructed via upscaling due to the loss of information. To make these two processes more compatible and improve the reconstruction performance, some efforts model them as a joint encoding-decoding task, with the constraint that the downscaled (i.e. encoded) low-resolution (LR) image must preserve the original visual appearance. To implement this constraint, most methods guide the downscaling module by supervising it with the bicubically downscaled LR version of the original high-resolution (HR) image. However, this bicubic LR guidance may be suboptimal for the subsequent upscaling (i.e. decoding) and restrict the final reconstruction performance. In this paper, instead of directly applying the LR guidance, we propose an additional invertible flow guidance module (FGM), which can transform the downscaled representation to the visually plausible image during downscaling and transform it back during upscaling. Benefiting from the invertibility of FGM, the downscaled representation could get rid of the LR guidance and would not disturb the downscaling-upscaling process. It allows us to remove the restrictions on the downscaling module and optimize the downscaling and upscaling modules in an end-to-end manner. In this way, these two modules could cooperate to maximize the HR reconstruction performance. Extensive experiments demonstrate that the proposed method can achieve state-of-the-art (SotA) performance on both downscaled and reconstructed images.

preprint2023arXiv

Evolution of dispersal in advective patchy environments with varying drift rates

In this paper, we study a two stream species Lotka-Volterra competition patch model with the patches aligned along a line. The two species are supposed to be identical except for the diffusion rates. For each species, the diffusion rates between patches are the same, while the drift rates vary. Our results show that the convexity of the drift rates has a significant impact on the competition outcomes: if the drift rates are convex, then the species with larger diffusion rate wins the competition; if the drift rates are concave, then the species with smaller diffusion rate wins the competition.

preprint2023arXiv

Quantum circuit matrix product state ansatz for large-scale simulations of molecules

As in the density matrix renormalization group (DMRG) method, approximating many-body wave function of electrons using a matrix product state (MPS) is a promising way to solve electronic structure problems. The expressibility of an MPS is determined by the size of the matrices or in other words the bond dimension, which unfortunately should be very large in many cases. In this study, we propose to calculate the ground state energies of molecular systems by variationally optimizing quantum circuit MPS (QCMPS) with a relatively small number of qubits. It is demonstrated that with carefully chosen circuit structure and orbital localization scheme, QCMPS can reach a similar accuracy as that achieved in DMRG with an exponentially large bond dimension. QCMPS simulation of a linear molecule with 50 orbitals can reach the chemical accuracy using only 6 qubits at a moderate circuit depth. These results suggest that QCMPS is a promising wave function ansatz in the variational quantum eigensolver algorithm for molecular systems.

preprint2023arXiv

Quantum Neural Network Inspired Hardware Adaptable Ansatz for Efficient Quantum Simulation of Chemical Systems

The variational quantum eigensolver is a promising way to solve the Schrödinger equation on a noisy intermediate-scale quantum (NISQ) computer, while its success relies on a well-designed wavefunction ansatz. Compared to physically motivated ansatzes, hardware heuristic ansatzes usually lead to a shallower circuit, but it may still be too deep for an NISQ device. Inspired by the quantum neural network, we propose a new hardware heuristic ansatz where the circuit depth can be significantly reduced by introducing ancilla qubits, which makes a practical simulation of a chemical reaction with more than 20 atoms feasible on a currently available quantum computer. More importantly, the expressibility of this new ansatz can be improved by increasing either the depth or the width of the circuit, which makes it adaptable to different hardware environments. These results open a new avenue to develop practical applications of quantum computation in the NISQ era.

preprint2022arXiv

$BC_2$ type multivariable matrix functions and matrix spherical functions

Matrix spherical functions associated to the compact symmetric pair $(\mathrm{SU}(m+2), \mathrm{S}(\mathrm{U}(2)\times \mathrm{U}(m))$, having reduced root system of type $\mathrm{BC}_2$, are studied. We consider an irreducible $K$-representation $(π,V)$ arising from the $\mathrm{U}(2)$-part of $K$, and the induced representation $\mathrm{Ind}_K^G π$ splits multiplicity free. The corresponding spherical functions, i.e. $Φ\colon G \to \mathrm{End}(V)$ satisfying $Φ(k_1gk_2)=π(k_1)Φ(g)π(k_2)$ for all $g\in G$, $k_1,k_2\in K$, are studied by studying certain leading coefficients which involve hypergeometric functions. This is done explicitly using the action of the radial part of the Casimir operator on these functions and their leading coefficients. To suitably grouped matrix spherical functions we associate two-variable matrix orthogonal polynomials giving a matrix analogue of Koornwinder's 1970s two-variable orthogonal polynomials, which are Heckman-Opdam polynomials for $\mathrm{BC}_2$. In particular, we find explicit orthogonality relations and the matrix polynomials being eigenfunctions to an explicit second order matrix partial differential operator. The scalar part of the matrix weight is less general than Koornwinder's weight.

preprint2022arXiv

A practical algorithm to minimize the overall error in FEM computations

Using the standard finite element method (FEM) to solve general partial differential equations, the round-off error is found to be proportional to $N^{β_{\rm R}}$, with $N$ the number of degrees of freedom (DoFs) and $β_{\rm R}$ a coefficient. A method which uses a few cheap numerical experiments is proposed to determine the coefficient of proportionality and $β_{\rm R}$ in various space dimensions and FEM packages. Using the coefficients obtained above, the strategy put forward in \cite{liu386balancing} for predicting the highest achievable accuracy $E_{\rm min}$ and the associated optimal number of DoFs $N_{\rm opt}$ for specific problems is extended to general problems. This strategy allows predicting $E_{\rm min}$ accurately for general problems, with the CPU time for obtaining the solution with the highest accuracy $E_{\rm min}$ typically reduced by 60\%--90\%.

preprint2022arXiv

Application of Color Block Code in Image Scaling

Aiming at the high cost of embedding annotation watermark in a narrow small area and the information distortion caused by the change of annotation watermark image resolution, this paper proposes a color block code technology, which uses location information and color code to form recognizable graphics, which can not only simplify the annotation graphics, but also ensure the recognition efficiency. First, the constituent elements of color block code are designed, and then the coding and decoding method of color block code is proposed. Experiments show that color block code has high anti-scaling and anti-interference, and can be widely used in the labeling of small object surface and low resolution image.

preprint2022arXiv

Cost-effective BlackWater Raft on Highly Unreliable Nodes at Scale Out

The Raft algorithm maintains strong consistency across data replicas in Cloud. This algorithm divides nodes into leaders and followers, to satisfy read/write requests spanning geo-diverse sites. With the increase of workload, Raft shall provide scale-out performance in proportion. However, traditional scale-out techniques encounter bottlenecks in Raft, and when the provisioned sites exhaust local resources, the performance loss will grow exponentially. To provide scalability in Raft, this paper proposes a cost-effective mechanism for elastic auto-scaling in Raft, called BlackWater-Raft or BW-Raft. BW-Raft extends the original Raft with the following abstractions: (1) secretary nodes that take over expensive log synchronization operations from the leader, relaxing the performance constraints on locks. (2) massive low cost observer nodes that handle reads only, improving throughput for typical data intensive services. These abstractions are stateless, allowing elastic scale-out on unreliable yet cheap spot instances. In theory, we demonstrate that BW-Raft can maintain Raft's strong consistency guarantees when scaling out, processing a 50X increase in the number of nodes compared to the original Raft. We have prototyped the BW-Raft on key-value services and evaluated it with many state-of-the-arts on Amazon EC2 and Alibaba Cloud. Our results show that within the same budget, BW-Raft's resource footprint increments are 5-7X smaller than Multi-Raft, and 2X better than original Raft. Using spot instances, BW-Raft can reduces costs by 84.5\% compared to Multi-Raft. In the real world experiments, BW-Raft improves goodput of the 95th-percentile SLO by 9.4X, thus serving as an alternative for services scaling out with strong consistency.

preprint2022arXiv

Divide-and-conquer variational quantum algorithms for large-scale electronic structure simulations

Exploring the potential application of quantum computers in material design and drug discovery has attracted a lot of interest in the age of quantum computing. However, the quantum resource requirement for solving practical electronic structure problems are far beyond the capacity of near-term quantum devices. In this work, we integrate the divide-and-conquer (DC) approaches into the variational quantum eigensolver (VQE) for large-scale quantum computational chemistry simulations. Two popular divide-and-conquer schemes, including many-body expansion~(MBE) fragmentation theory and density matrix embedding theory~(DMET), are employed to divide complicated problems into many small parts that are easy to implement on near-term quantum computers. Pilot applications of these methods to systems consisting of tens of atoms are performed with adaptive VQE algorithms. This work should encourage further studies of using the philosophy of DC to solve electronic structure problems on quantum computers.

preprint2022arXiv

Dynamic Prototype Convolution Network for Few-Shot Semantic Segmentation

The key challenge for few-shot semantic segmentation (FSS) is how to tailor a desirable interaction among support and query features and/or their prototypes, under the episodic training scenario. Most existing FSS methods implement such support-query interactions by solely leveraging plain operations - e.g., cosine similarity and feature concatenation - for segmenting the query objects. However, these interaction approaches usually cannot well capture the intrinsic object details in the query images that are widely encountered in FSS, e.g., if the query object to be segmented has holes and slots, inaccurate segmentation almost always happens. To this end, we propose a dynamic prototype convolution network (DPCN) to fully capture the aforementioned intrinsic details for accurate FSS. Specifically, in DPCN, a dynamic convolution module (DCM) is firstly proposed to generate dynamic kernels from support foreground, then information interaction is achieved by convolution operations over query features using these kernels. Moreover, we equip DPCN with a support activation module (SAM) and a feature filtering module (FFM) to generate pseudo mask and filter out background information for the query images, respectively. SAM and FFM together can mine enriched context information from the query features. Our DPCN is also flexible and efficient under the k-shot FSS setting. Extensive experiments on PASCAL-5i and COCO-20i show that DPCN yields superior performances under both 1-shot and 5-shot settings.

preprint2022arXiv

Enhanced proton-boron nuclear fusion cross sections in intense high-frequency laser

We investigate the proton-boron nuclear fusion cross sections under the influence of the intense linearly polarized monochromatic laser fields with high frequency. First, we rewrite the time-dependent Schrödinger equation using Kramers-Henneberger (KH) transformation which allows for shifting all time dependence of the problem into the potential function. Then, for the intense laser fields that satisfy the high frequency limit, the time-averaged scheme in the KH framework should be valid. We can use WKB approximation to evaluate Coulomb barrier penetrability and then calculate proton-boron nuclear fusion cross sections by a phenomenological Gamow form. We show that the corresponding Coulomb barrier penetrability increases significantly due to the depression of the time-averaged potential barrier. As a result, we find that proton-boron nuclear fusion cross sections can be enhanced effectively depending on a dimensionless quantity $n_{\mathrm{d}}$, which equals the ratio of the quiver oscillation amplitude to the geometrical touching radius of the proton and boron nucleus. For $n_{\mathrm{d}}=9$, we predict that the resonance peak of the fusion cross-section is enhanced by about $26$ times at the incident energy of $\varepsilon=148$ keV. And for another incident energy of $\varepsilon=586$ keV, the resonance peak of fusion cross-section is not only enhanced but also shifted to lower energy of $\varepsilon=392$ keV due to the mechanism of over-barrier fusion.

preprint2022arXiv

Exploring accurate potential energy surfaces via integrating variational quantum eigensovler with machine learning

The potential energy surface (PES) is crucial for interpreting a variety of chemical reaction processes. However, predicting accurate PESs with high-level electronic structure methods is a challenging task due to the high computational cost. As an appealing application of quantum computing, we show in this work that variational quantum algorithms can be integrated with machine learning (ML) techniques as a promising scheme for exploring accurate PESs. Different from using a ML model to represent the potential energy, we encode the molecular geometry information into a deep neural network (DNN) for representing parameters of the variational quantum eigensolver (VQE), leaving the PES to the wave function ansatz. Once the DNN model is trained, the variational optimization procedure that hinders the application of the VQE to complex systems is avoided and thus the evaluation of PESs is significantly accelerated. Numerical results demonstrate that a simple DNN model is able to reproduce accurate PESs for small molecules.

preprint2022arXiv

Fast and Memory-Efficient Network Towards Efficient Image Super-Resolution

Runtime and memory consumption are two important aspects for efficient image super-resolution (EISR) models to be deployed on resource-constrained devices. Recent advances in EISR exploit distillation and aggregation strategies with plenty of channel split and concatenation operations to make full use of limited hierarchical features. In contrast, sequential network operations avoid frequently accessing preceding states and extra nodes, and thus are beneficial to reducing the memory consumption and runtime overhead. Following this idea, we design our lightweight network backbone by mainly stacking multiple highly optimized convolution and activation layers and decreasing the usage of feature fusion. We propose a novel sequential attention branch, where every pixel is assigned an important factor according to local and global contexts, to enhance high-frequency details. In addition, we tailor the residual block for EISR and propose an enhanced residual block (ERB) to further accelerate the network inference. Finally, combining all the above techniques, we construct a fast and memory-efficient network (FMEN) and its small version FMEN-S, which runs 33% faster and reduces 74% memory consumption compared with the state-of-the-art EISR model: E-RFDN, the champion in AIM 2020 efficient super-resolution challenge. Besides, FMEN-S achieves the lowest memory consumption and the second shortest runtime in NTIRE 2022 challenge on efficient super-resolution. Code is available at https://github.com/NJU-Jet/FMEN.

preprint2022arXiv

Fishbone resonance structure in the attosecond transient absorption spectroscopy of graphene

We investigate the attosecond transient absorption spectroscopy (ATAS) of graphene by numerically solving four-band density-matrix equations, which demonstrates apparent fish bone resonance structures. To gain insight into these interesting structures, we exploit a simplified model that only considers the electrons of $Γ$ and M points in the Brillouin zone. With the help of this model, we can analytically express the ATAS spectrum as the sum of zeroth- and first-order Bessel functions in the variables of the strength and frequency of the infrared pump field as well as the effective mass of electrons at the $Γ$ and M points. Lorentzian and Fano line shapes in the absorption spectrum are addressed. The fish bone structure consists of periodic V-shaped structure that can be explained by first-order Bessel functions and its tilt angle is solely determined by the frequency of the pump laser. The periodicity of the V-shaped structure in the fish bone originates from the periodic dependence of the Lorentzian and Fano line shapes of the absorption spectrum on the time delay between the pump and probe lasers. Compared with the numerical results, our analytical theory can qualitatively or even quantitatively predict the zeroth- and first-order fringes in the fish bone structures of the ATAS spectrum. The gauge issues in the numerical simulations are also discussed.

preprint2022arXiv

From General to Specific: Online Updating for Blind Super-Resolution

Most deep learning-based super-resolution (SR) methods are not image-specific: 1) They are trained on samples synthesized by predefined degradations (e.g. bicubic downsampling), regardless of the domain gap between training and testing data. 2) During testing, they super-resolve all images by the same set of model weights, ignoring the degradation variety. As a result, most previous methods may suffer a performance drop when the degradations of test images are unknown and various (i.e. the case of blind SR). To address these issues, we propose an online SR (ONSR) method. It does not rely on predefined degradations and allows the model weights to be updated according to the degradation of the test image. Specifically, ONSR consists of two branches, namely internal branch (IB) and external branch (EB). IB could learn the specific degradation of the given test LR image, and EB could learn to super resolve images degraded by the learned degradation. In this way, ONSR could customize a specific model for each test image, and thus get more robust to various degradations. Extensive experiments on both synthesized and real-world images show that ONSR can generate more visually favorable SR results and achieve state-of-the-art performance in blind SR.

preprint2022arXiv

KSSOLV 2.0: An efficient MATLAB toolbox for solving the Kohn-Sham equations with plane-wave basis set

KSSOLV (Kohn-Sham Solver) is a MATLAB toolbox for performing Kohn-Sham density functional theory (DFT) calculations with a plane-wave basis set. KSSOLV 2.0 preserves the design features of the original KSSOLV software to allow users and developers to easily set up a problem and perform ground-state calculations as well as to prototype and test new algorithms. Furthermore, it includes new functionalities such as new iterative diagonalization algorithms, k-point sampling for electron band structures, geometry optimization and advanced algorithms for performing DFT calculations with local, semi-local, and hybrid exchange-correlation functionals. It can be used to study the electronic structures of both molecules and solids. We describe these new capabilities in this work through a few use cases. We also demonstrate the numerical accuracy and computational efficiency of KSSOLV on a variety of examples.

preprint2022arXiv

LAI Estimation of Cucumber Crop Based on Improved Fully Convolutional Network

LAI (Leaf Area Index) is of great importance for crop yield estimation in agronomy. It is directly related to plant growth status, net assimilation rate, plant photosynthesis, and carbon dioxide in the environment. How to measure LAI accurately and efficiently is the key to the crop yield estimation problem. Manual measurement consumes a lot of human resources and material resources. Remote sensing technology is not suitable for near-Earth LAI measurement. Besides, methods based on traditional digital image processing are greatly affected by environmental noise and image exposure. Nowadays, deep learning is widely used in many fields. The improved FCN (Fully Convolutional Network) is proposed in our study for LAI measure task. Eighty-two cucumber images collected from our greenhouse are labeled to fine-tuning the pre-trained model. The result shows that the improved FCN model performs well on our dataset. Our method's mean IoU can reach 0.908, which is 11% better than conventional methods and 4.7% better than the basic FCN model.

preprint2022arXiv

Large-Scale Simulation of Quantum Computational Chemistry on a New Sunway Supercomputer

Quantum computational chemistry (QCC) is the use of quantum computers to solve problems in computational quantum chemistry. We develop a high performance variational quantum eigensolver (VQE) simulator for simulating quantum computational chemistry problems on a new Sunway supercomputer. The major innovations include: (1) a Matrix Product State (MPS) based VQE simulator to reduce the amount of memory needed and increase the simulation efficiency; (2) a combination of the Density Matrix Embedding Theory with the MPS-based VQE simulator to further extend the simulation range; (3) A three-level parallelization scheme to scale up to 20 million cores; (4) Usage of the Julia script language as the main programming language, which both makes the programming easier and enables cutting edge performance as native C or Fortran; (5) Study of real chemistry systems based on the VQE simulator, achieving nearly linearly strong and weak scaling. Our simulation demonstrates the power of VQE for large quantum chemistry systems, thus paves the way for large-scale VQE experiments on near-term quantum computers.

preprint2022arXiv

Multi-features based Semantic Augmentation Networks for Named Entity Recognition in Threat Intelligence

Extracting cybersecurity entities such as attackers and vulnerabilities from unstructured network texts is an important part of security analysis. However, the sparsity of intelligence data resulted from the higher frequency variations and the randomness of cybersecurity entity names makes it difficult for current methods to perform well in extracting security-related concepts and entities. To this end, we propose a semantic augmentation method which incorporates different linguistic features to enrich the representation of input tokens to detect and classify the cybersecurity names over unstructured text. In particular, we encode and aggregate the constituent feature, morphological feature and part of speech feature for each input token to improve the robustness of the method. More than that, a token gets augmented semantic information from its most similar K words in cybersecurity domain corpus where an attentive module is leveraged to weigh differences of the words, and from contextual clues based on a large-scale general field corpus. We have conducted experiments on the cybersecurity datasets DNRTI and MalwareTextDB, and the results demonstrate the effectiveness of the proposed method.

preprint2022arXiv

Multiple Targets Directed Greybox Fuzzing

Directed greybox fuzzing (DGF) can quickly discover or reproduce bugs in programs by seeking to reach a program location or explore some locations in order. However, due to their static stage division and coarse-grained energy scheduling, prior DGF tools perform poorly when facing multiple target locations (targets for short). In this paper, we present multiple targets directed greybox fuzzing which aims to reach multiple programs locations in a fuzzing campaign. Specifically, we propose a novel strategy to adaptively coordinate exploration and exploitation stages, and a novel energy scheduling strategy by considering more relations between seeds and target locations. We implement our approaches in a tool called LeoFuzz and evaluate it on crash reproduction, true positives verification, and vulnerability exposure in real-world programs. Experimental results show that LeoFuzz outperforms six state-of-the-art fuzzers, i.e., QYSM, AFLGo, Lolly, Berry, Beacon and WindRanger in terms of effectiveness and efficiency. Moreover, LeoFuzz has detected 23 new vulnerabilities in real-world programs, and 11 of them have been assigned CVE IDs.

preprint2022arXiv

Non-Hermiticity stabilized Majorana zero modes in semiconductor-superconductor nanowires

Coupled Majorana zero modes with nonzero energies are generally detrimental to the non-Abelian statistics due to the additional dynamic phase. Nevertheless, we show that a well-connected lead can introduce a local non-Hermitian dissipation term to shift the energies of the both coupled Majorana modes to zero, and surprisingly turn the coupled Majorana mode far from the lead into a dark Majorana mode with exponentially small dissipation. This dark Majorana mode can conquer the drawback of the partially overlapped Majorana zero modes and possess the properties of true Majorana zero mode such as the perfect fractional Josephson effect and the non-Abelian statistics.

preprint2022arXiv

Normalized tangent bundle, varieties with small codegree and pseudoeffective threshold

We propose a conjectural list of Fano manifolds of Picard number $1$ with pseudoeffective normalized tangent bundles, which we prove in various situations by relating it to the complete divisibility conjecture of Russo and Zak on varieties with small codegree. Furthermore, the pseudoeffective thresholds and hence the pseudoeffective cones of the projectivized tangent bundles of rational homogeneous spaces of Picard number $1$ are explicitly determined by studying the total dual VMRT and the geometry of stratified Mukai flops. As a by-product, we obtain sharp vanishing theorems on the global twisted symmetric holomorphic vector fields on rational homogeneous spaces of Picard number $1$.

preprint2022arXiv

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

This paper reviews the NTIRE 2022 challenge on efficient single image super-resolution with focus on the proposed solutions and results. The task of the challenge was to super-resolve an input image with a magnification factor of $\times$4 based on pairs of low and corresponding high resolution images. The aim was to design a network for single image super-resolution that achieved improvement of efficiency measured according to several metrics including runtime, parameters, FLOPs, activations, and memory consumption while at least maintaining the PSNR of 29.00dB on DIV2K validation set. IMDN is set as the baseline for efficiency measurement. The challenge had 3 tracks including the main track (runtime), sub-track one (model complexity), and sub-track two (overall performance). In the main track, the practical runtime performance of the submissions was evaluated. The rank of the teams were determined directly by the absolute value of the average runtime on the validation set and test set. In sub-track one, the number of parameters and FLOPs were considered. And the individual rankings of the two metrics were summed up to determine a final ranking in this track. In sub-track two, all of the five metrics mentioned in the description of the challenge including runtime, parameter count, FLOPs, activations, and memory consumption were considered. Similar to sub-track one, the rankings of five metrics were summed up to determine a final ranking. The challenge had 303 registered participants, and 43 teams made valid submissions. They gauge the state-of-the-art in efficient single image super-resolution.

preprint2022arXiv

Open Set Recognition using Vision Transformer with an Additional Detection Head

Deep neural networks have demonstrated prominent capacities for image classification tasks in a closed set setting, where the test data come from the same distribution as the training data. However, in a more realistic open set scenario, traditional classifiers with incomplete knowledge cannot tackle test data that are not from the training classes. Open set recognition (OSR) aims to address this problem by both identifying unknown classes and distinguishing known classes simultaneously. In this paper, we propose a novel approach to OSR that is based on the vision transformer (ViT) technique. Specifically, our approach employs two separate training stages. First, a ViT model is trained to perform closed set classification. Then, an additional detection head is attached to the embedded features extracted by the ViT, trained to force the representations of known data to class-specific clusters compactly. Test examples are identified as known or unknown based on their distance to the cluster centers. To the best of our knowledge, this is the first time to leverage ViT for the purpose of OSR, and our extensive evaluation against several OSR benchmark datasets reveals that our approach significantly outperforms other baseline methods and obtains new state-of-the-art performance.

preprint2022arXiv

Optimizing Irregular-Shaped Matrix-Matrix Multiplication on Multi-Core DSPs

General Matrix Multiplication (GEMM) has a wide range of applications in scientific simulation and artificial intelligence. Although traditional libraries can achieve high performance on large regular-shaped GEMMs, they often behave not well on irregular-shaped GEMMs, which are often found in new algorithms and applications of high-performance computing (HPC). Due to energy efficiency constraints, low-power multi-core digital signal processors (DSPs) have become an alternative architecture in HPC systems. Targeting multi-core DSPs in FT-m7032, a prototype CPU-DSPs heterogeneous processor for HPC, an efficient implementation - ftIMM - for three types of irregular-shaped GEMMs is proposed. FtIMM supports automatic generation of assembly micro-kernels, two parallelization strategies, and auto-tuning of block sizes and parallelization strategies. The experiments show that ftIMM can get better performance than the traditional GEMM implementations on multi-core DSPs in FT-m7032, yielding on up to 7.2x performance improvement, when performing on irregular-shaped GEMMs. And ftIMM on multi-core DSPs can also far outperform the open source library on multi-core CPUs in FT-m7032, delivering up to 3.1x higher efficiency.

preprint2022arXiv

Privacy protection based on mask template

Powerful recognition algorithms are widely used in the Internet or important medical systems, which poses a serious threat to personal privacy. Although the law provides for diversity protection, e.g. The General Data Protection Regulation (GDPR) in Europe and Articles 1032 to 1039 of the civil code in China. However, as an important privacy disclosure event, biometric data is often hidden, which is difficult for the owner to detect and trace to the source. Human biometrics generally exist in images. In order to avoid the disclosure of personal privacy, we should prevent unauthorized recognition algorithms from acquiring the real features of the original image.

preprint2022arXiv

Promoting Saliency From Depth: Deep Unsupervised RGB-D Saliency Detection

Growing interests in RGB-D salient object detection (RGB-D SOD) have been witnessed in recent years, owing partly to the popularity of depth sensors and the rapid progress of deep learning techniques. Unfortunately, existing RGB-D SOD methods typically demand large quantity of training images being thoroughly annotated at pixel-level. The laborious and time-consuming manual annotation has become a real bottleneck in various practical scenarios. On the other hand, current unsupervised RGB-D SOD methods still heavily rely on handcrafted feature representations. This inspires us to propose in this paper a deep unsupervised RGB-D saliency detection approach, which requires no manual pixel-level annotation during training. It is realized by two key ingredients in our training pipeline. First, a depth-disentangled saliency update (DSU) framework is designed to automatically produce pseudo-labels with iterative follow-up refinements, which provides more trustworthy supervision signals for training the saliency network. Second, an attentive training strategy is introduced to tackle the issue of noisy pseudo-labels, by properly re-weighting to highlight the more reliable pseudo-labels. Extensive experiments demonstrate the superior efficiency and effectiveness of our approach in tackling the challenging unsupervised RGB-D SOD scenarios. Moreover, our approach can also be adapted to work in fully-supervised situation. Empirical studies show the incorporation of our approach gives rise to notably performance improvement in existing supervised RGB-D SOD models.

preprint2022arXiv

Q$^2$Chemistry: A quantum computation platform for quantum chemistry

Quantum computer provides new opportunities for quantum chemistry. In this article, we present a versatile, extensible, and efficient software package, named Q$^2$Chemistry, for developing quantum algorithms and quantum inspired classical algorithms in the field of quantum chemistry. In Q$^2$Chemistry, wave function and Hamiltonian can be conveniently mapped into the qubit space, then quantum circuits can be generated according to a specific quantum algorithm already implemented in the package or newly developed by the users. The generated circuits can be dispatched to either a physical quantum computer, if available, or to the internal virtual quantum computer realized by simulating quantum circuit on classical supercomputers. As demonstrated by our benchmark simulations with up to 72 qubit, Q$^2$Chemistry achieves excellent performance in simulating medium scale quantum circuits. Application of Q$^2$Chemistry to simulate molecules and periodic systems are given with performance analysis.

preprint2022arXiv

Reducing circuit depth in adaptive variational quantum algorithms via effective Hamiltonian theories

Electronic structure simulation is an anticipated application for quantum computers. Due to high-dimensional quantum entanglement in strongly correlated systems, the quantum resources required to perform such simulations are far beyond the capacity of current quantum devices. To reduce the quantum circuit complexity, it has been suggested to incorporate a part of the electronic correlation into an effective Hamiltonian, which is often obtained from a similarity transformation of the electronic Hamiltonian. In this work, we introduce a new transformation in the form of a product of a linear combination of excitation operators to construct the effective Hamiltonian with finite terms. To demonstrate its accuracy, we also consider an equivalent adaptive variational algorithm with this transformation and show that it can obtain an accurate ground state wave function. The effective Hamiltonian defined with this new transformation is incorporated into the adaptive variational quantum algorithms to maintain constant-size quantum circuits. The new computational scheme is assessed by performing numerical simulations for small molecules. Chemical accuracy is achieved with a much shallower circuit depth.

preprint2022arXiv

SimT: Handling Open-set Noise for Domain Adaptive Semantic Segmentation

This paper studies a practical domain adaptive (DA) semantic segmentation problem where only pseudo-labeled target data is accessible through a black-box model. Due to the domain gap and label shift between two domains, pseudo-labeled target data contains mixed closed-set and open-set label noises. In this paper, we propose a simplex noise transition matrix (SimT) to model the mixed noise distributions in DA semantic segmentation and formulate the problem as estimation of SimT. By exploiting computational geometry analysis and properties of segmentation, we design three complementary regularizers, i.e. volume regularization, anchor guidance, convex guarantee, to approximate the true SimT. Specifically, volume regularization minimizes the volume of simplex formed by rows of the non-square SimT, which ensures outputs of segmentation model to fit into the ground truth label distribution. To compensate for the lack of open-set knowledge, anchor guidance and convex guarantee are devised to facilitate the modeling of open-set noise distribution and enhance the discriminative feature learning among closed-set and open-set classes. The estimated SimT is further utilized to correct noise issues in pseudo labels and promote the generalization ability of segmentation model on target domain data. Extensive experimental results demonstrate that the proposed SimT can be flexibly plugged into existing DA methods to boost the performance. The source code is available at https://github.com/CityU-AIM-Group/SimT.

preprint2022arXiv

Towards Effective Depthwise Convolutions on ARMv8 Architecture

Depthwise convolutions are widely used in lightweight convolutional neural networks (CNNs). The performance of depthwise convolutions is mainly bounded by the memory access rather than the arithmetic operations for classic convolutions so that direct algorithms are often more efficient than indirect ones (matrix multiplication-, Winograd-, and FFT-based convolutions) with additional memory accesses. However, the existing direct implementations of depthwise convolutions on ARMv8 architectures feature a bad trade-off between register-level reuse of different tensors, which usually leads to sub-optimal performance. In this paper, we propose new direct implementations of depthwise convolutions by means of implicit padding, register tiling, etc., which contain forward propagation, backward propagation and weight gradient update procedures. Compared to the existing ones, our new implementations can incur much less communication overhead between registers and cache. Experimental results on two ARMv8 CPUs show that our implementations can averagely deliver 4.88x and 16.4x performance improvement over the existing direct ones in open source libraries and matrix multiplications-based ones in Pytorch, respectively.

preprint2022arXiv

Two-Stage Mesh Deep Learning for Automated Tooth Segmentation and Landmark Localization on 3D Intraoral Scans

Accurately segmenting teeth and identifying the corresponding anatomical landmarks on dental mesh models are essential in computer-aided orthodontic treatment. Manually performing these two tasks is time-consuming, tedious, and, more importantly, highly dependent on orthodontists' experiences due to the abnormality and large-scale variance of patients' teeth. Some machine learning-based methods have been designed and applied in the orthodontic field to automatically segment dental meshes (e.g., intraoral scans). In contrast, the number of studies on tooth landmark localization is still limited. This paper proposes a two-stage framework based on mesh deep learning (called TS-MDL) for joint tooth labeling and landmark identification on raw intraoral scans. Our TS-MDL first adopts an end-to-end \emph{i}MeshSegNet method (i.e., a variant of the existing MeshSegNet with both improved accuracy and efficiency) to label each tooth on the downsampled scan. Guided by the segmentation outputs, our TS-MDL further selects each tooth's region of interest (ROI) on the original mesh to construct a light-weight variant of the pioneering PointNet (i.e., PointNet-Reg) for regressing the corresponding landmark heatmaps. Our TS-MDL was evaluated on a real-clinical dataset, showing promising segmentation and localization performance. Specifically, \emph{i}MeshSegNet in the first stage of TS-MDL reached an averaged Dice similarity coefficient (DSC) at \textcolor[rgb]{0,0,0}{$0.964\pm0.054$}, significantly outperforming the original MeshSegNet. In the second stage, PointNet-Reg achieved a mean absolute error (MAE) of $0.597\pm0.761 \, mm$ in distances between the prediction and ground truth for $66$ landmarks, which is superior compared with other networks for landmark detection. All these results suggest the potential usage of our TS-MDL in orthodontics.

preprint2022arXiv

Weak solutions of the three-dimensional hypoviscous elastodynamics with finite kinetic energy

We construct weak solutions to the 3D hypoviscous incompressible elastodynamics with finite kinetic energy which was unknown in literatures. Our result holds for fractional hypoviscosity $(-Δ)^θ$, where $0\leqθ<1$. The proof {consists of a convex integration scheme with new building blocks of 2D intermittency and suitable temporal correctors, which are motivated by} the inherent geometric structure of the viscoelastic equations.

preprint2021arXiv

Accurate Mode-Coupling Characterization of Low-Crosstalk Ring-Core Fibers using Integral Calculation based Swept-Wavelength Interferometry Measurement

In this paper, to accurately characterize the low inter-mode coupling of the weakly-coupled few mode fibers (FMFs), we propose a modified inter-mode coupling characterization method based on swept-wavelength interferometry measurement, in which an integral calculation approach is used to eliminate significant sources of error that may lead to underestimation of the power coupling coefficient. Using the proposed characterization method, a low-crosstalk ring-core fiber (RCF) with low mode dependent loss (MDL) and with single span length up to 100 km is experimentally measured to have low power coupling coefficients between high-order orbital angular momentum (OAM) mode groups of below -30 dB/km over C band. The measured low coupling coefficients based on the proposed method are verified by the direct system power measurements, proving the feasibility and reliability of the proposed inter-mode coupling characterization method.

preprint2021arXiv

Back-n White Neutron Source at CSNS and its Applications

Back-streaming neutrons from the spallation target of the China Spallation Neutron Source (CSNS) that emit through the incoming proton channel were exploited to build a white neutron beam facility (the so-called Back-n white neutron source), which was completed in March 2018. The Back-n neutron beam is very intense, at approximately 2*10^7 n/cm^2/s at 55 m from the target, and has a nominal proton beam with a power of 100 kW in the CSNS-I phase and a kinetic energy of 1.6 GeV and a thick tungsten target in multiple slices with modest moderation from the cooling water through the slices. In addition, the excellent energy spectrum spanning from 0.5 eV to 200 MeV, and a good time resolution related to the time-of-flight measurements make it a typical white neutron source for nuclear data measurements; its overall performance is among that of the best white neutron sources in the world. Equipped with advanced spectrometers, detectors, and application utilities, the Back-n facility can serve wide applications, with a focus on neutron-induced cross-section measurements. This article presents an overview of the neutron beam characteristics, the experimental setups, and the ongoing applications at Back-n.

preprint2021arXiv

Non-Abelian braiding in spin superconductors utilizing the Aharonov-Casher effect

Spin superconductor (SSC) is an exciton condensate state where the spin-triplet exciton superfluidity is charge neutral while spin $2(\hbar/2)$. In analogy to the Majorana zero mode (MZM) in topological superconductors, the interplay between SSC and band topology will also give rise to a specific kind of topological boundary state obeying non-Abelian braiding statistics. Remarkably, the non-Abelian geometric phase here originates from the Aharonov-Casher effect of the &#34;half-charge&#34; other than the Aharonov-Bohm effect. Such topological boundary state of SSC is bound with the vortex of electric flux gradient and can be experimentally more distinct than the MZM for being electrically charged. This theoretical proposal provides a new avenue investigating the non-Abelian braiding physics without the assistance of MZM and charge superconductor.

preprint2021arXiv

Phase-dependent cross sections of deuteron-triton fusion in dichromatic intense fields with high-frequency limit

We investigate the influence of strong dichromatic laser fields (i.e. 1ω-2ω and 1ω-3ω) with high-frequency limit on the cross sections of deuteron-triton(DT) fusion in Kramers-Henneberger(KH) frame. We focus on the transitions of phase-dependent effects depending on a dimensionless quantity n_{d} , which equals the ratio of the quiver oscillation amplitude to the geometrical touching radius of the deuteron and triton as defined in our previous research. Theoretical calculations show that the angle-dependent as well as phase-dependent Coulomb barrier penetrabilities can be enhanced in dichromatic intense fields, and the corresponding angle-averaged penetrabilities and the fusion cross sections increase significantly compared with field-free case. Moreover, we find that there are twice shifts of the peak in the cross sections whenever the frequency becomes sufficiently low or the intensity sufficiently high. The reason for the first shift is the angle-dependence effects for sub-barrier fusion, while the second shift is due to the accumulation of over-barrier fusion, these mechanisms are analyzed in detail in this paper.

preprint2021arXiv

Resonant tunneling of deuteron-triton fusion in strong high-frequency electromagnetic fields

We investigate deuteron-triton (DT) fusion in the presence of linearly polarized strong electromagnetic fields in high-frequency limit, in which a complex spherical square-well potential is exploited to describe the nuclear potential. Within the framework of the Kramers-Henneberger (KH) transformation, we have calculated the total and angular differential fusion cross sections by investigating the asymptotical phase shifts of the Coulomb wavefunctions. With introducing a dimensionless quantity of $n_d$ representing the ratio of the particle quiver oscillation amplitude to the radius of nuclear potential, we find that, even though the tunneling probability of passing through the Coulomb repulsive potential keeps almost identical to that in the absence of electromagnetic fields, the peak of total fusion sections shows an apparent shift from the well known value of 110 keV to 78 keV for $n_d=0.01$. The angular differential cross sections also show some resonance peaks that shift from zero inclination angle to $π/2$ with increasing the parameter $n_d$. The corresponding astrophysical $S$-factors are found to be enhanced by several times in amplitudes. With the help of Wentzel-Kramers-Brillouin (WKB) approximate wavefunctions, the shape-resonance tunneling mechanism of the above findings are uncovered and some implications are discussed.

preprint2021arXiv

Truncation-Free Matching System for Display Advertising at Alibaba

Matching module plays a critical role in display advertising systems. Without query from user, it is challenging for system to match user traffic and ads suitably. System packs up a group of users with common properties such as the same gender or similar shopping interests into a crowd. Here term crowd can be viewed as a tag over users. Then advertisers bid for different crowds and deliver their ads to those targeted users. Matching module in most industrial display advertising systems follows a two-stage paradigm. When receiving a user request, matching system (i) finds the crowds that the user belongs to; (ii) retrieves all ads that have targeted those crowds. However, in applications such as display advertising at Alibaba, with very large volumes of crowds and ads, both stages of matching have to truncate the long-tailed parts for online serving, under limited latency. That&#39;s to say, not all ads have the chance to participate in online matching. This results in sub-optimal result for both advertising performance and platform revenue. In this paper, we study the truncation problem and propose a Truncation Free Matching System (TFMS). The basic idea is to decouple the matching computation from the online pipeline. Instead of executing the two-stage matching when user visits, TFMS utilizes a near-line truncation-free matching to pre-calculate and store those top valuable ads for each user. Then the online pipeline just needs to fetch the pre-stored ads as matching results. In this way, we can jump out of online system&#39;s latency and computation cost limitations, and leverage flexible computation resource to finish the user-ad matching. TFMS has been deployed in our productive system since 2019, bringing (i) more than 50% improvement of impressions for advertisers who encountered truncation before, (ii) 9.4% Revenue Per Mile gain, which is significant enough for the business.

preprint2021arXiv

Unveiling non-Abelian statistics of vortex Majorana bound states in iron-based superconductors using fermionic modes

Motivated by the recent experiments that reported the discovery of vortex Majorana bound states (vMBSs) in iron-based superconductors, we establish a portable scheme to unveil the non-Abelian statistics of vMBSs using normal fermionic modes. The unique non-Abelian statistics of vMBSs is characterized by the charge flip signal of the fermions that can be easily read out through the charge sensing measurement. In particular, the charge flip signal will be significantly suppressed for strong hybridized vMBSs or trivial vortex modes, which efficiently identifies genuine vMBSs. To eliminate the error induced by the unnecessary dynamical evolution of the fermionic modes, we further propose a correction strategy by continually reversing the energy of the fermions, reminiscent of the quantum Zeno effect. Finally, we establish a feasible protocol to perform non-Abelian braiding operations on vMBSs.

preprint2020arXiv

A Comprehensive Survey of Grammar Error Correction

Grammar error correction (GEC) is an important application aspect of natural language processing techniques. The past decade has witnessed significant progress achieved in GEC for the sake of increasing popularity of machine learning and deep learning, especially in late 2010s when near human-level GEC systems are available. However, there is no prior work focusing on the whole recapitulation of the progress. We present the first survey in GEC for a comprehensive retrospect of the literature in this area. We first give the introduction of five public datasets, data annotation schema, two important shared tasks and four standard evaluation metrics. More importantly, we discuss four kinds of basic approaches, including statistical machine translation based approach, neural machine translation based approach, classification based approach and language model based approach, six commonly applied performance boosting techniques for GEC systems and two data augmentation methods. Since GEC is typically viewed as a sister task of machine translation, many GEC systems are based on neural machine translation (NMT) approaches, where the neural sequence-to-sequence model is applied. Similarly, some performance boosting techniques are adapted from machine translation and are successfully combined with GEC systems for enhancement on the final performance. Furthermore, we conduct an analysis in level of basic approaches, performance boosting techniques and integrated GEC systems based on their experiment results respectively for more clear patterns and conclusions. Finally, we discuss five prospective directions for future GEC researches.

preprint2020arXiv

Adaptive Neural Network-Based Approximation to Accelerate Eulerian Fluid Simulation

The Eulerian fluid simulation is an important HPC application. The neural network has been applied to accelerate it. The current methods that accelerate the fluid simulation with neural networks lack flexibility and generalization. In this paper, we tackle the above limitation and aim to enhance the applicability of neural networks in the Eulerian fluid simulation. We introduce Smartfluidnet, a framework that automates model generation and application. Given an existing neural network as input, Smartfluidnet generates multiple neural networks before the simulation to meet the execution time and simulation quality requirement. During the simulation, Smartfluidnet dynamically switches the neural networks to make the best efforts to reach the user requirement on simulation quality. Evaluating with 20,480 input problems, we show that Smartfluidnet achieves 1.46x and 590x speedup comparing with a state-of-the-art neural network model and the original fluid simulation respectively on an NVIDIA Titan X Pascal GPU, while providing better simulation quality than the state-of-the-art model.

preprint2020arXiv

AIM 2020 Challenge on Efficient Super-Resolution: Methods and Results

This paper reviews the AIM 2020 challenge on efficient single image super-resolution with focus on the proposed solutions and results. The challenge task was to super-resolve an input image with a magnification factor x4 based on a set of prior examples of low and corresponding high resolution images. The goal is to devise a network that reduces one or several aspects such as runtime, parameter count, FLOPs, activations, and memory consumption while at least maintaining PSNR of MSRResNet. The track had 150 registered participants, and 25 teams submitted the final results. They gauge the state-of-the-art in efficient single image super-resolution.

preprint2020arXiv

An efficient and high-resolution topology optimization method based on convolutional neural networks

2. In Section 3, we used some vague statements to affirm the training process of the neural network, which cannot support others to reproduce the results of the paper. In addition, this section does not show the difference between this paper and other work, nor does it reflect innovation. 3. In Section 5, the numerical examples are not compared with other multi-resolution methods, which is not enough to explain the superiority of the method proposed in this paper. Figure 10 also fails to show that this method is significantly improved compared to the traditional method. Thanks for your understanding. Topology optimization is a pioneering design method that can provide various candidates with high mechanical properties. However, the high-resolution for the optimum structures is highly desired, normally in turn leading to computationally intractable puzzle, especially for the famous Solid Isotropic Material with Penalization (SIMP) method. In this paper, we introduce the Super-Resolution Convolutional Neural Network (SRCNN) technique into topology optimization framework to improve the resolution of topology solutions with extremely high computational efficiency. Additionally, a pooling strategy is established to balance the number of finite element analysis (FEA) and the output mesh in optimization process. Considering the high training cost of 3D neural networks, several 2D neural networks are combined to deal with 3D topology optimization design problems. The combined treatment method used in 3D topology optimization design eliminates the expense of retraining 3D convolutional neural network and guarantees the quality of 3D design. Some typical examples justify that the high-resolution topology optimization method adopting SRCNN has excellent applicability and high efficiency.

preprint2020arXiv

Colonoscopy Polyp Detection: Domain Adaptation From Medical Report Images to Real-time Videos

Automatic colorectal polyp detection in colonoscopy video is a fundamental task, which has received a lot of attention. Manually annotating polyp region in a large scale video dataset is time-consuming and expensive, which limits the development of deep learning techniques. A compromise is to train the target model by using labeled images and infer on colonoscopy videos. However, there are several issues between the image-based training and video-based inference, including domain differences, lack of positive samples, and temporal smoothness. To address these issues, we propose an Image-video-joint polyp detection network (Ivy-Net) to address the domain gap between colonoscopy images from historical medical reports and real-time videos. In our Ivy-Net, a modified mixup is utilized to generate training data by combining the positive images and negative video frames at the pixel level, which could learn the domain adaptive representations and augment the positive samples. Simultaneously, a temporal coherence regularization (TCR) is proposed to introduce the smooth constraint on feature-level in adjacent frames and improve polyp detection by unlabeled colonoscopy videos. For evaluation, a new large colonoscopy polyp dataset is collected, which contains 3056 images from historical medical reports of 889 positive patients and 7.5-hour videos of 69 patients (28 positive). The experiments on the collected dataset demonstrate that our Ivy-Net achieves the state-of-the-art result on colonoscopy video.

preprint2020arXiv

Computational Performance of a Germline Variant Calling Pipeline for Next Generation Sequencing

With the booming of next generation sequencing technology and its implementation in clinical practice and life science research, the need for faster and more efficient data analysis methods becomes pressing in the field of sequencing. Here we report on the evaluation of an optimized germline mutation calling pipeline, HummingBird, by assessing its performance against the widely accepted BWA-GATK pipeline. We found that the HummingBird pipeline can significantly reduce the running time of the primary data analysis for whole genome sequencing and whole exome sequencing while without significantly sacrificing the variant calling accuracy. Thus, we conclude that expansion of such software usage will help to improve the primary data analysis efficiency for next generation sequencing.

preprint2020arXiv

CTM: Collaborative Temporal Modeling for Action Recognition

With the rapid development of digital multimedia, video understanding has become an important field. For action recognition, temporal dimension plays an important role, and this is quite different from image recognition. In order to learn powerful feature of videos, we propose a Collaborative Temporal Modeling (CTM) block (Figure 1) to learn temporal information for action recognition. Besides a parameter-free identity shortcut, as a separate temporal modeling block, CTM includes two collaborative paths: a spatial-aware temporal modeling path, which we propose the Temporal-Channel Convolution Module (TCCM) with unshared parameters for each spatial position (H*W) to build, and a spatial-unaware temporal modeling path. CTM blocks can seamlessly be inserted into many popular networks to generate CTM Networks and bring the capability of learning temporal information to 2D CNN backbone networks, which only capture spatial information. Experiments on several popular action recognition datasets demonstrate that CTM blocks bring the performance improvements on 2D CNN baselines, and our method achieves the competitive results against the state-of-the-art methods. Code will be made publicly available.

preprint2020arXiv

Deep Convolutional Neural Network-based Bernoulli Heatmap for Head Pose Estimation

Head pose estimation is a crucial problem for many tasks, such as driver attention, fatigue detection, and human behaviour analysis. It is well known that neural networks are better at handling classification problems than regression problems. It is an extremely nonlinear process to let the network output the angle value directly for optimization learning, and the weight constraint of the loss function will be relatively weak. This paper proposes a novel Bernoulli heatmap for head pose estimation from a single RGB image. Our method can achieve the positioning of the head area while estimating the angles of the head. The Bernoulli heatmap makes it possible to construct fully convolutional neural networks without fully connected layers and provides a new idea for the output form of head pose estimation. A deep convolutional neural network (CNN) structure with multiscale representations is adopted to maintain high-resolution information and low-resolution information in parallel. This kind of structure can maintain rich, high-resolution representations. In addition, channelwise fusion is adopted to make the fusion weights learnable instead of simple addition with equal weights. As a result, the estimation is spatially more precise and potentially more accurate. The effectiveness of the proposed method is empirically demonstrated by comparing it with other state-of-the-art methods on public datasets.

preprint2020arXiv

Dipolar spin waves in uniaxial easy-axis antiferromagnets: A natural topological nodal-line semimetal

The existence of the magnetostatic surface spin waves in ferromagnets, known as Damon-Eshbach mode, was recently demonstrated to originate from the topology of the dipole-dipole interaction. In this work, we study the topological characteristics of magnons in easy-axis antiferromagnets with uniaxial anisotropy. The dipolar spin waves are found to be, driven by the dipole-dipole interaction, in a topological nodal-line semimetal phase, which hosts Damon-Eshbach-type surface modes due to the bulk-edge correspondence. The long wavelength character of dipolar spin waves makes our proposal valid for any natural uniaxial easy-axis antiferromagnet, and thus enriches the candidates of topological magnonic materials. In contrast to the nonreciprocal property in ferromagnetic case, the surface modes with opposite momentum coexist at each surface, but with different chiralities. Such a chirality-momentum or spin-momentum locking, similar to that of electronic surface states in topological insulators, offers the opportunity to design novel chirality-based magnonic devices in antiferromagnets.

preprint2020arXiv

Disorder effects in the two-dimensional Lieb lattice and its extensions

We study the localization properties of the two-dimensional Lieb lattice and its extensions in the presence of disorder using transfer matrix method and finite-size scaling. We find that all states in the Lieb lattice and its extensions are localized for $W \geq 1$. Clear differences in the localization properties between disordered flat band and disordered dispersive bands are identified. Our results complement previous experimental studies of clean photonic Lieb lattices and provide information about their stability with respect to disorder.

preprint2020arXiv

FLAME: A Self-Adaptive Auto-labeling System for Heterogeneous Mobile Processors

How to accurately and efficiently label data on a mobile device is critical for the success of training machine learning models on mobile devices. Auto-labeling data on mobile devices is challenging, because data is usually incrementally generated and there is possibility of having unknown labels. Furthermore, the rich hardware heterogeneity on mobile devices creates challenges on efficiently executing auto-labeling workloads. In this paper, we introduce Flame, an auto-labeling system that can label non-stationary data with unknown labels. Flame includes a runtime system that efficiently schedules and executes auto-labeling workloads on heterogeneous mobile processors. Evaluating Flame with eight datasets on a smartphone, we demonstrate that Flame enables auto-labeling with high labeling accuracy and high performance.

preprint2020arXiv

Interpolative separable density fitting decomposition for accelerating Hartree-Fock exchange calculations within numerical atomic orbitals

The high cost associated with the evaluation of Hartree-Fock exchange (HFX) makes hybrid functionals computationally challenging for large systems. In this work, we present an efficient way to accelerate HFX calculations with numerical atomic basis sets. Our approach is based on the recently proposed interpolative separable density fitting (ISDF) decomposition to construct a low rank approximation of HFX matrix, which avoids explicit calculations of the electron repulsion integrals (ERIs) and significantly reduces the computational cost. We implement the ISDF method for hybrid functional (PBE0) calculations in the HONPAS package. We take benzene and polycyclic aromatic hydrocarbons molecules as examples and demonstrate that hybrid functionals with ISDF yields quite promising results at a significantly reduced computational cost. Especially, the ISDF approach reduces the total cost for evaluating HFX matrix by nearly 2 orders of magnitude compared to conventional approaches of direct evaluation of ERIs.

preprint2020arXiv

iqiyi Submission to ActivityNet Challenge 2019 Kinetics-700 challenge: Hierarchical Group-wise Attention

In this report, the method for the iqiyi submission to the task of ActivityNet 2019 Kinetics-700 challenge is described. Three models are involved in the model ensemble stage: TSN, HG-NL and StNet. We propose the hierarchical group-wise non-local (HG-NL) module for frame-level features aggregation for video classification. The standard non-local (NL) module is effective in aggregating frame-level features on the task of video classification but presents low parameters efficiency and high computational cost. The HG-NL method involves a hierarchical group-wise structure and generates multiple attention maps to enhance performance. Basing on this hierarchical group-wise structure, the proposed method has competitive accuracy, fewer parameters and smaller computational cost than the standard NL. For the task of ActivityNet 2019 Kinetics-700 challenge, after model ensemble, we finally obtain an averaged top-1 and top-5 error percentage 28.444% on the test set.

preprint2020arXiv

Learning to Predict More Accurate Text Instances for Scene Text Detection

At present, multi-oriented text detection methods based on deep neural network have achieved promising performances on various benchmarks. Nevertheless, there are still some difficulties for arbitrary shape text detection, especially for a simple and proper representation of arbitrary shape text instances. In this paper, a pixel-based text detector is proposed to facilitate the representation and prediction of text instances with arbitrary shapes in a simple manner. Firstly, to alleviate the effect of the target vertex sorting and achieve the direct regression of arbitrary shape text instances, the starting-point-independent coordinates regression loss is proposed. Furthermore, to predict more accurate text instances, the text instance accuracy loss is proposed as an assistant task to refine the predicted coordinates under the guidance of IoU. To evaluate the effectiveness of our detector, extensive experiments have been carried on public benchmarks which contain arbitrary shape text instances and multi-oriented text instances. We obtain 84.8% of F-measure on Total-Text benchmark. The results show that our method can reach state-of-the-art performance.

preprint2020arXiv

Nanotechnology-inspired Information Processing Systems of the Future

Nanoscale semiconductor technology has been a key enabler of the computing revolution. It has done so via advances in new materials and manufacturing processes that resulted in the size of the basic building block of computing systems - the logic switch and memory devices - being reduced into the nanoscale regime. Nanotechnology has provided increased computing functionality per unit volume, energy, and cost. In order for computing systems to continue to deliver substantial benefits for the foreseeable future to society at large, it is critical that the very notion of computing be examined in the light of nanoscale realities. In particular, one needs to ask what it means to compute when the very building block - the logic switch - no longer exhibits the level of determinism required by the von Neumann architecture. There needs to be a sustained and heavy investment in a nation-wide Vertically Integrated Semiconductor Ecosystem (VISE). VISE is a program in which research and development is conducted seamlessly across the entire compute stack - from applications, systems and algorithms, architectures, circuits and nanodevices, and materials. A nation-wide VISE provides clear strategic advantages in ensuring the US&#39;s global superiority in semiconductors. First, a VISE provides the highest quality seed-corn for nurturing transformative ideas that are critically needed today in order for nanotechnology-inspired computing to flourish. It does so by dramatically opening up new areas of semiconductor research that are inspired and driven by new application needs. Second, a VISE creates a very high barrier to entry from foreign competitors because it is extremely hard to establish, and even harder to duplicate.

preprint2020arXiv

New timing measurement results of 16 pulsars

Pulsar&#39;s position, proper motion and parallax are important parameters in timing equations. It is a really challenging work to fit astrometric parameters accurately through pulsar timing, especially for pulsars that show irregular timing properties. As the fast development of related techniques, it is possible to measure astrometric parameters of more and more pulsars in a model$\textrm{-}$independent manner with the Very Long Baseline Interferometry (VLBI). In this work, we select 16 normal pulsars, whose parallax and proper motion have not been successfully fitted with timing observations or show obvious differences with corresponding latest VLBI solutions, and do further studies on their timing properties. After updating astrometric parameters in pulsar ephemerides with the latest VLBI measurements, we derive the latest rotation solutions of these pulsars with observation data at S and C$\textrm{-}$band obtained from the Shanghai Tian Ma Radio Telescope (TMRT). Compared with spin frequency $ν$ inferred from previous rotation solutions, the newly$\textrm{-}$fitted $ν$ show differences larger than 10$^{-9}$ Hz for most pulsars. The contribution of the Shklovsky effect to period derivative $\dot{P}$ can be properly removed taking advantages of accurate proper motion and distance of target pulsars measured by VLBI astrometry. This further leads to a precise estimate of intrinsic characteristic age $τ_{\rm c}$. Differences between the newly$\textrm{-}$measured $τ_{\rm c}$ and corresponding previous results are as large as 2% for some pulsars. VLBI astrometric parameter solutions also lead to better measurements of timing irregularities. For PSR B0154$+$61, the glitch epoch (MJD 58279.5) measured with previous ephemeris is about 13 d later than the result (MJD 58266.4) obtained with VLBI astrometric parameter solutions.

preprint2020arXiv

Noise signatures for determining chiral Majorana fermion modes

The conductance measurement of a half quantized plateau in a quantum anomalous Hall insulator-superconductor structure is reported by a recent experiment [Q. L. He \textit{et al.}, Science 357, 294-299 (2017)], which suggests the existence of the chiral Majorana fermion modes. However, such half quantized conductance plateau may also originates from a disorder-induced metallic phase. To identify the exact mechanism, we study the transport properties of such a system in the presence of strong disorders. Our results show that the local current density distributions of these two mechanisms are different. In particular, the current noises measurement can be used to distinguish them without any further fabrication of current experimental setup.

preprint2020arXiv

Non-Abelian braiding of Dirac fermionic modes using topological corner states in higher-order topological insulator

We numerically demonstrate that the topological corner states residing in the corners of higher-order topological insulator possess non-Abelian braiding properties. Such topological corner states are Dirac fermionic modes other than Majorana zero-modes. We claim that Dirac fermionic modes protected by nontrivial topology also support non-Abelian braiding. An analytical description on such non-Abelian braiding is conducted based on the vortex-induced Dirac-type fermionic modes. The braiding operator for Dirac fermionic modes is also analytically derivated and compared with the Majorana zero-modes. Experimentally, such non-Abelian braiding operation on Dirac fermionic modes is proposed to be testified through topological electric circuit.

preprint2020arXiv

Poet: Product-oriented Video Captioner for E-commerce

In e-commerce, a growing number of user-generated videos are used for product promotion. How to generate video descriptions that narrate the user-preferred product characteristics depicted in the video is vital for successful promoting. Traditional video captioning methods, which focus on routinely describing what exists and happens in a video, are not amenable for product-oriented video captioning. To address this problem, we propose a product-oriented video captioner framework, abbreviated as Poet. Poet firstly represents the videos as product-oriented spatial-temporal graphs. Then, based on the aspects of the video-associated product, we perform knowledge-enhanced spatial-temporal inference on those graphs for capturing the dynamic change of fine-grained product-part characteristics. The knowledge leveraging module in Poet differs from the traditional design by performing knowledge filtering and dynamic memory modeling. We show that Poet achieves consistent performance improvement over previous methods concerning generation quality, product aspects capturing, and lexical diversity. Experiments are performed on two product-oriented video captioning datasets, buyer-generated fashion video dataset (BFVD) and fan-generated fashion video dataset (FFVD), collected from Mobile Taobao. We will release the desensitized datasets to promote further investigations on both video captioning and general video analysis problems.

preprint2020arXiv

Simulating periodic systems on quantum computer

The variational quantum eigensolver (VQE) is one of the most appealing quantum algorithms to simulate electronic structure properties of molecules on near-term noisy intermediate-scale quantum devices. In this work, we generalize the VQE algorithm for simulating extended systems. However, the numerical study of an one-dimensional (1D) infinite hydrogen chain using existing VQE algorithms shows a remarkable deviation of the ground state energy with respect to the exact full configuration interaction (FCI) result. Here, we present two schemes to improve the accuracy of quantum simulations for extended systems. The first one is a modified VQE algorithm, which introduces an unitary transformation of Hartree-Fock orbitals to avoid the complex Hamiltonian. The second one is a Post-VQE approach combining VQE with the quantum subspace expansion approach (VQE/QSE). Numerical benchmark calculations demonstrate that both of two schemes provide an accurate enough description of the potential energy curve of the 1D hydrogen chain. In addition, excited states computed with the VQE/QSE approach also agree very well with FCI results.

preprint2020arXiv

Spectral self-adaptive absorber/emitter for harvesting energy from the sun and outer space

The sun (~6000 K) and outer space (~3 K) are the original heat source and sink for human beings on Earth. The energy applications of absorbing solar irradiation and harvesting the coldness of outer space for energy utilization have attracted considerable interest from researchers. However, combining these two functions in a static device for continuous energy harvesting is unachievable due to the intrinsic infrared spectral conflict. In this study, we developed spectral self-adaptive absorber/emitter (SSA/E) for daytime photothermal and nighttime radiative sky cooling modes depending on the phase transition of the vanadium dioxide coated layer. A 24-hour day-night test showed that the fabricated SSA/E has continuous energy harvesting ability and improved overall energy utilization performance, thus showing remarkable potential in future energy applications.

preprint2020arXiv

Stochastic Learning for Sparse Discrete Markov Random Fields with Controlled Gradient Approximation Error

We study the $L_1$-regularized maximum likelihood estimator/estimation (MLE) problem for discrete Markov random fields (MRFs), where efficient and scalable learning requires both sparse regularization and approximate inference. To address these challenges, we consider a stochastic learning framework called stochastic proximal gradient (SPG; Honorio 2012a, Atchade et al. 2014,Miasojedow and Rejchel 2016). SPG is an inexact proximal gradient algorithm [Schmidtet al., 2011], whose inexactness stems from the stochastic oracle (Gibbs sampling) for gradient approximation - exact gradient evaluation is infeasible in general due to the NP-hard inference problem for discrete MRFs [Koller and Friedman, 2009]. Theoretically, we provide novel verifiable bounds to inspect and control the quality of gradient approximation. Empirically, we propose the tighten asymptotically (TAY) learning strategy based on the verifiable bounds to boost the performance of SPG.

preprint2020arXiv

Super-exponential diffusion in nonlinear non-Hermitian systems

We investigate the quantum diffusion of a periodically kicked particle subjecting to both nonlinearity induced self-interactions and $\mathcal{PT}$-symmetric potentials. We find that, due to the interplay between the nonlinearity and non-Hermiticity, the expectation value of mean square of momentum scales with time in a super-exponential form $\langle p^2(t)\rangle\propto\exp[β\exp(αt)]$, which is faster than any known rates of quantum diffusion. In the $\mathcal{PT}$-symmetry-breaking phase, the intensity of a state increases exponentially with time, leading to the exponential growth of the interaction strength. The feedback of the intensity-dependent nonlinearity further turns the interaction energy into the kinetic energy, resulting in a super-exponential growth of the mean energy. These theoretical predictions are in good agreement with numerical simulations in a $\cal{PT}$-symmetric nonlinear kicked particle. Our discovery establishes a new mechanism of diffusion in interacting and dissipative quantum systems. Important implications and possible experimental observations are discussed.

preprint2020arXiv

Supporting the Problem-Solving Loop: Designing Highly Interactive Optimisation Systems

Efficient optimisation algorithms have become important tools for finding high-quality solutions to hard, real-world problems such as production scheduling, timetabling, or vehicle routing. These algorithms are typically &#34;black boxes&#34; that work on mathematical models of the problem to solve. However, many problems are difficult to fully specify, and require a &#34;human in the loop&#34; who collaborates with the algorithm by refining the model and guiding the search to produce acceptable solutions. Recently, the Problem-Solving Loop was introduced as a high-level model of such interactive optimisation. Here, we present and evaluate nine recommendations for the design of interactive visualisation tools supporting the Problem-Solving Loop. They range from the choice of visual representation for solutions and constraints to the use of a solution gallery to support exploration of alternate solutions. We first examined the applicability of the recommendations by investigating how well they had been supported in previous interactive optimisation tools. We then evaluated the recommendations in the context of the vehicle routing problem with time windows (VRPTW). To do so we built a sophisticated interactive visual system for solving VRPTW that was informed by the recommendations. Ten participants then used this system to solve a variety of routing problems. We report on participant comments and interaction patterns with the tool. These showed the tool was regarded as highly usable and the results generally supported the usefulness of the underlying recommendations.

preprint2020arXiv

Tencent ML-Images: A Large-Scale Multi-Label Image Database for Visual Representation Learning

In existing visual representation learning tasks, deep convolutional neural networks (CNNs) are often trained on images annotated with single tags, such as ImageNet. However, a single tag cannot describe all important contents of one image, and some useful visual information may be wasted during training. In this work, we propose to train CNNs from images annotated with multiple tags, to enhance the quality of visual representation of the trained CNN model. To this end, we build a large-scale multi-label image database with 18M images and 11K categories, dubbed Tencent ML-Images. We efficiently train the ResNet-101 model with multi-label outputs on Tencent ML-Images, taking 90 hours for 60 epochs, based on a large-scale distributed deep learning framework,i.e.,TFplus. The good quality of the visual representation of the Tencent ML-Images checkpoint is verified through three transfer learning tasks, including single-label image classification on ImageNet and Caltech-256, object detection on PASCAL VOC 2007, and semantic segmentation on PASCAL VOC 2012. The Tencent ML-Images database, the checkpoints of ResNet-101, and all the training codehave been released at https://github.com/Tencent/tencent-ml-images. It is expected to promote other vision tasks in the research and industry community.

preprint2020arXiv

Triaging moderate COVID-19 and other viral pneumonias from routine blood tests

The COVID-19 is sweeping the world with deadly consequences. Its contagious nature and clinical similarity to other pneumonias make separating subjects contracted with COVID-19 and non-COVID-19 viral pneumonia a priority and a challenge. However, COVID-19 testing has been greatly limited by the availability and cost of existing methods, even in developed countries like the US. Intrigued by the wide availability of routine blood tests, we propose to leverage them for COVID-19 testing using the power of machine learning. Two proven-robust machine learning model families, random forests (RFs) and support vector machines (SVMs), are employed to tackle the challenge. Trained on blood data from 208 moderate COVID-19 subjects and 86 subjects with non-COVID-19 moderate viral pneumonia, the best result is obtained in an SVM-based classifier with an accuracy of 84%, a sensitivity of 88%, a specificity of 80%, and a precision of 92%. The results are found explainable from both machine learning and medical perspectives. A privacy-protected web portal is set up to help medical personnel in their practice and the trained models are released for developers to further build other applications. We hope our results can help the world fight this pandemic and welcome clinical verification of our approach on larger populations.

preprint2019arXiv

Bound state and non-Markovian dynamics of a quantum emitter around a surface plasmonic nanostructure

A bound state between a quantum emitter (QE) and surface plasmon polaritons (SPPs) can be formed, where the QE is partially stabilized in its excited state. We put forward a general approach for calculating the energy level shift at a negative frequency $ω$, which is just the negative of the nonresonant part for the energy level shift at positive frequency $-ω$. We also propose an efficient formalism for obtaining the long-time value of the excited-state population without calculating the eigenfrequency of the bound state or performing a time evolution of the system, in which the probability amplitude for the excited state in the steady limit is equal to one minus the integral of the evolution spectrum over the positive frequency range. With the above two quantities obtained, we show that the non-Markovian decay dynamics in the presence of a bound state can be obtained by the method based on the Green&#39;s function expression for the evolution operator. A general criterion for identifying the existence of a bound state is presented. These are numerically demonstrated for a QE located around a nanosphere and in a gap plasmonic nanocavity. These findings are instructive in the fields of coherent light-matter interactions.

preprint2019arXiv

Double-frequency Aharonov-Bohm effect and non-Abelian braiding properties of Jackiw-Rebbi zero-mode

Ever since its first proposal in 1976, Jackiw-Rebbi zero-mode has been drawing extensive attention for its charming properties including charge fractionalization, topologically protected zero-energy and possible non-Abelian statistics. We investigate these properties through the Jackiw-Rebbi zero-modes in quantum spin Hall insulator. Though charge fractionalization is not manifested, Jackiw-Rebbi zero-mode&#39;s zero-energy nature leads to a double-frequency Aharonov-Bohm effect, implying that it can be viewed as a special case of Majorana zero-mode breaking particle-hole symmetry. Such relation is strengthened since Jackiw-Rebbi zero-modes also exhibit non-Abelian braiding properties in the absence of superconductivity, and the symmetry-protected degeneracy of both Jackiw-Rebbi and Majorana zero-modes is proved to be equally important as the topological gap for their non-Abelian statistics.

preprint2019arXiv

Majorana zero modes by engineering topological kink states in two dimensional electron gas

Majorana zero modes (MZMs)--bearing potential applications for topological quantum computing--are verified in quasi-one-dimensional (1D) Fermion systems, including semiconductor nanowires, magnetic atomic chains, planar Josephson junctions. However, the existence of multi-bands in these systems makes the MZMs fragile to the influence of disorder. Moreover, in practical perspective, the proximity induced superconductivity may be difficult and restricted for 1D systems. Here, we propose a flexible route to realize MZMs through 1D topological kink states by engineering a 2D electron gas with antidot lattices, in which both the aforementioned issues can be avoided owing to the robustness of kink states and the intrinsically attainable superconductivity in high-dimensional systems. The MZMs are verified to be quite robust against disorders and the bending of kink states, and can be conveniently tuned by varying the Rashba spin-orbit coupling strength. Our proposal provides an experimental feasible platform for MZMs with systematic manipulability and assembleability based on the present techniques in 2D electron gas system.

preprint2019arXiv

Odd singular vector formula for general linear Lie superalgebras

We establish a closed formula for a singular vector of weight $λ-β$ in the Verma module of highest weight $λ$ for Lie superalgebra $\mathfrak{gl}(m|n)$ when $λ$ is atypical with respect to an odd positive root $β$. It is further shown that this vector is unique up to a scalar multiple, and it descends to a singular vector, again unique up to a scalar multiple, in the corresponding Kac module when both $λ$ and $λ-β$ are dominant integral.