Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
23works
0followers
21topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

23 published item(s)

preprint2026arXiv

GraphPI: Efficient Protein Inference with Graph Neural Networks

The integration of deep learning approaches in biomedical research has been transformative, enabling breakthroughs in various applications. Despite these strides, its application in protein inference is impeded by the scarcity of extensively labeled datasets, a challenge compounded by the high costs and complexities of accurate protein annotation. In this study, we introduce GraphPI, a novel framework that treats protein inference as a node classification problem. We treat proteins as interconnected nodes within a protein-peptide-PSM graph, utilizing a Graph Neural Network-based architecture to elucidate their interrelations. To address label scarcity, we train the model on a set of unlabeled public protein datasets with pseudo-labels derived from an existing protein inference algorithm, enhanced by self-training to iteratively refine labels based on confidence scores. Contrary to prevalent methodologies necessitating dataset-specific training, our research illustrates that GraphPI, due to the well normalized nature of Percolator features, exhibits universal applicability without dataset-specific fine-tuning, a feature that not only mitigates the risk of overfitting but also enhances computational efficiency. Our empirical experiments reveal notable performance on various test datasets and deliver significantly reduced computation times compared to common protein inference algorithms.

preprint2022arXiv

An Upper Limit of Decaying Rate with Respect to Frequency in Deep Neural Network

Deep neural network (DNN) usually learns the target function from low to high frequency, which is called frequency principle or spectral bias. This frequency principle sheds light on a high-frequency curse of DNNs -- difficult to learn high-frequency information. Inspired by the frequency principle, a series of works are devoted to develop algorithms for overcoming the high-frequency curse. A natural question arises: what is the upper limit of the decaying rate w.r.t. frequency when one trains a DNN? In this work, our theory, confirmed by numerical experiments, suggests that there is a critical decaying rate w.r.t. frequency in DNN training. Below the upper limit of the decaying rate, the DNN interpolates the training data by a function with a certain regularity. However, above the upper limit, the DNN interpolates the training data by a trivial function, i.e., a function is only non-zero at training data points. Our results indicate a better way to overcome the high-frequency curse is to design a proper pre-condition approach to shift high-frequency information to low-frequency one, which coincides with several previous developed algorithms for fast learning high-frequency information. More importantly, this work rigorously proves that the high-frequency curse is an intrinsic difficulty of DNNs.

preprint2022arXiv

Asymptotic-Preserving Neural Networks for Multiscale Time-Dependent Linear Transport Equations

In this paper we develop a neural network for the numerical simulation of time-dependent linear transport equations with diffusive scaling and uncertainties. The goal of the network is to resolve the computational challenges of curse-of-dimensionality and multiple scales of the problem. We first show that a standard Physics-Informed Neural Network (PINN) fails to capture the multiscale nature of the problem, hence justifies the need to use Asymptotic-Preserving Neural Networks (APNNs). We show that not all classical AP formulations are fit for the neural network approach. We construct a micro-macro decomposition based neural network, and also build in a mass conservation mechanism into the loss function, in order to capture the dynamic and multiscale nature of the solutions. Numerical examples are used to demonstrate the effectiveness of this APNNs.

preprint2022arXiv

On the Ergodic Rate of Cognitive Radio Inspired Uplink Multiple Access

With the exponential increase of the number of devices in the communication ecosystem toward the upcoming sixth generation (6G) of wireless networks, more enabling technologies and potential wireless architectures are necessary to fulfill the networking requirements of high throughput, massive connectivity, ultra reliability, and heterogeneous quality of service (QoS). In this work, we consider an uplink network consisting of a primary user (PU) and a secondary user (SU) and, by integrating the concept of cognitive radio and multiple access, two protocols based on rate-splitting multiple access and non-orthogonal multiple access with successive interference cancellation are investigated in terms of ergodic rate. The considered protocols aim to serve the SU in a resource block which is originally allocated solely for the PU without negatively affecting the QoS of the PU. We extract the ergodic rate of the SU considering a specific QoS for the PU for the two protocols. In the numerical results, we validate the theoretical analysis and illustrate the superiority of the considered protocols over two benchmark schemes.

preprint2022arXiv

OpenCalib: A Multi-sensor Calibration Toolbox for Autonomous Driving

Accurate sensor calibration is a prerequisite for multi-sensor perception and localization systems for autonomous vehicles. The intrinsic parameter calibration of the sensor is to obtain the mapping relationship inside the sensor, and the extrinsic parameter calibration is to transform two or more sensors into a unified spatial coordinate system. Most sensors need to be calibrated after installation to ensure the accuracy of sensor measurements. To this end, we present OpenCalib, a calibration toolbox that contains a rich set of various sensor calibration methods. OpenCalib covers manual calibration tools, automatic calibration tools, factory calibration tools, and online calibration tools for different application scenarios. At the same time, to evaluate the calibration accuracy and subsequently improve the accuracy of the calibration algorithm, we released a corresponding benchmark dataset. This paper introduces various features and calibration methods of this toolbox. To our knowledge, this is the first open-sourced calibration codebase containing the full set of autonomous-driving-related calibration approaches in this area. We wish that the toolbox could be helpful to autonomous driving researchers. We have open-sourced our code on GitHub to benefit the community. Code is available at https://github.com/PJLab-ADG/SensorsCalibration.

preprint2022arXiv

Unified Chinese License Plate Detection and Recognition with High Efficiency

Recently, deep learning-based methods have reached an excellent performance on License Plate (LP) detection and recognition tasks. However, it is still challenging to build a robust model for Chinese LPs since there are not enough large and representative datasets. In this work, we propose a new dataset named Chinese Road Plate Dataset (CRPD) that contains multi-objective Chinese LP images as a supplement to the existing public benchmarks. The images are mainly captured with electronic monitoring systems with detailed annotations. To our knowledge, CRPD is the largest public multi-objective Chinese LP dataset with annotations of vertices. With CRPD, a unified detection and recognition network with high efficiency is presented as the baseline. The network is end-to-end trainable with totally real-time inference efficiency (30 fps with 640p). The experiments on several public benchmarks demonstrate that our method has reached competitive performance. The code and dataset will be publicly available at https://github.com/yxgong0/CRPD.

preprint2021arXiv

Automatic Minimisation of Masking in Multitrack Audio using Subgroups

The iterative process of masking minimisation when mixing multitrack audio is a challenging optimisation problem, in part due to the complexity and non-linearity of auditory perception. In this article, we first propose a multitrack masking metric inspired by the MPEG psychoacoustic model. We investigate different audio processing techniques to manipulate the frequency and dynamic characteristics of the signal in order to reduce masking based on the proposed metric. We also investigate whether or not automatically mixing using subgrouping is beneficial or not to perceived quality and clarity of a mix. Evaluation results suggest that our proposed masking metric when used in an automatic mixing framework can be used to reduce inter-channel auditory masking as well as improve the perceived quality and perceived clarity of a mix. Furthermore, our results suggest that using subgrouping in an automatic mixing framework can be used to improve the perceived quality and perceived clarity of a mix.

preprint2021arXiv

MOD-Net: A Machine Learning Approach via Model-Operator-Data Network for Solving PDEs

In this paper, we propose a a machine learning approach via model-operator-data network (MOD-Net) for solving PDEs. A MOD-Net is driven by a model to solve PDEs based on operator representation with regularization from data. For linear PDEs, we use a DNN to parameterize the Green's function and obtain the neural operator to approximate the solution according to the Green's method. To train the DNN, the empirical risk consists of the mean squared loss with the least square formulation or the variational formulation of the governing equation and boundary conditions. For complicated problems, the empirical risk also includes a few labels, which are computed on coarse grid points with cheap computation cost and significantly improves the model accuracy. Intuitively, the labeled dataset works as a regularization in addition to the model constraints. The MOD-Net solves a family of PDEs rather than a specific one and is much more efficient than original neural operator because few expensive labels are required. We numerically show MOD-Net is very efficient in solving Poisson equation and one-dimensional radiative transfer equation. For nonlinear PDEs, the nonlinear MOD-Net can be similarly used as an ansatz for solving nonlinear PDEs, exemplified by solving several nonlinear PDE problems, such as the Burgers equation.

preprint2021arXiv

Network Pruning via Resource Reallocation

Channel pruning is broadly recognized as an effective approach to obtain a small compact model through eliminating unimportant channels from a large cumbersome network. Contemporary methods typically perform iterative pruning procedure from the original over-parameterized model, which is both tedious and expensive especially when the pruning is aggressive. In this paper, we propose a simple yet effective channel pruning technique, termed network Pruning via rEsource rEalLocation (PEEL), to quickly produce a desired slim model with negligible cost. Specifically, PEEL first constructs a predefined backbone and then conducts resource reallocation on it to shift parameters from less informative layers to more important layers in one round, thus amplifying the positive effect of these informative layers. To demonstrate the effectiveness of PEEL , we perform extensive experiments on ImageNet with ResNet-18, ResNet-50, MobileNetV2, MobileNetV3-small and EfficientNet-B0. Experimental results show that structures uncovered by PEEL exhibit competitive performance with state-of-the-art pruning algorithms under various pruning settings. Our code is available at https://github.com/cardwing/Codes-for-PEEL.

preprint2020arXiv

3D Face Reconstruction from A Single Image Assisted by 2D Face Images in the Wild

3D face reconstruction from a single 2D image is a challenging problem with broad applications. Recent methods typically aim to learn a CNN-based 3D face model that regresses coefficients of 3D Morphable Model (3DMM) from 2D images to render 3D face reconstruction or dense face alignment. However, the shortage of training data with 3D annotations considerably limits performance of those methods. To alleviate this issue, we propose a novel 2D-assisted self-supervised learning (2DASL) method that can effectively use "in-the-wild" 2D face images with noisy landmark information to substantially improve 3D face model learning. Specifically, taking the sparse 2D facial landmarks as additional information, 2DSAL introduces four novel self-supervision schemes that view the 2D landmark and 3D landmark prediction as a self-mapping process, including the 2D and 3D landmark self-prediction consistency, cycle-consistency over the 2D landmark prediction and self-critic over the predicted 3DMM coefficients based on landmark predictions. Using these four self-supervision schemes, the 2DASL method significantly relieves demands on the the conventional paired 2D-to-3D annotations and gives much higher-quality 3D face models without requiring any additional 3D annotations. Experiments on multiple challenging datasets show that our method outperforms state-of-the-arts for both 3D face reconstruction and dense face alignment by a large margin.

preprint2020arXiv

A Dual Symmetric Gauss-Seidel Alternating Direction Method of Multipliers for Hyperspectral Sparse Unmixing

Since sparse unmixing has emerged as a promising approach to hyperspectral unmixing, some spatial-contextual information in the hyperspectral images has been exploited to improve the performance of the unmixing recently. The total variation (TV) has been widely used to promote the spatial homogeneity as well as the smoothness between adjacent pixels. However, the computation task for hyperspectral sparse unmixing with a TV regularization term is heavy. Besides, the convergence of the primal alternating direction method of multipliers (ADMM) for the hyperspectral sparse unmixing with a TV regularization term has not been explained in details. In this paper, we design an efficient and convergent dual symmetric Gauss-Seidel ADMM (sGS-ADMM) for hyperspectral sparse unmixing with a TV regularization term. We also present the global convergence and local linear convergence rate analysis for this algorithm. As demonstrated in numerical experiments, our algorithm can obviously improve the efficiency of the unmixing compared with the state-of-the-art algorithm. More importantly, we can obtain images with higher quality.

preprint2020arXiv

A type of generalization error induced by initialization in deep neural networks

How initialization and loss function affect the learning of a deep neural network (DNN), specifically its generalization error, is an important problem in practice. In this work, by exploiting the linearity of DNN training dynamics in the NTK regime \citep{jacot2018neural,lee2019wide}, we provide an explicit and quantitative answer to this problem. Focusing on regression problem, we prove that, in the NTK regime, for any loss in a general class of functions, the DNN finds the same \emph{global} minima---the one that is nearest to the initial value in the parameter space, or equivalently, the one that is closest to the initial DNN output in the corresponding reproducing kernel Hilbert space. Using these optimization problems, we quantify the impact of initial output and prove that a random non-zero one increases the generalization error. We further propose an antisymmetrical initialization (ASI) trick that eliminates this type of error and accelerates the training. To understand whether the above results hold in general, we also perform experiments for DNNs in the non-NTK regime, which demonstrate the effectiveness of our theoretical results and the ASI trick in a qualitative sense. Overall, our work serves as a baseline for the further investigation of the impact of initialization and loss function on the generalization of DNNs, which can potentially guide and improve the training of DNNs in practice.

preprint2020arXiv

Fully Decentralized Federated Learning Based Beamforming Design for UAV Communications

To handle the data explosion in the era of internet of things (IoT), it is of interest to investigate the decentralized network, with the aim at relaxing the burden to central server along with keeping data privacy. In this work, we develop a fully decentralized federated learning (FL) framework with an inexact stochastic parallel random walk alternating direction method of multipliers (ISPW-ADMM). Performing more communication efficient and enhanced privacy preservation compared with the current state-of-the-art, the proposed ISPW-ADMM can be partially immune to the impacts from time-varying dynamic network and stochastic data collection, while still in fast convergence. Benefits from the stochastic gradients and biased first-order moment estimation, the proposed framework can be applied to any decentralized FL tasks over time-varying graphs. Thus to further demonstrate the practicability of such framework in providing fast convergence, high communication efficiency, and system robustness, we study the extreme learning machine(ELM)-based FL model for robust beamforming (BF) design in UAV communications, as verified by the numerical simulations.

preprint2020arXiv

Haar Graph Pooling

Deep Graph Neural Networks (GNNs) are useful models for graph classification and graph-based regression tasks. In these tasks, graph pooling is a critical ingredient by which GNNs adapt to input graphs of varying size and structure. We propose a new graph pooling operation based on compressive Haar transforms -- HaarPooling. HaarPooling implements a cascade of pooling operations; it is computed by following a sequence of clusterings of the input graph. A HaarPooling layer transforms a given input graph to an output graph with a smaller node number and the same feature dimension; the compressive Haar transform filters out fine detail information in the Haar wavelet domain. In this way, all the HaarPooling layers together synthesize the features of any given input graph into a feature vector of uniform size. Such transforms provide a sparse characterization of the data and preserve the structure information of the input graph. GNNs implemented with standard graph convolution layers and HaarPooling layers achieve state of the art performance on diverse graph classification and regression problems.

preprint2020arXiv

Inter-Region Affinity Distillation for Road Marking Segmentation

We study the problem of distilling knowledge from a large deep teacher network to a much smaller student network for the task of road marking segmentation. In this work, we explore a novel knowledge distillation (KD) approach that can transfer 'knowledge' on scene structure more effectively from a teacher to a student model. Our method is known as Inter-Region Affinity KD (IntRA-KD). It decomposes a given road scene image into different regions and represents each region as a node in a graph. An inter-region affinity graph is then formed by establishing pairwise relationships between nodes based on their similarity in feature distribution. To learn structural knowledge from the teacher network, the student is required to match the graph generated by the teacher. The proposed method shows promising results on three large-scale road marking segmentation benchmarks, i.e., ApolloScape, CULane and LLAMAS, by taking various lightweight models as students and ResNet-101 as the teacher. IntRA-KD consistently brings higher performance gains on all lightweight models, compared to previous distillation methods. Our code is available at https://github.com/cardwing/Codes-for-IntRA-KD.

preprint2020arXiv

Optimized Large Hyperuniform Binary Colloidal Suspensions in Two Dimensions

The creation of disordered hyperuniform materials with potentially extraordinary optical properties requires a capacity to synthesize large samples that are effectively hyperuniform down to the nanoscale. Motivated by this challenge, we propose a fabrication protocol using binary superparamagnetic colloidal particles confined in a 2D plane. The strong and long-ranged dipolar interaction induced by a tunable magnetic field is free from screening effects that attenuates long-ranged electrostatic interactions in charged colloidal systems. Specifically, we find a family of optimal size ratios that makes the two-phase system effectively hyperuniform. We show that hyperuniformity is a general consequence of low isothermal compressibilities, which makes our protocol suitable to systems with other long-ranged soft interactions, dimensionalities and/or polydispersity. Our methodology paves the way to synthesize large photonic hyperuniform materials that function in the visible to infrared range and hence may accelerate the discovery of novel photonic materials.

preprint2020arXiv

Path Integral Based Convolution and Pooling for Graph Neural Networks

Graph neural networks (GNNs) extends the functionality of traditional neural networks to graph-structured data. Similar to CNNs, an optimized design of graph convolution and pooling is key to success. Borrowing ideas from physics, we propose a path integral based graph neural networks (PAN) for classification and regression tasks on graphs. Specifically, we consider a convolution operation that involves every path linking the message sender and receiver with learnable weights depending on the path length, which corresponds to the maximal entropy random walk. It generalizes the graph Laplacian to a new transition matrix we call maximal entropy transition (MET) matrix derived from a path integral formalism. Importantly, the diagonal entries of the MET matrix are directly related to the subgraph centrality, thus providing a natural and adaptive pooling mechanism. PAN provides a versatile framework that can be tailored for different graph data with varying sizes and structures. We can view most existing GNN architectures as special cases of PAN. Experimental results show that PAN achieves state-of-the-art performance on various graph classification/regression tasks, including a new benchmark dataset from statistical mechanics we propose to boost applications of GNN in physical sciences.

preprint2020arXiv

Predicting permeability via statistical learning on higher-order microstructural information

Quantitative structure-property relationships are crucial for the understanding and prediction of the physical properties of complex materials. For fluid flow in porous materials, characterizing the geometry of the pore microstructure facilitates prediction of permeability, a key property that has been extensively studied in material science, geophysics and chemical engineering. In this work, we study the predictability of different structural descriptors via both linear regressions and neural networks. A large data set of 30,000 virtual, porous microstructures of different types is created for this end. We compute permeabilities of these structures using the lattice Boltzmann method, and characterize the pore space geometry using one-point correlation functions (porosity, specific surface), two-point surface-surface, surface-void, and void-void correlation functions, as well as the geodesic tortuosity as an implicit descriptor. Then, we study the prediction of the permeability using different combinations of these descriptors. We obtain significant improvements of performance when compared to a Kozeny-Carman regression with only lowest-order descriptors (porosity and specific surface). We find that combining all three two-point correlation functions and tortuosity provides the best prediction of permeability, with the void-void correlation function being the most informative individual descriptor. Moreover, the combination of porosity, specific surface, and geodesic tortuosity provides very good predictive performance. This shows that higher-order correlation functions are extremely useful for forming a general model for predicting physical properties of complex materials. Additionally, our results suggest that neural networks are superior to the more conventional regression methods for establishing quantitative structure-property relationships.

preprint2020arXiv

Recent development in kinetic theory of granular materials: analysis and numerical methods

Over the past decades, kinetic description of granular materials has received a lot of attention in mathematical community and applied fields such as physics and engineering. This article aims to review recent mathematical results in kinetic granular materials, especially for those which arose since the last review by Villani on the same subject. We will discuss both theoretical and numerical developments. We will finally showcase some important open problems and conjectures by means of numerical experiments based on spectral methods.

preprint2020arXiv

What's the relationship between CNNs and communication systems?

The interpretability of Convolutional Neural Networks (CNNs) is an important topic in the field of computer vision. In recent years, works in this field generally adopt a mature model to reveal the internal mechanism of CNNs, helping to understand CNNs thoroughly. In this paper, we argue the working mechanism of CNNs can be revealed through a totally different interpretation, by comparing the communication systems and CNNs. This paper successfully obtained the corresponding relationship between the modules of the two, and verified the rationality of the corresponding relationship with experiments. Finally, through the analysis of some cutting-edge research on neural networks, we find the inherent relation between these two tasks can be of help in explaining these researches reasonably, as well as helping us discover the correct research direction of neural networks.

preprint2019arXiv

Defending from adversarial examples with a two-stream architecture

In recent years, deep learning has shown impressive performance on many tasks. However, recent researches showed that deep learning systems are vulnerable to small, specially crafted perturbations that are imperceptible to humans. Images with such perturbations are the so called adversarial examples, which have proven to be an indisputable threat to the DNN based applications. The lack of better understanding of the DNNs has prevented the development of efficient defenses against adversarial examples. In this paper, we propose a two-stream architecture to protect CNN from attacking by adversarial examples. Our model draws on the idea of "two-stream" which commonly used in the security field, and successfully defends different kinds of attack methods by the differences of "high-resolution" and "low-resolution" networks in feature extraction. We provide a reasonable interpretation on why our two-stream architecture is difficult to defeat, and show experimentally that our method is hard to defeat with state-of-the-art attacks. We demonstrate that our two-stream architecture is robust to adversarial examples built by currently known attacking algorithms.

preprint2019arXiv

Focus-Enhanced Scene Text Recognition with Deformable Convolutions

Recently, scene text recognition methods based on deep learning have sprung up in computer vision area. The existing methods achieved great performances, but the recognition of irregular text is still challenging due to the various shapes and distorted patterns. Consider that at the time of reading words in the real world, normally we will not rectify it in our mind but adjust our focus and visual fields. Similarly, through utilizing deformable convolutional layers whose geometric structures are adjustable, we present an enhanced recognition network without the steps of rectification to deal with irregular text in this work. A number of experiments have been applied, where the results on public benchmarks demonstrate the effectiveness of our proposed components and shows that our method has reached satisfactory performances. The code will be publicly available at https://github.com/Alpaca07/dtr soon.

preprint2019arXiv

Generating Text Sequence Images for Recognition

Recently, methods based on deep learning have dominated the field of text recognition. With a large number of training data, most of them can achieve the state-of-the-art performances. However, it is hard to harvest and label sufficient text sequence images from the real scenes. To mitigate this issue, several methods to synthesize text sequence images were proposed, yet they usually need complicated preceding or follow-up steps. In this work, we present a method which is able to generate infinite training data without any auxiliary pre/post-process. We tackle the generation task as an image-to-image translation one and utilize conditional adversarial networks to produce realistic text sequence images in the light of the semantic ones. Some evaluation metrics are involved to assess our method and the results demonstrate that the caliber of the data is satisfactory. The code and dataset will be publicly available soon.