Source author record

Jie Ding

Jie Ding appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

33works

27topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Framework for Understanding Model Extraction Attack and Defense

The privacy of machine learning models has become a significant concern in many emerging Machine-Learning-as-a-Service applications, where prediction services based on well-trained models are offered to users via pay-per-query. The lack of a defense mechanism can impose a high risk on the privacy of the server's model since an adversary could efficiently steal the model by querying only a few `good' data points. The interplay between a server's defense and an adversary's attack inevitably leads to an arms race dilemma, as commonly seen in Adversarial Machine Learning. To study the fundamental tradeoffs between model utility from a benign user's view and privacy from an adversary's view, we develop new metrics to quantify such tradeoffs, analyze their theoretical properties, and develop an optimization problem to understand the optimal adversarial attack and defense strategies. The developed concepts and theory match the empirical findings on the `equilibrium' between privacy and utility. In terms of optimization, the key ingredient that enables our results is a unified representation of the attack-defense problem as a min-max bi-level problem. The developed results will be demonstrated by examples and experiments.

preprint2022arXiv

Asymptotic Critical Radii in Random Geometric Graphs over 3-Dimensional Convex regions

This article presents the precise asymptotical distribution of two types of critical transmission radii, defined in terms of k-connectivity and the minimum vertex degree, for a random geometry graph distributed over a 3-Dimensional Convex region.

preprint2022arXiv

Asymptotic Critical Transmission Radii in Wireless Networks over a Convex Region

Critical transmission ranges (or radii) in wireless ad-hoc and sensor networks have been extensively investigated for various performance metrics such as connectivity, coverage, power assignment and energy consumption. However, the regions on which the networks are distributed are typically either squares or disks in existing works, which seriously limits the usage in real-life applications. In this article, we consider a convex region (i.e., a generalisation of squares and disks) on which wireless nodes are uniformly distributed. We have investigated two types of critical transmission radii, defined in terms of k-connectivity and the minimum vertex degree, respectively, and have also established their precise asymptotic distributions. These make the previous results obtained under the circumstance of squares or disks special cases of this work. More importantly, our results reveal how the region shape impacts on the critical transmission ranges: it is the length of the boundary of the (fixed-area) region that completely determines the transmission ranges. Furthermore, by isodiametric inequality, the smallest critical transmission ranges are achieved when regions are disks only.

preprint2022arXiv

Convergence Analysis of Structure-Preserving Numerical Methods Based on Slotboom Transformation for the Poisson--Nernst--Planck Equations

The analysis of structure-preserving numerical methods for the Poisson--Nernst--Planck (PNP) system has attracted growing interests in recent years. In this work, we provide an optimal rate convergence analysis and error estimate for finite difference schemes based on the Slotboom reformulation. Different options of mobility average at the staggered mesh points are considered in the finite-difference spatial discretization, such as the harmonic mean, geometric mean, arithmetic mean, and entropic mean. A semi-implicit temporal discretization is applied, which in turn results in a non-constant coefficient, positive-definite linear system at each time step. A higher order asymptotic expansion is applied in the consistency analysis, and such a higher order consistency estimate is necessary to control the discrete maximum norm of the concentration variables. In convergence estimate, the harmonic mean for the mobility average, which turns out to bring lots of convenience in the theoretical analysis, is taken for simplicity, while other options of mobility average would also lead to the desired error estimate, with more technical details involved. As a result, an optimal rate convergence analysis on concentrations, electric potential, and ionic fluxes is derived, which is the first such results for the structure-preserving numerical schemes based on the Slotboom reformulation. It is remarked that the convergence analysis leads to a theoretical justification of the conditional energy dissipation analysis, which relies on the maximum norm bounds of the concentration and the gradient of the electric potential. Some numerical results are also presented to demonstrate the accuracy and structure-preserving performance of the associated schemes.

preprint2022arXiv

Federated Learning Challenges and Opportunities: An Outlook

Federated learning (FL) has been developed as a promising framework to leverage the resources of edge devices, enhance customers' privacy, comply with regulations, and reduce development costs. Although many methods and applications have been developed for FL, several critical challenges for practical FL systems remain unaddressed. This paper provides an outlook on FL development, categorized into five emerging directions of FL, namely algorithm foundation, personalization, hardware and security constraints, lifelong learning, and nonstandard data. Our unique perspectives are backed by practical observations from large-scale federated systems for edge devices.

preprint2022arXiv

Interval Privacy: A Framework for Privacy-Preserving Data Collection

The emerging public awareness and government regulations of data privacy motivate new paradigms of collecting and analyzing data that are transparent and acceptable to data owners. We present a new concept of privacy and corresponding data formats, mechanisms, and theories for privatizing data during data collection. The privacy, named Interval Privacy, enforces the raw data conditional distribution on the privatized data to be the same as its unconditional distribution over a nontrivial support set. Correspondingly, the proposed privacy mechanism will record each data value as a random interval (or, more generally, a range) containing it. The proposed interval privacy mechanisms can be easily deployed through survey-based data collection interfaces, e.g., by asking a respondent whether its data value is within a randomly generated range. Another unique feature of interval mechanisms is that they obfuscate the truth but do not perturb it. Using narrowed range to convey information is complementary to the popular paradigm of perturbing data. Also, the interval mechanisms can generate progressively refined information at the discretion of individuals, naturally leading to privacy-adaptive data collection. We develop different aspects of theory such as composition, robustness, distribution estimation, and regression learning from interval-valued data. Interval privacy provides a new perspective of human-centric data privacy where individuals have a perceptible, transparent, and simple way of sharing sensitive data.

preprint2022arXiv

Is a Classification Procedure Good Enough? A Goodness-of-Fit Assessment Tool for Classification Learning

In recent years, many non-traditional classification methods, such as Random Forest, Boosting, and neural network, have been widely used in applications. Their performance is typically measured in terms of classification accuracy. While the classification error rate and the like are important, they do not address a fundamental question: Is the classification method underfitted? To our best knowledge, there is no existing method that can assess the goodness-of-fit of a general classification procedure. Indeed, the lack of a parametric assumption makes it challenging to construct proper tests. To overcome this difficulty, we propose a methodology called BAGofT that splits the data into a training set and a validation set. First, the classification procedure to assess is applied to the training set, which is also used to adaptively find a data grouping that reveals the most severe regions of underfitting. Then, based on this grouping, we calculate a test statistic by comparing the estimated success probabilities and the actual observed responses from the validation set. The data splitting guarantees that the size of the test is controlled under the null hypothesis, and the power of the test goes to one as the sample size increases under the alternative hypothesis. For testing parametric classification models, the BAGofT has a broader scope than the existing methods since it is not restricted to specific parametric models (e.g., logistic regression). Extensive simulation studies show the utility of the BAGofT when assessing general classification procedures and its strengths over some existing methods when testing parametric classification models.

preprint2022arXiv

On The Energy Statistics of Feature Maps in Pruning of Neural Networks with Skip-Connections

We propose a new structured pruning framework for compressing Deep Neural Networks (DNNs) with skip connections, based on measuring the statistical dependency of hidden layers and predicted outputs. The dependence measure defined by the energy statistics of hidden layers serves as a model-free measure of information between the feature maps and the output of the network. The estimated dependence measure is subsequently used to prune a collection of redundant and uninformative layers. Model-freeness of our measure guarantees that no parametric assumptions on the feature map distribution are required, making it computationally appealing for very high dimensional feature space in DNNs. Extensive numerical experiments on various architectures show the efficacy of the proposed pruning approach with competitive performance to state-of-the-art methods.

preprint2022arXiv

Self-Aware Personalized Federated Learning

In the context of personalized federated learning (FL), the critical challenge is to balance local model improvement and global model tuning when the personal and global objectives may not be exactly aligned. Inspired by Bayesian hierarchical models, we develop a self-aware personalized FL method where each client can automatically balance the training of its local personal model and the global model that implicitly contributes to other clients' training. Such a balance is derived from the inter-client and intra-client uncertainty quantification. A larger inter-client variation implies more personalization is needed. Correspondingly, our method uses uncertainty-driven local training steps and aggregation rule instead of conventional local fine-tuning and sample size-based aggregation. With experimental studies on synthetic data, Amazon Alexa audio data, and public datasets such as MNIST, FEMNIST, CIFAR10, and Sent140, we show that our proposed method can achieve significantly improved personalization performance compared with the existing counterparts.

preprint2022arXiv

Targeted Cross-Validation

In many applications, we have access to the complete dataset but are only interested in the prediction of a particular region of predictor variables. A standard approach is to find the globally best modeling method from a set of candidate methods. However, it is perhaps rare in reality that one candidate method is uniformly better than the others. A natural approach for this scenario is to apply a weighted $L_2$ loss in performance assessment to reflect the region-specific interest. We propose a targeted cross-validation (TCV) to select models or procedures based on a general weighted $L_2$ loss. We show that the TCV is consistent in selecting the best performing candidate under the weighted $L_2$ loss. Experimental studies are used to demonstrate the use of TCV and its potential advantage over the global CV or the approach of using only local data for modeling a local region. Previous investigations on CV have relied on the condition that when the sample size is large enough, the ranking of two candidates stays the same. However, in many applications with the setup of changing data-generating processes or highly adaptive modeling methods, the relative performance of the methods is not static as the sample size varies. Even with a fixed data-generating process, it is possible that the ranking of two methods switches infinitely many times. In this work, we broaden the concept of the selection consistency by allowing the best candidate to switch as the sample size varies, and then establish the consistency of the TCV. This flexible framework can be applied to high-dimensional and complex machine learning scenarios where the relative performances of modeling procedures are dynamic.

preprint2022arXiv

The Rate of Convergence of Variation-Constrained Deep Neural Networks

Multi-layer feedforward networks have been used to approximate a wide range of nonlinear functions. An important and fundamental problem is to understand the learnability of a network model through its statistical risk, or the expected prediction error on future data. To the best of our knowledge, the rate of convergence of neural networks shown by existing works is bounded by at most the order of $n^{-1/4}$ for a sample size of $n$. In this paper, we show that a class of variation-constrained neural networks, with arbitrary width, can achieve near-parametric rate $n^{-1/2+δ}$ for an arbitrarily small positive constant $δ$. It is equivalent to $n^{-1 +2δ}$ under the mean squared error. This rate is also observed by numerical experiments. The result indicates that the neural function space needed for approximating smooth functions may not be as large as what is often perceived. Our result also provides insight to the phenomena that deep neural networks do not easily suffer from overfitting when the number of neurons and learning parameters rapidly grow with $n$ or even surpass $n$. We also discuss the rate of convergence regarding other network parameters, including the input dimension, network layer, and coefficient norm.

preprint2020arXiv

Forecasting with Multiple Seasonality

An emerging number of modern applications involve forecasting time series data that exhibit both short-time dynamics and long-time seasonality. Specifically, time series with multiple seasonality is a difficult task with comparatively fewer discussions. In this paper, we propose a two-stage method for time series with multiple seasonality, which does not require pre-determined seasonality periods. In the first stage, we generalize the classical seasonal autoregressive moving average (ARMA) model in multiple seasonality regime. In the second stage, we utilize an appropriate criterion for lag order selection. Simulation and empirical studies show the excellent predictive performance of our method, especially compared to a recently popular `Facebook Prophet' model for time series.

preprint2020arXiv

Imitation Privacy

In recent years, there have been many cloud-based machine learning services, where well-trained models are provided to users on a pay-per-query scheme through a prediction API. The emergence of these services motivates this work, where we will develop a general notion of model privacy named imitation privacy. We show the broad applicability of imitation privacy in classical query-response MLaaS scenarios and new multi-organizational learning scenarios. We also exemplify the fundamental difference between imitation privacy and the usual data-level privacy.

preprint2020arXiv

Information Laundering for Model Privacy

In this work, we propose information laundering, a novel framework for enhancing model privacy. Unlike data privacy that concerns the protection of raw data information, model privacy aims to protect an already-learned model that is to be deployed for public use. The private model can be obtained from general learning methods, and its deployment means that it will return a deterministic or random response for a given input query. An information-laundered model consists of probabilistic components that deliberately maneuver the intended input and output for queries to the model, so the model's adversarial acquisition is less likely. Under the proposed framework, we develop an information-theoretic principle to quantify the fundamental tradeoffs between model utility and privacy leakage and derive the optimal design.

preprint2020arXiv

IoT Connectivity Technologies and Applications: A Survey

The Internet of Things (IoT) is rapidly becoming an integral part of our life and also multiple industries. We expect to see the number of IoT connected devices explosively grows and will reach hundreds of billions during the next few years. To support such a massive connectivity, various wireless technologies are investigated. In this survey, we provide a broad view of the existing wireless IoT connectivity technologies and discuss several new emerging technologies and solutions that can be effectively used to enable massive connectivity for IoT. In particular, we categorize the existing wireless IoT connectivity technologies based on coverage range and review diverse types of connectivity technologies with different specifications. We also point out key technical challenges of the existing connectivity technologies for enabling massive IoT connectivity. To address the challenges, we further review and discuss some examples of promising technologies such as compressive sensing (CS) random access, non-orthogonal multiple access (NOMA), and massive multiple input multiple output (mMIMO) based random access that could be employed in future standards for supporting IoT connectivity. Finally, a classification of IoT applications is considered in terms of various service requirements. For each group of classified applications, we outline its suitable IoT connectivity options.

preprint2020arXiv

Non-escaping points of Zorich maps

We extend results about the dimension of the radial Julia set of certain exponential functions to quasiregular Zorich maps in higher dimensions. Our results improve on previous estimates of the dimension also in the special case of exponential functions.

preprint2020arXiv

Speech Emotion Recognition with Dual-Sequence LSTM Architecture

Speech Emotion Recognition (SER) has emerged as a critical component of the next generation human-machine interfacing technologies. In this work, we propose a new dual-level model that predicts emotions based on both MFCC features and mel-spectrograms produced from raw audio signals. Each utterance is preprocessed into MFCC features and two mel-spectrograms at different time-frequency resolutions. A standard LSTM processes the MFCC features, while a novel LSTM architecture, denoted as Dual-Sequence LSTM (DS-LSTM), processes the two mel-spectrograms simultaneously. The outputs are later averaged to produce a final classification of the utterance. Our proposed model achieves, on average, a weighted accuracy of 72.7% and an unweighted accuracy of 73.3%---a 6% improvement over current state-of-the-art unimodal models---and is comparable with multimodal models that leverage textual information as well as audio signals.

preprint2020arXiv

Structure-Preserving and Efficient Numerical Methods for Ion Transport

Ion transport, often described by the Poisson--Nernst--Planck (PNP) equations, is ubiquitous in electrochemical devices and many biological processes of significance. In this work, we develop conservative, positivity-preserving, energy dissipating, and implicit finite difference schemes for solving the multi-dimensional PNP equations with multiple ionic species. A central-differencing discretization based on harmonic-mean approximations is employed for the Nernst--Planck (NP) equations. The backward Euler discretization in time is employed to derive a fully implicit nonlinear system, which is efficiently solved by a newly proposed Newton's method. The improved computational efficiency of the Newton's method originates from the usage of the electrostatic potential as the iteration variable, rather than the unknowns of the nonlinear system that involves both the potential and concentration of multiple ionic species. Numerical analysis proves that the numerical schemes respect three desired analytical properties (conservation, positivity preserving, and energy dissipation) fully discretely. Based on advantages brought by the harmonic-mean approximations, we are able to establish estimate on the upper bound of condition numbers of coefficient matrices in linear systems that are solved iteratively. The solvability and stability of the linearized problem in the Newton's method are rigorously established as well. Numerical tests are performed to confirm the anticipated numerical accuracy, computational efficiency, and structure-preserving properties of the developed schemes. Adaptive time stepping is implemented for further efficiency improvement. Finally, the proposed numerical approaches are applied to characterize ion transport subject to a sinusoidal applied potential.

preprint2020arXiv

Towards Enabling Critical mMTC: A Review of URLLC within mMTC

Massive machine-type communication (mMTC) and ultra-reliable and low-latency communication (URLLC) are two key service types in the fifth-generation (5G) communication systems, pursuing scalability and reliability with low-latency, respectively. These two extreme services are envisaged to agglomerate together into \emph{critical mMTC} shortly with emerging use cases (e.g., wide-area disaster monitoring, wireless factory automation), creating new challenges to designing wireless systems beyond 5G. While conventional network slicing is effective in supporting a simple mixture of mMTC and URLLC, it is difficult to simultaneously guarantee the reliability, latency, and scalability requirements of critical mMTC (e.g., < 4ms latency, $10^6$ devices/km$^2$ for factory automation) with limited radio resources. Furthermore, recently proposed solutions to scalable URLLC (e.g., machine learning aided URLLC for driverless vehicles) are ill-suited to critical mMTC whose machine type users have minimal energy budget and computing capability that should be (tightly) optimized for given tasks. To this end, our paper aims to characterize promising use cases of critical mMTC and search for their possible solutions. To this end, we first review the state-of-the-art (SOTA) technologies for separate mMTC and URLLC services and then identify key challenges from conflicting SOTA requirements, followed by potential approaches to prospective critical mMTC solutions at different layers.

preprint2019arXiv

Deep Clustering of Compressed Variational Embeddings

Motivated by the ever-increasing demands for limited communication bandwidth and low-power consumption, we propose a new methodology, named joint Variational Autoencoders with Bernoulli mixture models (VAB), for performing clustering in the compressed data domain. The idea is to reduce the data dimension by Variational Autoencoders (VAEs) and group data representations by Bernoulli mixture models (BMMs). Once jointly trained for compression and clustering, the model can be decomposed into two parts: a data vendor that encodes the raw data into compressed data, and a data consumer that classifies the received (compressed) data. In this way, the data vendor benefits from data security and communication bandwidth, while the data consumer benefits from low computational complexity. To enable training using the gradient descent algorithm, we propose to use the Gumbel-Softmax distribution to resolve the infeasibility of the back-propagation algorithm when assessing categorical samples.

preprint2019arXiv

DRASIC: Distributed Recurrent Autoencoder for Scalable Image Compression

We propose a new architecture for distributed image compression from a group of distributed data sources. The work is motivated by practical needs of data-driven codec design, low power consumption, robustness, and data privacy. The proposed architecture, which we refer to as Distributed Recurrent Autoencoder for Scalable Image Compression (DRASIC), is able to train distributed encoders and one joint decoder on correlated data sources. Its compression capability is much better than the method of training codecs separately. Meanwhile, the performance of our distributed system with 10 distributed sources is only within 2 dB peak signal-to-noise ratio (PSNR) of the performance of a single codec trained with all data sources. We experiment distributed sources with different correlations and show how our data-driven methodology well matches the Slepian-Wolf Theorem in Distributed Source Coding (DSC). To the best of our knowledge, this is the first data-driven DSC framework for general distributed code design with deep learning.

preprint2019arXiv

Restricted Recurrent Neural Networks

Recurrent Neural Network (RNN) and its variations such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), have become standard building blocks for learning online data of sequential nature in many research areas, including natural language processing and speech data analysis. In this paper, we present a new methodology to significantly reduce the number of parameters in RNNs while maintaining performance that is comparable or even better than classical RNNs. The new proposal, referred to as Restricted Recurrent Neural Network (RRNN), restricts the weight matrices corresponding to the input data and hidden states at each time step to share a large proportion of parameters. The new architecture can be regarded as a compression of its classical counterpart, but it does not require pre-training or sophisticated parameter fine-tuning, both of which are major issues in most existing compression techniques. Experiments on natural language modeling show that compared with its classical counterpart, the restricted recurrent architecture generally produces comparable results at about 50\% compression rate. In particular, the Restricted LSTM can outperform classical RNN with even less number of parameters.

preprint2019arXiv

Supervised Encoding for Discrete Representation Learning

Classical supervised classification tasks search for a nonlinear mapping that maps each encoded feature directly to a probability mass over the labels. Such a learning framework typically lacks the intuition that encoded features from the same class tend to be similar and thus has little interpretability for the learned features. In this paper, we propose a novel supervised learning model named Supervised-Encoding Quantizer (SEQ). The SEQ applies a quantizer to cluster and classify the encoded features. We found that the quantizer provides an interpretable graph where each cluster in the graph represents a class of data samples that have a particular style. We also trained a decoder that can decode convex combinations of the encoded features from similar and different clusters and provide guidance on style transfer between sub-classes.

preprint2016arXiv

Bridging AIC and BIC: a new criterion for autoregression

We introduce a new criterion to determine the order of an autoregressive model fitted to time series data. It has the benefits of the two well-known model selection techniques, the Akaike information criterion and the Bayesian information criterion. When the data is generated from a finite order autoregression, the Bayesian information criterion is known to be consistent, and so is the new criterion. When the true order is infinity or suitably high with respect to the sample size, the Akaike information criterion is known to be efficient in the sense that its prediction performance is asymptotically equivalent to the best offered by the candidate models; in this case, the new criterion behaves in a similar manner. Different from the two classical criteria, the proposed criterion adaptively achieves either consistency or efficiency depending on the underlying true model. In practice where the observed time series is given without any prior information about the model specification, the proposed order selection criterion is more flexible and robust compared with classical approaches. Numerical results are presented demonstrating the adaptivity of the proposed technique when applied to various datasets.

preprint2015arXiv

Complementary Lattice Arrays for Coded Aperture Imaging

In this work, we consider complementary lattice arrays in order to enable a broader range of designs for coded aperture imaging systems. We provide a general framework and methods that generate richer and more flexible designs than existing ones. Besides this, we review and interpret the state-of-the-art uniformly redundant arrays (URA) designs, broaden the related concepts, and further propose some new design methods.

preprint2015arXiv

Data-Driven Learning of the Number of States in Multi-State Autoregressive Models

In this work, we consider the class of multi-state autoregressive processes that can be used to model non-stationary time-series of interest. In order to capture different autoregressive (AR) states underlying an observed time series, it is crucial to select the appropriate number of states. We propose a new model selection technique based on the Gap statistics, which uses a null reference distribution on the stable AR filters to check whether adding a new AR state significantly improves the performance of the model. To that end, we define a new distance measure between AR filters based on mean squared prediction error (MSPE), and propose an efficient method to generate random stable filters that are uniformly distributed in the coefficient space. Numerical results are provided to evaluate the performance of the proposed approach.

preprint2015arXiv

Key Pre-Distributions From Graph-Based Block Designs

With the development of wireless communication technologies which considerably contributed to the development of wireless sensor networks (WSN), we have witnessed an ever-increasing WSN based applications which induced a host of research activities in both academia and industry. Since most of the target WSN applications are very sensitive, security issue is one of the major challenges in the deployment of WSN. One of the important building blocks in securing WSN is key management. Traditional key management solutions developed for other networks are not suitable for WSN since WSN networks are resource (e.g. memory, computation, energy) limited. Key pre-distribution algorithms have recently evolved as efficient alternatives of key management in these networks. In the key pre-distribution systems, secure communication is achieved between a pair of nodes either by the existence of a key allowing for direct communication or by a chain of keys forming a key-path between the pair. In this paper, we propose methods which bring prior knowledge of network characteristics and application constraints into the design of key pre-distribution schemes, in order to provide better security and connectivity while requiring less resources. Our methods are based on casting the prior information as a graph. Motivated by this idea, we also propose a class of quasi-symmetric designs referred here to as g-designs. These produce key pre-distribution schemes that significantly improve upon the existing constructions based on unital designs. We give some examples, and point out open problems for future research.

preprint2015arXiv

Learning the Number of Autoregressive Mixtures in Time Series Using the Gap Statistics

Using a proper model to characterize a time series is crucial in making accurate predictions. In this work we use time-varying autoregressive process (TVAR) to describe non-stationary time series and model it as a mixture of multiple stable autoregressive (AR) processes. We introduce a new model selection technique based on Gap statistics to learn the appropriate number of AR filters needed to model a time series. We define a new distance measure between stable AR filters and draw a reference curve that is used to measure how much adding a new AR filter improves the performance of the model, and then choose the number of AR filters that has the maximum gap with the reference curve. To that end, we propose a new method in order to generate uniform random stable AR filters in root domain. Numerical results are provided demonstrating the performance of the proposed approach.

preprint2014arXiv

A novel wireless sensor network topology with fewer links

This paper, based on $k$-NN graph, presents symmetric $(k,j)$-NN graph $(1 \leq j < k)$, a brand new topology which could be adopted by a series of network-based structures. We show that the $k$ nearest neighbors of a node exert disparate influence on guaranteeing network connectivity, and connections with the farthest $j$ ones among these $k$ neighbors are competent to build up a connected network, contrast to the current popular strategy of connecting all these $k$ neighbors. In particular, for a network with node amount $n$ up to $10^3$, as experiments demonstrate, connecting with the farthest three, rather than all, of the five nearest neighbor nodes, i.e. $(k,j)=(5,3)$, can guarantee the network connectivity in high probabilities. We further reveal that more than $0.75n$ links or edges in $5$-NN graph are not necessary for the connectivity. Moreover, a composite topology combining symmetric $(k,j)$-NN and random geometric graph (RGG) is constructed for constrained transmission radii in wireless sensor networks (WSNs) application.

preprint2013arXiv

Perturbation Analysis of Orthogonal Matching Pursuit

Orthogonal Matching Pursuit (OMP) is a canonical greedy pursuit algorithm for sparse approximation. Previous studies of OMP have mainly considered the exact recovery of a sparse signal $\bm x$ through $\bm Φ$ and $\bm y=\bm Φ\bm x$, where $\bm Φ$ is a matrix with more columns than rows. In this paper, based on Restricted Isometry Property (RIP), the performance of OMP is analyzed under general perturbations, which means both $\bm y$ and $\bm Φ$ are perturbed. Though exact recovery of an almost sparse signal $\bm x$ is no longer feasible, the main contribution reveals that the exact recovery of the locations of $k$ largest magnitude entries of $\bm x$ can be guaranteed under reasonable conditions. The error between $\bm x$ and solution of OMP is also estimated. It is also demonstrated that the sufficient condition is rather tight by constructing an example. When $\bm x$ is strong-decaying, it is proved that the sufficient conditions can be relaxed, and the locations can even be recovered in the order of the entries' magnitude.

preprint2011arXiv

Performance of Orthogonal Matching Pursuit for Multiple Measurement Vectors

In this paper, we consider orthogonal matching pursuit (OMP) algorithm for multiple measurement vectors (MMV) problem. The robustness of OMPMMV is studied under general perturbations---when the measurement vectors as well as the sensing matrix are incorporated with additive noise. The main result shows that although exact recovery of the sparse solutions is unrealistic in noisy scenario, recovery of the support set of the solutions is guaranteed under suitable conditions. Specifically, a sufficient condition is derived that guarantees exact recovery of the sparse solutions in noiseless scenario.

preprint2010arXiv

Fundamental Results on Fluid Approximations of Stochastic Process Algebra Models

In order to avoid the state space explosion problem encountered in the quantitative analysis of large scale PEPA models, a fluid approximation approach has recently been proposed, which results in a set of ordinary differential equations (ODEs) to approximate the underlying continuous time Markov chain (CTMC). This paper presents a mapping semantics from PEPA to ODEs based on a numerical representation scheme, which extends the class of PEPA models that can be subjected to fluid approximation. Furthermore, we have established the fundamental characteristics of the derived ODEs, such as the existence, uniqueness, boundedness and nonnegativeness of the solution. The convergence of the solution as time tends to infinity for several classes of PEPA models, has been proved under some mild conditions. For general PEPA models, the convergence is proved under a particular condition, which has been revealed to relate to some famous constants of Markov chains such as the spectral gap and the Log-Sobolev constant. This thesis has established the consistency between the fluid approximation and the underlying CTMCs for PEPA, i.e. the limit of the solution is consistent with the equilibrium probability distribution corresponding to a family of underlying density dependent CTMCs.

preprint2010arXiv

Numerically Representing A Stochastic Process Algebra

The syntactic nature and compositionality characteristic of stochastic process algebras make models to be easily understood by human beings, but not convenient for machines as well as people to directly carry out mathematical analysis and stochastic simulation. This paper presents a numerical representation schema for the stochastic process algebra PEPA, which can provide a platform to directly and conveniently employ a variety of computational approaches to both qualitatively and quantitatively analyse the models. Moreover, these approaches developed on the basis of the schema are demonstrated and discussed. In particular, algorithms for automatically deriving the schema from a general PEPA model and simulating the model based on the derived schema to derive performance measures are presented.

Jie Ding

What is connected

Connect this record

See the researcher in context

Building this map preview

33 published item(s)

A Framework for Understanding Model Extraction Attack and Defense

Asymptotic Critical Radii in Random Geometric Graphs over 3-Dimensional Convex regions

Asymptotic Critical Transmission Radii in Wireless Networks over a Convex Region

Convergence Analysis of Structure-Preserving Numerical Methods Based on Slotboom Transformation for the Poisson--Nernst--Planck Equations

Federated Learning Challenges and Opportunities: An Outlook

Interval Privacy: A Framework for Privacy-Preserving Data Collection

Is a Classification Procedure Good Enough? A Goodness-of-Fit Assessment Tool for Classification Learning

On The Energy Statistics of Feature Maps in Pruning of Neural Networks with Skip-Connections

Self-Aware Personalized Federated Learning

Targeted Cross-Validation

The Rate of Convergence of Variation-Constrained Deep Neural Networks

Forecasting with Multiple Seasonality

Imitation Privacy

Information Laundering for Model Privacy

IoT Connectivity Technologies and Applications: A Survey

Non-escaping points of Zorich maps

Speech Emotion Recognition with Dual-Sequence LSTM Architecture

Structure-Preserving and Efficient Numerical Methods for Ion Transport

Towards Enabling Critical mMTC: A Review of URLLC within mMTC

Deep Clustering of Compressed Variational Embeddings

DRASIC: Distributed Recurrent Autoencoder for Scalable Image Compression

Restricted Recurrent Neural Networks

Supervised Encoding for Discrete Representation Learning

Bridging AIC and BIC: a new criterion for autoregression

Complementary Lattice Arrays for Coded Aperture Imaging

Data-Driven Learning of the Number of States in Multi-State Autoregressive Models

Key Pre-Distributions From Graph-Based Block Designs

Learning the Number of Autoregressive Mixtures in Time Series Using the Gap Statistics

A novel wireless sensor network topology with fewer links

Perturbation Analysis of Orthogonal Matching Pursuit

Performance of Orthogonal Matching Pursuit for Multiple Measurement Vectors

Fundamental Results on Fluid Approximations of Stochastic Process Algebra Models

Numerically Representing A Stochastic Process Algebra