Source author record

Xiao-Shan Gao

Xiao-Shan Gao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Symbolic Computation math.AG Machine Learning Cryptography and Security Computational Geometry Information Theory math.IT Computational Complexity Artificial Intelligence Computer Science and Game Theory Graphics math.AC math.GT math.NA math.PR Numerical Analysis

Catalog footprint

What is connected

27works

16topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

Data-Dependent Stability Analysis of Adversarial Training

Stability analysis is an essential aspect of studying the generalization ability of deep learning, as it involves deriving generalization bounds for stochastic gradient descent-based training algorithms. Adversarial training is the most widely used defense against adversarial example attacks. However, previous generalization bounds for adversarial training have not included information regarding the data distribution. In this paper, we fill this gap by providing generalization bounds for stochastic gradient descent-based adversarial training that incorporate data distribution information. We utilize the concepts of on-average stability and high-order approximate Lipschitz conditions to examine how changes in data distribution and adversarial budget can affect robust generalization gaps. Our derived generalization bounds for both convex and non-convex losses are at least as good as the uniform stability-based counterparts which do not include data distribution information. Furthermore, our findings demonstrate how distribution shifts from data poisoning attacks can impact robust generalization.

preprint2022arXiv

A Robust Classification-autoencoder to Defend Outliers and Adversaries

In this paper, a robust classification-autoencoder (CAE) is proposed, which has strong ability to recognize outliers and defend adversaries. The main idea is to change the autoencoder from an unsupervised learning model into a classifier, where the encoder is used to compress samples with different labels into disjoint compression spaces and the decoder is used to recover samples from their compression spaces. The encoder is used both as a compressed feature learner and as a classifier, and the decoder is used to decide whether the classification given by the encoder is correct by comparing the input sample with the output. Since adversary samples are seemingly inevitable for the current DNN framework, the list classifier to defend adversaries is introduced based on CAE, which outputs several labels and the corresponding samples recovered by the CAE. Extensive experimental results are used to show that the CAE achieves state of the art to recognize outliers by finding almost all outliers; the list classifier gives near lossless classification in the sense that the output list contains the correct label for almost all adversaries and the size of the output list is reasonably small.

preprint2022arXiv

Achieve Optimal Adversarial Accuracy for Adversarial Deep Learning using Stackelberg Game

Adversarial deep learning is to train robust DNNs against adversarial attacks, which is one of the major research focuses of deep learning. Game theory has been used to answer some of the basic questions about adversarial deep learning such as the existence of a classifier with optimal robustness and the existence of optimal adversarial samples for a given class of classifiers. In most previous work, adversarial deep learning was formulated as a simultaneous game and the strategy spaces are assumed to be certain probability distributions in order for the Nash equilibrium to exist. But, this assumption is not applicable to the practical situation. In this paper, we give answers to these basic questions for the practical case where the classifiers are DNNs with a given structure, by formulating the adversarial deep learning as sequential games. The existence of Stackelberg equilibria for these games are proved. Furthermore, it is shown that the equilibrium DNN has the largest adversarial accuracy among all DNNs with the same structure, when Carlini-Wagner's margin loss is used. Trade-off between robustness and accuracy in adversarial deep learning is also studied from game theoretical aspect.

preprint2022arXiv

Adversarial Parameter Attack on Deep Neural Networks

In this paper, a new parameter perturbation attack on DNNs, called adversarial parameter attack, is proposed, in which small perturbations to the parameters of the DNN are made such that the accuracy of the attacked DNN does not decrease much, but its robustness becomes much lower. The adversarial parameter attack is stronger than previous parameter perturbation attacks in that the attack is more difficult to be recognized by users and the attacked DNN gives a wrong label for any modified sample input with high probability. The existence of adversarial parameters is proved. For a DNN $F_Θ$ with the parameter set $Θ$ satisfying certain conditions, it is shown that if the depth of the DNN is sufficiently large, then there exists an adversarial parameter set $Θ_a$ for $Θ$ such that the accuracy of $F_{Θ_a}$ is equal to that of $F_Θ$, but the robustness measure of $F_{Θ_a}$ is smaller than any given bound. An effective training algorithm is given to compute adversarial parameters and numerical experiments are used to demonstrate that the algorithms are effective to produce high quality adversarial parameters.

preprint2022arXiv

Bit Complexity of Polynomial GCD on Sparse Representation

An input- and output-sensitive GCD algorithm for multi-variate polynomials over finite fields is proposed by combining the modular method with the Ben-Or/Tiwari sparse interpolation. The bit complexity of the algorithm is given and is sensitive to the sparse representation, while for previous sparse GCD algorithms, the complexities were given only in some special cases. It is shown that the new algorithm is superior both in theory and in practice comparing with existing GCD algorithms: the complexity in the degree is decreased from quadratic to linear and the running times are decreased by 1-3 orders of magnitude in various benchmarks.

preprint2022arXiv

Proving Information Inequalities and Identities with Symbolic Computation

Proving linear inequalities and identities of Shannon's information measures, possibly with linear constraints on the information measures, is an important problem in information theory. For this purpose, ITIP and other variant algorithms have been developed and implemented, which are all based on solving a linear program (LP). In particular, an identity $f = 0$ is verified by solving two LPs, one for $f \ge 0$ and one for $f \le 0$. In this paper, we develop a set of algorithms that can be implemented by symbolic computation. Based on these algorithms, procedures for verifying linear information inequalities and identities are devised. Compared with LP-based algorithms, our procedures can produce analytical proofs that are both human-verifiable and free of numerical errors. Our procedures are also more efficient computationally. For constrained inequalities, by taking advantage of the algebraic structure of the problem, the size of the LP that needs to be solved can be significantly reduced. For identities, instead of solving two LPs, the identity can be verified directly with very little computation.

preprint2022arXiv

Robust and Information-theoretically Safe Bias Classifier against Adversarial Attacks

In this paper, the bias classifier is introduced, that is, the bias part of a DNN with Relu as the activation function is used as a classifier. The work is motivated by the fact that the bias part is a piecewise constant function with zero gradient and hence cannot be directly attacked by gradient-based methods to generate adversaries, such as FGSM. The existence of the bias classifier is proved and an effective training method for the bias classifier is given. It is proved that by adding a proper random first-degree part to the bias classifier, an information-theoretically safe classifier against the original-model gradient attack is obtained in the sense that the attack will generate a totally random attacking direction. This seems to be the first time that the concept of information-theoretically safe classifier is proposed. Several attack methods for the bias classifier are proposed and numerical experiments are used to show that the bias classifier is more robust than DNNs with similar size against these attacks in most cases.

preprint2022arXiv

Skew-sparse matrix multiplication

Based on the observation that $\mathbb{Q}^{(p-1) \times (p-1)}$ is isomorphic to a quotient skew polynomial ring, we propose a new method for $(p-1)\times (p-1)$ matrix multiplication over $\mathbb{Q}$, where $p$ is a prime number. The main feature of our method is the acceleration for matrix multiplication if the product is skew-sparse. Based on the new method, we design a deterministic algorithm with complexity $O(T^{ω-2} p^2)$, where $T\le p-1$ is a parameter determined by the skew-sparsity of input matrices and $ω$ is the asymptotic exponent of matrix multiplication. Moreover, by introducing randomness, we also propose a probabilistic algorithm with complexity $O^\thicksim(t^{ω-2}p^2+p^2\log\frac{1}ν)$, where $t\le p-1$ is the skew-sparsity of the product and $ν$ is the probability parameter.

preprint2020arXiv

Lower Bound on Derivatives of Costa's Differential Entropy

Several conjectures concern the lower bound for the differential entropy $H(X_t)$ of an $n$-dimensional random vector $X_t$ introduced by Costa. Cheng and Geng conjectured that $H(X_t)$ is completely monotone, that is, $C_1(m,n): (-1)^{m+1}(d^m/d^m t)H(X_t)\ge0$. McKean conjectured that Gaussian $X_{Gt}$ achieves the minimum of $(-1)^{m+1}(d^m/d^m t)H(X_t)$ under certain conditions, that is, $C_2(m,n): (-1)^{m+1}(d^m/d^m t)H(X_t)\ge(-1)^{m+1}(d^m/d^m t)H(X_{Gt})$. McKean's conjecture was only considered in the univariate case before: $C_2(1,1)$ and $C_2(2,1)$ were proved by McKean and $C_2(i,1),i=3,4,5$ were proved by Zhang-Anantharam-Geng under the log-concave condition. In this paper, we prove $C_2(1,n)$, $C_2(2,n)$ and observe that McKean's conjecture might not be true for $n>1$ and $m>2$. We further propose a weaker version $C_3(m,n): (-1)^{m+1}(d^m/d^m t)H(X_t)\ge(-1)^{m+1}\frac{1}{n}(d^m/d^m t)H(X_{Gt})$ and prove $C_3(3,2)$, $C_3(3,3)$, $C_3(3,4)$, $C_3(4,2)$ under the log-concave condition. A systematical procedure to prove $C_l(m,n)$ is proposed based on semidefinite programming and the results mentioned above are proved using this procedure.

preprint2020arXiv

Prove Costa's Entropy Power Inequality and High Order Inequality for Differential Entropy with Semidefinite Programming

Costa's entropy power inequality is an important generalization of Shannon's entropy power inequality. Related with Costa's entropy power inequality and a conjecture proposed by McKean in 1966, Cheng-Geng recently conjectured that $D(m,n): (-1)^{m+1}(\partial^m/\partial^m t)H(X_t)\ge0$, where $X_t$ is the $n$-dimensional random variable in Costa's entropy power inequality and $H(X_t)$ the differential entropy of $X_t$. $D(1,n)$ and $D(2,n)$ were proved by Costa as consequences of Costa's entropy power inequality. Cheng-Geng proved $D(3,1)$ and $D(4,1)$. In this paper, we propose a systematical procedure to prove $D(m,n)$ and Costa's entropy power inequality based on semidefinite programming. Using software packages based on this procedure, we prove $D(3,n)$ for $n=2,3,4$ and give a new proof for Costa's entropy power inequality. We also show that with the currently known constraints, $D(5,1)$ and $D(4,2)$ cannot be proved with the procedure.

preprint2016arXiv

A Polynomial-time Algorithm to Compute Generalized Hermite Normal Form of Matrices over Z[x]

In this paper, a polynomial-time algorithm is given to compute the generalized Hermite normal form for a matrix F over Z[x], or equivalently, the reduced Groebner basis of the Z[x]-module generated by the column vectors of F. The algorithm is also shown to be practically more efficient than existing algorithms. The algorithm is based on three key ingredients. First, an F4 style algorithm to compute the Groebner basis is adopted, where a novel prolongation is designed such that the coefficient matrices under consideration have polynomial sizes. Second, fast algorithms to compute Hermite normal forms of matrices over Z are used. Third, the complexity of the algorithm are guaranteed by a nice estimation for the degree and height bounds of the polynomials in the generalized Hermite normal form.

preprint2016arXiv

Binomial Difference Ideals

In this paper, binomial difference ideals are studied. Three canonical representations for Laurent binomial difference ideals are given in terms of the reduced Groebner basis of Z[x]-lattices, regular and coherent difference ascending chains, and partial characters over Z[x]-lattices, respectively. Criteria for a Laurent binomial difference ideal to be reflexive, prime, well-mixed, and perfect are given in terms of their support lattices. The reflexive, well-mixed, and perfect closures of a Laurent binomial difference ideal are shown to be binomial. Most of the properties of Laurent binomial difference ideals are extended to the case of difference binomial ideals. Finally, algorithms are given to check whether a given Laurent binomial difference ideal I is reflexive, prime, well-mixed, or perfect, and in the negative case, to compute the reflexive, well-mixed, and perfect closures of I. An algorithm is given to decompose a finitely generated perfect binomial difference ideal as the intersection of reflexive prime binomial difference ideals.

preprint2016arXiv

Toric Difference Variety

In this paper, the concept of toric difference varieties is defined and four equivalent descriptions for toric difference varieties are presented in terms of difference rational parametrization, difference coordinate rings, toric difference ideals, and group actions by difference tori. Connections between toric difference varieties and affine N[x]-semimodules are established by proving the correspondence between the irreducible invariant difference subvarieties and the faces of the N[x]-submodules and the orbit-face correspondence. Finally, an algorithm is given to decide whether a binomial difference ideal represented by a Z[x]-lattice defines a toric difference variety.

preprint2015arXiv

A Triangular Decomposition Algorithm for Differential Polynomial Systems with Elementary Computation Complexity

In this paper, a new triangular decomposition algorithm is proposed for ordinary differential polynomial systems, which has triple exponential computational complexity. The key idea is to eliminate one algebraic variable from a set of polynomials in one step using the theory of multivariate resultant. This seems to be the first differential triangular decomposition algorithm with elementary computation complexity.

preprint2015arXiv

Binomial Difference Ideal and Toric Difference Variety

In this paper, the concepts of binomial difference ideals and toric difference varieties are defined and their properties are proved. Two canonical representations for Laurent binomial difference ideals are given using the reduced Groebner basis of Z[x]-lattices and regular and coherent difference ascending chains, respectively. Criteria for a Laurent binomial difference ideal to be reflexive, prime, well-mixed, perfect, and toric are given in terms of their support lattices which are Z[x]-lattices. The reflexive, well-mixed, and perfect closures of a Laurent binomial difference ideal are shown to be binomial. Four equivalent definitions for toric difference varieties are presented. Finally, algorithms are given to check whether a given Laurent binomial difference ideal I is reflexive, prime, well-mixed, perfect, or toric, and in the negative case, to compute the reflexive, well-mixed, and perfect closures of I. An algorithm is given to decompose a finitely generated perfect binomial difference ideal as the intersection of reflexive prime binomial difference ideals.

preprint2013arXiv

Sparse Difference Resultant

In this paper, the concept of sparse difference resultant for a Laurent transformally essential system of difference polynomials is introduced and a simple criterion for the existence of sparse difference resultant is given. The concept of transformally homogenous polynomial is introduced and the sparse difference resultant is shown to be transformally homogenous. It is shown that the vanishing of the sparse difference resultant gives a necessary condition for the corresponding difference polynomial system to have non-zero solutions. The order and degree bounds for sparse difference resultant are given. Based on these bounds, an algorithm to compute the sparse difference resultant is proposed, which is single exponential in terms of the number of variables, the Jacobi number, and the size of the Laurent transformally essential system. Furthermore, the precise order and degree, a determinant representation, and a Poisson-type product formula for the difference resultant are given.

preprint2012arXiv

Certified Approximation of Parametric Space Curves with Cubic B-spline Curves

Approximating complex curves with simple parametric curves is widely used in CAGD, CG, and CNC. This paper presents an algorithm to compute a certified approximation to a given parametric space curve with cubic B-spline curves. By certified, we mean that the approximation can approximate the given curve to any given precision and preserve the geometric features of the given curve such as the topology, singular points, etc. The approximated curve is divided into segments called quasi-cubic Bézier curve segments which have properties similar to a cubic rational Bézier curve. And the approximate curve is naturally constructed as the associated cubic rational Bézier curve of the control tetrahedron of a quasi-cubic curve. A novel optimization method is proposed to select proper weights in the cubic rational Bézier curve to approximate the given curve. The error of the approximation is controlled by the size of its tetrahedron, which converges to zero by subdividing the curve segments. As an application, approximate implicit equations of the approximated curves can be computed. Experiments show that the method can approximate space curves of high degrees with high precision and very few cubic Bézier curve segments.

preprint2012arXiv

Certified Rational Parametric Approximation of Real Algebraic Space Curves with Local Generic Position Method

In this paper, an algorithm to compute a certified $G^1$ rational parametric approximation for algebraic space curves is given by extending the local generic position method for solving zero dimensional polynomial equation systems to the case of dimension one. By certified, we mean the approximation curve and the original curve have the same topology and their Hausdauff distance is smaller than a given precision. Thus, the method also gives a new algorithm to compute the topology for space algebraic curves. The main advantage of the algorithm, inhering from the local generic method, is that topology computation and approximation for a space curve is directly reduced to the same tasks for two plane curves. In particular, the error bound of the approximation space curve is obtained from the error bounds of the approximation plane curves explicitly. Nontrivial examples are used to show the effectivity of the method.

preprint2012arXiv

Matrix Formula of Differential Resultant for First Order Generic Ordinary Differential Polynomials

In this paper, a matrix representation for the differential resultant of two generic ordinary differential polynomials $f_1$ and $f_2$ in the differential indeterminate $y$ with order one and arbitrary degree is given. That is, a non-singular matrix is constructed such that its determinant contains the differential resultant as a factor. Furthermore, the algebraic sparse resultant of $f_1, f_2, δf_1, δf_2$ treated as polynomials in $y, y', y"$ is shown to be a non-zero multiple of the differential resultant of $f_1, f_2$. Although very special, this seems to be the first matrix representation for a class of nonlinear generic differential polynomials.

preprint2012arXiv

Sparse Differential Resultant for Laurent Differential Polynomials

In this paper, we first introduce the concept of Laurent differentially essential systems and give a criterion for Laurent differentially essential systems in terms of their supports. Then the sparse differential resultant for a Laurent differentially essential system is defined and its basic properties are proved. In particular, order and degree bounds for the sparse differential resultant are given. Based on these bounds, an algorithm to compute the sparse differential resultant is proposed, which is single exponential in terms of the number of indeterminates, the Jacobi number of the system, and the size of the system.

preprint2011arXiv

Differential Chow Form for Projective Differential Variety

In this paper, a generic intersection theorem in projective differential algebraic geometry is presented. Precisely, the intersection of an irreducible projective differential variety of dimension d>0 and order h with a generic projective differential hyperplane is shown to be an irreducible projective differential variety of dimension d-1 and order h. Based on the generic intersection theorem, the Chow form for an irreducible projective differential variety is defined and most of the properties of the differential Chow form in affine differential case are established for its projective differential counterpart. Finally, we apply the differential Chow form to a result of linear dependence over projective varieties given by Kolchin.

preprint2011arXiv

Intersection Theory in Differential Algebraic Geometry: Generic Intersections and the Differential Chow Form

In this paper, an intersection theory for generic differential polynomials is presented. The intersection of an irreducible differential variety of dimension $d$ and order $h$ with a generic differential hypersurface of order $s$ is shown to be an irreducible variety of dimension $d-1$ and order $h+s$. As a consequence, the dimension conjecture for generic differential polynomials is proved. Based on the intersection theory, the Chow form for an irreducible differential variety is defined and most of the properties of the Chow form in the algebraic case are established for its differential counterpart. Furthermore, the generalized differential Chow form is defined and its properties are proved. As an application of the generalized differential Chow form, the differential resultant of $n+1$ generic differential polynomials in $n$ variables is defined and properties similar to that of the Macaulay resultant for multivariate polynomials are proved.

preprint2011arXiv

Multiplicity Preserving Triangular Set Decomposition of Two Polynomials

In this paper, a multiplicity preserving triangular set decomposition algorithm is proposed for a system of two polynomials. The algorithm decomposes the variety defined by the polynomial system into unmixed components represented by triangular sets, which may have negative multiplicities. In the bivariate case, we give a complete algorithm to decompose the system into multiplicity preserving triangular sets with positive multiplicities. We also analyze the complexity of the algorithm in the bivariate case. We implement our algorithm and show the effectiveness of the method with extensive experiments.

preprint2011arXiv

Root Isolation of Zero-dimensional Polynomial Systems with Linear Univariate Representation

In this paper, a linear univariate representation for the roots of a zero-dimensional polynomial equation system is presented, where the roots of the equation system are represented as linear combinations of roots of several univariate polynomial equations. The main advantage of this representation is that the precision of the roots can be easily controlled. In fact, based on the linear univariate representation, we can give the exact precisions needed for roots of the univariate equations in order to obtain the roots of the equation system to a given precision. As a consequence, a root isolation algorithm for a zero-dimensional polynomial equation system can be easily derived from its linear univariate representation.

preprint2010arXiv

Efficient Characteristic Set Algorithms for Equation Solving in Finite Fields and Applications in Cryptanalysis

Efficient characteristic set methods for computing solutions of polynomial equation systems in a finite field are proposed. The concept of proper triangular sets is introduced and an explicit formula for the number of solutions of a proper and monic (or regular) triangular set is given. An improved zero decomposition algorithm which can be used to reduce the zero set of an equation system in general form to the union of zero sets of monic proper triangular sets is proposed. As a consequence, we can give an explicit formula for the number of solutions of an equation system. Bitsize complexity for the algorithm is given in the case of Boolean polynomials. We also give a multiplication free characteristic set method for Boolean polynomials, where the sizes of the polynomials are effectively controlled. The algorithms are implemented in the case of Boolean polynomials and extensive experiments show that they are quite efficient for solving certain classes of Boolean equations.

preprint2010arXiv

On Functional Decomposition of Multivariate Polynomials with Differentiation and Homogenization

In this paper, we give a theoretical analysis for the algorithms to compute functional decomposition for multivariate polynomials based on differentiation and homogenization which are proposed by Ye, Dai, Lam (1999) and Faug$μ$ere, Perret (2006, 2008, 2009). We show that a degree proper functional decomposition for a set of randomly decomposable quartic homogenous polynomials can be computed using the algorithm with high probability. This solves a conjecture proposed by Ye, Dai, and Lam (1999). We also propose a conjecture such that the decomposition for a set of polynomials can be computed from that of its homogenization with high probability. Finally, we prove that the right decomposition factors for a set of polynomials can be computed from its right decomposition factor space. Combining these results together, we prove that the algorithm can compute a degree proper decomposition for a set of randomly decomposable quartic polynomials with probability one when the base field is of characteristic zero, and with probability close to one when the base field is a finite field with sufficiently large number under the assumption that the conjeture is correct.

preprint2009arXiv

Ambient Isotopic Meshing of Implicit Algebraic Surface with Singularities

A complete method is proposed to compute a certified, or ambient isotopic, meshing for an implicit algebraic surface with singularities. By certified, we mean a meshing with correct topology and any given geometric precision. We propose a symbolic-numeric method to compute a certified meshing for the surface inside a box containing singularities and use a modified Plantinga-Vegter marching cube method to compute a certified meshing for the surface inside a box without singularities. Nontrivial examples are given to show the effectiveness of the algorithm. To our knowledge, this is the first method to compute a certified meshing for surfaces with singularities.

Xiao-Shan Gao

What is connected

Connect this record

See the researcher in context

Building this map preview

27 published item(s)

Data-Dependent Stability Analysis of Adversarial Training

A Robust Classification-autoencoder to Defend Outliers and Adversaries

Achieve Optimal Adversarial Accuracy for Adversarial Deep Learning using Stackelberg Game

Adversarial Parameter Attack on Deep Neural Networks

Bit Complexity of Polynomial GCD on Sparse Representation

Proving Information Inequalities and Identities with Symbolic Computation

Robust and Information-theoretically Safe Bias Classifier against Adversarial Attacks

Skew-sparse matrix multiplication

Lower Bound on Derivatives of Costa's Differential Entropy

Prove Costa's Entropy Power Inequality and High Order Inequality for Differential Entropy with Semidefinite Programming

A Polynomial-time Algorithm to Compute Generalized Hermite Normal Form of Matrices over Z[x]

Binomial Difference Ideals

Toric Difference Variety

A Triangular Decomposition Algorithm for Differential Polynomial Systems with Elementary Computation Complexity

Binomial Difference Ideal and Toric Difference Variety

Sparse Difference Resultant

Certified Approximation of Parametric Space Curves with Cubic B-spline Curves

Certified Rational Parametric Approximation of Real Algebraic Space Curves with Local Generic Position Method

Matrix Formula of Differential Resultant for First Order Generic Ordinary Differential Polynomials

Sparse Differential Resultant for Laurent Differential Polynomials

Differential Chow Form for Projective Differential Variety

Intersection Theory in Differential Algebraic Geometry: Generic Intersections and the Differential Chow Form

Multiplicity Preserving Triangular Set Decomposition of Two Polynomials

Root Isolation of Zero-dimensional Polynomial Systems with Linear Univariate Representation

Efficient Characteristic Set Algorithms for Equation Solving in Finite Fields and Applications in Cryptanalysis

On Functional Decomposition of Multivariate Polynomials with Differentiation and Homogenization

Ambient Isotopic Meshing of Implicit Algebraic Surface with Singularities