Researcher profile

Jinshan Zeng

Jinshan Zeng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2022arXiv

A Tale of HodgeRank and Spectral Method: Target Attack Against Rank Aggregation Is the Fixed Point of Adversarial Game

Rank aggregation with pairwise comparisons has shown promising results in elections, sports competitions, recommendations, and information retrieval. However, little attention has been paid to the security issue of such algorithms, in contrast to numerous research work on the computational and statistical characteristics. Driven by huge profits, the potential adversary has strong motivation and incentives to manipulate the ranking list. Meanwhile, the intrinsic vulnerability of the rank aggregation methods is not well studied in the literature. To fully understand the possible risks, we focus on the purposeful adversary who desires to designate the aggregated results by modifying the pairwise data in this paper. From the perspective of the dynamical system, the attack behavior with a target ranking list is a fixed point belonging to the composition of the adversary and the victim. To perform the targeted attack, we formulate the interaction between the adversary and the victim as a game-theoretic framework consisting of two continuous operators while Nash equilibrium is established. Then two procedures against HodgeRank and RankCentrality are constructed to produce the modification of the original data. Furthermore, we prove that the victims will produce the target ranking list once the adversary masters the complete information. It is noteworthy that the proposed methods allow the adversary only to hold incomplete information or imperfect feedback and perform the purposeful attack. The effectiveness of the suggested target attack strategies is demonstrated by a series of toy simulations and several real-world data experiments. These experimental results show that the proposed methods could achieve the attacker's goal in the sense that the leading candidate of the perturbed ranking list is the designated one by the adversary.

preprint2022arXiv

CodeGen-Test: An Automatic Code Generation Model Integrating Program Test Information

Automatic code generation is to generate the program code according to the given natural language description. The current mainstream approach uses neural networks to encode natural language descriptions, and output abstract syntax trees (AST) at the decoder, then convert the AST into program code. While the generated code largely conforms to specific syntax rules, two problems are still ignored. One is missing program testing, an essential step in the process of complete code implementation; the other is only focusing on the syntax compliance of the generated code, while ignoring the more important program functional requirements. The paper proposes a CodeGen-Test model, which adds program testing steps and incorporates program testing information to iteratively generate code that meets the functional requirements of the program, thereby improving the quality of code generation. At the same time, the paper proposes a new evaluation metric, test accuracy (Test-Acc), which represents the proportion of passing program test in generated code. Different from the previous evaluation metric, which only evaluates the quality of code generation from the perspective of character similarity, the Test-Acc can evaluate the quality of code generation from the Program functions. Moreover, the paper evaluates the CodeGen-test model on a python data set "hearthstone legend". The experimental results show the proposed method can effectively improve the quality of generated code. Compared with the existing optimal model, CodeGen-Test model improves the Bleu value by 0.2%, Rouge-L value by 0.3% and Test-Acc by 6%.

preprint2022arXiv

Exploring Structural Sparsity of Deep Networks via Inverse Scale Spaces

The great success of deep neural networks is built upon their over-parameterization, which smooths the optimization landscape without degrading the generalization ability. Despite the benefits of over-parameterization, a huge amount of parameters makes deep networks cumbersome in daily life applications. Though techniques such as pruning and distillation are developed, they are expensive in fully training a dense network as backward selection methods, and there is still a void on systematically exploring forward selection methods for learning structural sparsity in deep networks. To fill in this gap, this paper proposes a new approach based on differential inclusions of inverse scale spaces, which generate a family of models from simple to complex ones along the dynamics via coupling a pair of parameters, such that over-parameterized deep models and their structural sparsity can be explored simultaneously. This kind of differential inclusion scheme has a simple discretization, dubbed Deep structure splitting Linearized Bregman Iteration (DessiLBI), whose global convergence in learning deep networks could be established under the Kurdyka-Lojasiewicz framework. Experimental evidence shows that our method achieves comparable and even better performance than the competitive optimizers in exploring the sparse structure of several widely used backbones on the benchmark datasets. Remarkably, with early stopping, our method unveils `winning tickets' in early epochs: the effective sparse network structures with comparable test accuracy to fully trained over-parameterized models, that are further transferable to similar alternative tasks. Furthermore, our method is able to grow networks efficiently with adaptive filter configurations, demonstrating a good performance with much less computational cost. Codes and models can be downloaded at {https://github.com/DessiLBI2020/DessiLBI}.

preprint2021arXiv

On Stochastic Variance Reduced Gradient Method for Semidefinite Optimization

The low-rank stochastic semidefinite optimization has attracted rising attention due to its wide range of applications. The nonconvex reformulation based on the low-rank factorization, significantly improves the computational efficiency but brings some new challenge to the analysis. The stochastic variance reduced gradient (SVRG) method has been regarded as one of the most effective methods. SVRG in general consists of two loops, where a reference full gradient is first evaluated in the outer loop and then used to yield a variance reduced estimate of the current gradient in the inner loop. Two options have been suggested to yield the output of the inner loop, where Option I sets the output as its last iterate, and Option II yields the output via random sampling from all the iterates in the inner loop. However, there is a significant gap between the theory and practice of SVRG when adapted to the stochastic semidefinite programming (SDP). SVRG practically works better with Option I, while most of existing theoretical results focus on Option II. In this paper, we fill this gap via exploiting a new semi-stochastic variant of the original SVRG with Option I adapted to the semidefinite optimization. Equipped with this, we establish the global linear submanifold convergence (i.e., converging exponentially fast to a submanifold of a global minimum under the orthogonal group action) of the proposed SVRG method, given a provable initialization scheme and under certain smoothness and restricted strongly convex assumptions. Our analysis includes the effects of the mini-batch size and update frequency in the inner loop as well as two practical step size strategies, the fixed and stabilized Barzilai-Borwein step sizes. Some numerical results in matrix sensing demonstrate the efficiency of proposed SVRG method outperforming Option II counterpart as well as others.

preprint2021arXiv

StrokeGAN: Reducing Mode Collapse in Chinese Font Generation via Stroke Encoding

The generation of stylish Chinese fonts is an important problem involved in many applications. Most of existing generation methods are based on the deep generative models, particularly, the generative adversarial networks (GAN) based models. However, these deep generative models may suffer from the mode collapse issue, which significantly degrades the diversity and quality of generated results. In this paper, we introduce a one-bit stroke encoding to capture the key mode information of Chinese characters and then incorporate it into CycleGAN, a popular deep generative model for Chinese font generation. As a result we propose an efficient method called StrokeGAN, mainly motivated by the observation that the stroke encoding contains amount of mode information of Chinese characters. In order to reconstruct the one-bit stroke encoding of the associated generated characters, we introduce a stroke-encoding reconstruction loss imposed on the discriminator. Equipped with such one-bit stroke encoding and stroke-encoding reconstruction loss, the mode collapse issue of CycleGAN can be significantly alleviated, with an improved preservation of strokes and diversity of generated characters. The effectiveness of StrokeGAN is demonstrated by a series of generation tasks over nine datasets with different fonts. The numerical results demonstrate that StrokeGAN generally outperforms the state-of-the-art methods in terms of content and recognition accuracies, as well as certain stroke error, and also generates more realistic characters.

preprint2020arXiv

DessiLBI: Exploring Structural Sparsity of Deep Networks via Differential Inclusion Paths

Over-parameterization is ubiquitous nowadays in training neural networks to benefit both optimization in seeking global optima and generalization in reducing prediction error. However, compressive networks are desired in many real world applications and direct training of small networks may be trapped in local optima. In this paper, instead of pruning or distilling over-parameterized models to compressive ones, we propose a new approach based on differential inclusions of inverse scale spaces. Specifically, it generates a family of models from simple to complex ones that couples a pair of parameters to simultaneously train over-parameterized deep models and structural sparsity on weights of fully connected and convolutional layers. Such a differential inclusion scheme has a simple discretization, proposed as Deep structurally splitting Linearized Bregman Iteration (DessiLBI), whose global convergence analysis in deep learning is established that from any initializations, algorithmic iterations converge to a critical point of empirical risks. Experimental evidence shows that DessiLBI achieve comparable and even better performance than the competitive optimizers in exploring the structural sparsity of several widely used backbones on the benchmark datasets. Remarkably, with early stopping, DessiLBI unveils "winning tickets" in early epochs: the effective sparse structure with comparable test accuracy to fully trained over-parameterized models.

preprint2020arXiv

Fully-Corrective Gradient Boosting with Squared Hinge: Fast Learning Rates and Early Stopping

Boosting is a well-known method for improving the accuracy of weak learners in machine learning. However, its theoretical generalization guarantee is missing in literature. In this paper, we propose an efficient boosting method with theoretical generalization guarantees for binary classification. Three key ingredients of the proposed boosting method are: a) the \textit{fully-corrective greedy} (FCG) update in the boosting procedure, b) a differentiable \textit{squared hinge} (also called \textit{truncated quadratic}) function as the loss function, and c) an efficient alternating direction method of multipliers (ADMM) algorithm for the associated FCG optimization. The used squared hinge loss not only inherits the robustness of the well-known hinge loss for classification with outliers, but also brings some benefits for computational implementation and theoretical justification. Under some sparseness assumption, we derive a fast learning rate of the order ${\cal O}((m/\log m)^{-1/4})$ for the proposed boosting method, which can be further improved to ${\cal O}((m/\log m)^{-1/2})$ if certain additional noise assumption is imposed, where $m$ is the size of sample set. Both derived learning rates are the best ones among the existing generalization results of boosting-type methods for classification. Moreover, an efficient early stopping scheme is provided for the proposed method. A series of toy simulations and real data experiments are conducted to verify the developed theories and demonstrate the effectiveness of the proposed method.