Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
35works
0followers
16topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

35 published item(s)

preprint2026arXiv

TMPO: Trajectory Matching Policy Optimization for Diverse and Efficient Diffusion Alignment

Reinforcement learning (RL) has shown extraordinary potential in aligning diffusion models to downstream tasks, yet most of them still suffer from significant reward hacking, which degrades generative diversity and quality by inducing visual mode collapse and amplifying unreliable rewards. We identify the root cause as the mode-seeking nature of these methods, which maximize expected reward without effectively constraining probability distribution over acceptable trajectories, causing concentration on a few high-reward paths. In contrast, we propose Trajectory Matching Policy Optimization (TMPO), which replaces scalar reward maximization with trajectory-level reward distribution matching. Specifically, TMPO introduces a Softmax Trajectory Balance (Softmax-TB) objective to match the policy probabilities of K trajectories to a reward-induced Boltzmann distribution. We prove that this objective inherits the mode-covering property of forward KL divergence, preserving coverage over all acceptable trajectories while optimizing reward. To further reduce multi-trajectory training time on large-scale flow-matching models, TMPO incorporates Dynamic Stochastic Tree Sampling, where trajectories share denoising prefixes and branch at dynamically scheduled steps, reducing redundant computation while improving training effectiveness. Extensive results across diverse alignment tasks such as human preference, compositional generation and text rendering show that TMPO improves generative diversity over state-of-the-art methods by 9.1%, and achieves competitive performance in all downstream and efficiency metrics, attaining the optimal trade-off between reward and diversity.

preprint2024arXiv

FABind: Fast and Accurate Protein-Ligand Binding

Modeling the interaction between proteins and ligands and accurately predicting their binding structures is a critical yet challenging task in drug discovery. Recent advancements in deep learning have shown promise in addressing this challenge, with sampling-based and regression-based methods emerging as two prominent approaches. However, these methods have notable limitations. Sampling-based methods often suffer from low efficiency due to the need for generating multiple candidate structures for selection. On the other hand, regression-based methods offer fast predictions but may experience decreased accuracy. Additionally, the variation in protein sizes often requires external modules for selecting suitable binding pockets, further impacting efficiency. In this work, we propose $\mathbf{FABind}$, an end-to-end model that combines pocket prediction and docking to achieve accurate and fast protein-ligand binding. $\mathbf{FABind}$ incorporates a unique ligand-informed pocket prediction module, which is also leveraged for docking pose estimation. The model further enhances the docking process by incrementally integrating the predicted pocket to optimize protein-ligand binding, reducing discrepancies between training and inference. Through extensive experiments on benchmark datasets, our proposed $\mathbf{FABind}$ demonstrates strong advantages in terms of effectiveness and efficiency compared to existing methods. Our code is available at https://github.com/QizhiPei/FABind

preprint2023arXiv

Rethinking the Data Annotation Process for Multi-view 3D Pose Estimation with Active Learning and Self-Training

Pose estimation of the human body and hands is a fundamental problem in computer vision, and learning-based solutions require a large amount of annotated data. In this work, we improve the efficiency of the data annotation process for 3D pose estimation problems with Active Learning (AL) in a multi-view setting. AL selects examples with the highest value to annotate under limited annotation budgets (time and cost), but choosing the selection strategy is often nontrivial. We present a framework to efficiently extend existing single-view AL strategies. We then propose two novel AL strategies that make full use of multi-view geometry. Moreover, we demonstrate additional performance gains by incorporating pseudo-labels computed during the AL process, which is a form of self-training. Our system significantly outperforms simulated annotation baselines in 3D body and hand pose estimation on two large-scale benchmarks: CMU Panoptic Studio and InterHand2.6M. Notably, on CMU Panoptic Studio, we are able to reduce the turn-around time by 60% and annotation cost by 80% when compared to the conventional annotation process.

preprint2022arXiv

A Strengthened Branch and Bound Algorithm for the Maximum Common (Connected) Subgraph Problem

We propose a new and strengthened Branch-and-Bound (BnB) algorithm for the maximum common (connected) induced subgraph problem based on two new operators, Long-Short Memory (LSM) and Leaf vertex Union Match (LUM). Given two graphs for which we search for the maximum common (connected) induced subgraph, the first operator of LSM maintains a score for the branching node using the short-term reward of each vertex of the first graph and the long-term reward of each vertex pair of the two graphs. In this way, the BnB process learns to reduce the search tree size significantly and improve the algorithm performance. The second operator of LUM further improves the performance by simultaneously matching the leaf vertices connected to the current matched vertices, and allows the algorithm to match multiple vertex pairs without affecting the solution optimality. We incorporate the two operators into the state-of-the-art BnB algorithm McSplit, and denote the resulting algorithm as McSplit+LL. Experiments show that McSplit+LL outperforms McSplit+RL, a more recent variant of McSplit using reinforcement learning that is superior than McSplit.

preprint2022arXiv

An Efficient Algorithm for the Partitioning Min-Max Weighted Matching Problem

The Partitioning Min-Max Weighted Matching (PMMWM) problem is an NP-hard problem that combines the problem of partitioning a group of vertices of a bipartite graph into disjoint subsets with limited size and the classical Min-Max Weighted Matching (MMWM) problem. Kress et al. proposed this problem in 2015 and they also provided several algorithms, among which MP$_{\text{LS}}$ is the state-of-the-art. In this work, we observe there is a time bottleneck in the matching phase of MP$_{\text{LS}}$. Hence, we optimize the redundant operations during the matching iterations, and propose an efficient algorithm called the MP$_{\text{KM-M}}$ that greatly speeds up MP$_{\text{LS}}$. The bottleneck time complexity is optimized from $O(n^3)$ to $O(n^2)$. We also prove the correctness of MP$_{\text{KM-M}}$ by the primal-dual method. To test the performance on diverse instances, we generate various types and sizes of benchmarks, and carried out an extensive computational study on the performance of MP$_{\text{KM-M}}$ and MP$_{\text{LS}}$. The evaluation results show that our MP$_{\text{KM-M}}$ greatly shortens the runtime as compared with MP$_{\text{LS}}$ while yielding the same solution quality.

preprint2022arXiv

An efficient thermal lattice Boltzmann method for simulating three-dimensional liquid-vapor phase change

In this paper, a multiple-relaxation-time lattice Boltzmann (LB) approach is developed for the simulation of three-dimensional (3D) liquid-vapor phase change based on the pseudopotential model. In contrast to some existing 3D thermal LB models for liquid-vapor phase change, the present approach has two advantages: for one thing, the current approach does not require calculating the gradient of volumetric heat capacity [i.e., $\nabla \left( {ρ{c_v}} \right)$], and for another, the current approach is constructed based on the seven discrete velocities in three dimensions (D3Q7), making the current thermal LB model more efficient and easy to implement. Also, based on the scheme proposed by Zhou and He [Phys Fluids 9:1591-1598, 1997], a pressure boundary condition for the D3Q19 lattice is proposed to model the multiphase flow in open systems. The current method is then validated by considering the temperature distribution in a 3D saturated liquid-vapor system, the $d^2$ law and the droplet evaporation on a heated surface. It is observed that the numerical results fit well with the analytical solutions, the results of the finite difference method and the experimental data. Our numerical results indicate that the present approach is reliable and efficient in dealing with the 3D liquid-vapor phase change.

preprint2022arXiv

Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities

Assembly101 is a new procedural activity dataset featuring 4321 videos of people assembling and disassembling 101 "take-apart" toy vehicles. Participants work without fixed instructions, and the sequences feature rich and natural variations in action ordering, mistakes, and corrections. Assembly101 is the first multi-view action dataset, with simultaneous static (8) and egocentric (4) recordings. Sequences are annotated with more than 100K coarse and 1M fine-grained action segments, and 18M 3D hand poses. We benchmark on three action understanding tasks: recognition, anticipation and temporal segmentation. Additionally, we propose a novel task of detecting mistakes. The unique recording format and rich set of annotations allow us to investigate generalization to new toys, cross-view transfer, long-tailed distributions, and pose vs. appearance. We envision that Assembly101 will serve as a new challenge to investigate various activity understanding problems.

preprint2022arXiv

BandMaxSAT: A Local Search MaxSAT Solver with Multi-armed Bandit

We address Partial MaxSAT (PMS) and Weighted PMS (WPMS), two practical generalizations of the MaxSAT problem, and propose a local search algorithm for these problems, called BandMaxSAT, that applies a multi-armed bandit model to guide the search direction. The bandit in our method is associated with all the soft clauses in the input (W)PMS instance. Each arm corresponds to a soft clause. The bandit model can help BandMaxSAT to select a good direction to escape from local optima by selecting a soft clause to be satisfied in the current step, that is, selecting an arm to be pulled. We further propose an initialization method for (W)PMS that prioritizes both unit and binary clauses when producing the initial solutions. Extensive experiments demonstrate that BandMaxSAT significantly outperforms the state-of-the-art (W)PMS local search algorithm SATLike3.0. Specifically, the number of instances in which BandMaxSAT obtains better results is about twice that obtained by SATLike3.0. Moreover, we combine BandMaxSAT with the complete solver TT-Open-WBO-Inc. The resulting solver BandMaxSAT-c also outperforms some of the best state-of-the-art complete (W)PMS solvers, including SATLike-c, Loandra and TT-Open-WBO-Inc.

preprint2022arXiv

Detecting Textual Adversarial Examples through Randomized Substitution and Vote

A line of work has shown that natural text processing models are vulnerable to adversarial examples. Correspondingly, various defense methods are proposed to mitigate the threat of textual adversarial examples, eg, adversarial training, input transformations, detection, etc. In this work, we treat the optimization process for synonym substitution based textual adversarial attacks as a specific sequence of word replacement, in which each word mutually influences other words. We identify that we could destroy such mutual interaction and eliminate the adversarial perturbation by randomly substituting a word with its synonyms. Based on this observation, we propose a novel textual adversarial example detection method, termed Randomized Substitution and Vote (RS&V), which votes the prediction label by accumulating the logits of k samples generated by randomly substituting the words in the input text with synonyms. The proposed RS&V is generally applicable to any existing neural networks without modification on the architecture or extra training, and it is orthogonal to prior work on making the classification network itself more robust. Empirical evaluations on three benchmark datasets demonstrate that our RS&V could detect the textual adversarial examples more successfully than the existing detection methods while maintaining the high classification accuracy on benign samples.

preprint2022arXiv

Effective Variable Depth Local Search for the Budgeted Maximum Coverage Problem

We address the Budgeted Maximum Coverage Problem (BMCP), which is a natural and more practical extension of the standard 0-1 knapsack problem and the set cover problem. Given m elements with nonnegative weights, n subsets of elements with nonnegative costs, and a total budget, BMCP aims to select some subsets such that the total cost of selected subsets does not exceed the budget, and the total weight of associated elements is maximized. In this paper, we propose a variable depth local search algorithm (VDLS) for the BMCP. VDLS first generates an initial solution by a greedy algorithm, then iteratively improves the solution through a partial depth-first search method, that can improve the solution by simultaneously changing the states (selected or not) of multiple subsets. Such method allows VDLS to explore the solution space widely and deeply, and to yield high-quality solutions. We further propose a neighbour structure to boost the algorithm performance, that is, both subsets have a neighbour relation if they have at least one common associated element. By applying the neighbour structure, VDLS can adjust the selected subsets while losing as few covered elements as possible. Since the existing BMCP benchmarks only have simple structures and small scales, we design 60 new instances with relatively large scales and complex structures to enrich the diversity of the BMCP instances. Experimental results on 30 public instances and 60 new instances we designed demonstrate that VDLS significantly outperforms the existing heuristic and the general CPLEX exact solver, for the BMCP.

preprint2022arXiv

Error-feedback stochastic modeling strategy for time series forecasting with convolutional neural networks

Despite the superiority of convolutional neural networks demonstrated in time series modeling and forecasting, it has not been fully explored on the design of the neural network architecture and the tuning of the hyper-parameters. Inspired by the incremental construction strategy for building a random multilayer perceptron, we propose a novel Error-feedback Stochastic Modeling (ESM) strategy to construct a random Convolutional Neural Network (ESM-CNN) for time series forecasting task, which builds the network architecture adaptively. The ESM strategy suggests that random filters and neurons of the error-feedback fully connected layer are incrementally added to steadily compensate the prediction error during the construction process, and then a filter selection strategy is introduced to enable ESM-CNN to extract the different size of temporal features, providing helpful information at each iterative process for the prediction. The performance of ESM-CNN is justified on its prediction accuracy of one-step-ahead and multi-step-ahead forecasting tasks respectively. Comprehensive experiments on both the synthetic and real-world datasets show that the proposed ESM-CNN not only outperforms the state-of-art random neural networks, but also exhibits stronger predictive power and less computing overhead in comparison to trained state-of-art deep neural network models.

preprint2022arXiv

Generating Pseudo-labels Adaptively for Few-shot Model-Agnostic Meta-Learning

Model-Agnostic Meta-Learning (MAML) is a famous few-shot learning method that has inspired many follow-up efforts, such as ANIL and BOIL. However, as an inductive method, MAML is unable to fully utilize the information of query set, limiting its potential of gaining higher generality. To address this issue, we propose a simple yet effective method that generates psuedo-labels adaptively and could boost the performance of the MAML family. The proposed methods, dubbed Generative Pseudo-label based MAML (GP-MAML), GP-ANIL and GP-BOIL, leverage statistics of the query set to improve the performance on new tasks. Specifically, we adaptively add pseudo labels and pick samples from the query set, then re-train the model using the picked query samples together with the support set. The GP series can also use information from the pseudo query set to re-train the network during the meta-testing. While some transductive methods, such as Transductive Propagation Network (TPN), struggle to achieve this goal.

preprint2022arXiv

HoSIM: Higher-order Structural Importance based Method for Multiple Local Community Detection

Local community detection has attracted much research attention recently, and many methods have been proposed for the single local community detection that finds a community containing the given set of query nodes. However, nodes may belong to several communities in the network, and detecting all the communities for the query node set, termed as the multiple local community detection (MLCD), is more important as it could uncover more potential information. MLCD is also more challenging because when a query node belongs to multiple communities, it always locates in the complicated overlapping region and the marginal region of communities. Accordingly, detecting multiple communities for such nodes by applying seed expansion methods is insufficient. In this work, we address the MLCD based on higher-order structural importance (HoSI). First, to effectively estimate the influence of higher-order structures, we propose a new variant of random walk called Active Random Walk to measure the HoSI score between nodes. Then, we propose two new metrics to evaluate the HoSI score of a subgraph to a node and the HoSI score of a node, respectively. Based on the proposed metrics, we present a novel algorithm called HoSIM to detect multiple local communities for a single query node. HoSIM enforces a three-stage processing, namely subgraph sampling, core member identification, and local community detection. The key idea is utilizing HoSI to find and identify the core members of communities relevant to the query node and optimize the generated communities. Extensive experiments illustrate the effectiveness of HoSIM.

preprint2022arXiv

Hybrid Learning with New Value Function for the Maximum Common Subgraph Problem

Maximum Common induced Subgraph (MCS) is an important NP-hard problem with wide real-world applications. Branch-and-Bound (BnB) is the basis of a class of efficient algorithms for MCS, consisting in successively selecting vertices to match and pruning when it is discovered that a solution better than the best solution found so far does not exist. The method of selecting the vertices to match is essential for the performance of BnB. In this paper, we propose a new value function and a hybrid selection strategy used in reinforcement learning to define a new vertex selection method, and propose a new BnB algorithm, called McSplitDAL, for MCS. Extensive experiments show that McSplitDAL significantly improves the current best BnB algorithms, McSplit+LL and McSplit+RL. An empirical analysis is also performed to illustrate why the new value function and the hybrid selection strategy are effective.

preprint2022arXiv

Integrating Large Circular Kernels into CNNs through Neural Architecture Search

The square kernel is a standard unit for contemporary CNNs, as it fits well on the tensor computation for convolution operation. However, the retinal ganglion cells in the biological visual system have approximately concentric receptive fields. Motivated by this observation, we propose to use circular kernel with a concentric and isotropic receptive field as an option for the convolution operation. We first propose a simple yet efficient implementation of the convolution using circular kernels, and empirically show the significant advantages of large circular kernels over the counterpart square kernels. We then expand the operation space of several typical Neural Architecture Search (NAS) methods with the convolutions of large circular kernels. The searched new neural architectures do contain large circular kernels and outperform the original searched models considerably. Our additional analysis also reveals that large circular kernels could help the model to be more robust to the rotated or sheared images due to their better rotation invariance. Our work shows the potential of designing new convolutional kernels for CNNs, bringing up the prospect of expanding the search space of NAS with new variants of convolutions.

preprint2022arXiv

Propagation with Adaptive Mask then Training for Node Classification on Attributed Networks

Node classification on attributed networks is a semi-supervised task that is crucial for network analysis. By decoupling two critical operations in Graph Convolutional Networks (GCNs), namely feature transformation and neighborhood aggregation, some recent works of decoupled GCNs could support the information to propagate deeper and achieve advanced performance. However, they follow the traditional structure-aware propagation strategy of GCNs, making it hard to capture the attribute correlation of nodes and sensitive to the structure noise described by edges whose two endpoints belong to different categories. To address these issues, we propose a new method called the itshape Propagation with Adaptive Mask then Training (PAMT). The key idea is to integrate the attribute similarity mask into the structure-aware propagation process. In this way, PAMT could preserve the attribute correlation of adjacent nodes during the propagation and effectively reduce the influence of structure noise. Moreover, we develop an iterative refinement mechanism to update the similarity mask during the training process for improving the training performance. Extensive experiments on four real-world datasets demonstrate the superior performance and robustness of PAMT.

preprint2022arXiv

Reinforced Hybrid Genetic Algorithm for the Traveling Salesman Problem

In this paper, we propose a new method called the Reinforced Hybrid Genetic Algorithm (RHGA) for solving the famous NP-hard Traveling Salesman Problem (TSP). Specifically, we combine reinforcement learning with the well-known Edge Assembly Crossover genetic algorithm (EAX-GA) and the Lin-Kernighan-Helsgaun (LKH) local search heuristic. In the hybrid algorithm, LKH can help EAX-GA improve the population by its effective local search, and EAX-GA can help LKH escape from local optima by providing high-quality and diverse initial solutions. We restrict that there is only one special individual among the population in EAX-GA that can be improved by LKH. Such a mechanism can prevent the population diversity, efficiency, and algorithm performance from declining due to the redundant calling of LKH upon the population. As a result, our proposed hybrid mechanism can help EAX-GA and LKH boost each other's performance without reducing the convergence rate of the population. The reinforcement learning technique based on Q-learning further promotes the hybrid genetic algorithm. Experimental results on 138 well-known and widely used TSP benchmarks with the number of cities ranging from 1,000 to 85,900 demonstrate the excellent performance of RHGA.

preprint2022arXiv

Reinforced Lin-Kernighan-Helsgaun Algorithms for the Traveling Salesman Problems

TSP is a classical NP-hard combinatorial optimization problem with many practical variants. LKH is one of the state-of-the-art local search algorithms for the TSP. LKH-3 is a powerful extension of LKH that can solve many TSP variants. Both LKH and LKH-3 associate a candidate set to each city to improve the efficiency, and have two different methods, $α$-measure and POPMUSIC, to decide the candidate sets. In this work, we first propose a Variable Strategy Reinforced LKH (VSR-LKH) algorithm, which incorporates three reinforcement learning methods (Q-learning, Sarsa, Monte Carlo) with LKH, for the TSP. We further propose a new algorithm called VSR-LKH-3 that combines the variable strategy reinforcement learning method with LKH-3 for typical TSP variants, including the TSP with time windows (TSPTW) and Colored TSP (CTSP). The proposed algorithms replace the inflexible traversal operations in LKH and LKH-3 and let the algorithms learn to make a choice at each search step by reinforcement learning. Both LKH and LKH-3, with either $α$-measure or POPMUSIC, can be significantly improved by our methods. Extensive experiments on 236 widely-used TSP benchmarks with up to 85,900 cities demonstrate the excellent performance of VSR-LKH. VSR-LKH-3 also significantly outperforms the state-of-the-art heuristics for TSPTW and CTSP.

preprint2022arXiv

Robust Textual Embedding against Word-level Adversarial Attacks

We attribute the vulnerability of natural language processing models to the fact that similar inputs are converted to dissimilar representations in the embedding space, leading to inconsistent outputs, and we propose a novel robust training method, termed Fast Triplet Metric Learning (FTML). Specifically, we argue that the original sample should have similar representation with its adversarial counterparts and distinguish its representation from other samples for better robustness. To this end, we adopt the triplet metric learning into the standard training to pull words closer to their positive samples (i.e., synonyms) and push away their negative samples (i.e., non-synonyms) in the embedding space. Extensive experiments demonstrate that FTML can significantly promote the model robustness against various advanced adversarial attacks while keeping competitive classification accuracy on original samples. Besides, our method is efficient as it only needs to adjust the embedding and introduces very little overhead on the standard training. Our work shows great potential of improving the textual robustness through robust word embedding.

preprint2022arXiv

Sampling-based Fast Gradient Rescaling Method for Highly Transferable Adversarial Attacks

Deep neural networks have shown to be very vulnerable to adversarial examples crafted by adding human-imperceptible perturbations to benign inputs. After achieving impressive attack success rates in the white-box setting, more focus is shifted to black-box attacks. In either case, the common gradient-based approaches generally use the $sign$ function to generate perturbations at the end of the process. However, only a few works pay attention to the limitation of the $sign$ function. Deviation between the original gradient and the generated noises may lead to inaccurate gradient update estimation and suboptimal solutions for adversarial transferability, which is crucial for black-box attacks. To address this issue, we propose a Sampling-based Fast Gradient Rescaling Method (S-FGRM) to improve the transferability of the crafted adversarial examples. Specifically, we use data rescaling to substitute the inefficient $sign$ function in gradient-based attacks without extra computational cost. We also propose a Depth First Sampling method to eliminate the fluctuation of rescaling and stabilize the gradient update. Our method can be used in any gradient-based optimizations and is extensible to be integrated with various input transformation or ensemble methods for further improving the adversarial transferability. Extensive experiments on the standard ImageNet dataset show that our S-FGRM could significantly boost the transferability of gradient-based attacks and outperform the state-of-the-art baselines.

preprint2022arXiv

Stochastic Variance Reduced Ensemble Adversarial Attack for Boosting the Adversarial Transferability

The black-box adversarial attack has attracted impressive attention for its practical use in the field of deep learning security. Meanwhile, it is very challenging as there is no access to the network architecture or internal weights of the target model. Based on the hypothesis that if an example remains adversarial for multiple models, then it is more likely to transfer the attack capability to other models, the ensemble-based adversarial attack methods are efficient and widely used for black-box attacks. However, ways of ensemble attack are rather less investigated, and existing ensemble attacks simply fuse the outputs of all the models evenly. In this work, we treat the iterative ensemble attack as a stochastic gradient descent optimization process, in which the variance of the gradients on different models may lead to poor local optima. To this end, we propose a novel attack method called the stochastic variance reduced ensemble (SVRE) attack, which could reduce the gradient variance of the ensemble models and take full advantage of the ensemble attack. Empirical results on the standard ImageNet dataset demonstrate that the proposed method could boost the adversarial transferability and outperforms existing ensemble attacks significantly. Code is available at https://github.com/JHL-HUST/SVRE.

preprint2022arXiv

Tight Lower Complexity Bounds for Strongly Convex Finite-Sum Optimization

Finite-sum optimization plays an important role in the area of machine learning, and hence has triggered a surge of interest in recent years. To address this optimization problem, various randomized incremental gradient methods have been proposed with guaranteed upper and lower complexity bounds for their convergence. Nonetheless, these lower bounds rely on certain conditions: deterministic optimization algorithm, or fixed probability distribution for the selection of component functions. Meanwhile, some lower bounds even do not match the upper bounds of the best known methods in certain cases. To break these limitations, we derive tight lower complexity bounds of randomized incremental gradient methods, including SAG, SAGA, SVRG, and SARAH, for two typical cases of finite-sum optimization. Specifically, our results tightly match the upper complexity of Katyusha or VRADA when each component function is strongly convex and smooth, and tightly match the upper complexity of SDCA without duality and of KatyushaX when the finite-sum function is strongly convex and the component functions are average smooth.

preprint2022arXiv

Triangle Attack: A Query-efficient Decision-based Adversarial Attack

Decision-based attack poses a severe threat to real-world applications since it regards the target model as a black box and only accesses the hard prediction label. Great efforts have been made recently to decrease the number of queries; however, existing decision-based attacks still require thousands of queries in order to generate good quality adversarial examples. In this work, we find that a benign sample, the current and the next adversarial examples can naturally construct a triangle in a subspace for any iterative attacks. Based on the law of sines, we propose a novel Triangle Attack (TA) to optimize the perturbation by utilizing the geometric information that the longer side is always opposite the larger angle in any triangle. However, directly applying such information on the input image is ineffective because it cannot thoroughly explore the neighborhood of the input sample in the high dimensional space. To address this issue, TA optimizes the perturbation in the low frequency space for effective dimensionality reduction owing to the generality of such geometric property. Extensive evaluations on ImageNet dataset show that TA achieves a much higher attack success rate within 1,000 queries and needs a much less number of queries to achieve the same attack success rate under various perturbation budgets than existing decision-based attacks. With such high efficiency, we further validate the applicability of TA on real-world API, i.e., Tencent Cloud API.

preprint2020arXiv

Adaptive Large Neighborhood Search for Circle Bin Packing Problem

We address a new variant of packing problem called the circle bin packing problem (CBPP), which is to find a dense packing of circle items to multiple square bins so as to minimize the number of used bins. To this end, we propose an adaptive large neighborhood search (ALNS) algorithm, which uses our Greedy Algorithm with Corner Occupying Action (GACOA) to construct an initial layout. The greedy solution is usually in a local optimum trap, and ALNS enables multiple neighborhood search that depends on the stochastic annealing schedule to avoid getting stuck in local minimum traps. Specifically, ALNS perturbs the current layout to jump out of a local optimum by iteratively reassigns some circles and accepts the new layout with some probability during the search. The acceptance probability is adjusted adaptively using simulated annealing that fine-tunes the search direction in order to reach the global optimum. We benchmark computational results against GACOA in heterogeneous instances. ALNS always outperforms GACOA in improving the objective function, and in several cases, there is a significant reduction on the number of bins used in the packing.

preprint2020arXiv

AT-GAN: An Adversarial Generator Model for Non-constrained Adversarial Examples

Despite the rapid development of adversarial machine learning, most adversarial attack and defense researches mainly focus on the perturbation-based adversarial examples, which is constrained by the input images. In comparison with existing works, we propose non-constrained adversarial examples, which are generated entirely from scratch without any constraint on the input. Unlike perturbation-based attacks, or the so-called unrestricted adversarial attack which is still constrained by the input noise, we aim to learn the distribution of adversarial examples to generate non-constrained but semantically meaningful adversarial examples. Following this spirit, we propose a novel attack framework called AT-GAN (Adversarial Transfer on Generative Adversarial Net). Specifically, we first develop a normal GAN model to learn the distribution of benign data, and then transfer the pre-trained GAN model to estimate the distribution of adversarial examples for the target model. In this way, AT-GAN can learn the distribution of adversarial examples that is very close to the distribution of real data. To our knowledge, this is the first work of building an adversarial generator model that could produce adversarial examples directly from any input noise. Extensive experiments and visualizations show that the proposed AT-GAN can very efficiently generate diverse adversarial examples that are more realistic to human perception. In addition, AT-GAN yields higher attack success rates against adversarially trained models under white-box attack setting and exhibits moderate transferability against black-box models.

preprint2020arXiv

Dynamic Graph Correlation Learning for Disease Diagnosis with Incomplete Labels

Disease diagnosis on chest X-ray images is a challenging multi-label classification task. Previous works generally classify the diseases independently on the input image without considering any correlation among diseases. However, such correlation actually exists, for example, Pleural Effusion is more likely to appear when Pneumothorax is present. In this work, we propose a Disease Diagnosis Graph Convolutional Network (DD-GCN) that presents a novel view of investigating the inter-dependency among different diseases by using a dynamic learnable adjacency matrix in graph structure to improve the diagnosis accuracy. To learn more natural and reliable correlation relationship, we feed each node with the image-level individual feature map corresponding to each type of disease. To our knowledge, our method is the first to build a graph over the feature maps with a dynamic adjacency matrix for correlation learning. To further deal with a practical issue of incomplete labels, DD-GCN also utilizes an adaptive loss and a curriculum learning strategy to train the model on incomplete labels. Experimental results on two popular chest X-ray (CXR) datasets show that our prediction accuracy outperforms state-of-the-arts, and the learned graph adjacency matrix establishes the correlation representations of different diseases, which is consistent with expert experience. In addition, we apply an ablation study to demonstrate the effectiveness of each component in DD-GCN.

preprint2020arXiv

Hidden Community Detection on Two-layer Stochastic Models: a Theoretical Perspective

Hidden community is a new graph-theoretical concept recently proposed [4], in which the authors also propose a meta-approach called HICODE (Hidden Community Detection) for detecting hidden communities. HICODE is demonstrated through experiments that it is able to uncover previously overshadowed weak layers and uncover both weak and strong layers at a higher accuracy. However, the authors provide no theoretical guarantee for the performance. In this work, we focus on the theoretical analysis of HICODE on synthetic two-layer networks, where layers are independent of each other and each layer is generated by stochastic block model. We bridge their gap through two-layer stochastic block model networks in the following aspects: 1) we show that partitions that locally optimize modularity correspond to grounded layers, indicating modularity-optimizing algorithms can detect strong layers; 2) we prove that when reducing found layers, HICODE increases absolute modularities of all unreduced layers, showing its layer reduction step makes weak layers more detectable. Our work builds a solid theoretical base for HICODE, demonstrating that it is promising in uncovering both weak and strong layers of communities in two-layer networks.

preprint2020arXiv

Hierarchical hidden community detection for protein complex prediction

Motivation: Discovering functional modules in protein-protein interaction (PPI) networks by optimization methods remains a longstanding challenge in biology. Traditional algorithms simply consider strong protein complexes that can be found in the original network by optimizing some metrics, which causes obstacles for the discovery of weak and hidden complexes shielded by stronger complexes. Also, protein complexes are not only in different density but also in a large range of scale, making it extremely difficult to be detected. Toward this objective, we propose a hierarchical hidden community approach to predict protein complexes. Results: We propose a method called HirHide (Hierarchical Hidden Community Detection), which can be combined with traditional community detection methods to enable them to discover hierarchical hidden communities. It is the first community detection algorithm that can find a hierarchical structure as well as hidden structure. We compare the performance of three traditional methods with their HirHide versions. Experimental results show that the HirHide methods using traditional methods as the base algorithms achieve better performance, sometimes even significantly outperform the baselines.

preprint2020arXiv

Local Generalization and Bucketization Technique for Personalized Privacy Preservation

Anonymization technique has been extensively studied and widely applied for privacy-preserving data publishing. In most previous approaches, a microdata table consists of three categories of attribute: explicit-identifier, quasi-identifier (QI), and sensitive attribute. Actually, different individuals may have different view on the sensitivity of different attributes. Therefore, there is another type of attribute that contains both QI values and sensitive values, namely, semi-sensitive attribute. Based on such observation, we propose a new anonymization technique, called local generalization and bucketization, to prevent identity disclosure and protect the sensitive values on each semi-sensitive attribute and sensitive attribute. The rationale is to use local generalization and local bucketization to divide the tuples into local equivalence groups and partition the sensitive values into local buckets, respectively. The protections of local generalization and local bucketization are independent, so that they can be implemented by appropriate algorithms without weakening other protection, respectively. Besides, the protection of local bucketization for each semi-sensitive attribute and sensitive attribute is also independent. Consequently, local bucketization can comply with various principles in different attributes according to the actual requirements of anonymization. The conducted extensive experiments illustrate the effectiveness of the proposed approach.

preprint2020arXiv

Nesterov Accelerated Gradient and Scale Invariance for Adversarial Attacks

Deep learning models are vulnerable to adversarial examples crafted by applying human-imperceptible perturbations on benign inputs. However, under the black-box setting, most existing adversaries often have a poor transferability to attack other defense models. In this work, from the perspective of regarding the adversarial example generation as an optimization process, we propose two new methods to improve the transferability of adversarial examples, namely Nesterov Iterative Fast Gradient Sign Method (NI-FGSM) and Scale-Invariant attack Method (SIM). NI-FGSM aims to adapt Nesterov accelerated gradient into the iterative attacks so as to effectively look ahead and improve the transferability of adversarial examples. While SIM is based on our discovery on the scale-invariant property of deep learning models, for which we leverage to optimize the adversarial perturbations over the scale copies of the input images so as to avoid "overfitting" on the white-box model being attacked and generate more transferable adversarial examples. NI-FGSM and SIM can be naturally integrated to build a robust gradient-based attack to generate more transferable adversarial examples against the defense models. Empirical results on ImageNet dataset demonstrate that our attack methods exhibit higher transferability and achieve higher attack success rates than state-of-the-art gradient-based attacks.

preprint2020arXiv

Probability Learning based Tabu Search for the Budgeted Maximum Coverage Problem

Knapsack problems are classic models that can formulate a wide range of applications. In this work, we deal with the Budgeted Maximum Coverage Problem (BMCP), which is a generalized 0-1 knapsack problem. Given a set of items with nonnegative weights and a set of elements with nonnegative profits, where each item is composed of a subset of elements, BMCP aims to pack a subset of items in a capacity-constrained knapsack such that the total weight of the selected items does not exceed the knapsack capacity, and the total profit of the associated elements is maximized. Note that each element is counted once even if it is covered multiple times. BMCP is closely related to the Set-Union Knapsack Problem (SUKP) that is well studied in recent years. As the counterpart problem of SUKP, however, BMCP was introduced early in 1999 but since then it has been rarely studied, especially there is no practical algorithm proposed. By combining the reinforcement learning technique to the local search procedure, we propose a probability learning based tabu search (PLTS) algorithm for addressing this NP-hard problem. The proposed algorithm iterates through two distinct phases, namely a tabu search phase and a probability learning based perturbation phase. As there is no benchmark instances proposed in the literature, we generate 30 benchmark instances with varied properties. Experimental results demonstrate that our PLTS algorithm significantly outperforms the general CPLEX solver for solving the challenging BMCP in terms of the solution quality.

preprint2020arXiv

Robust Local Features for Improving the Generalization of Adversarial Training

Adversarial training has been demonstrated as one of the most effective methods for training robust models to defend against adversarial examples. However, adversarially trained models often lack adversarially robust generalization on unseen testing data. Recent works show that adversarially trained models are more biased towards global structure features. Instead, in this work, we would like to investigate the relationship between the generalization of adversarial training and the robust local features, as the robust local features generalize well for unseen shape variation. To learn the robust local features, we develop a Random Block Shuffle (RBS) transformation to break up the global structure features on normal adversarial examples. We continue to propose a new approach called Robust Local Features for Adversarial Training (RLFAT), which first learns the robust local features by adversarial training on the RBS-transformed adversarial examples, and then transfers the robust local features into the training of normal adversarial examples. To demonstrate the generality of our argument, we implement RLFAT in currently state-of-the-art adversarial training frameworks. Extensive experiments on STL-10, CIFAR-10 and CIFAR-100 show that RLFAT significantly improves both the adversarially robust generalization and the standard generalization of adversarial training. Additionally, we demonstrate that our models capture more local features of the object on the images, aligning better with human perception.

preprint2020arXiv

Single Image Reflection Removal through Cascaded Refinement

We address the problem of removing undesirable reflections from a single image captured through a glass surface, which is an ill-posed, challenging but practically important problem for photo enhancement. Inspired by iterative structure reduction for hidden community detection in social networks, we propose an Iterative Boost Convolutional LSTM Network (IBCLN) that enables cascaded prediction for reflection removal. IBCLN is a cascaded network that iteratively refines the estimates of transmission and reflection layers in a manner that they can boost the prediction quality to each other, and information across steps of the cascade is transferred using an LSTM. The intuition is that the transmission is the strong, dominant structure while the reflection is the weak, hidden structure. They are complementary to each other in a single image and thus a better estimate and reduction on one side from the original image leads to a more accurate estimate on the other side. To facilitate training over multiple cascade steps, we employ LSTM to address the vanishing gradient problem, and propose residual reconstruction loss as further training guidance. Besides, we create a dataset of real-world images with reflection and ground-truth transmission layers to mitigate the problem of insufficient data. Comprehensive experiments demonstrate that the proposed method can effectively remove reflections in real and synthetic images compared with state-of-the-art reflection removal methods.

preprint2020arXiv

Sparse Nonnegative Matrix Factorization for Multiple Local Community Detection

Local community detection consists of finding a group of nodes closely related to the seeds, a small set of nodes of interest. Such group of nodes are densely connected or have a high probability of being connected internally than their connections to other clusters in the network. Existing local community detection methods focus on finding either one local community that all seeds are most likely to be in or finding a single community for each of the seeds. However, a seed member usually belongs to multiple local overlapping communities. In this work, we present a novel method of detecting multiple local communities to which a single seed member belongs. The proposed method consists of three key steps: (1) local sampling with Personalized PageRank (PPR); (2) using the sparseness generated by a sparse nonnegative matrix factorization (SNMF) to estimate the number of communities in the sampled subgraph; (3) using SNMF soft community membership vectors to assign nodes to communities. The proposed method shows favorable accuracy performance and a good conductance when compared to state-of-the-art community detection methods by experiments using a combination of artificial and real-world networks.

preprint2020arXiv

Stochastic Item Descent Method for Large Scale Equal Circle Packing Problem

Stochastic gradient descent (SGD) is a powerful method for large-scale optimization problems in the area of machine learning, especially for a finite-sum formulation with numerous variables. In recent years, mini-batch SGD gains great success and has become a standard technique for training deep neural networks fed with big amount of data. Inspired by its success in deep learning, we apply the idea of SGD with batch selection of samples to a classic optimization problem in decision version. Given $n$ unit circles, the equal circle packing problem (ECPP) asks whether there exist a feasible packing that could put all the circles inside a circular container without overlapping. Specifically, we propose a stochastic item descent method (SIDM) for ECPP in large scale, which randomly divides the unit circles into batches and runs Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm on the corresponding batch function iteratively to speedup the calculation. We also increase the batch size during the batch iterations to gain higher quality solution. Comparing to the current best packing algorithms, SIDM greatly speeds up the calculation of optimization process and guarantees the solution quality for large scale instances with up to 1500 circle items, while the baseline algorithms usually handle about 300 circle items. The results indicate the highly efficiency of SIDM for this classic optimization problem in large scale, and show potential for other large scale classic optimization problems in which gradient descent is used for optimization.