Source author record

Ting Dai

Ting Dai appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications Artificial Intelligence Computation and Language Computer Vision Cryptography and Security

Catalog footprint

What is connected

3works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

Repairing Adversarial Texts through Perturbation

It is known that neural networks are subject to attacks through adversarial perturbations, i.e., inputs which are maliciously crafted through perturbations to induce wrong predictions. Furthermore, such attacks are impossible to eliminate, i.e., the adversarial perturbation is still possible after applying mitigation methods such as adversarial training. Multiple approaches have been developed to detect and reject such adversarial inputs, mostly in the image domain. Rejecting suspicious inputs however may not be always feasible or ideal. First, normal inputs may be rejected due to false alarms generated by the detection algorithm. Second, denial-of-service attacks may be conducted by feeding such systems with adversarial inputs. To address the gap, in this work, we propose an approach to automatically repair adversarial texts at runtime. Given a text which is suspected to be adversarial, we novelly apply multiple adversarial perturbation methods in a positive way to identify a repair, i.e., a slightly mutated but semantically equivalent text that the neural network correctly classifies. Our approach has been experimented with multiple models trained for natural language processing tasks and the results show that our approach is effective, i.e., it successfully repairs about 80\% of the adversarial texts. Furthermore, depending on the applied perturbation method, an adversarial text could be repaired in as short as one second on average.

preprint2021arXiv

Rethinking Natural Adversarial Examples for Classification Models

Recently, it was found that many real-world examples without intentional modifications can fool machine learning models, and such examples are called "natural adversarial examples". ImageNet-A is a famous dataset of natural adversarial examples. By analyzing this dataset, we hypothesized that large, cluttered and/or unusual background is an important reason why the images in this dataset are difficult to be classified. We validated the hypothesis by reducing the background influence in ImageNet-A examples with object detection techniques. Experiments showed that the object detection models with various classification models as backbones obtained much higher accuracy than their corresponding classification models. A detection model based on the classification model EfficientNet-B7 achieved a top-1 accuracy of 53.95%, surpassing previous state-of-the-art classification models trained on ImageNet, suggesting that accurate localization information can significantly boost the performance of classification models on ImageNet-A. We then manually cropped the objects in images from ImageNet-A and created a new dataset, named ImageNet-A-Plus. A human test on the new dataset showed that the deep learning-based classifiers still performed quite poorly compared with humans. Therefore, the new dataset can be used to study the robustness of classification models to the internal variance of objects without considering the background disturbance.

preprint2020arXiv

A systematic approach to identify and evaluate missing data patterns and mechanisms in multivariate educational, social, and behavioral research

Methods for addressing missing data have become much more accessible to applied researchers. However, little guidance exists to help researchers systematically identify plausible missing data mechanisms in order to ensure that these methods are appropriately applied. Two considerations motivate the present study. First, psychological research is typically characterized by a large number of potential response variables that may be observed across multiple waves of data collection. This situation makes it more challenging to identify plausible missing data mechanisms than is the case in other fields such as biostatistics where a small number of dependent variables is typically of primary interest and the main predictor of interest is statistically independent of other covariates. Second, there is growing recognition of the importance of systematic approaches to sensitivity analyses for treatment of missing data in psychological science. We develop and apply a systematic approach for reducing a large number of observed patterns and demonstrate how these can be used to explore potential missing data mechanisms within multivariate contexts. A large scale simulation study is used to guide suggestions for which approaches are likely to be most accurate as a function of sample size, number of factors, number of indicators per factor, and proportion of missing data. Three applications of this approach to data examples suggest that the method appears useful in practice.