Source author record

Pengcheng Zhang

Pengcheng Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Software Engineering Computer Vision

Catalog footprint

What is connected

4works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

A Framework and DataSet for Bugs in Ethereum Smart Contracts

Ethereum is the largest blockchain platform that supports smart contracts. Users deploy smart contracts by publishing the smart contract's bytecode to the blockchain. Since the data in the blockchain cannot be modified, even if these contracts contain bugs, it is not possible to patch deployed smart contracts with code updates. Moreover, there is currently neither a comprehensive classification framework for Ethereum smart contract bugs, nor detailed criteria for detecting bugs in smart contracts, making it difficult for developers to fully understand the negative effects of bugs and design new approaches to detect bugs. In this paper, to fill the gap, we first collect as many smart contract bugs as possible from multiple sources and divide these bugs into 9 categories by extending the IEEE Standard Classification for Software Anomalies. Then, we design the criteria for detecting each kind of bugs, and construct a dataset of smart contracts covering all kinds of bugs. With our framework and dataset, developers can learn smart contract bugs and develop new tools to detect and locate bugs in smart contracts. Moreover, we evaluate the state-of-the-art tools for smart contract analysis with our dataset and obtain some interesting findings: 1) Mythril, Slither and Remix are the most worthwhile combination of analysis tools. 2) There are still 10 kinds of bugs that cannot be detected by any analysis tool.

preprint2020arXiv

ADF-GA: Data Flow Criterion Based Test Case Generation for Ethereum Smart Contracts

Testing is an important technique to improve the quality of Ethereum smart contract programs. However, current work on testing smart contract only focus on static problems of smart contract programs. A data flow oriented test case generation approach for dynamic testing of smart contract programs is still missing. To address this problem, this paper proposes a novel test case generation approach, called ADF-GA (All-uses Data Flow criterion based test case generation using Genetic Algorithm), for Solidity based Ethereum smart contract programs. ADF-GA aims to efficiently generate a valid set of test cases via three stages. First, the corresponding program control flow graph is constructed from the source codes. Second, the generated control flow graph is analyzed to obtain the variable information in the Solidity programs, locate the require statements, and also get the definition-use pairs to be tested. Finally, a genetic algorithm is used to generate test cases, in which an improved fitness function is proposed to calculate the definition-use pairs coverage of each test case with program instrumentation. Experimental studies are performed on several representative Solidity programs. The results show that ADF-GA can effectively generate test cases, achieve better coverage, and reduce the number of iterations in genetic algorithm.

preprint2020arXiv

CAGFuzz: Coverage-Guided Adversarial Generative Fuzzing Testing of Deep Learning Systems

Deep Learning systems (DL) based on Deep Neural Networks (DNNs) are more and more used in various aspects of our life, including unmanned vehicles, speech processing, and robotics. However, due to the limited dataset and the dependence on manual labeling data, DNNs often fail to detect their erroneous behaviors, which may lead to serious problems. Several approaches have been proposed to enhance the input examples for testing DL systems. However, they have the following limitations. First, they design and generate adversarial examples from the perspective of model, which may cause low generalization ability when they are applied to other models. Second, they only use surface feature constraints to judge the difference between the adversarial example generated and the original example. The deep feature constraints, which contain high-level semantic information, such as image object category and scene semantics are completely neglected. To address these two problems, in this paper, we propose CAGFuzz, a Coverage-guided Adversarial Generative Fuzzing testing approach, which generates adversarial examples for a targeted DNN to discover its potential defects. First, we train an adversarial case generator (AEG) from the perspective of general data set. Second, we extract the depth features of the original and adversarial examples, and constrain the adversarial examples by cosine similarity to ensure that the semantic information of adversarial examples remains unchanged. Finally, we retrain effective adversarial examples to improve neuron testing coverage rate. Based on several popular data sets, we design a set of dedicated experiments to evaluate CAGFuzz. The experimental results show that CAGFuzz can improve the neuron coverage rate, detect hidden errors, and also improve the accuracy of the target DNN.

preprint2020arXiv

Quality Assurance Technologies of Big Data Applications: A Systematic Literature Review

Big data applications are currently used in many application domains, ranging from statistical applications to prediction systems and smart cities. However, the quality of these applications is far from perfect, leading to a large amount of issues and problems. Consequently, assuring the overall quality for big data applications plays an increasingly important role. This paper aims at summarizing and assessing existing quality assurance (QA) technologies addressing quality issues in big data applications. We have conducted a systematic literature review (SLR) by searching major scientific databases, resulting in 83 primary and relevant studies on QA technologies for big data applications. The SLR results reveal the following main findings: 1) the impact of the big data attributes of volume, velocity, and variety on the quality of big data applications; 2) the quality attributes that determine the quality for big data applications include correctness, performance, availability, scalability, reliability and so on; 3) the existing QA technologies, including analysis, specification, model-driven architecture (MDA), verification, fault tolerance, testing, monitoring and fault & failure prediction; 4) existing strengths and limitations of each kind of QA technology; 5) the existing empirical evidence of each QA technology. This study provides a solid foundation for research on QA technologies of big data applications. However, many challenges of big data applications regarding quality still remain.

Pengcheng Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

A Framework and DataSet for Bugs in Ethereum Smart Contracts

ADF-GA: Data Flow Criterion Based Test Case Generation for Ethereum Smart Contracts

CAGFuzz: Coverage-Guided Adversarial Generative Fuzzing Testing of Deep Learning Systems

Quality Assurance Technologies of Big Data Applications: A Systematic Literature Review