Source author record

Xiapu Luo

Xiapu Luo appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Cryptography and Security Software Engineering Machine Learning Distributed, Parallel, and Cluster Computing

Catalog footprint

What is connected

22works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

TransLibEval: Demystify Large Language Models' Capability in Third-party Library-targeted Code Translation

In recent years, Large Language Models (LLMs) have been widely studied in the code translation field on the method, class, and even repository levels. However, most of these benchmarks are limited in terms of Third-Party Library (TPL) categories and scales, making TPL-related errors hard to expose and hindering the development of targeted solutions. Considering the high dependence (over 90%) on TPLs in practical programming, demystifying and analyzing LLMs' code translation performance involving various TPLs becomes imperative. To address this gap, we construct TransLibEval, the first benchmark dedicated to library-centric code translation. It consists of 200 real-world tasks across Python, Java, and C++, each explicitly involving TPLs from diverse categories such as data processing, machine learning, and web development, with comprehensive dependency coverage and high-coverage test suites. We evaluate seven recent LLMs of commercial, general, and code-specialized families under six translation strategies of three categories: Direct, IR-guided, and Retrieval-augmented. Experimental results show a dramatic performance drop compared with library-free settings (average CA decline over 60%), while diverse strategies demonstrate heterogeneous advantages. Furthermore, we analyze 4,831 failed cases from GPT-4o, one of the State-of-the-Art (SOTA) LLMs, revealing numerous third-party reference errors that were obscured previously. These findings highlight the unique challenges of library-centric translation and provide practical guidance for improving TPL-aware code intelligence.

preprint2024arXiv

Coverage Goal Selector for Combining Multiple Criteria in Search-Based Unit Test Generation

Unit testing is critical to the software development process, ensuring the correctness of basic programming units in a program (e.g., a method). Search-based software testing (SBST) is an automated approach to generating test cases. SBST generates test cases with genetic algorithms by specifying the coverage criterion (e.g., branch coverage). However, a good test suite must have different properties, which cannot be captured using an individual coverage criterion. Therefore, the state-of-the-art approach combines multiple criteria to generate test cases. Since combining multiple coverage criteria brings multiple objectives for optimization, it hurts the test suites' coverage for certain criteria compared with using the single criterion. To cope with this problem, we propose a novel approach named \textbf{smart selection}. Based on the coverage correlations among criteria and the subsumption relationships among coverage goals, smart selection selects a subset of coverage goals to reduce the number of optimization objectives and avoid missing any properties of all criteria. We conduct experiments to evaluate smart selection on $400$ Java classes with three state-of-the-art genetic algorithms under the $2$-minute budget. On average, smart selection outperforms combining all goals on $65.1\%$ of the classes having significant differences between the two approaches. Secondly, we conduct experiments to verify our assumptions about coverage criteria relationships. Furthermore, we assess the coverage performance of smart selection under varying budgets of $5$, $8$, and $10$ minutes and explore its effect on bug detection, confirming the advantage of smart selection over combining all goals.

preprint2024arXiv

MalModel: Hiding Malicious Payload in Mobile Deep Learning Models with Black-box Backdoor Attack

Mobile malware has become one of the most critical security threats in the era of ubiquitous mobile computing. Despite the intensive efforts from security experts to counteract it, recent years have still witnessed a rapid growth of identified malware samples. This could be partly attributed to the newly-emerged technologies that may constantly open up under-studied attack surfaces for the adversaries. One typical example is the recently-developed mobile machine learning (ML) framework that enables storing and running deep learning (DL) models on mobile devices. Despite obvious advantages, this new feature also inadvertently introduces potential vulnerabilities (e.g., on-device models may be modified for malicious purposes). In this work, we propose a method to generate or transform mobile malware by hiding the malicious payloads inside the parameters of deep learning models, based on a strategy that considers four factors (layer type, layer number, layer coverage and the number of bytes to replace). Utilizing the proposed method, we can run malware in DL mobile applications covertly with little impact on the model performance (i.e., as little as 0.4% drop in accuracy and at most 39ms latency overhead).

preprint2024arXiv

VGX: Large-Scale Sample Generation for Boosting Learning-Based Software Vulnerability Analyses

Accompanying the successes of learning-based defensive software vulnerability analyses is the lack of large and quality sets of labeled vulnerable program samples, which impedes further advancement of those defenses. Existing automated sample generation approaches have shown potentials yet still fall short of practical expectations due to the high noise in the generated samples. This paper proposes VGX, a new technique aimed for large-scale generation of high-quality vulnerability datasets. Given a normal program, VGX identifies the code contexts in which vulnerabilities can be injected, using a customized Transformer featured with a new value-flowbased position encoding and pre-trained against new objectives particularly for learning code structure and context. Then, VGX materializes vulnerability-injection code editing in the identified contexts using patterns of such edits obtained from both historical fixes and human knowledge about real-world vulnerabilities. Compared to four state-of-the-art (SOTA) baselines (pattern-, Transformer-, GNN-, and pattern+Transformer-based), VGX achieved 99.09-890.06% higher F1 and 22.45%-328.47% higher label accuracy. For in-the-wild sample production, VGX generated 150,392 vulnerable samples, from which we randomly chose 10% to assess how much these samples help vulnerability detection, localization, and repair. Our results show SOTA techniques for these three application tasks achieved 19.15-330.80% higher F1, 12.86-19.31% higher top-10 accuracy, and 85.02-99.30% higher top-50 accuracy, respectively, by adding those samples to their original training data. These samples also helped a SOTA vulnerability detector discover 13 more real-world vulnerabilities (CVEs) in critical systems (e.g., Linux kernel) that would be missed by the original model.

preprint2022arXiv

A Survey on EOSIO Systems Security: Vulnerability, Attack, and Mitigation

EOSIO, as one of the most representative blockchain 3.0 platforms, involves lots of new features, e.g., delegated proof of stake consensus algorithm and updatable smart contracts, enabling a much higher transaction per second and the prosperous decentralized applications (DApps) ecosystem. According to the statistics, it has reached nearly 18 billion USD, taking the third place of the whole cryptocurrency market, following Bitcoin and Ethereum. Loopholes, however, are hiding in the shadows. EOSBet, a famous gambling DApp, was attacked twice within a month and lost more than 1 million USD. No existing work has surveyed the EOSIO from a security researcher perspective. To fill this gap, in this paper, we collected all occurred attack events against EOSIO, and systematically studied their root causes, i.e., vulnerabilities lurked in all relying components for EOSIO, as well as the corresponding attacks and mitigations. We also summarized some best practices for DApp developers, EOSIO official team, and security researchers for future directions.

preprint2022arXiv

Aper: Evolution-Aware Runtime Permission Misuse Detection for Android Apps

The Android platform introduces the runtime permission model in version 6.0. The new model greatly improves data privacy and user experience, but brings new challenges for app developers. First, it allows users to freely revoke granted permissions. Hence, developers cannot assume that the permissions granted to an app would keep being granted. Instead, they should make their apps carefully check the permission status before invoking dangerous APIs. Second, the permission specification keeps evolving, bringing new types of compatibility issues into the ecosystem. To understand the impact of the challenges, we conducted an empirical study on 13,352 popular Google Play apps. We found that 86.0% apps used dangerous APIs asynchronously after permission management and 61.2% apps used evolving dangerous APIs. If an app does not properly handle permission revocations or platform differences, unexpected runtime issues may happen and even cause app crashes. We call such Android Runtime Permission issues as ARP bugs. Unfortunately, existing runtime permission issue detection tools cannot effectively deal with the ARP bugs induced by asynchronous permission management and permission specification evolution. To fill the gap, we designed a static analyzer, Aper, that performs reaching definition and dominator analysis on Android apps to detect the two types of ARP bugs. To compare Aper with existing tools, we built a benchmark, ARPfix, from 60 real ARP bugs. Our experiment results show that Aper significantly outperforms two academic tools, ARPDroid and RevDroid, and an industrial tool, Lint, on ARPfix, with an average improvement of 46.3% on F1-score. In addition, Aper successfully found 34 ARP bugs in 214 opensource Android apps, most of which can result in abnormal app behaviors (such as app crashes) according to our manual validation.

preprint2022arXiv

BinarizedAttack: Structural Poisoning Attacks to Graph-based Anomaly Detection

Graph-based Anomaly Detection (GAD) is becoming prevalent due to the powerful representation abilities of graphs as well as recent advances in graph mining techniques. These GAD tools, however, expose a new attacking surface, ironically due to their unique advantage of being able to exploit the relations among data. That is, attackers now can manipulate those relations (i.e., the structure of the graph) to allow some target nodes to evade detection. In this paper, we exploit this vulnerability by designing a new type of targeted structural poisoning attacks to a representative regression-based GAD system termed OddBall. Specially, we formulate the attack against OddBall as a bi-level optimization problem, where the key technical challenge is to efficiently solve the problem in a discrete domain. We propose a novel attack method termed BinarizedAttack based on gradient descent. Comparing to prior arts, BinarizedAttack can better use the gradient information, making it particularly suitable for solving combinatorial optimization problems. Furthermore, we investigate the attack transferability of BinarizedAttack by employing it to attack other representation-learning-based GAD systems. Our comprehensive experiments demonstrate that BinarizedAttack is very effective in enabling target nodes to evade graph-based anomaly detection tools with limited attackers' budget, and in the black-box transfer attack setting, BinarizedAttack is also tested effective and in particular, can significantly change the node embeddings learned by the GAD systems. Our research thus opens the door to studying a new type of attack against security analytic tools that rely on graph data.

preprint2022arXiv

iLibScope: Reliable Third-Party Library Detection for iOS Mobile Apps

Vetting security impacts introduced by third-party libraries in iOS apps requires a reliable library detection technique. Especially when a new vulnerability (or a privacy-invasive behavior) was discovered in a third-party library, there is a practical need to precisely identify the existence of libraries and their versions for iOS apps. However, few studies have been proposed to tackle this problem, and they all suffer from the code duplication problem in different libraries. In this paper, we focus on third-party library detection in iOS apps. Given an app, we aim to identify the integrated libraries and pinpoint their versions (or the version range).To this end, we first conduct an in-depth study on iOS third-party libraries to demystify the code duplication challenge. By doing so, we have two key observations: 1) even though two libraries can share classes, the shared classes cannot be integrated into an app simultaneously without causing a class name conflict; and 2) code duplication between multiple versions of two libraries can vary. Based on these findings, we propose a novel profile-based similarity comparison approach to perform the detection. Specifically, we build a library database consists of original library binaries with distinct versions. After extracting profiles for each library version and the target app, we conduct a similarity comparison to find the best matches. We implemented this approach in iLibScope. We built a benchmark consists of 5,807 apps with 10,495 library integrations and applied our tool to it. Our evaluation shows that iLibScope achieves a recall exceeds 99% and a precision exceeds 97% for library detection. We also applied iLibScope to detect the presence of well-known vulnerable third-party libraries in real-world iOS mobile apps to show the promising usage of our tool. It successfully identified 405 vulnerable library usage from 4,249 apps.

preprint2021arXiv

CHAMP: Characterizing Undesired App Behaviors from User Comments based on Market Policies

Millions of mobile apps have been available through various app markets. Although most app markets have enforced a number of automated or even manual mechanisms to vet each app before it is released to the market, thousands of low-quality apps still exist in different markets, some of which violate the explicitly specified market policies.In order to identify these violations accurately and timely, we resort to user comments, which can form an immediate feedback for app market maintainers, to identify undesired behaviors that violate market policies, including security-related user concerns. Specifically, we present the first large-scale study to detect and characterize the correlations between user comments and market policies. First, we propose CHAMP, an approach that adopts text mining and natural language processing (NLP) techniques to extract semantic rules through a semi-automated process, and classifies comments into 26 pre-defined types of undesired behaviors that violate market policies. Our evaluation on real-world user comments shows that it achieves both high precision and recall ($>0.9$) in classifying comments for undesired behaviors. Then, we curate a large-scale comment dataset (over 3 million user comments) from apps in Google Play and 8 popular alternative Android app markets, and apply CHAMP to understand the characteristics of undesired behavior comments in the wild. The results confirm our speculation that user comments can be used to pinpoint suspicious apps that violate policies declared by app markets. The study also reveals that policy violations are widespread in many app markets despite their extensive vetting efforts. CHAMP can be a \textit{whistle blower} that assigns policy-violation scores and identifies most informative comments for apps.

preprint2020arXiv

A Framework and DataSet for Bugs in Ethereum Smart Contracts

Ethereum is the largest blockchain platform that supports smart contracts. Users deploy smart contracts by publishing the smart contract's bytecode to the blockchain. Since the data in the blockchain cannot be modified, even if these contracts contain bugs, it is not possible to patch deployed smart contracts with code updates. Moreover, there is currently neither a comprehensive classification framework for Ethereum smart contract bugs, nor detailed criteria for detecting bugs in smart contracts, making it difficult for developers to fully understand the negative effects of bugs and design new approaches to detect bugs. In this paper, to fill the gap, we first collect as many smart contract bugs as possible from multiple sources and divide these bugs into 9 categories by extending the IEEE Standard Classification for Software Anomalies. Then, we design the criteria for detecting each kind of bugs, and construct a dataset of smart contracts covering all kinds of bugs. With our framework and dataset, developers can learn smart contract bugs and develop new tools to detect and locate bugs in smart contracts. Moreover, we evaluate the state-of-the-art tools for smart contract analysis with our dataset and obtain some interesting findings: 1) Mythril, Slither and Remix are the most worthwhile combination of analysis tools. 2) There are still 10 kinds of bugs that cannot be detected by any analysis tool.

preprint2020arXiv

AdvMind: Inferring Adversary Intent of Black-Box Attacks

Deep neural networks (DNNs) are inherently susceptible to adversarial attacks even under black-box settings, in which the adversary only has query access to the target models. In practice, while it may be possible to effectively detect such attacks (e.g., observing massive similar but non-identical queries), it is often challenging to exactly infer the adversary intent (e.g., the target class of the adversarial example the adversary attempts to craft) especially during early stages of the attacks, which is crucial for performing effective deterrence and remediation of the threats in many scenarios. In this paper, we present AdvMind, a new class of estimation models that infer the adversary intent of black-box adversarial attacks in a robust and prompt manner. Specifically, to achieve robust detection, AdvMind accounts for the adversary adaptiveness such that her attempt to conceal the target will significantly increase the attack cost (e.g., in terms of the number of queries); to achieve prompt detection, AdvMind proactively synthesizes plausible query results to solicit subsequent queries from the adversary that maximally expose her intent. Through extensive empirical evaluation on benchmark datasets and state-of-the-art black-box attacks, we demonstrate that on average AdvMind detects the adversary intent with over 75% accuracy after observing less than 3 query batches and meanwhile increases the cost of adaptive attacks by over 60%. We further discuss the possible synergy between AdvMind and other defense methods against black-box adversarial attacks, pointing to several promising research directions.

preprint2020arXiv

An Empirical Evaluation of GDPR Compliance Violations in Android mHealth Apps

The purpose of the General Data Protection Regulation (GDPR) is to provide improved privacy protection. If an app controls personal data from users, it needs to be compliant with GDPR. However, GDPR lists general rules rather than exact step-by-step guidelines about how to develop an app that fulfills the requirements. Therefore, there may exist GDPR compliance violations in existing apps, which would pose severe privacy threats to app users. In this paper, we take mobile health applications (mHealth apps) as a peephole to examine the status quo of GDPR compliance in Android apps. We first propose an automated system, named \mytool, to bridge the semantic gap between the general rules of GDPR and the app implementations by identifying the data practices declared in the app privacy policy and the data relevant behaviors in the app code. Then, based on \mytool, we detect three kinds of GDPR compliance violations, including the incompleteness of privacy policy, the inconsistency of data collections, and the insecurity of data transmission. We perform an empirical evaluation of 796 mHealth apps. The results reveal that 189 (23.7\%) of them do not provide complete privacy policies. Moreover, 59 apps collect sensitive data through different measures, but 46 (77.9\%) of them contain at least one inconsistent collection behavior. Even worse, among the 59 apps, only 8 apps try to ensure the transmission security of collected data. However, all of them contain at least one encryption or SSL misuse. Our work exposes severe privacy issues to raise awareness of privacy protection for app users and developers.

preprint2020arXiv

AxeChain: A Secure and Decentralized blockchain for solving Easily-Verifiable problems

While Proof-of-Work (PoW) is the most widely used consensus mechanism for blockchain, it received harsh criticism due to its massive waste of energy for meaningless hash calculation. Some studies have introduced Proof-of-Stake to address this issue. However, such protocols widen the gap between rich and poor and in the worst case lead to an oligopoly, where the rich control the entire network. Other studies have attempted to translate the energy consumption of PoW into useful work, but they have many limitations, such as narrow application scope, serious security issues and impractical incentive model. In this paper, we introduce AxeChain, which can use the computing power of blockchain to solve practical problems raised by users without greatly compromising decentralization or security. AxeChain achieves this by coupling hard problem solving with PoW mining. We model the security of AxeChain and derive a balance curve between power utilization and system security. That is, under the reasonable assumption that the attack power does not exceed 1/3 of the total power, 1/2 of total power can be safely used to solve practical problems. We also design a novel incentive model based on the amount of work involved in problem solving, balancing the interests of both the users and miners. Moreover, our experimental results show that AxeChain provides strong security guarantees, no matter what kind of problem is submitted.

preprint2020arXiv

Characterizing Cryptocurrency Exchange Scams

As the indispensable trading platforms of the ecosystem, hundreds of cryptocurrency exchanges are emerging to facilitate the trading of digital assets. While, it also attracts the attentions of attackers. A number of scam attacks were reported targeting cryptocurrency exchanges, leading to a huge mount of financial loss. However, no previous work in our research community has systematically studied this problem. In this paper, we make the first effort to identify and characterize the cryptocurrency exchange scams. We first identify over 1,500 scam domains and over 300 fake apps, by collecting existing reports and using typosquatting generation techniques. Then we investigate the relationship between them, and identify 94 scam domain families and 30 fake app families. We further characterize the impacts of such scams, and reveal that these scams have incurred financial loss of 520k US dollars at least. We further observe that the fake apps have been sneaked to major app markets (including Google Play) to infect unsuspicious users. Our findings demonstrate the urgency to identify and prevent cryptocurrency exchange scams. To facilitate future research, we have publicly released all the identified scam domains and fake apps to the community.

preprint2020arXiv

Characterizing EOSIO Blockchain

EOSIO has become one of the most popular blockchain platforms since its mainnet launch in June 2018. In contrast to the traditional PoW-based systems (e.g., Bitcoin and Ethereum), which are limited by low throughput, EOSIO is the first high throughput Delegated Proof of Stake system that has been widely adopted by many applications. Although EOSIO has millions of accounts and billions of transactions, little is known about its ecosystem, especially related to security and fraud. In this paper, we perform a large-scale measurement study of the EOSIO blockchain and its associated DApps. We gather a large-scale dataset of EOSIO and characterize activities including money transfers, account creation and contract invocation. Using our insights, we then develop techniques to automatically detect bots and fraudulent activity. We discover thousands of bot accounts (over 30\% of the accounts in the platform) and a number of real-world attacks (301 attack accounts). By the time of our study, 80 attack accounts we identified have been confirmed by DApp teams, causing 828,824 EOS tokens losses (roughly 2.6 million US\$) in total.

preprint2020arXiv

Characterizing Erasable Accounts in Ethereum

Being the most popular permissionless blockchain that supports smart contracts, Ethereum allows any user to create accounts on it. However, not all accounts matter. For example, the accounts due to attacks can be removed. In this paper, we conduct the first investigation on erasable accounts that can be removed to save system resources and even users' money (i.e., ETH or gas). In particular, we propose and develop a novel tool named GLASER, which analyzes the State DataBase of Ethereum to discover five kinds of erasable accounts. The experimental results show that GLASER can accurately reveal 508,482 erasable accounts and these accounts lead to users wasting more than 106 million dollars. GLASER can help stop further economic loss caused by these detected accounts. Moreover, GLASER characterizes the attacks/behaviors related to detected erasable accounts through graph analysis.

preprint2020arXiv

Feature Location Benchmark for Decomposing and Reusing Android Apps

Software reuse enables developers to reuse architecture, programs and other software artifacts. Realizing a systematical reuse in software brings a large amount of benefits for stakeholders, including lower maintenance efforts, lower development costs, and time to market. Unfortunately, currently implementing a framework for large-scale software reuse in Android apps is still a huge problem, regarding the complexity of the task and lacking of practical technical support from either tools or domain experts. Therefore, proposing a feature location benchmark for apps will help developers either optimize their feature location techniques or reuse the assets created in the benchmark for reusing. In this paper, we release a feature location benchmark, which can be used for those developers, who intend to compose software product lines (SPL) and release reuse in apps. The benchmark not only contributes to the research community for reuse research, but also helps participants in industry for optimizing their architecture and enhancing modularity. In addition, we also develop an Android Studio plugin named caIDE for developers to view and operate on the benchmark.

preprint2020arXiv

MadDroid: Characterising and Detecting Devious Ad Content for Android Apps

Advertisement drives the economy of the mobile app ecosystem. As a key component in the mobile ad business model, mobile ad content has been overlooked by the research community, which poses a number of threats, e.g., propagating malware and undesirable contents. To understand the practice of these devious ad behaviors, we perform a large-scale study on the app contents harvested through automated app testing. In this work, we first provide a comprehensive categorization of devious ad contents, including five kinds of behaviors belonging to two categories: \emph{ad loading content} and \emph{ad clicking content}. Then, we propose MadDroid, a framework for automated detection of devious ad contents. MadDroid leverages an automated app testing framework with a sophisticated ad view exploration strategy for effectively collecting ad-related network traffic and subsequently extracting ad contents. We then integrate dedicated approaches into the framework to identify devious ad contents. We have applied MadDroid to 40,000 Android apps and found that roughly 6\% of apps deliver devious ad contents, e.g., distributing malicious apps that cannot be downloaded via traditional app markets. Experiment results indicate that devious ad contents are prevalent, suggesting that our community should invest more effort into the detection and mitigation of devious ads towards building a trustworthy mobile advertising ecosystem.

preprint2020arXiv

Security Analysis of EOSIO Smart Contracts

The EOSIO blockchain, one of the representative Delegated Proof-of-Stake (DPoS) blockchain platforms, has grown rapidly recently. Meanwhile, a number of vulnerabilities and high-profile attacks against top EOSIO DApps and their smart contracts have also been discovered and observed in the wild, resulting in serious financial damages. Most of EOSIO's smart contracts are not open-sourced and they are typically compiled to WebAssembly (Wasm) bytecode, thus making it challenging to analyze and detect the presence of possible vulnerabilities. In this paper, we propose EOSAFE, the first static analysis framework that can be used to automatically detect vulnerabilities in EOSIO smart contracts at the bytecode level. Our framework includes a practical symbolic execution engine for Wasm, a customized library emulator for EOSIO smart contracts, and four heuristics-driven detectors to identify the presence of four most popular vulnerabilities in EOSIO smart contracts. Experiment results suggest that EOSAFE achieves promising results in detecting vulnerabilities, with an F1-measure of 98%. We have applied EOSAFE to all active 53,666 smart contracts in the ecosystem (as of November 15, 2019). Our results show that over 25% of the smart contracts are vulnerable. We further analyze possible exploitation attempts against these vulnerable smart contracts and identify 48 in-the-wild attacks (25 of them have been confirmed by DApp developers), resulting in financial loss of at least 1.7 million USD.

preprint2020arXiv

STAN: Towards Describing Bytecodes of Smart Contract

More than eight million smart contracts have been deployed into Ethereum, which is the most popular blockchain that supports smart contract. However, less than 1% of deployed smart contracts are open-source, and it is difficult for users to understand the functionality and internal mechanism of those closed-source contracts. Although a few decompilers for smart contracts have been recently proposed, it is still not easy for users to grasp the semantic information of the contract, not to mention the potential misleading due to decompilation errors. In this paper, we propose the first system named STAN to generate descriptions for the bytecodes of smart contracts to help users comprehend them. In particular, for each interface in a smart contract, STAN can generate four categories of descriptions, including functionality description, usage description, behavior description, and payment description, by leveraging symbolic execution and NLP (Natural Language Processing) techniques. Extensive experiments show that STAN can generate adequate, accurate, and readable descriptions for contract's bytecodes, which have practical value for users.

preprint2014arXiv

A Sink-driven Approach to Detecting Exposed Component Vulnerabilities in Android Apps

Android apps could expose their components for cooperating with other apps. This convenience, however, makes apps susceptible to the exposed component vulnerability (ECV), in which a dangerous API (commonly known as sink) inside its component can be triggered by other (malicious) apps. In the prior works, detecting these ECVs use a set of sinks pertaining to the ECVs under detection. In this paper, we argue that a more comprehensive and effective approach should start by a systematic selection and classification of vulnerability-specific sinks (VSinks). The set of VSinks is much larger than those used in the previous works. Based on these VSinks, our sink-driven approach can detect different kinds of ECVs in an app in two steps. First, VSinks and their categories are identified through a typical forward reachability analysis. Second, based on each VSink's category, a corresponding detection method is used to identify the ECV via a customized backward dataflow analysis. We also design a semi-auto guided analysis and validation capability for system-only broadcast checking to remove some false positives. We implement our sink-driven approach in a tool called ECVDetector and evaluate it with the top 1K Android apps. Using ECVDetector we successfully identify a total of 49 vulnerable apps across all four ECV categories we have defined. To our knowledge, most of them are previously undisclosed, such as the very popular Go SMS Pro and Clean Master. Moreover, the performance of ECVDetector is high, requiring only 9.257 seconds on average to process each component.

preprint2013arXiv

STor: Social Network based Anonymous Communication in Tor

Anonymity networks hide user identities with the help of relayed anonymity routers. However, the state-of-the-art anonymity networks do not provide an effective trust model. As a result, users cannot circumvent malicious or vulnerable routers, thus making them susceptible to malicious router based attacks (e.g., correlation attacks). In this paper, we propose a novel social network based trust model to help anonymity networks circumvent malicious routers and obtain secure anonymity. In particular, we design an input independent fuzzy model to determine trust relationships between friends based on qualitative and quantitative social attributes, both of which can be readily obtained from existing social networks. Moreover, we design an algorithm for propagating trust over an anonymity network. We integrate these two elements in STor, a novel social network based Tor. We have implemented STor by modifying the Tor's source code and conducted experiments on PlanetLab to evaluate the effectiveness of STor. Both simulation and PlanetLab experiment results have demonstrated that STor can achieve secure anonymity by establishing trust-based circuits in a distributed way. Although the design of STor is based on Tor network, the social network based trust model can be adopted by other anonymity networks.

Xiapu Luo

What is connected

Connect this record

See the researcher in context

Building this map preview

22 published item(s)

TransLibEval: Demystify Large Language Models' Capability in Third-party Library-targeted Code Translation

Coverage Goal Selector for Combining Multiple Criteria in Search-Based Unit Test Generation

MalModel: Hiding Malicious Payload in Mobile Deep Learning Models with Black-box Backdoor Attack

VGX: Large-Scale Sample Generation for Boosting Learning-Based Software Vulnerability Analyses

A Survey on EOSIO Systems Security: Vulnerability, Attack, and Mitigation

Aper: Evolution-Aware Runtime Permission Misuse Detection for Android Apps

BinarizedAttack: Structural Poisoning Attacks to Graph-based Anomaly Detection

iLibScope: Reliable Third-Party Library Detection for iOS Mobile Apps

CHAMP: Characterizing Undesired App Behaviors from User Comments based on Market Policies

A Framework and DataSet for Bugs in Ethereum Smart Contracts

AdvMind: Inferring Adversary Intent of Black-Box Attacks

An Empirical Evaluation of GDPR Compliance Violations in Android mHealth Apps

AxeChain: A Secure and Decentralized blockchain for solving Easily-Verifiable problems

Characterizing Cryptocurrency Exchange Scams

Characterizing EOSIO Blockchain

Characterizing Erasable Accounts in Ethereum

Feature Location Benchmark for Decomposing and Reusing Android Apps

MadDroid: Characterising and Detecting Devious Ad Content for Android Apps

Security Analysis of EOSIO Smart Contracts

STAN: Towards Describing Bytecodes of Smart Contract

A Sink-driven Approach to Detecting Exposed Component Vulnerabilities in Android Apps

STor: Social Network based Anonymous Communication in Tor