Source author record

Mathias Payer

Mathias Payer appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Cryptography and Security Programming Languages Software Engineering cs.CY Systems and Control

Catalog footprint

What is connected

8works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Designing a Provenance Analysis for SGX Enclaves

Intel SGX enables memory isolation and static integrity verification of code and data stored in user-space memory regions called enclaves. SGX effectively shields the execution of enclaves from the underlying untrusted OS. Attackers cannot tamper nor examine enclaves' content. However, these properties equally challenge defenders as they are precluded from any provenance analysis to infer intrusions inside SGX enclaves. In this work, we propose SgxMonitor, a novel provenance analysis to monitor and identify anomalous executions of enclave code. To this end, we design a technique to extract contextual runtime information from an enclave and propose a novel model to represent enclaves' intrusions. Our experiments show that not only SgxMonitor incurs an overhead comparable to traditional provenance tools, but it also exhibits macro-benchmarks' overheads and slowdowns that marginally affect real use cases deployment. Our evaluation shows SgxMonitor successfully identifies enclave intrusions carried out by the state of the art attacks while reporting no false positives and negatives during normal enclaves executions, thus supporting the use of SgxMonitor in realistic scenarios.

preprint2022arXiv

FishFuzz: Throwing Larger Nets to Catch Deeper Bugs

Greybox fuzzing is the de-facto standard to discover bugs during development. Fuzzers execute many inputs to maximize the amount of reached code. Recently, Directed Greybox Fuzzers (DGFs) propose an alternative strategy that goes beyond "just" coverage: driving testing toward specific code targets by selecting "closer" seeds. DGFs go through different phases: exploration (i.e., reaching interesting locations) and exploitation (i.e., triggering bugs). In practice, DGFs leverage coverage to directly measure exploration, while exploitation is, at best, measured indirectly by alternating between different targets. Specifically, we observe two limitations in existing DGFs: (i) they lack precision in their distance metric, i.e., averaging multiple paths and targets into a single score (to decide which seeds to prioritize), and (ii) they assign energy to seeds in a round-robin fashion without adjusting the priority of the targets (exhaustively explored targets should be dropped). We propose FishFuzz, which draws inspiration from trawl fishing: first casting a wide net, scraping for high coverage, then slowly pulling it in to maximize the harvest. The core of our fuzzer is a novel seed selection strategy that builds on two concepts: (i) a novel multi-distance metric whose precision is independent of the number of targets, and (ii) a dynamic target ranking to automatically discard exhausted targets. This strategy allows FishFuzz to seamlessly scale to tens of thousands of targets and dynamically alternate between exploration and exploitation phases. We evaluate FishFuzz by leveraging all sanitizer labels as targets. Extensively comparing FishFuzz against modern DGFs and coverage-guided fuzzers shows that FishFuzz reached higher coverage compared to the direct competitors, reproduces existing bugs (70.2% faster), and finally discovers 25 new bugs (18 CVEs) in 44 programs.

preprint2022arXiv

PACSan: Enforcing Memory Safety Based on ARM PA

Memory safety is a key security property that stops memory corruption vulnerabilities. Existing sanitizers enforce checks and catch such bugs during development and testing. However, they either provide partial memory safety or have overwhelmingly high performance overheads. Our novel sanitizer PACSan enforces spatial and temporal memory safety with no false positives at low performance overheads. PACSan removes the majority of the overheads involved in pointer tracking by sealing metadata in pointers through ARM PA (Pointer Authentication), and performing the memory safety checks when pointers are dereferenced. We have developed a prototype of PACSan and systematically evaluated its security and performance on the Magma, Juliet, Nginx, and SPEC CPU2017 test suites, respectively. In our evaluation, PACSan shows no false positives together with negligible false negatives, while introducing stronger security guarantees and lower performance overheads than state-of-the-art sanitizers, including HWASan, ASan, SoftBound+CETS, Memcheck, LowFat, and PTAuth. Specifically, PACSan has 0.84x runtime overhead and 1.92x memory overhead on average. Compared to the widely deployed ASan, PACSan has no false positives and much fewer false negatives and reduces 7.172% runtime overheads and 89.063%memory overheads.

preprint2021arXiv

Too Quiet in the Library: An Empirical Study of Security Updates in Android Apps' Native Code

Android apps include third-party native libraries to increase performance and to reuse functionality. Native code is directly executed from apps through the Java Native Interface or the Android Native Development Kit. Android developers add precompiled native libraries to their projects, enabling their use. Unfortunately, developers often struggle or simply neglect to update these libraries in a timely manner. This results in the continuous use of outdated native libraries with unpatched security vulnerabilities years after patches became available. To further understand such phenomena, we study the security updates in native libraries in the most popular 200 free apps on Google Play from Sept. 2013 to May 2020. A core difficulty we face in this study is the identification of libraries and their versions. Developers often rename or modify libraries, making their identification challenging. We create an approach called LibRARIAN (LibRAry veRsion IdentificAtioN) that accurately identifies native libraries and their versions as found in Android apps based on our novel similarity metric bin2sim. LibRARIAN leverages different features extracted from libraries based on their metadata and identifying strings in read-only sections. We discovered 53/200 popular apps (26.5%) with vulnerable versions with known CVEs between Sept. 2013 and May 2020, with 14 of those apps remaining vulnerable. We find that app developers took, on average, 528.71 days to apply security patches, while library developers release a security patch after 54.59 days - a 10 times slower rate of update.

preprint2020arXiv

Decentralized Privacy-Preserving Proximity Tracing

This document describes and analyzes a system for secure and privacy-preserving proximity tracing at large scale. This system, referred to as DP3T, provides a technological foundation to help slow the spread of SARS-CoV-2 by simplifying and accelerating the process of notifying people who might have been exposed to the virus so that they can take appropriate measures to break its transmission chain. The system aims to minimise privacy and security risks for individuals and communities and guarantee the highest level of data protection. The goal of our proximity tracing system is to determine who has been in close physical proximity to a COVID-19 positive person and thus exposed to the virus, without revealing the contact's identity or where the contact occurred. To achieve this goal, users run a smartphone app that continually broadcasts an ephemeral, pseudo-random ID representing the user's phone and also records the pseudo-random IDs observed from smartphones in close proximity. When a patient is diagnosed with COVID-19, she can upload pseudo-random IDs previously broadcast from her phone to a central server. Prior to the upload, all data remains exclusively on the user's phone. Other users' apps can use data from the server to locally estimate whether the device's owner was exposed to the virus through close-range physical proximity to a COVID-19 positive person who has uploaded their data. In case the app detects a high risk, it will inform the user.

preprint2020arXiv

Software Ethology: An Accurate, Resilient, and Cross-Architecture Binary Analysis Framework

When reverse engineering a binary, the analyst must first understand the semantics of the binary's functions through either manual or automatic analysis. Manual semantic analysis is time-consuming, because abstractions provided by high level languages, such as type information, variable scope, or comments are lost, and past analyses cannot apply to the current analysis task. Existing automated binary analysis tools currently suffer from low accuracy in determining semantic function identification in the presence of diverse compilation environments. We introduce Software Ethology, a binary analysis approach for determining the semantic similarity of functions. Software Ethology abstracts semantic behavior as classification vectors of program state changes resulting from a function executing with a specified input state, and uses these vectors as a unique fingerprint for identification. All existing semantic identifiers determine function similarity via code measurements, and suffer from high inaccuracy when classifying functions from compilation environments different from their ground truth source. Since Software Ethology does not rely on code measurements, its accuracy is resilient to changes in compiler, compiler version, optimization level, or even different source implementing equivalent functionality. Tinbergen, our prototype Software Ethology implementation, leverages a virtual execution environment and a fuzzer to generate the classification vectors. In evaluating Tinbergen's feasibility as a semantic function identifier by identifying functions in coreutils-8.30, we achieve a high .805 average accuracy. Compared to the state-of-the-art, Tinbergen is 1.5 orders of magnitude faster when training, 50% faster in answering queries, and, when identifying functions in binaries generated from differing compilation environments, is 30%-61% more accurate.

preprint2014arXiv

Lockdown: Dynamic Control-Flow Integrity

Applications written in low-level languages without type or memory safety are especially prone to memory corruption. Attackers gain code execution capabilities through such applications despite all currently deployed defenses by exploiting memory corruption vulnerabilities. Control-Flow Integrity (CFI) is a promising defense mechanism that restricts open control-flow transfers to a static set of well-known locations. We present Lockdown, an approach to dynamic CFI that protects legacy, binary-only executables and libraries. Lockdown adaptively learns the control-flow graph of a running process using information from a trusted dynamic loader. The sandbox component of Lockdown restricts interactions between different shared objects to imported and exported functions by enforcing fine-grained CFI checks. Our prototype implementation shows that dynamic CFI results in low performance overhead.

preprint2014arXiv

Similarity-based matching meets Malware Diversity

Similarity metrics, e.g., signatures as used by anti-virus products, are the dominant technique to detect if a given binary is malware. The underlying assumption of this approach is that all instances of a malware (or even malware family) will be similar to each other. Software diversification is a probabilistic technique that uses code and data randomization and expressiveness in the target instruction set to generate large amounts of functionally equivalent but different binaries. Malware diversity builds on software diversity and ensures that any two diversified instances of the same malware have low similarity (according to a set of similarity metrics). An LLVM-based prototype implementation diversifies both code and data of binaries and our evaluation shows that signatures based on similarity only match one or few instances in a pool of diversified binaries generated from the same source code.

Mathias Payer

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Designing a Provenance Analysis for SGX Enclaves

FishFuzz: Throwing Larger Nets to Catch Deeper Bugs

PACSan: Enforcing Memory Safety Based on ARM PA

Too Quiet in the Library: An Empirical Study of Security Updates in Android Apps' Native Code

Decentralized Privacy-Preserving Proximity Tracing

Software Ethology: An Accurate, Resilient, and Cross-Architecture Binary Analysis Framework

Lockdown: Dynamic Control-Flow Integrity

Similarity-based matching meets Malware Diversity