Source author record

Aravind Machiry

Aravind Machiry appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Software Engineering Cryptography and Security Programming Languages

Catalog footprint

What is connected

2works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

C to Checked C by 3C

Owing to the continued use of C (and C++), spatial safety violations (e.g., buffer overflows) still constitute one of today's most dangerous and prevalent security vulnerabilities. To combat these violations, Checked C extends C with bounds-enforced checked pointer types. Checked C is essentially a gradually typed spatially safe C - checked pointers are backwards-binary compatible with legacy pointers, and the language allows them to be added piecemeal, rather than necessarily all at once, so that safety retrofitting can be incremental. This paper presents a semi-automated process for porting a legacy C program to Checked C. The process centers on 3C, a static analysis-based annotation tool. 3C employs two novel static analysis algorithms - typ3c and boun3c - to annotate legacy pointers as checked pointers, and to infer array bounds annotations for pointers that need them. 3C performs a root cause analysis to direct a human developer to code that should be refactored; once done, 3C can be re-run to infer further annotations (and updated root causes). Experiments on 11 programs totaling 319KLoC show 3C to be effective at inferring checked pointer types, and experience with previously and newly ported code finds 3C works well when combined with human-driven refactoring.

preprint2022arXiv

Cornucopia: A Framework for Feedback Guided Generation of Binaries

Binary analysis is an important capability required for many security and software engineering applications. Consequently, there are many binary analysis techniques and tools with varied capabilities. However, testing these tools requires a large, varied binary dataset with corresponding source-level information. In this paper, we present Cornucopia, an architecture agnostic automated framework that can generate a plethora of binaries from corresponding program source by exploiting compiler optimizations and feedback-guided learning. Our evaluation shows that Cornucopia was able to generate 309K binaries across four architectures (x86, x64, ARM, MIPS) with an average of 403 binaries for each program and outperforms Bintuner, a similar technique. Our experiments revealed issues with the LLVM optimization scheduler resulting in compiler crashes ($\sim$300). Our evaluation of four popular binary analysis tools Angr, Ghidra, Idapro, and Radare, using Cornucopia generated binaries, revealed various issues with these tools. Specifically, we found 263 crashes in Angr and one memory corruption issue in Idapro. Our differential testing on the analysis results revealed various semantic bugs in these tools. We also tested machine learning tools, Asmvec, Safe, and Debin, that claim to capture binary semantics and show that they perform poorly (For instance, Debin F1 score dropped to 12.9% from reported 63.1%) on Cornucopia generated binaries. In summary, our exhaustive evaluation shows that Cornucopia is an effective mechanism to generate binaries for testing binary analysis techniques effectively.

Aravind Machiry

What is connected

Connect this record

See the researcher in context

Building this map preview

2 published item(s)

C to Checked C by 3C

Cornucopia: A Framework for Feedback Guided Generation of Binaries