Source author record

Zhendong Su

Zhendong Su appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Software Engineering Programming Languages Computation and Language Databases

Catalog footprint

What is connected

8works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

MLIR-Smith: A Novel Random Program Generator for Evaluating Compiler Pipelines

Compilers are essential for the performance and correct execution of software and hold universal relevance across various scientific disciplines. Despite this, there is a notable lack of tools for testing and evaluating them, especially within the adaptable Multi-Level Intermediate Representation (MLIR) context. This paper addresses the need for a tool that can accommodate MLIR's extensibility, a feature not provided by previous methods such as Csmith. Here we introduce MLIR-Smith, a novel random program generator specifically designed to test and evaluate MLIR-based compiler optimizations. We demonstrate the utility of MLIR-Smith by conducting differential testing on MLIR, LLVM, DaCe, and DCIR, which led to the discovery of multiple bugs in these compiler pipelines. The introduction of MLIR-Smith not only fills a void in the realm of compiler testing but also emphasizes the importance of comprehensive testing within these systems. By providing a tool that can generate random MLIR programs, this paper enhances our ability to evaluate and improve compilers and paves the way for future tools, potentially shaping the wider landscape of software testing and quality assurance.

preprint2021arXiv

Testing Machine Translation via Referential Transparency

Machine translation software has seen rapid progress in recent years due to the advancement of deep neural networks. People routinely use machine translation software in their daily lives, such as ordering food in a foreign restaurant, receiving medical diagnosis and treatment from foreign doctors, and reading international political news online. However, due to the complexity and intractability of the underlying neural networks, modern machine translation software is still far from robust and can produce poor or incorrect translations; this can lead to misunderstanding, financial loss, threats to personal safety and health, and political conflicts. To address this problem, we introduce referentially transparent inputs (RTIs), a simple, widely applicable methodology for validating machine translation software. A referentially transparent input is a piece of text that should have similar translations when used in different contexts. Our practical implementation, Purity, detects when this property is broken by a translation. To evaluate RTI, we use Purity to test Google Translate and Bing Microsoft Translator with 200 unlabeled sentences, which detected 123 and 142 erroneous translations with high precision (79.3% and 78.3%). The translation errors are diverse, including examples of under-translation, over-translation, word/phrase mistranslation, incorrect modification, and unclear logic.

preprint2020arXiv

Combining Symbolic Execution and Model Checking to Verify MPI Programs

Message passing is the standard paradigm of programming in high-performance computing. However, verifying Message Passing Interface (MPI) programs is challenging, due to the complex program features (such as non-determinism and non-blocking operations). In this work, we present MPI symbolic verifier (MPI-SV), the first symbolic execution based tool for automatically verifying MPI programs with non-blocking operations. MPI-SV combines symbolic execution and model checking in a synergistic way to tackle the challenges in MPI program verification. The synergy improves the scalability and enlarges the scope of verifiable properties. We have implemented MPI-SV (footnote: https://mpi-sv.github.io) and evaluated it with 111 real-world MPI verification tasks. The pure symbolic execution-based technique successfully verifies 61 out of the 111 tasks (55\%) within one hour, while in comparison, MPI-SV verifies 100 tasks (90\%). On average, compared with pure symbolic execution, MPI-SV achieves 19x speedups on verifying the satisfaction of the critical property and 5x speedups on finding violations.

preprint2020arXiv

Detecting Optimization Bugs in Database Engines via Non-Optimizing Reference Engine Construction

Database Management Systems (DBMS) are used ubiquitously. To efficiently access data, they apply sophisticated optimizations. Incorrect optimizations can result in logic bugs, which cause a query to compute an incorrect result set. We propose Non-Optimizing Reference Engine Construction (NoREC), a fully-automatic approach to detect optimization bugs in DBMS. Conceptually, this approach aims to evaluate a query by an optimizing and a non-optimizing version of a DBMS, to then detect differences in their returned result set, which would indicate a bug in the DBMS. Obtaining a non-optimizing version of a DBMS is challenging, because DBMS typically provide limited control over optimizations. Our core insight is that a given, potentially randomly-generated optimized query can be rewritten to one that the DBMS cannot optimize. Evaluating this unoptimized query effectively corresponds to a non-optimizing reference engine executing the original query. We evaluated NoREC in an extensive testing campaign on four widely-used DBMS, namely PostgreSQL, MariaDB, SQLite, and CockroachDB. We found 159 previously unknown bugs in the latest versions of these systems, 141 of which have been fixed by the developers. Of these, 51 were optimization bugs, while the remaining were error and crash bugs. Our results suggest that NoREC is effective, general and requires little implementation effort, which makes the technique widely applicable in practice.

preprint2020arXiv

Structure-Invariant Testing for Machine Translation

In recent years, machine translation software has increasingly been integrated into our daily lives. People routinely use machine translation for various applications, such as describing symptoms to a foreign doctor and reading political news in a foreign language. However, the complexity and intractability of neural machine translation (NMT) models that power modern machine translation make the robustness of these systems difficult to even assess, much less guarantee. Machine translation systems can return inferior results that lead to misunderstanding, medical misdiagnoses, threats to personal safety, or political conflicts. Despite its apparent importance, validating the robustness of machine translation systems is very difficult and has, therefore, been much under-explored. To tackle this challenge, we introduce structure-invariant testing (SIT), a novel metamorphic testing approach for validating machine translation software. Our key insight is that the translation results of "similar" source sentences should typically exhibit similar sentence structures. Specifically, SIT (1) generates similar source sentences by substituting one word in a given sentence with semantically similar, syntactically equivalent words; (2) represents sentence structure by syntax parse trees (obtained via constituency or dependency parsing); (3) reports sentence pairs whose structures differ quantitatively by more than some threshold. To evaluate SIT, we use it to test Google Translate and Bing Microsoft Translator with 200 source sentences as input, which led to 64 and 70 buggy issues with 69.5\% and 70\% top-1 accuracy, respectively. The translation errors are diverse, including under-translation, over-translation, incorrect modification, word/phrase mistranslation, and unclear logic.

preprint2020arXiv

Testing Database Engines via Pivoted Query Synthesis

Relational databases are used ubiquitously. They are managed by database management systems (DBMS), which allow inserting, modifying, and querying data using a domain-specific language called Structured Query Language (SQL). Popular DBMS have been extensively tested by fuzzers, which have been successful in finding crash bugs. However, approaches to finding logic bugs, such as when a DBMS computes an incorrect result set, have remained mostly untackled. Differential testing is an effective technique to test systems that support a common language by comparing the outputs of these systems. However, this technique is ineffective for DBMS, because each DBMS typically supports its own SQL dialect. To this end, we devised a novel and general approach that we have termed Pivoted Query Synthesis. The core idea of this approach is to automatically generate queries for which we ensure that they fetch a specific, randomly selected row, called the pivot row. If the DBMS fails to fetch the pivot row, the likely cause is a bug in the DBMS. We tested our approach on three widely-used and mature DBMS, namely SQLite, MySQL, and PostgreSQL. In total, we reported 123 bugs in these DBMS, 99 of which have been fixed or verified, demonstrating that the approach is highly effective and general. We expect that the wide applicability and simplicity of our approach will enable the improvement of robustness of many DBMS.

preprint2016arXiv

Toward Rapid Transformation of Ideas into Software

A key mission of computer science is to enable people realize their creative ideas as naturally and painlessly as possible. Software engineering is at the center of this mission -- software technologies enable reification of ideas into working systems. As computers become ubiquitous, both in availability and the aspects of human lives they touch, the quantity and diversity of ideas also rapidly grow. Our programming systems and technologies need to evolve to make this reification process -- transforming ideas to software -- as quick and accessible as possible. The goal of this paper is twofold. First, it advocates and highlights the "transforming ideas to software" mission as a moonshot for software engineering research. This is a long-term direction for the community, and there is no silver bullet that can get us there. To make this mission a reality, as a community, we need to improve the status quo across many dimensions. Thus, the second goal is to outline a number of directions to modernize our contemporary programming technologies for decades to come, describe work that has been undertaken along those vectors, and pinpoint critical challenges.

preprint2012arXiv

Abstracting Runtime Heaps for Program Understanding

Modern programming environments provide extensive support for inspecting, analyzing, and testing programs based on the algorithmic structure of a program. Unfortunately, support for inspecting and understanding runtime data structures during execution is typically much more limited. This paper provides a general purpose technique for abstracting and summarizing entire runtime heaps. We describe the abstract heap model and the associated algorithms for transforming a concrete heap dump into the corresponding abstract model as well as algorithms for merging, comparing, and computing changes between abstract models. The abstract model is designed to emphasize high-level concepts about heap-based data structures, such as shape and size, as well as relationships between heap structures, such as sharing and connectivity. We demonstrate the utility and computational tractability of the abstract heap model by building a memory profiler. We then use this tool to check for, pinpoint, and correct sources of memory bloat from a suite of programs from DaCapo.

Zhendong Su

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

MLIR-Smith: A Novel Random Program Generator for Evaluating Compiler Pipelines

Testing Machine Translation via Referential Transparency

Combining Symbolic Execution and Model Checking to Verify MPI Programs

Detecting Optimization Bugs in Database Engines via Non-Optimizing Reference Engine Construction

Structure-Invariant Testing for Machine Translation

Testing Database Engines via Pivoted Query Synthesis

Toward Rapid Transformation of Ideas into Software

Abstracting Runtime Heaps for Program Understanding