Researcher profile

Ali Mesbah

Ali Mesbah contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2026arXiv

Issue2Test: Generating Reproducing Test Cases from Issue Reports

Automated tools for solving GitHub issues are receiving significant attention by both researchers and practitioners, e.g., in the form of foundation models and LLM-based agents prompted with issues. A crucial step toward successfully solving an issue is creating a test case that accurately reproduces the issue. Such a test case can guide the search for an appropriate patch and help validate whether the patch matches the issue's intent. However, existing techniques for issue reproduction show only moderate success. This paper presents Issue2Test, an LLM-based technique for automatically generating a reproducing test case for a given issue report. Unlike automated regression test generators, which aim at creating passing tests, our approach aims at a test that fails, and that fails specifically for the reason described in the issue. To this end, Issue2Test performs three steps: (1) understand the issue and gather context (e.g., related files and project-specific guidelines) relevant for reproducing it; (2) generate a candidate test case; and (3) iteratively refine the test case based on compilation and runtime feedback until it fails and the failure aligns with the problem described in the issue. We evaluate Issue2Test on the SWT-bench-lite dataset, where it successfully reproduces 32.9% of the issues, achieving a 16.3% relative improvement over the best existing technique. Our evaluation also shows that Issue2Test reproduces 20 issues that four prior techniques fail to address, contributing a total of 60.4% of all issues reproduced by these tools. We envision our approach to contribute to enhancing the overall progress in the important task of automatically solving GitHub issues.

preprint2022arXiv

A Controlled Experiment of Different Code Representations for Learning-Based Bug Repair

Training a deep learning model on source code has gained significant traction recently. Since such models reason about vectors of numbers, source code needs to be converted to a code representation before vectorization. Numerous approaches have been proposed to represent source code, from sequences of tokens to abstract syntax trees. However, there is no systematic study to understand the effect of code representation on learning performance. Through a controlled experiment, we examine the impact of various code representations on model accuracy and usefulness in deep learning-based program repair. We train 21 different generative models that suggest fixes for name-based bugs, including 14 different homogeneous code representations, four mixed representations for the buggy and fixed code, and three different embeddings. We assess if fix suggestions produced by the model in various code representations are automatically patchable, meaning they can be transformed to a valid code that is ready to be applied to the buggy code to fix it. We also conduct a developer study to qualitatively evaluate the usefulness of inferred fixes in different code representations. Our results highlight the importance of code representation and its impact on learning and usefulness. Our findings indicate that (1) while code abstractions help the learning process, they can adversely impact the usefulness of inferred fixes from a developer's point of view; this emphasizes the need to look at the patches generated from the practitioner's perspective, which is often neglected in the literature, (2) mixed representations can outperform homogeneous code representations, (3) bug type can affect the effectiveness of different code representations; although current techniques use a single code representation for all bug types, there is no single best code representation applicable to all bug types.

preprint2022arXiv

A Physics-informed Deep Learning Approach for Minimum Effort Stochastic Control of Colloidal Self-Assembly

We propose formulating the finite-horizon stochastic optimal control problem for colloidal self-assembly in the space of probability density functions (PDFs) of the underlying state variables (namely, order parameters). The control objective is formulated in terms of steering the state PDFs from a prescribed initial probability measure towards a prescribed terminal probability measure with minimum control effort. For specificity, we use a univariate stochastic state model from the literature. Both the analysis and the computational steps for control synthesis as developed in this paper generalize for multivariate stochastic state dynamics given by generic nonlinear in state and non-affine in control models. We derive the conditions of optimality for the associated optimal control problem. This derivation yields a system of three coupled partial differential equations together with the boundary conditions at the initial and terminal times. The resulting system is a generalized instance of the so-called Schrödinger bridge problem. We then determine the optimal control policy by training a physics-informed deep neural network, where the "physics" are the derived conditions of optimality. The performance of the proposed solution is demonstrated via numerical simulations on a benchmark colloidal self-assembly problem.

preprint2022arXiv

Fragment-Based Test Generation For Web Apps

Automated model-based test generation presents a viable alternative to the costly manual test creation currently employed for regression testing of web apps. However, existing model inference techniques rely on threshold-based whole-page comparison to establish state equivalence, which cannot reliably identify near-duplicate web pages in modern web apps. Consequently, existing techniques produce inadequate models for dynamic web apps, and fragile test oracles, rendering the generated regression test suites ineffective. We propose a model-based test generation technique, FRAGGEN, that eliminates the need for thresholds, by employing a novel state abstraction based on page fragmentation to establish state equivalence. FRAGGEN also uses fine-grained page fragment analysis to diversify state exploration and generate reliable test oracles. Our evaluation shows that FRAGGEN outperforms existing whole-page techniques by detecting more near-duplicates, inferring better web app models and generating test suites that are better suited for regression testing. On a dataset of 86,165 state-pairs, FRAGGEN detected 123% more near-duplicates on average compared to whole-page techniques. The crawl models inferred by FRAGGEN have 62% more precision and 70% more recall on average. FRAGGEN also generates reliable regression test suites with test actions that have nearly 100% success rate on the same version of the web app even if the execution environment is varied. The test oracles generated by FRAGGEN can detect 98.7% of the visible changes in web pages while being highly robust, making them suitable for regression testing.

preprint2022arXiv

Safe Exploration and Escape Local Minima with Model Predictive Control under Partially Unknown Constraints

In this paper, we propose a novel model predictive control (MPC) framework for output tracking that deals with partially unknown constraints. The MPC scheme optimizes over a learning and a backup trajectory. The learning trajectory aims to explore unknown and potentially unsafe areas, if and only if this might lead to a potential performance improvement. On the contrary, the backup trajectory lies in the known space, and is intended to ensure safety and convergence. The cost function for the learning trajectory is divided into a tracking and an offset cost, while the cost function for the backup trajectory is only marginally considered and only penalizes the offset cost. We show that the proposed MPC scheme is not only able to safely explore the unknown constraints, but also escape from local minima that may arise from the presence of obstacles. Moreover, we provide formal guarantees for convergence and recursive feasibility of the MPC scheme, as well as closed-loop constraint satisfaction. Finally, the proposed MPC scheme is demonstrated in simulations using an example of autonomous vehicle driving in a partially unknown environment where unknown obstacles are present.

preprint2020arXiv

PoCET: a Polynomial Chaos Expansion Toolbox for Matlab

We introduce PoCET: a free and open-scource Polynomial Chaos Expansion Toolbox for Matlab, featuring the automatic generation of polynomial chaos expansion (PCE) for linear and nonlinear dynamic systems with time-invariant stochastic parameters or initial conditions, as well as several simulation tools. It offers a built-in handling of Gaussian, uniform, and beta probability density functions, projection and collocation-based calculation of PCE coefficients, and the calculation of stochastic moments from a PCE. Efficient algorithms for the calculation of the involved integrals have been designed in order to increase its applicability. PoCET comes with a variety of introductory and instructive examples. Throughout the paper we show how to perform a polynomial chaos expansion on a simple ordinary differential equation using PoCET, as well as how it can be used to solve the more complex task of optimal experimental design.