Source author record

Arjun Radhakrishna

Arjun Radhakrishna appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Programming Languages Artificial Intelligence Software Engineering Logic in Computer Science Databases Distributed, Parallel, and Cluster Computing Information Retrieval Machine Learning

Catalog footprint

What is connected

13works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

An Empirical Investigation of Robustness in Large Language Models under Tabular Distortions

We investigate how large language models (LLMs) fail when tabular data in an otherwise canonical representation is subjected to semantic and structural distortions. Our findings reveal that LLMs lack an inherent ability to detect and correct subtle distortions in table representations. Only when provided with an explicit prior, via a system prompt, do models partially adjust their reasoning strategies and correct some distortions, though not consistently or completely. To study this phenomenon, we introduce a small, expert-curated dataset that explicitly evaluates LLMs on table question answering (TQA) tasks requiring an additional error-correction step prior to analysis. Our results reveal systematic differences in how LLMs ingest and interpret tabular information under distortion, with even SoTA models such as GPT-5.2 model exhibiting a drop of minimum 22% accuracy under distortion. These findings raise important questions for future research, particularly regarding when and how models should autonomously decide to realign tabular inputs, analogous to human behavior, without relying on explicit prompts or tabular data pre-processing.

preprint2022arXiv

AutoTSG: Learning and Synthesis for Incident Troubleshooting

Incident management is a key aspect of operating large-scale cloud services. To aid with faster and efficient resolution of incidents, engineering teams document frequent troubleshooting steps in the form of Troubleshooting Guides (TSGs), to be used by on-call engineers (OCEs). However, TSGs are siloed, unstructured, and often incomplete, requiring developers to manually understand and execute necessary steps. This results in a plethora of issues such as on-call fatigue, reduced productivity, and human errors. In this work, we conduct a large-scale empirical study of over 4K+ TSGs mapped to 1000s of incidents and find that TSGs are widely used and help significantly reduce mitigation efforts. We then analyze feedback on TSGs provided by 400+ OCEs and propose a taxonomy of issues that highlights significant gaps in TSG quality. To alleviate these gaps, we investigate the automation of TSGs and propose AutoTSG -- a novel framework for automation of TSGs to executable workflows by combining machine learning and program synthesis. Our evaluation of AutoTSG on 50 TSGs shows the effectiveness in both identifying TSG statements (accuracy 0.89) and parsing them for execution (precision 0.94 and recall 0.91). Lastly, we survey ten Microsoft engineers and show the importance of TSG automation and the usefulness of AutoTSG.

preprint2022arXiv

Landmarks and Regions: A Robust Approach to Data Extraction

We propose a new approach to extracting data items or field values from semi-structured documents. Examples of such problems include extracting passenger name, departure time and departure airport from a travel itinerary, or extracting price of an item from a purchase receipt. Traditional approaches to data extraction use machine learning or program synthesis to process the whole document to extract the desired fields. Such approaches are not robust to format changes in the document, and the extraction process typically fails even if changes are made to parts of the document that are unrelated to the desired fields of interest. We propose a new approach to data extraction based on the concepts of landmarks and regions. Humans routinely use landmarks in manual processing of documents to zoom in and focus their attention on small regions of interest in the document. Inspired by this human intuition, we use the notion of landmarks in program synthesis to automatically synthesize extraction programs that first extract a small region of interest, and then automatically extract the desired value from the region in a subsequent step. We have implemented our landmark-based extraction approach in a tool LRSyn, and show extensive evaluation on documents in HTML as well as scanned images of invoices and receipts. Our results show that our approach is robust to various types of format changes that routinely happen in real-world settings.

preprint2022arXiv

Overwatch: Learning Patterns in Code Edit Sequences

Integrated Development Environments (IDEs) provide tool support to automate many source code editing tasks. Traditionally, IDEs use only the spatial context, i.e., the location where the developer is editing, to generate candidate edit recommendations. However, spatial context alone is often not sufficient to confidently predict the developer's next edit, and thus IDEs generate many suggestions at a location. Therefore, IDEs generally do not actively offer suggestions and instead, the developer is usually required to click on a specific icon or menu and then select from a large list of potential suggestions. As a consequence, developers often miss the opportunity to use the tool support because they are not aware it exists or forget to use it. To better understand common patterns in developer behavior and produce better edit recommendations, we can additionally use the temporal context, i.e., the edits that a developer was recently performing. To enable edit recommendations based on temporal context, we present Overwatch, a novel technique for learning edit sequence patterns from traces of developers' edits performed in an IDE. Our experiments show that Overwatch has 78% precision and that Overwatch not only completed edits when developers missed the opportunity to use the IDE tool support but also predicted new edits that have no tool support in the IDE.

preprint2021arXiv

Conformance Constraint Discovery: Measuring Trust in Data-Driven Systems

The reliability and proper function of data-driven applications hinge on the data's continued conformance to the applications' initial design. When data deviates from this initial profile, system behavior becomes unpredictable. Data profiling techniques such as functional dependencies and denial constraints encode patterns in the data that can be used to detect deviations. But traditional methods typically focus on exact constraints and categorical attributes, and are ill-suited for tasks such as determining whether the prediction of a machine learning system can be trusted or for quantifying data drift. In this paper, we introduce data invariants, a new data-profiling primitive that models arithmetic relationships involving multiple numerical attributes within a (noisy) dataset and which complements the existing data-profiling techniques. We propose a quantitative semantics to measure the degree of violation of a data invariant, and establish that strong data invariants can be constructed from observations with low variance on the given dataset. A concrete instance of this principle gives the surprising result that low-variance components of a principal component analysis (PCA), which are usually discarded, generate better invariants than the high-variance components. We demonstrate the value of data invariants on two applications: trusted machine learning and data drift. We empirically show that data invariants can (1) reliably detect tuples on which the prediction of a machine-learned model should not be trusted, and (2) quantify data drift more accurately than the state-of-the-art methods. Additionally, we show four case studies where an intervention-centric explanation tool uses data invariants to explain causes for tuple non-conformance.

preprint2020arXiv

Information-theoretic User Interaction: Significant Inputs for Program Synthesis

Programming-by-example technologies are being deployed in industrial products for real-time synthesis of various kinds of data transformations. These technologies rely on the user to provide few representative examples of the transformation task. Motivated by the need to find the most pertinent question to ask the user, in this paper, we introduce the {\em significant questions problem}, and show that it is hard in general. We then develop an information-theoretic greedy approach for solving the problem. We justify the greedy algorithm using the conditional entropy result, which informally says that the question that achieves the maximum information gain is the one that we know least about. In the context of interactive program synthesis, we use the above result to develop an {\em{active program learner}} that generates the significant inputs to pose as queries to the user in each iteration. The procedure requires extending a {\em{passive program learner}} to a {\em{sampling program learner}} that is able to sample candidate programs from the set of all consistent programs to enable estimation of information gain. It also uses clustering of inputs based on features in the inputs and the corresponding outputs to sample a small set of candidate significant inputs. Our active learner is able to tradeoff false negatives for false positives and converge in a small number of iterations on a real-world dataset of %around 800 string transformation tasks.

preprint2015arXiv

From Non-preemptive to Preemptive Scheduling using Synchronization Synthesis

We present a computer-aided programming approach to concurrency. The approach allows programmers to program assuming a friendly, non-preemptive scheduler, and our synthesis procedure inserts synchronization to ensure that the final program works even with a preemptive scheduler. The correctness specification is implicit, inferred from the non-preemptive behavior. Let us consider sequences of calls that the program makes to an external interface. The specification requires that any such sequence produced under a preemptive scheduler should be included in the set of such sequences produced under a non-preemptive scheduler. The solution is based on a finitary abstraction, an algorithm for bounded language inclusion modulo an independence relation, and rules for inserting synchronization. We apply the approach to device-driver programming, where the driver threads call the software interface of the device and the API provided by the operating system. Our experiments demonstrate that our synthesis method is precise and efficient, and, since it does not require explicit specifications, is more practical than the conventional approach based on user-provided assertions.

preprint2015arXiv

Optimizing Solution Quality in Synchronization Synthesis

Given a multithreaded program written assuming a friendly, non-preemptive scheduler, the goal of synchronization synthesis is to automatically insert synchronization primitives to ensure that the modified program behaves correctly, even with a preemptive scheduler. In this work, we focus on the quality of the synthesized solution: we aim to infer synchronization placements that not only ensure correctness, but also meet some quantitative objectives such as optimal program performance on a given computing platform. The key step that enables solution optimization is the construction of a set of global constraints over synchronization placements such that each model of the constraints set corresponds to a correctness-ensuring synchronization placement. We extract the global constraints from generalizations of counterexample traces and the control-flow graph of the program. The global constraints enable us to choose from among the encoded synchronization solutions using an objective function. We consider two types of objective functions: ones that are solely dependent on the program (e.g., minimizing the size of critical sections) and ones that are also dependent on the computing platform. For the latter, given a program and a computing platform, we construct a performance model based on measuring average contention for critical sections and the average time taken to acquire and release a lock under a given average contention. We empirically evaluated that our approach scales to typical module sizes of many real world concurrent programs such as device drivers and multithreaded servers, and that the performance predictions match reality. To the best of our knowledge, this is the first comprehensive approach for optimizing the placement of synthesized synchronization.

preprint2015arXiv

Synthesis through Unification

Given a specification and a set of candidate programs (program space), the program synthesis problem is to find a candidate program that satisfies the specification. We present the synthesis through unification (STUN) approach, which is an extension of the counter-example guided inductive synthesis (CEGIS) approach. In CEGIS, the synthesizer maintains a subset S of inputs and a candidate program Prog that is correct for S. The synthesizer repeatedly checks if there exists a counter-example input c such that the execution of Prog is incorrect on c. If so, the synthesizer enlarges S to include c, and picks a program from the program space that is correct for the new set S. The STUN approach extends CEGIS with the idea that given a program Prog that is correct for a subset of inputs, the synthesizer can try to find a program Prog' that is correct for the rest of the inputs. If Prog and Prog' can be unified into a program in the program space, then a solution has been found. We present a generic synthesis procedure based on the STUN approach and specialize it for three different domains by providing the appropriate unification operators. We implemented these specializations in prototype tools, and we show that our tools often per- forms significantly better on standard benchmarks than a tool based on a pure CEGIS approach.

preprint2014arXiv

Regression-free Synthesis for Concurrency

While fixing concurrency bugs, program repair algorithms may introduce new concurrency bugs. We present an algorithm that avoids such regressions. The solution space is given by a set of program transformations we consider in for repair process. These include reordering of instructions within a thread and inserting atomic sections. The new algorithm learns a constraint on the space of candidate solutions, from both positive examples (error-free traces) and counterexamples (error traces). From each counterexample, the algorithm learns a constraint necessary to remove the errors. From each positive examples, it learns a constraint that is necessary in order to prevent the repair from turning the trace into an error trace. We implemented the algorithm and evaluated it on simplified Linux device drivers with known bugs.

preprint2012arXiv

Interface Simulation Distances

The classical (boolean) notion of refinement for behavioral interfaces of system components is the alternating refinement preorder. In this paper, we define a distance for interfaces, called interface simulation distance. It makes the alternating refinement preorder quantitative by, intuitively, tolerating errors (while counting them) in the alternating simulation game. We show that the interface simulation distance satisfies the triangle inequality, that the distance between two interfaces does not increase under parallel composition with a third interface, and that the distance between two interfaces can be bounded from above and below by distances between abstractions of the two interfaces. We illustrate the framework, and the properties of the distances under composition of interfaces, with two case studies.

preprint2011arXiv

Quantitative Synthesis for Concurrent Programs

We present an algorithmic method for the quantitative, performance-aware synthesis of concurrent programs. The input consists of a nondeterministic partial program and of a parametric performance model. The nondeterminism allows the programmer to omit which (if any) synchronization construct is used at a particular program location. The performance model, specified as a weighted automaton, can capture system architectures by assigning different costs to actions such as locking, context switching, and memory and cache accesses. The quantitative synthesis problem is to automatically resolve the nondeterminism of the partial program so that both correctness is guaranteed and performance is optimal. As is standard for shared memory concurrency, correctness is formalized "specification free", in particular as race freedom or deadlock freedom. For worst-case (average-case) performance, we show that the problem can be reduced to 2-player graph games (with probabilistic transitions) with quantitative objectives. While we show, using game-theoretic methods, that the synthesis problem is NEXP-complete, we present an algorithmic method and an implementation that works efficiently for concurrent programs and performance models of practical interest. We have implemented a prototype tool and used it to synthesize finite-state concurrent programs that exhibit different programming patterns, for several performance models representing different architectures.

preprint2010arXiv

GIST: A Solver for Probabilistic Games

Gist is a tool that (a) solves the qualitative analysis problem of turn-based probabilistic games with ω-regular objectives; and (b) synthesizes reasonable environment assumptions for synthesis of unrealizable specifications. Our tool provides the first and efficient implementations of several reduction-based techniques to solve turn-based probabilistic games, and uses the analysis of turn-based probabilistic games for synthesizing environment assumptions for unrealizable specifications.

Arjun Radhakrishna

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

An Empirical Investigation of Robustness in Large Language Models under Tabular Distortions

AutoTSG: Learning and Synthesis for Incident Troubleshooting

Landmarks and Regions: A Robust Approach to Data Extraction

Overwatch: Learning Patterns in Code Edit Sequences

Conformance Constraint Discovery: Measuring Trust in Data-Driven Systems

Information-theoretic User Interaction: Significant Inputs for Program Synthesis

From Non-preemptive to Preemptive Scheduling using Synchronization Synthesis

Optimizing Solution Quality in Synchronization Synthesis

Synthesis through Unification

Regression-free Synthesis for Concurrency

Interface Simulation Distances

Quantitative Synthesis for Concurrent Programs

GIST: A Solver for Probabilistic Games