Source author record

Vasco Manquinho

Vasco Manquinho appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Logic in Computer Science Software Engineering cs.CY Databases Machine Learning math.LO math.OC Programming Languages

Catalog footprint

What is connected

10works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

C-Pack of IPAs: A C90 Program Benchmark of Introductory Programming Assignments

Due to the vast number of students enrolled in Massive Open Online Courses (MOOCs), there has been an increasing number of automated program repair techniques focused on introductory programming assignments (IPAs). Such techniques take advantage of previous correct student implementations in order to provide automated, comprehensive, and personalized feedback to students. This paper presents C-Pack-IPAs, a publicly available benchmark of students' programs submitted for 25 different IPAs. C-Pack-IPAs contains semantically correct, semantically incorrect, and syntactically incorrect programs plus a test suite for each IPA. Hence, C-Pack-IPAs can be used to help evaluate the development of novel semantic, as well as syntactic, automated program repair frameworks, focused on providing feedback to novice programmers.

preprint2022arXiv

Exact and approximate determination of the Pareto set using minimal correction subsets

Recently, it has been shown that the enumeration of Minimal Correction Subsets (MCS) of Boolean formulas allows solving Multi-Objective Boolean Optimization (MOBO) formulations. However, a major drawback of this approach is that most MCSs do not correspond to Pareto-optimal solutions. In fact, one can only know that a given MCS corresponds to a Pareto-optimal solution when all MCSs are enumerated. Moreover, if it is not possible to enumerate all MCSs, then there is no guarantee of the quality of the approximation of the Pareto frontier. This paper extends the state of the art for solving MOBO using MCSs. First, we show that it is possible to use MCS enumeration to solve MOBO problems such that each MCS necessarily corresponds to a Pareto-optimal solution. Additionally, we also propose two new algorithms that can find a (1 + {\varepsilon})-approximation of the Pareto frontier using MCS enumeration. Experimental results in several benchmark sets show that the newly proposed algorithms allow finding better approximations of the Pareto frontier than state-of-the-art algorithms, and with guaranteed approximation ratios.

preprint2022arXiv

New Core-Guided and Hitting Set Algorithms for Multi-Objective Combinatorial Optimization

In the last decade, a plethora of algorithms for single-objective Boolean optimization has been proposed that rely on the iterative usage of a highly effective Propositional Satisfiability (SAT) solver. But the use of SAT solvers in Multi-Objective Combinatorial Optimization (MOCO) algorithms is still scarce. Due to this shortage of efficient tools for MOCO, many real-world applications formulated as multi-objective are simplified to single-objective, using either a linear combination or a lexicographic ordering of the objective functions to optimize. In this paper, we extend the state of the art of MOCO solvers with two novel unsatisfiability-based algorithms. The first is a core-guided MOCO solver. The second is a hitting set-based MOCO solver. Experimental results obtained in a wide range of benchmark instances show that our new unsatisfiability-based algorithms can outperform state-of-the-art SAT-based algorithms for MOCO.

preprint2021arXiv

SOAR: A Synthesis Approach for Data Science API Refactoring

With the growth of the open-source data science community, both the number of data science libraries and the number of versions for the same library are increasing rapidly. To match the evolving APIs from those libraries, open-source organizations often have to exert manual effort to refactor the APIs used in the code base. Moreover, due to the abundance of similar open-source libraries, data scientists working on a certain application may have an abundance of libraries to choose, maintain and migrate between. The manual refactoring between APIs is a tedious and error-prone task. Although recent research efforts were made on performing automatic API refactoring between different languages, previous work relies on statistical learning with collected pairwise training data for the API matching and migration. Using large statistical data for refactoring is not ideal because such training data will not be available for a new library or a new version of the same library. We introduce Synthesis for Open-Source API Refactoring (SOAR), a novel technique that requires no training data to achieve API migration and refactoring. SOAR relies only on the documentation that is readily available at the release of the library to learn API representations and mapping between libraries. Using program synthesis, SOAR automatically computes the correct configuration of arguments to the APIs and any glue code required to invoke those APIs. SOAR also uses the interpreter's error messages when running refactored code to generate logical constraints that can be used to prune the search space. Our empirical evaluation shows that SOAR can successfully refactor 80% of our benchmarks corresponding to deep learning models with up to 44 layers with an average run time of 97.23 seconds, and 90% of the data wrangling benchmarks with an average run time of 17.31 seconds.

preprint2015arXiv

DistMS: A Non-Portfolio Distributed Solver for Maximum Satisfiability

The most successful parallel SAT and MaxSAT solvers follow a portfolio approach, where each thread applies a different algorithm (or the same algorithm configured differently) to solve a given problem instance. The main goal of building a portfolio is to diversify the search process being carried out by each thread. As soon as one thread finishes, the instance can be deemed solved. In this paper we present a new open source distributed solver for MaxSAT solving that addresses two issues commonly found in multicore parallel solvers, namely memory contention and scalability. Preliminary results show that our non-portfolio distributed MaxSAT solver outperforms its sequential version and is able to solve more instances as the number of processes increases.

preprint2015arXiv

Exploiting Resolution-based Representations for MaxSAT Solving

Most recent MaxSAT algorithms rely on a succession of calls to a SAT solver in order to find an optimal solution. In particular, several algorithms take advantage of the ability of SAT solvers to identify unsatisfiable subformulas. Usually, these MaxSAT algorithms perform better when small unsatisfiable subformulas are found early. However, this is not the case in many problem instances, since the whole formula is given to the SAT solver in each call. In this paper, we propose to partition the MaxSAT formula using a resolution-based graph representation. Partitions are then iteratively joined by using a proximity measure extracted from the graph representation of the formula. The algorithm ends when only one partition remains and the optimal solution is found. Experimental results show that this new approach further enhances a state of the art MaxSAT solver to optimally solve a larger set of industrial problem instances.

preprint2015arXiv

Generalized Totalizer Encoding for Pseudo-Boolean Constraints

Pseudo-Boolean constraints, also known as 0-1 Integer Linear Constraints, are used to model many real-world problems. A common approach to solve these constraints is to encode them into a SAT formula. The runtime of the SAT solver on such formula is sensitive to the manner in which the given pseudo-Boolean constraints are encoded. In this paper, we propose generalized Totalizer encoding (GTE), which is an arc-consistency preserving extension of the Totalizer encoding to pseudo-Boolean constraints. Unlike some other encodings, the number of auxiliary variables required for GTE does not depend on the magnitudes of the coefficients. Instead, it depends on the number of distinct combinations of these coefficients. We show the superiority of GTE with respect to other encodings when large pseudo-Boolean constraints have low number of distinct coefficients. Our experimental results also show that GTE remains competitive even when the pseudo-Boolean constraints do not have this characteristic.

preprint2014arXiv

Incremental Cardinality Constraints for MaxSAT

Maximum Satisfiability (MaxSAT) is an optimization variant of the Boolean Satisfiability (SAT) problem. In general, MaxSAT algorithms perform a succession of SAT solver calls to reach an optimum solution making extensive use of cardinality constraints. Many of these algorithms are non-incremental in nature, i.e. at each iteration the formula is rebuilt and no knowledge is reused from one iteration to another. In this paper, we exploit the knowledge acquired across iterations using novel schemes to use cardinality constraints in an incremental fashion. We integrate these schemes with several MaxSAT algorithms. Our experimental results show a significant performance boost for these algo- rithms as compared to their non-incremental counterparts. These results suggest that incremental cardinality constraints could be beneficial for other constraint solving domains.

preprint2012arXiv

On When and How to use SAT to Mine Frequent Itemsets

A new stream of research was born in the last decade with the goal of mining itemsets of interest using Constraint Programming (CP). This has promoted a natural way to combine complex constraints in a highly flexible manner. Although CP state-of-the-art solutions formulate the task using Boolean variables, the few attempts to adopt propositional Satisfiability (SAT) provided an unsatisfactory performance. This work deepens the study on when and how to use SAT for the frequent itemset mining (FIM) problem by defining different encodings with multiple task-driven enumeration options and search strategies. Although for the majority of the scenarios SAT-based solutions appear to be non-competitive with CP peers, results show a variety of interesting cases where SAT encodings are the best option.

preprint2011arXiv

Optimally Solving the MCM Problem Using Pseudo-Boolean Satisfiability

In this report, we describe three encodings of the multiple constant multiplication (MCM) problem to pseudo-boolean satisfiability (PBS), and introduce an algorithm to solve the MCM problem optimally. To the best of our knowledge, the proposed encodings and the optimization algorithm are the first formalization of the MCM problem in a PBS manner. This report evaluates the complexity of the problem size and the performance of several PBS solvers over three encodings.

Vasco Manquinho

What is connected

Connect this record

See the researcher in context

Building this map preview

10 published item(s)

C-Pack of IPAs: A C90 Program Benchmark of Introductory Programming Assignments

Exact and approximate determination of the Pareto set using minimal correction subsets

New Core-Guided and Hitting Set Algorithms for Multi-Objective Combinatorial Optimization

SOAR: A Synthesis Approach for Data Science API Refactoring

DistMS: A Non-Portfolio Distributed Solver for Maximum Satisfiability

Exploiting Resolution-based Representations for MaxSAT Solving

Generalized Totalizer Encoding for Pseudo-Boolean Constraints

Incremental Cardinality Constraints for MaxSAT

On When and How to use SAT to Mine Frequent Itemsets

Optimally Solving the MCM Problem Using Pseudo-Boolean Satisfiability