Researcher profile

Jef Wijsen

Jef Wijsen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2022arXiv

A Dichotomy in Consistent Query Answering for Primary Keys and Unary Foreign Keys

Since 2005, significant progress has been made in the problem of Consistent Query Answering (CQA) with respect to primary keys. In this problem, the input is a database instance that may violate one or more primary key constraints. A repair is defined as a maximal subinstance that satisfies all primary keys. Given a Boolean query q, the question then is whether q holds true in every repair. So far, theoretical research in this field has not addressed the combination of primary key and foreign key constraints, despite the importance of referential integrity in database systems. This paper addresses the problem of CQA with respect to both primary keys and foreign keys. In this setting, it is natural to adopt the notion of symmetric-difference repairs, because foreign keys can be repaired by inserting new tuples. We consider the case where foreign keys are unary, and queries are conjunctive queries without self-joins. In this setting, we characterize the boundary between those CQA problems that admit a consistent first-order rewriting, and those that do not.

preprint2022arXiv

Computing H-Partitions in ASP and Datalog

A $H$-partition of a finite undirected simple graph $G$ is a labeling of $G$'s vertices such that the constraints expressed by the model graph $H$ are satisfied. For every model graph $H$, it can be decided in non-deterministic polynomial time whether a given input graph $G$ admits a $H$-partition. Moreover, it has been shown by Dantas et al. that for most model graphs, this decision problem is in deterministic polynomial time. In this paper, we show that these polynomial-time algorithms for finding $H$-partitions can be expressed in Datalog with stratified negation. Moreover, using the answer set solver Clingo, we have conducted experiments to compare straightforward guess-and-check programs with Datalog programs. Our experiments indicate that in Clingo, guess-and-check programs run faster than their equivalent Datalog programs.

preprint2022arXiv

LinCQA: Faster Consistent Query Answering with Linear Time Guarantees

Most data analytical pipelines often encounter the problem of querying inconsistent data that violate pre-determined integrity constraints. Data cleaning is an extensively studied paradigm that singles out a consistent repair of the inconsistent data. Consistent query answering (CQA) is an alternative approach to data cleaning that asks for all tuples guaranteed to be returned by a given query on all (in most cases, exponentially many) repairs of the inconsistent data. This paper identifies a class of acyclic select-project-join (SPJ) queries for which CQA can be solved via SQL rewriting with a linear time guarantee. Our rewriting method can be viewed as a generalization of Yannakakis's algorithm for acyclic joins to the inconsistent setting. We present LinCQA, a system that can output rewritings in both SQL and non-recursive Datalog rules for every query in this class. We show that LinCQA often outperforms the existing CQA systems on both synthetic and real-world workloads, and in some cases, by orders of magnitude.

preprint2013arXiv

Charting the Tractability Frontier of Certain Conjunctive Query Answering

An uncertain database is defined as a relational database in which primary keys need not be satisfied. A repair (or possible world) of such database is obtained by selecting a maximal number of tuples without ever selecting two distinct tuples with the same primary key value. For a Boolean query q, the decision problem CERTAINTY(q) takes as input an uncertain database db and asks whether q is satisfied by every repair of db. Our main focus is on acyclic Boolean conjunctive queries without self-join. Previous work has introduced the notion of (directed) attack graph of such queries, and has proved that CERTAINTY(q) is first-order expressible if and only if the attack graph of q is acyclic. The current paper investigates the boundary between tractability and intractability of CERTAINTY(q). We first classify cycles in attack graphs as either weak or strong, and then prove among others the following. If the attack graph of a query q contains a strong cycle, then CERTAINTY(q) is coNP-complete. If the attack graph of q contains no strong cycle and every weak cycle of it is terminal (i.e., no edge leads from a vertex in the cycle to a vertex outside the cycle), then CERTAINTY(q) is in P. We then partially address the only remaining open case, i.e., when the attack graph contains some nonterminal cycle and no strong cycle. Finally, we establish a relationship between the complexities of CERTAINTY(q) and evaluating q on probabilistic databases.