Researcher profile

Miika Hannula

Miika Hannula contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2022arXiv

A Dichotomy in Consistent Query Answering for Primary Keys and Unary Foreign Keys

Since 2005, significant progress has been made in the problem of Consistent Query Answering (CQA) with respect to primary keys. In this problem, the input is a database instance that may violate one or more primary key constraints. A repair is defined as a maximal subinstance that satisfies all primary keys. Given a Boolean query q, the question then is whether q holds true in every repair. So far, theoretical research in this field has not addressed the combination of primary key and foreign key constraints, despite the importance of referential integrity in database systems. This paper addresses the problem of CQA with respect to both primary keys and foreign keys. In this setting, it is natural to adopt the notion of symmetric-difference repairs, because foreign keys can be repaired by inserting new tuples. We consider the case where foreign keys are unary, and queries are conjunctive queries without self-joins. In this setting, we characterize the boundary between those CQA problems that admit a consistent first-order rewriting, and those that do not.

preprint2021arXiv

An Algorithm for the Discovery of Independence from Data

For years, independence has been considered as an important concept in many disciplines. Nevertheless, we present the first research that investigates the discovery problem of independence in data. In its arguably simplest form, independence is a statement between two sets of columns expressing that for every two rows in a table there is also a row in the table that coincides with the first row on the first set of columns and with the second row on the second set of columns. We show that the problem of deciding whether there is an independence statement that holds on a given table is not only NP-complete but $W[3]$-complete in its arguably most natural parameter, namely its arity. We establish the first algorithm to discover all independence statement that hold on a given table. We illustrate in experiments with benchmark data that our algorithm performs well within the limits established by our hardness results. In practice, it is often useful to determine the ratio with which independence statements hold on a given table. For that purpose, we show that our treatment of independence and the design of our algorithm enables us to extend our findings to approximate independence. In our final experiments, we provide some insight into the trade-off between run time and the approximation ratio. Naturally, the smaller the ratio, the more approximate independence statements hold, and the more time it takes to discover all of them. While this research establishes first insight into the computational properties of discovering independence from data, we hope to initiate research into more sophisticated notions of independence, including embedded multivalued dependencies, as well as their context-specific and probabilistic variants.

preprint2021arXiv

Controlling Entity Integrity with Key Sets

Codd's rule of entity integrity stipulates that every table has a primary key. Hence, the attributes of the primary key carry unique and complete value combinations. In practice, data cannot always meet such requirements. Previous work proposed the superior notion of key sets for controlling entity integrity. We establish a linear-time algorithm for validating whether a given key set holds on a given data set, and demonstrate its efficiency on real-world data. We establish a binary axiomatization for the associated implication problem, and prove its coNP-completeness. However, the implication of unary by arbitrary key sets has better properties. The fragment enjoys a unary axiomatization and is decidable in quadratic time. Hence, we can minimize overheads before validating key sets. While perfect models do not always exist in general, we show how to compute them for any instance of our fragment. This provides computational support towards the acquisition of key sets.

preprint2021arXiv

On the Interaction of Functional and Inclusion Dependencies with Independence Atoms

Infamously, the finite and unrestricted implication problems for the classes of i) functional and inclusion dependencies together, and ii) embedded multivalued dependencies alone are each undecidable. Famously, the restriction of i) to functional and unary inclusion dependencies in combination with the restriction of ii) to multivalued dependencies yield implication problems that are still different in the finite and unrestricted case, but each are finitely axiomatizable and decidable in low-degree polynomial time. An important embedded tractable fragment of embedded multivalued dependencies are independence atoms that stipulate independence between two attribute sets. We establish a series of results for implication problems over subclasses of the combined class of functional and inclusion dependencies as well as independence atoms. One of our main results is that both finite and unrestricted implication problems for the combined class of independence atoms, unary functional and unary inclusion dependencies are axiomatizable and decidable in low-degree polynomial time.

preprint2020arXiv

Descriptive complexity of real computation and probabilistic independence logic

We introduce a novel variant of BSS machines called Separate Branching BSS machines (S-BSS in short) and develop a Fagin-type logical characterisation for languages decidable in non-deterministic polynomial time by S-BSS machines. We show that NP on S-BSS machines is strictly included in NP on BSS machines and that every NP language on S-BSS machines is a countable union of closed sets in the usual topology of R^n. Moreover, we establish that on Boolean inputs NP on S-BSS machines without real constants characterises a natural fragment of the complexity class existsR (a class of problems polynomial time reducible to the true existential theory of the reals) and hence lies between NP and PSPACE. Finally we apply our results to determine the data complexity of probabilistic independence logic.

preprint2020arXiv

On the Complexity of Horn and Krom Fragments of Second-Order Boolean Logic

Second-order Boolean logic is a generalization of QBF, whose constant alternation fragments are known to be complete for the levels of the exponential time hierarchy. We consider two types of restriction of this logic: 1) restrictions to term constructions, 2) restrictions to the form of the Boolean matrix. Of the first sort, we consider two kinds of restrictions: firstly, disallowing nested use of proper function variables, and secondly stipulating that each function variable must appear with a fixed sequence of arguments. Of the second sort, we consider Horn, Krom, and core fragments of the Boolean matrix. We classify the complexity of logics obtained by combining these two types of restrictions. We show that, in most cases, logics with k alternating blocks of function quantifiers are complete for the kth or (k-1)th level of the exponential time hierarchy. Furthermore, we establish NL-completeness for the Krom and core fragments, when k=1 and both restrictions of the first sort are in effect.

preprint2020arXiv

Polyteam Semantics

Team semantics is the mathematical framework of modern logics of dependence and independence in which formulae are interpreted by sets of assignments (teams) instead of single assignments as in first-order logic. In order to deepen the fruitful interplay between team semantics and database dependency theory, we define "Polyteam Semantics" in which formulae are evaluated over a family of teams. We begin by defining a novel polyteam variant of dependence atoms and give a finite axiomatisation for the associated implication problem. We relate polyteam semantics to team semantics and investigate in which cases logics over the former can be simulated by logics over the latter. We also characterise the expressive power of poly-dependence logic by properties of polyteams that are downwards closed and definable in existential second-order logic (ESO). The analogous result is shown to hold for poly-independence logic and all ESO-definable properties. We also relate poly-inclusion logic to greatest fixed point logic.