Source author record

Nicole Schweikardt

Nicole Schweikardt appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Logic in Computer Science Databases Artificial Intelligence Computational Complexity Data Structures and Algorithms Formal Languages and Automata Theory Machine Learning

Catalog footprint

What is connected

9works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Structural Indexing of Relational Databases for the Evaluation of Free-Connex Acyclic Conjunctive Queries

We present an index structure to boost the evaluation of free-connex acyclic conjunctive queries (fc-ACQs) over relational databases. The main ingredient of the index associated with a given database $D$ is an auxiliary database $D_{col}$. Our main result states that for any fc-ACQ $Q$ over $D$, we can count the number of answers of $Q$ or enumerate them with constant delay after a preprocessing phase that takes time linear in the size of $D_{col}$. Unlike previous indexing methods based on values or order (e.g., B+ trees), our index is based on structural symmetries among tuples in a database, and the size of $D_{col}$ is related to the number of colors assigned to $D$ by Scheidt and Schweikardt's "relational color refinement" (2025). In the particular case of graphs, this coincides with the minimal size of an equitable partition of the graph. For example, the size of $D_{col}$ is logarithmic in the case of binary trees and constant for regular graphs. Even in the worst-case that $D$ has no structural symmetries among tuples at all, the size of $D_{col}$ is still linear in the size of $D$. Given that the size of $D_{col}$ is bounded by the size of $D$ and can be much smaller (even constant for some families of databases), our index is the first foundational result on indexing internal structural symmetries of a database to evaluate all fc-ACQs with performance potentially strictly smaller than the database size.

preprint2026arXiv

Using Color Refinement to Boost Enumeration and Counting for Acyclic CQs of Binary Schemas

We present an index structure, called the color-index, to boost the evaluation of acyclic conjunctive queries (ACQs) over binary schemas. The color-index is based on the color refinement algorithm, a widely used subroutine for graph isomorphism testing algorithms. Given a database $D$, we use a suitable version of the color refinement algorithm to produce a stable coloring of $D$, an assignment from the active domain of $D$ to a set of colors $C_D$. The main ingredient of the color-index is a particular database $D_c$ whose active domain is $C_D$ and whose size is at most $|D|$. Using the color-index, we can evaluate any free-connex ACQ $Q$ over $D$ with preprocessing time $O(|Q| \cdot |D_c|)$ and constant delay enumeration. Furthermore, we can also count the number of results of $Q$ over $D$ in time $O(|Q| \cdot |D_c|)$. Given that $|D_c|$ could be much smaller than $|D|$ (even constant-size for some families of databases), the color-index is the first index structure for evaluating free-connex ACQs that allows efficient enumeration and counting with performance that may be strictly smaller than the database size.

preprint2021arXiv

Spanner Evaluation over SLP-Compressed Documents

We consider the problem of evaluating regular spanners over compressed documents, i.e., we wish to solve evaluation tasks directly on the compressed data, without decompression. As compressed forms of the documents we use straight-line programs (SLPs) -- a lossless compression scheme for textual data widely used in different areas of theoretical computer science and particularly well-suited for algorithmics on compressed data. In terms of data complexity, our results are as follows. For a regular spanner M and an SLP S that represents a document D, we can solve the tasks of model checking and of checking non-emptiness in time O(size(S)). Computing the set M(D) of all span-tuples extracted from D can be done in time O(size(S) size(M(D))), and enumeration of M(D) can be done with linear preprocessing O(size(S)) and a delay of O(depth(S)), where depth(S) is the depth of S's derivation tree. Note that size(S) can be exponentially smaller than the document's size |D|; and, due to known balancing results for SLPs, we can always assume that depth(S) = O(log(|D|)) independent of D's compressibility. Hence, our enumeration algorithm has a delay logarithmic in the size of the non-compressed data and a preprocessing time that is at best (i.e., in the case of highly compressible documents) also logarithmic, but at worst still linear. Therefore, in a big-data perspective, our enumeration algorithm for SLP-compressed documents may nevertheless beat the known linear preprocessing and constant delay algorithms for non-compressed documents.

preprint2020arXiv

Constant delay enumeration with FPT-preprocessing for conjunctive queries of bounded submodular width

Marx (STOC~2010, J.~ACM 2013) introduced the notion of submodular width of a conjunctive query (CQ) and showed that for any class $Φ$ of Boolean CQs of bounded submodular width, the model-checking problem for $Φ$ on the class of all finite structures is fixed-parameter tractable (FPT). Note that for non-Boolean queries, the size of the query result may be far too large to be computed entirely within FPT time. We investigate the free-connex variant of submodular width and generalise Marx's result to non-Boolean queries as follows: For every class $Φ$ of CQs of bounded free-connex submodular width, within FPT-preprocessing time we can build a data structure that allows to enumerate, without repetition and with constant delay, all tuples of the query result. Our proof builds upon Marx's splitting routine to decompose the query result into a union of results; but we have to tackle the additional technical difficulty to ensure that these can be enumerated efficiently.

preprint2020arXiv

Learning Concepts Described by Weight Aggregation Logic

We consider weighted structures, which extend ordinary relational structures by assigning weights, i.e. elements from a particular group or ring, to tuples present in the structure. We introduce an extension of first-order logic that allows to aggregate weights of tuples, compare such aggregates, and use them to build more complex formulas. We provide locality properties of fragments of this logic including Feferman-Vaught decompositions and a Gaifman normal form for a fragment called FOW1, as well as a localisation theorem for a larger fragment called FOWA1. This fragment can express concepts from various machine learning scenarios. Using the locality properties, we show that concepts definable in FOWA1 over a weighted background structure of at most polylogarithmic degree are agnostically PAC-learnable in polylogarithmic time after pseudo-linear time preprocessing.

preprint2016arXiv

Monadic Datalog Containment on Trees Using the Descendant-Axis

In their AMW14-paper, Frochaux, Grohe, and Schweikardt showed that the query containment problem for monadic datalog on finite unranked labeled trees is Exptime-complete when (a) considering unordered trees using the child-axis, and when (b) considering ordered trees using the axes firstchild, nextsibling, and child. Furthermore, when allowing to use also the descendant-axis, the query containment problem was shown to be solvable in 2-fold exponential time, but it remained open to determine the problems exact complexity in presence of the descendant-axis. The present paper closes this gap by showing that, in the presence of the descendant-axis, the problem is 2Exptime-hard.

preprint2014arXiv

Monadic Datalog Containment on Trees

We show that the query containment problem for monadic datalog on finite unranked labeled trees can be solved in 2-fold exponential time when (a) considering unordered trees using the axes child and descendant, and when (b) considering ordered trees using the axes firstchild, nextsibling, child, and descendant. When omitting the descendant-axis, we obtain that in both cases the problem is EXPTIME-complete.

preprint2013arXiv

A note on monadic datalog on unranked trees

In the article 'Recursive queries on trees and data trees' (ICDT'13), Abiteboul et al., asked whether the containment problem for monadic datalog over unordered unranked labeled trees using the child relation and the descendant relation is decidable. This note gives a positive answer to this question, as well as an overview of the relative expressive power of monadic datalog on various representations of unranked trees.

preprint2011arXiv

A note on the expressive power of linear orders

This article shows that there exist two particular linear orders such that first-order logic with these two linear orders has the same expressive power as first-order logic with the Bit-predicate FO(Bit). As a corollary we obtain that there also exists a built-in permutation such that first-order logic with a linear order and this permutation is as expressive as FO(Bit).