Source author record

Martin Grohe

Martin Grohe appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Logic in Computer Science Discrete Mathematics Computational Complexity Data Structures and Algorithms Databases math.CO Artificial Intelligence Machine Learning math.LO math.OC Neural and Evolutionary Computing

Catalog footprint

What is connected

37works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Query Languages for Machine-Learning Models

In this paper, I discuss two logics for weighted finite structures: first-order logic with summation (FO(SUM)) and its recursive extension IFP(SUM). These logics originate from foundational work by Grädel, Gurevich, and Meer in the 1990s. In recent joint work with Standke, Steegmans, and Van den Bussche, we have investigated these logics as query languages for machine learning models, specifically neural networks, which are naturally represented as weighted graphs. I present illustrative examples of queries to neural networks that can be expressed in these logics and discuss fundamental results on their expressiveness and computational complexity.

preprint2026arXiv

Recursive querying of neural networks via weighted structures

Expressive querying of machine learning models - viewed as a form of intentional data - enables their verification and interpretation using declarative languages, thereby making learned representations of data more accessible. Motivated by the querying of feedforward neural networks, we investigate logics for weighted structures. In the absence of a bound on neural network depth, such logics must incorporate recursion; thereto we revisit the functional fixpoint mechanism proposed by Grädel and Gurevich. We adopt it in a Datalog-like syntax; we extend normal forms for fixpoint logics to weighted structures; and show an equivalent "loose" fixpoint mechanism that allows values of inductively defined weight functions to be overwritten. We propose a "scalar" restriction of functional fixpoint logic, of polynomial-time data complexity, and show it can express all PTIME model-agnostic queries over reduced networks with polynomially bounded weights. In contrast, we show that very simple model-agnostic queries are already NP-complete. Finally, we consider transformations of weighted structures by iterated transductions.

preprint2025arXiv

Compressing CFI Graphs and Lower Bounds for the Weisfeiler-Leman Refinements

The $k$-dimensional Weisfeiler-Leman ($k$-WL) algorithm is a simple combinatorial algorithm that was originally designed as a graph isomorphism heuristic. It naturally finds applications in Babai's quasipolynomial time isomorphism algorithm, practical isomorphism solvers, and algebraic graph theory. However, it also has surprising connections to other areas such as logic, proof complexity, combinatorial optimization, and machine learning. The algorithm iteratively computes a coloring of the $k$-tuples of vertices of a graph. Since Fürer's linear lower bound [ICALP 2001], it has been an open question whether there is a super-linear lower bound for the iteration number for $k$-WL on graphs. We answer this question affirmatively, establishing an $Ω(n^{k/2})$-lower bound for all $k$.

preprint2022arXiv

Generative Datalog with Continuous Distributions

Arguing for the need to combine declarative and probabilistic programming, Bárány et al. (TODS 2017) recently introduced a probabilistic extension of Datalog as a "purely declarative probabilistic programming language." We revisit this language and propose a more principled approach towards defining its semantics based on stochastic kernels and Markov processes - standard notions from probability theory. This allows us to extend the semantics to continuous probability distributions, thereby settling an open problem posed by Bárány et al. We show that our semantics is fairly robust, allowing both parallel execution and arbitrary chase orders when evaluating a program. We cast our semantics in the framework of infinite probabilistic databases (Grohe and Lindner, ICDT 2020), and show that the semantics remains meaningful even when the input of a probabilistic Datalog program is an arbitrary probabilistic database.

preprint2022arXiv

Graph Similarity Based on Matrix Norms

Quantifying the similarity between two graphs is a fundamental algorithmic problem at the heart of many data analysis tasks for graph-based data. In this paper, we study the computational complexity of a family of similarity measures based on quantifying the mismatch between the two graphs, that is, the "symmetric difference" of the graphs under an optimal alignment of the vertices. An important example is similarity based on graph edit distance. While edit distance calculates the "global" mismatch, that is, the number of edges in the symmetric difference, our main focus is on "local" measures calculating the maximum mismatch per vertex. Mathematically, our similarity measures are best expressed in terms of the adjacency matrices: the mismatch between graphs is expressed as the difference of their adjacency matrices (under an optimal alignment), and we measure it by applying some matrix norm. Roughly speaking, global measures like graph edit distance correspond to entrywise matrix norms like the Frobenius norm and local measures correspond to operator norms like the spectral norm. We prove a number of strong NP-hardness and inapproximability results even for very restricted graph classes such as bounded-degree trees.

preprint2022arXiv

Independence in Infinite Probabilistic Databases

Probabilistic databases (PDBs) model uncertainty in data. The current standard is to view PDBs as finite probability spaces over relational database instances. Since many attributes in typical databases have infinite domains, such as integers, strings, or real numbers, it is often more natural to view PDBs as infinite probability spaces over database instances. In this paper, we lay the mathematical foundations of infinite probabilistic databases. Our focus then is on independence assumptions. Tuple-independent PDBs play a central role in theory and practice of PDBs. Here, we study infinite tuple-independent PDBs as well as related models such as infinite block-independent disjoint PDBs. While the standard model of PDBs focuses on a set-based semantics, we also study tuple-independent PDBs with a bag semantics and independence in PDBs over uncountable fact spaces. We also propose a new approach to PDBs with an open-world assumption, addressing issues raised by Ceylan et al. (Proc. KR 2016) and generalizing their work, which is still rooted in finite tuple-independent PDBs. Moreover, for countable PDBs we propose an approximate query answering algorithm.

preprint2022arXiv

One Model, Any CSP: Graph Neural Networks as Fast Global Search Heuristics for Constraint Satisfaction

We propose a universal Graph Neural Network architecture which can be trained as an end-2-end search heuristic for any Constraint Satisfaction Problem (CSP). Our architecture can be trained unsupervised with policy gradient descent to generate problem specific heuristics for any CSP in a purely data driven manner. The approach is based on a novel graph representation for CSPs that is both generic and compact and enables us to process every possible CSP instance with one GNN, regardless of constraint arity, relations or domain size. Unlike previous RL-based methods, we operate on a global search action space and allow our GNN to modify any number of variables in every step of the stochastic search. This enables our method to properly leverage the inherent parallelism of GNNs. We perform a thorough empirical evaluation where we learn heuristics for well known and important CSPs from random data, including graph coloring, MaxCut, 3-SAT and MAX-k-SAT. Our approach outperforms prior approaches for neural combinatorial optimization by a substantial margin. It can compete with, and even improve upon, conventional search heuristics on test instances that are several orders of magnitude larger and structurally more complex than those seen during training.

preprint2022arXiv

The Logic of Graph Neural Networks

Graph neural networks (GNNs) are deep learning architectures for machine learning problems on graphs. It has recently been shown that the expressiveness of GNNs can be characterised precisely by the combinatorial Weisfeiler-Leman algorithms and by finite variable counting logics. The correspondence has even led to new, higher-order GNNs corresponding to the WL algorithm in higher dimensions. The purpose of this paper is to explain these descriptive characterisations of GNNs.

preprint2022arXiv

Tuple-Independent Representations of Infinite Probabilistic Databases

Probabilistic databases (PDBs) are probability spaces over database instances. They provide a framework for handling uncertainty in databases, as occurs due to data integration, noisy data, data from unreliable sources or randomized processes. Most of the existing theory literature investigated finite, tuple-independent PDBs (TI-PDBs) where the occurrences of tuples are independent events. Only recently, Grohe and Lindner (PODS '19) introduced independence assumptions for PDBs beyond the finite domain assumption. In the finite, a major argument for discussing the theoretical properties of TI-PDBs is that they can be used to represent any finite PDB via views. This is no longer the case once the number of tuples is countably infinite. In this paper, we systematically study the representability of infinite PDBs in terms of TI-PDBs and the related block-independent disjoint PDBs. The central question is which infinite PDBs are representable as first-order views over tuple-independent PDBs. We give a necessary condition for the representability of PDBs and provide a sufficient criterion for representability in terms of the probability distribution of a PDB. With various examples, we explore the limits of our criteria. We show that conditioning on first order properties yields no additional power in terms of expressivity. Finally, we discuss the relation between purely logical and arithmetic reasons for (non-)representability.

preprint2021arXiv

Probabilistic Data with Continuous Distributions

Statistical models of real world data typically involve continuous probability distributions such as normal, Laplace, or exponential distributions. Such distributions are supported by many probabilistic modelling formalisms, including probabilistic database systems. Yet, the traditional theoretical framework of probabilistic databases focusses entirely on finite probabilistic databases. Only recently, we set out to develop the mathematical theory of infinite probabilistic databases. The present paper is an exposition of two recent papers which are cornerstones of this theory. In (Grohe, Lindner; ICDT 2020) we propose a very general framework for probabilistic databases, possibly involving continuous probability distributions, and show that queries have a well-defined semantics in this framework. In (Grohe, Kaminski, Katoen, Lindner; PODS 2020) we extend the declarative probabilistic programming language Generative Datalog, proposed by (Bárány et al.~2017) for discrete probability distributions, to continuous probability distributions and show that such programs yield generative models of continuous probabilistic databases.

preprint2021arXiv

Recent Advances on the Graph Isomorphism Problem

We give an overview of recent advances on the graph isomorphism problem. Our main focus will be on Babai's quasi-polynomial time isomorphism test and subsequent developments that led to the design of isomorphism algorithms with a quasi-polynomial parameterized running time of the from $n^{\text{polylog}(k)}$, where $k$ is a graph parameter such as the maximum degree. A second focus will be the combinatorial Weisfeiler-Leman algorithm.

preprint2020arXiv

Counting Bounded Tree Depth Homomorphisms

We prove that graphs G, G' satisfy the same sentences of first-order logic with counting of quantifier rank at most k if and only if they are homomorphism-indistinguishable over the class of all graphs of tree depth at most k. Here G, G' are homomorphism-indistinguishable over a class C of graphs if for each graph F in C, the number of homomorphisms from F to G equals the number of homomorphisms from F to G'.

preprint2020arXiv

Deep Weisfeiler Leman

We introduce the framework of Deep Weisfeiler Leman algorithms (DeepWL), which allows the design of purely combinatorial graph isomorphism tests that are more powerful than the well-known Weisfeiler-Leman algorithm. We prove that, as an abstract computational model, polynomial time DeepWL-algorithms have exactly the same expressiveness as the logic Choiceless Polynomial Time (with counting) introduced by Blass, Gurevich, and Shelah (Ann. Pure Appl. Logic., 1999) It is a well-known open question whether the existence of a polynomial time graph isomorphism test implies the existence of a polynomial time canonisation algorithm. Our main technical result states that for each class of graphs (satisfying some mild closure condition), if there is a polynomial time DeepWL isomorphism test then there is a polynomial canonisation algorithm for this class. This implies that there is also a logic capturing polynomial time on this class.

preprint2020arXiv

Graph Neural Networks for Maximum Constraint Satisfaction

Many combinatorial optimization problems can be phrased in the language of constraint satisfaction problems. We introduce a graph neural network architecture for solving such optimization problems. The architecture is generic; it works for all binary constraint satisfaction problems. Training is unsupervised, and it is sufficient to train on relatively small instances; the resulting networks perform well on much larger instances (at least 10-times larger). We experimentally evaluate our approach for a variety of problems, including Maximum Cut and Maximum Independent Set. Despite being generic, we show that our approach matches or surpasses most greedy and semi-definite programming based algorithms and sometimes even outperforms state-of-the-art heuristics for the specific problems.

preprint2020arXiv

Infinite Probabilistic Databases

Probabilistic databases (PDBs) are used to model uncertainty in data in a quantitative way. In the standard formal framework, PDBs are finite probability spaces over relational database instances. It has been argued convincingly that this is not compatible with an open world semantics (Ceylan et al., KR 2016) and with application scenarios that are modeled by continuous probability distributions (Dalvi et al., CACM 2009). We recently introduced a model of PDBs as infinite probability spaces that addresses these issues (Grohe and Lindner, PODS 2019). While that work was mainly concerned with countably infinite probability spaces, our focus here is on uncountable spaces. Such an extension is necessary to model typical continuous probability distributions that appear in many applications. However, an extension beyond countable probability spaces raises nontrivial foundational issues concerned with the measurability of events and queries and ultimately with the question whether queries have a well-defined semantics. It turns out that so-called finite point processes are the appropriate model from probability theory for dealing with probabilistic databases. This model allows us to construct suitable (uncountable) probability spaces of database instances in a systematic way. Our main technical results are measurability statements for relational algebra queries as well as aggregate queries and datalog queries.

preprint2020arXiv

word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data

Vector representations of graphs and relational structures, whether hand-crafted feature vectors or learned representations, enable us to apply standard data analysis and machine learning techniques to the structures. A wide range of methods for generating such embeddings have been studied in the machine learning and knowledge representation literature. However, vector embeddings have received relatively little attention from a theoretical point of view. Starting with a survey of embedding techniques that have been used in practice, in this paper we propose two theoretical approaches that we see as central for understanding the foundations of vector embeddings. We draw connections between the various approaches and suggest directions for future research.

preprint2016arXiv

Computing with Tangles

Tangles of graphs have been introduced by Robertson and Seymour in the context of their graph minor theory. Tangles may be viewed as describing "k-connected components" of a graph (though in a twisted way). They play an important role in graph minor theory. An interesting aspect of tangles is that they cannot only be defined for graphs, but more generally for arbitrary connectivity functions (that is, integer-valued submodular and symmetric set functions). However, tangles are difficult to deal with algorithmically. To start with, it is unclear how to represent them, because they are families of separations and as such may be exponentially large. Our first contribution is a data structure for representing and accessing all tangles of a graph up to some fixed order. Using this data structure, we can prove an algorithmic version of a very general structure theorem due to Carmesin, Diestel, Harman and Hundertmark (for graphs) and Hundertmark (for arbitrary connectivity functions) that yields a canonical tree decomposition whose parts correspond to the maximal tangles. (This may be viewed as a generalisation of the decomposition of a graph into its 3-connected components.)

preprint2016arXiv

Linear Diophantine Equations, Group CSPs, and Graph Isomorphism

In recent years, we have seen several approaches to the graph isomorphism problem based on "generic" mathematical programming or algebraic (Gröbner basis) techniques. For most of these, lower bounds have been established. In fact, it has been shown that the pairs of nonisomorphic CFI-graphs (introduced by Cai, Fürer, and Immerman in 1992 as hard examples for the combinatorial Weisfeiler-Leman algorithm) cannot be distinguished by these mathematical algorithms. A notable exception were the algebraic algorithms over the field GF(2), for which no lower bound was known. Another, in some way even stronger, approach to graph isomorphism testing is based on solving systems of linear Diophantine equations (that is, linear equations over the integers), which is known to be possible in polynomial time. So far, no lower bounds for this approach were known. Lower bounds for the algebraic algorithms can best be proved in the framework of proof complexity, where they can be phrased as lower bounds for algebraic proof systems such as Nullstellensatz or the (more powerful) polynomial calculus. We give new hard examples for these systems: families of pairs of non-isomorphic graphs that are hard to distinguish by polynomial calculus proofs simultaneously over all prime fields, including GF(2), as well as examples that are hard to distinguish by the systems-of-linear-Diophantine-equations approach. In a previous paper, we observed that the CFI-graphs are closely related to what we call "group CSPs": constraint satisfaction problems where the constraints are membership tests in some coset of a subgroup of a cartesian power of a base group (Z_2 in the case of the classical CFI-graphs). Our new examples are also based on group CSPs (for Abelian groups), but here we extend the CSPs by a few non-group constraints to obtain even harder instances for graph isomorphism.

preprint2016arXiv

Order Invariance on Decomposable Structures

Order-invariant formulas access an ordering on a structure's universe, but the model relation is independent of the used ordering. Order invariance is frequently used for logic-based approaches in computer science. Order-invariant formulas capture unordered problems of complexity classes and they model the independence of the answer to a database query from low-level aspects of databases. We study the expressive power of order-invariant monadic second-order (MSO) and first-order (FO) logic on restricted classes of structures that admit certain forms of tree decompositions (not necessarily of bounded width). While order-invariant MSO is more expressive than MSO and, even, CMSO (MSO with modulo-counting predicates), we show that order-invariant MSO and CMSO are equally expressive on graphs of bounded tree width and on planar graphs. This extends an earlier result for trees due to Courcelle. Moreover, we show that all properties definable in order-invariant FO are also definable in MSO on these classes. These results are applications of a theorem that shows how to lift up definability results for order-invariant logics from the bags of a graph's tree decomposition to the graph itself.

preprint2016arXiv

Quasi-4-Connected Components

We introduce a new decomposition of a graphs into quasi-4-connected components, where we call a graph quasi-4-connected if it is 3-connected and it only has separations of order 3 that remove a single vertex. Moreover, we give a cubic time algorithm computing the decomposition of a given graph. Our decomposition into quasi-4-connected components refines the well-known decompositions of graphs into biconnected and triconnected components. We relate our decomposition to Robertson and Seymour's theory of tangles by establishing a correspondence between the quasi-4-connected components of a graph and its tangles of order 4.

preprint2016arXiv

Tangled up in Blue (A Survey on Connectivity, Decompositions, and Tangles)

We survey an abstract theory of connectivity, based on symmetric submodular set functions. We start by developing Robertson and Seymour's fundamental duality between branch decompositions (related to the better-known tree decompositions) and so-called tangles, which may be viewed as highly connected regions in a connectivity system. We move on to studying canonical decompositions of connectivity systems into their maximal tangles. Last, but not least, we will discuss algorithmic aspect of the theory.

preprint2016arXiv

Tangles and Connectivity in Graphs

This paper is a short introduction to the theory of tangles, both in graphs and general connectivity systems. An emphasis is put on the correspondence between tangles of order k and k-connected components. In particular, we prove that there is a one-to-one correspondence between the triconnected components of a graph and its tangles of order 3.

preprint2015arXiv

Isomorphism Testing for Graphs of Bounded Rank Width

We give an algorithm that, for every fixed k, decides isomorphism of graphs of rank width at most k in polynomial time. As the clique width of a graph is bounded in terms of its rank width, we also obtain a polynomial time isomorphism test for graph classes of bounded clique width.

preprint2015arXiv

Limitations of Algebraic Approaches to Graph Isomorphism Testing

We investigate the power of graph isomorphism algorithms based on algebraic reasoning techniques like Gröbner basis computation. The idea of these algorithms is to encode two graphs into a system of equations that are satisfiable if and only if if the graphs are isomorphic, and then to (try to) decide satisfiability of the system using, for example, the Gröbner basis algorithm. In some cases this can be done in polynomial time, in particular, if the equations admit a bounded degree refutation in an algebraic proof systems such as Nullstellensatz or polynomial calculus. We prove linear lower bounds on the polynomial calculus degree over all fields of characteristic different from 2 and also linear lower bounds for the degree of Positivstellensatz calculus derivations. We compare this approach to recently studied linear and semidefinite programming approaches to isomorphism testing, which are known to be related to the combinatorial Weisfeiler-Lehman algorithm. We exactly characterise the power of the Weisfeiler-Lehman algorithm in terms of an algebraic proof system that lies between degree-k Nullstellensatz and degree-k polynomial calculus.

preprint2015arXiv

Pebble Games and Linear Equations

We give a new, simplified and detailed account of the correspondence between levels of the Sherali-Adams relaxation of graph isomorphism and levels of pebble-game equivalence with counting (higher-dimensional Weisfeiler-Lehman colour refinement). The correspondence between basic colour refinement and fractional isomorphism, due to Tinhofer (1986, 1991) and Ramana, Scheinerman and Ullman (1994), is re-interpreted as the base level of Sherali-Adams and generalised to higher levels in this sense by Atserias and Maneva (2012) and Malkin (2014), who prove that the two resulting hierarchies interleave. In carrying this analysis further, we here give (a) a precise characterisation of the level-k Sherali-Adams relaxation in terms of a modified counting pebble game; (b) a variant of the Sherali-Adams levels that precisely match the k-pebble counting game; (c) a proof that the interleaving between these two hierarchies is strict. We also investigate the variation based on boolean arithmetic instead of real/rational arithmetic and obtain analogous correspondences and separations for plain k-pebble equivalence (without counting). Our results are driven by considerably simplified accounts of the underlying combinatorics and linear algebra.

preprint2015arXiv

Tight Lower and Upper Bounds for the Complexity of Canonical Colour Refinement

An assignment of colours to the vertices of a graph is stable if any two vertices of the same colour have identically coloured neighbourhoods. The goal of colour refinement is to find a stable colouring that uses a minimum number of colours. This is a widely used subroutine for graph isomorphism testing algorithms, since any automorphism needs to be colour preserving. We give an $O((m+n)\log n)$ algorithm for finding a canonical version of such a stable colouring, on graphs with $n$ vertices and $m$ edges. We show that no faster algorithm is possible, under some modest assumptions about the type of algorithm, which captures all known colour refinement algorithms.

preprint2014arXiv

Deciding first-order properties of nowhere dense graphs

Nowhere dense graph classes, introduced by Nesetril and Ossona de Mendez, form a large variety of classes of "sparse graphs" including the class of planar graphs, actually all classes with excluded minors, and also bounded degree graphs and graph classes of bounded expansion. We show that deciding properties of graphs definable in first-order logic is fixed-parameter tractable on nowhere dense graph classes. At least for graph classes closed under taking subgraphs, this result is optimal: it was known before that for all classes C of graphs closed under taking subgraphs, if deciding first-order properties of graphs in C is fixed-parameter tractable, then C must be nowhere dense (under a reasonable complexity theoretic assumption). As a by-product, we give an algorithmic construction of sparse neighbourhood covers for nowhere dense graphs. This extends and improves previous constructions of neighbourhood covers for graph classes with excluded minors. At the same time, our construction is considerably simpler than those. Our proofs are based on a new game-theoretic characterisation of nowhere dense graphs that allows for a recursive version of locality-based algorithms on these classes. On the logical side, we prove a "rank-preserving" version of Gaifman's locality theorem.

preprint2014arXiv

Dimension Reduction via Colour Refinement

Colour refinement is a basic algorithmic routine for graph isomorphism testing, appearing as a subroutine in almost all practical isomorphism solvers. It partitions the vertices of a graph into "colour classes" in such a way that all vertices in the same colour class have the same number of neighbours in every colour class. Tinhofer (Disc. App. Math., 1991), Ramana, Scheinerman, and Ullman (Disc. Math., 1994) and Godsil (Lin. Alg. and its App., 1997) established a tight correspondence between colour refinement and fractional isomorphisms of graphs, which are solutions to the LP relaxation of a natural ILP formulation of graph isomorphism. We introduce a version of colour refinement for matrices and extend existing quasilinear algorithms for computing the colour classes. Then we generalise the correspondence between colour refinement and fractional automorphisms and develop a theory of fractional automorphisms and isomorphisms of matrices. We apply our results to reduce the dimensions of systems of linear equations and linear programs. Specifically, we show that any given LP L can efficiently be transformed into a (potentially) smaller LP L' whose number of variables and constraints is the number of colour classes of the colour refinement algorithm, applied to a matrix associated with the LP. The transformation is such that we can easily (by a linear mapping) map both feasible and optimal solutions back and forth between the two LPs. We demonstrate empirically that colour refinement can indeed greatly reduce the cost of solving linear programs.

preprint2014arXiv

Monadic Datalog Containment on Trees

We show that the query containment problem for monadic datalog on finite unranked labeled trees can be solved in 2-fold exponential time when (a) considering unordered trees using the axes child and descendant, and when (b) considering ordered trees using the axes firstchild, nextsibling, child, and descendant. When omitting the descendant-axis, we obtain that in both cases the problem is EXPTIME-complete.

preprint2014arXiv

Structure Theorem and Isomorphism Test for Graphs with Excluded Topological Subgraphs

We generalize the structure theorem of Robertson and Seymour for graphs excluding a fixed graph $H$ as a minor to graphs excluding $H$ as a topological subgraph. We prove that for a fixed $H$, every graph excluding $H$ as a topological subgraph has a tree decomposition where each part is either "almost embeddable" to a fixed surface or has bounded degree with the exception of a bounded number of vertices. Furthermore, we prove that such a decomposition is computable by an algorithm that is fixed-parameter tractable with parameter $|H|$. We present two algorithmic applications of our structure theorem. To illustrate the mechanics of a "typical" application of the structure theorem, we show that on graphs excluding $H$ as a topological subgraph, Partial Dominating Set (find $k$ vertices whose closed neighborhood has maximum size) can be solved in time $f(H,k)\cdot n^{O(1)}$ time. More significantly, we show that on graphs excluding $H$ as a topological subgraph, Graph Isomorphism can be solved in time $n^{f(H)}$. This result unifies and generalizes two previously known important polynomial-time solvable cases of Graph Isomorphism: bounded-degree graphs and $H$-minor free graphs. The proof of this result needs a generalization of our structure theorem to the context of invariant treelike decomposition.

preprint2013arXiv

L-Recursion and a new Logic for Logarithmic Space

We extend first-order logic with counting by a new operator that allows it to formalise a limited form of recursion which can be evaluated in logarithmic space. The resulting logic LREC has a data complexity in LOGSPACE, and it defines LOGSPACE-complete problems like deterministic reachability and Boolean formula evaluation. We prove that LREC is strictly more expressive than deterministic transitive closure logic with counting and incomparable in expressive power with symmetric transitive closure logic STC and transitive closure logic (with or without counting). LREC is strictly contained in fixed-point logic with counting FPC. We also study an extension LREC= of LREC that has nicer closure properties and is more expressive than both LREC and STC, but is still contained in FPC and has a data complexity in LOGSPACE. Our main results are that LREC captures LOGSPACE on the class of directed trees and that LREC= captures LOGSPACE on the class of interval graphs.

preprint2012arXiv

Where First-Order and Monadic Second-Order Logic Coincide

We study on which classes of graphs first-order logic (FO) and monadic second-order logic (MSO) have the same expressive power. We show that for all classes C of graphs that are closed under taking subgraphs, FO and MSO have the same expressive power on C if, and only if, C has bounded tree depth. Tree depth is a graph invariant that measures the similarity of a graph to a star in a similar way that tree width measures the similarity of a graph to a tree. For classes just closed under taking induced subgraphs, we show an analogous result for guarded second-order logic (GSO), the variant of MSO that not only allows quantification over vertex sets but also over edge sets. A key tool in our proof is a Feferman-Vaught-type theorem that is constructive and still works for unbounded partitions.

preprint2011arXiv

Counting Homomorphisms and Partition Functions

Homomorphisms between relational structures are not only fundamental mathematical objects, but are also of great importance in an applied computational context. Indeed, constraint satisfaction problems (CSPs), a wide class of algorithmic problems that occur in many different areas of computer science such as artificial intelligence or database theory, may be viewed as asking for homomorphisms between two relational structures [FedVar98]. In a logical setting, homomorphisms may be viewed as witnesses for positive primitive formulas in a relational language. As we shall see, homomorphisms, or more precisely the numbers of homomorphisms between two structures, are also related to a fundamental computational problem of statistical physics. In this article, we are concerned with the complexity of counting homomorphisms from a given structure A to a fixed structure B. Actually, we are mainly interested in a generalization of this problem to weighted homomorphisms (or partition functions). We almost exclusively focus on graphs. The first part of the article is a short survey of what is known about the problem. In the second part, we give a proof of a theorem due to Bulatov and the first author of this paper [BulGro05], which classifies the complexity of partition functions described by matrices with non-negative entries. The proof we give here is essentially the same as the original one, with a few shortcuts due to [Thu09], but it is phrased in a different, more graph theoretical language that may make it more accessible to most readers.

preprint2011arXiv

Randomisation and Derandomisation in Descriptive Complexity Theory

We study probabilistic complexity classes and questions of derandomisation from a logical point of view. For each logic L we introduce a new logic BPL, bounded error probabilistic L, which is defined from L in a similar way as the complexity class BPP, bounded error probabilistic polynomial time, is defined from PTIME. Our main focus lies on questions of derandomisation, and we prove that there is a query which is definable in BPFO, the probabilistic version of first-order logic, but not in Cinf, finite variable infinitary logic with counting. This implies that many of the standard logics of finite model theory, like transitive closure logic and fixed-point logic, both with and without counting, cannot be derandomised. Similarly, we present a query on ordered structures which is definable in BPFO but not in monadic second-order logic, and a query on additive structures which is definable in BPFO but not in FO. The latter of these queries shows that certain uniform variants of AC0 (bounded-depth polynomial sized circuits) cannot be derandomised. These results are in contrast to the general belief that most standard complexity classes can be derandomised. Finally, we note that BPIFP+C, the probabilistic version of fixed-point logic with counting, captures the complexity class BPP, even on unordered structures.

preprint2010arXiv

Finding topological subgraphs is fixed-parameter tractable

We show that for every fixed undirected graph $H$, there is a $O(|V(G)|^3)$ time algorithm that tests, given a graph $G$, if $G$ contains $H$ as a topological subgraph (that is, a subdivision of $H$ is subgraph of $G$). This shows that topological subgraph testing is fixed-parameter tractable, resolving a longstanding open question of Downey and Fellows from 1992. As a corollary, for every $H$ we obtain an $O(|V(G)|^3)$ time algorithm that tests if there is an immersion of $H$ into a given graph $G$. This answers another open question raised by Downey and Fellows in 1992.

preprint2010arXiv

Fixed-Point Definability and Polynomial Time on Chordal Graphs and Line Graphs

The question of whether there is a logic that captures polynomial time was formulated by Yuri Gurevich in 1988. It is still wide open and regarded as one of the main open problems in finite model theory and database theory. Partial results have been obtained for specific classes of structures. In particular, it is known that fixed-point logic with counting captures polynomial time on all classes of graphs with excluded minors. The introductory part of this paper is a short survey of the state-of-the-art in the quest for a logic capturing polynomial time. The main part of the paper is concerned with classes of graphs defined by excluding induced subgraphs. Two of the most fundamental such classes are the class of chordal graphs and the class of line graphs. We prove that capturing polynomial time on either of these classes is as hard as capturing it on the class of all graphs. In particular, this implies that fixed-point logic with counting does not capture polynomial time on these classes. Then we prove that fixed-point logic with counting does capture polynomial time on the class of all graphs that are both chordal and line graphs.

preprint2009arXiv

The Complexity of Datalog on Linear Orders

We study the program complexity of datalog on both finite and infinite linear orders. Our main result states that on all linear orders with at least two elements, the nonemptiness problem for datalog is EXPTIME-complete. While containment of the nonemptiness problem in EXPTIME is known for finite linear orders and actually for arbitrary finite structures, it is not obvious for infinite linear orders. It sharply contrasts the situation on other infinite structures; for example, the datalog nonemptiness problem on an infinite successor structure is undecidable. We extend our upper bound results to infinite linear orders with constants. As an application, we show that the datalog nonemptiness problem on Allen's interval algebra is EXPTIME-complete.

Martin Grohe

What is connected

Connect this record

See the researcher in context

Building this map preview

37 published item(s)

Query Languages for Machine-Learning Models

Recursive querying of neural networks via weighted structures

Compressing CFI Graphs and Lower Bounds for the Weisfeiler-Leman Refinements

Generative Datalog with Continuous Distributions

Graph Similarity Based on Matrix Norms

Independence in Infinite Probabilistic Databases

One Model, Any CSP: Graph Neural Networks as Fast Global Search Heuristics for Constraint Satisfaction

The Logic of Graph Neural Networks

Tuple-Independent Representations of Infinite Probabilistic Databases

Probabilistic Data with Continuous Distributions

Recent Advances on the Graph Isomorphism Problem

Counting Bounded Tree Depth Homomorphisms

Deep Weisfeiler Leman

Graph Neural Networks for Maximum Constraint Satisfaction

Infinite Probabilistic Databases

word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings of Structured Data

Computing with Tangles

Linear Diophantine Equations, Group CSPs, and Graph Isomorphism

Order Invariance on Decomposable Structures

Quasi-4-Connected Components

Tangled up in Blue (A Survey on Connectivity, Decompositions, and Tangles)

Tangles and Connectivity in Graphs

Isomorphism Testing for Graphs of Bounded Rank Width

Limitations of Algebraic Approaches to Graph Isomorphism Testing

Pebble Games and Linear Equations

Tight Lower and Upper Bounds for the Complexity of Canonical Colour Refinement

Deciding first-order properties of nowhere dense graphs

Dimension Reduction via Colour Refinement

Monadic Datalog Containment on Trees

Structure Theorem and Isomorphism Test for Graphs with Excluded Topological Subgraphs

L-Recursion and a new Logic for Logarithmic Space

Where First-Order and Monadic Second-Order Logic Coincide

Counting Homomorphisms and Partition Functions

Randomisation and Derandomisation in Descriptive Complexity Theory

Finding topological subgraphs is fixed-parameter tractable

Fixed-Point Definability and Polynomial Time on Chordal Graphs and Line Graphs

The Complexity of Datalog on Linear Orders