Source author record

Jan Van den Bussche

Jan Van den Bussche appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Databases Logic in Computer Science Artificial Intelligence Machine Learning Computational Complexity math.LO Neural and Evolutionary Computing

Catalog footprint

What is connected

22works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Recursive querying of neural networks via weighted structures

Expressive querying of machine learning models - viewed as a form of intentional data - enables their verification and interpretation using declarative languages, thereby making learned representations of data more accessible. Motivated by the querying of feedforward neural networks, we investigate logics for weighted structures. In the absence of a bound on neural network depth, such logics must incorporate recursion; thereto we revisit the functional fixpoint mechanism proposed by Grädel and Gurevich. We adopt it in a Datalog-like syntax; we extend normal forms for fixpoint logics to weighted structures; and show an equivalent "loose" fixpoint mechanism that allows values of inductively defined weight functions to be overwritten. We propose a "scalar" restriction of functional fixpoint logic, of polynomial-time data complexity, and show it can express all PTIME model-agnostic queries over reduced networks with polynomially bounded weights. In contrast, we show that very simple model-agnostic queries are already NP-complete. Finally, we consider transformations of weighted structures by iterated transductions.

preprint2022arXiv

Expressiveness within Sequence Datalog

Motivated by old and new applications, we investigate Datalog as a language for sequence databases. We reconsider classical features of Datalog programs, such as negation, recursion, intermediate predicates, and relations of higher arities. We also consider new features that are useful for sequences, notably, equations between path expressions, and "packing". Our goal is to clarify the relative expressiveness of all these different features, in the context of sequences. Towards our goal, we establish a number of redundancy and primitivity results, showing that certain features can, or cannot, be expressed in terms of other features. These results paint a complete picture of the expressiveness relationships among all possible Sequence Datalog fragments that can be formed using the six features that we consider.

preprint2022arXiv

Inputs, Outputs, and Composition in the Logic of Information Flows

The logic of information flows (LIF) is a general framework in which tasks of a procedural nature can be modeled in a declarative, logic-based fashion. The first contribution of this paper is to propose semantic and syntactic definitions of inputs and outputs of LIF expressions. We study how the two relate and show that our syntactic definition is optimal in a sense that is made precise. The second contribution is a systematic study of the expressive power of sequential composition in LIF. Our results on composition tie in the results on inputs and outputs, and relate LIF to first-order logic (FO) and bounded-variable LIF to bounded-variable FO.

preprint2022arXiv

On the expressive power of message-passing neural networks as global feature map transformers

We investigate the power of message-passing neural networks (MPNNs) in their capacity to transform the numerical features stored in the nodes of their input graphs. Our focus is on global expressive power, uniformly over all input graphs, or over graphs of bounded degree with features from a bounded domain. Accordingly, we introduce the notion of a global feature map transformer (GFMT). As a yardstick for expressiveness, we use a basic language for GFMTs, which we call MPLang. Every MPNN can be expressed in MPLang, and our results clarify to which extent the converse inclusion holds. We consider exact versus approximate expressiveness; the use of arbitrary activation functions; and the case where only the ReLU activation function is allowed.

preprint2022arXiv

SHACL: A Description Logic in Disguise

SHACL is a W3C-proposed language for expressing structural constraints on RDF graphs. In recent years, SHACL's popularity has risen quickly. This rise in popularity comes with questions related to its place in the semantic web, particularly about its relation to OWL (the de facto standard for expressing ontological information on the web) and description logics (which form the formal foundations of OWL). We answer these questions by arguing that SHACL is in fact a description logic. On the one hand, our answer is surprisingly simple, some might even say obvious. But, on the hand, our answer is also controversial. By resolving this issue once and for all, we establish the field of description logics as the solid formal foundations of SHACL.

preprint2022arXiv

Temporal graph patterns by timed automata

Temporal graphs represent graph evolution over time, and have been receiving considerable research attention. Work on expressing temporal graph patterns or discovering temporal motifs typically assumes relatively simple temporal constraints, such as journeys or, more generally, existential constraints, possibly with finite delays. In this paper we propose to use timed automata to express temporal constraints, leading to a general and powerful notion of temporal basic graph pattern (BGP). The new difficulty is the evaluation of the temporal constraint on a large set of matchings. An important benefit of timed automata is that they support an iterative state assignment, which can be useful for early detection of matches and pruning of non-matches. We introduce algorithms to retrieve all instances of a temporal BGP match in a graph, and present results of an extensive experimental evaluation, demonstrating interesting performance trade-offs. We show that an on-demand algorithm that processes total matchings incrementally over time is preferable when dealing with cyclic patterns on sparse graphs. On acyclic patterns or dense graphs, and when connectivity of partial matchings can be guaranteed, the best performance is achieved by maintaining partial matchings over time and allowing automaton evaluation to be fully incremental.

preprint2022arXiv

What Can Database Query Processing Do for Instance-Spanning Constraints?

In the last decade, the term instance-spanning constraint has been introduced in the process mining field to refer to constraints that span multiple process instances of one or several processes. Of particular relevance, in this setting, is checking whether process executions comply with constraints of interest, which at runtime calls for suitable monitoring techniques. Even though event data are often stored in some sort of database, there is a lack of database-oriented approaches to tackle compliance checking and monitoring of (instance-spanning) constraints. In this paper, we fill this gap by showing how well-established technology from database query processing can be effectively used for this purpose. We propose to define an instance-spanning constraint through an ensemble of four database queries that retrieve the satisfying, violating, pending-satisfying, and pending-violating cases of the constraint. In this context, the problem of compliance monitoring then becomes an application of techniques for incremental view maintenance, which is well-developed in database query processing. In this paper, we argue for our approach in detail, and, as a proof of concept, present an experimental validation using the DBToaster incremental database query engine.

preprint2020arXiv

Descriptive complexity of real computation and probabilistic independence logic

We introduce a novel variant of BSS machines called Separate Branching BSS machines (S-BSS in short) and develop a Fagin-type logical characterisation for languages decidable in non-deterministic polynomial time by S-BSS machines. We show that NP on S-BSS machines is strictly included in NP on BSS machines and that every NP language on S-BSS machines is a countable union of closed sets in the usual topology of R^n. Moreover, we establish that on Boolean inputs NP on S-BSS machines without real constants characterises a natural fragment of the complexity class existsR (a class of problems polynomial time reducible to the true existential theory of the reals) and hence lies between NP and PSPACE. Finally we apply our results to determine the data complexity of probabilistic independence logic.

preprint2020arXiv

J-Logic: a Logic for Querying JSON

We propose a logical framework, based on Datalog, to study the foundations of querying JSON data. The main feature of our approach, which we call J-Logic, is the emphasis on paths. Paths are sequences of keys and are used to access the tree structure of nested JSON objects. J-Logic also features ``packing'' as a means to generate a new key from a path or subpath. J-Logic with recursion is computationally complete, but many queries can be expressed without recursion, such as deep equality. We give a necessary condition for queries to be expressible without recursion. Most of our results focus on the deterministic nature of JSON objects as partial functions from keys to values. Predicates defined by J-Logic programs may not properly describe objects, however. Nevertheless we show that every object-to-object transformation in J-Logic can be defined using only objects in intermediate results. Moreover we show that it is decidable whether a positive, nonrecursive J-Logic program always returns an object when given objects as inputs. Regarding packing, we show that packing is unnecessary if the output does not require new keys. Finally, we show the decidability of query containment for positive, nonrecursive J-Logic programs.

preprint2016arXiv

Mapping-equivalence and oid-equivalence of single-function object-creating conjunctive queries

Conjunctive database queries have been extended with a mechanism for object creation to capture important applications such as data exchange, data integration, and ontology-based data access. Object creation generates new object identifiers in the result, that do not belong to the set of constants in the source database. The new object identifiers can be also seen as Skolem terms. Hence, object-creating conjunctive queries can also be regarded as restricted second-order tuple-generating dependencies (SO tgds), considered in the data exchange literature. In this paper, we focus on the class of single-function object-creating conjunctive queries, or sifo CQs for short. We give a new characterization for oid-equivalence of sifo CQs that is simpler than the one given by Hull and Yoshikawa and places the problem in the complexity class NP. Our characterization is based on Cohen's equivalence notions for conjunctive queries with multiplicities. We also solve the logical entailment problem for sifo CQs, showing that also this problem belongs to NP. Results by Pichler et al. have shown that logical equivalence for more general classes of SO tgds is either undecidable or decidable with as yet unknown complexity upper bounds.

preprint2016arXiv

On the convergence of cycle detection for navigational reinforcement learning

We consider a reinforcement learning framework where agents have to navigate from start states to goal states. We prove convergence of a cycle-detection learning algorithm on a class of tasks that we call reducible. Reducible tasks have an acyclic solution. We also syntactically characterize the form of the final policy. This characterization can be used to precisely detect the convergence point in a simulation. Our result demonstrates that even simple algorithms can be successful in learning a large class of nontrivial tasks. In addition, our framework is elementary in the sense that we only use basic concepts to formally prove convergence.

preprint2016arXiv

On the satisfiability problem for SPARQL patterns

The satisfiability problem for SPARQL patterns is undecidable in general, since the expressive power of SPARQL 1.0 is comparable with that of the relational algebra. The goal of this paper is to delineate the boundary of decidability of satisfiability in terms of the constraints allowed in filter conditions. The classes of constraints considered are bound-constraints, negated bound-constraints, equalities, nonequalities, constant-equalities, and constant-nonequalities. The main result of the paper can be summarized by saying that, as soon as inconsistent filter conditions can be formed, satisfiability is undecidable. The key insight in each case is to find a way to emulate the set difference operation. Undecidability can then be obtained from a known undecidability result for the algebra of binary relations with union, composition, and set difference. When no inconsistent filter conditions can be formed, satisfiability is efficiently decidable by simple checks on bound variables and on the use of literals. The paper also points out that satisfiability for the so-called `well-designed' patterns can be decided by a check on bound variables and a check for inconsistent filter conditions.

preprint2015arXiv

Positive Neural Networks in Discrete Time Implement Monotone-Regular Behaviors

We study the expressive power of positive neural networks. The model uses positive connection weights and multiple input neurons. Different behaviors can be expressed by varying the connection weights. We show that in discrete time, and in absence of noise, the class of positive neural networks captures the so-called monotone-regular behaviors, that are based on regular languages. A finer picture emerges if one takes into account the delay by which a monotone-regular behavior is implemented. Each monotone-regular behavior can be implemented by a positive neural network with a delay of one time unit. Some monotone-regular behaviors can be implemented with zero delay. And, interestingly, some simple monotone-regular behaviors can not be implemented with zero delay.

preprint2015arXiv

Putting Logic-Based Distributed Systems on Stable Grounds

In the Declarative Networking paradigm, Datalog-like languages are used to express distributed computations. Whereas recently formal operational semantics for these languages have been developed, a corresponding declarative semantics has been lacking so far. The challenge is to capture precisely the amount of nondeterminism that is inherent to distributed computations due to concurrency, networking delays, and asynchronous communication. This paper shows how a declarative, model-based semantics can be obtained by simply using the well-known stable model semantics for Datalog with negation. We show that the model-based semantics matches previously proposed formal operational semantics.

preprint2014arXiv

FO(C) and Related Modelling Paradigms

Recently, C-Log was introduced as a language for modelling causal processes. Its formal semantics has been defined together with introductory examples, but the study of this language is far from finished. In this paper, we compare C-Log to other declarative modelling languages. More specifically, we compare to first-order logic (FO), and argue that C-Log and FO are orthogonal and that their integration, FO(C), is a knowledge representation language that allows for clear and succinct models. We compare FO(C) to E-disjunctive logic programming with the stable semantics, and define a fragment on which both semantics coincide. Furthermore, we discuss object-creation in FO(C), relating it to mathematics, business rules systems, and data base systems.

preprint2014arXiv

FO(C): A Knowledge Representation Language of Causality

Cause-effect relations are an important part of human knowledge. In real life, humans often reason about complex causes linked to complex effects. By comparison, existing formalisms for representing knowledge about causal relations are quite limited in the kind of specifications of causes and effects they allow. In this paper, we present the new language C-Log, which offers a significantly more expressive representation of effects, including such features as the creation of new objects. We show how C-Log integrates with first-order logic, resulting in the language FO(C). We also compare FO(C) with several related languages and paradigms, including inductive definitions, disjunctive logic programming, business rules and extensions of Datalog.

preprint2014arXiv

Inference in the FO(C) Modelling Language

Recently, FO(C), the integration of C-Log with classical logic, was introduced as a knowledge representation language. Up to this point, no systems exist that perform inference on FO(C), and very little is known about properties of inference in FO(C). In this paper, we study both of the above problems. We define normal forms for FO(C), one of which corresponds to FO(ID). We define transformations between these normal forms, and show that, using these transformations, several inference tasks for FO(C) can be reduced to inference tasks for FO(ID), for which solvers exist. We implemented a prototype of this transformation, and thus present the first system to perform inference in FO(C). We also provide results about the complexity of reasoning in FO(C).

preprint2014arXiv

Relative Expressive Power of Navigational Querying on Graphs

Motivated by both established and new applications, we study navigational query languages for graphs (binary relations). The simplest language has only the two operators union and composition, together with the identity relation. We make more powerful languages by adding any of the following operators: intersection; set difference; projection; coprojection; converse; and the diversity relation. All these operators map binary relations to binary relations. We compare the expressive power of all resulting languages. We do this not only for general path queries (queries where the result may be any binary relation) but also for boolean or yes/no queries (expressed by the nonemptiness of an expression). For both cases, we present the complete Hasse diagram of relative expressiveness. In particular the Hasse diagram for boolean queries contains some nontrivial separations and a few surprising collapses.

preprint2014arXiv

Similarity and bisimilarity notions appropriate for characterizing indistinguishability in fragments of the calculus of relations

Motivated by applications in databases, this paper considers various fragments of the calculus of binary relations. The fragments are obtained by leaving out, or keeping in, some of the standard operators, along with some derived operators such as set difference, projection, coprojection, and residuation. For each considered fragment, a characterization is obtained for when two given binary relational structures are indistinguishable by expressions in that fragment. The characterizations are based on appropriately adapted notions of simulation and bisimulation.

preprint2014arXiv

Undecidability of satisfiability in the algebra of finite binary relations with union, composition, and difference

We consider expressions built up from binary relation names using the operators union, composition, and set difference. We show that it is undecidable to test whether a given such expression $e$ is finitely satisfiable, i.e., whether there exist finite binary relations that can be substituted for the relation names so that $e$ evaluates to a nonempty result. This result already holds in restriction to expressions that mention just a single relation name, and where the difference operator can be nested at most once.

preprint2010arXiv

Mining tree-query associations in graphs

New applications of data mining, such as in biology, bioinformatics, or sociology, are faced with large datasetsstructured as graphs. We introduce a novel class of tree-shapedpatterns called tree queries, and present algorithms for miningtree queries and tree-query associations in a large data graph. Novel about our class of patterns is that they can containconstants, and can contain existential nodes which are not counted when determining the number of occurrences of the patternin the data graph. Our algorithms have a number of provableoptimality properties, which are based on the theory of conjunctive database queries. We propose a practical, database-oriented implementation in SQL, and show that the approach works in practice through experiments on data about food webs, protein interactions, and citation analysis.

preprint2010arXiv

Relational transducers for declarative networking

Motivated by a recent conjecture concerning the expressiveness of declarative networking, we propose a formal computation model for "eventually consistent" distributed querying, based on relational transducers. A tight link has been conjectured between coordination-freeness of computations, and monotonicity of the queries expressed by such computations. Indeed, we propose a formal definition of coordination-freeness and confirm that the class of monotone queries is captured by coordination-free transducer networks. Coordination-freeness is a semantic property, but the syntactic class that we define of "oblivious" transducers also captures the same class of monotone queries. Transducer networks that are not coordination-free are much more powerful.

Jan Van den Bussche

What is connected

Connect this record

See the researcher in context

Building this map preview

22 published item(s)

Recursive querying of neural networks via weighted structures

Expressiveness within Sequence Datalog

Inputs, Outputs, and Composition in the Logic of Information Flows

On the expressive power of message-passing neural networks as global feature map transformers

SHACL: A Description Logic in Disguise

Temporal graph patterns by timed automata

What Can Database Query Processing Do for Instance-Spanning Constraints?

Descriptive complexity of real computation and probabilistic independence logic

J-Logic: a Logic for Querying JSON

Mapping-equivalence and oid-equivalence of single-function object-creating conjunctive queries

On the convergence of cycle detection for navigational reinforcement learning

On the satisfiability problem for SPARQL patterns

Positive Neural Networks in Discrete Time Implement Monotone-Regular Behaviors

Putting Logic-Based Distributed Systems on Stable Grounds

FO(C) and Related Modelling Paradigms

FO(C): A Knowledge Representation Language of Causality

Inference in the FO(C) Modelling Language

Relative Expressive Power of Navigational Querying on Graphs

Similarity and bisimilarity notions appropriate for characterizing indistinguishability in fragments of the calculus of relations

Undecidability of satisfiability in the algebra of finite binary relations with union, composition, and difference

Mining tree-query associations in graphs

Relational transducers for declarative networking