Source author record

Peter Lindner

Peter Lindner appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Databases Logic in Computer Science cond-mat.soft cond-mat.mes-hall

Catalog footprint

What is connected

7works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Generative Datalog with Continuous Distributions

Arguing for the need to combine declarative and probabilistic programming, Bárány et al. (TODS 2017) recently introduced a probabilistic extension of Datalog as a "purely declarative probabilistic programming language." We revisit this language and propose a more principled approach towards defining its semantics based on stochastic kernels and Markov processes - standard notions from probability theory. This allows us to extend the semantics to continuous probability distributions, thereby settling an open problem posed by Bárány et al. We show that our semantics is fairly robust, allowing both parallel execution and arbitrary chase orders when evaluating a program. We cast our semantics in the framework of infinite probabilistic databases (Grohe and Lindner, ICDT 2020), and show that the semantics remains meaningful even when the input of a probabilistic Datalog program is an arbitrary probabilistic database.

preprint2022arXiv

Independence in Infinite Probabilistic Databases

Probabilistic databases (PDBs) model uncertainty in data. The current standard is to view PDBs as finite probability spaces over relational database instances. Since many attributes in typical databases have infinite domains, such as integers, strings, or real numbers, it is often more natural to view PDBs as infinite probability spaces over database instances. In this paper, we lay the mathematical foundations of infinite probabilistic databases. Our focus then is on independence assumptions. Tuple-independent PDBs play a central role in theory and practice of PDBs. Here, we study infinite tuple-independent PDBs as well as related models such as infinite block-independent disjoint PDBs. While the standard model of PDBs focuses on a set-based semantics, we also study tuple-independent PDBs with a bag semantics and independence in PDBs over uncountable fact spaces. We also propose a new approach to PDBs with an open-world assumption, addressing issues raised by Ceylan et al. (Proc. KR 2016) and generalizing their work, which is still rooted in finite tuple-independent PDBs. Moreover, for countable PDBs we propose an approximate query answering algorithm.

preprint2022arXiv

Tuple-Independent Representations of Infinite Probabilistic Databases

Probabilistic databases (PDBs) are probability spaces over database instances. They provide a framework for handling uncertainty in databases, as occurs due to data integration, noisy data, data from unreliable sources or randomized processes. Most of the existing theory literature investigated finite, tuple-independent PDBs (TI-PDBs) where the occurrences of tuples are independent events. Only recently, Grohe and Lindner (PODS '19) introduced independence assumptions for PDBs beyond the finite domain assumption. In the finite, a major argument for discussing the theoretical properties of TI-PDBs is that they can be used to represent any finite PDB via views. This is no longer the case once the number of tuples is countably infinite. In this paper, we systematically study the representability of infinite PDBs in terms of TI-PDBs and the related block-independent disjoint PDBs. The central question is which infinite PDBs are representable as first-order views over tuple-independent PDBs. We give a necessary condition for the representability of PDBs and provide a sufficient criterion for representability in terms of the probability distribution of a PDB. With various examples, we explore the limits of our criteria. We show that conditioning on first order properties yields no additional power in terms of expressivity. Finally, we discuss the relation between purely logical and arithmetic reasons for (non-)representability.

preprint2021arXiv

Probabilistic Data with Continuous Distributions

Statistical models of real world data typically involve continuous probability distributions such as normal, Laplace, or exponential distributions. Such distributions are supported by many probabilistic modelling formalisms, including probabilistic database systems. Yet, the traditional theoretical framework of probabilistic databases focusses entirely on finite probabilistic databases. Only recently, we set out to develop the mathematical theory of infinite probabilistic databases. The present paper is an exposition of two recent papers which are cornerstones of this theory. In (Grohe, Lindner; ICDT 2020) we propose a very general framework for probabilistic databases, possibly involving continuous probability distributions, and show that queries have a well-defined semantics in this framework. In (Grohe, Kaminski, Katoen, Lindner; PODS 2020) we extend the declarative probabilistic programming language Generative Datalog, proposed by (Bárány et al.~2017) for discrete probability distributions, to continuous probability distributions and show that such programs yield generative models of continuous probabilistic databases.

preprint2020arXiv

Infinite Probabilistic Databases

Probabilistic databases (PDBs) are used to model uncertainty in data in a quantitative way. In the standard formal framework, PDBs are finite probability spaces over relational database instances. It has been argued convincingly that this is not compatible with an open world semantics (Ceylan et al., KR 2016) and with application scenarios that are modeled by continuous probability distributions (Dalvi et al., CACM 2009). We recently introduced a model of PDBs as infinite probability spaces that addresses these issues (Grohe and Lindner, PODS 2019). While that work was mainly concerned with countably infinite probability spaces, our focus here is on uncountable spaces. Such an extension is necessary to model typical continuous probability distributions that appear in many applications. However, an extension beyond countable probability spaces raises nontrivial foundational issues concerned with the measurability of events and queries and ultimately with the question whether queries have a well-defined semantics. It turns out that so-called finite point processes are the appropriate model from probability theory for dealing with probabilistic databases. This model allows us to construct suitable (uncountable) probability spaces of database instances in a systematic way. Our main technical results are measurability statements for relational algebra queries as well as aggregate queries and datalog queries.

preprint2015arXiv

Nonequilibrium Structure of Colloidal Dumbbells under Oscillatory Shear

We investigate the nonequilibrium behavior of dense, plastic-crystalline suspensions of mildly anisotropic colloidal hard dumbbells under the action of an oscillatory shear field by employing Brownian dynamics computer simulations. In particular, we extend previous investigations, where we uncovered novel nonequilibrium phase transitions, to other aspect ratios and to a larger nonequilibrium parameter space, that is, a wider range of strains and shear frequencies. We compare and discuss selected results in the context of novel scattering and rheological experiments. Both simulations and experiments demonstrate that the previously found transitions from the plastic crystal phase with increasing shear strain also occur at other aspect ratios. We explore the transition behavior in the strain-frequency phase and summarize it in a nonequilibrium phase diagram. Additionally, the experimental rheology results hint at a slowing down of the colloidal dynamics with higher aspect ratio.

preprint2013arXiv

Learning about SANS Instruments and Data Reduction from Round Robin Measurements on Samples of Polystyrene Latex

Measurements of a well-characterised standard sample can verify the performance of an instrument. Typically, small-angle neutron scattering instruments are used to investigate a wide range of samples and may often be used in a number of configurations. Appropriate standard samples are useful to test different aspects of the performance of hardware as well as that of the data reduction and analysis software. Measurements on a number of instruments with different intrinsic characteristics and designs in a round robin can not only better characterise the performance for a wider range of conditions but also, perhaps more importantly, reveal the limits of the current state of the art of small-angle scattering. The exercise, followed by detailed analysis, tests the limits of current understanding as well as uncovers often forgotten assumptions, simplifications and approximations that underpin the current practice of the technique. This paper describes measurements of polystyrene latex, radius 72 nm with a number of instruments. Scattering from monodisperse, uniform spherical particles is simple to calculate and displays sharp minima. Such data test the calibrations of intensity, wavelength and resolution as well as the detector response. Smoothing due to resolution, multiple scattering and polydispersity has been determined. Sources of uncertainty are often related to systematic deviations and calibrations rather than random counting errors. The study has prompted development of software to treat modest multiple scattering and to better model the instrument resolution. These measurements also allow checks of data reduction algorithms and have identified how they can be improved. The reproducibility and the reliability of instruments and the accuracy of parameters derived from the data are described.

Peter Lindner

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Generative Datalog with Continuous Distributions

Independence in Infinite Probabilistic Databases

Tuple-Independent Representations of Infinite Probabilistic Databases

Probabilistic Data with Continuous Distributions

Infinite Probabilistic Databases

Nonequilibrium Structure of Colloidal Dumbbells under Oscillatory Shear

Learning about SANS Instruments and Data Reduction from Round Robin Measurements on Samples of Polystyrene Latex