Source author record

Mathias Niepert

Mathias Niepert appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning Computation and Language Computer Vision Biomolecules Computation Digital Libraries Logic in Computer Science math.CO physics.geo-ph Quantitative Methods

Catalog footprint

What is connected

23works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

LOGLO-FNO: Efficient Learning of Local and Global Features in Fourier Neural Operators

Modeling high-frequency information is a critical challenge in scientific machine learning. For instance, fully turbulent flow simulations of the Navier-Stokes equations at Reynolds numbers 3500 and above can generate high-frequency signals due to swirling fluid motions caused by eddies and vortices. Faithfully modeling such signals using neural nets depends on the accurate reconstruction of moderate to high frequencies. However, it has been well known that neural nets exhibit spectral or frequency bias towards learning low-frequency components. Meanwhile, Fourier Neural Operators (FNOs) have emerged as a popular class of data-driven models for surrogate modeling and solving PDEs. Although impressive results were achieved on several PDE benchmark problems, FNOs perform poorly in learning non-dominant frequencies characterized by local features. This limitation stems from spectral bias inherent in neural nets and the explicit exclusion of high-frequency modes in FNOs and their variants. Therefore, to mitigate these issues and improve FNO's spectral learning capabilities to represent a broad range of frequency components, we propose two key architectural enhancements: (i) a parallel branch performing local spectral convolution (ii) a high-frequency propagation module. Moreover, we propose a novel frequency-sensitive loss based on radially binned spectral errors. This introduction of a parallel branch for local convolution reduces the trainable parameters by up to 50% while achieving the accuracy of FNO that relies solely on global convolution. Moreover, our findings demonstrate that the proposed model improves stability over longer rollouts. Experiments on six challenging PDEs in fluid mechanics, wave propagation, and biological pattern formation, and the qualitative and spectral analysis of predictions, show the effectiveness of our method over SOTA neural operator families of baselines.

preprint2026arXiv

Protein Fold Classification at Scale: Benchmarking and Pretraining

Classifying protein topology is essential for deciphering biological function, but progress is held back by the lack of large-scale benchmarks that avoid duplicates and by models that do not scale well. We introduce TEDBench, a large-scale, non-redundant benchmark for protein fold classification constructed from the Encyclopedia of Domains (TED) and Foldseek-clustered AlphaFold structures. We show that on TEDBench, current protein representation learning methods either require very large models or fail to deliver strong performance. To address this challenge, we propose Masked Invariant Autoencoders (MiAE), a self-supervised framework for protein structure representation learning. MiAE uses an extremely high masking ratio of up to 90% with an $\mathrm{SE(3)}$-invariant encoder and a lightweight decoder that reconstructs backbone coordinates from the latent representation and mask tokens. MiAE scales well and outperforms supervised counterparts and state-of-the-art baselines on TEDBench, establishing a strong recipe for protein fold classification. To test transfer beyond AlphaFold structures, we further benchmark on a curated dataset from experimental structures of CATH v4.4. TEDBench is available at https://github.com/BorgwardtLab/TEDBench.

preprint2026arXiv

SparseSAM: Structured Sparsification of Activations in Segment Anything Models

The Segment Anything Model (SAM) achieves strong open-vocabulary segmentation, but its ViT-based image encoders dominate inference latency and memory. Existing activation compression methods, such as token merging, reduce the token length to process, yet introduce non-trivial runtime overhead and encounter catastrophic quality drop under high compression. Other methods applying Sparse Attention focus on attention alone, leaving the MLP fully dense and capping achievable speedup. We propose SparseSAM, a (i) training-free structured sparsification framework that jointly accelerates attention and MLP layers while preserving token identity. SparseSAM introduces (ii) Stripe-Sort Attention, which uses a deterministic Z-order permutation to transform dense attention into static hardware-friendly sparse patterns, eliminating dynamic masking overhead. SparseSAM further introduces a (iii) Residual-Consistency MLP that routes only informative tokens through the MLP while propagating remaining tokens through the residual pathway. Across four segmentation benchmarks, SparseSAM loses only 0.004 mIoU at a 0.4 density and 0.021 mIoU at 0.3, a 2.10x reduction in accuracy loss versus token merging advances, while achieving 2x faster inference and 2.8x memory reduction.

preprint2026arXiv

SymDrift: One-Shot Generative Modeling under Symmetries

Generative modeling of physical systems, such as molecules, requires learning distributions that are invariant under global symmetries, such as rotations in three-dimensional space. Equivariant diffusion and flow matching models can incorporate such invariances effectively, even when trained on a non-invariant empirical distribution, but they typically rely on costly multi-step sampling. Recently, drifting models have emerged as an efficient alternative, enabling single-step generation and achieving state-of-the-art performance in generative modeling tasks. However, we show that drifting models face a symmetry-specific challenge, since an equivariant generator does not generally produce the same drifting field as the one obtained from the symmetrized target distribution. Addressing this issue would require expensive symmetrization of the empirical distribution. To avoid this cost, we propose SymDrift, a framework that makes the drifting field itself symmetry-aware. We introduce two complementary strategies: (i) a symmetrized drift in coordinate space based on optimal alignment, and (ii) a $G$-invariant embedding that removes symmetry ambiguity by construction. Empirically, SymDrift outperforms existing one-shot methods on standard benchmarks for conformer and transition state generation, while remaining competitive with significantly more expensive multi-step approaches. By enabling one-shot inference, SymDrift reduces computational overhead by up to 40$\times$ compared to existing baselines, making it promising for high-throughput applications such as virtual drug screening and large-scale reaction network exploration.

preprint2022arXiv

AnnIE: An Annotation Platform for Constructing Complete Open Information Extraction Benchmark

Open Information Extraction (OIE) is the task of extracting facts from sentences in the form of relations and their corresponding arguments in schema-free manner. Intrinsic performance of OIE systems is difficult to measure due to the incompleteness of existing OIE benchmarks: the ground truth extractions do not group all acceptable surface realizations of the same fact that can be extracted from a sentence. To measure performance of OIE systems more realistically, it is necessary to manually annotate complete facts (i.e., clusters of all acceptable surface realizations of the same fact) from input sentences. We propose AnnIE: an interactive annotation platform that facilitates such challenging annotation tasks and supports creation of complete fact-oriented OIE evaluation benchmarks. AnnIE is modular and flexible in order to support different use case scenarios (i.e., benchmarks covering different types of facts). We use AnnIE to build two complete OIE benchmarks: one with verb-mediated facts and another with facts encompassing named entities. Finally, we evaluate several OIE systems on our complete benchmarks created with AnnIE. Our results suggest that existing incomplete benchmarks are overly lenient, and that OIE systems are not as robust as previously reported. We publicly release AnnIE under non-restrictive license.

preprint2022arXiv

BenchIE: A Framework for Multi-Faceted Fact-Based Open Information Extraction Evaluation

Intrinsic evaluations of OIE systems are carried out either manually -- with human evaluators judging the correctness of extractions -- or automatically, on standardized benchmarks. The latter, while much more cost-effective, is less reliable, primarily because of the incompleteness of the existing OIE benchmarks: the ground truth extractions do not include all acceptable variants of the same fact, leading to unreliable assessment of the models' performance. Moreover, the existing OIE benchmarks are available for English only. In this work, we introduce BenchIE: a benchmark and evaluation framework for comprehensive evaluation of OIE systems for English, Chinese, and German. In contrast to existing OIE benchmarks, BenchIE is fact-based, i.e., it takes into account informational equivalence of extractions: our gold standard consists of fact synsets, clusters in which we exhaustively list all acceptable surface forms of the same fact. Moreover, having in mind common downstream applications for OIE, we make BenchIE multi-faceted; i.e., we create benchmark variants that focus on different facets of OIE evaluation, e.g., compactness or minimality of extractions. We benchmark several state-of-the-art OIE systems using BenchIE and demonstrate that these systems are significantly less effective than indicated by existing OIE benchmarks. We make BenchIE (data and evaluation code) publicly available on https://github.com/gkiril/benchie.

preprint2022arXiv

milIE: Modular & Iterative Multilingual Open Information Extraction

Open Information Extraction (OpenIE) is the task of extracting (subject, predicate, object) triples from natural language sentences. Current OpenIE systems extract all triple slots independently. In contrast, we explore the hypothesis that it may be beneficial to extract triple slots iteratively: first extract easy slots, followed by the difficult ones by conditioning on the easy slots, and therefore achieve a better overall extraction. Based on this hypothesis, we propose a neural OpenIE system, milIE, that operates in an iterative fashion. Due to the iterative nature, the system is also modular -- it is possible to seamlessly integrate rule based extraction systems with a neural end-to-end system, thereby allowing rule based systems to supply extraction slots which milIE can leverage for extracting the remaining slots. We confirm our hypothesis empirically: milIE outperforms SOTA systems on multiple languages ranging from Chinese to Arabic. Additionally, we are the first to provide an OpenIE test dataset for Arabic and Galician.

preprint2021arXiv

Answering Complex Queries in Knowledge Graphs with Bidirectional Sequence Encoders

Representation learning for knowledge graphs (KGs) has focused on the problem of answering simple link prediction queries. In this work we address the more ambitious challenge of predicting the answers of conjunctive queries with multiple missing entities. We propose Bi-Directional Query Embedding (BIQE), a method that embeds conjunctive queries with models based on bi-directional attention mechanisms. Contrary to prior work, bidirectional self-attention can capture interactions among all the elements of a query graph. We introduce a new dataset for predicting the answer of conjunctive query and conduct experiments that show BIQE significantly outperforming state of the art baselines.

preprint2020arXiv

A Comparative Study of Distributional and Symbolic Paradigms for Relational Learning

Many real-world domains can be expressed as graphs and, more generally, as multi-relational knowledge graphs. Though reasoning and learning with knowledge graphs has traditionally been addressed by symbolic approaches, recent methods in (deep) representation learning has shown promising results for specialized tasks such as knowledge base completion. These approaches abandon the traditional symbolic paradigm by replacing symbols with vectors in Euclidean space. With few exceptions, symbolic and distributional approaches are explored in different communities and little is known about their respective strengths and weaknesses. In this work, we compare representation learning and relational learning on various relational classification and clustering tasks and analyse the complexity of the rules used implicitly by these approaches. Preliminary results reveal possible indicators that could help in choosing one approach over the other for particular knowledge graphs.

preprint2020arXiv

Learning Discrete Structures for Graph Neural Networks

Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.

preprint2016arXiv

Discriminative Gaifman Models

We present discriminative Gaifman models, a novel family of relational machine learning models. Gaifman models learn feature representations bottom up from representations of locally connected and bounded-size regions of knowledge bases (KBs). Considering local and bounded-size neighborhoods of knowledge bases renders logical inference and learning tractable, mitigates the problem of overfitting, and facilitates weight sharing. Gaifman models sample neighborhoods of knowledge bases so as to make the learned relational models more robust to missing objects and relations which is a common situation in open-world KBs. We present the core ideas of Gaifman models and apply them to large-scale relational learning problems. We also discuss the ways in which Gaifman models relate to some existing relational machine learning approaches.

preprint2016arXiv

Learning Convolutional Neural Networks for Graphs

Numerous important problems can be framed as learning from graph data. We propose a framework for learning convolutional neural networks for arbitrary graphs. These graphs may be undirected, directed, and with both discrete and continuous node and edge attributes. Analogous to image-based convolutional networks that operate on locally connected regions of the input, we present a general approach to extracting locally connected regions from graphs. Using established benchmark data sets, we demonstrate that the learned feature representations are competitive with state of the art graph kernels and that their computation is highly efficient.

preprint2014arXiv

Exchangeable Variable Models

A sequence of random variables is exchangeable if its joint distribution is invariant under variable permutations. We introduce exchangeable variable models (EVMs) as a novel class of probabilistic models whose basic building blocks are partially exchangeable sequences, a generalization of exchangeable sequences. We prove that a family of tractable EVMs is optimal under zero-one loss for a large class of functions, including parity and threshold functions, and strictly subsumes existing tractable independence-based model families. Extensive experiments show that EVMs outperform state of the art classifiers such as SVMs and probabilistic models which are solely based on independence assumptions.

preprint2014arXiv

Lifted Probabilistic Inference for Asymmetric Graphical Models

Lifted probabilistic inference algorithms have been successfully applied to a large number of symmetric graphical models. Unfortunately, the majority of real-world graphical models is asymmetric. This is even the case for relational representations when evidence is given. Therefore, more recent work in the community moved to making the models symmetric and then applying existing lifted inference algorithms. However, this approach has two shortcomings. First, all existing over-symmetric approximations require a relational representation such as Markov logic networks. Second, the induced symmetries often change the distribution significantly, making the computed probabilities highly biased. We present a framework for probabilistic sampling-based inference that only uses the induced approximate symmetries to propose steps in a Metropolis-Hastings style Markov chain. The framework, therefore, leads to improved probability estimates while remaining unbiased. Experiments demonstrate that the approach outperforms existing MCMC algorithms.

preprint2014arXiv

LODE: Linking Digital Humanities Content to the Web of Data

Numerous digital humanities projects maintain their data collections in the form of text, images, and metadata. While data may be stored in many formats, from plain text to XML to relational databases, the use of the resource description framework (RDF) as a standardized representation has gained considerable traction during the last five years. Almost every digital humanities meeting has at least one session concerned with the topic of digital humanities, RDF, and linked data. While most existing work in linked data has focused on improving algorithms for entity matching, the aim of the LinkedHumanities project is to build digital humanities tools that work "out of the box," enabling their use by humanities scholars, computer scientists, librarians, and information scientists alike. With this paper, we report on the Linked Open Data Enhancer (LODE) framework developed as part of the LinkedHumanities project. With LODE we support non-technical users to enrich a local RDF repository with high-quality data from the Linked Open Data cloud. LODE links and enhances the local RDF repository without compromising the quality of the data. In particular, LODE supports the user in the enhancement and linking process by providing intuitive user-interfaces and by suggesting high-quality linking candidates using tailored matching algorithms. We hope that the LODE framework will be useful to digital humanities scholars complementing other digital humanities tools.

preprint2014arXiv

Markov Chains on Orbits of Permutation Groups

We present a novel approach to detecting and utilizing symmetries in probabilistic graphical models with two main contributions. First, we present a scalable approach to computing generating sets of permutation groups representing the symmetries of graphical models. Second, we introduce orbital Markov chains, a novel family of Markov chains leveraging model symmetries to reduce mixing times. We establish an insightful connection between model symmetries and rapid mixing of orbital Markov chains. Thus, we present the first lifted MCMC algorithm for probabilistic graphical models. Both analytical and empirical results demonstrate the effectiveness and efficiency of the approach.

preprint2014arXiv

On the Conditional Independence Implication Problem: A Lattice-Theoretic Approach

A lattice-theoretic framework is introduced that permits the study of the conditional independence (CI) implication problem relative to the class of discrete probability measures. Semi-lattices are associated with CI statements and a finite, sound and complete inference system relative to semi-lattice inclusions is presented. This system is shown to be (1) sound and complete for saturated CI statements, (2) complete for general CI statements, and (3) sound and complete for stable CI statements. These results yield a criterion that can be used to falsify instances of the implication problem and several heuristics are derived that approximate this "lattice-exclusion" criterion in polynomial time. Finally, we provide experimental results that relate our work to results obtained from other existing inference algorithms.

preprint2014arXiv

Tractability through Exchangeability: A New Perspective on Efficient Probabilistic Inference

Exchangeability is a central notion in statistics and probability theory. The assumption that an infinite sequence of data points is exchangeable is at the core of Bayesian statistics. However, finite exchangeability as a statistical property that renders probabilistic inference tractable is less well-understood. We develop a theory of finite exchangeability and its relation to tractable probabilistic inference. The theory is complementary to that of independence and conditional independence. We show that tractable inference in probabilistic models with high treewidth and millions of variables can be understood using the notion of finite (partial) exchangeability. We also show that existing lifted inference algorithms implicitly utilize a combination of conditional independence and partial exchangeability.

preprint2013arXiv

RockIt: Exploiting Parallelism and Symmetry for MAP Inference in Statistical Relational Models

RockIt is a maximum a-posteriori (MAP) query engine for statistical relational models. MAP inference in graphical models is an optimization problem which can be compiled to integer linear programs (ILPs). We describe several advances in translating MAP queries to ILP instances and present the novel meta-algorithm cutting plane aggregation (CPA). CPA exploits local context-specific symmetries and bundles up sets of linear constraints. The resulting counting constraints lead to more compact ILPs and make the symmetry of the ground model more explicit to state-of-the-art ILP solvers. Moreover, RockIt parallelizes most parts of the MAP inference pipeline taking advantage of ubiquitous shared-memory multi-core architectures. We report on extensive experiments with Markov logic network (MLN) benchmarks showing that RockIt outperforms the state-of-the-art systems Alchemy, Markov TheBeast, and Tuffy both in terms of efficiency and quality of results.

preprint2013arXiv

Symmetry-Aware Marginal Density Estimation

The Rao-Blackwell theorem is utilized to analyze and improve the scalability of inference in large probabilistic models that exhibit symmetries. A novel marginal density estimator is introduced and shown both analytically and empirically to outperform standard estimators by several orders of magnitude. The developed theory and algorithms apply to a broad class of probabilistic models including statistical relational models considered not susceptible to lifted probabilistic inference.

preprint2012arXiv

A Delayed Column Generation Strategy for Exact k-Bounded MAP Inference in Markov Logic Networks

The paper introduces k-bounded MAP inference, a parameterization of MAP inference in Markov logic networks. k-Bounded MAP states are MAP states with at most k active ground atoms of hidden (non-evidence) predicates. We present a novel delayed column generation algorithm and provide empirical evidence that the algorithm efficiently computes k-bounded MAP states for meaningful real-world graph matching problems. The underlying idea is that, instead of solving one large optimization problem, it is often more efficient to tackle several small ones.

preprint2012arXiv

Logical Inference Algorithms and Matrix Representations for Probabilistic Conditional Independence

Logical inference algorithms for conditional independence (CI) statements have important applications from testing consistency during knowledge elicitation to constraintbased structure learning of graphical models. We prove that the implication problem for CI statements is decidable, given that the size of the domains of the random variables is known and fixed. We will present an approximate logical inference algorithm which combines a falsification and a novel validation algorithm. The validation algorithm represents each set of CI statements as a sparse 0-1 matrix A and validates instances of the implication problem by solving specific linear programs with constraint matrix A. We will show experimentally that the algorithm is both effective and efficient in validating and falsifying instances of the probabilistic CI implication problem.

preprint2012arXiv

Markov Chains on Orbits of Permutation Groups

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint

Fields this researcher appears in

Artificial Intelligence Machine Learning Computation and Language Computer Vision Biomolecules Computation Digital Libraries Logic in Computer Science math.CO physics.geo-ph Quantitative Methods

Source provenance

Where this author record came from

arxivconfidence 95%

external id: arxiv:2605.06140:author:6:mathias-niepert

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.18552:author:3:mathias-niepert

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.17633:author:4:mathias-niepert

Imported May 20, 2026Synced May 20, 2026

4 works

Bhushan Kotnis

Researcher

Bhushan Kotnis contributes to research discovery and scholarly infrastructure.

Open to collaborate

4 works

Carolin Lawrence

Researcher

Carolin Lawrence contributes to research discovery and scholarly infrastructure.

Open to collaborate

3 works

Kiril Gashteovski

Researcher

Kiril Gashteovski contributes to research discovery and scholarly infrastructure.

Open to collaborate

2 works

Goran Glavaš

Researcher

Goran Glavaš contributes to research discovery and scholarly infrastructure.

Open to collaborate

Mathias Niepert

What is connected

Connect this record

See the researcher in context

Building this map preview

23 published item(s)

LOGLO-FNO: Efficient Learning of Local and Global Features in Fourier Neural Operators

Protein Fold Classification at Scale: Benchmarking and Pretraining

SparseSAM: Structured Sparsification of Activations in Segment Anything Models

SymDrift: One-Shot Generative Modeling under Symmetries

AnnIE: An Annotation Platform for Constructing Complete Open Information Extraction Benchmark

BenchIE: A Framework for Multi-Faceted Fact-Based Open Information Extraction Evaluation

milIE: Modular & Iterative Multilingual Open Information Extraction

Answering Complex Queries in Knowledge Graphs with Bidirectional Sequence Encoders

A Comparative Study of Distributional and Symbolic Paradigms for Relational Learning

Learning Discrete Structures for Graph Neural Networks

Discriminative Gaifman Models

Learning Convolutional Neural Networks for Graphs

Exchangeable Variable Models

Lifted Probabilistic Inference for Asymmetric Graphical Models

LODE: Linking Digital Humanities Content to the Web of Data

Markov Chains on Orbits of Permutation Groups

On the Conditional Independence Implication Problem: A Lattice-Theoretic Approach

Tractability through Exchangeability: A New Perspective on Efficient Probabilistic Inference

RockIt: Exploiting Parallelism and Symmetry for MAP Inference in Statistical Relational Models

Symmetry-Aware Marginal Density Estimation

A Delayed Column Generation Strategy for Exact k-Bounded MAP Inference in Markov Logic Networks

Logical Inference Algorithms and Matrix Representations for Probabilistic Conditional Independence

Markov Chains on Orbits of Permutation Groups