Source author record

Theodore Papamarkou

Theodore Papamarkou appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Applications Computer Vision math.AT Methodology Artificial Intelligence Computation Computational Geometry Information Theory math.IT Mathematical Software Neurons and Cognition

Catalog footprint

What is connected

12works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Position: agentic AI orchestration should be Bayes-consistent

LLMs excel at predictive tasks and complex reasoning tasks, but many high-value deployments rely on decisions under uncertainty, for example, which tool to call, which expert to consult, or how many resources to invest. While the usefulness and feasibility of Bayesian approaches remain unclear for LLM inference, this position paper argues that the control layer of an agentic AI system (that orchestrates LLMs and tools) is a clear case where Bayesian principles should shine. Bayesian decision theory provides a framework for agentic systems that can help to maintain beliefs over task-relevant latent quantities, to update these beliefs from observed agentic and human-AI interactions, and to choose actions. Making LLMs themselves explicitly Bayesian belief-updating engines remains computationally intensive and conceptually nontrivial as a general modeling target. In contrast, this paper argues that coherent decision-making requires Bayesian principles at the orchestration level of the agentic system, not necessarily the LLM agent parameters. This paper articulates practical properties for Bayesian control that fit modern agentic AI systems and human-AI collaboration, and provides concrete examples and design patterns to illustrate how calibrated beliefs and utility-aware policies can improve agentic AI orchestration.

preprint2026arXiv

The Human Brain as a Combinatorial Complex

We propose a framework for constructing combinatorial complexes (CCs) from fMRI time series data that captures both pairwise and higher-order neural interactions through information-theoretic measures, bridging topological deep learning and network neuroscience. Current graph-based representations of brain networks systematically miss the higher-order dependencies that characterize neural complexity, where information processing often involves synergistic interactions that cannot be decomposed into pairwise relationships. Unlike topological lifting approaches that map relational structures into higher-order domains, our method directly constructs CCs from statistical dependencies in the data. Our CCs generalize graphs by incorporating higher-order cells that represent collective dependencies among brain regions, naturally accommodating the multi-scale, hierarchical nature of neural processing. The framework constructs data-driven combinatorial complexes using O-information and S-information measures computed from fMRI signals, preserving both pairwise connections and higher-order cells (e.g., triplets, quadruplets) based on synergistic dependencies. Using NetSim simulations as a controlled proof-of-concept dataset, we demonstrate our CC construction pipeline and show how both pairwise and higher-order dependencies in neural time series can be quantified and represented within a unified structure. This work provides a framework for brain network representation that preserves fundamental higher-order structure invisible to traditional graph methods, and enables the application of topological deep learning (TDL) architectures to neural data.

preprint2026arXiv

TopoU-Net: a U-Net architecture for topological domains

Many modern datasets mix points, edges, regions, groups, objects, events, hyperedges, and relations. Yet neural architectures often force such data into grids, graphs, or sequences, obscuring higher-order structure and making encoder-decoder designs domain-specific. We view U-Net not as a grid-specific architecture, but as a hierarchical encoder-decoder principle: representation spaces, transport maps between levels, and skip connections between matched levels. Combinatorial complexes naturally supply these ingredients through cells, incidences, and ranks. We introduce TopoU-Net, a rank-path U-Net for topological domains. Given a path from an input rank to a bottleneck rank and back, the encoder lifts cochains upward along incidence maps, the decoder transports them downward, and skip connections merge features at matched ranks. Rank replaces spatial scale: choosing paths through nodes, edges, faces, hyperedges, or global cells becomes the central architectural decision. A key quantity is the bottleneck support ratio, the number of cells at the bottleneck relative to the number of cells at the input rank. This ratio is fixed by the complex and chosen path rather than by arbitrary pooling, and it clarifies when skip connections are optional, useful, or structurally important. Across node classification, graph classification, hypergraph node classification, mesh classification, and image reconstruction, TopoU-Net provides a reusable encoder-decoder template for higher-order structured data. Among the evaluated baselines, it achieves the strongest mean accuracy on six of eight node-classification datasets and four of five hypergraph datasets, with the largest gains on heterophilic graphs. Ablations show that removing skip connections is most damaging under severe bottleneck compression.

preprint2022arXiv

A Random Persistence Diagram Generator

Topological data analysis (TDA) studies the shape patterns of data. Persistent homology is a widely used method in TDA that summarizes homological features of data at multiple scales and stores them in persistence diagrams (PDs). In this paper, we propose a random persistence diagram generator (RPDG) method that generates a sequence of random PDs from the ones produced by the data. RPDG is underpinned by a model based on pairwise interacting point processes, and a reversible jump Markov chain Monte Carlo (RJ-MCMC) algorithm. A first example, which is based on a synthetic dataset, demonstrates the efficacy of RPDG and provides a comparison with another method for sampling PDs. A second example demonstrates the utility of RPDG to solve a materials science problem given a real dataset of small sample size.

preprint2022arXiv

Depth-2 Neural Networks Under a Data-Poisoning Attack

In this work, we study the possibility of defending against data-poisoning attacks while training a shallow neural network in a regression setup. We focus on doing supervised learning for a class of depth-2 finite-width neural networks, which includes single-filter convolutional networks. In this class of networks, we attempt to learn the network weights in the presence of a malicious oracle doing stochastic, bounded and additive adversarial distortions on the true output during training. For the non-gradient stochastic algorithm that we construct, we prove worst-case near-optimal trade-offs among the magnitude of the adversarial attack, the weight approximation accuracy, and the confidence achieved by the proposed algorithm. As our algorithm uses mini-batching, we analyze how the mini-batch size affects convergence. We also show how to utilize the scaling of the outer layer weights to counter output-poisoning attacks depending on the probability of attack. Lastly, we give experimental evidence demonstrating how our algorithm outperforms stochastic gradient descent under different input data distributions, including instances of heavy-tailed distributions.

preprint2022arXiv

Inferring the spread of COVID-19: the role of time-varying reporting rate in epidemiological modelling

The role of epidemiological models is crucial for informing public health officials during a public health emergency, such as the COVID-19 pandemic. However, traditional epidemiological models fail to capture the time-varying effects of mitigation strategies and do not account for under-reporting of active cases, thus introducing bias in the estimation of model parameters. To infer more accurate parameter estimates and to reduce the uncertainty of these estimates, we extend the SIR and SEIR epidemiological models with two time-varying parameters that capture the transmission rate and the rate at which active cases are reported to health officials. Using two real data sets of COVID-19 cases, we perform Bayesian inference via our SIR and SEIR models with time-varying transmission and reporting rates and via their standard counterparts with constant rates; our approach provides parameter estimates with more realistic interpretation, and one-week ahead predictions with reduced uncertainty. Furthermore, we find consistent under-reporting in the number of active cases in the data that we consider, suggesting that the initial phase of the pandemic was more widespread than previously reported.

preprint2022arXiv

Simplicial Complex Representation Learning

Simplicial complexes form an important class of topological spaces that are frequently used in many application areas such as computer-aided design, computer graphics, and simulation. Representation learning on graphs, which are just 1-d simplicial complexes, has witnessed a great attention in recent years. However, there has not been enough effort to extend representation learning to higher dimensional simplicial objects due to the additional complexity these objects hold, especially when it comes to entire-simplicial complex representation learning. In this work, we propose a method for simplicial complex-level representation learning that embeds a simplicial complex to a universal embedding space in a way that complex-to-complex proximity is preserved. Our method uses our novel geometric message passing schemes to learn an entire simplicial complex representation in an end-to-end fashion. We demonstrate the proposed model on publicly available mesh dataset. To the best of our knowledge, this work presents the first method for learning simplicial complex-level representation.

preprint2020arXiv

Automated detection of corrosion in used nuclear fuel dry storage canisters using residual neural networks

Nondestructive evaluation methods play an important role in ensuring component integrity and safety in many industries. Operator fatigue can play a critical role in the reliability of such methods. This is important for inspecting high value assets or assets with a high consequence of failure, such as aerospace and nuclear components. Recent advances in convolution neural networks can support and automate these inspection efforts. This paper proposes using residual neural networks (ResNets) for real-time detection of corrosion, including iron oxide discoloration, pitting and stress corrosion cracking, in dry storage stainless steel canisters housing used nuclear fuel. The proposed approach crops nuclear canister images into smaller tiles, trains a ResNet on these tiles, and classifies images as corroded or intact using the per-image count of tiles predicted as corroded by the ResNet. The results demonstrate that such a deep learning approach allows to detect the locus of corrosion via smaller tiles, and at the same time to infer with high accuracy whether an image comes from a corroded canister. Thereby, the proposed approach holds promise to automate and speed up nuclear fuel canister inspections, to minimize inspection costs, and to partially replace human-conducted onsite inspections, thus reducing radiation doses to personnel.

preprint2020arXiv

Bayesian neural networks and dimensionality reduction

In conducting non-linear dimensionality reduction and feature learning, it is common to suppose that the data lie near a lower-dimensional manifold. A class of model-based approaches for such problems includes latent variables in an unknown non-linear regression function; this includes Gaussian process latent variable models and variational auto-encoders (VAEs) as special cases. VAEs are artificial neural networks (ANNs) that employ approximations to make computation tractable; however, current implementations lack adequate uncertainty quantification in estimating the parameters, predictive densities, and lower-dimensional subspace, and can be unstable and lack interpretability in practice. We attempt to solve these problems by deploying Markov chain Monte Carlo sampling algorithms (MCMC) for Bayesian inference in ANN models with latent variables. We address issues of identifiability by imposing constraints on the ANN parameters as well as by using anchor points. This is demonstrated on simulated and real data examples. We find that current MCMC sampling schemes face fundamental challenges in neural networks involving latent variables, motivating new research directions.

preprint2020arXiv

Wide Neural Networks with Bottlenecks are Deep Gaussian Processes

There has recently been much work on the "wide limit" of neural networks, where Bayesian neural networks (BNNs) are shown to converge to a Gaussian process (GP) as all hidden layers are sent to infinite width. However, these results do not apply to architectures that require one or more of the hidden layers to remain narrow. In this paper, we consider the wide limit of BNNs where some hidden layers, called "bottlenecks", are held at finite width. The result is a composition of GPs that we term a "bottleneck neural network Gaussian process" (bottleneck NNGP). Although intuitive, the subtlety of the proof is in showing that the wide limit of a composition of networks is in fact the composition of the limiting GPs. We also analyze theoretically a single-bottleneck NNGP, finding that the bottleneck induces dependence between the outputs of a multi-output network that persists through extreme post-bottleneck depths, and prevents the kernel of the network from losing discriminative power at extreme post-bottleneck depths.

preprint2016arXiv

Forward-Mode Automatic Differentiation in Julia

We present ForwardDiff, a Julia package for forward-mode automatic differentiation (AD) featuring performance competitive with low-level languages like C++. Unlike recently developed AD tools in other popular high-level languages such as Python and MATLAB, ForwardDiff takes advantage of just-in-time (JIT) compilation to transparently recompile AD-unaware user code, enabling efficient support for higher-order differentiation and differentiation using custom number types (including complex numbers). For gradient and Jacobian calculations, ForwardDiff provides a variant of vector-forward mode that avoids expensive heap allocation and makes better use of memory bandwidth than traditional vector mode. In our numerical experiments, we demonstrate that for nontrivially large dimensions, ForwardDiff's gradient computations can be faster than a reverse-mode implementation from the Python-based autograd package. We also illustrate how ForwardDiff is used effectively within JuMP, a modeling language for optimization. According to our usage statistics, 41 unique repositories on GitHub depend on ForwardDiff, with users from diverse fields such as astronomy, optimization, finite element analysis, and statistics. This document is an extended abstract that has been accepted for presentation at the AD2016 7th International Conference on Algorithmic Differentiation.

preprint2014arXiv

The Controlled Thermodynamic Integral for Bayesian Model Comparison

Bayesian model comparison relies upon the model evidence, yet for many models of interest the model evidence is unavailable in closed form and must be approximated. Many of the estimators for evidence that have been proposed in the Monte Carlo literature suffer from high variability. This paper considers the reduction of variance that can be achieved by exploiting control variates in this setting. Our methodology is based on thermodynamic integration and applies whenever the gradient of both the log-likelihood and the log-prior with respect to the parameters can be efficiently evaluated. Results obtained on regression models and popular benchmark datasets demonstrate a significant and sometimes dramatic reduction in estimator variance and provide insight into the wider applicability of control variates to Bayesian model comparison.

Theodore Papamarkou

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

Position: agentic AI orchestration should be Bayes-consistent

The Human Brain as a Combinatorial Complex

TopoU-Net: a U-Net architecture for topological domains

A Random Persistence Diagram Generator

Depth-2 Neural Networks Under a Data-Poisoning Attack

Inferring the spread of COVID-19: the role of time-varying reporting rate in epidemiological modelling

Simplicial Complex Representation Learning

Automated detection of corrosion in used nuclear fuel dry storage canisters using residual neural networks

Bayesian neural networks and dimensionality reduction

Wide Neural Networks with Bottlenecks are Deep Gaussian Processes

Forward-Mode Automatic Differentiation in Julia

The Controlled Thermodynamic Integral for Bayesian Model Comparison