Researcher profile

Sebastian Claici

Sebastian Claici contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
1topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2020arXiv

Model Fusion with Kullback--Leibler Divergence

We propose a method to fuse posterior distributions learned from heterogeneous datasets. Our algorithm relies on a mean field assumption for both the fused model and the individual dataset posteriors and proceeds using a simple assign-and-average approach. The components of the dataset posteriors are assigned to the proposed global model components by solving a regularized variant of the assignment problem. The global components are then updated based on these assignments by their mean under a KL divergence. For exponential family variational distributions, our formulation leads to an efficient non-parametric algorithm for computing the fused model. Our algorithm is easy to describe and implement, efficient, and competitive with state-of-the-art on motion capture analysis, topic modeling, and federated learning of Bayesian neural networks.

preprint2020arXiv

Wasserstein Measure Coresets

The proliferation of large data sets and Bayesian inference techniques motivates demand for better data sparsification. Coresets provide a principled way of summarizing a large dataset via a smaller one that is guaranteed to match the performance of the full data set on specific problems. Classical coresets, however, neglect the underlying data distribution, which is often continuous. We address this oversight by introducing Wasserstein measure coresets, an extension of coresets which by definition takes into account generalization. Our formulation of the problem, which essentially consists in minimizing the Wasserstein distance, is solvable via stochastic gradient descent. This yields an algorithm which simply requires sample access to the data distribution and is able to handle large data streams in an online manner. We validate our construction for inference and clustering.