Source author record

Karsten Kreis

Karsten Kreis appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision cond-mat.stat-mech Cryptography and Security quant-ph Graphics physics.chem-ph

Catalog footprint

What is connected

14works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Exploring Synthesizable Chemical Space with Iterative Pathway Refinements

A well-known pitfall of molecular generative models is that they are not guaranteed to generate synthesizable molecules. Existing solutions for this problem often struggle to effectively navigate exponentially large combinatorial space of synthesizable molecules and suffer from poor coverage. To address this problem, we introduce ReaSyn, an iterative generative pathway refinement framework that obtains synthesizable analogs to input molecules by projecting them onto synthesizable space. Specifically, we propose a simple synthetic pathway representation that allows for generating pathways in both bottom-up and top-down traversal of synthetic trees. We design ReaSyn so that both bottom-up and top-down pathways can be sampled with a single unified autoregressive model. ReaSyn can thus iteratively refine subtrees of generated synthetic trees in a bidirectional manner. Further, we introduce a discrete flow model that refines the generated pathway at the entire pathway level with edit operations: insertion, deletion, and substitution. The iterative refinement cycle of (1) bottom-up decoding, (2) top-down decoding, and (3) holistic editing constitutes a powerful pathway reasoning strategy, allowing the model to explore the vast space of synthesizable molecules. Experimentally, ReaSyn achieves the highest reconstruction rate and pathway diversity in synthesizable molecule reconstruction and the highest optimization performance in synthesizable goal-directed molecular optimization, and significantly outperforms previous synthesizable projection methods in synthesizable hit expansion. These results highlight ReaSyn's superior ability to navigate combinatorially-large synthesizable chemical space.

preprint2024arXiv

Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models

Text-guided diffusion models have revolutionized image and video generation and have also been successfully used for optimization-based 3D object synthesis. Here, we instead focus on the underexplored text-to-4D setting and synthesize dynamic, animated 3D objects using score distillation methods with an additional temporal dimension. Compared to previous work, we pursue a novel compositional generation-based approach, and combine text-to-image, text-to-video, and 3D-aware multiview diffusion models to provide feedback during 4D object optimization, thereby simultaneously enforcing temporal consistency, high-quality visual appearance and realistic geometry. Our method, called Align Your Gaussians (AYG), leverages dynamic 3D Gaussian Splatting with deformation fields as 4D representation. Crucial to AYG is a novel method to regularize the distribution of the moving 3D Gaussians and thereby stabilize the optimization and induce motion. We also propose a motion amplification mechanism as well as a new autoregressive synthesis scheme to generate and combine multiple 4D sequences for longer generation. These techniques allow us to synthesize vivid dynamic scenes, outperform previous work qualitatively and quantitatively and achieve state-of-the-art text-to-4D performance. Due to the Gaussian 4D representation, different 4D animations can be seamlessly combined, as we demonstrate. AYG opens up promising avenues for animation, simulation and digital content creation as well as synthetic data generation.

preprint2023arXiv

Differentially Private Diffusion Models

While modern machine learning models rely on increasingly large training datasets, data is often limited in privacy-sensitive domains. Generative models trained with differential privacy (DP) on sensitive data can sidestep this challenge, providing access to synthetic data instead. We build on the recent success of diffusion models (DMs) and introduce Differentially Private Diffusion Models (DPDMs), which enforce privacy using differentially private stochastic gradient descent (DP-SGD). We investigate the DM parameterization and the sampling algorithm, which turn out to be crucial ingredients in DPDMs, and propose noise multiplicity, a powerful modification of DP-SGD tailored to the training of DMs. We validate our novel DPDMs on image generation benchmarks and achieve state-of-the-art performance in all experiments. Moreover, on standard benchmarks, classifiers trained on DPDM-generated synthetic data perform on par with task-specific DP-SGD-trained classifiers, which has not been demonstrated before for DP generative models. Project page and code: https://nv-tlabs.github.io/DPDM.

preprint2022arXiv

BigDatasetGAN: Synthesizing ImageNet with Pixel-wise Annotations

Annotating images with pixel-wise labels is a time-consuming and costly process. Recently, DatasetGAN showcased a promising alternative - to synthesize a large labeled dataset via a generative adversarial network (GAN) by exploiting a small set of manually labeled, GAN-generated images. Here, we scale DatasetGAN to ImageNet scale of class diversity. We take image samples from the class-conditional generative model BigGAN trained on ImageNet, and manually annotate 5 images per class, for all 1k classes. By training an effective feature segmentation architecture on top of BigGAN, we turn BigGAN into a labeled dataset generator. We further show that VQGAN can similarly serve as a dataset generator, leveraging the already annotated data. We create a new ImageNet benchmark by labeling an additional set of 8k real images and evaluate segmentation performance in a variety of settings. Through an extensive ablation study we show big gains in leveraging a large generated dataset to train different supervised and self-supervised backbone models on pixel-wise tasks. Furthermore, we demonstrate that using our synthesized datasets for pre-training leads to improvements over standard ImageNet pre-training on several downstream datasets, such as PASCAL-VOC, MS-COCO, Cityscapes and chest X-ray, as well as tasks (detection, segmentation). Our benchmark will be made public and maintain a leaderboard for this challenging task. Project Page: https://nv-tlabs.github.io/big-datasetgan/

preprint2022arXiv

Causal Scene BERT: Improving object detection by searching for challenging groups of data

Modern computer vision applications rely on learning-based perception modules parameterized with neural networks for tasks like object detection. These modules frequently have low expected error overall but high error on atypical groups of data due to biases inherent in the training process. In building autonomous vehicles (AV), this problem is an especially important challenge because their perception modules are crucial to the overall system performance. After identifying failures in AV, a human team will comb through the associated data to group perception failures that share common causes. More data from these groups is then collected and annotated before retraining the model to fix the issue. In other words, error groups are found and addressed in hindsight. Our main contribution is a pseudo-automatic method to discover such groups in foresight by performing causal interventions on simulated scenes. To keep our interventions on the data manifold, we utilize masked language models. We verify that the prioritized groups found via intervention are challenging for the object detector and show that retraining with data collected from these groups helps inordinately compared to adding more IID data. We also plan to release software to run interventions in simulated scenes, which we hope will benefit the causality community.

preprint2022arXiv

Polymorphic-GAN: Generating Aligned Samples across Multiple Domains with Learned Morph Maps

Modern image generative models show remarkable sample quality when trained on a single domain or class of objects. In this work, we introduce a generative adversarial network that can simultaneously generate aligned image samples from multiple related domains. We leverage the fact that a variety of object classes share common attributes, with certain geometric differences. We propose Polymorphic-GAN which learns shared features across all domains and a per-domain morph layer to morph shared features according to each domain. In contrast to previous works, our framework allows simultaneous modelling of images with highly varying geometries, such as images of human faces, painted and artistic faces, as well as multiple different animal faces. We demonstrate that our model produces aligned samples for all domains and show how it can be used for applications such as segmentation transfer and cross-domain image editing, as well as training in low-data regimes. Additionally, we apply our Polymorphic-GAN on image-to-image translation tasks and show that we can greatly surpass previous approaches in cases where the geometric differences between domains are large.

preprint2022arXiv

Score-Based Generative Modeling with Critically-Damped Langevin Diffusion

Score-based generative models (SGMs) have demonstrated remarkable synthesis quality. SGMs rely on a diffusion process that gradually perturbs the data towards a tractable distribution, while the generative model learns to denoise. The complexity of this denoising task is, apart from the data distribution itself, uniquely determined by the diffusion process. We argue that current SGMs employ overly simplistic diffusions, leading to unnecessarily complex denoising processes, which limit generative modeling performance. Based on connections to statistical mechanics, we propose a novel critically-damped Langevin diffusion (CLD) and show that CLD-based SGMs achieve superior performance. CLD can be interpreted as running a joint diffusion in an extended space, where the auxiliary variables can be considered "velocities" that are coupled to the data variables as in Hamiltonian dynamics. We derive a novel score matching objective for CLD and show that the model only needs to learn the score function of the conditional distribution of the velocity given data, an easier task than learning scores of the data directly. We also derive a new sampling scheme for efficient synthesis from CLD-based diffusion models. We find that CLD outperforms previous SGMs in synthesis quality for similar network architectures and sampling compute budgets. We show that our novel sampler for CLD significantly outperforms solvers such as Euler--Maruyama. Our framework provides new insights into score-based denoising diffusion models and can be readily used for high-resolution image synthesis. Project page and code: https://nv-tlabs.github.io/CLD-SGM.

preprint2022arXiv

Tackling the Generative Learning Trilemma with Denoising Diffusion GANs

A wide variety of deep generative models has been developed in the past decade. Yet, these models often struggle with simultaneously addressing three key requirements including: high sample quality, mode coverage, and fast sampling. We call the challenge imposed by these requirements the generative learning trilemma, as the existing models often trade some of them for others. Particularly, denoising diffusion models have shown impressive sample quality and diversity, but their expensive sampling does not yet allow them to be applied in many real-world applications. In this paper, we argue that slow sampling in these models is fundamentally attributed to the Gaussian assumption in the denoising step which is justified only for small step sizes. To enable denoising with large steps, and hence, to reduce the total number of denoising steps, we propose to model the denoising distribution using a complex multimodal distribution. We introduce denoising diffusion generative adversarial networks (denoising diffusion GANs) that model each denoising step using a multimodal conditional GAN. Through extensive evaluations, we show that denoising diffusion GANs obtain sample quality and diversity competitive with original diffusion models while being 2000$\times$ faster on the CIFAR-10 dataset. Compared to traditional GANs, our model exhibits better mode coverage and sample diversity. To the best of our knowledge, denoising diffusion GAN is the first model that reduces sampling cost in diffusion models to an extent that allows them to be applied to real-world applications inexpensively. Project page and code can be found at https://nvlabs.github.io/denoising-diffusion-gan

preprint2021arXiv

Don't Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence

Although machine learning models trained on massive data have led to break-throughs in several areas, their deployment in privacy-sensitive domains remains limited due to restricted access to data. Generative models trained with privacy constraints on private data can sidestep this challenge, providing indirect access to private data instead. We propose DP-Sinkhorn, a novel optimal transport-based generative method for learning data distributions from private data with differential privacy. DP-Sinkhorn minimizes the Sinkhorn divergence, a computationally efficient approximation to the exact optimal transport distance, between the model and data in a differentially private manner and uses a novel technique for control-ling the bias-variance trade-off of gradient estimates. Unlike existing approaches for training differentially private generative models, which are mostly based on generative adversarial networks, we do not rely on adversarial objectives, which are notoriously difficult to optimize, especially in the presence of noise imposed by privacy constraints. Hence, DP-Sinkhorn is easy to train and deploy. Experimentally, we improve upon the state-of-the-art on multiple image modeling benchmarks and show differentially private synthesis of informative RGB images. Project page:https://nv-tlabs.github.io/DP-Sinkhorn.

preprint2021arXiv

Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes

Neural signed distance functions (SDFs) are emerging as an effective representation for 3D shapes. State-of-the-art methods typically encode the SDF with a large, fixed-size neural network to approximate complex shapes with implicit surfaces. Rendering with these large networks is, however, computationally expensive since it requires many forward passes through the network for every pixel, making these representations impractical for real-time graphics. We introduce an efficient neural representation that, for the first time, enables real-time rendering of high-fidelity neural SDFs, while achieving state-of-the-art geometry reconstruction quality. We represent implicit surfaces using an octree-based feature volume which adaptively fits shapes with multiple discrete levels of detail (LODs), and enables continuous LOD with SDF interpolation. We further develop an efficient algorithm to directly render our novel neural SDF representation in real-time by querying only the necessary LODs with sparse octree traversal. We show that our representation is 2-3 orders of magnitude more efficient in terms of rendering speed compared to previous works. Furthermore, it produces state-of-the-art reconstruction quality for complex shapes under both 3D geometric and 2D image-space metrics.

preprint2015arXiv

Advantages and challenges in coupling an ideal gas to atomistic models in adaptive resolution simulations

In adaptive resolution simulations, molecular fluids are modeled employing different levels of resolution in different subregions of the system. When traveling from one region to the other, particles change their resolution on the fly. One of the main advantages of such approaches is the computational efficiency gained in the coarse-grained region. In this respect the best coarse-grained system to employ in the low resolution region would be the ideal gas, making intermolecular force calculations in the coarse-grained subdomain redundant. In this case, however, a smooth coupling is challenging due to the high energetic imbalance between typical liquids and a system of non-interacting particles. In the present work, we investigate this approach, using as a test case the most biologically relevant fluid, water. We demonstrate that a successful coupling of water to the ideal gas can be achieved with current adaptive resolution methods, and discuss the issues that remain to be addressed.

preprint2015arXiv

From classical to quantum and back: Hamiltonian coupling of classical and Path Integral models of atoms

In computer simulations, quantum delocalization of atomic nuclei can be modeled making use of the Path Integral (PI) formulation of quantum statistical mechanics. This approach, however, comes with a large computational cost. By restricting the PI modeling to a small region of space, this cost can be significantly reduced. In the present work we derive a Hamiltonian formulation for a bottom-up, theoretically solid simulation protocol that allows molecules to change their resolution from quantum-mechanical to classical and vice versa on the fly, while freely diffusing across the system. This approach renders possible simulations of quantum systems at constant chemical potential. The validity of the proposed scheme is demonstrated by means of simulations of low temperature parahydrogen. Potential future applications include simulations of biomolecules, membranes, and interfaces.

preprint2012arXiv

Characterizing And Exploiting Hybrid Entanglement

Quantum information theory is a very young area of research offering a lot of challenging open questions to be tackled by ambitious upcoming physicists. One such problem is addressed in this thesis. Recently, several protocols have emerged which exploit both continuous variables and discrete variables. On the one hand, outperforming many of the established pure continuous variable or discrete variable schemes, these hybrid approaches offer new opportunities. However, on the other hand, they also lead to new, intricate, as yet uninvestigated, phenomena. An important ingredient of several of these hybrid protocols is a new kind of entanglement: The hybrid entanglement between continuous variable and discrete variable quantum systems, which is studied in detail in this work. An exhaustive analysis of this kind of entanglement is performed, where the focus is on bipartite entanglement. Nevertheless, also issues regarding multipartite hybrid entanglement are briefly discussed. The quintessence of this thesis is a new classification scheme which distinguishes between effective discrete variable hybrid entanglement and so-called true hybrid entanglement. However, along the way, also other questions are addressed, which have emerged during the studies. For example, entanglement witnessing is discussed not only for hybrid entangled states, but also for fully continuous variable two-mode Schroedinger cat states. Furthermore, subtleties regarding entanglement witnessing in a certain kind of mixed states are examined. Not only theoretical classification and analysis of hybrid entangled states are discussed, but also their generation is presented and a few applications are demonstrated.

preprint2012arXiv

Classifying, quantifying, and witnessing qudit-qumode hybrid entanglement

Recently, several hybrid approaches to quantum information emerged which utilize both continuous- and discrete-variable methods and resources at the same time. In this work, we investigate the bipartite hybrid entanglement between a finite-dimensional, discrete-variable quantum system and an infinite-dimensional, continuous-variable quantum system. A classification scheme is presented leading to a distinction between pure hybrid entangled states, mixed hybrid entangled states (those effectively supported by an overall finite-dimensional Hilbert space), and so-called truly hybrid entangled states (those which cannot be described in an overall finite-dimensional Hilbert space). Examples for states of each regime are given and entanglement witnessing as well as quantification are discussed. In particular, using the channel map of a thermal photon noise channel, we find that true hybrid entanglement naturally occurs in physically important settings. Finally, extensions from bipartite to multipartite hybrid entanglement are considered.

Karsten Kreis

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

Exploring Synthesizable Chemical Space with Iterative Pathway Refinements

Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models

Differentially Private Diffusion Models

BigDatasetGAN: Synthesizing ImageNet with Pixel-wise Annotations

Causal Scene BERT: Improving object detection by searching for challenging groups of data

Polymorphic-GAN: Generating Aligned Samples across Multiple Domains with Learned Morph Maps

Score-Based Generative Modeling with Critically-Damped Langevin Diffusion

Tackling the Generative Learning Trilemma with Denoising Diffusion GANs

Don't Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence

Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes

Advantages and challenges in coupling an ideal gas to atomistic models in adaptive resolution simulations

From classical to quantum and back: Hamiltonian coupling of classical and Path Integral models of atoms

Characterizing And Exploiting Hybrid Entanglement

Classifying, quantifying, and witnessing qudit-qumode hybrid entanglement