Researcher profile

Thomas Müller

Thomas Müller contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
18works
0followers
16topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

18 published item(s)

preprint2023arXiv

Second Data Release of the COSMOS Lyman-alpha Mapping and Tomographic Observation: The First 3D Maps of the Detailed Cosmic Web at 2.05<z<2.55

We present the second data release of the COSMOS Lyman-Alpha Mapping And Tomography Observations (CLAMATO) Survey conducted with the LRIS spectrograph on the Keck-I telescope. This project used Lyman-alpha forest absorption in the spectra of faint star forming galaxies and quasars at z ~ 2-3 to trace neutral hydrogen in the intergalactic medium. In particular, we use 320 objects over a footprint of ~0.2 deg^2 to reconstruct the absorption field at 2.05 < z < 2.55 at ~2 h^{-1}Mpc resolution. We apply a Wiener filtering technique to the observed data to reconstruct three dimensional maps of the field over a volume of 4.1 x 10^5 comoving cubic Mpc. In addition to the filtered flux maps, for the first time we infer the underlying dark matter field through a forward modeling framework from a joint likelihood of galaxy and Lyman-alpha forest data, finding clear examples of the detailed cosmic web consisting of cosmic voids, sheets, filaments, and nodes. In addition to traditional figures, we present a number of interactive three dimensional models to allow exploration of the data and qualitative comparisons to known galaxy surveys. We find that our inferred over-densities are consistent with those found from galaxy fields. Our reduced spectra, extracted Lyman-alpha forest pixel data, and reconstructed tomographic maps are available publicly at https://doi.org/10.5281/zenodo.7524313

preprint2022arXiv

Active Few-Shot Learning with FASL

Recent advances in natural language processing (NLP) have led to strong text classification models for many tasks. However, still often thousands of examples are needed to train models with good quality. This makes it challenging to quickly develop and deploy new models for real world problems and business needs. Few-shot learning and active learning are two lines of research, aimed at tackling this problem. In this work, we combine both lines into FASL, a platform that allows training text classification models using an iterative and fast process. We investigate which active learning methods work best in our few-shot setup. Additionally, we develop a model to predict when to stop annotating. This is relevant as in a few-shot setup we do not have access to a large validation set.

preprint2022arXiv

Few-Shot Learning with Siamese Networks and Label Tuning

We study the problem of building text classifiers with little or no training data, commonly known as zero and few-shot text classification. In recent years, an approach based on neural textual entailment models has been found to give strong results on a diverse range of tasks. In this work, we show that with proper pre-training, Siamese Networks that embed texts and labels offer a competitive alternative. These models allow for a large reduction in inference cost: constant in the number of labels rather than linear. Furthermore, we introduce label tuning, a simple and computationally efficient approach that allows to adapt the models in a few-shot setup by only changing the label embeddings. While giving lower performance than model fine-tuning, this approach has the architectural advantage that a single encoder can be shared by many different tasks.

preprint2022arXiv

Instant Neural Graphics Primitives with a Multiresolution Hash Encoding

Neural graphics primitives, parameterized by fully connected neural networks, can be costly to train and evaluate. We reduce this cost with a versatile new input encoding that permits the use of a smaller network without sacrificing quality, thus significantly reducing the number of floating point and memory access operations: a small neural network is augmented by a multiresolution hash table of trainable feature vectors whose values are optimized through stochastic gradient descent. The multiresolution structure allows the network to disambiguate hash collisions, making for a simple architecture that is trivial to parallelize on modern GPUs. We leverage this parallelism by implementing the whole system using fully-fused CUDA kernels with a focus on minimizing wasted bandwidth and compute operations. We achieve a combined speedup of several orders of magnitude, enabling training of high-quality neural graphics primitives in a matter of seconds, and rendering in tens of milliseconds at a resolution of ${1920\!\times\!1080}$.

preprint2022arXiv

Object classification on video data of meteors and meteor-like phenomena: algorithm and data

Every moment, countless meteoroids enter our atmosphere unseen. The detection and measurement of meteors offer the unique opportunity to gain insights into the composition of our solar systems&#39; celestial bodies. Researchers, therefore, carry out a wide-area-sky-monitoring to secure 360-degree video material, saving every single entry of a meteor. Existing machine intelligence cannot accurately recognize events of meteors intersecting the earth&#39;s atmosphere due to a lack of high-quality training data publicly available. This work presents four reusable open source solutions for researchers trained on data we collected due to the lack of available labeled high-quality training data. We refer to the proposed dataset as the NightSkyUCP dataset, consisting of a balanced set of 10,000 meteor- and 10,000 non-meteor-events. Our solutions apply various machine learning techniques, namely classification, feature learning, anomaly detection, and extrapolation. For the classification task, a mean accuracy of 99.1\% is achieved. The code and data are made public at figshare with DOI: 10.6084/m9.figshare.16451625

preprint2022arXiv

Predicted future fate of COSMOS galaxy protoclusters over 11 Gyr with constrained simulations

Cosmological simulations are crucial tools in studying the Universe, but they typically do not directly match real observed structures. Constrained cosmological simulations, on the other hand, are designed to match the observed distribution of galaxies. Here we present constrained simulations based on spectroscopic surveys at a redshift of z~2.3, corresponding to an epoch of nearly 11 Gyrs ago. This allows us to &#39;fast-forward&#39; the simulation to our present-day and study the evolution of observed cosmic structures self-consistently. We confirm that several previously-reported protoclusters will evolve into massive galaxy clusters by our present epoch, including the &#39;Hyperion&#39; structure that we predict will collapse into a giant filamentary supercluster spanning 100 Megaparsecs. We also discover previously unknown protoclusters, with lower final masses than typically detectable by other methods, that nearly double the number of known protoclusters within this volume. Constrained simulations, applied to future high-redshift datasets, represents a unique opportunity for studying early structure formation and matching galaxy properties between high and low redshifts.

preprint2022arXiv

Stratospheric Balloons as a Complement to the Next Generation of Astronomy Missions

Observations that require large physical instrument dimensions and/or a considerable amount of cryogens, as it is for example the case for high spatial resolution far infrared astronomy, currently still face technological limits for their execution from space. The high cost and finality of space missions furthermore call for a very low risk approach and entail long development times. For certain spectral regions, prominently including the mid- to far-infrared as well as parts of the ultraviolet, stratospheric balloons offer a flexible and affordable complement to space telescopes, with short development times and comparatively good observing conditions. Yet, the entry burden to use balloon-borne telescopes is high, with research groups typically having to shoulder part of the infrastructure development as well. Aiming to ease access to balloon-based observations, we present the efforts towards a community-accessible balloon-based observatory, the European Stratospheric Balloon Observatory (ESBO). ESBO aims at complementing space-based and airborne capabilities over the next 10-15 years and at adding to the current landscape of scientific ballooning activities by providing a service-centered infrastructure for broader astronomical use, performing regular flights and offering an operations concept that provides researchers with a similar proposal-based access to observation time as practiced on ground-based observatories. We present details on the activities planned towards the goal of ESBO, the current status of the STUDIO (Stratospheric UV Demonstrator of an Imaging Observatory) prototype platform and mission, as well as selected technology developments with extensibility potential to space missions undertaken for STUDIO.

preprint2022arXiv

Stratospheric balloons as a platform for the next large far infrared observatory

Observations that require large physical instrument dimensions and/or a considerable amount of cryogens, as it is the case for high spatial resolution far infrared (FIR) astronomy, currently still face technological limits for their execution from space. Angular resolution and available observational capabilities are particularly affected. Balloon-based platforms promise to complement the existing observational capabilities by offering means to deploy comparatively large telescopes with comparatively little effort, including other advantages such as the possibility to regularly refill cryogens and to change and/or update instruments. The planned European Stratospheric Balloon Observatory (ESBO) aims at providing these additional large aperture FIR capabilities, exceeding the spatial resolution of Herschel, in the long term. The plans focus on reusable platforms performing regular flights and an operations concept that provides researchers with proposal-based access to observations. It thereby aims at offering a complement to other airborne, ground-based and space-based observatories in terms of access to wavelength regions, spatial resolution capability, and photometric stability. While the FIR capabilities are a main long-term objective, ESBO will offer benefits in other wavelength regimes along the way. Within the ESBO Design Study (ESBO DS), a prototype platform carrying a 0.5 m telescope for ultraviolet and visible light observations is being built and a platform concept for a next-generation FIR telescope is being studied. A flight of the UV/VIS prototype platform is estimated for 2021. In this paper we will outline the scientific and technical motivation for a large aperture balloon-based FIR observatory and the ESBO DS approach towards such an infrastructure. Secondly, we will present the technical motivation, science case, and instrumentation of the 0.5 m UV/VIS platform.

preprint2022arXiv

Variable Bitrate Neural Fields

Neural approximations of scalar and vector fields, such as signed distance functions and radiance fields, have emerged as accurate, high-quality representations. State-of-the-art results are obtained by conditioning a neural approximation with a lookup from trainable feature grids that take on part of the learning task and allow for smaller, more efficient neural networks. Unfortunately, these feature grids usually come at the cost of significantly increased memory consumption compared to stand-alone neural network models. We present a dictionary method for compressing such feature grids, reducing their memory consumption by up to 100x and permitting a multiresolution representation which can be useful for out-of-core streaming. We formulate the dictionary optimization as a vector-quantized auto-decoder problem which lets us learn end-to-end discrete neural representations in a space where no direct supervision is available and with dynamic topology and structure. Our source code will be available at https://github.com/nv-tlabs/vqad.

preprint2022arXiv

Zero and Few-shot Learning for Author Profiling

Author profiling classifies author characteristics by analyzing how language is shared among people. In this work, we study that task from a low-resource viewpoint: using little or no training data. We explore different zero and few-shot models based on entailment and evaluate our systems on several profiling tasks in Spanish and English. In addition, we study the effect of both the entailment hypothesis and the size of the few-shot training sample. We find that entailment-based models out-perform supervised text classifiers based on roberta-XLM and that we can reach 80% of the accuracy of previous approaches using less than 50\% of the training data on average.

preprint2021arXiv

Measurement-induced dark state phase transitions in long-ranged fermion systems

We identify an unconventional algebraic scaling phase in the quantum dynamics of free fermions with long range hopping, which are exposed to continuous local density measurements. The unconventional phase is characterized by an algebraic entanglement entropy growth, and by a slow algebraic decay of the density-density correlation function, both with a fractional exponent. It occurs for hopping decay exponents $1< p \lesssim 3/2$ independently of the measurement rate. The algebraic phase gives rise to two critical lines, separating it from a critical phase with logarithmic entanglement growth at small, and an area law phase with constant entanglement entropy at large monitoring rates. A perturbative renormalization group analysis suggests that the transitions to the long-range phase are also unconventional, corresponding to a modified sine-Gordon theory. Comparing exact numerical simulations of the monitored wave functions with analytical predictions from a replica field theory approach yields an excellent quantitative agreement. This confirms the view of a measurement-induced phase transition as a quantum phase transition in the dark state of an effective, non-Hermitian Hamiltonian.

preprint2020arXiv

Development of swarm behavior in artificial learning agents that adapt to different foraging environments

Collective behavior, and swarm formation in particular, has been studied from several perspectives within a large variety of fields, ranging from biology to physics. In this work, we apply Projective Simulation to model each individual as an artificial learning agent that interacts with its neighbors and surroundings in order to make decisions and learn from them. Within a reinforcement learning framework, we discuss one-dimensional learning scenarios where agents need to get to food resources to be rewarded. We observe how different types of collective motion emerge depending on the distance the agents need to travel to reach the resources. For instance, strongly aligned swarms emerge when the food source is placed far away from the region where agents are situated initially. In addition, we study the properties of the individual trajectories that occur within the different types of emergent collective dynamics. Agents trained to find distant resources exhibit individual trajectories with Lévy-like characteristics as a consequence of the collective motion, whereas agents trained to reach nearby resources present Brownian-like trajectories.

preprint2020arXiv

IDEAS: Immersive Dome Experiences for Accelerating Science

Astrophysics lies at the crossroads of big datasets (such as the Large Synoptic Survey Telescope and Gaia), open source software to visualize and interpret high dimensional datasets (such as Glue, WorldWide Telescope, and OpenSpace), and uniquely skilled software engineers who bridge data science and research fields. At the same time, more than 4,000 planetariums across the globe immerse millions of visitors in scientific data. We have identified the potential for critical synergy across data, software, hardware, locations, and content that -- if prioritized over the next decade -- will drive discovery in astronomical research. Planetariums can and should be used for the advancement of scientific research. Current facilities such as the Hayden Planetarium in New York City, Adler Planetarium in Chicago, Morrison Planetarium in San Francisco, the Iziko Planetarium and Digital Dome Research Consortium in Cape Town, and Visualization Center C in Norrkoping are already developing software which ingests catalogs of astronomical and multi-disciplinary data critical for exploration research primarily for the purpose of creating scientific storylines for the general public. We propose a transformative model whereby scientists become the audience and explorers in planetariums, utilizing software for their own investigative purposes. In this manner, research benefits from the authentic and unique experience of data immersion contained in an environment bathed in context and equipped for collaboration. Consequently, in this white paper we argue that over the next decade the research astronomy community should partner with planetariums to create visualization-based research opportunities for the field. Realizing this vision will require new investments in software and human capital.

preprint2020arXiv

Neural Control Variates

We propose neural control variates (NCV) for unbiased variance reduction in parametric Monte Carlo integration. So far, the core challenge of applying the method of control variates has been finding a good approximation of the integrand that is cheap to integrate. We show that a set of neural networks can face that challenge: a normalizing flow that approximates the shape of the integrand and another neural network that infers the solution of the integral equation. We also propose to leverage a neural importance sampler to estimate the difference between the original integrand and the learned control variate. To optimize the resulting parametric estimator, we derive a theoretically optimal, variance-minimizing loss function, and propose an alternative, composite loss for stable online training in practice. When applied to light transport simulation, neural control variates are capable of matching the state-of-the-art performance of other unbiased approaches, while providing means to develop more performant, practical solutions. Specifically, we show that the learned light-field approximation is of sufficient quality for high-order bounces, allowing us to omit the error correction and thereby dramatically reduce the noise at the cost of negligible visible bias.

preprint2020arXiv

Small Bodies: Near and Far Database for thermal infrared observations of small bodies in the Solar System

In this paper we present the &#34;Small Bodies: Near and Far&#34; Infrared Database, an easy-to-use tool intended to facilitate the modeling of thermal emission of small Solar System bodies. Our database collects thermal emission measurements of small Solar Systems targets that are otherwise available in scattered sources and gives a complete description of the data, with all information necessary to perform direct scientific analyses and without the need to access additional, external resources. This public database contains representative data of asteroid observations of large surveys (e.g. AKARI, IRAS and WISE) as well as a collection of small body observations of infrared space telescopes (e.g. the Herschel Space Observatory) and provides a web interface to access this data (https://ird.konkoly.hu). We also provide an example for the direct application of the database and show how it can be used to estimate the thermal inertia of specific populations, e.g. asteroids within a given size range. We show how different scalings of thermal inertia with heliocentric distance (i.e. temperature) may affect our interpretation of the data and discuss why the widely-used radiative conductivity exponent ($α$=-3/4) might not be adequate in general, as hinted by previous studies.

preprint2020arXiv

TAPAS: Weakly Supervised Table Parsing via Pre-training

Answering natural language questions over tables is usually seen as a semantic parsing task. To alleviate the collection cost of full logical forms, one popular approach focuses on weak supervision consisting of denotations instead of logical forms. However, training semantic parsers from weak supervision poses difficulties, and in addition, the generated logical forms are only used as an intermediate step prior to retrieving the denotation. In this paper, we present TAPAS, an approach to question answering over tables without generating logical forms. TAPAS trains from weak supervision, and predicts the denotation by selecting table cells and optionally applying a corresponding aggregation operator to such selection. TAPAS extends BERT&#39;s architecture to encode tables as input, initializes from an effective joint pre-training of text segments and tables crawled from Wikipedia, and is trained end-to-end. We experiment with three different semantic parsing datasets, and find that TAPAS outperforms or rivals semantic parsing models by improving state-of-the-art accuracy on SQA from 55.1 to 67.2 and performing on par with the state-of-the-art on WIKISQL and WIKITQ, but with a simpler model architecture. We additionally find that transfer learning, which is trivial in our setting, from WIKISQL to WIKITQ, yields 48.7 accuracy, 4.2 points above the state-of-the-art.

preprint2019arXiv

Finite non-cyclic $p$-groups whose number of subgroups is minimal

Recent results of Qu and Tuarnauceanu describe explicitly the finite p-groups which are not elementary abelian and have the property that the number of their subgroups is maximal among p-groups of a given order. We complement these results from the bottom level up by determining completely the non-cyclic finite p-groups whose number of subgroups among p-groups of a given order is minimal.

preprint2019arXiv

Trans-Neptunian objects and Centaurs at thermal wavelengths

The thermal emission of transneptunian objects (TNO) and Centaurs has been observed at mid- and far-infrared wavelengths - with the biggest contributions coming from the Spitzer and Herschel space observatories-, and the brightest ones also at sub-millimeter and millimeter wavelengths. These measurements allowed to determine the sizes and albedos for almost 180 objects, and densities for about 25 multiple systems. The derived very low thermal inertias show evidence for a decrease at large heliocentric distances and for high-albedo objects, which indicates porous and low-conductivity surfaces. The radio emissivity was found to be low ($ε_r$=0.70$\pm$0.13) with possible spectral variations in a few cases. The general increase of density with object size points to different formation locations or times. The mean albedos increase from about 5-6% (Centaurs, Scattered-Disk Objects) to 15% for the Detached objects, with distinct cumulative albedo distributions for hot and cold classicals. The color-albedo separation in our sample is evidence for a compositional discontinuity in the young Solar System. The median albedo of the sample (excluding dwarf planets and the Haumea family) is 0.08, the albedo of Haumea family members is close to 0.5, best explained by the presence of water ice. The existing thermal measurements remain a treasure trove at times where the far-infrared regime is observationally not accessible.