Source author record

Nikolaos Nikolaou

Nikolaos Nikolaou appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

astro-ph.IM Machine Learning astro-ph.EP physics.comp-ph Information Theory math.IT physics.data-an

Catalog footprint

What is connected

8works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Adaptive Online Emulation for Accelerating Complex Physical Simulations

Complex physical simulations often require trade-offs between model fidelity and computational feasibility. We introduce Adaptive Online Emulation (AOE), which dynamically learns neural network surrogates during simulation execution to accelerate expensive components. Unlike existing methods requiring extensive offline training, AOE uses Online Sequential Extreme Learning Machines (OS-ELMs) to continuously adapt emulators along the actual simulation trajectory. We employ a numerically stable variant of the OS-ELM using cumulative sufficient statistics to avoid matrix inversion instabilities. AOE integrates with time-stepping frameworks through a three-phase strategy balancing data collection, updates, and surrogate usage, while requiring orders of magnitude less training data than conventional surrogate approaches. Demonstrated on a 1D atmospheric model of exoplanet GJ1214b, AOE achieves 11.1 times speedup (91% time reduction) across 200,000 timesteps while maintaining accuracy, potentially making previously intractable high-fidelity time-stepping simulations computationally feasible.

preprint2022arXiv

Don't Pay Attention to the Noise: Learning Self-supervised Representations of Light Curves with a Denoising Time Series Transformer

Astrophysical light curves are particularly challenging data objects due to the intensity and variety of noise contaminating them. Yet, despite the astronomical volumes of light curves available, the majority of algorithms used to process them are still operating on a per-sample basis. To remedy this, we propose a simple Transformer model -- called Denoising Time Series Transformer (DTST) -- and show that it excels at removing the noise and outliers in datasets of time series when trained with a masked objective, even when no clean targets are available. Moreover, the use of self-attention enables rich and illustrative queries into the learned representations. We present experiments on real stellar light curves from the Transiting Exoplanet Space Satellite (TESS), showing advantages of our approach compared to traditional denoising techniques.

preprint2022arXiv

ESA-Ariel Data Challenge NeurIPS 2022: Inferring Physical Properties of Exoplanets From Next-Generation Telescopes

The study of extra-solar planets, or simply, exoplanets, planets outside our own Solar System, is fundamentally a grand quest to understand our place in the Universe. Discoveries in the last two decades have re-defined our understanding of planets, and helped us comprehend the uniqueness of our very own Earth. In recent years the focus has shifted from planet detection to planet characterisation, where key planetary properties are inferred from telescope observations using Monte Carlo-based methods. However, the efficiency of sampling-based methodologies is put under strain by the high-resolution observational data from next generation telescopes, such as the James Webb Space Telescope and the Ariel Space Mission. We are delighted to announce the acceptance of the Ariel ML Data Challenge 2022 as part of the NeurIPS competition track. The goal of this challenge is to identify a reliable and scalable method to perform planetary characterisation. Depending on the chosen track, participants are tasked to provide either quartile estimates or the approximate distribution of key planetary properties. To this end, a synthetic spectroscopic dataset has been generated from the official simulators for the ESA Ariel Space Mission. The aims of the competition are three-fold. 1) To offer a challenging application for comparing and advancing conditional density estimation methods. 2) To provide a valuable contribution towards reliable and efficient analysis of spectroscopic data, enabling astronomers to build a better picture of planetary demographics, and 3) To promote the interaction between ML and exoplanetary science. The competition is open from 15th June and will run until early October, participants of all skill levels are more than welcomed!

preprint2022arXiv

Fast Regression of the Tritium Breeding Ratio in Fusion Reactors

The tritium breeding ratio (TBR) is an essential quantity for the design of modern and next-generation D-T fueled nuclear fusion reactors. Representing the ratio between tritium fuel generated in breeding blankets and fuel consumed during reactor runtime, the TBR depends on reactor geometry and material properties in a complex manner. In this work, we explored the training of surrogate models to produce a cheap but high-quality approximation for a Monte Carlo TBR model in use at the UK Atomic Energy Authority. We investigated possibilities for dimensional reduction of its feature space, reviewed 9 families of surrogate models for potential applicability, and performed hyperparameter optimisation. Here we present the performance and scaling properties of these models, the fastest of which, an artificial neural network, demonstrated $R^2=0.985$ and a mean prediction time of $0.898\ μ\mathrm{s}$, representing a relative speedup of $8\cdot 10^6$ with respect to the expensive MC model. We further present a novel adaptive sampling algorithm, Quality-Adaptive Surrogate Sampling, capable of interfacing with any of the individually studied surrogates. Our preliminary testing on a toy TBR theory has demonstrated the efficacy of this algorithm for accelerating the surrogate modelling process.

preprint2020arXiv

Better Boosting with Bandits for Online Learning

Probability estimates generated by boosting ensembles are poorly calibrated because of the margin maximization nature of the algorithm. The outputs of the ensemble need to be properly calibrated before they can be used as probability estimates. In this work, we demonstrate that online boosting is also prone to producing distorted probability estimates. In batch learning, calibration is achieved by reserving part of the training data for training the calibrator function. In the online setting, a decision needs to be made on each round: shall the new example(s) be used to update the parameters of the ensemble or those of the calibrator. We proceed to resolve this decision with the aid of bandit optimization algorithms. We demonstrate superior performance to uncalibrated and naively-calibrated on-line boosting ensembles in terms of probability estimation. Our proposed mechanism can be easily adapted to other tasks(e.g. cost-sensitive classification) and is robust to the choice of hyperparameters of both the calibrator and the ensemble.

preprint2020arXiv

Detrending Exoplanetary Transit Light Curves with Long Short-Term Memory Networks

The precise derivation of transit depths from transit light curves is a key component for measuring exoplanet transit spectra, and henceforth for the study of exoplanet atmospheres. However, it is still deeply affected by various kinds of systematic errors and noise. In this paper we propose a new detrending method by reconstructing the stellar flux baseline during transit time. We train a probabilistic Long Short-Term Memory (LSTM) network to predict the next data point of the light curve during the out-of-transit, and use this model to reconstruct a transit-free light curve - i.e. including only the systematics - during the in-transit. By making no assumption about the instrument, and using only the transit ephemeris, this provides a general way to correct the systematics and perform a subsequent transit fit. The name of the proposed model is TLCD-LSTM, standing for Transit Light Curve Detrending LSTM. Here we present the first results on data from six transit observations of HD 189733b with the IRAC camera on board the Spitzer Space Telescope, and discuss some of its possible further applications.

preprint2020arXiv

Margin Maximization as Lossless Maximal Compression

The ultimate goal of a supervised learning algorithm is to produce models constructed on the training data that can generalize well to new examples. In classification, functional margin maximization -- correctly classifying as many training examples as possible with maximal confidence --has been known to construct models with good generalization guarantees. This work gives an information-theoretic interpretation of a margin maximizing model on a noiseless training dataset as one that achieves lossless maximal compression of said dataset -- i.e. extracts from the features all the useful information for predicting the label and no more. The connection offers new insights on generalization in supervised machine learning, showing margin maximization as a special case (that of classification) of a more general principle and explains the success and potential limitations of popular learning algorithms like gradient boosting. We support our observations with theoretical arguments and empirical evidence and identify interesting directions for future work.

preprint2020arXiv

Pushing the Limits of Exoplanet Discovery via Direct Imaging with Deep Learning

Further advances in exoplanet detection and characterisation require sampling a diverse population of extrasolar planets. One technique to detect these distant worlds is through the direct detection of their thermal emission. The so-called direct imaging technique, is suitable for observing young planets far from their star. These are very low signal-to-noise-ratio (SNR) measurements and limited ground truth hinders the use of supervised learning approaches. In this paper, we combine deep generative and discriminative models to bypass the issues arising when directly training on real data. We use a Generative Adversarial Network to obtain a suitable dataset for training Convolutional Neural Network classifiers to detect and locate planets across a wide range of SNRs. Tested on artificial data, our detectors exhibit good predictive performance and robustness across SNRs. To demonstrate the limits of the detectors, we provide maps of the precision and recall of the model per pixel of the input image. On real data, the models can re-confirm bright source detections.

Nikolaos Nikolaou

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Adaptive Online Emulation for Accelerating Complex Physical Simulations

Don't Pay Attention to the Noise: Learning Self-supervised Representations of Light Curves with a Denoising Time Series Transformer

ESA-Ariel Data Challenge NeurIPS 2022: Inferring Physical Properties of Exoplanets From Next-Generation Telescopes

Fast Regression of the Tritium Breeding Ratio in Fusion Reactors

Better Boosting with Bandits for Online Learning

Detrending Exoplanetary Transit Light Curves with Long Short-Term Memory Networks

Margin Maximization as Lossless Maximal Compression

Pushing the Limits of Exoplanet Discovery via Direct Imaging with Deep Learning