Source author record

Laurent Callot

Laurent Callot appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Methodology Applications Computation Distributed, Parallel, and Cluster Computing math.ST Software Engineering Statistics Theory

Catalog footprint

What is connected

11works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

TerraFormer: Automated Infrastructure-as-Code with LLMs Fine-Tuned via Policy-Guided Verifier Feedback

Automating Infrastructure-as-Code (IaC) is challenging, and large language models (LLMs) often produce incorrect configurations from natural language (NL). We present TerraFormer, a neuro-symbolic framework for IaC generation and mutation that combines supervised fine-tuning with verifier-guided reinforcement learning, using formal verification tools to provide feedback on syntax, deployability, and policy compliance. We curate two large, high-quality NL-to-IaC datasets, TF-Gen (152k instances) and TF-Mutn (52k instances), via multi-stage verification and iterative LLM self-correction. Evaluations against 17 state-of-the-art LLMs, including ~50x larger models like Sonnet 3.7, DeepSeek-R1, and GPT-4.1, show that TerraFormer improves correctness over its base LLM by 15.94% on IaC-Eval, 11.65% on TF-Gen (Test), and 19.60% on TF-Mutn (Test). It outperforms larger models on both TF-Gen (Test) and TF-Mutn (Test), ranks third on IaC-Eval, and achieves top best-practices and security compliance.

preprint2022arXiv

Deep Generative model with Hierarchical Latent Factors for Time Series Anomaly Detection

Multivariate time series anomaly detection has become an active area of research in recent years, with Deep Learning models outperforming previous approaches on benchmark datasets. Among reconstruction-based models, most previous work has focused on Variational Autoencoders and Generative Adversarial Networks. This work presents DGHL, a new family of generative models for time series anomaly detection, trained by maximizing the observed likelihood by posterior sampling and alternating back-propagation. A top-down Convolution Network maps a novel hierarchical latent space to time series windows, exploiting temporal dynamics to encode information efficiently. Despite relying on posterior sampling, it is computationally more efficient than current approaches, with up to 10x shorter training times than RNN based models. Our method outperformed current state-of-the-art models on four popular benchmark datasets. Finally, DGHL is robust to variable features between entities and accurate even with large proportions of missing values, settings with increasing relevance with the advent of IoT. We demonstrate the superior robustness of DGHL with novel occlusion experiments in this literature. Our code is available at https://github.com/cchallu/dghl.

preprint2022arXiv

Deep Learning for Time Series Forecasting: Tutorial and Literature Survey

Deep learning based forecasting methods have become the methods of choice in many applications of time series prediction or forecasting often outperforming other approaches. Consequently, over the last years, these methods are now ubiquitous in large-scale industrial forecasting applications and have consistently ranked among the best entries in forecasting competitions (e.g., M4 and M5). This practical success has further increased the academic interest to understand and improve deep forecasting methods. In this article we provide an introduction and overview of the field: We present important building blocks for deep forecasting in some depth; using these building blocks, we then survey the breadth of the recent deep forecasting literature.

preprint2022arXiv

Online Time Series Anomaly Detection with State Space Gaussian Processes

We propose r-ssGPFA, an unsupervised online anomaly detection model for uni- and multivariate time series building on the efficient state space formulation of Gaussian processes. For high-dimensional time series, we propose an extension of Gaussian process factor analysis to identify the common latent processes of the time series, allowing us to detect anomalies efficiently in an interpretable manner. We gain explainability while speeding up computations by imposing an orthogonality constraint on the mapping from the latent to the observed. Our model's robustness is improved by using a simple heuristic to skip Kalman updates when encountering anomalous observations. We investigate the behaviour of our model on synthetic data and show on standard benchmark datasets that our method is competitive with state-of-the-art methods while being computationally cheaper.

preprint2022arXiv

Robust Projection based Anomaly Extraction (RPE) in Univariate Time-Series

This paper presents a novel, closed-form, and data/computation efficient online anomaly detection algorithm for time-series data. The proposed method, dubbed RPE, is a window-based method and in sharp contrast to the existing window-based methods, it is robust to the presence of anomalies in its window and it can distinguish the anomalies in time-stamp level. RPE leverages the linear structure of the trajectory matrix of the time-series and employs a robust projection step which makes the algorithm able to handle the presence of multiple arbitrarily large anomalies in its window. A closed-form/non-iterative algorithm for the robust projection step is provided and it is proved that it can identify the corrupted time-stamps. RPE is a great candidate for the applications where a large training data is not available which is the common scenario in the area of time-series. An extensive set of numerical experiments show that RPE can outperform the existing approaches with a notable margin.

preprint2022arXiv

Spliced Binned-Pareto Distribution for Robust Modeling of Heavy-tailed Time Series

This work proposes a novel method to robustly and accurately model time series with heavy-tailed noise, in non-stationary scenarios. In many practical application time series have heavy-tailed noise that significantly impacts the performance of classical forecasting models; in particular, accurately modeling a distribution over extreme events is crucial to performing accurate time series anomaly detection. We propose a Spliced Binned-Pareto distribution which is both robust to extreme observations and allows accurate modeling of the full distribution. Our method allows the capture of time dependencies in the higher order moments of the distribution such as the tail heaviness. We compare the robustness and the accuracy of the tail estimation of our method to other state of the art methods on Twitter mentions count time series.

preprint2022arXiv

Testing Granger Non-Causality in Panels with Cross-Sectional Dependencies

This paper proposes a new approach for testing Granger non-causality on panel data. Instead of aggregating panel member statistics, we aggregate their corresponding p-values and show that the resulting p-value approximately bounds the type I error by the chosen significance level even if the panel members are dependent. We compare our approach against the most widely used Granger causality algorithm on panel data and show that our approach yields lower FDR at the same power for large sample sizes and panels with cross-sectional dependencies. Finally, we examine COVID-19 data about confirmed cases and deaths measured in countries/regions worldwide and show that our approach is able to discover the true causal relation between confirmed cases and deaths while state-of-the-art approaches fail.

preprint2020arXiv

A simple and effective predictive resource scaling heuristic for large-scale cloud applications

We propose a simple yet effective policy for the predictive auto-scaling of horizontally scalable applications running in cloud environments, where compute resources can only be added with a delay, and where the deployment throughput is limited. Our policy uses a probabilistic forecast of the workload to make scaling decisions dependent on the risk aversion of the application owner. We show in our experiments using real-world and synthetic data that this policy compares favorably to mathematically more sophisticated approaches as well as to simple benchmark policies.

preprint2020arXiv

Improve black-box sequential anomaly detector relevancy with limited user feedback

Anomaly detectors are often designed to catch statistical anomalies. End-users typically do not have interest in all of the detected outliers, but only those relevant to their application. Given an existing black-box sequential anomaly detector, this paper proposes a method to improve its user relevancy using a small number of human feedback. As our first contribution, the method is agnostic to the detector: it only assumes access to its anomaly scores, without requirement on any additional information inside it. Inspired by a fact that anomalies are of different types, our approach identifies these types and utilizes user feedback to assign relevancy to types. This relevancy score, as our second contribution, is used to adjust the subsequent anomaly selection process. Empirical results on synthetic and real-world datasets show that our approach yields significant improvements on precision and recall over a range of anomaly detectors.

preprint2015arXiv

Sharp Threshold Detection Based on Sup-norm Error rates in High-dimensional Models

We propose a new estimator, the thresholded scaled Lasso, in high dimensional threshold regressions. First, we establish an upper bound on the $\ell_\infty$ estimation error of the scaled Lasso estimator of Lee et al. (2012). This is a non-trivial task as the literature on high-dimensional models has focused almost exclusively on $\ell_1$ and $\ell_2$ estimation errors. We show that this sup-norm bound can be used to distinguish between zero and non-zero coefficients at a much finer scale than would have been possible using classical oracle inequalities. Thus, our sup-norm bound is tailored to consistent variable selection via thresholding. Our simulations show that thresholding the scaled Lasso yields substantial improvements in terms of variable selection. Finally, we use our estimator to shed further empirical light on the long running debate on the relationship between the level of debt (public and private) and GDP growth.

preprint2014arXiv

Vector Autoregressions with Parsimoniously Time Varying Parameters and an Application to Monetary Policy

This paper proposes a parsimoniously time varying parameter vector autoregressive model (with exogenous variables, VARX) and studies the properties of the Lasso and adaptive Lasso as estimators of this model. The parameters of the model are assumed to follow parsimonious random walks, where parsimony stems from the assumption that increments to the parameters have a non-zero probability of being exactly equal to zero. By varying the degree of parsimony our model can accommodate constant parameters, an unknown number of structural breaks, or parameters with a high degree of variation. We characterize the finite sample properties of the Lasso by deriving upper bounds on the estimation and prediction errors that are valid with high probability; and asymptotically we show that these bounds tend to zero with probability tending to one if the number of non zero increments grows slower than $\sqrt{T}$. By simulation experiments we investigate the properties of the Lasso and the adaptive Lasso in settings where the parameters are stable, experience structural breaks, or follow a parsimonious random walk. We use our model to investigate the monetary policy response to inflation and business cycle fluctuations in the US by estimating a parsimoniously time varying parameter Taylor rule. We document substantial changes in the policy response of the Fed in the 1980s and since 2008.

Laurent Callot

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

TerraFormer: Automated Infrastructure-as-Code with LLMs Fine-Tuned via Policy-Guided Verifier Feedback

Deep Generative model with Hierarchical Latent Factors for Time Series Anomaly Detection

Deep Learning for Time Series Forecasting: Tutorial and Literature Survey

Online Time Series Anomaly Detection with State Space Gaussian Processes

Robust Projection based Anomaly Extraction (RPE) in Univariate Time-Series

Spliced Binned-Pareto Distribution for Robust Modeling of Heavy-tailed Time Series

Testing Granger Non-Causality in Panels with Cross-Sectional Dependencies

A simple and effective predictive resource scaling heuristic for large-scale cloud applications

Improve black-box sequential anomaly detector relevancy with limited user feedback

Sharp Threshold Detection Based on Sup-norm Error rates in High-dimensional Models

Vector Autoregressions with Parsimoniously Time Varying Parameters and an Application to Monetary Policy