Source author record

Georgia Papacharalampous

Georgia Papacharalampous appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Machine Learning Applications Computation physics.ao-ph

Catalog footprint

What is connected

8works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Variable transformations in consistent loss functions

The empirical use of variable transformations within (strictly) consistent loss functions is widespread, yet a theoretical understanding is lacking. To address this gap, we develop a theoretical framework that establishes formal characterizations of (strict) consistency for such transformed loss functions. Our analysis focuses on two interrelated cases: (a) transformations applied solely to the realization variable and (b) bijective transformations applied jointly to both the realization and prediction variables. These cases extend the well-established framework of transformations applied exclusively to the prediction variable, as formalized by Osband's revelation principle. We further develop analogous characterizations for (strict) identification functions. The resulting theoretical framework is broadly applicable to statistical and machine learning methodologies. For instance, we apply the framework to Bregman and expectile loss functions to interpret empirical findings from models trained with transformed loss functions and systematically construct new identifiable and elicitable functionals, which we term respectively $g$-transformed expectation and $g$-transformed expectile. Applications of the framework to simulated and real-world data illustrate its practical utility in diverse settings. By unifying theoretical insights with practical applications, this work advances principled methodologies for designing loss functions in complex predictive tasks.

preprint2023arXiv

Hydrological post-processing for predicting extreme quantiles

Hydrological post-processing using quantile regression algorithms constitutes a prime means of estimating the uncertainty of hydrological predictions. Nonetheless, conventional large-sample theory for quantile regression does not apply sufficiently far in the tails of the probability distribution of the dependent variable. To overcome this limitation that could be crucial when the interest lies on flood events, hydrological post-processing through extremal quantile regression is introduced here for estimating the extreme quantiles of hydrological model's responses. In summary, the new hydrological post-processing method exploits properties of the Hill's estimator from the extreme value theory to extrapolate quantile regression's predictions to high quantiles. As a proof of concept, the new method is here tested in post-processing daily streamflow simulations provided by three process-based hydrological models for 180 basins in the contiguous United States (CONUS) and is further compared to conventional quantile regression. With this large-scale comparison, it is demonstrated that hydrological post-processing using conventional quantile regression severely underestimates high quantiles (at the quantile level 0.9999) compared to hydrological post-processing using extremal quantile regression, although both methods are equivalent at lower quantiles (at the quantile level 0.9700). Moreover, it is shown that, in the same context, extremal quantile regression estimates the high predictive quantiles with efficiency that is, on average, equivalent in the large-sample study for the three process-based hydrological models.

preprint2022arXiv

Time series features for supporting hydrometeorological explorations and predictions in ungauged locations using large datasets

Regression-based frameworks for streamflow regionalization are built around catchment attributes that traditionally originate from catchment hydrology, flood frequency analysis and their interplay. In this work, we deviated from this traditional path by formulating and extensively investigating the first regression-based streamflow regionalization frameworks that largely emerge from general-purpose time series features for data science and, more precisely, from a large variety of such features. We focused on 28 features that included (partial) autocorrelation, entropy, temporal variation, seasonality, trend, lumpiness, stability, nonlinearity, linearity, spikiness, curvature and others. We estimated these features for daily temperature, precipitation and streamflow time series from 511 catchments, and then merged them within regionalization contexts with traditional topographic, land cover, soil and geologic attributes. Precipitation and temperature features (e.g., the spectral entropy, seasonality strength and lag-1 autocorrelation of the precipitation time series, and the stability and trend strength of the temperature time series) were found to be useful predictors of many streamflow features. The same applies to traditional attributes, such as the catchment mean elevation. Relationships between predictor and dependent variables were also revealed, while the spectral entropy, the seasonality strength and several autocorrelation features of the streamflow time series were found to be more regionalizable than others.

preprint2021arXiv

Global-scale massive feature extraction from monthly hydroclimatic time series: Statistical characterizations, spatial patterns and hydrological similarity

Hydroclimatic time series analysis focuses on a few feature types (e.g., autocorrelations, trends, extremes), which describe a small portion of the entire information content of the observations. Aiming to exploit a larger part of the available information and, thus, to deliver more reliable results (e.g., in hydroclimatic time series clustering contexts), here we approach hydroclimatic time series analysis differently, i.e., by performing massive feature extraction. In this respect, we develop a big data framework for hydroclimatic variable behaviour characterization. This framework relies on approximately 60 diverse features and is completely automatic (in the sense that it does not depend on the hydroclimatic process at hand). We apply the new framework to characterize mean monthly temperature, total monthly precipitation and mean monthly river flow. The applications are conducted at the global scale by exploiting 40-year-long time series originating from over 13 000 stations. We extract interpretable knowledge on seasonality, trends, autocorrelation, long-range dependence and entropy, and on feature types that are met less frequently. We further compare the examined hydroclimatic variable types in terms of this knowledge and, identify patterns related to the spatial variability of the features. For this latter purpose, we also propose and exploit a hydroclimatic time series clustering methodology. This new methodology is based on Breiman's random forests. The descriptive and exploratory insights gained by the global-scale applications prove the usefulness of the adopted feature compilation in hydroclimatic contexts. Moreover, the spatially coherent patterns characterizing the clusters delivered by the new methodology build confidence in its future exploitation...

preprint2021arXiv

Probabilistic water demand forecasting using quantile regression algorithms

Machine and statistical learning algorithms can be reliably automated and applied at scale. Therefore, they can constitute a considerable asset for designing practical forecasting systems, such as those related to urban water demand. Quantile regression algorithms are statistical and machine learning algorithms that can provide probabilistic forecasts in a straightforward way, and have not been applied so far for urban water demand forecasting. In this work, we aim to fill this gap by automating and extensively comparing several quantile-regression-based practical systems for probabilistic one-day ahead urban water demand forecasting. For designing the practical systems, we use five individual algorithms (i.e., the quantile regression, linear boosting, generalized random forest, gradient boosting machine and quantile regression neural network algorithms), their mean combiner and their median combiner. The comparison is conducted by exploiting a large urban water flow dataset, as well as several types of hydrometeorological time series (which are considered as exogenous predictor variables in the forecasting setting). The results mostly favour the practical systems designed using the linear boosting algorithm, probably due to the presence of trends in the urban water flow time series. The forecasts of the mean and median combiners are also found to be skilful in general terms.

preprint2020arXiv

Hydrological time series forecasting using simple combinations: Big data testing and investigations on one-year ahead river flow predictability

Delivering useful hydrological forecasts is critical for urban and agricultural water management, hydropower generation, flood protection and management, drought mitigation and alleviation, and river basin planning and management, among others. In this work, we present and appraise a new simple and flexible methodology for hydrological time series forecasting. This methodology relies on (a) at least two individual forecasting methods and (b) the median combiner of forecasts. The appraisal is made by using a big dataset consisted of 90-year-long mean annual river flow time series from approximately 600 stations. Covering large parts of North America and Europe, these stations represent various climate and catchment characteristics, and thus can collectively support benchmarking. Five individual forecasting methods and 26 variants of the introduced methodology are applied to each time series. The application is made in one-step ahead forecasting mode. The individual methods are the last-observation benchmark, simple exponential smoothing, complex exponential smoothing, automatic autoregressive fractionally integrated moving average (ARFIMA) and Facebook's Prophet, while the 26 variants are defined by all the possible combinations (per two, three, four or five) of the five afore-mentioned methods. The new methodology is identified as well-performing in the long run, especially when more than two individual forecasting methods are combined within its framework. Moreover, the possibility of case-informed integrations of diverse hydrological forecasting methods within systematic frameworks is algorithmically investigated and discussed. The related investigations encompass linear regression analyses, which aim at finding interpretable relationships between the values of a representative forecasting performance metric and the values of selected river flow statistics...

preprint2020arXiv

Quantification of predictive uncertainty in hydrological modelling by harnessing the wisdom of the crowd: A large-sample experiment at monthly timescale

Predictive hydrological uncertainty can be quantified by using ensemble methods. If properly formulated, these methods can offer improved predictive performance by combining multiple predictions. In this work, we use 50-year-long monthly time series observed in 270 catchments in the United States to explore the performances provided by an ensemble learning post-processing methodology for issuing probabilistic hydrological predictions. This methodology allows the utilization of flexible quantile regression models for exploiting information about the hydrological model's error. Its key differences with respect to basic two-stage hydrological post-processing methodologies using the same type of regression models are that (a) instead of a single point hydrological prediction it generates a large number of "sister predictions" (yet using a single hydrological model), and that (b) it relies on the concept of combining probabilistic predictions via simple quantile averaging. A major hydrological modelling challenge is obtaining probabilistic predictions that are simultaneously reliable and associated to prediction bands that are as narrow as possible; therefore, we assess both these desired properties of the predictions by computing their coverage probabilities, average widths and average interval scores. The results confirm the usefulness of the proposed methodology and its larger robustness with respect to basic two-stage post-processing methodologies. Finally, this methodology is empirically proven to harness the "wisdom of the crowd" in terms of average interval score, i.e., the average of the individual predictions combined by this methodology scores no worse -- usually better -- than the average of the scores of the individual predictions.

preprint2020arXiv

Quantification of predictive uncertainty in hydrological modelling by harnessing the wisdom of the crowd: Methodology development and investigation using toy models

We introduce an ensemble learning post-processing methodology for probabilistic hydrological modelling. This methodology generates numerous point predictions by applying a single hydrological model, yet with different parameter values drawn from the respective simulated posterior distribution. We call these predictions "sister predictions". Each sister prediction extending in the period of interest is converted into a probabilistic prediction using information about the hydrological model's errors. This information is obtained from a preceding period for which observations are available, and is exploited using a flexible quantile regression model. All probabilistic predictions are finally combined via simple quantile averaging to produce the output probabilistic prediction. The idea is inspired by the ensemble learning methods originating from the machine learning literature. The proposed methodology offers larger robustness in performance than basic post-processing methodologies using a single hydrological point prediction. It is also empirically proven to "harness the wisdom of the crowd" in terms of average interval score, i.e., the obtained quantile predictions score no worse -- usually better -- than the average score of the combined individual predictions. This proof is provided within toy examples, which can be used for gaining insight on how the methodology works and under which conditions it can optimally convert point hydrological predictions to probabilistic ones. A large-scale hydrological application is made in a companion paper.

Georgia Papacharalampous

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Variable transformations in consistent loss functions

Hydrological post-processing for predicting extreme quantiles

Time series features for supporting hydrometeorological explorations and predictions in ungauged locations using large datasets

Global-scale massive feature extraction from monthly hydroclimatic time series: Statistical characterizations, spatial patterns and hydrological similarity

Probabilistic water demand forecasting using quantile regression algorithms

Hydrological time series forecasting using simple combinations: Big data testing and investigations on one-year ahead river flow predictability

Quantification of predictive uncertainty in hydrological modelling by harnessing the wisdom of the crowd: A large-sample experiment at monthly timescale

Quantification of predictive uncertainty in hydrological modelling by harnessing the wisdom of the crowd: Methodology development and investigation using toy models