Source author record

Ioannis Kosmidis

Ioannis Kosmidis appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Applications math.ST Statistics Theory Computation q-fin.PM q-fin.RM q-fin.ST

Catalog footprint

What is connected

10works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Diaconis-Ylvisaker prior penalized likelihood for $p/n \to κ\in (0,1)$ logistic regression

We characterise the behavior of the maximum Diaconis--Ylvisaker prior penalized likelihood estimator in high-dimensional logistic regression, where the number of covariates is a fraction $κ\in (0,1)$ of the number of observations $n$, as $n \to \infty$. We construct a rescaled estimator with zero asymptotic aggregate bias and define adjusted $Z$-statistics and rescaled penalized likelihood ratio statistics that exhibit the typical null asymptotic distributions, when the covariates are independent multivariate normal with an arbitrary covariance matrix and the linear predictor has asymptotic variance $γ^2$. While the maximum likelihood estimate asymptotically exists only for a narrow range of $(κ, γ)$ values, the maximum Diaconis--Ylvisaker prior penalized likelihood estimate always exists and can be computed directly using standard maximum likelihood routines. Thus, our asymptotic results extend to $(κ, γ)$ values where the maximum likelihood framework breaks down, with no additional implementation or computational cost. We study the estimator's shrinkage properties, compare the proposed estimation and inference procedures with alternatives that also accommodate proportional asymptotics, and formulate a conjecture -- supported by strong empirical evidence -- that extends our results when the model includes an intercept parameter. Finally, we propose estimation methods for all unknown constants involved in our procedures and demonstrate the theoretical advances through extensive simulation studies and the analysis of digit recognition data.

preprint2022arXiv

Mean and median bias reduction: A concise review and application to adjacent-categories logit models

The estimation of categorical response models using bias-reducing adjusted score equations has seen extensive theoretical research and applied use. The resulting estimates have been found to have superior frequentist properties to what maximum likelihood generally delivers and to be finite, even in cases where the maximum likelihood estimates are infinite. We briefly review mean and median bias reduction of maximum likelihood estimates via adjusted score equations in an illustration-driven way, and discuss their particular equivariance properties under parameter transformations. We then apply mean and median bias reduction to adjacent-categories logit models for ordinal responses. We show how ready bias reduction procedures for Poisson log-linear models can be used for mean and median bias reduction in adjacent-categories logit models with proportional odds and mean bias-reduced estimation in models with non-proportional odds. As in binomial logistic regression, the reduced-bias estimates are found to be finite even in cases where the maximum likelihood estimates are infinite. We also use the approximation of the bias of transformations of mean bias-reduced estimators to correct for the mean bias of model-based ordinal superiority measures. All developments are motivated and illustrated using real-data case studies and simulations

preprint2022arXiv

Parametric bootstrap inference for stratified models with high-dimensional nuisance specifications

Inference about a scalar parameter of interest typically relies on the asymptotic normality of common likelihood pivots, such as the signed likelihood root, the score and Wald statistics. Nevertheless, the resulting inferential procedures are known to perform poorly when the dimension of the nuisance parameter is large relative to the sample size and when the information about the parameters is limited. In many such cases, the use of asymptotic normality of analytical modifications of the signed likelihood root is known to recover inferential performance. It is proved here that parametric bootstrap of standard likelihood pivots results in as accurate inferences as analytical modifications of the signed likelihood root do in stratified models with stratum specific nuisance parameters. We focus on the challenging case where the number of strata increases as fast or faster than the stratum samples size. It is also shown that this equivalence holds regardless of whether constrained or unconstrained bootstrap is used. This is in contrast to when the number of strata is fixed or increases slower than the stratum sample size, where we show that constrained bootstrap corrects inference to a higher order than unconstrained bootstrap. Simulation experiments support the theoretical findings and demonstrate the excellent performance of bootstrap in extreme scenarios.

preprint2021arXiv

Bias Reduction as a Remedy to the Consequences of Infinite Estimates in Poisson and Tobit Regression

Data separation is a well-studied phenomenon that can cause problems in the estimation and inference from binary response models. Complete or quasi-complete separation occurs when there is a combination of regressors in the model whose value can perfectly predict one or both outcomes. In such cases, and such cases only, the maximum likelihood estimates and the corresponding standard errors are infinite. It is less widely known that the same can happen in further microeconometric models. One of the few works in the area is Santos Silva and Tenreyro (2010) who note that the finiteness of the maximum likelihood estimates in Poisson regression depends on the data configuration and propose a strategy to detect and overcome the consequences of data separation. However, their approach can lead to notable bias on the parameter estimates when the regressors are correlated. We illustrate how bias-reducing adjustments to the maximum likelihood score equations can overcome the consequences of separation in Poisson and Tobit regression models.

preprint2020arXiv

A Bayesian inference approach for determining player abilities in football

We consider the task of determining a football player's ability for a given event type, for example, scoring a goal. We propose an interpretable Bayesian model which is fit using variational inference methods. We implement a Poisson model to capture occurrences of event types, from which we infer player abilities. Our approach also allows the visualisation of differences between players, for a specific ability, through the marginal posterior variational densities. We then use these inferred player abilities to extend the Bayesian hierarchical model of Baio and Blangiardo (2010) which captures a team's scoring rate (the rate at which they score goals). We apply the resulting scheme to the English Premier League, capturing player abilities over the 2013/2014 season, before using output from the hierarchical model to predict whether over or under 2.5 goals will be scored in a given game in the 2014/2015 season. This validates our model as a way of providing insights into team formation and the individual success of sports teams.

preprint2020arXiv

Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models

Penalization of the likelihood by Jeffreys' invariant prior, or by a positive power thereof, is shown to produce finite-valued maximum penalized likelihood estimates in a broad class of binomial generalized linear models. The class of models includes logistic regression, where the Jeffreys-prior penalty is known additionally to reduce the asymptotic bias of the maximum likelihood estimator; and also models with other commonly used link functions such as probit and log-log. Shrinkage towards equiprobability across observations, relative to the maximum likelihood estimator, is established theoretically and is studied through illustrative examples. Some implications of finiteness and shrinkage for inference are discussed, particularly when inference is based on Wald-type procedures. A widely applicable procedure is developed for computation of maximum penalized likelihood estimates, by using repeated maximum likelihood fits with iteratively adjusted binomial responses and totals. These theoretical results and methods underpin the increasingly widespread use of reduced-bias and similarly penalized binomial regression models in many applied fields.

preprint2015arXiv

Linking the performance of endurance runners to training and physiological effects via multi-resolution elastic net

A multiplicative effects model is introduced for the identification of the factors that are influential to the performance of highly-trained endurance runners. The model extends the established power-law relationship between performance times and distances by taking into account the effect of the physiological status of the runners, and training effects extracted from GPS records collected over the course of a year. In order to incorporate information on the runners' training into the model, the concept of the training distribution profile is introduced and its ability to capture the characteristics of the training session is discussed. The covariates that are relevant to runner performance as response are identified using a procedure termed multi-resolution elastic net. Multi-resolution elastic net allows the simultaneous identification of scalar covariates and of intervals on the domain of one or more functional covariates that are most influential for the response. The results identify a contiguous group of speed intervals between 5.3 to 5.7 m$\cdot$s$^{-1}$ as influential for the improvement of running performance and extend established relationships between physiological status and runner performance. Another outcome of multi-resolution elastic net is a predictive equation for performance based on the minimization of the mean squared prediction error on a test data set across resolutions.

preprint2014arXiv

Liquidity commonality does not imply liquidity resilience commonality: A functional characterisation for ultra-high frequency cross-sectional LOB data

We present a large-scale study of commonality in liquidity and resilience across assets in an ultra high-frequency (millisecond-timestamped) Limit Order Book (LOB) dataset from a pan-European electronic equity trading facility. We first show that extant work in quantifying liquidity commonality through the degree of explanatory power of the dominant modes of variation of liquidity (extracted through Principal Component Analysis) fails to account for heavy tailed features in the data, thus producing potentially misleading results. We employ Independent Component Analysis, which both decorrelates the liquidity measures in the asset cross-section, but also reduces higher-order statistical dependencies. To measure commonality in liquidity resilience, we utilise a novel characterisation as the time required for return to a threshold liquidity level. This reflects a dimension of liquidity that is not captured by the majority of liquidity measures and has important ramifications for understanding supply and demand pressures for market makers in electronic exchanges, as well as regulators and HFTs. When the metric is mapped out across a range of thresholds, it produces the daily Liquidity Resilience Profile (LRP) for a given asset. This daily summary of liquidity resilience behaviour from the vast LOB dataset is then amenable to a functional data representation. This enables the comparison of liquidity resilience in the asset cross-section via functional linear sub-space decompositions and functional regression. The functional regression results presented here suggest that market factors for liquidity resilience (as extracted through functional principal components analysis) can explain between 10 and 40% of the variation in liquidity resilience at low liquidity thresholds, but are less explanatory at more extreme levels, where individual asset factors take effect.

preprint2014arXiv

Upside and Downside Risk Exposures of Currency Carry Trades via Tail Dependence

Currency carry trade is the investment strategy that involves selling low interest rate currencies in order to purchase higher interest rate currencies, thus profiting from the interest rate differentials. This is a well known financial puzzle to explain, since assuming foreign exchange risk is uninhibited and the markets have rational risk-neutral investors, then one would not expect profits from such strategies. That is, according to uncovered interest rate parity (UIP), changes in the related exchange rates should offset the potential to profit from such interest rate differentials. However, it has been shown empirically, that investors can earn profits on average by borrowing in a country with a lower interest rate, exchanging for foreign currency, and investing in a foreign country with a higher interest rate, whilst allowing for any losses from exchanging back to their domestic currency at maturity. This paper explores the financial risk that trading strategies seeking to exploit a violation of the UIP condition are exposed to with respect to multivariate tail dependence present in both the funding and investment currency baskets. It will outline in what contexts these portfolio risk exposures will benefit accumulated portfolio returns and under what conditions such tail exposures will reduce portfolio returns.

preprint2012arXiv

Some discussions of D. Fearnhead and D. Prangle's Read Paper "Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation"

This report is a collection of comments on the Read Paper of Fearnhead and Prangle (2011), to appear in the Journal of the Royal Statistical Society Series B, along with a reply from the authors.

Ioannis Kosmidis

What is connected

Connect this record

See the researcher in context

Building this map preview

10 published item(s)

Diaconis-Ylvisaker prior penalized likelihood for $p/n \to κ\in (0,1)$ logistic regression

Mean and median bias reduction: A concise review and application to adjacent-categories logit models

Parametric bootstrap inference for stratified models with high-dimensional nuisance specifications

Bias Reduction as a Remedy to the Consequences of Infinite Estimates in Poisson and Tobit Regression

A Bayesian inference approach for determining player abilities in football

Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models

Linking the performance of endurance runners to training and physiological effects via multi-resolution elastic net

Liquidity commonality does not imply liquidity resilience commonality: A functional characterisation for ultra-high frequency cross-sectional LOB data

Upside and Downside Risk Exposures of Currency Carry Trades via Tail Dependence

Some discussions of D. Fearnhead and D. Prangle's Read Paper "Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation"