Topic overview

econ.EM

637 works1300 researchers0 institutions

Topic snapshot

What this area looks like now

637works
1300authors
0experts visible
0communities

Next steps

Move from topic reading into action

The graph preview below keeps the nearby papers, people and communities visible in the same reading flow.

Topic graph

See the topic as a live network

Open full explorer

Inspect nearby papers, researchers, institutions and communities without opening a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Papers in this area

24 featured work(s)

preprint2022arXiv

bootUR: An R Package for Bootstrap Unit Root Tests

Unit root tests form an essential part of any time series analysis. We provide practitioners with a single, unified framework for comprehensive and reliable unit root testing in the R package bootUR.The package's backbone is the popular augmented Dickey-Fuller test paired with a union of rejections principle, which can be performed directly on single time series or multiple (including panel) time series. Accurate inference is ensured through the use of bootstrap methods. The package addresses the needs of both novice users, by providing user-friendly and easy-to-implement functions with sensible default options, as well as expert users, by giving full user-control to adjust the tests to one's desired settings. Our parallelized C++ implementation ensures that all unit root tests are scalable to datasets containing many time series.

preprint2022arXiv

Protection or Peril of Following the Crowd in a Pandemic-Concurrent Flood Evacuation

The decisions of whether and how to evacuate during a climate disaster are influenced by a wide range of factors, including sociodemographics, emergency messaging, and social influence. Further complexity is introduced when multiple hazards occur simultaneously, such as a flood evacuation taking place amid a viral pandemic that requires physical distancing. Such multi-hazard events can necessitate a nuanced navigation of competing decision-making strategies wherein a desire to follow peers is weighed against contagion risks. To better understand these nuances, we distributed an online survey during a pandemic surge in July 2020 to 600 individuals in three midwestern and three southern states in the United States with high risk of flooding. In this paper, we estimate a random parameter logit model in both preference space and willingness-to-pay space. Our results show that the directionality and magnitude of the influence of peers' choices of whether and how to evacuate vary widely across respondents. Overall, the decision of whether to evacuate is positively impacted by peer behavior, while the decision of how to evacuate is negatively impacted by peers. Furthermore, an increase in flood threat level lessens the magnitude of these impacts. These findings have important implications for the design of tailored emergency messaging strategies. Specifically, emphasizing or deemphasizing the severity of each threat in a multi-hazard scenario may assist in: (1) encouraging a reprioritization of competing risk perceptions and (2) magnifying or neutralizing the impacts of social influence, thereby (3) nudging evacuation decision-making toward a desired outcome.

preprint2022arXiv

Parametric quantile regression for income data

Univariate normal regression models are statistical tools widely applied in many areas of economics. Nevertheless, income data have asymmetric behavior and are best modeled by non-normal distributions. The modeling of income plays an important role in determining workers' earnings, as well as being an important research topic in labor economics. Thus, the objective of this work is to propose parametric quantile regression models based on two important asymmetric income distributions, namely, Dagum and Singh-Maddala distributions. The proposed quantile models are based on reparameterizations of the original distributions by inserting a quantile parameter. We present the reparameterizations, some properties of the distributions, and the quantile regression models with their inferential aspects. We proceed with Monte Carlo simulation studies, considering the maximum likelihood estimation performance evaluation and an analysis of the empirical distribution of two residuals. The Monte Carlo results show that both models meet the expected outcomes. We apply the proposed quantile regression models to a household income data set provided by the National Institute of Statistics of Chile. We showed that both proposed models had a good performance both in terms of model fitting. Thus, we conclude that results were favorable to the use of Singh-Maddala and Dagum quantile regression models for positive asymmetric data, such as income data.

preprint2021arXiv

Big Data meets Causal Survey Research: Understanding Nonresponse in the Recruitment of a Mixed-mode Online Panel

Survey scientists increasingly face the problem of high-dimensionality in their research as digitization makes it much easier to construct high-dimensional (or "big") data sets through tools such as online surveys and mobile applications. Machine learning methods are able to handle such data, and they have been successfully applied to solve \emph{predictive} problems. However, in many situations, survey statisticians want to learn about \emph{causal} relationships to draw conclusions and be able to transfer the findings of one survey to another. Standard machine learning methods provide biased estimates of such relationships. We introduce into survey statistics the double machine learning approach, which gives approximately unbiased estimators of causal parameters, and show how it can be used to analyze survey nonresponse in a high-dimensional panel setting.

preprint2022arXiv

Dynamic demand for differentiated products with fixed-effects unobserved heterogeneity

This paper studies identification and estimation of a dynamic discrete choice model of demand for differentiated product using consumer-level panel data with few purchase events per consumer (i.e., short panel). Consumers are forward-looking and their preferences incorporate two sources of dynamics: last choice dependence due to habits and switching costs, and duration dependence due to inventory, depreciation, or learning. A key distinguishing feature of the model is that consumer unobserved heterogeneity has a Fixed Effects (FE) structure -- that is, its probability distribution conditional on the initial values of endogenous state variables is unrestricted. I apply and extend recent results to establish the identification of all the structural parameters as long as the dataset includes four or more purchase events per household. The parameters can be estimated using a sufficient statistic - conditional maximum likelihood (CML) method. An attractive feature of CML in this model is that the sufficient statistic controls for the forward-looking value of the consumer's decision problem such that the method does not require solving dynamic programming problems or calculating expected present values.

preprint2022arXiv

Two-stage differences in differences

A recent literature has shown that when adoption of a treatment is staggered and average treatment effects vary across groups and over time, difference-in-differences regression does not identify an easily interpretable measure of the typical effect of the treatment. In this paper, I extend this literature in two ways. First, I provide some simple underlying intuition for why difference-in-differences regression does not identify a group$\times$period average treatment effect. Second, I propose an alternative two-stage estimation framework, motivated by this intuition. In this framework, group and period effects are identified in a first stage from the sample of untreated observations, and average treatment effects are identified in a second stage by comparing treated and untreated outcomes, after removing these group and period effects. The two-stage approach is robust to treatment-effect heterogeneity under staggered adoption, and can be used to identify a host of different average treatment effect measures. It is also simple, intuitive, and easy to implement. I establish the theoretical properties of the two-stage approach and demonstrate its effectiveness and applicability using Monte-Carlo evidence and an example from the literature.

preprint2026arXiv

Causal EpiNets: Precision-corrected Bounds on Individual Treatment Effects using Epistemic Neural Networks

Individual treatment effects are not point-identified from data. The Probability of Necessity and Sufficiency (PNS) circumvents this limitation by characterizing individual-level causality through intersection bounds derived from combined experimental and observational data. In finite samples, however, standard plug-in estimators systematically fail: they violate structural probability constraints and suffer from extremum bias induced by max-min operators, yielding spuriously narrow intervals. We propose a neural framework for finite-sample PNS estimation that resolves both pathologies. We introduce an anchored neural architecture that guarantees structural constraint satisfaction by construction. To correct extremum bias, we employ precision-corrected intersection-bound inference, leveraging Epistemic Neural Networks for scalable, high-dimensional uncertainty quantification. Empirical evaluations confirm that this approach maintains nominal coverage and exact constraint validity in high-dimensional regimes where standard estimators systematically undercover.

preprint2026arXiv

Covariate Balancing and Riesz Regression Should Be Guided by the Neyman Orthogonal Score in Debiased Machine Learning

This position paper argues that, in debiased machine learning, balancing functions should be derived from the Neyman orthogonal score, not chosen only as functions of covariates. Covariate balancing is effective when the regression error entering the score can be represented by functions of covariates alone, and it is the natural finite-dimensional approximation for targets such as ATT counterfactual means. For ATE estimation under treatment effect heterogeneity, however, the score error generally contains treatment-specific components because the outcome regression is a function of the full regressor $X=(D,Z)$. In that case, balancing common functions of $Z$ can leave the treatment-specific component unbalanced. We therefore advocate regressor balancing, implemented by Riesz regression with basis functions of $X$, as the general balancing principle for DML. The position is not that covariate balancing is invalid, but that covariate balancing should be understood as the special case that is appropriate when the score-relevant regression error is a function of covariates alone.

preprint2023arXiv

Doubly-Robust Inference for Conditional Average Treatment Effects with High-Dimensional Controls

Plausible identification of conditional average treatment effects (CATEs) may rely on controlling for a large number of variables to account for confounding factors. In these high-dimensional settings, estimation of the CATE requires estimating first-stage models whose consistency relies on correctly specifying their parametric forms. While doubly-robust estimators of the CATE exist, inference procedures based on the second stage CATE estimator are not doubly-robust. Using the popular augmented inverse propensity weighting signal, we propose an estimator for the CATE whose resulting Wald-type confidence intervals are doubly-robust. We assume a logistic model for the propensity score and a linear model for the outcome regression, and estimate the parameters of these models using an $\ell_1$ (Lasso) penalty to address the high dimensional covariates. Our proposed estimator remains consistent at the nonparametric rate and our proposed pointwise and uniform confidence intervals remain asymptotically valid even if one of the logistic propensity score or linear outcome regression models are misspecified. These results are obtained under similar conditions to existing analyses in the high-dimensional and nonparametric literatures.

preprint2021arXiv

Deep Structural Estimation: With an Application to Option Pricing

We propose a novel structural estimation framework in which we train a surrogate of an economic model with deep neural networks. Our methodology alleviates the curse of dimensionality and speeds up the evaluation and parameter estimation by orders of magnitudes, which significantly enhances one's ability to conduct analyses that require frequent parameter re-estimation. As an empirical application, we compare two popular option pricing models (the Heston and the Bates model with double-exponential jumps) against a non-parametric random forest model. We document that: a) the Bates model produces better out-of-sample pricing on average, but both structural models fail to outperform random forest for large areas of the volatility surface; b) random forest is more competitive at short horizons (e.g., 1-day), for short-dated options (with less than 7 days to maturity), and on days with poor liquidity; c) both structural models outperform random forest in out-of-sample delta hedging; d) the Heston model's relative performance has deteriorated significantly after the 2008 financial crisis.

preprint2023arXiv

A Sieve-SMM Estimator for Dynamic Models

This paper proposes a Sieve Simulated Method of Moments (Sieve-SMM) estimator for the parameters and the distribution of the shocks in nonlinear dynamic models where the likelihood and the moments are not tractable. An important concern with SMM, which matches sample with simulated moments, is that a parametric distribution is required. However, economic quantities that depend on this distribution, such as welfare and asset-prices, can be sensitive to misspecification. The Sieve-SMM estimator addresses this issue by flexibly approximating the distribution of the shocks with a Gaussian and tails mixture sieve. The asymptotic framework provides consistency, rate of convergence and asymptotic normality results, extending existing results to a new framework with more general dynamics and latent variables. An application to asset pricing in a production economy shows a large decline in the estimates of relative risk-aversion, highlighting the empirical relevance of misspecification bias.

preprint2014arXiv

Efficient Modeling and Forecasting of the Electricity Spot Price

The increasing importance of renewable energy, especially solar and wind power, has led to new forces in the formation of electricity prices. Hence, this paper introduces an econometric model for the hourly time series of electricity prices of the European Power Exchange (EPEX) which incorporates specific features like renewable energy. The model consists of several sophisticated and established approaches and can be regarded as a periodic VAR-TARCH with wind power, solar power, and load as influences on the time series. It is able to map the distinct and well-known features of electricity prices in Germany. An efficient iteratively reweighted lasso approach is used for the estimation. Moreover, it is shown that several existing models are outperformed by the procedure developed in this paper.

preprint2026arXiv

Quantifying the Risk-Return Tradeoff in Forecasting

Average forecast accuracy is not the same as forecast reliability. I treat forecast loss differentials relative to a benchmark as a return series. I then evaluate these returns using risk-adjusted performance measures from finance, including the Sharpe ratio, Sortino ratio, Omega ratio, and drawdown-based metrics. I also introduce the Edge Ratio capturing a model's propensity to deliver uniquely informative predictions relative to the forecasting frontier. I apply this framework to U.S. macroeconomic forecasting, comparing econometric benchmarks, machine learning models, a foundation model (TabPFN), and the Survey of Professional Forecasters. While it is often feasible to beat professional forecasters in terms of average accuracy, it is much harder to beat them on a risk-adjusted basis. They rarely exhibit catastrophic failures and often achieve high Edge Ratios, plausibly reflecting the value of contextual judgment. Nonetheless, selected machine learning methods deliver attractive risk profiles for specific targets. The framework naturally extends to meta-analyses across targets, horizons, and samples, illustrated with a density forecast evaluation and the M4 competition.

preprint2026arXiv

LGB+: A Macroeconomic Forecasting Road Test

Needless to say, linear dynamics are pervasive in economic time series, particularly autoregressive ones. While gradient boosting with trees excels at capturing nonlinearities, it is inefficient in small samples when much of the predictive content is linear, expending splits to approximate relationships better captured by simple linear terms. This paper proposes LGB+, a boosting procedure operating on a more inclusive set of basis functions. The idea comes in two flavors. LGB+ evaluates a tree and a linear candidate at each step against out-of-bag data; only the winner advances. The simpler variant, LGB^A+, alternates on a fixed schedule: a block of tree updates, then a greedy linear correction, repeat. Both designs avoid ex ante commitments to any particular functional form or predictor selection. Because the prediction is the sum of a linear and a tree component, forecasts decompose natively into linear and nonlinear contributions, and so does permutation-based variable importance and historical proximity weights. In a quarterly U.S. macroeconomic forecasting exercise, LGB+ delivers strong gains for targets with pronounced autoregressive dynamics or mixed linear-nonlinear signals. Variables dominating the linear channel are those operating through autoregressive persistence or near-accounting relationships to the target (e.g., initial claims for unemployment and building permits for housing starts).

preprint2026arXiv

The Payment Heterogeneity Index: An Integrated Unsupervised Framework for High-Volume Procurement Oversight and Decision Support

Public procurement is vulnerable to error, fraud, and corruption, particularly as high transaction volumes overwhelm oversight. While research often focuses on tender-stage anomalies, post-award payment monitoring remains underexplored. Since labelled datasets are rare and methods like Benford's Law face restrictive assumptions, there is a need for interpretable, unsupervised frameworks for high-volume procurement oversight and decision support. This paper introduces the Structural Heterogeneity Index (SHI), a composite statistic for one-dimensional samples, and its payment-specific instantiation, the Payment Heterogeneity Index (PHI), characterising payment structure and latent regimes. It incorporates Gaussian Mixture Model (GMM) parameters alongside non-parametric statistics, integrating four interpretable components: modality, asymmetry, tail behaviour, and structural dispersion. Uniquely, the tail-behaviour component captures both distributional heaviness and extreme-value concentration, while structural-dispersion combines the variability, prevalence, and separation of latent payment regimes. Applied to UK municipal procurement data, PHI identifies a financially significant cohort (0.6\% of suppliers; 10.1\% of high-volume vendors) with structurally distinct payment patterns. Statistical testing further supports these differences, and targeted human verification confirms the plausibility of prioritised cases. Comparative analysis shows PHI reveals regime separation obscured by the Coefficient of Variation ($ρ= 0.310$). PHI provides a transparent, decomposable, and computationally lightweight framework for procurement integrity oversight and targeted audit prioritisation.

preprint2026arXiv

Sufficient conditions for a Heuristic Rating Estimation Method application

A series of papers has introduced the Heuristic Rating Estimation method, which evaluates a set of alternatives based on pairwise comparisons and the weights of reference alternatives. We formulate the conditions under which the HRE method can be applied correctly. The research considers both arithmetic and geometric algorithms for complete and incomplete pairwise comparison methods. The illustrative examples show that the estimations of inconsistency in the arithmetic variant are optimal.

preprint2022arXiv

Detecting Structural Breaks in Foreign Exchange Markets by using the group LASSO technique

This article proposes an estimation method to detect breakpoints for linear time series models with their parameters that jump scarcely. Its basic idea owes the group LASSO (group least absolute shrinkage and selection operator). The method practically provides estimates of such time-varying parameters of the models. An example shows that our method can detect each structural breakpoint's date and magnitude.

preprint2025arXiv

Multivariate quantile regression

This paper introduces a new framework for multivariate quantile regression based on the multivariate distribution function, termed multivariate quantile regression (MQR). In contrast to existing approaches--such as directional quantiles, vector quantile regression, or copula-based methods--MQR defines quantiles through the conditional probability structure of the joint conditional distribution function. The method constructs multivariate quantile curves using sequential univariate quantile regressions derived from conditioning mechanisms, allowing for an intuitive interpretation and flexible estimation of marginal effects. The paper develops theoretical foundations of MQR, including asymptotic properties of the estimators. Through simulation exercises, the estimator demonstrates robust finite sample performance across different dependence structures. As an empirical application, the MQR framework is applied to the analysis of exchange rate pass-through in Argentina from 2004 to 2024.

preprint2022arXiv

Optimal Best Arm Identification in Two-Armed Bandits with a Fixed Budget under a Small Gap

We consider fixed-budget best-arm identification in two-armed Gaussian bandit problems. One of the longstanding open questions is the existence of an optimal strategy under which the probability of misidentification matches a lower bound. We show that a strategy following the Neyman allocation rule (Neyman, 1934) is asymptotically optimal when the gap between the expected rewards is small. First, we review a lower bound derived by Kaufmann et al. (2016). Then, we propose the "Neyman Allocation (NA)-Augmented Inverse Probability weighting (AIPW)" strategy, which consists of the sampling rule using the Neyman allocation with an estimated standard deviation and the recommendation rule using an AIPW estimator. Our proposed strategy is optimal because the upper bound matches the lower bound when the budget goes to infinity and the gap goes to zero.

preprint2025arXiv

Extrapolating LATE with Weak IVs

To evaluate the effectiveness of a counterfactual policy, it is often necessary to extrapolate treatment effects on compliers to broader populations. This extrapolation relies on exogenous variation in instruments, which is often weak in practice. This limited variation leads to invalid confidence intervals that are typically too short and cannot be accurately detected by classical methods. For instance, the F-test may falsely conclude that the instruments are strong. Consequently, I develop inference results that are valid even with limited variation in the instruments. These results lead to asymptotically valid confidence sets for various linear functionals of marginal treatment effects, including LATE, ATE, ATT, and policy-relevant treatment effects, regardless of identification strength. This is the first paper to provide weak instrument robust inference results for this class of parameters. Finally, I illustrate my results using data from Agan, Doleac, and Harvey (2023) to analyze counterfactual policies of changing prosecutors' leniency and their effects on reducing recidivism.

preprint2026arXiv

Optimal Contextual Pricing under Agnostic Non-Lipschitz Demand

We study contextual dynamic pricing with linear valuations and bounded-support agnostic noise, whose induced demand curve may be non-Lipschitz with arbitrary jumps and atoms. Such discontinuities break the cross-context interpolation arguments used by smooth-demand pricing algorithms, while the best previous method achieved only $\tilde O(T^{3/4})$ regret. We propose Conservative-Markdown Redirect-UCB Pricing, a polynomial-time algorithm that combines randomized parameter estimation, conservative residual-grid probing, and confidence-based one-step redirection. Our algorithm achieves $\tilde O(T^{2/3})$ optimal regret, matching the known lower bounds of Kleinberg and Leighton (2003) up to logarithmic factors and improving over the previous upper bound of Xu and Wang (2022). Under stochastic well-conditioned contexts, this closes the long-existing open regret gap in linear-valuation contextual pricing under agnostic non-Lipschitz noise distribution.

preprint2022arXiv

Kernel Estimation of Spot Volatility with Microstructure Noise Using Pre-Averaging

We first revisit the problem of estimating the spot volatility of an Itô semimartingale using a kernel estimator. We prove a Central Limit Theorem with optimal convergence rate for a general two-sided kernel. Next, we introduce a new pre-averaging/kernel estimator for spot volatility to handle the microstructure noise of ultra high-frequency observations. We prove a Central Limit Theorem for the estimation error with an optimal rate and study the optimal selection of the bandwidth and kernel functions. We show that the pre-averaging/kernel estimator's asymptotic variance is minimal for exponential kernels, hence, justifying the need of working with kernels of unbounded support as proposed in this work. We also develop a feasible implementation of the proposed estimators with optimal bandwidth. Monte Carlo experiments confirm the superior performance of the devised method.

preprint2026arXiv

Hall-Like Transversal Stress and Sandpile Criticality on Real Production Networks

This paper develops a Hall-Sandpile model of economic instability that combines a Hall-like transversal stress mechanism with sandpile threshold dynamics on a real production-network substrate. In analogy with the physical Hall effect, where exposed flows under an external field generate stress in a transversal direction, we model economic shocks as fields that act on flow-intensive, low-redundancy, low-capacity nodes and produce systemic stress through a multiplicative conversion function. The accumulated stress drives a discrete toppling rule and an avalanche dynamics whose effective activation threshold declines with transversal exposure. The model is calibrated on annual World Input--Output Database (WIOD) production networks for 2000--2014 and simulated on the 2014 substrate (2{,}283 country--sector nodes) under three alternative propagation normalisations to avoid mechanical near-criticality from row-stochastic operators. Controlled Monte Carlo experiments over external field intensity and redundancy stress generate four ordered regimes: stable absorption, latent fragility, critical transition, and avalanche regime. Mean avalanche size and the probabilities of finite-size systemic events $\Pr(S\!\geq\!5)$, $\Pr(S\!\geq\!10)$ and $\Pr(S\!\geq\!20)$ rise jointly with field intensity and redundancy stress. Tail diagnostics show regime-dependent thickening of the avalanche distribution, but the estimated tail indices remain too high to interpret as evidence of universal power-law criticality. The contribution is therefore a finite-size, real-network description of how transversal stress activates structural fragility, not a claim of self-organised criticality in the global economy.

preprint2026arXiv

Regret Equals Covariance: A Closed-Form Characterization for Stochastic Optimization

Regret is the cost of uncertainty in algorithmic decision-making. Quantifying regret typically requires computationally expensive simulation via Sample Average Approximation (SAA), with complexity $\mathcal{O}(Bn^{2}d^{3})$ in the number of scenarios $B$, variables $n$, and constraints $d$. % This paper proves that expected regret in any stochastic optimization problem admits the exact decomposition % \begin{equation*} \mathrm{Regret}(c) = \mathrm{Cov}(c,\,π^{*}(c)) + R(c), \end{equation*} % where $c$ is the vector of uncertain parameters, $π^{*}(c)$ is the optimal decision, and $R(c)$ is a residual whose magnitude we bound explicitly under Lipschitz, smooth, and strongly convex conditions. % For linear programs and unconstrained quadratic programs, including the classical Markowitz portfolio problem, we prove $R(c)=0$ exactly, so that $\mathrm{Regret}(c) = \mathrm{Cov}(c,π^{*}(c))$ holds without approximation. % When historical cost-decision pairs $\{(c_i, π^*(c_i))\}$ are available, the covariance can be estimated in $\mathcal{O}(nd^{2})$ time, which is orders of magnitude faster than SAA. The estimation is performed by a single pass through the data. % We derive concentration bounds, a central limit theorem, and an asymptotically unbiased residual estimator, and we validate all results on synthetic LP, QP, and integer programming instances and on a rolling-window portfolio experiment using ten years of CRSP equity data.

People in this topic

12 visible researcher(s)