Source author record

Yuta Koike

Yuta Koike appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.ST Statistics Theory math.PR econ.EM Machine Learning Methodology q-fin.ST

Catalog footprint

What is connected

17works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A note on connections between the Föllmer process and the denoising diffusion probabilistic model

The Föllmer process is a Brownian motion conditioned to have a pre-specified distribution at time 1. This process can be interpreted as an "augmented" time-compressed version of the reverse stochastic differential equation (SDE) for the denoising diffusion probabilistic model (DDPM). While this fact has been indirectly used to analyze DDPM sampling errors via discretization of the reverse SDE, connections between direct discretization of the Föllmer process and the DDPM sampler have not yet been fully explored. This note aims to clarify this point while surveying relevant results from existing work. We show that discretized Föllmer processes give natural hyper-parameter settings of the DDPM sampler. Moreover, this allows us to systematically recover state-of-the-art results on DDPM sampling error bounds with slight improvements.

preprint2026arXiv

On lead-lag estimation of non-synchronously observed point processes

This paper introduces a new theoretical framework for analyzing lead-lag relationships between point processes, with a special focus on applications to high-frequency financial data. In particular, we are interested in lead-lag relationships between two sequences of order arrival timestamps. The seminal work of Dobrev and Schaumburg proposed model-free measures of cross-market trading activity based on cross-counts of timestamps. While their method is known to yield reliable results, it faces limitations because its original formulation inherently relies on discrete-time observations, an issue we address in this study. Specifically, we formulate the problem of estimating lead-lag relationships in two point processes as that of estimating the shape of the cross-pair correlation function (CPCF) of a bivariate stationary point process, a quantity well-studied in the neuroscience and spatial statistics literature. Within this framework, the prevailing lead-lag time is defined as the location of the CPCF's sharpest peak. Under this interpretation, the peak location in Dobrev and Schaumburg's cross-market activity measure can be viewed as an estimator of the lead-lag time in the aforementioned sense. We further propose an alternative lead-lag time estimator based on kernel density estimation and show that it possesses desirable theoretical properties and delivers superior numerical performance. Empirical evidence from high-frequency financial data demonstrates the effectiveness of our proposed method.

preprint2026arXiv

Wasserstein bounds for denoising diffusion probabilistic models via the Föllmer process

This paper studies sampling error bounds for denoising diffusion probabilistic models (DDPMs) in the 2-Wasserstein distance. Our contributions are threefold. (i) Under general Lipschitz-type conditions on the score function and for a broad class of variance schedules, including the cosine schedule, we establish sharp upper bounds that are optimal in both the dimension and the number of steps, and recover several sharp error bounds previously obtained in the literature. (ii) We prove that the same Lipschitz-type conditions, which encompass those commonly imposed on the (learned) score, imply a logarithmic Sobolev inequality and hence a quadratic transportation cost inequality for the DDPM. As a consequence, in settings covered by existing work, an optimal Wasserstein bound, up to a logarithmic factor, follows from the recently obtained sharp error bound in the Kullback-Leibler divergence under geometric-type variance schedules. (iii) We show that for general log-concave target distributions, the optimal Wasserstein error bound remains attainable even without a quadratic transportation cost inequality for the target. Our analysis is based on viewing the DDPM sampler as a discretization of the Föllmer process rather than the conventional reverse Ornstein-Uhlenbeck process.

preprint2022arXiv

From $p$-Wasserstein Bounds to Moderate Deviations

We use a new method via $p$-Wasserstein bounds to prove Cramér-type moderate deviations in (multivariate) normal approximations. In the classical setting that $W$ is a standardized sum of $n$ independent and identically distributed (i.i.d.) random variables with sub-exponential tails, our method recovers the optimal range of $0\leq x=o(n^{1/6})$ and the near optimal error rate $O(1)(1+x)(\log n+x^2)/\sqrt{n}$ for $P(W>x)/(1-Φ(x))\to 1$, where $Φ$ is the standard normal distribution function. Our method also works for dependent random variables (vectors) and we give applications to the combinatorial central limit theorem, Wiener chaos, homogeneous sums and local dependence. The key step of our method is to show that the $p$-Wasserstein distance between the distribution of the random variable (vector) of interest and a normal distribution grows like $O(p^αΔ)$, $1\leq p\leq p_0$, for some constants $α, Δ$ and $p_0$. In the above i.i.d. setting, $α=1, Δ=1/\sqrt{n}, p_0=n^{1/3}$. For this purpose, we obtain general $p$-Wasserstein bounds in (multivariate) normal approximations using Stein's method.

preprint2022arXiv

High-dimensional Data Bootstrap

This article reviews recent progress in high-dimensional bootstrap. We first review high-dimensional central limit theorems for distributions of sample mean vectors over the rectangles, bootstrap consistency results in high dimensions, and key techniques used to establish those results. We then review selected applications of high-dimensional bootstrap: construction of simultaneous confidence sets for high-dimensional vector parameters, multiple hypothesis testing via stepdown, post-selection inference, intersection bounds for partially identified parameters, and inference on best policies in policy evaluation. Finally, we also comment on a couple of future research directions.

preprint2022arXiv

Improved Central Limit Theorem and bootstrap approximations in high dimensions

This paper deals with the Gaussian and bootstrap approximations to the distribution of the max statistic in high dimensions. This statistic takes the form of the maximum over components of the sum of independent random vectors and its distribution plays a key role in many high-dimensional econometric problems. Using a novel iterative randomized Lindeberg method, the paper derives new bounds for the distributional approximation errors. These new bounds substantially improve upon existing ones and simultaneously allow for a larger class of bootstrap methods.

preprint2022arXiv

Notes on the dimension dependence in high-dimensional central limit theorems for hyperrectangles

Let $X_1,\dots,X_n$ be independent centered random vectors in $\mathbb{R}^d$. This paper shows that, even when $d$ may grow with $n$, the probability $P(n^{-1/2}\sum_{i=1}^nX_i\in A)$ can be approximated by its Gaussian analog uniformly in hyperrectangles $A$ in $\mathbb{R}^d$ as $n\to\infty$ under appropriate moment assumptions, as long as $(\log d)^5/n\to0$. This improves a result of Chernozhukov, Chetverikov & Kato [Ann. Probab. 45 (2017) 2309-2353] in terms of the dimension growth condition. When $n^{-1/2}\sum_{i=1}^nX_i$ has a common factor across the components, this condition can be further improved to $(\log d)^3/n\to0$. The corresponding bootstrap approximation results are also developed. These results serve as a theoretical foundation of simultaneous inference for high-dimensional models.

preprint2021arXiv

Large-dimensional Central Limit Theorem with Fourth-moment Error Bounds on Convex Sets and Balls

We prove the large-dimensional Gaussian approximation of a sum of $n$ independent random vectors in $\mathbb{R}^d$ together with fourth-moment error bounds on convex sets and Euclidean balls. We show that compared with classical third-moment bounds, our bounds have near-optimal dependence on $n$ and can achieve improved dependence on the dimension $d$. For centered balls, we obtain an additional error bound that has a sub-optimal dependence on $n$, but recovers the known result of the validity of the Gaussian approximation if and only if $d=o(n)$. We discuss an application to the bootstrap. We prove our main results using Stein's method.

preprint2020arXiv

High-dimensional Central Limit Theorems by Stein's Method

We obtain explicit error bounds for the $d$-dimensional normal approximation on hyperrectangles for a random vector that has a Stein kernel, or admits an exchangeable pair coupling, or is a non-linear statistic of independent random variables or a sum of $n$ locally dependent random vectors. We assume the approximating normal distribution has a non-singular covariance matrix. The error bounds vanish even when the dimension $d$ is much larger than the sample size $n$. We prove our main results using the approach of Götze (1991) in Stein's method, together with modifications of an estimate of Anderson, Hall and Titterington (1998) and a smoothing inequality of Bhattacharya and Rao (1976). For sums of $n$ independent and identically distributed isotropic random vectors having a log-concave density, we obtain an error bound that is optimal up to a $\log n$ factor. We also discuss an application to multiple Wiener-Itô integrals.

preprint2020arXiv

Multi-scale analysis of lead-lag relationships in high-frequency financial markets

We propose a novel estimation procedure for scale-by-scale lead-lag relationships of financial assets observed at high-frequency in a non-synchronous manner. The proposed estimation procedure does not require any interpolation processing of original datasets and is applicable to those with highest time resolution available. Consistency of the proposed estimators is shown under the continuous-time framework that has been developed in our previous work Hayashi and Koike (2018). An empirical application to a quote dataset of the NASDAQ-100 assets identifies two types of lead-lag relationships at different time scales.

preprint2020arXiv

New error bounds in multivariate normal approximations via exchangeable pairs with applications to Wishart matrices and fourth moment theorems

We extend Stein's celebrated Wasserstein bound for normal approximation via exchangeable pairs to the multi-dimensional setting. As an intermediate step, we exploit the symmetry of exchangeable pairs to obtain an error bound for smooth test functions. We also obtain a continuous version of the multi-dimensional Wasserstein bound in terms of fourth moments. We apply the main results to multivariate normal approximations to Wishart matrices of size $n$ and degree $d$, where we obtain the optimal convergence rate $\sqrt{n^3/d}$ under only moment assumptions, and to quadratic forms and Poisson functionals, where we strengthen a few of the fourth moment bounds in the literature on the Wasserstein distance.

preprint2019arXiv

De-biased graphical Lasso for high-frequency data

This paper develops a new statistical inference theory for the precision matrix of high-frequency data in a high-dimensional setting. The focus is not only on point estimation but also on interval estimation and hypothesis testing for entries of the precision matrix. To accomplish this purpose, we establish an abstract asymptotic theory for the weighted graphical Lasso and its de-biased version without specifying the form of the initial covariance estimator. We also extend the scope of the theory to the case that a known factor structure is present in the data. The developed theory is applied to the concrete situation where we can use the realized covariance matrix as the initial covariance estimator, and we obtain a feasible asymptotic distribution theory to construct (simultaneous) confidence intervals and (multiple) testing procedures for entries of the precision matrix.

preprint2016arXiv

Quadratic covariation estimation of an irregularly observed semimartingale with jumps and noise

This paper presents a central limit theorem for a pre-averaged version of the realized covariance estimator for the quadratic covariation of a discretely observed semimartingale with noise. The semimartingale possibly has jumps, while the observation times show irregularity, non-synchronicity, and some dependence on the observed process. It is shown that the observation times' effect on the asymptotic distribution of the estimator is only through two characteristics: the observation frequency and the covariance structure of the noise. This is completely different from the case of the realized covariance in a pure semimartingale setting.

preprint2015arXiv

Time endogeneity and an optimal weight function in pre-averaging covariance estimation

We establish a central limit theorem for a class of pre-averaging covariance estimators in a general endogenous time setting. In particular, we show that the time endogeneity has no impact on the asymptotic distribution if certain functionals of observation times are asymptotically well-defined. This contrasts with the case of the realized volatility in a pure diffusion setting. We also discuss an optimal choice of the weight function in the pre-averaging.

preprint2013arXiv

Central limit theorems for pre-averaging covariance estimators under endogenous sampling times

We consider two continuous Itô semimartingales observed with noise and sampled at stopping times in a nonsynchronous manner. In this article we establish a central limit theorem for the pre-averaged Hayashi-Yoshida estimator of their integrated covariance in a general endogenous time setting. In particular, we show that the time endogeneity has no impact on the asymptotic distribution of the pre-averaged Hayashi-Yoshida estimator, which contrasts the case for the realized volatility in a pure diffusion setting. We also establish a central limit theorem for the modulated realized covariance, which is another pre-averaging based integrated covariance estimator, and demonstrate the above property seems to be a special feature of the pre-averaging technique.

preprint2013arXiv

Estimation of integrated covariances in the simultaneous presence of nonsynchronicity, microstructure noise and jumps

We propose a new estimator for the integrated covariance of two Ito semimartingales observed at a high-frequency. This new estimator, which we call the pre-averaged truncated Hayashi-Yoshida estimator, enables us to separate the sum of the co-jumps from the total quadratic covariation even in the case that the sampling schemes of two processes are nonsynchronous and the observation data is polluted by some noise. It is the first estimator which can simultaneously handle these three issues, which are fundamental to empirical studies of high-frequency financial data. We also show the asymptotic mixed normality of this estimator under some mild conditions allowing infinite activity jump processes with finite variations, some dependency between the sampling times and the observed processes as well as a kind of endogenous observation errors. We examine the finite sample performance of this estimator using a Monte Carlo study.

preprint2013arXiv

Limit theorems for the pre-averaged Hayashi-Yoshida estimator with random sampling

We will focus on estimating the integrated covariance of two diffusion processes observed in a nonsynchronous manner. The observation data is contaminated by some noise, which is possibly correlated with the returns of the diffusion processes, while the sampling times also possibly depend on the observed processes. In a high-frequency setting, we consider a modified version of the pre-averaged Hayashi-Yoshida estimator, and we show that such a kind of estimators has the consistency and the asymptotic mixed normality, and attains the optimal rate of convergence.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint

Fields this researcher appears in

math.ST Statistics Theory math.PR econ.EM Machine Learning Methodology q-fin.ST

Source provenance

Where this author record came from

arxivconfidence 95%

external id: arxiv:2605.18069:author:1:yuta-koike

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.18040:author:1:yuta-koike

Imported May 20, 2026Synced May 21, 2026

4 works

Xiao Fang

Researcher

Xiao Fang contributes to research discovery and scholarly infrastructure.

Open to collaborate

2 works

Denis Chetverikov

Researcher

Denis Chetverikov contributes to research discovery and scholarly infrastructure.

Open to collaborate

2 works

Kengo Kato

Researcher

Kengo Kato contributes to research discovery and scholarly infrastructure.

Open to collaborate

2 works

Takaki Hayashi

Researcher

Takaki Hayashi contributes to research discovery and scholarly infrastructure.

Open to collaborate

Yuta Koike

What is connected

Connect this record

See the researcher in context

Building this map preview

17 published item(s)

A note on connections between the Föllmer process and the denoising diffusion probabilistic model

On lead-lag estimation of non-synchronously observed point processes

Wasserstein bounds for denoising diffusion probabilistic models via the Föllmer process

From $p$-Wasserstein Bounds to Moderate Deviations

High-dimensional Data Bootstrap

Improved Central Limit Theorem and bootstrap approximations in high dimensions

Notes on the dimension dependence in high-dimensional central limit theorems for hyperrectangles

Large-dimensional Central Limit Theorem with Fourth-moment Error Bounds on Convex Sets and Balls

High-dimensional Central Limit Theorems by Stein's Method

Multi-scale analysis of lead-lag relationships in high-frequency financial markets

New error bounds in multivariate normal approximations via exchangeable pairs with applications to Wishart matrices and fourth moment theorems

De-biased graphical Lasso for high-frequency data

Quadratic covariation estimation of an irregularly observed semimartingale with jumps and noise

Time endogeneity and an optimal weight function in pre-averaging covariance estimation

Central limit theorems for pre-averaging covariance estimators under endogenous sampling times

Estimation of integrated covariances in the simultaneous presence of nonsynchronicity, microstructure noise and jumps

Limit theorems for the pre-averaged Hayashi-Yoshida estimator with random sampling