Researcher profile

Yuta Koike

Yuta Koike contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2026arXiv

A note on connections between the Föllmer process and the denoising diffusion probabilistic model

The Föllmer process is a Brownian motion conditioned to have a pre-specified distribution at time 1. This process can be interpreted as an "augmented" time-compressed version of the reverse stochastic differential equation (SDE) for the denoising diffusion probabilistic model (DDPM). While this fact has been indirectly used to analyze DDPM sampling errors via discretization of the reverse SDE, connections between direct discretization of the Föllmer process and the DDPM sampler have not yet been fully explored. This note aims to clarify this point while surveying relevant results from existing work. We show that discretized Föllmer processes give natural hyper-parameter settings of the DDPM sampler. Moreover, this allows us to systematically recover state-of-the-art results on DDPM sampling error bounds with slight improvements.

preprint2026arXiv

On lead-lag estimation of non-synchronously observed point processes

This paper introduces a new theoretical framework for analyzing lead-lag relationships between point processes, with a special focus on applications to high-frequency financial data. In particular, we are interested in lead-lag relationships between two sequences of order arrival timestamps. The seminal work of Dobrev and Schaumburg proposed model-free measures of cross-market trading activity based on cross-counts of timestamps. While their method is known to yield reliable results, it faces limitations because its original formulation inherently relies on discrete-time observations, an issue we address in this study. Specifically, we formulate the problem of estimating lead-lag relationships in two point processes as that of estimating the shape of the cross-pair correlation function (CPCF) of a bivariate stationary point process, a quantity well-studied in the neuroscience and spatial statistics literature. Within this framework, the prevailing lead-lag time is defined as the location of the CPCF's sharpest peak. Under this interpretation, the peak location in Dobrev and Schaumburg's cross-market activity measure can be viewed as an estimator of the lead-lag time in the aforementioned sense. We further propose an alternative lead-lag time estimator based on kernel density estimation and show that it possesses desirable theoretical properties and delivers superior numerical performance. Empirical evidence from high-frequency financial data demonstrates the effectiveness of our proposed method.

preprint2026arXiv

Wasserstein bounds for denoising diffusion probabilistic models via the Föllmer process

This paper studies sampling error bounds for denoising diffusion probabilistic models (DDPMs) in the 2-Wasserstein distance. Our contributions are threefold. (i) Under general Lipschitz-type conditions on the score function and for a broad class of variance schedules, including the cosine schedule, we establish sharp upper bounds that are optimal in both the dimension and the number of steps, and recover several sharp error bounds previously obtained in the literature. (ii) We prove that the same Lipschitz-type conditions, which encompass those commonly imposed on the (learned) score, imply a logarithmic Sobolev inequality and hence a quadratic transportation cost inequality for the DDPM. As a consequence, in settings covered by existing work, an optimal Wasserstein bound, up to a logarithmic factor, follows from the recently obtained sharp error bound in the Kullback-Leibler divergence under geometric-type variance schedules. (iii) We show that for general log-concave target distributions, the optimal Wasserstein error bound remains attainable even without a quadratic transportation cost inequality for the target. Our analysis is based on viewing the DDPM sampler as a discretization of the Föllmer process rather than the conventional reverse Ornstein-Uhlenbeck process.

preprint2022arXiv

From $p$-Wasserstein Bounds to Moderate Deviations

We use a new method via $p$-Wasserstein bounds to prove Cramér-type moderate deviations in (multivariate) normal approximations. In the classical setting that $W$ is a standardized sum of $n$ independent and identically distributed (i.i.d.) random variables with sub-exponential tails, our method recovers the optimal range of $0\leq x=o(n^{1/6})$ and the near optimal error rate $O(1)(1+x)(\log n+x^2)/\sqrt{n}$ for $P(W>x)/(1-Φ(x))\to 1$, where $Φ$ is the standard normal distribution function. Our method also works for dependent random variables (vectors) and we give applications to the combinatorial central limit theorem, Wiener chaos, homogeneous sums and local dependence. The key step of our method is to show that the $p$-Wasserstein distance between the distribution of the random variable (vector) of interest and a normal distribution grows like $O(p^αΔ)$, $1\leq p\leq p_0$, for some constants $α, Δ$ and $p_0$. In the above i.i.d. setting, $α=1, Δ=1/\sqrt{n}, p_0=n^{1/3}$. For this purpose, we obtain general $p$-Wasserstein bounds in (multivariate) normal approximations using Stein's method.

preprint2022arXiv

High-dimensional Data Bootstrap

This article reviews recent progress in high-dimensional bootstrap. We first review high-dimensional central limit theorems for distributions of sample mean vectors over the rectangles, bootstrap consistency results in high dimensions, and key techniques used to establish those results. We then review selected applications of high-dimensional bootstrap: construction of simultaneous confidence sets for high-dimensional vector parameters, multiple hypothesis testing via stepdown, post-selection inference, intersection bounds for partially identified parameters, and inference on best policies in policy evaluation. Finally, we also comment on a couple of future research directions.

preprint2022arXiv

Improved Central Limit Theorem and bootstrap approximations in high dimensions

This paper deals with the Gaussian and bootstrap approximations to the distribution of the max statistic in high dimensions. This statistic takes the form of the maximum over components of the sum of independent random vectors and its distribution plays a key role in many high-dimensional econometric problems. Using a novel iterative randomized Lindeberg method, the paper derives new bounds for the distributional approximation errors. These new bounds substantially improve upon existing ones and simultaneously allow for a larger class of bootstrap methods.

preprint2022arXiv

Notes on the dimension dependence in high-dimensional central limit theorems for hyperrectangles

Let $X_1,\dots,X_n$ be independent centered random vectors in $\mathbb{R}^d$. This paper shows that, even when $d$ may grow with $n$, the probability $P(n^{-1/2}\sum_{i=1}^nX_i\in A)$ can be approximated by its Gaussian analog uniformly in hyperrectangles $A$ in $\mathbb{R}^d$ as $n\to\infty$ under appropriate moment assumptions, as long as $(\log d)^5/n\to0$. This improves a result of Chernozhukov, Chetverikov & Kato [Ann. Probab. 45 (2017) 2309-2353] in terms of the dimension growth condition. When $n^{-1/2}\sum_{i=1}^nX_i$ has a common factor across the components, this condition can be further improved to $(\log d)^3/n\to0$. The corresponding bootstrap approximation results are also developed. These results serve as a theoretical foundation of simultaneous inference for high-dimensional models.

preprint2021arXiv

Large-dimensional Central Limit Theorem with Fourth-moment Error Bounds on Convex Sets and Balls

We prove the large-dimensional Gaussian approximation of a sum of $n$ independent random vectors in $\mathbb{R}^d$ together with fourth-moment error bounds on convex sets and Euclidean balls. We show that compared with classical third-moment bounds, our bounds have near-optimal dependence on $n$ and can achieve improved dependence on the dimension $d$. For centered balls, we obtain an additional error bound that has a sub-optimal dependence on $n$, but recovers the known result of the validity of the Gaussian approximation if and only if $d=o(n)$. We discuss an application to the bootstrap. We prove our main results using Stein's method.

preprint2020arXiv

High-dimensional Central Limit Theorems by Stein's Method

We obtain explicit error bounds for the $d$-dimensional normal approximation on hyperrectangles for a random vector that has a Stein kernel, or admits an exchangeable pair coupling, or is a non-linear statistic of independent random variables or a sum of $n$ locally dependent random vectors. We assume the approximating normal distribution has a non-singular covariance matrix. The error bounds vanish even when the dimension $d$ is much larger than the sample size $n$. We prove our main results using the approach of Götze (1991) in Stein's method, together with modifications of an estimate of Anderson, Hall and Titterington (1998) and a smoothing inequality of Bhattacharya and Rao (1976). For sums of $n$ independent and identically distributed isotropic random vectors having a log-concave density, we obtain an error bound that is optimal up to a $\log n$ factor. We also discuss an application to multiple Wiener-Itô integrals.

preprint2020arXiv

Multi-scale analysis of lead-lag relationships in high-frequency financial markets

We propose a novel estimation procedure for scale-by-scale lead-lag relationships of financial assets observed at high-frequency in a non-synchronous manner. The proposed estimation procedure does not require any interpolation processing of original datasets and is applicable to those with highest time resolution available. Consistency of the proposed estimators is shown under the continuous-time framework that has been developed in our previous work Hayashi and Koike (2018). An empirical application to a quote dataset of the NASDAQ-100 assets identifies two types of lead-lag relationships at different time scales.

preprint2020arXiv

New error bounds in multivariate normal approximations via exchangeable pairs with applications to Wishart matrices and fourth moment theorems

We extend Stein's celebrated Wasserstein bound for normal approximation via exchangeable pairs to the multi-dimensional setting. As an intermediate step, we exploit the symmetry of exchangeable pairs to obtain an error bound for smooth test functions. We also obtain a continuous version of the multi-dimensional Wasserstein bound in terms of fourth moments. We apply the main results to multivariate normal approximations to Wishart matrices of size $n$ and degree $d$, where we obtain the optimal convergence rate $\sqrt{n^3/d}$ under only moment assumptions, and to quadratic forms and Poisson functionals, where we strengthen a few of the fourth moment bounds in the literature on the Wasserstein distance.

preprint2019arXiv

De-biased graphical Lasso for high-frequency data

This paper develops a new statistical inference theory for the precision matrix of high-frequency data in a high-dimensional setting. The focus is not only on point estimation but also on interval estimation and hypothesis testing for entries of the precision matrix. To accomplish this purpose, we establish an abstract asymptotic theory for the weighted graphical Lasso and its de-biased version without specifying the form of the initial covariance estimator. We also extend the scope of the theory to the case that a known factor structure is present in the data. The developed theory is applied to the concrete situation where we can use the realized covariance matrix as the initial covariance estimator, and we obtain a feasible asymptotic distribution theory to construct (simultaneous) confidence intervals and (multiple) testing procedures for entries of the precision matrix.