Source author record

Donggyu Kim

Donggyu Kim appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology econ.EM Applications math.ST physics.optics q-fin.ST Statistics Theory cond-mat.soft Machine Learning physics.ao-ph physics.flu-dyn q-fin.PM q-fin.RM stat.OT

Catalog footprint

What is connected

16works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

High-Dimensional Time-Varying Coefficient Estimation in Diffusion Models

In this paper, we develop a novel high-dimensional time-varying coefficient estimation method, based on high-dimensional Itô diffusion processes. To account for high-dimensional time-varying coefficients, we first estimate local (or instantaneous) coefficients using a time-localized Dantzig selection scheme under a sparsity condition, which results in biased local coefficient estimators due to the regularization. To handle the bias, we propose a debiasing scheme, which provides well-performing unbiased local coefficient estimators. With the unbiased local coefficient estimators, we estimate the integrated coefficient, and to further account for the sparsity of the coefficient process, we apply thresholding schemes. We call this Thresholding dEbiased Dantzig (TED). We establish asymptotic properties of the proposed TED estimator. In the empirical analysis, TED achieves a higher average out-of-sample $R^2$ across assets than benchmark estimators in most periods. Industry-related factors play a central role in explaining asset returns. The estimated integrated coefficients show pronounced time variation associated with firm-specific events and seasonal patterns.

preprint2022arXiv

Benchmark Dataset for Precipitation Forecasting by Post-Processing the Numerical Weather Prediction

Precipitation forecasting is an important scientific challenge that has wide-reaching impacts on society. Historically, this challenge has been tackled using numerical weather prediction (NWP) models, grounded on physics-based simulations. Recently, many works have proposed an alternative approach, using end-to-end deep learning (DL) models to replace physics-based NWP models. While these DL methods show improved performance and computational efficiency, they exhibit limitations in long-term forecasting and lack the explainability. In this work, we present a hybrid NWP-DL workflow to fill the gap between standalone NWP and DL approaches. Under this workflow, the outputs of NWP models are fed into a deep neural network, which post-processes the data to yield a refined precipitation forecast. The deep model is trained with supervision, using Automatic Weather Station (AWS) observations as ground-truth labels. This can achieve the best of both worlds, and can even benefit from future improvements in NWP technology. To facilitate study in this direction, we present a novel dataset focused on the Korean Peninsula, termed KoMet (Korea Meteorological Dataset), comprised of NWP outputs and AWS observations. For the NWP model, the Global Data Assimilation and Prediction Systems-Korea Integrated Model (GDAPS-KIM) is utilized. We provide analysis on a comprehensive set of baseline methods aimed at addressing the challenges of KoMet, including the sparsity of AWS observations and class imbalance. To lower the barrier to entry and encourage further study, we also provide an extensive open-source Python package for data processing and model development. Our benchmark data and code are available at https://github.com/osilab-kaist/KoMet-Benchmark-Dataset.

preprint2022arXiv

Dynamic Realized Beta Models Using Robust Realized Integrated Beta Estimator

This paper introduces a unified parametric modeling approach for time-varying market betas that can accommodate continuous-time diffusion and discrete-time series models based on a continuous-time series regression model to better capture the dynamic evolution of market betas. We call this the dynamic realized beta (DR Beta). We first develop a non-parametric realized integrated beta estimator using high-frequency financial data contaminated by microstructure noises, which is robust to the stylized features, such as the time-varying beta and the dependence structure of microstructure noises, and construct the estimator's asymptotic properties. Then, with the robust realized integrated beta estimator, we propose a quasi-likelihood procedure for estimating the model parameters based on the combined high-frequency data and low frequency dynamic structure. We also establish asymptotic theorems for the proposed estimator and conduct a simulation study to check the performance of finite samples of the estimator. The empirical study with the S&P 500 index and the top 50 large trading volume stocks from the S&P 500 illustrates that the proposed DR Beta model effectively accounts for dynamics in the market beta of individual stocks and better predicts future market betas.

preprint2022arXiv

Next Generation Models for Portfolio Risk Management: An Approach Using Financial Big Data

This paper proposes a dynamic process of portfolio risk measurement to address potential information loss. The proposed model takes advantage of financial big data to incorporate out-of-target-portfolio information that may be missed when one considers the Value at Risk (VaR) measures only from certain assets of the portfolio. We investigate how the curse of dimensionality can be overcome in the use of financial big data and discuss where and when benefits occur from a large number of assets. In this regard, the proposed approach is the first to suggest the use of financial big data to improve the accuracy of risk analysis. We compare the proposed model with benchmark approaches and empirically show that the use of financial big data improves small portfolio risk analysis. Our findings are useful for portfolio managers and financial regulators, who may seek for an innovation to improve the accuracy of portfolio risk estimation.

preprint2022arXiv

Overnight GARCH-Itô Volatility Models

Various parametric volatility models for financial data have been developed to incorporate high-frequency realized volatilities and better capture market dynamics. However, because high-frequency trading data are not available during the close-to-open period, the volatility models often ignore volatility information over the close-to-open period and thus may suffer from loss of important information relevant to market dynamics. In this paper, to account for whole-day market dynamics, we propose an overnight volatility model based on Itô diffusions to accommodate two different instantaneous volatility processes for the open-to-close and close-to-open periods. We develop a weighted least squares method to estimate model parameters for two different periods and investigate its asymptotic properties. We conduct a simulation study to check the finite sample performance of the proposed model and method. Finally, we apply the proposed approaches to real trading data.

preprint2022arXiv

Volatility Models for Stylized Facts of High-Frequency Financial Data

This paper introduces novel volatility diffusion models to account for the stylized facts of high-frequency financial data such as volatility clustering, intra-day U-shape, and leverage effect. For example, the daily integrated volatility of the proposed volatility process has a realized GARCH structure with an asymmetric effect on log-returns. To further explain the heavy-tailedness of the financial data, we assume that the log-returns have a finite $2b$-th moment for $b \in (1,2]$. Then, we propose a Huber regression estimator which has an optimal convergence rate of $n^{(1-b)/b}$. We also discuss how to adjust bias coming from Huber loss and show its asymptotic properties.

preprint2021arXiv

State Heterogeneity Analysis of Financial Volatility Using High-Frequency Financial Data

Recently, to account for low-frequency market dynamics, several volatility models, employing high-frequency financial data, have been developed. However, in financial markets, we often observe that financial volatility processes depend on economic states, so they have a state heterogeneous structure. In this paper, to study state heterogeneous market dynamics based on high-frequency data, we introduce a novel volatility model based on a continuous Ito diffusion process whose intraday instantaneous volatility process evolves depending on the exogenous state variable, as well as its integrated volatility. We call it the state heterogeneous GARCH-Ito (SG-Ito) model. We suggest a quasi-likelihood estimation procedure with the realized volatility proxy and establish its asymptotic behaviors. Moreover, to test the low-frequency state heterogeneity, we develop a Wald test-type hypothesis testing procedure. The results of empirical studies suggest the existence of leverage, investor attention, market illiquidity, stock market comovement, and post-holiday effect in S&P 500 index volatility.

preprint2021arXiv

Statistical Analysis of Quantum Annealing

Quantum computers use quantum resources to carry out computational tasks and may outperform classical computers in solving certain computational problems. Special-purpose quantum computers such as quantum annealers employ quantum adiabatic theorem to solve combinatorial optimization problems. In this paper, we compare classical annealings such as simulated annealing and quantum annealings that are done by the D-Wave machines both theoretically and numerically. We show that if the classical and quantum annealing are characterized by equivalent Ising models, then solving an optimization problem, i.e., finding the minimal energy of each Ising model, by the two annealing procedures, are mathematically identical. For quantum annealing, we also derive the probability lower-bound on successfully solving an optimization problem by measuring the system at the end of the annealing procedure. Moreover, we present the Markov chain Monte Carlo (MCMC) method to realize quantum annealing by classical computers and investigate its statistical properties. In the numerical section, we discuss the discrepancies between the MCMC based annealing approaches and the quantum annealing approach in solving optimization problems.

preprint2020arXiv

How and When the Cassie-Baxter Droplet Starts to Slide on the Textured Surfaces

The Cassie-Baxter state droplet has many local energy minima on the textured surface, while the amount of the energy barrier between them can be affected by the gravity. When the droplet cannot find any local energy minimum point on the surface, the droplet starts to slide. Based on the Laplace pressure equation, the shape of a two-dimensional Cassie-Baxter droplet on a textured surface is predicted. Then the stability of the droplet is examined by considering the interference between the liquid and the surface microstructure as well as analyzing the free energy change upon the de-pinning. Afterward, the theoretical analysis is validated against the line-tension based front tracking method simulation (LTM), that seamlessly captures the attachment and detachment between the liquid and the substrate. We answer to the open debates on the sliding research field: (i) Whether the sliding initiates with the front end slip or the rear end slip, and (ii) whether the advancing and receding contact angles measured on the horizontal surface are comparable with the front and rear contact angle of the droplet at the onset of sliding. Additionally, a new droplet translation mechanism promoted by cycle of condensation and evaporation is suggested.

preprint2020arXiv

Unified Discrete-Time Factor Stochastic Volatility and Continuous-Time Ito Models for Combining Inference Based on Low-Frequency and High-Frequency

This paper introduces unified models for high-dimensional factor-based Ito process, which can accommodate both continuous-time Ito diffusion and discrete-time stochastic volatility (SV) models by embedding the discrete SV model in the continuous instantaneous factor volatility process. We call it the SV-Ito model. Based on the series of daily integrated factor volatility matrix estimators, we propose quasi-maximum likelihood and least squares estimation methods. Their asymptotic properties are established. We apply the proposed method to predict future vast volatility matrix whose asymptotic behaviors are studied. A simulation study is conducted to check the finite sample performance of the proposed estimation and prediction method. An empirical analysis is carried out to demonstrate the advantage of the SV-Ito model in volatility prediction and portfolio allocation problems.

preprint2020arXiv

Volatility Analysis with Realized GARCH-Ito Models

This paper introduces a unified approach for modeling high-frequency financial data that can accommodate both the continuous-time jump-diffusion and discrete-time realized GARCH model by embedding the discrete realized GARCH structure in the continuous instantaneous volatility process. The key feature of the proposed model is that the corresponding conditional daily integrated volatility adopts an autoregressive structure where both integrated volatility and jump variation serve as innovations. We name it as the realized GARCH-Ito model. Given the autoregressive structure in the conditional daily integrated volatility, we propose a quasi-likelihood function for parameter estimation and establish its asymptotic properties. To improve the parameter estimation, we propose a joint quasi-likelihood function that is built on the marriage of daily integrated volatility estimated by high-frequency data and nonparametric volatility estimator obtained from option data. We conduct a simulation study to check the finite sample performance of the proposed methodologies and an empirical study with the S&P500 stock index and option data.

preprint2016arXiv

Asymptotic Theory for Estimating the Singular Vectors and Values of a Partially-observed Low Rank Matrix with Noise

Matrix completion algorithms recover a low rank matrix from a small fraction of the entries, each entry contaminated with additive errors. In practice, the singular vectors and singular values of the low rank matrix play a pivotal role for statistical analyses and inferences. This paper proposes estimators of these quantities and studies their asymptotic behavior. Under the setting where the dimensions of the matrix increase to infinity and the probability of observing each entry is identical, Theorem 4.1 gives the rate of convergence for the estimated singular vectors; Theorem 4.3 gives a multivariate central limit theorem for the estimated singular values. Even though the estimators use only a partially observed matrix, they achieve the same rates of convergence as the fully observed case. These estimators combine to form a consistent estimator of the full low rank matrix that is computed with a non-iterative algorithm. In the cases studied in this paper, this estimator achieves the minimax lower bound in Koltchinskii et al. (2011). The numerical experiments corroborate our theoretical results.

preprint2016arXiv

Intelligent Initialization and Adaptive Thresholding for Iterative Matrix Completion; Some Statistical and Algorithmic Theory for Adaptive-Impute

Over the past decade, various matrix completion algorithms have been developed. Thresholded singular value decomposition (SVD) is a popular technique in implementing many of them. A sizable number of studies have shown its theoretical and empirical excellence, but choosing the right threshold level still remains as a key empirical difficulty. This paper proposes a novel matrix completion algorithm which iterates thresholded SVD with theoretically-justified and data-dependent values of thresholding parameters. The estimate of the proposed algorithm enjoys the minimax error rate and shows outstanding empirical performances. The thresholding scheme that we use can be viewed as a solution to a non-convex optimization problem, understanding of whose theoretical convergence guarantee is known to be limited. We investigate this problem by introducing a simpler algorithm, generalized-\SI, analyzing its convergence behavior, and connecting it to the proposed algorithm.

preprint2016arXiv

Optimal large-scale quantum state tomography with Pauli measurements

Quantum state tomography aims to determine the state of a quantum system as represented by a density matrix. It is a fundamental task in modern scientific studies involving quantum systems. In this paper, we study estimation of high-dimensional density matrices based on Pauli measurements. In particular, under appropriate notion of sparsity, we establish the minimax optimal rates of convergence for estimation of the density matrix under both the spectral and Frobenius norm losses; and show how these rates can be achieved by a common thresholding approach. Numerical performance of the proposed estimator is also investigated.

preprint2013arXiv

Enrichment of deeply penetrating waves in disordered media

Waves incident to a highly scattering medium are incapable of penetrating deep into the medium due to the diffusion process induced by multiple scattering. This poses a fundamental limitation to optically imaging, sensing, and manipulating targets embedded in opaque scattering layers such as biological tissues. One strategy for mitigating the shallow wave penetration is to exploit eigenmodes with anomalously high transmittance existing in any disordered medium. When waves are coupled to these eigenmodes, strong constructive wave interference enhances deeply penetrating waves. However, finding such eigenmodes has been a challenging task due to the complexity of disordered media. In this Letter, we present an iterative wavefront control method that selectively enriches the coupling of incident beam to high-transmission eigenmodes. Specifically, we refined the high-transmission eigenmodes from an arbitrary initial wave by either maximizing transmitted wave intensity or minimizing reflected wave intensity. Using the proposed method, we achieved more than a factor of 3 increases in light transmission through a scattering medium exhibiting hundreds of scattering events. Our approach is readily applicable to in vivo applications in which only the detection of reflected waves is available. Enhancing light penetration will lead to improving the working depth of optical imaging and treatment techniques.

preprint2013arXiv

Pixelation-free and real-time endoscopic imaging through a fiber bundle

Endoscopy has been an indispensible tool in medical diagnostics, and yet the demands for reduced unit diameter and enhanced spatial resolution have steadily been growing for the accurate investigation of distal sites with minimal side effects. However, the attempts to make use of thin image-guiding media accompany the degradation in spatial resolution as the micro-optics often induces aberrations. Here, we present a microendoscope that performs real-time correction of severe aberrations induced by image-guiding media such as a bundled fiber. Specifically, we developed a method exploiting the full binary control of a digital micro-mirror device (DMD) for characterizing the input-output response of image-guiding media and subsequently compensating the aberrations. As a proof-of-concept study, we completely eliminated the pixelation artifact, a severe form of aberration, in endoscopic imaging through an image fiber bundle and achieved spatial resolution much better than the diameter of an individual fiber. Our study lays a foundation of applying extremely thin, but highly aberrant image-guiding media for high-resolution microendoscopy.

Donggyu Kim

What is connected

Connect this record

See the researcher in context

Building this map preview

16 published item(s)

High-Dimensional Time-Varying Coefficient Estimation in Diffusion Models

Benchmark Dataset for Precipitation Forecasting by Post-Processing the Numerical Weather Prediction

Dynamic Realized Beta Models Using Robust Realized Integrated Beta Estimator

Next Generation Models for Portfolio Risk Management: An Approach Using Financial Big Data

Overnight GARCH-Itô Volatility Models

Volatility Models for Stylized Facts of High-Frequency Financial Data

State Heterogeneity Analysis of Financial Volatility Using High-Frequency Financial Data

Statistical Analysis of Quantum Annealing

How and When the Cassie-Baxter Droplet Starts to Slide on the Textured Surfaces

Unified Discrete-Time Factor Stochastic Volatility and Continuous-Time Ito Models for Combining Inference Based on Low-Frequency and High-Frequency

Volatility Analysis with Realized GARCH-Ito Models

Asymptotic Theory for Estimating the Singular Vectors and Values of a Partially-observed Low Rank Matrix with Noise

Intelligent Initialization and Adaptive Thresholding for Iterative Matrix Completion; Some Statistical and Algorithmic Theory for Adaptive-Impute

Optimal large-scale quantum state tomography with Pauli measurements

Enrichment of deeply penetrating waves in disordered media

Pixelation-free and real-time endoscopic imaging through a fiber bundle