Researcher profile

Mengjia Yu

Mengjia Yu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2023arXiv

Enrollment Forecast for Clinical Trials at the Portfolio Planning Phase Based on Site-Level Historical Data

Accurate forecast of a clinical trial enrollment timeline at the planning phase is of great importance to both corporate strategic planning and trial operational excellence. While predictions of key milestones such as last subject first dose date can inform strategic decision-making, detailed predictive insights (e.g., median number of enrolled subjects by month for a country) can facilitate the planning of clinical trial operation activities and promote execution excellence. The naive approach often calculates an average enrollment rate from historical data and generates an inaccurate prediction based on a linear trend with the average rate. The traditional statistical approach utilizes the simple Poisson-Gamma model that assumes time-invariant site activation rates and it can fail to capture the underlying nonlinear patterns (e.g., up and down site activation pattern). We present a novel statistical approach based on generalized linear mixed-effects models and the use of non-homogeneous Poisson processes through Bayesian framework to model the country initiation, site activation and subject enrollment sequentially in a systematic fashion. We validate the performance of our proposed enrollment modeling framework based on a set of preselected 25 studies from four therapeutic areas. Our modeling framework shows a substantial improvement in prediction accuracy in comparison to the traditional statistical approach. Furthermore, we show that our modeling and simulation approach calibrates the data variability appropriately and gives correct coverage rates for prediction intervals of various nominal levels. Finally, we demonstrate the use of our approach to generate the predicted enrollment curves through time with confidence bands overlaid.

preprint2021arXiv

Finite sample change point inference and identification for high-dimensional mean vectors

Cumulative sum (CUSUM) statistics are widely used in the change point inference and identification. For the problem of testing for existence of a change point in an independent sample generated from the mean-shift model, we introduce a Gaussian multiplier bootstrap to calibrate critical values of the CUSUM test statistics in high dimensions. The proposed bootstrap CUSUM test is fully data-dependent and it has strong theoretical guarantees under arbitrary dependence structures and mild moment conditions. Specifically, we show that with a boundary removal parameter the bootstrap CUSUM test enjoys the uniform validity in size under the null and it achieves the minimax separation rate under the sparse alternatives when the dimension $p$ can be larger than the sample size $n$. Once a change point is detected, we estimate the change point location by maximizing the $\ell^{\infty}$-norm of the generalized CUSUM statistics at two different weighting scales corresponding to covariance stationary and non-stationary CUSUM statistics. For both estimators, we derive their rates of convergence and show that dimension impacts the rates only through logarithmic factors, which implies that consistency of the CUSUM estimators is possible when $p$ is much larger than $n$. In the presence of multiple change points, we propose a principled bootstrap-assisted binary segmentation (BABS) algorithm to dynamically adjust the change point detection rule and recursively estimate their locations. We derive its rate of convergence under suitable signal separation and strength conditions. The results derived in this paper are non-asymptotic and we provide extensive simulation studies to assess the finite sample performance. The empirical evidence shows an encouraging agreement with our theoretical results.