Source author record

Guosheng Yin

Guosheng Yin appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Applications Computation Computer Vision eess.IV Machine Learning math.PR math.ST q-fin.PM Statistics Theory

Catalog footprint

What is connected

11works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Oncology Dose Finding Using Approximate Bayesian Computation Design

In the development of new cancer treatment, an essential step is to determine the maximum tolerated dose (MTD) via phase I clinical trials. Generally speaking, phase I trial designs can be classified as either model-based or algorithm-based approaches. Model-based phase I designs are typically more efficient by using all observed data, while there is a potential risk of model misspecification that may lead to unreliable dose assignment and incorrect MTD identification. In contrast, most of the algorithm-based designs are less efficient in using cumulative information, because they tend to focus on the observed data in the neighborhood of the current dose level for dose movement. To use the data more efficiently yet without any model assumption, we propose a novel approximate Bayesian computation (ABC) approach for phase I trial design. Not only is the ABC design free of any dose--toxicity curve assumption, but it can also aggregate all the available information accrued in the trial for dose assignment. Extensive simulation studies demonstrate its robustness and efficiency compared with other phase I designs. We apply the ABC design to the MEK inhibitor selumetinib trial to demonstrate its satisfactory performance. The proposed design can be a useful addition to the family of phase I clinical trial designs due to its simplicity, efficiency and robustness.

preprint2021arXiv

PCA Rerandomization

Mahalanobis distance between treatment group and control group covariate means is often adopted as a balance criterion when implementing a rerandomization strategy. However, this criterion may not work well for high-dimensional cases because it balances all orthogonalized covariates equally. Here, we propose leveraging principal component analysis (PCA) to identify proper subspaces in which Mahalanobis distance should be calculated. Not only can PCA effectively reduce the dimensionality for high-dimensional cases while capturing most of the information in the covariates, but it also provides computational simplicity by focusing on the top orthogonal components. We show that our PCA rerandomization scheme has desirable theoretical properties on balancing covariates and thereby on improving the estimation of average treatment effects. We also show that this conclusion is supported by numerical studies using both simulated and real examples.

preprint2021arXiv

Unit Information Prior for Adaptive Information Borrowing from Multiple Historical Datasets

In clinical trials, there often exist multiple historical studies for the same or related treatment investigated in the current trial. Incorporating historical data in the analysis of the current study is of great importance, as it can help to gain more information, improve efficiency, and provide a more comprehensive evaluation of treatment. Enlightened by the unit information prior (UIP) concept in the reference Bayesian test, we propose a new informative prior called UIP from an information perspective that can adaptively borrow information from multiple historical datasets. We consider both binary and continuous data and also extend the new UIP methods to linear regression settings. Extensive simulation studies demonstrate that our method is comparable to other commonly used informative priors, while the interpretation of UIP is intuitive and its implementation is relatively easy. One distinctive feature of UIP is that its construction only requires summary statistics commonly reported in the literature rather than the patient-level data. By applying our UIP methods to phase III clinical trials for investigating the efficacy of memantine in Alzheimer's disease, we illustrate its ability of adaptively borrowing information from multiple historical datasets in the real application.

preprint2020arXiv

Adaptive Iterative Hessian Sketch via A-Optimal Subsampling

Iterative Hessian sketch (IHS) is an effective sketching method for modeling large-scale data. It was originally proposed by Pilanci and Wainwright (2016; JMLR) based on randomized sketching matrices. However, it is computationally intensive due to the iterative sketch process. In this paper, we analyze the IHS algorithm under the unconstrained least squares problem setting, then propose a deterministic approach for improving IHS via A-optimal subsampling. Our contributions are three-fold: (1) a good initial estimator based on the A-optimal design is suggested; (2) a novel ridged preconditioner is developed for repeated sketching; and (3) an exact line search method is proposed for determining the optimal step length adaptively. Extensive experimental results demonstrate that our proposed A-optimal IHS algorithm outperforms the existing accelerated IHS methods.

preprint2020arXiv

Demystify Lindley's Paradox by Interpreting P-value as Posterior Probability

In the hypothesis testing framework, p-value is often computed to determine rejection of the null hypothesis or not. On the other hand, Bayesian approaches typically compute the posterior probability of the null hypothesis to evaluate its plausibility. We revisit Lindley's paradox (Lindley, 1957) and demystify the conflicting results between Bayesian and frequentist hypothesis testing procedures by casting a two-sided hypothesis as a combination of two one-sided hypotheses along the opposite directions. This can naturally circumvent the ambiguities of assigning a point mass to the null and choices of using local or non-local prior distributions. As p-value solely depends on the observed data without incorporating any prior information, we consider non-informative prior distributions for fair comparisons with p-value. The equivalence of p-value and the Bayesian posterior probability of the null hypothesis can be established to reconcile Lindley's paradox. Extensive simulation studies are conducted with multivariate normal data and random effects models to examine the relationship between the p-value and posterior probability.

preprint2020arXiv

Efficient Unpaired Image Dehazing with Cyclic Perceptual-Depth Supervision

Image dehazing without paired haze-free images is of immense importance, as acquiring paired images often entails significant cost. However, we observe that previous unpaired image dehazing approaches tend to suffer from performance degradation near depth borders, where depth tends to vary abruptly. Hence, we propose to anneal the depth border degradation in unpaired image dehazing with cyclic perceptual-depth supervision. Coupled with the dual-path feature re-using backbones of the generators and discriminators, our model achieves $\mathbf{20.36}$ Peak Signal-to-Noise Ratio (PSNR) on NYU Depth V2 dataset, significantly outperforming its predecessors with reduced Floating Point Operations (FLOPs).

preprint2018arXiv

P-value: A Bless or A Curse for Evidence-Based Studies?

As a convention, p-value is often computed in frequentist hypothesis testing and compared with the nominal significance level of 0.05 to determine whether or not to reject the null hypothesis. The smaller the p-value, the more significant the statistical test. We consider both one-sided and two-sided hypotheses in the composite hypothesis setting. For one-sided hypothesis tests, we establish the equivalence of p-value and the Bayesian posterior probability of the null hypothesis, which renders p-value an explicit interpretation of how strong the data support the null. For two-sided hypothesis tests of a point null, we recast the problem as a combination of two one-sided hypotheses alone the opposite directions and put forward the notion of a two-sided posterior probability, which also has an equivalent relationship with the (two-sided) p-value. Extensive simulation studies are conducted to demonstrate the Bayesian posterior probability interpretation for the p-value. Contrary to common criticisms of the use of p-value in evidence-based studies, we justify its utility and reclaim its importance from the Bayesian perspective, and recommend the continual use of p-value in hypothesis testing. After all, p-value is not all that bad.

preprint2016arXiv

Dynamic portfolio selection without risk-free assets

We consider the mean--variance portfolio optimization problem under the game theoretic framework and without risk-free assets. The problem is solved semi-explicitly by applying the extended Hamilton--Jacobi--Bellman equation. Although the coefficient of risk aversion in our model is a constant, the optimal amounts of money invested in each stock still depend on the current wealth in general. The optimal solution is obtained by solving a system of ordinary differential equations whose existence and uniqueness are proved and a numerical algorithm as well as its convergence speed are provided. Different from portfolio selection with risk-free assets, our value function is quadratic in the current wealth, and the equilibrium allocation is linearly sensitive to the initial wealth. Numerical results show that this model performs better than both the classical one and the variance model in a bull market.

preprint2014arXiv

Bayesian data augmentation dose finding with continual reassessment method and delayed toxicity

A major practical impediment when implementing adaptive dose-finding designs is that the toxicity outcome used by the decision rules may not be observed shortly after the initiation of the treatment. To address this issue, we propose the data augmentation continual reassessment method (DA-CRM) for dose finding. By naturally treating the unobserved toxicities as missing data, we show that such missing data are nonignorable in the sense that the missingness depends on the unobserved outcomes. The Bayesian data augmentation approach is used to sample both the missing data and model parameters from their posterior full conditional distributions. We evaluate the performance of the DA-CRM through extensive simulation studies and also compare it with other existing methods. The results show that the proposed design satisfactorily resolves the issues related to late-onset toxicities and possesses desirable operating characteristics: treating patients more safely and also selecting the maximum tolerated dose with a higher probability. The new DA-CRM is illustrated with two phase I cancer clinical trials.

preprint2014arXiv

Nonparametric maximum likelihood approach to multiple change-point problems

In multiple change-point problems, different data segments often follow different distributions, for which the changes may occur in the mean, scale or the entire distribution from one segment to another. Without the need to know the number of change-points in advance, we propose a nonparametric maximum likelihood approach to detecting multiple change-points. Our method does not impose any parametric assumption on the underlying distributions of the data sequence, which is thus suitable for detection of any changes in the distributions. The number of change-points is determined by the Bayesian information criterion and the locations of the change-points can be estimated via the dynamic programming algorithm and the use of the intrinsic order structure of the likelihood function. Under some mild conditions, we show that the new method provides consistent estimation with an optimal rate. We also suggest a prescreening procedure to exclude most of the irrelevant points prior to the implementation of the nonparametric likelihood method. Simulation studies show that the proposed method has satisfactory performance of identifying multiple change-points in terms of estimation accuracy and computation time.

preprint2011arXiv

Bayesian phase I/II adaptively randomized oncology trials with combined drugs

We propose a new integrated phase I/II trial design to identify the most efficacious dose combination that also satisfies certain safety requirements for drug-combination trials. We first take a Bayesian copula-type model for dose finding in phase I. After identifying a set of admissible doses, we immediately move the entire set forward to phase II. We propose a novel adaptive randomization scheme to favor assigning patients to more efficacious dose-combination arms. Our adaptive randomization scheme takes into account both the point estimate and variability of efficacy. By using a moving reference to compare the relative efficacy among treatment arms, our method achieves a high resolution to distinguish different arms. We also consider groupwise adaptive randomization when efficacy is late-onset. We conduct extensive simulation studies to examine the operating characteristics of the proposed design, and illustrate our method using a phase I/II melanoma clinical trial.

Guosheng Yin

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Oncology Dose Finding Using Approximate Bayesian Computation Design

PCA Rerandomization

Unit Information Prior for Adaptive Information Borrowing from Multiple Historical Datasets

Adaptive Iterative Hessian Sketch via A-Optimal Subsampling

Demystify Lindley's Paradox by Interpreting P-value as Posterior Probability

Efficient Unpaired Image Dehazing with Cyclic Perceptual-Depth Supervision

P-value: A Bless or A Curse for Evidence-Based Studies?

Dynamic portfolio selection without risk-free assets

Bayesian data augmentation dose finding with continual reassessment method and delayed toxicity

Nonparametric maximum likelihood approach to multiple change-point problems

Bayesian phase I/II adaptively randomized oncology trials with combined drugs