Source author record

Dai Feng

Dai Feng appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.ST Methodology Statistics Theory

Catalog footprint

What is connected

4works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Target alignment in truncated kernel ridge regression

Kernel ridge regression (KRR) has recently attracted renewed interest due to its potential for explaining the transient effects, such as double descent, that emerge during neural network training. In this work, we study how the alignment between the target function and the kernel affects the performance of the KRR. We focus on the truncated KRR (TKRR) which utilizes an additional parameter that controls the spectral truncation of the kernel matrix. We show that for polynomial alignment, there is an \emph{over-aligned} regime, in which TKRR can achieve a faster rate than what is achievable by full KRR. The rate of TKRR can improve all the way to the parametric rate, while that of full KRR is capped at a sub-optimal value. This shows that target alignemnt can be better leveraged by utilizing spectral truncation in kernel methods. We also consider the bandlimited alignment setting and show that the regularization surface of TKRR can exhibit transient effects including multiple descent and non-monotonic behavior. Our results show that there is a strong and quantifable relation between the shape of the \emph{alignment spectrum} and the generalization performance of kernel methods, both in terms of rates and in finite samples.

preprint2021arXiv

Nonparametric Analysis of Delayed Treatment Effects using Single-Crossing Constraints

Clinical trials involving novel immuno-oncology (IO) therapies frequently exhibit survival profiles which violate the proportional hazards assumption due to a delay in treatment effect, and in such settings, the survival curves in the two treatment arms may have a crossing before the two curves eventually separate. To flexibly model such scenarios, we describe a nonparametric approach for estimating the treatment arm-specific survival functions which constrains these two survival functions to cross at most once without making any additional assumptions about how the survival curves are related. A main advantage of our approach is that it provides an estimate of a crossing time if such a crossing exists, and moreover, our method generates interpretable measures of treatment benefit including crossing-conditional survival probabilities and crossing-conditional estimates of restricted residual mean life. We demonstrate the use and effectiveness of our approach with a large simulation study and an analysis of reconstructed outcomes from a recent combination-therapy trial.

preprint2020arXiv

DNNSurv: Deep Neural Networks for Survival Analysis Using Pseudo Values

There has been increasing interest in modelling survival data using deep learning methods in medical research. Current approaches have focused on designing special cost functions to handle censored survival data. We propose a very different method with two steps. In the first step, we transform each subject's survival time into a series of jackknife pseudo conditional survival probabilities and then use these pseudo probabilities as a quantitative response variable in the deep neural network model. By using the pseudo values, we reduce a complex survival analysis to a standard regression problem, which greatly simplifies the neural network construction. Our two-step approach is simple, yet very flexible in making risk predictions for survival data, which is very appealing from the practice point of view. The source code is freely available at http://github.com/lilizhaoUM/DNNSurv.

preprint2020arXiv

Random Forest (RF) Kernel for Regression, Classification and Survival

Breiman's random forest (RF) can be interpreted as an implicit kernel generator,where the ensuing proximity matrix represents the data-driven RF kernel. Kernel perspective on the RF has been used to develop a principled framework for theoretical investigation of its statistical properties. However, practical utility of the links between kernels and the RF has not been widely explored and systematically evaluated.Focus of our work is investigation of the interplay between kernel methods and the RF. We elucidate the performance and properties of the data driven RF kernels used by regularized linear models in a comprehensive simulation study comprising of continuous, binary and survival targets. We show that for continuous and survival targets, the RF kernels are competitive to RF in higher dimensional scenarios with larger number of noisy features. For the binary target, the RF kernel and RF exhibit comparable performance. As the RF kernel asymptotically converges to the Laplace kernel, we included it in our evaluation. For most simulation setups, the RF and RFkernel outperformed the Laplace kernel. Nevertheless, in some cases the Laplace kernel was competitive, showing its potential value for applications. We also provide the results from real life data sets for the regression, classification and survival to illustrate how these insights may be leveraged in practice.Finally, we discuss further extensions of the RF kernels in the context of interpretable prototype and landmarking classification, regression and survival. We outline future line of research for kernels furnished by Bayesian counterparts of the RF.

Dai Feng

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

Target alignment in truncated kernel ridge regression

Nonparametric Analysis of Delayed Treatment Effects using Single-Crossing Constraints

DNNSurv: Deep Neural Networks for Survival Analysis Using Pseudo Values

Random Forest (RF) Kernel for Regression, Classification and Survival