Variable Selection and Estimation with Censored Data

Variable Selection and Estimation with Censored Data
Title Variable Selection and Estimation with Censored Data PDF eBook
Author Yi Li
Publisher
Pages 96
Release 2020
Genre
ISBN

Download Variable Selection and Estimation with Censored Data Book in PDF, Epub and Kindle

In clinical and epidemiological studies, it is possible to collect a large set of covariates that are potentially prognostic of the event time. For survival data with high-dimensional covariates, selecting a subset of covariates that are most significantly associated with the outcome has become an important objective. This dissertation focuses on variable selection and estimation with censored data. In the first part, we consider robust modeling and variable selection for the accelerated failure time (AFT) model with right-censored data. We propose a unified Expectation-Maximization (EM) approach combined with the LASSO penalty to perform variable selection and parameter estimation simultaneously. Our approach can be used with general loss functions, and reduces to the well-known Buckley-James method when the squared-error loss is used without regularization. To mitigate the effects of outliers and heavy-tailed noise in the real application, we recommend the use of robust loss functions under our proposed framework. Simulation studies are conducted to evaluate the performance of the proposed approach with different loss functions, and an application to an ovarian cancer study is provided. In the second part, we consider group and within-group variable selection for the AFT model with right-censored data. We extend our approach established in the first part by incorporating the group structure among the covariates. The LASSO penalty is replaced by the sparse group LASSO (SGL) penalty in the proposed EM approach in order to select groups and covariates within a group. We conduct simulation studies to assess the performance of the proposed approach with the SGL penalty and compare it with the approach proposed in the first part. We provide an application to the same ovarian cancer data. In the third part, we consider variable selection with interval-censored data. We study a class of semiparametric linear transformation models, which includes the Cox proportional hazards and proportional odds models as special cases. We propose a penalized nonparametric maximum likelihood estimation (NPMLE) approach to perform variable selection and parameter estimation simultaneously for this class of models. Efficient computation of the penalized NPMLE is achieved by a modified iterative convex minorant (ICM) algorithm combined with the coordinate descent algorithm. The proposed approach is evaluated by simulation studies and applied to the Atherosclerosis Risk in Communities (ARIC) study.

Model Evaluation and Variable Selection for Interval-censored Data

Model Evaluation and Variable Selection for Interval-censored Data
Title Model Evaluation and Variable Selection for Interval-censored Data PDF eBook
Author Tyler Cook
Publisher
Pages 77
Release 2015
Genre
ISBN

Download Model Evaluation and Variable Selection for Interval-censored Data Book in PDF, Epub and Kindle

Survival analysis is a popular area of statistics dealing with time-to-event data. This type of data can be seen in many disciplines, but it is perhaps most commonly encountered in medical studies. Doctors, for example, might be testing different treatments developed to prolong the lifetimes of cancer patients. Unfortunately, in practical problems such as clinical trials, there is often incomplete data thanks to patients dropping out of the study. This results in censoring, which is a special characteristic of survival data. There are many different types of censoring. This dissertation focuses on the analysis of interval-censored data, where the failure time is only known to belong to some interval of observation times. One problem that researchers face when analyzing survival data is how to handle the censoring distribution. It is often assumed that the observation process generating the censoring is independent of the event time of interest. Consequently, the observation process can effectively be ignored. However, this assumption is clearly not always realistic. Unfortunately, one cannot generally test for independent censoring without additional assumptions or information. Therefore, the researcher is faced with a choice between using methods designed for informative or noninformative censoring. Chapters 2 and 3 of this dissertation investigate the effectiveness of different methods developed for the analysis of informative case I and case II interval censored data under both types of censoring. Extensive simulation studies indicate that the methods produce unbiased results in the presence of both informative and noninformative censoring. The efficiency of the informative censoring methods is then compared with approaches created to handle noninformative censoring. The results of these simulation studies can provide guidelines for deciding between models when facing a practical problem where one is unsure about the dependence of the censoring distribution. Another important problem seen in survival analysis is determining the set of predictors that are significantly related with the failure time being studied. Variable selection has received substantial attention both in classical linear models as well as survival analysis. This is largely thanks to recent technological advances making it easier for researchers in biology to collect huge amounts of genetic data. For example, a researcher with access to gene expression levels for hundreds of genes is interested in identifying which of those genes can predict tumor development time in cancer patients. One must sift through the large number of genes in order to find the small set of significant genes that influence tumor growth. Several methods using penalized likelihood procedures have been proposed to perform parameter estimation and variable selection simultaneously. A number of these techniques have also been extended to the case of right-censored survival data, but little has been done in the context of interval-censoring. In chapter 4, we propose an imputation approach for variable selection of interval-censored data that utilizes these penalized likelihood procedures. This method uses imputation to create a new dataset of imputed exact failure times and right-censored observations. Variable selection can then be performed on the imputed dataset using any of the popular variable selection techniques created for right-censored data. Comprehensive simulation studies illustrate the effectiveness of this new approach. Also, this method is attractive due to how easy it is to implement, since it can take advantage of existing software for variable selection of right-censored data.

The Statistical Analysis of Interval-censored Failure Time Data

The Statistical Analysis of Interval-censored Failure Time Data
Title The Statistical Analysis of Interval-censored Failure Time Data PDF eBook
Author Jianguo Sun
Publisher Springer
Pages 310
Release 2007-05-26
Genre Mathematics
ISBN 0387371192

Download The Statistical Analysis of Interval-censored Failure Time Data Book in PDF, Epub and Kindle

This book collects and unifies statistical models and methods that have been proposed for analyzing interval-censored failure time data. It provides the first comprehensive coverage of the topic of interval-censored data and complements the books on right-censored data. The focus of the book is on nonparametric and semiparametric inferences, but it also describes parametric and imputation approaches. This book provides an up-to-date reference for people who are conducting research on the analysis of interval-censored failure time data as well as for those who need to analyze interval-censored data to answer substantive questions.

Sparse Estimation and Inference for Censored Median Regression

Sparse Estimation and Inference for Censored Median Regression
Title Sparse Estimation and Inference for Censored Median Regression PDF eBook
Author Justin Hall Shows
Publisher
Pages 84
Release 2009
Genre
ISBN

Download Sparse Estimation and Inference for Censored Median Regression Book in PDF, Epub and Kindle

Keywords: censored data, median regression, variable selection.

Regression Models

Regression Models
Title Regression Models PDF eBook
Author Richard Breen
Publisher SAGE
Pages 92
Release 1996-01-09
Genre Mathematics
ISBN 9780803957107

Download Regression Models Book in PDF, Epub and Kindle

This book provides an introduction to the regression models needed, where an outcome variable for a sample is not representative of the population from which a generalized result is sought.

Contemporary Multivariate Analysis and Design of Experiments

Contemporary Multivariate Analysis and Design of Experiments
Title Contemporary Multivariate Analysis and Design of Experiments PDF eBook
Author Kaitai Fang
Publisher World Scientific
Pages 470
Release 2005
Genre Mathematics
ISBN 9812567763

Download Contemporary Multivariate Analysis and Design of Experiments Book in PDF, Epub and Kindle

Index. Subject index -- Author index

Inverse Regression Estimation for Censored Data

Inverse Regression Estimation for Censored Data
Title Inverse Regression Estimation for Censored Data PDF eBook
Author Nivedita Vikas Nadkarni
Publisher
Pages 98
Release 2007
Genre
ISBN

Download Inverse Regression Estimation for Censored Data Book in PDF, Epub and Kindle