Multiple Testing Procedures with Applications to Genomics

Multiple Testing Procedures with Applications to Genomics
Title Multiple Testing Procedures with Applications to Genomics PDF eBook
Author Sandrine Dudoit
Publisher Springer
Pages 0
Release 2008-11-01
Genre Science
ISBN 9780387517094

Download Multiple Testing Procedures with Applications to Genomics Book in PDF, Epub and Kindle

This book establishes the theoretical foundations of a general methodology for multiple hypothesis testing and discusses its software implementation in R and SAS. These are applied to a range of problems in biomedical and genomic research, including identification of differentially expressed and co-expressed genes in high-throughput gene expression experiments; tests of association between gene expression measures and biological annotation metadata; sequence analysis; and genetic mapping of complex traits using single nucleotide polymorphisms. The procedures are based on a test statistics joint null distribution and provide Type I error control in testing problems involving general data generating distributions, null hypotheses, and test statistics.

Multiple Testing Procedures with Applications to Genomics

Multiple Testing Procedures with Applications to Genomics
Title Multiple Testing Procedures with Applications to Genomics PDF eBook
Author Sandrine Dudoit
Publisher Springer Science & Business Media
Pages 611
Release 2007-12-18
Genre Science
ISBN 0387493174

Download Multiple Testing Procedures with Applications to Genomics Book in PDF, Epub and Kindle

This book establishes the theoretical foundations of a general methodology for multiple hypothesis testing and discusses its software implementation in R and SAS. These are applied to a range of problems in biomedical and genomic research, including identification of differentially expressed and co-expressed genes in high-throughput gene expression experiments; tests of association between gene expression measures and biological annotation metadata; sequence analysis; and genetic mapping of complex traits using single nucleotide polymorphisms. The procedures are based on a test statistics joint null distribution and provide Type I error control in testing problems involving general data generating distributions, null hypotheses, and test statistics.

Multiple Hypothesis Testing

Multiple Hypothesis Testing
Title Multiple Hypothesis Testing PDF eBook
Author Houston Nash Gilbert
Publisher
Pages 372
Release 2009
Genre
ISBN

Download Multiple Hypothesis Testing Book in PDF, Epub and Kindle

Resampling-Based Multiple Testing

Resampling-Based Multiple Testing
Title Resampling-Based Multiple Testing PDF eBook
Author Peter H. Westfall
Publisher John Wiley & Sons
Pages 382
Release 1993-01-12
Genre Mathematics
ISBN 9780471557616

Download Resampling-Based Multiple Testing Book in PDF, Epub and Kindle

Combines recent developments in resampling technology (including the bootstrap) with new methods for multiple testing that are easy to use, convenient to report and widely applicable. Software from SAS Institute is available to execute many of the methods and programming is straightforward for other applications. Explains how to summarize results using adjusted p-values which do not necessitate cumbersome table look-ups. Demonstrates how to incorporate logical constraints among hypotheses, further improving power.

Multiple Testing Procedures Controlling False Discovery Rate with Applications to Genomic Data

Multiple Testing Procedures Controlling False Discovery Rate with Applications to Genomic Data
Title Multiple Testing Procedures Controlling False Discovery Rate with Applications to Genomic Data PDF eBook
Author Iris Mirales Gauran
Publisher
Pages 320
Release 2018
Genre
ISBN

Download Multiple Testing Procedures Controlling False Discovery Rate with Applications to Genomic Data Book in PDF, Epub and Kindle

In recent mutation studies, analyses based on protein domain positions are gaining popularity over traditional gene-centric approaches since the latter have limitations in considering the functional context that the position of the mutation provides. This presents a large-scale simultaneous inference problem, with hundreds of hypothesis tests to consider at the same time. The overarching objective of this thesis is to propose different multiple testing procedures which can address the problems posed by discrete genomic data. Specifically, we are interested in identifying significant mutation counts while controlling a given level of Type I error via False Discovery Rate (FDR) procedures. One main assumption is that the mutation counts follow a zero-inflated model in order to account for the true zeros in the count model and the excess zeros. The class of models considered is the Zero-inflated Generalized Poisson (ZIGP) distribution.

Large-scale Multiple Hypothesis Testing with Complex Data Structure

Large-scale Multiple Hypothesis Testing with Complex Data Structure
Title Large-scale Multiple Hypothesis Testing with Complex Data Structure PDF eBook
Author Xiaoyu Dai
Publisher
Pages 104
Release 2018
Genre Electronic dissertations
ISBN

Download Large-scale Multiple Hypothesis Testing with Complex Data Structure Book in PDF, Epub and Kindle

In the last decade, motivated by a variety of applications in medicine, bioinformatics, genomics, brain imaging, etc., a growing amount of statistical research has been devoted to large-scale multiple testing, where thousands or even greater numbers of tests are conducted simultaneously. However, due to the complexity of real data sets, the assumptions of many existing multiple testing procedures, e.g. that tests are independent and have continuous null distributions of p-values, may not hold. This poses limitations in their performances such as low detection power and inflated false discovery rate (FDR). In this dissertation, we study how to better proceed the multiple testing problems under complex data structures. In Chapter 2, we study the multiple testing with discrete test statistics. In Chapter 3, we study the discrete multiple testing with prior ordering information incorporated. In Chapter 4, we study the multiple testing under complex dependency structure. We propose novel procedures under each scenario, based on the marginal critical functions (MCFs) of randomized tests, the conditional random field (CRF) or the deep neural network (DNN). The theoretical properties of our procedures are carefully studied, and their performances are evaluated through various simulations and real applications with the analysis of genetic data from next-generation sequencing (NGS) experiments.

Some New Developments on Multiple Testing Procedures

Some New Developments on Multiple Testing Procedures
Title Some New Developments on Multiple Testing Procedures PDF eBook
Author Lilun Du
Publisher
Pages 134
Release 2015
Genre
ISBN

Download Some New Developments on Multiple Testing Procedures Book in PDF, Epub and Kindle

In the context of large-scale multiple testing, hypotheses are often accompanied with certain prior information. In chapter 2, we present a single-index modulated multiple testing procedure, which maintains control of the false discovery rate while incorporating prior information, by assuming the availability of a bivariate p-value for each hypothesis. To find the optimal rejection region for the bivariate p-value, we propose a criteria based on the ratio of probability density functions of the bivariate p-value under the true null and non-null. This criteria in the bivariate normal setting further motivates us to project the bivariate p-value to a single index p-value, for a wide range of directions. The true null distribution of the single index p-value is estimated via parametric and nonparametric approaches, leading to two procedures for estimating and controlling the false discovery rate. To derive the optimal projection direction, we propose a new approach based on power comparison, which is further shown to be consistent under some mild conditions. Multiple testing based on chi-squared test statistics is commonly used in many scientific fields such as genomics research and brain imaging studies. However, the challenges associated with designing a formal testing procedure when there exists a general dependence structure across the chi-squared test statistics have not been well addressed. In chapter 3, we propose a Factor Connected procedure to fill in this gap. We first adopt a latent factor structure to construct a testing framework for approximating the false discovery proportion (FDP) for a large number of highly correlated chi-squared test statistics with finite degrees of freedom k. The testing framework is then connected to simultaneously testing k linear constraints in a large dimensional linear factor model involved with some observable and unobservable common factors, resulting in a consistent estimator of FDP based on the associated unadjusted p-values.