Dimension Reduction in Statistical Modeling

Dimension Reduction in Statistical Modeling
Title Dimension Reduction in Statistical Modeling PDF eBook
Author Linquan Ma (Ph.D.)
Publisher
Pages 0
Release 2022
Genre
ISBN

Download Dimension Reduction in Statistical Modeling Book in PDF, Epub and Kindle

When the data object is described by a large number of features, it is often beneficial to reduce the dimension of the data, so that the statistical analysis can have better efficiencies. Recently, a new dimension reduction method called the envelope method by Cook, Li, and Chiaromonte (2010) has been proposed in multivariate regressions. It has the potential to gain substantial efficiency over the standard least squares estimator. Chapter 2 proposes an approach to use the envelope method when the predictors and/or the responses are missing at random. When there exists missing data, the envelope method using the complete case observations may lead to biased and inefficient results. We incorporate the envelope structure in the expectation-maximization (EM) algorithm. Our method is guaranteed to be more efficient, or at least as efficient as, the standard EM algorithm. We give asymptotic properties of our method under both normal and non-normal cases. Chapter 3 extends the envelope model to the mixed effects model for longitudinal data with possibly unbalanced design and time-varying predictors. We show that our model provides more efficient estimators than the standard estimators in mixed effects models. Chapter 4 proposes a semiparametric variant of the inner envelope model (Su and Cook, 2012) that does not rely on the linear model nor the normality assumption. We show that our proposal leads to globally and locally efficient estimators of the inner envelope spaces. We also present a computationally tractable algorithm to estimate the inner envelope. The instrumental variables (IV) are frequently used in observational studies to recover the effect of exposure in the presence of unmeasured confounding. A key fact is that the strength of IV matters: an IV with a stronger association with the exposure results in a more accurate estimation of a causal effect. While it is hard to find a stronger IV, we generalize a sufficient dimension method to remove immaterial IVs. Chapter 5 investigates two different ways of incorporating the envelope method into IV regression. We show that the first stage envelope method does not yield any efficiency gain on the standard IV estimator, however, it may reduce the finite sample bias. The second stage envelope can achieve substantial efficiency gain under certain conditions.

Dimension Reduction in Statistical Modeling

Dimension Reduction in Statistical Modeling
Title Dimension Reduction in Statistical Modeling PDF eBook
Author Linquan Ma (Ph.D.)
Publisher
Pages 0
Release 2022
Genre
ISBN

Download Dimension Reduction in Statistical Modeling Book in PDF, Epub and Kindle

When the data object is described by a large number of features, it is often beneficial to reduce the dimension of the data, so that the statistical analysis can have better efficiencies. Recently, a new dimension reduction method called the envelope method by Cook, Li, and Chiaromonte (2010) has been proposed in multivariate regressions. It has the potential to gain substantial efficiency over the standard least squares estimator. Chapter 2 proposes an approach to use the envelope method when the predictors and/or the responses are missing at random. When there exists missing data, the envelope method using the complete case observations may lead to biased and inefficient results. We incorporate the envelope structure in the expectation-maximization (EM) algorithm. Our method is guaranteed to be more efficient, or at least as efficient as, the standard EM algorithm. We give asymptotic properties of our method under both normal and non-normal cases. Chapter 3 extends the envelope model to the mixed effects model for longitudinal data with possibly unbalanced design and time-varying predictors. We show that our model provides more efficient estimators than the standard estimators in mixed effects models. Chapter 4 proposes a semiparametric variant of the inner envelope model (Su and Cook, 2012) that does not rely on the linear model nor the normality assumption. We show that our proposal leads to globally and locally efficient estimators of the inner envelope spaces. We also present a computationally tractable algorithm to estimate the inner envelope. The instrumental variables (IV) are frequently used in observational studies to recover the effect of exposure in the presence of unmeasured confounding. A key fact is that the strength of IV matters: an IV with a stronger association with the exposure results in a more accurate estimation of a causal effect. While it is hard to find a stronger IV, we generalize a sufficient dimension method to remove immaterial IVs. Chapter 5 investigates two different ways of incorporating the envelope method into IV regression. We show that the first stage envelope method does not yield any efficiency gain on the standard IV estimator, however, it may reduce the finite sample bias. The second stage envelope can achieve substantial efficiency gain under certain conditions.

Sufficient Dimension Reduction

Sufficient Dimension Reduction
Title Sufficient Dimension Reduction PDF eBook
Author Bing Li
Publisher CRC Press
Pages 362
Release 2018-04-27
Genre Mathematics
ISBN 1351645730

Download Sufficient Dimension Reduction Book in PDF, Epub and Kindle

Sufficient dimension reduction is a rapidly developing research field that has wide applications in regression diagnostics, data visualization, machine learning, genomics, image processing, pattern recognition, and medicine, because they are fields that produce large datasets with a large number of variables. Sufficient Dimension Reduction: Methods and Applications with R introduces the basic theories and the main methodologies, provides practical and easy-to-use algorithms and computer codes to implement these methodologies, and surveys the recent advances at the frontiers of this field. Features Provides comprehensive coverage of this emerging research field. Synthesizes a wide variety of dimension reduction methods under a few unifying principles such as projection in Hilbert spaces, kernel mapping, and von Mises expansion. Reflects most recent advances such as nonlinear sufficient dimension reduction, dimension folding for tensorial data, as well as sufficient dimension reduction for functional data. Includes a set of computer codes written in R that are easily implemented by the readers. Uses real data sets available online to illustrate the usage and power of the described methods. Sufficient dimension reduction has undergone momentous development in recent years, partly due to the increased demands for techniques to process high-dimensional data, a hallmark of our age of Big Data. This book will serve as the perfect entry into the field for the beginning researchers or a handy reference for the advanced ones. The author Bing Li obtained his Ph.D. from the University of Chicago. He is currently a Professor of Statistics at the Pennsylvania State University. His research interests cover sufficient dimension reduction, statistical graphical models, functional data analysis, machine learning, estimating equations and quasilikelihood, and robust statistics. He is a fellow of the Institute of Mathematical Statistics and the American Statistical Association. He is an Associate Editor for The Annals of Statistics and the Journal of the American Statistical Association.

Machine Learning Techniques for Multimedia

Machine Learning Techniques for Multimedia
Title Machine Learning Techniques for Multimedia PDF eBook
Author Matthieu Cord
Publisher Springer Science & Business Media
Pages 297
Release 2008-02-07
Genre Computers
ISBN 3540751718

Download Machine Learning Techniques for Multimedia Book in PDF, Epub and Kindle

Processing multimedia content has emerged as a key area for the application of machine learning techniques, where the objectives are to provide insight into the domain from which the data is drawn, and to organize that data and improve the performance of the processes manipulating it. Arising from the EU MUSCLE network, this multidisciplinary book provides a comprehensive coverage of the most important machine learning techniques used and their application in this domain.

Feature Engineering and Selection

Feature Engineering and Selection
Title Feature Engineering and Selection PDF eBook
Author Max Kuhn
Publisher CRC Press
Pages 266
Release 2019-07-25
Genre Business & Economics
ISBN 1351609467

Download Feature Engineering and Selection Book in PDF, Epub and Kindle

The process of developing predictive models includes many stages. Most resources focus on the modeling algorithms but neglect other critical aspects of the modeling process. This book describes techniques for finding the best representations of predictors for modeling and for nding the best subset of predictors for improving model performance. A variety of example data sets are used to illustrate the techniques along with R programs for reproducing the results.

Principal Manifolds for Data Visualization and Dimension Reduction

Principal Manifolds for Data Visualization and Dimension Reduction
Title Principal Manifolds for Data Visualization and Dimension Reduction PDF eBook
Author Alexander N. Gorban
Publisher Springer Science & Business Media
Pages 361
Release 2007-09-11
Genre Technology & Engineering
ISBN 3540737502

Download Principal Manifolds for Data Visualization and Dimension Reduction Book in PDF, Epub and Kindle

The book starts with the quote of the classical Pearson definition of PCA and includes reviews of various methods: NLPCA, ICA, MDS, embedding and clustering algorithms, principal manifolds and SOM. New approaches to NLPCA, principal manifolds, branching principal components and topology preserving mappings are described. Presentation of algorithms is supplemented by case studies. The volume ends with a tutorial PCA deciphers genome.

Sufficient Dimension Reduction

Sufficient Dimension Reduction
Title Sufficient Dimension Reduction PDF eBook
Author Bing Li
Publisher CRC Press
Pages 362
Release 2018-04-27
Genre Mathematics
ISBN 1351645730

Download Sufficient Dimension Reduction Book in PDF, Epub and Kindle

Sufficient dimension reduction is a rapidly developing research field that has wide applications in regression diagnostics, data visualization, machine learning, genomics, image processing, pattern recognition, and medicine, because they are fields that produce large datasets with a large number of variables. Sufficient Dimension Reduction: Methods and Applications with R introduces the basic theories and the main methodologies, provides practical and easy-to-use algorithms and computer codes to implement these methodologies, and surveys the recent advances at the frontiers of this field. Features Provides comprehensive coverage of this emerging research field. Synthesizes a wide variety of dimension reduction methods under a few unifying principles such as projection in Hilbert spaces, kernel mapping, and von Mises expansion. Reflects most recent advances such as nonlinear sufficient dimension reduction, dimension folding for tensorial data, as well as sufficient dimension reduction for functional data. Includes a set of computer codes written in R that are easily implemented by the readers. Uses real data sets available online to illustrate the usage and power of the described methods. Sufficient dimension reduction has undergone momentous development in recent years, partly due to the increased demands for techniques to process high-dimensional data, a hallmark of our age of Big Data. This book will serve as the perfect entry into the field for the beginning researchers or a handy reference for the advanced ones. The author Bing Li obtained his Ph.D. from the University of Chicago. He is currently a Professor of Statistics at the Pennsylvania State University. His research interests cover sufficient dimension reduction, statistical graphical models, functional data analysis, machine learning, estimating equations and quasilikelihood, and robust statistics. He is a fellow of the Institute of Mathematical Statistics and the American Statistical Association. He is an Associate Editor for The Annals of Statistics and the Journal of the American Statistical Association.