Dimension Reduction in Statistical Modeling

Dimension Reduction in Statistical Modeling
Title Dimension Reduction in Statistical Modeling PDF eBook
Author Linquan Ma (Ph.D.)
Publisher
Pages 0
Release 2022
Genre
ISBN

Download Dimension Reduction in Statistical Modeling Book in PDF, Epub and Kindle

When the data object is described by a large number of features, it is often beneficial to reduce the dimension of the data, so that the statistical analysis can have better efficiencies. Recently, a new dimension reduction method called the envelope method by Cook, Li, and Chiaromonte (2010) has been proposed in multivariate regressions. It has the potential to gain substantial efficiency over the standard least squares estimator. Chapter 2 proposes an approach to use the envelope method when the predictors and/or the responses are missing at random. When there exists missing data, the envelope method using the complete case observations may lead to biased and inefficient results. We incorporate the envelope structure in the expectation-maximization (EM) algorithm. Our method is guaranteed to be more efficient, or at least as efficient as, the standard EM algorithm. We give asymptotic properties of our method under both normal and non-normal cases. Chapter 3 extends the envelope model to the mixed effects model for longitudinal data with possibly unbalanced design and time-varying predictors. We show that our model provides more efficient estimators than the standard estimators in mixed effects models. Chapter 4 proposes a semiparametric variant of the inner envelope model (Su and Cook, 2012) that does not rely on the linear model nor the normality assumption. We show that our proposal leads to globally and locally efficient estimators of the inner envelope spaces. We also present a computationally tractable algorithm to estimate the inner envelope. The instrumental variables (IV) are frequently used in observational studies to recover the effect of exposure in the presence of unmeasured confounding. A key fact is that the strength of IV matters: an IV with a stronger association with the exposure results in a more accurate estimation of a causal effect. While it is hard to find a stronger IV, we generalize a sufficient dimension method to remove immaterial IVs. Chapter 5 investigates two different ways of incorporating the envelope method into IV regression. We show that the first stage envelope method does not yield any efficiency gain on the standard IV estimator, however, it may reduce the finite sample bias. The second stage envelope can achieve substantial efficiency gain under certain conditions.

Dimension Reduction in Statistical Modeling

Dimension Reduction in Statistical Modeling
Title Dimension Reduction in Statistical Modeling PDF eBook
Author Linquan Ma (Ph.D.)
Publisher
Pages 0
Release 2022
Genre
ISBN

Download Dimension Reduction in Statistical Modeling Book in PDF, Epub and Kindle

When the data object is described by a large number of features, it is often beneficial to reduce the dimension of the data, so that the statistical analysis can have better efficiencies. Recently, a new dimension reduction method called the envelope method by Cook, Li, and Chiaromonte (2010) has been proposed in multivariate regressions. It has the potential to gain substantial efficiency over the standard least squares estimator. Chapter 2 proposes an approach to use the envelope method when the predictors and/or the responses are missing at random. When there exists missing data, the envelope method using the complete case observations may lead to biased and inefficient results. We incorporate the envelope structure in the expectation-maximization (EM) algorithm. Our method is guaranteed to be more efficient, or at least as efficient as, the standard EM algorithm. We give asymptotic properties of our method under both normal and non-normal cases. Chapter 3 extends the envelope model to the mixed effects model for longitudinal data with possibly unbalanced design and time-varying predictors. We show that our model provides more efficient estimators than the standard estimators in mixed effects models. Chapter 4 proposes a semiparametric variant of the inner envelope model (Su and Cook, 2012) that does not rely on the linear model nor the normality assumption. We show that our proposal leads to globally and locally efficient estimators of the inner envelope spaces. We also present a computationally tractable algorithm to estimate the inner envelope. The instrumental variables (IV) are frequently used in observational studies to recover the effect of exposure in the presence of unmeasured confounding. A key fact is that the strength of IV matters: an IV with a stronger association with the exposure results in a more accurate estimation of a causal effect. While it is hard to find a stronger IV, we generalize a sufficient dimension method to remove immaterial IVs. Chapter 5 investigates two different ways of incorporating the envelope method into IV regression. We show that the first stage envelope method does not yield any efficiency gain on the standard IV estimator, however, it may reduce the finite sample bias. The second stage envelope can achieve substantial efficiency gain under certain conditions.

Sufficient Dimension Reduction

Sufficient Dimension Reduction
Title Sufficient Dimension Reduction PDF eBook
Author Bing Li
Publisher CRC Press
Pages 362
Release 2018-04-27
Genre Mathematics
ISBN 1351645730

Download Sufficient Dimension Reduction Book in PDF, Epub and Kindle

Sufficient dimension reduction is a rapidly developing research field that has wide applications in regression diagnostics, data visualization, machine learning, genomics, image processing, pattern recognition, and medicine, because they are fields that produce large datasets with a large number of variables. Sufficient Dimension Reduction: Methods and Applications with R introduces the basic theories and the main methodologies, provides practical and easy-to-use algorithms and computer codes to implement these methodologies, and surveys the recent advances at the frontiers of this field. Features Provides comprehensive coverage of this emerging research field. Synthesizes a wide variety of dimension reduction methods under a few unifying principles such as projection in Hilbert spaces, kernel mapping, and von Mises expansion. Reflects most recent advances such as nonlinear sufficient dimension reduction, dimension folding for tensorial data, as well as sufficient dimension reduction for functional data. Includes a set of computer codes written in R that are easily implemented by the readers. Uses real data sets available online to illustrate the usage and power of the described methods. Sufficient dimension reduction has undergone momentous development in recent years, partly due to the increased demands for techniques to process high-dimensional data, a hallmark of our age of Big Data. This book will serve as the perfect entry into the field for the beginning researchers or a handy reference for the advanced ones. The author Bing Li obtained his Ph.D. from the University of Chicago. He is currently a Professor of Statistics at the Pennsylvania State University. His research interests cover sufficient dimension reduction, statistical graphical models, functional data analysis, machine learning, estimating equations and quasilikelihood, and robust statistics. He is a fellow of the Institute of Mathematical Statistics and the American Statistical Association. He is an Associate Editor for The Annals of Statistics and the Journal of the American Statistical Association.

Machine Learning Techniques for Multimedia

Machine Learning Techniques for Multimedia
Title Machine Learning Techniques for Multimedia PDF eBook
Author Matthieu Cord
Publisher Springer Science & Business Media
Pages 297
Release 2008-02-07
Genre Computers
ISBN 3540751718

Download Machine Learning Techniques for Multimedia Book in PDF, Epub and Kindle

Processing multimedia content has emerged as a key area for the application of machine learning techniques, where the objectives are to provide insight into the domain from which the data is drawn, and to organize that data and improve the performance of the processes manipulating it. Arising from the EU MUSCLE network, this multidisciplinary book provides a comprehensive coverage of the most important machine learning techniques used and their application in this domain.

Feature Engineering and Selection

Feature Engineering and Selection
Title Feature Engineering and Selection PDF eBook
Author Max Kuhn
Publisher CRC Press
Pages 266
Release 2019-07-25
Genre Business & Economics
ISBN 1351609467

Download Feature Engineering and Selection Book in PDF, Epub and Kindle

The process of developing predictive models includes many stages. Most resources focus on the modeling algorithms but neglect other critical aspects of the modeling process. This book describes techniques for finding the best representations of predictors for modeling and for nding the best subset of predictors for improving model performance. A variety of example data sets are used to illustrate the techniques along with R programs for reproducing the results.

Principal Manifolds for Data Visualization and Dimension Reduction

Principal Manifolds for Data Visualization and Dimension Reduction
Title Principal Manifolds for Data Visualization and Dimension Reduction PDF eBook
Author Alexander N. Gorban
Publisher Springer Science & Business Media
Pages 361
Release 2007-09-11
Genre Technology & Engineering
ISBN 3540737502

Download Principal Manifolds for Data Visualization and Dimension Reduction Book in PDF, Epub and Kindle

The book starts with the quote of the classical Pearson definition of PCA and includes reviews of various methods: NLPCA, ICA, MDS, embedding and clustering algorithms, principal manifolds and SOM. New approaches to NLPCA, principal manifolds, branching principal components and topology preserving mappings are described. Presentation of algorithms is supplemented by case studies. The volume ends with a tutorial PCA deciphers genome.

A Survey of Statistical Network Models

A Survey of Statistical Network Models
Title A Survey of Statistical Network Models PDF eBook
Author Anna Goldenberg
Publisher Now Publishers Inc
Pages 118
Release 2010
Genre Computers
ISBN 1601983204

Download A Survey of Statistical Network Models Book in PDF, Epub and Kindle

Networks are ubiquitous in science and have become a focal point for discussion in everyday life. Formal statistical models for the analysis of network data have emerged as a major topic of interest in diverse areas of study, and most of these involve a form of graphical representation. Probability models on graphs date back to 1959. Along with empirical studies in social psychology and sociology from the 1960s, these early works generated an active network community and a substantial literature in the 1970s. This effort moved into the statistical literature in the late 1970s and 1980s, and the past decade has seen a burgeoning network literature in statistical physics and computer science. The growth of the World Wide Web and the emergence of online networking communities such as Facebook, MySpace, and LinkedIn, and a host of more specialized professional network communities has intensified interest in the study of networks and network data. Our goal in this review is to provide the reader with an entry point to this burgeoning literature. We begin with an overview of the historical development of statistical network modeling and then we introduce a number of examples that have been studied in the network literature. Our subsequent discussion focuses on a number of prominent static and dynamic network models and their interconnections. We emphasize formal model descriptions, and pay special attention to the interpretation of parameters and their estimation. We end with a description of some open problems and challenges for machine learning and statistics.