Predictive Modeling for High Dimensional Longitudinal Data

Predictive Modeling for High Dimensional Longitudinal Data
Title Predictive Modeling for High Dimensional Longitudinal Data PDF eBook
Author Junjie Liang
Publisher
Pages 0
Release 2022
Genre
ISBN

Download Predictive Modeling for High Dimensional Longitudinal Data Book in PDF, Epub and Kindle

Longitudinal studies, which involve repeated observations, taken at irregularly spaced time points, for a set of individuals over time, are ubiquitous in many applications. Predictive models for longitudinal data generally need to take into account the data correlation, i.e., correlation among repeated observations of the individual and/or correlation among groups of individuals. Ignoring either part of the correlation can lead to misleading statistical inferences. It can be non-trivial to choose a suitable correlation structure that reflects the correlations present in the data. The relationships between the variables and outcomes of interest can be highly complex and non-linear. Furthermore, modern applications often call for longitudinal methods that scale gracefully with increasing number of variables and millions of data points. The target for this dissertation is to address the challenges in longitudinal data analysis using machine learning and representation learning approaches. Specifically, our work is dedicated to redesign the state-of-the-art longitudinal models to fit in the large-scale, high-dimensional longitudinal settings. We focus on improving the mixed effects models and non-parametric models by answering the following research questions: (i) How can we design mixed effects models to handle longitudinal data with thousands of variables and automate the selection between fixed and random effects? (ii) How can we design non-parametric models to handle longitudinal data with time-varying and time-invariant effects and automate the discovery of complex correlation? (iii) How can we design non-parametric models to handle longitudinal data with outcomes that could show state transitions, abrupt discontinuities and complex correlation? Against this background, this dissertation investigates two lines of approaches, Factorization Machines and Gaussian Process. We tackle both the theoretical and practical challenges in adapting these approaches to longitudinal settings. For each proposed model, we explore provably efficient algorithm to improve its applicability over high-dimensional data.

Modern Statistics with R

Modern Statistics with R
Title Modern Statistics with R PDF eBook
Author Måns Thulin
Publisher CRC Press
Pages 0
Release 2024-08-20
Genre Mathematics
ISBN 9781032512440

Download Modern Statistics with R Book in PDF, Epub and Kindle

The past decades have transformed the world of statistical data analysis, with new methods, new types of data, and new computational tools. Modern Statistics with R introduces you to key parts of this modern statistical toolkit. It teaches you: Data wrangling - importing, formatting, reshaping, merging, and filtering data in R. Exploratory data analysis - using visualisations and multivariate techniques to explore datasets. Statistical inference - modern methods for testing hypotheses and computing confidence intervals. Predictive modelling - regression models and machine learning methods for prediction, classification, and forecasting. Simulation - using simulation techniques for sample size computations and evaluations of statistical methods. Ethics in statistics - ethical issues and good statistical practice. R programming - writing code that is fast, readable, and (hopefully!) free from bugs. No prior programming experience is necessary. Clear explanations and examples are provided to accommodate readers at all levels of familiarity with statistical principles and coding practices. A basic understanding of probability theory can enhance comprehension of certain concepts discussed within this book. In addition to plenty of examples, the book includes more than 200 exercises, with fully worked solutions available at: www.modernstatisticswithr.com.

High-dimensional Data Analysis

High-dimensional Data Analysis
Title High-dimensional Data Analysis PDF eBook
Author Tony Cai;Xiaotong Shen
Publisher
Pages 318
Release
Genre
ISBN 9787894236326

Download High-dimensional Data Analysis Book in PDF, Epub and Kindle

Over the last few years, significant developments have been taking place in highdimensional data analysis, driven primarily by a wide range of applications in many fields such as genomics and signal processing. In particular, substantial advances have been made in the areas of feature selection, covariance estimation, classification and regression. This book intends to examine important issues arising from highdimensional data analysis to explore key ideas for statistical inference and prediction. It is structured around topics on multiple hypothesis testing, feature selection, regression, cla.

High-Dimensional Covariance Estimation

High-Dimensional Covariance Estimation
Title High-Dimensional Covariance Estimation PDF eBook
Author Mohsen Pourahmadi
Publisher John Wiley & Sons
Pages 204
Release 2013-06-24
Genre Mathematics
ISBN 1118034295

Download High-Dimensional Covariance Estimation Book in PDF, Epub and Kindle

Methods for estimating sparse and large covariance matrices Covariance and correlation matrices play fundamental roles in every aspect of the analysis of multivariate data collected from a variety of fields including business and economics, health care, engineering, and environmental and physical sciences. High-Dimensional Covariance Estimation provides accessible and comprehensive coverage of the classical and modern approaches for estimating covariance matrices as well as their applications to the rapidly developing areas lying at the intersection of statistics and machine learning. Recently, the classical sample covariance methodologies have been modified and improved upon to meet the needs of statisticians and researchers dealing with large correlated datasets. High-Dimensional Covariance Estimation focuses on the methodologies based on shrinkage, thresholding, and penalized likelihood with applications to Gaussian graphical models, prediction, and mean-variance portfolio management. The book relies heavily on regression-based ideas and interpretations to connect and unify many existing methods and algorithms for the task. High-Dimensional Covariance Estimation features chapters on: Data, Sparsity, and Regularization Regularizing the Eigenstructure Banding, Tapering, and Thresholding Covariance Matrices Sparse Gaussian Graphical Models Multivariate Regression The book is an ideal resource for researchers in statistics, mathematics, business and economics, computer sciences, and engineering, as well as a useful text or supplement for graduate-level courses in multivariate analysis, covariance estimation, statistical learning, and high-dimensional data analysis.

Modeling Longitudinal Data

Modeling Longitudinal Data
Title Modeling Longitudinal Data PDF eBook
Author Robert E. Weiss
Publisher Springer Science & Business Media
Pages 445
Release 2006-12-06
Genre Medical
ISBN 0387283145

Download Modeling Longitudinal Data Book in PDF, Epub and Kindle

The book features many figures and tables illustrating longitudinal data and numerous homework problems. The associated web site contains many longitudinal data sets, examples of computer code, and labs to re-enforce the material. Weiss emphasizes continuous data rather than discrete data, graphical and covariance methods, and generalizations of regression rather than generalizations of analysis of variance.

Feature Engineering and Selection

Feature Engineering and Selection
Title Feature Engineering and Selection PDF eBook
Author Max Kuhn
Publisher CRC Press
Pages 266
Release 2019-07-25
Genre Business & Economics
ISBN 1351609467

Download Feature Engineering and Selection Book in PDF, Epub and Kindle

The process of developing predictive models includes many stages. Most resources focus on the modeling algorithms but neglect other critical aspects of the modeling process. This book describes techniques for finding the best representations of predictors for modeling and for nding the best subset of predictors for improving model performance. A variety of example data sets are used to illustrate the techniques along with R programs for reproducing the results.

Mixed Effects Models for Complex Data

Mixed Effects Models for Complex Data
Title Mixed Effects Models for Complex Data PDF eBook
Author Lang Wu
Publisher CRC Press
Pages 431
Release 2009-11-11
Genre Mathematics
ISBN 9781420074086

Download Mixed Effects Models for Complex Data Book in PDF, Epub and Kindle

Although standard mixed effects models are useful in a range of studies, other approaches must often be used in correlation with them when studying complex or incomplete data. Mixed Effects Models for Complex Data discusses commonly used mixed effects models and presents appropriate approaches to address dropouts, missing data, measurement errors, censoring, and outliers. For each class of mixed effects model, the author reviews the corresponding class of regression model for cross-sectional data. An overview of general models and methods, along with motivating examples After presenting real data examples and outlining general approaches to the analysis of longitudinal/clustered data and incomplete data, the book introduces linear mixed effects (LME) models, generalized linear mixed models (GLMMs), nonlinear mixed effects (NLME) models, and semiparametric and nonparametric mixed effects models. It also includes general approaches for the analysis of complex data with missing values, measurement errors, censoring, and outliers. Self-contained coverage of specific topics Subsequent chapters delve more deeply into missing data problems, covariate measurement errors, and censored responses in mixed effects models. Focusing on incomplete data, the book also covers survival and frailty models, joint models of survival and longitudinal data, robust methods for mixed effects models, marginal generalized estimating equation (GEE) models for longitudinal or clustered data, and Bayesian methods for mixed effects models. Background material In the appendix, the author provides background information, such as likelihood theory, the Gibbs sampler, rejection and importance sampling methods, numerical integration methods, optimization methods, bootstrap, and matrix algebra. Failure to properly address missing data, measurement errors, and other issues in statistical analyses can lead to severely biased or misleading results. This book explores the biases that arise when naïve methods are used and shows which approaches should be used to achieve accurate results in longitudinal data analysis.