Non-linear Latent Factor Models for Revealing Structure in High-dimensional Data

Non-linear Latent Factor Models for Revealing Structure in High-dimensional Data
Title Non-linear Latent Factor Models for Revealing Structure in High-dimensional Data PDF eBook
Author Roland Memisevic
Publisher
Pages
Release
Genre
ISBN

Download Non-linear Latent Factor Models for Revealing Structure in High-dimensional Data Book in PDF, Epub and Kindle

Biomarker Detection Algorithms and Tools for Medical Imaging or Omic Data

Biomarker Detection Algorithms and Tools for Medical Imaging or Omic Data
Title Biomarker Detection Algorithms and Tools for Medical Imaging or Omic Data PDF eBook
Author Fengfeng Zhou
Publisher Frontiers Media SA
Pages 246
Release 2022-07-13
Genre Science
ISBN 2889765709

Download Biomarker Detection Algorithms and Tools for Medical Imaging or Omic Data Book in PDF, Epub and Kindle

Analysis of Multivariate and High-Dimensional Data

Analysis of Multivariate and High-Dimensional Data
Title Analysis of Multivariate and High-Dimensional Data PDF eBook
Author Inge Koch
Publisher Cambridge University Press
Pages 531
Release 2014
Genre Business & Economics
ISBN 0521887933

Download Analysis of Multivariate and High-Dimensional Data Book in PDF, Epub and Kindle

This modern approach integrates classical and contemporary methods, fusing theory and practice and bridging the gap to statistical learning.

Nonlinear Factor Models for Network and Panel Data

Nonlinear Factor Models for Network and Panel Data
Title Nonlinear Factor Models for Network and Panel Data PDF eBook
Author Mingli Chen
Publisher
Pages
Release 2019
Genre
ISBN

Download Nonlinear Factor Models for Network and Panel Data Book in PDF, Epub and Kindle

Factor structures or interactive effects are convenient devices to incorporate latent variables in panel data models. We consider fixed effect estimation of nonlinear panel single-index models with factor structures in the unobservables, which include logit, probit, ordered probit and Poisson specifications. We establish that fixed effect estimators of model parameters and average partial effects have normal distributions when the two dimensions of the panel grow large, but might suffer from incidental parameter bias. We show how models with factor structures can also be applied to capture important features of network data such as reciprocity, degree heterogeneity, homophily in latent variables and clustering. We illustrate this applicability with an empirical example to the estimation of a gravity equation of international trade between countries using a Poisson model with multiple factors.

Handbook of Research on Systems Biology Applications in Medicine

Handbook of Research on Systems Biology Applications in Medicine
Title Handbook of Research on Systems Biology Applications in Medicine PDF eBook
Author Daskalaki, Andriani
Publisher IGI Global
Pages 982
Release 2008-11-30
Genre Technology & Engineering
ISBN 1605660779

Download Handbook of Research on Systems Biology Applications in Medicine Book in PDF, Epub and Kindle

"This book highlights the use of systems approaches including genomic, cellular, proteomic, metabolomic, bioinformatics, molecular, and biochemical, to address fundamental questions in complex diseases like cancer diabetes but also in ageing"--Provided by publisher.

Latent Factor Analysis for High-dimensional and Sparse Matrices

Latent Factor Analysis for High-dimensional and Sparse Matrices
Title Latent Factor Analysis for High-dimensional and Sparse Matrices PDF eBook
Author Ye Yuan
Publisher Springer Nature
Pages 99
Release 2022-11-15
Genre Computers
ISBN 9811967032

Download Latent Factor Analysis for High-dimensional and Sparse Matrices Book in PDF, Epub and Kindle

Latent factor analysis models are an effective type of machine learning model for addressing high-dimensional and sparse matrices, which are encountered in many big-data-related industrial applications. The performance of a latent factor analysis model relies heavily on appropriate hyper-parameters. However, most hyper-parameters are data-dependent, and using grid-search to tune these hyper-parameters is truly laborious and expensive in computational terms. Hence, how to achieve efficient hyper-parameter adaptation for latent factor analysis models has become a significant question. This is the first book to focus on how particle swarm optimization can be incorporated into latent factor analysis for efficient hyper-parameter adaptation, an approach that offers high scalability in real-world industrial applications. The book will help students, researchers and engineers fully understand the basic methodologies of hyper-parameter adaptation via particle swarm optimization in latent factor analysis models. Further, it will enable them to conduct extensive research and experiments on the real-world applications of the content discussed.

Latent Structure in Linear Prediction and Corpora Comparison

Latent Structure in Linear Prediction and Corpora Comparison
Title Latent Structure in Linear Prediction and Corpora Comparison PDF eBook
Author Seth Colin-Bear Strimas-Mackey
Publisher
Pages 0
Release 2022
Genre
ISBN

Download Latent Structure in Linear Prediction and Corpora Comparison Book in PDF, Epub and Kindle

This work first studies the finite-sample properties of the risk of the minimum-norm interpolating predictor in high-dimensional regression models. If the effective rank of the covariance matrix of the p regression features is much larger than the sample size n, we show that the min-norm interpolating predictor is not desirable, as its risk approaches the risk of trivially predicting the response by 0. However, our detailed finite-sample analysis reveals, surprisingly, that this behavior is not present when the regression response and the features are jointly low-dimensional, following a widely used factor regression model. Within this popular model class, and when the effective rank of the covariance matrix is smaller than n, while still allowing for p ” n, both the bias and the variance terms of the excess risk can be controlled, and the risk of the minimum-norm interpolating predictor approaches optimal benchmarks. Moreover, through a detailed analysis of the bias term, we exhibit model classes under which our upper bound on the excess risk approaches zero, while the corresponding upper bound in the recent work [arXiv:1906.11300] diverges. Furthermore, we show that the minimum-norm interpolating predictor analyzed under the factor regression model, despite being model-agnostic and devoid of tuning parameters, can have similar risk to predictors based on principal components regression and ridge regression, and can improve over LASSO based predictors, in the high-dimensional regime. The second part of this work extends the analysis of the minimum-norm interpolating predictor to a larger class of linear predictors of a real-valued response Y. Our primary contribution is in establishing finite sample risk bounds for prediction with the ubiquitous Principal Component Regression (PCR) method, under the factor regression model, with the number of principal components adaptively selected from the data--a form of theoretical guarantee that is surprisingly lacking from the PCR literature. To accomplish this, we prove a master theorem that establishes a risk bound for a large class of predictors, including the PCR predictor as a special case. This approach has the benefit of providing a unified framework for the analysis of a wide range of linear prediction methods, under the factor regression setting. In particular, we use our main theorem to recover the risk bounds for the minimum-norm interpolating predictor, and a prediction method tailored to a subclass of factor regression models with identifiable parameters. This model-tailored method can be interpreted as prediction via clusters with latent centers. To address the problem of selecting among a set of candidate predictors, we analyze a simple model selection procedure based on data-splitting, providing an oracle inequality under the factor model to prove that the performance of the selected predictor is close to the optimal candidate. In the third part of this work, we shift from the latent factor model to developing methodology in the context of topic models, which also rely on latent structure. We provide a new, principled, construction of a distance between two ensembles of independent, but not identically distributed, discrete samples, when each ensemble follows a topic model. Our proposal is a hierarchical Wasserstein distance, that can be used for the comparison of corpora of documents, or any other data sets following topic models. We define the distance by representing a corpus as a discrete measure theta over a set of clusters corresponding to topics. To a cluster we associate its center, which is itself a discrete measure over topics. This allows for summarizing both the relative weight of each topic in the corpus (represented by the components of theta) and the topic heterogeneity within the corpus in a single probabilistic representation. The distance between two corpora then follows naturally as a hierarchical Wasserstein distance between the probabilistic representations of the two corpora. We demonstrate that this distance captures differences in the content of the topics between two corpora and their relative coverage. We provide computationally tractable estimates of the distance, as well as accompanying finite sample error bounds relative to their population counterparts. We demonstrate the usage of the distance with an application to the comparison of news sources.