Robustness in Dimensionality Reduction

Robustness in Dimensionality Reduction
Title Robustness in Dimensionality Reduction PDF eBook
Author Jiaxi Liang
Publisher
Pages 161
Release 2016
Genre Algorithms
ISBN

Download Robustness in Dimensionality Reduction Book in PDF, Epub and Kindle

Dimensionality reduction is widely used in many statistical applications, such as image analysis, microarray analysis, or text mining. This thesis focuses on three problems that relate to the robustness in dimension reduction. The first topic is the performance analysis in dimension reduction, that is, quantitatively assessing the performance of a algorithm on a given dataset. A criterion for success is established from the geometric point of view to address this issues. A family of goodness measures, called \textsl{local rank correlation}, is developed to assess the performance of dimensionality reduction methods. The potential application of the local rank correlation in selecting tuning parameters of dimension reduction algorithms is also explored. The second topic is the sensitivity analysis in dimension reduction. Two types of influence functions are developed as measures of robustness, based on which we develop graphical display strategies for visualizing the robustness of a dimension reduction method, and flagging potential outliers. In the third part of the thesis, a novel robust PCA framework, called \textsl{Performance-Weighted Bagging PCA}, is proposed from the perspective of model averaging. It obtains a robust linear subspace by weighted averaging a collection of subspaces produced by subsamples. The robustness against outliers is achieved by a proper weighting scheme, and possible choices of weighting scheme are investigated.

Dimensionality Reduction and Robustness Analysis of Large Scale Systems

Dimensionality Reduction and Robustness Analysis of Large Scale Systems
Title Dimensionality Reduction and Robustness Analysis of Large Scale Systems PDF eBook
Author Rudiyanto Gunawan
Publisher
Pages 110
Release 2000
Genre
ISBN

Download Dimensionality Reduction and Robustness Analysis of Large Scale Systems Book in PDF, Epub and Kindle

Robust Methods for Data Reduction

Robust Methods for Data Reduction
Title Robust Methods for Data Reduction PDF eBook
Author Alessio Farcomeni
Publisher CRC Press
Pages 297
Release 2016-01-13
Genre Mathematics
ISBN 1466590637

Download Robust Methods for Data Reduction Book in PDF, Epub and Kindle

Robust Methods for Data Reduction gives a non-technical overview of robust data reduction techniques, encouraging the use of these important and useful methods in practical applications. The main areas covered include principal components analysis, sparse principal component analysis, canonical correlation analysis, factor analysis, clustering, dou

Data-Driven Science and Engineering

Data-Driven Science and Engineering
Title Data-Driven Science and Engineering PDF eBook
Author Steven L. Brunton
Publisher Cambridge University Press
Pages 615
Release 2022-05-05
Genre Computers
ISBN 1009098489

Download Data-Driven Science and Engineering Book in PDF, Epub and Kindle

A textbook covering data-science and machine learning methods for modelling and control in engineering and science, with Python and MATLAB®.

Data Analytics in Bioinformatics

Data Analytics in Bioinformatics
Title Data Analytics in Bioinformatics PDF eBook
Author Rabinarayan Satpathy
Publisher John Wiley & Sons
Pages 433
Release 2021-01-20
Genre Computers
ISBN 111978560X

Download Data Analytics in Bioinformatics Book in PDF, Epub and Kindle

Machine learning techniques are increasingly being used to address problems in computational biology and bioinformatics. Novel machine learning computational techniques to analyze high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. Machine learning techniques such as Markov models, support vector machines, neural networks, and graphical models have been successful in analyzing life science data because of their capabilities in handling randomness and uncertainty of data noise and in generalization. Machine Learning in Bioinformatics compiles recent approaches in machine learning methods and their applications in addressing contemporary problems in bioinformatics approximating classification and prediction of disease, feature selection, dimensionality reduction, gene selection and classification of microarray data and many more.

Robust Dimensionality Reduction for Data Visualization and Latent Structure Recovery

Robust Dimensionality Reduction for Data Visualization and Latent Structure Recovery
Title Robust Dimensionality Reduction for Data Visualization and Latent Structure Recovery PDF eBook
Author Lan Huong Nguyen
Publisher
Pages
Release 2019
Genre
ISBN

Download Robust Dimensionality Reduction for Data Visualization and Latent Structure Recovery Book in PDF, Epub and Kindle

High dimensionality is one of the major challenges in the analysis of modern data sets, as it is now common to have hundreds or even millions of simultaneous measurements collected for a single sample. Visualization of the data becomes difficult if not impossible, while standard statistical methods lose power due to the curse of dimensionality. Even if a large volume of data is available, exploring a high dimensional space exhaustively is computationally impractical. Low-dimensional data representations that remove noise but retain the signal of interest can be instrumental in detecting hidden structures and patterns. Our work focuses on improving current methods to perform dimensionality reduction (DR) and interpret its output. Many datasets are governed by a continuous process, which is often unknown. Estimating data points' natural ordering and their corresponding uncertainties often sheds light on these underlying mechanisms. We develop a Bayesian Unidimensional Scaling (BUDS) technique which extracts a dominant source of variation in high dimensional datasets and produces a visual data summary, facilitating the exploration of a hidden continuum. The method maps multivariate data points to latent one-dimensional coordinates along their inherent trajectory, and provides uncertainty bounds estimated using a Bayesian posterior distribution. We then turn our attention to DR techniques for data visualization. In particular, we study the behavior of t-Stochastic Neighbor Embedding (t-SNE), a technique broadly adopted for visualizing high-dimensional datasets. We show why t-SNE is usually unable to recover large-scale structures. We then propose a new embedding method, Diffusion t-SNE, which introduces a time-step parameter that can generate a multi-view representation of the data, recovering its geometry at different scales. We also provide mathematical explanations for why the entropy equalization procedure used in t-SNE results in a loss of information about local variances, leading to data distortions that produce misleading representations with uninformative relative sizes and unidentifiable input data sampling densities and variances. Building upon this analysis, we present a scaling scheme of the pairwise proximities that achieves accurate representations of regional data variances.

Robust and Constrained Dimension Reduction

Robust and Constrained Dimension Reduction
Title Robust and Constrained Dimension Reduction PDF eBook
Author Jianhui Zhou
Publisher
Pages 192
Release 2005
Genre
ISBN

Download Robust and Constrained Dimension Reduction Book in PDF, Epub and Kindle