Robustness in Dimensionality Reduction
Title | Robustness in Dimensionality Reduction PDF eBook |
Author | Jiaxi Liang |
Publisher | |
Pages | 161 |
Release | 2016 |
Genre | Algorithms |
ISBN |
Dimensionality reduction is widely used in many statistical applications, such as image analysis, microarray analysis, or text mining. This thesis focuses on three problems that relate to the robustness in dimension reduction. The first topic is the performance analysis in dimension reduction, that is, quantitatively assessing the performance of a algorithm on a given dataset. A criterion for success is established from the geometric point of view to address this issues. A family of goodness measures, called \textsl{local rank correlation}, is developed to assess the performance of dimensionality reduction methods. The potential application of the local rank correlation in selecting tuning parameters of dimension reduction algorithms is also explored. The second topic is the sensitivity analysis in dimension reduction. Two types of influence functions are developed as measures of robustness, based on which we develop graphical display strategies for visualizing the robustness of a dimension reduction method, and flagging potential outliers. In the third part of the thesis, a novel robust PCA framework, called \textsl{Performance-Weighted Bagging PCA}, is proposed from the perspective of model averaging. It obtains a robust linear subspace by weighted averaging a collection of subspaces produced by subsamples. The robustness against outliers is achieved by a proper weighting scheme, and possible choices of weighting scheme are investigated.
Dimensionality Reduction and Robustness Analysis of Large Scale Systems
Title | Dimensionality Reduction and Robustness Analysis of Large Scale Systems PDF eBook |
Author | Rudiyanto Gunawan |
Publisher | |
Pages | 110 |
Release | 2000 |
Genre | |
ISBN |
Robust Methods for Data Reduction
Title | Robust Methods for Data Reduction PDF eBook |
Author | Alessio Farcomeni |
Publisher | CRC Press |
Pages | 297 |
Release | 2016-01-13 |
Genre | Mathematics |
ISBN | 1466590637 |
Robust Methods for Data Reduction gives a non-technical overview of robust data reduction techniques, encouraging the use of these important and useful methods in practical applications. The main areas covered include principal components analysis, sparse principal component analysis, canonical correlation analysis, factor analysis, clustering, dou
Data-Driven Science and Engineering
Title | Data-Driven Science and Engineering PDF eBook |
Author | Steven L. Brunton |
Publisher | Cambridge University Press |
Pages | 615 |
Release | 2022-05-05 |
Genre | Computers |
ISBN | 1009098489 |
A textbook covering data-science and machine learning methods for modelling and control in engineering and science, with Python and MATLAB®.
Data Analytics in Bioinformatics
Title | Data Analytics in Bioinformatics PDF eBook |
Author | Rabinarayan Satpathy |
Publisher | John Wiley & Sons |
Pages | 433 |
Release | 2021-01-20 |
Genre | Computers |
ISBN | 111978560X |
Machine learning techniques are increasingly being used to address problems in computational biology and bioinformatics. Novel machine learning computational techniques to analyze high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. Machine learning techniques such as Markov models, support vector machines, neural networks, and graphical models have been successful in analyzing life science data because of their capabilities in handling randomness and uncertainty of data noise and in generalization. Machine Learning in Bioinformatics compiles recent approaches in machine learning methods and their applications in addressing contemporary problems in bioinformatics approximating classification and prediction of disease, feature selection, dimensionality reduction, gene selection and classification of microarray data and many more.
Robust Dimensionality Reduction for Data Visualization and Latent Structure Recovery
Title | Robust Dimensionality Reduction for Data Visualization and Latent Structure Recovery PDF eBook |
Author | Lan Huong Nguyen |
Publisher | |
Pages | |
Release | 2019 |
Genre | |
ISBN |
High dimensionality is one of the major challenges in the analysis of modern data sets, as it is now common to have hundreds or even millions of simultaneous measurements collected for a single sample. Visualization of the data becomes difficult if not impossible, while standard statistical methods lose power due to the curse of dimensionality. Even if a large volume of data is available, exploring a high dimensional space exhaustively is computationally impractical. Low-dimensional data representations that remove noise but retain the signal of interest can be instrumental in detecting hidden structures and patterns. Our work focuses on improving current methods to perform dimensionality reduction (DR) and interpret its output. Many datasets are governed by a continuous process, which is often unknown. Estimating data points' natural ordering and their corresponding uncertainties often sheds light on these underlying mechanisms. We develop a Bayesian Unidimensional Scaling (BUDS) technique which extracts a dominant source of variation in high dimensional datasets and produces a visual data summary, facilitating the exploration of a hidden continuum. The method maps multivariate data points to latent one-dimensional coordinates along their inherent trajectory, and provides uncertainty bounds estimated using a Bayesian posterior distribution. We then turn our attention to DR techniques for data visualization. In particular, we study the behavior of t-Stochastic Neighbor Embedding (t-SNE), a technique broadly adopted for visualizing high-dimensional datasets. We show why t-SNE is usually unable to recover large-scale structures. We then propose a new embedding method, Diffusion t-SNE, which introduces a time-step parameter that can generate a multi-view representation of the data, recovering its geometry at different scales. We also provide mathematical explanations for why the entropy equalization procedure used in t-SNE results in a loss of information about local variances, leading to data distortions that produce misleading representations with uninformative relative sizes and unidentifiable input data sampling densities and variances. Building upon this analysis, we present a scaling scheme of the pairwise proximities that achieves accurate representations of regional data variances.
Robust and Constrained Dimension Reduction
Title | Robust and Constrained Dimension Reduction PDF eBook |
Author | Jianhui Zhou |
Publisher | |
Pages | 192 |
Release | 2005 |
Genre | |
ISBN |