High-dimensional Regression Models with Structured Coefficients

High-dimensional Regression Models with Structured Coefficients
Title High-dimensional Regression Models with Structured Coefficients PDF eBook
Author Yuan Li
Publisher
Pages 0
Release 2018
Genre
ISBN

Download High-dimensional Regression Models with Structured Coefficients Book in PDF, Epub and Kindle

Regression models are very common for statistical inference, especially linear regression models with Gaussian noise. But in many modern scientific applications with large-scale datasets, the number of samples is small relative to the number of model parameters, which is the so-called high- dimensional setting. Directly applying classical linear regression models to high-dimensional data is ill-posed. Thus it is necessary to impose additional assumptions for regression coefficients to make high-dimensional statistical analysis possible. Regularization methods with sparsity assumptions have received substantial attention over the past two decades. But there are still some open questions regarding high-dimensional statistical analysis. Firstly, most literature provides statistical analysis for high-dimensional linear models with Gaussian noise, it is unclear whether similar results still hold if we are no longer in the Gaussian setting. To answer this question under Poisson setting, we study the minimax rates and provide an implementable convex algorithm for high-dimensional Poisson inverse problems under weak sparsity assumption and physical constraints. Secondly, much of the theory and methodology for high-dimensional linear regression models are based on the assumption that independent variables are independent of each other or have weak correlations. But it is possible that this assumption is not satisfied that some features are highly correlated with each other. It is natural to ask whether it is still possible to make high-dimensional statistical inference with high-correlated designs. Thus we provide a graph-based regularization method for high-dimensional regression models with high-correlated designs along with theoretical guarantees.

Tests of Hypotheses on Regression Coefficients in High-Dimensional Regression Models

Tests of Hypotheses on Regression Coefficients in High-Dimensional Regression Models
Title Tests of Hypotheses on Regression Coefficients in High-Dimensional Regression Models PDF eBook
Author Ye Alex Zhao
Publisher
Pages 0
Release 2022
Genre
ISBN

Download Tests of Hypotheses on Regression Coefficients in High-Dimensional Regression Models Book in PDF, Epub and Kindle

Statistical inference in high-dimensional settings has become an important area of research due to the increased production of high-dimensional data in a wide variety of areas. However, few approaches towards simultaneous hypothesis testing of high-dimensional regression coefficients have been proposed. In the first project of this dissertation, we introduce a new method for simultaneous tests of the coefficients in a high-dimensional linear regression model. Our new test statistic is based on the sum-of-squares of the score function mean with an additional power-enhancement term. The asymptotic distribution and power of the test statistic are derived, and our procedure is shown to outperform existing approaches. We conduct Monte Carlo simulations to demonstrate performance improvements over existing methods and apply the testing procedure to a real data example. In the second project, we propose a test statistic for regression coefficients in a high-dimensional setting that applies for generalized linear models. Building on previous work on testing procedures for high-dimensional linear regression models, we extend this approach to create a new testing methodology for GLMs, with specific illustrations for the Poisson and logistic regression scenarios. The asymptotic distribution of the test statistic is established, and both simulation results and a real data analysis are conducted to illustrate the performance of our proposed method. The final project of this dissertation introduces two new approaches for testing high-dimensional regression coefficients in the partial linear model setting and more generally for linear hypothesis tests in linear models. Our proposed statistic is motivated by the profile least squares method and decorrelation score method for high-dimensional inference, which we show to be equivalent in these particular cases. We outline the empirical performance of the new test statistic with simulation studies and real data examples. These results indicate generally satisfactory performance under a wide range of settings and applicability to real world data problems.

Robust Penalized Regression for Complex High-dimensional Data

Robust Penalized Regression for Complex High-dimensional Data
Title Robust Penalized Regression for Complex High-dimensional Data PDF eBook
Author Bin Luo
Publisher
Pages 169
Release 2020
Genre Dimensional analysis
ISBN

Download Robust Penalized Regression for Complex High-dimensional Data Book in PDF, Epub and Kindle

"Robust high-dimensional data analysis has become an important and challenging task in complex Big Data analysis due to the high-dimensionality and data contamination. One of the most popular procedures is the robust penalized regression. In this dissertation, we address three typical robust ultra-high dimensional regression problems via penalized regression approaches. The first problem is related to the linear model with the existence of outliers, dealing with the outlier detection, variable selection and parameter estimation simultaneously. The second problem is related to robust high-dimensional mean regression with irregular settings such as the data contamination, data asymmetry and heteroscedasticity. The third problem is related to robust bi-level variable selection for the linear regression model with grouping structures in covariates. In Chapter 1, we introduce the background and challenges by overviews of penalized least squares methods and robust regression techniques. In Chapter 2, we propose a novel approach in a penalized weighted least squares framework to perform simultaneous variable selection and outlier detection. We provide a unified link between the proposed framework and a robust M-estimation in general settings. We also establish the non-asymptotic oracle inequalities for the joint estimation of both the regression coefficients and weight vectors. In Chapter 3, we establish a framework of robust estimators in high-dimensional regression models using Penalized Robust Approximated quadratic M estimation (PRAM). This framework allows general settings such as random errors lack of symmetry and homogeneity, or covariates are not sub-Gaussian. Theoretically, we show that, in the ultra-high dimension setting, the PRAM estimator has local estimation consistency at the minimax rate enjoyed by the LS-Lasso and owns the local oracle property, under certain mild conditions. In Chapter 4, we extend the study in Chapter 3 to robust high-dimensional data analysis with structured sparsity. In particular, we propose a framework of high-dimensional M-estimators for bi-level variable selection. This framework encourages bi-level sparsity through a computationally efficient two-stage procedure. It produces strong robust parameter estimators if some nonconvex redescending loss functions are applied. In theory, we provide sufficient conditions under which our proposed two-stage penalized M-estimator possesses simultaneous local estimation consistency and the bi-level variable selection consistency, if a certain nonconvex penalty function is used at the group level. The performances of the proposed estimators are demonstrated in both simulation studies and real examples. In Chapter 5, we provide some discussions and future work."--Abstract from author supplied metadata

Inference on Structural Changes in High Dimensional Linear Regression Models

Inference on Structural Changes in High Dimensional Linear Regression Models
Title Inference on Structural Changes in High Dimensional Linear Regression Models PDF eBook
Author Hongjin Zhang
Publisher
Pages 0
Release 2023
Genre Change-point problems
ISBN

Download Inference on Structural Changes in High Dimensional Linear Regression Models Book in PDF, Epub and Kindle

This dissertation is dedicated to studying the problem of constructing asymptotically valid confidence intervals for change points in high-dimensional linear models, where the number of parameters may vastly exceed the sampling period.In Chapter 2, we develop an algorithmic estimator for a single change point and establish the optimal rate of estimation, Op(Îl 8́22 ), where Îl represents the jump size under a high dimensional scaling. The optimal result ensures the existence of limiting distributions. Asymptotic distributions are derived under both vanishing and non-vanishing regimes of jump size. In the former case, it corresponds to the argmax of a two-sided Brownian motion, while in the latter case to the argmax of a two-sided random walk, both with negative drifts. We also provide the relationship between the two distributions, which allows construction of regime (vanishing vs non-vanishing) adaptive confidence intervals.In Chapter 3, we extend our analysis to the statistical inference for multiple change points in high-dimensional linear regression models. We develop locally refitted estimators and evaluate their convergence rates both component-wise and simultaneously. Following similar manner as in Chapter 2, we achieve an optimal rate of estimation under the component-wise scenario, which guarantees the existence of limiting distributions. While we also establish the simultaneous rate which is the sharpest available by a logarithmic factor. Component-wise and joint limiting distributions are derived under vanishing and non-vanishing regimes of jump sizes, demonstrating the relationship between distributions in the two regimes.Lastly in Chapter 4, we introduce a novel implementation method for finding preliminary change points estimates via integer linear programming, which has not yet been explored in the current literature.Overall, this dissertation provides a comprehensive framework for inference on single and multiple change points in high-dimensional linear models, offering novel and efficient algorithms with strong theoretical guarantees. All theoretical results are supported by Monte Carlo simulations.

Sparse Graphical Modeling for High Dimensional Data

Sparse Graphical Modeling for High Dimensional Data
Title Sparse Graphical Modeling for High Dimensional Data PDF eBook
Author Faming Liang
Publisher CRC Press
Pages 151
Release 2023-08-02
Genre Mathematics
ISBN 0429584806

Download Sparse Graphical Modeling for High Dimensional Data Book in PDF, Epub and Kindle

A general framework for learning sparse graphical models with conditional independence tests Complete treatments for different types of data, Gaussian, Poisson, multinomial, and mixed data Unified treatments for data integration, network comparison, and covariate adjustment Unified treatments for missing data and heterogeneous data Efficient methods for joint estimation of multiple graphical models Effective methods of high-dimensional variable selection Effective methods of high-dimensional inference

Statistical Methods for Complex And/or High Dimensional Data

Statistical Methods for Complex And/or High Dimensional Data
Title Statistical Methods for Complex And/or High Dimensional Data PDF eBook
Author Shanshan Qin
Publisher
Pages 0
Release 2020
Genre
ISBN

Download Statistical Methods for Complex And/or High Dimensional Data Book in PDF, Epub and Kindle

This dissertation focuses on the development and implementation of statistical methods for high-dimensional and/or complex data, with an emphasis on $p$, the number of explanatory variables, larger than $n$, the number of observations, the ratio of $p/n$ tending to a finite number, and data with outlier observations. First, we propose a non-negative feature selection and/or feature grouping (nnFSG) method. It deals with a general series of sign-constrained high-dimensional regression problems, which allows the regression coefficients to carry a structure of disjoint homogeneity, including sparsity as a special case. To solve the resulting non-convex optimization problem, we provide an algorithm that incorporates the difference of convex programming, augmented Lagrange and coordinate descent methods. Furthermore, we show that the aforementioned nnFSG method recovers the oracle estimate consistently, and yields a bound on the mean squared errors (MSE).} Besides, we examine the performance of our method by using finite sample simulations and a real protein mass spectrum dataset. Next, we consider a High-dimensional multivariate ridge regression model under the regime where both $p$ and $n$ are large enough with $p/n \rightarrow \kappa (0

Component Identification and Estimation in Nonlinear High-dimensional Regression Models by Structural Adaptation

Component Identification and Estimation in Nonlinear High-dimensional Regression Models by Structural Adaptation
Title Component Identification and Estimation in Nonlinear High-dimensional Regression Models by Structural Adaptation PDF eBook
Author Alexander Samarov
Publisher
Pages 39
Release 2003
Genre
ISBN

Download Component Identification and Estimation in Nonlinear High-dimensional Regression Models by Structural Adaptation Book in PDF, Epub and Kindle