Discrepancy-based Algorithms for Best-subset Model Selection

Title	Discrepancy-based Algorithms for Best-subset Model Selection PDF eBook
Author	Tao Zhang
Publisher
Pages	142
Release	2013
Genre	Akaike Information Criterion
ISBN

GET E-BOOK HERE

Download Discrepancy-based Algorithms for Best-subset Model Selection Book in PDF, Epub and Kindle

The selection of a best-subset regression model from a candidate family is a common problem that arises in many analyses. In best-subset model selection, we consider all possible subsets of regressor variables; thus, numerous candidate models may need to be fit and compared. One of the main challenges of best-subset selection arises from the size of the candidate model family: specifically, the probability of selecting an inappropriate model generally increases as the size of the family increases. For this reason, it is usually difficult to select an optimal model when best-subset selection is attempted based on a moderate to large number of regressor variables. Model selection criteria are often constructed to estimate discrepancy measures used to assess the disparity between each fitted candidate model and the generating model. The Akaike information criterion (AIC) and the corrected AIC (AICc) are designed to estimate the expected Kullback-Leibler (K-L) discrepancy. For best-subset selection, both AIC and AICc are negatively biased, and the use of either criterion will lead to overfitted models. To correct for this bias, we introduce a criterion AICi, which has a penalty term evaluated from Monte Carlo simulation. A multistage model selection procedure AICaps, which utilizes AICi, is proposed for best-subset selection. In the framework of linear regression models, the Gauss discrepancy is another frequently applied measure of proximity between a fitted candidate model and the generating model. Mallows' conceptual predictive statistic (Cp) and the modified Cp (MCp) are designed to estimate the expected Gauss discrepancy. For best-subset selection, Cp and MCp exhibit negative estimation bias. To correct for this bias, we propose a criterion CPSi that again employs a penalty term evaluated from Monte Carlo simulation. We further devise a multistage procedure, CPSaps, which selectively utilizes CPSi. In this thesis, we consider best-subset selection in two different modeling frameworks: linear models and generalized linear models. Extensive simulation studies are compiled to compare the selection behavior of our methods and other traditional model selection criteria. We also apply our methods to a model selection problem in a study of bipolar disorder.

Computational Subset Model Selection Algorithms and Applications

Title	Computational Subset Model Selection Algorithms and Applications PDF eBook
Author
Publisher
Pages
Release	2004
Genre
ISBN

GET E-BOOK HERE

Download Computational Subset Model Selection Algorithms and Applications Book in PDF, Epub and Kindle

This dissertation develops new computationally efficient algorithms for identifying the subset of variables that minimizes any desired information criteria in model selection. In recent years, the statistical literature has placed more and more emphasis on information theoretic model selection criteria. A model selection criterion chooses model that "closely" approximates the true underlying model. Recent years have also seen many exciting developments in the model selection techniques. As demand increases for data mining of massive datasets with many variables, the demand for model selection techniques are becoming much stronger and needed. To this end, we introduce a new Implicit Enumeration (IE) algorithm and a hybridized IE with the Genetic Algorithm (GA) in this dissertation. The proposed Implicit Enumeration algorithm is the first algorithm that explicitly uses an information criterion as the objective function. The algorithm works with a variety of information criteria including some for which the existing branch and bound algorithms developed by Furnival and Wilson (1974) and Gatu and Kontoghiorghies (2003) are not applicable. It also finds the "best" subset model directly without the need of finding the "best" subset of each size as the branch and bound techniques do. The proposed methods are demonstrated in multiple, multivariate, logistic regression and discriminant analysis problems. The implicit enumeration algorithm converged to the optimal solution on real and simulated data sets with up to 80 predictors, thus having 280 = 1,208,925,819,614,630,000,000,000 possible subset models in the model portfolio. To our knowledge, none of the existing exact algorithms have the capability of optimally solving such problems of this size.

Machine Learning Under a Modern Optimization Lens

Title	Machine Learning Under a Modern Optimization Lens PDF eBook
Author	Dimitris Bertsimas
Publisher
Pages	589
Release	2019
Genre	Machine learning
ISBN	9781733788502

GET E-BOOK HERE

Download Machine Learning Under a Modern Optimization Lens Book in PDF, Epub and Kindle

Feature Engineering and Selection

Title	Feature Engineering and Selection PDF eBook
Author	Max Kuhn
Publisher	CRC Press
Pages	266
Release	2019-07-25
Genre	Business & Economics
ISBN	1351609467

GET E-BOOK HERE

Download Feature Engineering and Selection Book in PDF, Epub and Kindle

The process of developing predictive models includes many stages. Most resources focus on the modeling algorithms but neglect other critical aspects of the modeling process. This book describes techniques for finding the best representations of predictors for modeling and for nding the best subset of predictors for improving model performance. A variety of example data sets are used to illustrate the techniques along with R programs for reproducing the results.

Expert Clouds and Applications

Title	Expert Clouds and Applications PDF eBook
Author	I. Jeena Jacob
Publisher	Springer Nature
Pages	785
Release	2022-08-17
Genre	Technology & Engineering
ISBN	9811925003

GET E-BOOK HERE

Download Expert Clouds and Applications Book in PDF, Epub and Kindle

The book features original papers from International Conference on Expert Clouds and Applications (ICOECA 2022), organized by GITAM School of Technology, Bangalore, India, during 3–4 February 2022. It covers new research insights on artificial intelligence, big data, cloud computing, sustainability, knowledge-based expert systems. The book discusses innovative research from all aspects including theoretical, practical, and experimental domains that pertain to the expert systems, sustainable clouds, and artificial intelligence technologies.

Basic Guide for Machine Learning Algorithms and Models

Title	Basic Guide for Machine Learning Algorithms and Models PDF eBook
Author	Ms.G.Vanitha
Publisher	SK Research Group of Companies
Pages	191
Release	2024-07-10
Genre	Computers
ISBN	936492469X

GET E-BOOK HERE

Download Basic Guide for Machine Learning Algorithms and Models Book in PDF, Epub and Kindle

Ms.G.Vanitha, Associate Professor, Department of Information Technology, Bishop Heber College, Tiruchirappalli, Tamil Nadu, India. Dr.M.Kasthuri, Associate Professor, Department of Computer Science, Bishop Heber College, Tiruchirappalli, Tamil Nadu, India.

Subspace, Latent Structure and Feature Selection

Title	Subspace, Latent Structure and Feature Selection PDF eBook
Author	Craig Saunders
Publisher	Springer Science & Business Media
Pages	218
Release	2006-05-16
Genre	Computers
ISBN	3540341374

GET E-BOOK HERE

Download Subspace, Latent Structure and Feature Selection Book in PDF, Epub and Kindle

Many of the papers in this proceedings volume were presented at the PASCAL Workshop entitled Subspace, Latent Structure and Feature Selection Techniques: Statistical and Optimization Perspectives which took place in Bohinj, Slovenia during February, 23–25 2005.

Discrepancy-based Algorithms for Best-subset Model Selection

Computational Subset Model Selection Algorithms and Applications

Machine Learning Under a Modern Optimization Lens

Feature Engineering and Selection

Expert Clouds and Applications

Basic Guide for Machine Learning Algorithms and Models

Subspace, Latent Structure and Feature Selection

New Release