Theory of Agglomerative Hierarchical Clustering
Title | Theory of Agglomerative Hierarchical Clustering PDF eBook |
Author | Sadaaki Miyamoto |
Publisher | Springer Nature |
Pages | 117 |
Release | 2022-03-21 |
Genre | Mathematics |
ISBN | 9811904200 |
This book discusses recent theoretical developments in agglomerative hierarchical clustering. The general understanding of agglomerative hierarchical clustering is that its theory was completed long ago and there is no room for further methodological studies, at least in its fundamental structure. This book has been planned counter to that view: it will show that there are possibilities for further theoretical studies and they will be not only for methodological interests but also for usefulness in real applications. When compared with traditional textbooks, the present book has several notable features. First, standard linkage methods and agglomerative procedure are described by a general algorithm in which dendrogram output is expressed by a recursive subprogram. That subprogram describes an abstract tree structure, which is used for a two-stage linkage method for a greater number of objects. A fundamental theorem for single linkage using a fuzzy graph is proved, which uncovers several theoretical features of single linkage. Other theoretical properties such as dendrogram reversals are discussed. New methods using positive-definite kernels are considered, and some properties of the Ward method using kernels are studied. Overall, theoretical features are discussed, but the results are useful as well for application-oriented users of agglomerative clustering.
Theory of Agglomerative Hierarchical Clustering
Title | Theory of Agglomerative Hierarchical Clustering PDF eBook |
Author | Sadaaki Miyamoto |
Publisher | |
Pages | 109 |
Release | 2022 |
Genre | Business information services |
ISBN | 9789811904219 |
This book discusses recent theoretical developments in agglomerative hierarchical clustering. The general understanding of agglomerative hierarchical clustering is that its theory was completed long ago and there is no room for further methodological studies, at least in its fundamental structure. This book has been planned counter to that view: it will show that there are possibilities for further theoretical studies and they will be not only for methodological interests but also for usefulness in real applications. When compared with traditional textbooks, the present book has several notable features. First, standard linkage methods and agglomerative procedure are described by a general algorithm in which dendrogram output is expressed by a recursive subprogram. That subprogram describes an abstract tree structure, which is used for a two-stage linkage method for a greater number of objects. A fundamental theorem for single linkage using a fuzzy graph is proved, which uncovers several theoretical features of single linkage. Other theoretical properties such as dendrogram reversals are discussed. New methods using positive-definite kernels are considered, and some properties of the Ward method using kernels are studied. Overall, theoretical features are discussed, but the results are useful as well for application-oriented users of agglomerative clustering.--
Data Clustering: Theory, Algorithms, and Applications, Second Edition
Title | Data Clustering: Theory, Algorithms, and Applications, Second Edition PDF eBook |
Author | Guojun Gan |
Publisher | SIAM |
Pages | 430 |
Release | 2020-11-10 |
Genre | Mathematics |
ISBN | 1611976332 |
Data clustering, also known as cluster analysis, is an unsupervised process that divides a set of objects into homogeneous groups. Since the publication of the first edition of this monograph in 2007, development in the area has exploded, especially in clustering algorithms for big data and open-source software for cluster analysis. This second edition reflects these new developments, covers the basics of data clustering, includes a list of popular clustering algorithms, and provides program code that helps users implement clustering algorithms. Data Clustering: Theory, Algorithms and Applications, Second Edition will be of interest to researchers, practitioners, and data scientists as well as undergraduate and graduate students.
Encyclopedia of Distances
Title | Encyclopedia of Distances PDF eBook |
Author | Michel Marie Deza |
Publisher | Springer |
Pages | 731 |
Release | 2014-10-08 |
Genre | Mathematics |
ISBN | 3662443422 |
This updated and revised third edition of the leading reference volume on distance metrics includes new items from very active research areas in the use of distances and metrics such as geometry, graph theory, probability theory and analysis. Among the new topics included are, for example, polyhedral metric space, nearness matrix problems, distances between belief assignments, distance-related animal settings, diamond-cutting distances, natural units of length, Heidegger’s de-severance distance, and brain distances. The publication of this volume coincides with intensifying research efforts into metric spaces and especially distance design for applications. Accurate metrics have become a crucial goal in computational biology, image analysis, speech recognition and information retrieval. Leaving aside the practical questions that arise during the selection of a ‘good’ distance function, this work focuses on providing the research community with an invaluable comprehensive listing of the main available distances. As well as providing standalone introductions and definitions, the encyclopedia facilitates swift cross-referencing with easily navigable bold-faced textual links to core entries. In addition to distances themselves, the authors have collated numerous fascinating curiosities in their Who’s Who of metrics, including distance-related notions and paradigms that enable applied mathematicians in other sectors to deploy research tools that non-specialists justly view as arcane. In expanding access to these techniques, and in many cases enriching the context of distances themselves, this peerless volume is certain to stimulate fresh research.
Clustering Algorithms
Title | Clustering Algorithms PDF eBook |
Author | John A. Hartigan |
Publisher | John Wiley & Sons |
Pages | 374 |
Release | 1975 |
Genre | Mathematics |
ISBN |
Shows how Galileo, Newton, and Einstein tried to explain gravity. Discusses the concept of microgravity and NASA's research on gravity and microgravity.
Practical Guide to Cluster Analysis in R
Title | Practical Guide to Cluster Analysis in R PDF eBook |
Author | Alboukadel Kassambara |
Publisher | STHDA |
Pages | 168 |
Release | 2017-08-23 |
Genre | Education |
ISBN | 1542462703 |
Although there are several good books on unsupervised machine learning, we felt that many of them are too theoretical. This book provides practical guide to cluster analysis, elegant visualization and interpretation. It contains 5 parts. Part I provides a quick introduction to R and presents required R packages, as well as, data formats and dissimilarity measures for cluster analysis and visualization. Part II covers partitioning clustering methods, which subdivide the data sets into a set of k groups, where k is the number of groups pre-specified by the analyst. Partitioning clustering approaches include: K-means, K-Medoids (PAM) and CLARA algorithms. In Part III, we consider hierarchical clustering method, which is an alternative approach to partitioning clustering. The result of hierarchical clustering is a tree-based representation of the objects called dendrogram. In this part, we describe how to compute, visualize, interpret and compare dendrograms. Part IV describes clustering validation and evaluation strategies, which consists of measuring the goodness of clustering results. Among the chapters covered here, there are: Assessing clustering tendency, Determining the optimal number of clusters, Cluster validation statistics, Choosing the best clustering algorithms and Computing p-value for hierarchical clustering. Part V presents advanced clustering methods, including: Hierarchical k-means clustering, Fuzzy clustering, Model-based clustering and Density-based clustering.
Hands-On Machine Learning with R
Title | Hands-On Machine Learning with R PDF eBook |
Author | Brad Boehmke |
Publisher | CRC Press |
Pages | 373 |
Release | 2019-11-07 |
Genre | Business & Economics |
ISBN | 1000730433 |
Hands-on Machine Learning with R provides a practical and applied approach to learning and developing intuition into today’s most popular machine learning methods. This book serves as a practitioner’s guide to the machine learning process and is meant to help the reader learn to apply the machine learning stack within R, which includes using various R packages such as glmnet, h2o, ranger, xgboost, keras, and others to effectively model and gain insight from their data. The book favors a hands-on approach, providing an intuitive understanding of machine learning concepts through concrete examples and just a little bit of theory. Throughout this book, the reader will be exposed to the entire machine learning process including feature engineering, resampling, hyperparameter tuning, model evaluation, and interpretation. The reader will be exposed to powerful algorithms such as regularized regression, random forests, gradient boosting machines, deep learning, generalized low rank models, and more! By favoring a hands-on approach and using real word data, the reader will gain an intuitive understanding of the architectures and engines that drive these algorithms and packages, understand when and how to tune the various hyperparameters, and be able to interpret model results. By the end of this book, the reader should have a firm grasp of R’s machine learning stack and be able to implement a systematic approach for producing high quality modeling results. Features: · Offers a practical and applied introduction to the most popular machine learning methods. · Topics covered include feature engineering, resampling, deep learning and more. · Uses a hands-on approach and real world data.