Model-based clustering and classification for data science with applications in R

Cluster analysis finds groups in data automatically. Most methods have been heuristic and leave open such central questions as: how many clusters are there? Which method should I use? How should I handle outliers? Classification assigns new observations to groups given previously classified observat...

Full description

Bibliographic Details
Main Authors: Bouveyron, Charles, Celeux, Gilles (Author), Murphy, T. Brendan (Author), Raftery, Adrian E. (Author)
Format: eBook
Language:English
Published: Cambridge Cambridge University Press 2019
Series:Cambridge series in statistical and probabilistic mathematics
Subjects:
Online Access:
Collection: Cambridge Books Online - Collection details see MPG.ReNa
LEADER 02366nmm a2200325 u 4500
001 EB001871668
003 EBX01000000000000001035039
005 00000000000000.0
007 cr|||||||||||||||||||||
008 190823 ||| eng
020 |a 9781108644181 
050 4 |a QA278.55 
100 1 |a Bouveyron, Charles 
245 0 0 |a Model-based clustering and classification for data science  |b with applications in R  |c Charles Bouveyron, Gilles Celeux, T. Brendan Murphy, Adrian E. Raftery 
260 |a Cambridge  |b Cambridge University Press  |c 2019 
300 |a xvii, 427 pages  |b digital 
653 |a Cluster analysis 
653 |a Mathematical statistics 
653 |a Statistics / Classification 
653 |a R (Computer program language) 
700 1 |a Celeux, Gilles  |e [author] 
700 1 |a Murphy, T. Brendan  |e [author] 
700 1 |a Raftery, Adrian E.  |e [author] 
041 0 7 |a eng  |2 ISO 639-2 
989 |b CBO  |a Cambridge Books Online 
490 0 |a Cambridge series in statistical and probabilistic mathematics 
028 5 0 |a 10.1017/9781108644181 
856 4 0 |u https://doi.org/10.1017/9781108644181  |x Verlag  |3 Volltext 
082 0 |a 519.53 
520 |a Cluster analysis finds groups in data automatically. Most methods have been heuristic and leave open such central questions as: how many clusters are there? Which method should I use? How should I handle outliers? Classification assigns new observations to groups given previously classified observations, and also has open questions about parameter tuning, robustness and uncertainty assessment. This book frames cluster analysis and classification in terms of statistical models, thus yielding principled estimation, testing and prediction methods, and sound answers to the central questions. It builds the basic ideas in an accessible but rigorous way, with extensive data examples and R code; describes modern approaches to high-dimensional data and networks; and explains such recent advances as Bayesian regularization, non-Gaussian model-based clustering, cluster merging, variable selection, semi-supervised and robust classification, clustering of functional data, text and images, and co-clustering. Written for advanced undergraduates in data science, as well as researchers and practitioners, it assumes basic knowledge of multivariate calculus, linear algebra, probability and statistics