MPG.eBooks - Table of Contents: Mastering Spark with R

Read Now

Mastering Spark with R the complete guide to large-scale analysis and modeling

"Authors Javier Luraschi, Kevin Kuo, and Edgar Ruiz show you how to combine R with Spark to analyze data at scale. This book covers relevant data science topics, cluster computing, and issues that will interest even the most advanced users."--Back cover

Bibliographic Details
Main Authors:	Luraschi, Javier, Kuo, Kevin (Author), Ruiz, Edgar (Author)
Format:	eBook
Language:	English
Published:	Sebastopol, CA O'Reilly Media 2019
Edition:	First edition
Subjects:	Electronic Data Processing / Fast R (langage De Programmation) Big Data / Fast Spark (electronic Resource : Apache Software Foundation) / Fast Big Data / Http://id.loc.gov/authorities/subjects/sh2012003227 Spark (electronic Resource : Apache Software Foundation) / Http://id.loc.gov/authorities/names/no2015027445 R (computer Program Language) / Fast Données Volumineuses Electronic Data Processing / Http://id.loc.gov/authorities/subjects/sh85042288 R (computer Program Language) / Http://id.loc.gov/authorities/subjects/sh2002004407
Online Access:	https://learning.oreilly.com/library/view/~/978149...
Collection:	O'Reilly - Collection details see MPG.ReNa

Table of Contents:

Using ggplot2Using dbplot; Model; Caching; Communicate; Recap; Chapter 4. Modeling; Overview; Exploratory Data Analysis; Feature Engineering; Supervised Learning; Generalized Linear Regression; Other Models; Unsupervised Learning; Data Preparation; Topic Modeling; Recap; Chapter 5. Pipelines; Overview; Creation; Use Cases; Hyperparameter Tuning; Operating Modes; Interoperability; Deployment; Batch Scoring; Real-Time Scoring; Recap; Chapter 6. Clusters; Overview; On-Premises; Managers; Distributions; Cloud; Amazon; Databricks; Google; IBM; Microsoft; Qubole; Kubernetes; Tools; RStudio; Jupyter
Includes bibliographical references and index
OverviewTransformations; Analysis; Modeling; Pipelines; Distributed R; Kafka; Shiny; Recap; Chapter 13. Contributing; Overview; The Spark API; Spark Extensions; Using Scala Code; Recap; Appendix A. Supplemental Code References; Preface; Formatting; Chapter 1; The World's Capacity to Store Information; Daily Downloads of CRAN Packages; Chapter 2; Prerequisites; Chapter 3; Hive Functions; Chapter 4; MLlib Functions; Chapter 6; Google Trends for On-Premises (Mainframes), Cloud Computing, and Kubernetes; Chapter 12; Stream Generator; Installing Kafka; Index; About the Authors; Colophon
Intro; Copyright; Table of Contents; Foreword; Preface; Formatting; Acknowledgments; Conventions Used in This Book; Using Code Examples; O'Reilly Online Learning; How to Contact Us; Chapter 1. Introduction; Overview; Hadoop; Spark; R; sparklyr; Recap; Chapter 2. Getting Started; Overview; Prerequisites; Installing sparklyr; Installing Spark; Connecting; Using Spark; Web Interface; Analysis; Modeling; Data; Extensions; Distributed R; Streaming; Logs; Disconnecting; Using RStudio; Resources; Recap; Chapter 3. Analysis; Overview; Import; Wrangle; Built-in Functions; Correlations; Visualize
LivyRecap; Chapter 7. Connections; Overview; Edge Nodes; Spark Home; Local; Standalone; YARN; YARN Client; YARN Cluster; Livy; Mesos; Kubernetes; Cloud; Batches; Tools; Multiple Connections; Troubleshooting; Logging; Spark Submit; Windows; Recap; Chapter 8. Data; Overview; Reading Data; Paths; Schema; Memory; Columns; Writing Data; Copying Data; File Formats; CSV; JSON; Parquet; Others; File Systems; Storage Systems; Hive; Cassandra; JDBC; Recap; Chapter 9. Tuning; Overview; Graph; Timeline; Configuring; Connect Settings; Submit Settings; Runtime Settings; sparklyr Settings; Partitioning
Implicit PartitionsExplicit Partitions; Caching; Checkpointing; Memory; Shuffling; Serialization; Configuration Files; Recap; Chapter 10. Extensions; Overview; H2O; Graphs; XGBoost; Deep Learning; Genomics; Spatial; Troubleshooting; Recap; Chapter 11. Distributed R; Overview; Use Cases; Custom Parsers; Partitioned Modeling; Grid Search; Web APIs; Simulations; Partitions; Grouping; Columns; Context; Functions; Packages; Cluster Requirements; Installing R; Apache Arrow; Troubleshooting; Worker Logs; Resolving Timeouts; Inspecting Partitions; Debugging Workers; Recap; Chapter 12. Streaming

Mastering Spark with R the complete guide to large-scale analysis and modeling

Similar Items