Introducing .Net for Apache Spark distributed processing for massive datasets

Get started using Apache Spark via C# or F♯ and the .NET for Apache Spark bindings. This book is an introduction to both Apache Spark and the .NET bindings. Readers new to Apache Spark will get up to speed quickly using Spark for data processing tasks performed against large and very large datasets....

Full description

Bibliographic Details
Main Author: Elliott, Ed
Format: eBook
Language:English
Published: [Berkeley] Apress 2021
Subjects:
Online Access:
Collection: O'Reilly - Collection details see MPG.ReNa
Table of Contents:
  • Part I. Getting Started
  • 1. Understanding Apache Spark
  • 2. Setting up Spark
  • 3
  • Programming with .NET for Apache Spark
  • Part II. The APIs
  • 4. User-Defined Functions
  • 5. The DataFrame API
  • 6. Spark SQL and Hive Tables
  • 7. Spark Machine Learning API
  • Part III. Examples
  • 8. Batch Mode Processing
  • 9. Structured Streaming
  • 10. Troubleshooting
  • 11. Delta Lake
  • Part IV. Appendices
  • Appendix A. Running in the Cloud
  • Appendix B. Implementing .NET for Apache Spark Code