Building Spark Applications

13+ Hours of Video Instruction Overview Building Spark Applications LiveLessons provides data scientists and developers with a practical introduction to the Apache Spark framework using Python, R, and SQL. Additionally, it covers best practices for developing scalable Spark applications for predicti...

Full description

Bibliographic Details
Main Author: Dinu, Jonathan
Format: eBook
Language:English
Published: Addison-Wesley Professional 2015
Edition:1st edition
Online Access:
Collection: O'Reilly - Collection details see MPG.ReNa
Description
Summary:13+ Hours of Video Instruction Overview Building Spark Applications LiveLessons provides data scientists and developers with a practical introduction to the Apache Spark framework using Python, R, and SQL. Additionally, it covers best practices for developing scalable Spark applications for predictive analytics in the context of a data scientist's standard workflow. Description In this video training, Jonathan starts off with a brief history of Spark itself and shows you how to get started programming in a Spark environment on a laptop. Taking an application and code first approach, he then covers the various APIs in Python, R, and SQL to show how Spark makes large scale data analysis much more accessible through languages familiar to data scientists and analysts alike. With the basics covered, the videos move into a real-world case study showing you how to explore data, process text, and build models with Spark.
Throughout the process, Jonathan exposes the internals of the Spark framework itself to show you how to write better application code, optimize performance, and set up a cluster to fully leverage the distributed nature of Spark. After watching these videos, data scientists and developers will feel confident building an end-to-end application with Spark to perform machine learning and do data analysis at scale! About the Instructor Jonathan Dinu is the founder of Zipfian Academy an advanced immersive training program for data scientists and data engineers in San Francisco and served as its CAO/CTO before it was acquired by Galvanize, where he now is the VP of Academic Excellence. He first discovered his love of all things data while studying Computer Science and Physics at UC Berkeley, and in a former life he worked for Alpine Data Labs developing distributed machine learning algorithms for predictive analytics on Hadoop.
Jonathan is a dedicated educator, author, and speaker with a passion for sharing the things he has learned in the most creative ways he can. He has run data science workshops at Strata and PyData (among others), built a Data Visualization course with Udacity, and served on the UC Berkeley Extension Data Science Advisory Board. Currently he is writing a book on practical Data Science applications using Python. When he is not working with students you can find him blogging about data, visualization, and education at http://hopelessoptimism.com/ . Skill Level Beginning/Intermediate What You Will Learn How to in ..
Item Description:Not recommended for use on the libraries' public computers
Made available through: Safari, an O'Reilly Media Company
Physical Description:1 streaming video file, approximately 13 hr., 18 min.