Data science on the Google cloud platform implementing end-to-end real-time data pipelines: from ingest to machine learning

Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build on top of the Google Cloud Platform (GCP). This hands-on guide shows developers entering the data science field how to implement an end-to-end data pipeline, using statistical a...

Full description

Bibliographic Details
Main Author: Lakshmanan, Valliappa
Format: eBook
Language:English
Published: Sebastopol, CA O'Reilly Media 2018
Edition:First edition
Subjects:
Online Access:
Collection: O'Reilly - Collection details see MPG.ReNa
LEADER 03803nmm a2200589 u 4500
001 EB001916592
003 EBX01000000000000001079494
005 00000000000000.0
007 cr|||||||||||||||||||||
008 210123 ||| eng
020 |a 9781491974513 
020 |a 1491974532 
020 |a 9781491974537 
050 4 |a QA76.54 
100 1 |a Lakshmanan, Valliappa 
245 0 0 |a Data science on the Google cloud platform  |b implementing end-to-end real-time data pipelines: from ingest to machine learning  |c Valliappa Lakshmanan 
250 |a First edition 
260 |a Sebastopol, CA  |b O'Reilly Media  |c 2018 
300 |a xiv, 393 pages  |b illustrations 
505 0 |a Making better decisions based on data -- Ingesting data into the cloud -- Creating compelling dashboards -- Streaming data: publication and ingest -- Interactive data exploration -- Bayes classifier on cloud dataproc -- Machine learning: logistic regression on Spark -- Time-windowed aggregate features -- Machine learning classifier using TensorFlow -- Real-time machine learning 
653 |a COMPUTERS / Computer Science / bisacsh 
653 |a Cloud computing / fast 
653 |a Google (Firm) / fast 
653 |a Infonuagique 
653 |a COMPUTERS / Hardware / General / bisacsh 
653 |a Computing platforms / fast 
653 |a COMPUTERS / Data Processing / bisacsh 
653 |a COMPUTERS / Reference / bisacsh 
653 |a Real-time data processing / http://id.loc.gov/authorities/subjects/sh85111765 
653 |a Temps réel (Informatique) 
653 |a Computing platforms / http://id.loc.gov/authorities/subjects/sh2011003111 
653 |a Cloud computing / http://id.loc.gov/authorities/subjects/sh2008004883 
653 |a COMPUTERS / Computer Literacy / bisacsh 
653 |a Real-time data processing / fast 
653 |a Plateformes (Informatique) 
653 |a Google (Firm) / http://id.loc.gov/authorities/names/no00095539 
653 |a COMPUTERS / Machine Theory / bisacsh 
653 |a COMPUTERS / Information Technology / bisacsh 
041 0 7 |a eng  |2 ISO 639-2 
989 |b OREILLY  |a O'Reilly 
500 |a Includes index 
015 |a GBB7B3690 
776 |z 9781491974537 
776 |z 9781491974568 
776 |z 1491974516 
776 |z 1491974567 
776 |z 1491974532 
776 |z 9781491974513 
856 4 0 |u https://learning.oreilly.com/library/view/~/9781491974551/?ar  |x Verlag  |3 Volltext 
082 0 |a 004.33 
082 0 |a 500 
520 |a Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build on top of the Google Cloud Platform (GCP). This hands-on guide shows developers entering the data science field how to implement an end-to-end data pipeline, using statistical and machine learning methods and tools on GCP. Over the course of the book, you'll work through a sample business decision by employing a variety of data science approaches. Follow along by implementing these statistical and machine learning solutions in your own project on GCP, and discover how this platform provides a transformative and more collaborative way of doing data science. You'll learn how to: automate and schedule data ingest using an App Engine application, create and populate a dashboard in Google Data Studio, build a real-time analysis pipeline to carry out streaming analytics, conduct interactive data exploration with Google BigQuery, create a Bayesian model on a Cloud Dataproc cluster, build a logistic regression machine learning model with Spark, compute time-aggregate features with a Cloud Dataflow pipeline, create a high-performing prediction model with TensorFlow, use your deployed model as a microservice you can access from both batch and real-time pipelines