MPG.eBooks - Staff View: Doing data science

Read Now

Doing data science

Now that people are aware that data can make the difference in an election or a business model, data science as an occupation is gaining ground. But how can you get started working in a wide-ranging, interdisciplinary field that's so clouded in hype? This insightful book, based on Columbia Univ...

Full description

Bibliographic Details
Main Author:	Schutt, Rachel
Other Authors:	O'Neil, Cathy
Format:	eBook
Language:	English
Published:	Sebastopol, CA O'Reilly Media 2014
Subjects:	Data Mining / Fast Data Mining Big Data / Fast Information Science / Http://id.loc.gov/authorities/subjects/sh85066150 Big Data / Http://id.loc.gov/authorities/subjects/sh2012003227 Données Volumineuses Visualisation De L'information Data Structures (computer Science) / Http://id.loc.gov/authorities/subjects/sh85035862 Data Structures (computer Science) / Fast Information Science / Fast Sciences De L'information Data Mining / Http://id.loc.gov/authorities/subjects/sh97002073 Structures De Données (informatique) Information Visualization / Http://id.loc.gov/authorities/subjects/sh2002000243 Exploration De Données (informatique) Information Science / Aat Information Visualization / Fast
Online Access:	https://learning.oreilly.com/library/view/~/978144...
Collection:	O'Reilly - Collection details see MPG.ReNa


LEADER	05400nmm a2200505 u 4500
001	EB001918440
003	EBX01000000000000001081342
005	00000000000000.0
007	cr\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|\|
008	210123 \|\|\| eng
020			\|a 1449358659
050		4	\|a QA76.9.D343
100	1		\|a Schutt, Rachel
245	0	0	\|a Doing data science \|c Rachel Schutt, Cathy O'Neil
260			\|a Sebastopol, CA \|b O'Reilly Media \|c 2014
300			\|a 1 volume \|b illustrations
505	0		\|a A Data Scientist's Role in This Process -- Thought Experiment: How Would You Simulate Chaos? -- Case Study: RealDirect -- How Does RealDirect Make Money? -- Exercise: RealDirect Data Strategy -- Chapter3.Algorithms -- Machine Learning Algorithms -- Three Basic Algorithms -- Linear Regression -- k-Nearest Neighbors (k-NN) -- k-means -- Exercise: Basic Machine Learning Algorithms -- Solutions -- Summing It All Up -- Thought Experiment: Automated Statistician -- Chapter4.Spam Filters, Naive Bayes, and Wrangling -- Thought Experiment: Learning by Example
505	0		\|a The Current Landscape (with a Little History) -- Data Science Jobs -- A Data Science Profile -- Thought Experiment: Meta-Definition -- OK, So What Is a Data Scientist, Really? -- In Academia -- In Industry -- Chapter2.Statistical Inference, Exploratory Data Analysis, and the Data Science Process -- Statistical Thinking in the Age of Big Data -- Statistical Inference -- Populations and Samples -- Populations and Samples of Big Data -- Big Data Can Mean Big Assumptions -- Modeling -- Exploratory Data Analysis -- Philosophy of Exploratory Data Analysis -- Exercise: EDA -- The Data Science Process
505	0		\|a Why Won't Linear Regression Work for Filtering Spam? -- How About k-nearest Neighbors? -- Naive Bayes -- Bayes Law -- A Spam Filter for Individual Words -- A Spam Filter That Combines Words: Naive Bayes -- Fancy It Up: Laplace Smoothing -- Comparing Naive Bayes to k-NN -- Sample Code in bash -- Scraping the Web: APIs and Other Tools -- Jake's Exercise: Naive Bayes for Article Classification -- Sample R Code for Dealing with the NYT API -- Chapter5.Logistic Regression -- Thought Experiments -- Classifiers -- Runtime -- You -- Interpretability -- Scalability -- M6D Logistic Regression Case Study
505	0		\|a Copyright -- Table of Contents -- Preface -- Motivation -- Origins of the Class -- Origins of the Book -- What to Expect from This Book -- How This Book Is Organized -- How to Read This Book -- How Code Is Used in This Book -- Who This Book Is For -- Prerequisites -- Supplemental Reading -- About the Contributors -- Conventions Used in This Book -- Using Code Examples -- Safari® Books Online -- How to Contact Us -- Acknowledgments -- Chapter1.Introduction: What Is Data Science? -- Big Data and Data Science Hype -- Getting Past the Hype -- Why Now? -- Datafication
653			\|a Data mining / fast
653			\|a Data Mining
653			\|a Big data / fast
653			\|a Information science / http://id.loc.gov/authorities/subjects/sh85066150
653			\|a Big data / http://id.loc.gov/authorities/subjects/sh2012003227
653			\|a Données volumineuses
653			\|a Visualisation de l'information
653			\|a Data structures (Computer science) / http://id.loc.gov/authorities/subjects/sh85035862
653			\|a Data structures (Computer science) / fast
653			\|a Information science / fast
653			\|a Sciences de l'information
653			\|a Data mining / http://id.loc.gov/authorities/subjects/sh97002073
653			\|a Structures de données (Informatique)
653			\|a Information visualization / http://id.loc.gov/authorities/subjects/sh2002000243
653			\|a Exploration de données (Informatique)
653			\|a information science / aat
653			\|a Information visualization / fast
700	1		\|a O'Neil, Cathy
041	0	7	\|a eng \|2 ISO 639-2
989			\|b OREILLY \|a O'Reilly
776			\|z 9781449358655
856	4	0	\|u https://learning.oreilly.com/library/view/~/9781449363871/?ar \|x Verlag \|3 Volltext
082	0		\|a 500
082	0		\|a 006.3
520			\|a Now that people are aware that data can make the difference in an election or a business model, data science as an occupation is gaining ground. But how can you get started working in a wide-ranging, interdisciplinary field that's so clouded in hype? This insightful book, based on Columbia University's Introduction to Data Science class, tells you what you need to know. In many of these chapter-long lectures, data scientists from companies such as Google, Microsoft, and eBay share new algorithms, methods, and models by presenting case studies and the code they use. If you're familiar with linear algebra, probability, and statistics, and have programming experience, this book is an ideal introduction to data science. Topics include: Statistical inference, exploratory data analysis, and the data science process Algorithms Spam filters, Naive Bayes, and data wrangling Logistic regression Financial modeling Recommendation engines and causality Data visualization Social networks and data journalism Data engineering, MapReduce, Pregel, and Hadoop Doing Data Science is collaboration between course instructor Rachel Schutt, Senior VP of Data Science at News Corp, and data science consultant Cathy O'Neil, a senior data scientist at Johnson Research Labs, who attended and blogged about the course

Doing data science

Similar Items