Advanced data analytics using Python with architectural patterns, text and image classification, and optimization techniques

Understand advanced data analytics concepts such as time series and principal component analysis with ETL, supervised learning, and PySpark using Python. This book covers architectural patterns in data analytics, text and image classification, optimization techniques, natural language processing, an...

Full description

Bibliographic Details
Main Authors: Mukhopadhyay, Sayan, Samanta, Pratip (Author)
Format: eBook
Language:English
Published: New York, NY Apress 2023
Edition:Second edition
Subjects:
Online Access:
Collection: O'Reilly - Collection details see MPG.ReNa
Table of Contents:
  • Intro
  • Table of Contents
  • About the Authors
  • About the Technical Reviewer
  • Acknowledgments
  • Introduction
  • Chapter 1: A Birds Eye View to AI System
  • OOP in Python
  • Calling Other Languages in Python
  • Exposing the Python Model as a Microservice
  • High-Performance API and Concurrent Programming
  • Choosing the Right Database
  • Summary
  • Chapter 2: ETL with Python
  • MySQL
  • How to Install MySQLdb?
  • Database Connection
  • INSERT Operation
  • READ Operation
  • DELETE Operation
  • UPDATE Operation
  • COMMIT Operation
  • ROLL-BACK Operation
  • Normal Forms
  • Choosing K: The Elbow Method
  • Silhouette Analysis
  • Distance or Similarity Measure
  • Properties
  • General and Euclidean Distance
  • Squared Euclidean Distance
  • Distance Between String-Edit Distance
  • Levenshtein Distance
  • Needleman-Wunsch Algorithm
  • Similarity in the Context of a Document
  • Types of Similarity
  • Example of K-Means in Images
  • Preparing the Cluster
  • Thresholding
  • Time to Cluster
  • Revealing the Current Cluster
  • Hierarchical Clustering
  • Bottom-Up Approach
  • Distance Between Clusters
  • Single Linkage Method
  • Complete Linkage Method
  • Principal Component Analysis
  • Mutual Information
  • Classifications with Python
  • Semi-Supervised Learning
  • Decision Tree
  • Which Attribute Comes First?
  • Random Forest Classifier
  • Naïve Bayes Classifier
  • Support Vector Machine
  • Nearest Neighbor Classifier
  • Sentiment Analysis
  • Image Recognition
  • Regression with Python
  • Least Square Estimation
  • Logistic Regression
  • Classification and Regression
  • Intentionally Bias the Model to Over-Fit or Under-Fit
  • Dealing with Categorical Data
  • Summary
  • Chapter 4: Unsupervised Learning: Clustering
  • K-Means Clustering
  • First Normal Form
  • Second Normal Form
  • Third Normal Form
  • Elasticsearch
  • Connection Layer API
  • Neo4j Python Driver
  • neo4j-rest-client
  • In-Memory Database
  • MongoDB (Python Edition)
  • Import Data into the Collection
  • Create a Connection Using pymongo
  • Access Database Objects
  • Insert Data
  • Update Data
  • Remove Data
  • Cloud Databases
  • Pandas
  • ETL with Python (Unstructured Data)
  • Email Parsing
  • Topical Crawling
  • Crawling Algorithms
  • Summary
  • Chapter 3: Feature Engineering and Supervised Learning
  • Dimensionality Reduction with Python
  • Correlation Analysis