Advanced data analytics using Python with architectural patterns, text and image classification, and optimization techniques

Understand advanced data analytics concepts such as time series and principal component analysis with ETL, supervised learning, and PySpark using Python. This book covers architectural patterns in data analytics, text and image classification, optimization techniques, natural language processing, an...

Full description

Bibliographic Details
Main Authors: Mukhopadhyay, Sayan, Samanta, Pratip (Author)
Format: eBook
Language:English
Published: New York, NY Apress 2023
Edition:Second edition
Series:ITpro collection
Subjects:
Online Access:
Collection: O'Reilly - Collection details see MPG.ReNa
LEADER 05306nmm a2200529 u 4500
001 EB002138036
003 EBX01000000000000001276163
005 00000000000000.0
007 cr|||||||||||||||||||||
008 230102 ||| eng
020 |a 9781484280058 
050 4 |a QA76.73.P98 
100 1 |a Mukhopadhyay, Sayan 
245 0 0 |a Advanced data analytics using Python  |b with architectural patterns, text and image classification, and optimization techniques  |c Sayan Mukhopadhyay, Pratip Samanta 
250 |a Second edition 
260 |a New York, NY  |b Apress  |c 2023 
300 |a 259 pages  |b illustrations 
505 0 |a First Normal Form -- Second Normal Form -- Third Normal Form -- Elasticsearch -- Connection Layer API -- Neo4j Python Driver -- neo4j-rest-client -- In-Memory Database -- MongoDB (Python Edition) -- Import Data into the Collection -- Create a Connection Using pymongo -- Access Database Objects -- Insert Data -- Update Data -- Remove Data -- Cloud Databases -- Pandas -- ETL with Python (Unstructured Data) -- Email Parsing -- Topical Crawling -- Crawling Algorithms -- Summary -- Chapter 3: Feature Engineering and Supervised Learning -- Dimensionality Reduction with Python -- Correlation Analysis 
505 0 |a Choosing K: The Elbow Method -- Silhouette Analysis -- Distance or Similarity Measure -- Properties -- General and Euclidean Distance -- Squared Euclidean Distance -- Distance Between String-Edit Distance -- Levenshtein Distance -- Needleman-Wunsch Algorithm -- Similarity in the Context of a Document -- Types of Similarity -- Example of K-Means in Images -- Preparing the Cluster -- Thresholding -- Time to Cluster -- Revealing the Current Cluster -- Hierarchical Clustering -- Bottom-Up Approach -- Distance Between Clusters -- Single Linkage Method -- Complete Linkage Method 
505 0 |a Intro -- Table of Contents -- About the Authors -- About the Technical Reviewer -- Acknowledgments -- Introduction -- Chapter 1: A Birds Eye View to AI System -- OOP in Python -- Calling Other Languages in Python -- Exposing the Python Model as a Microservice -- High-Performance API and Concurrent Programming -- Choosing the Right Database -- Summary -- Chapter 2: ETL with Python -- MySQL -- How to Install MySQLdb? -- Database Connection -- INSERT Operation -- READ Operation -- DELETE Operation -- UPDATE Operation -- COMMIT Operation -- ROLL-BACK Operation -- Normal Forms 
505 0 |a Principal Component Analysis -- Mutual Information -- Classifications with Python -- Semi-Supervised Learning -- Decision Tree -- Which Attribute Comes First? -- Random Forest Classifier -- Naïve Bayes Classifier -- Support Vector Machine -- Nearest Neighbor Classifier -- Sentiment Analysis -- Image Recognition -- Regression with Python -- Least Square Estimation -- Logistic Regression -- Classification and Regression -- Intentionally Bias the Model to Over-Fit or Under-Fit -- Dealing with Categorical Data -- Summary -- Chapter 4: Unsupervised Learning: Clustering -- K-Means Clustering 
505 0 |a Includes bibliographical references 
653 |a Data mining / fast 
653 |a Python (Computer program language) / fast 
653 |a Python (Computer program language) / http://id.loc.gov/authorities/subjects/sh96008834 
653 |a Programmation (Informatique) 
653 |a Machine learning / fast 
653 |a Apprentissage automatique 
653 |a Data mining / http://id.loc.gov/authorities/subjects/sh97002073 
653 |a Python (Langage de programmation) 
653 |a computer programming / aat 
653 |a Machine learning / http://id.loc.gov/authorities/subjects/sh85079324 
653 |a Computer programming / http://id.loc.gov/authorities/subjects/sh85107310 
653 |a Exploration de données (Informatique) 
700 1 |a Samanta, Pratip  |e author 
041 0 7 |a eng  |2 ISO 639-2 
989 |b OREILLY  |a O'Reilly 
490 0 |a ITpro collection 
500 |a Includes index 
028 5 0 |a 10.1007/978-1-4842-8005-8 
776 |z 9781484280041 
776 |z 1484280040 
776 |z 1484280059 
776 |z 9781484280058 
856 4 0 |u https://learning.oreilly.com/library/view/~/9781484280058/?ar  |x Verlag  |3 Volltext 
082 0 |a 005.13/3 
520 |a Understand advanced data analytics concepts such as time series and principal component analysis with ETL, supervised learning, and PySpark using Python. This book covers architectural patterns in data analytics, text and image classification, optimization techniques, natural language processing, and computer vision in the cloud environment. Generic design patterns in Python programming is clearly explained, emphasizing architectural practices such as hot potato anti-patterns. You'll review recent advances in databases such as Neo4j, Elasticsearch, and MongoDB. You'll then study feature engineering in images and texts with implementing business logic and see how to build machine learning and deep learning models using transfer learning. Advanced Analytics with Python, 2nd edition features a chapter on clustering with a neural network, regularization techniques, and algorithmic design patterns in data analytics with reinforcement learning. Finally, the recommender system in PySpark explains how to optimize models for a specific application