Practical lakehouse architecture

This concise yet comprehensive guide explains how to adopt a data lakehouse architecture to implement modern data platforms. It reviews the design considerations, challenges, and best practices for implementing a lakehouse and provides key insights into the ways that using a lakehouse can impact you...

Full description

Bibliographic Details
Main Author: Thalpati, Gaurav Ashok
Format: eBook
Language:English
Published: Sebastopol, CA O'Reilly Media, Inc. 2024
Edition:First edition
Subjects:
Online Access:
Collection: O'Reilly - Collection details see MPG.ReNa
LEADER 04576nmm a2200373 u 4500
001 EB002221872
003 EBX01000000000000001358832
005 00000000000000.0
007 cr|||||||||||||||||||||
008 240801 ||| eng
050 4 |a TK5105.86 
100 1 |a Thalpati, Gaurav Ashok 
245 0 0 |a Practical lakehouse architecture  |c by Gaurav Ashok Thalpati 
250 |a First edition 
260 |a Sebastopol, CA  |b O'Reilly Media, Inc.  |c 2024 
300 |a 250 pages  |b illustrations 
505 0 |a Understanding Lakehouse Architecture -- Lakehouse Architecture Characteristics -- Lakehouse Architecture Benefits -- Key Takeaways -- References -- Chapter 2. Traditional Architectures and Modern Data Platforms -- Traditional Architectures: Data Lakes and Data Warehouses -- Data Warehouse Fundamentals -- Data Lake Fundamentals -- Modern Data Platforms -- Finding Answers in the Cloud -- Standalone Approach -- Combined Approach -- Expectations of Modern Data Platforms -- Comparison: Data Warehouse, Data Lake, Lakehouse -- Capabilities and Limitations -- Implementation Activities 
505 0 |a Intro -- Copyright -- Table of Contents -- Preface -- Who Should Read This Book? -- Why I Wrote This Book -- Navigating This Book -- O'Reilly Online Learning -- Conventions Used in This Book -- How to Contact Us -- Acknowledgments -- Chapter 1. Introduction to Lakehouse Architecture -- Understanding Data Architecture -- What Is Data Architecture? -- How Does Data Architecture Help Build a Data Platform? -- Core Components of a Data Platform -- Why Do We Need a New Data Architecture? -- Lakehouse Architecture: A New Pattern -- The Lakehouse: Best of Both Worlds 
505 0 |a Administration and Management -- Business Outcomes -- Lakehouse Architecture: The Default Choice for Future Data Platforms? -- Key Takeaways -- References -- Chapter 3. Storage: The Heart of the Lakehouse -- Lakehouse Storage: Key Concepts -- Row Versus Columnar Storage -- Storage-based Performance Optimization -- Lakehouse Storage Components -- Cloud Object Storage -- File Formats -- Table Formats -- Key Design Considerations -- Ecosystem Support -- Community Support -- Supported File Formats -- Supported Compute Engines -- Supported Features -- Commercial Product Support 
505 0 |a Implementing a Data Catalog: Key Design Considerations and Options -- Using Hive metastore -- Using AWS Services -- Using Azure Services -- Using GCP Services -- Using Databricks -- Key Takeaways -- References -- Chapter 5. Compute Engines for Lakehouse Architectures -- Data Computation Benefits of Lakehouse Architecture -- Independent Scaling -- Cross-region, Cross-account Access -- Unified Batch and Real-Time Processing -- Enhanced BI Performance -- Freedom to Choose Different Engine Types -- Cross-zone Analysis -- Compute Engine Options for Lakehouse Platforms -- Open Source Tools 
505 0 |a Current and Future Versions -- Performance Benchmarking -- Comparisons -- Sharing Features -- Key Takeaways -- References -- Chapter 4. Data Catalogs -- Understanding Metadata -- Technical Metadata -- Business Metadata -- How Metastores and Data Catalogs Work Together -- Features of a Data Catalog -- Search, Explore, and Discover Data -- Data Classification -- Data Governance and Security -- Data Lineage -- Unified Data Catalog -- Challenges of Siloed Metadata Management -- What Is a Unified Data Catalog? -- Benefits of a Unified Data Catalog 
653 |a Storage area networks (Computer networks) / http://id.loc.gov/authorities/subjects/sh2001003093 
653 |a Cloud computing / http://id.loc.gov/authorities/subjects/sh2008004883 
653 |a Réseaux de stockage (Informatique) 
653 |a Computer network architectures / http://id.loc.gov/authorities/subjects/sh86007468 
653 |a Infonuagique 
653 |a Réseaux d'ordinateurs / Architectures 
041 0 7 |a eng  |2 ISO 639-2 
989 |b OREILLY  |a O'Reilly 
776 |z 9781098153014 
856 4 0 |u https://learning.oreilly.com/library/view/~/9781098153007/?ar  |x Verlag  |3 Volltext 
082 0 |a 331 
082 0 |a 004.6 
520 |a This concise yet comprehensive guide explains how to adopt a data lakehouse architecture to implement modern data platforms. It reviews the design considerations, challenges, and best practices for implementing a lakehouse and provides key insights into the ways that using a lakehouse can impact your data platform, from managing structured and unstructured data and supporting BI and AI/ML use cases to enabling more rigorous data governance and security measures