Generative AI on AWS Building Context-Aware Multimodal Reasoning Applications

Companies today are moving rapidly to integrate generative AI into their products and services. But there's a great deal of hype (and misunderstanding) about the impact and promise of this technology. With this book, Chris Fregly, Antje Barth, and Shelbee Eigenbrode from AWS help CTOs, ML pract...

Full description

Bibliographic Details
Main Authors: Fregly, Chris, Barth, Antje (Author), Eigenbrode, Shelbee (Author)
Format: eBook
Language:English
Published: [Sebastopol, California] O'Reilly Media, Inc. 2023
Edition:First edtion
Subjects:
Online Access:
Collection: O'Reilly - Collection details see MPG.ReNa
LEADER 06093nmm a2200469 u 4500
001 EB002187541
003 EBX01000000000000001325026
005 00000000000000.0
007 cr|||||||||||||||||||||
008 231204 ||| eng
020 |a 9781098159191 
050 4 |a Q336 
100 1 |a Fregly, Chris 
245 0 0 |a Generative AI on AWS  |b Building Context-Aware Multimodal Reasoning Applications  |c Chris Fregly, Antje Barth & Shelbee Eigenbrode 
250 |a First edtion 
260 |a [Sebastopol, California]  |b O'Reilly Media, Inc.  |c 2023 
300 |a 1 online resource 
505 0 |a Cover -- Copyright -- Table of Contents -- Preface -- Conventions Used in This Book -- Using Code Examples -- O'Reilly Online Learning -- How to Contact Us -- Acknowledgments -- Chris -- Antje -- Shelbee -- Chapter 1. Generative AI Use Cases, Fundamentals, and Project Life Cycle -- Use Cases and Tasks -- Foundation Models and Model Hubs -- Generative AI Project Life Cycle -- Generative AI on AWS -- Why Generative AI on AWS? -- Building Generative AI Applications on AWS -- Summary -- Chapter 2. Prompt Engineering and In-Context Learning -- Prompts and Completions -- Tokens -- Prompt Engineering -- Prompt Structure -- Instruction -- Context -- In-Context Learning with Few-Shot Inference -- Zero-Shot Inference -- One-Shot Inference -- Few-Shot Inference -- In-Context Learning Gone Wrong -- In-Context Learning Best Practices -- Prompt-Engineering Best Practices -- Inference Configuration Parameters -- Summary -- Chapter 3. Large-Language Foundation Models --  
505 0 |a Large-Language Foundation Models -- Tokenizers -- Embedding Vectors -- Transformer Architecture -- Inputs and Context Window -- Embedding Layer -- Encoder -- Self-Attention -- Decoder -- Softmax Output -- Types of Transformer-Based Foundation Models -- Pretraining Datasets -- Scaling Laws -- Compute-Optimal Models -- Summary -- Chapter 4. Memory and Compute Optimizations -- Memory Challenges -- Data Types and Numerical Precision -- Quantization -- fp16 -- bfloat16 -- fp8 -- int8 -- Optimizing the Self-Attention Layers -- FlashAttention -- Grouped-Query Attention -- Distributed Computing -- Distributed Data Parallel -- Fully Sharded Data Parallel -- Performance Comparison of FSDP over DDP -- Distributed Computing on AWS -- Fully Sharded Data Parallel with Amazon SageMaker -- AWS Neuron SDK and AWS Trainium -- Summary -- Chapter 5. Fine-Tuning and Evaluation -- Instruction Fine-Tuning -- Llama 2-Chat -- Falcon-Chat -- FLAN-T5 -- Instruction Dataset -- Multitask Instruction Dataset --  
505 0 |a FLAN: Example Multitask Instruction Dataset -- Prompt Template -- Convert a Custom Dataset into an Instruction Dataset -- Instruction Fine-Tuning -- Amazon SageMaker Studio -- Amazon SageMaker JumpStart -- Amazon SageMaker Estimator for Hugging Face -- Evaluation -- Evaluation Metrics -- Benchmarks and Datasets -- Summary -- Chapter 6. Parameter-Efficient Fine-Tuning -- Full Fine-Tuning Versus PEFT -- LoRA and QLoRA -- LoRA Fundamentals -- Rank -- Target Modules and Layers -- Applying LoRA -- Merging LoRA Adapter with Original Model -- Maintaining Separate LoRA Adapters -- Full-Fine Tuning Versus LoRA Performance -- QLoRA -- Prompt Tuning and Soft Prompts -- Summary -- Chapter 7. Fine-Tuning with Reinforcement Learning from Human Feedback -- Human Alignment: Helpful, Honest, and Harmless -- Reinforcement Learning Overview -- Train a Custom Reward Model -- Collect Training Dataset with Human-in-the-Loop -- Sample Instructions for Human Labelers 
653 |a Intelligence artificielle / Logiciels 
653 |a Logiciels d'application 
653 |a artificial intelligence / aat 
653 |a Application software / http://id.loc.gov/authorities/subjects/sh90001980 
653 |a Amazon Web Services (Firm) / fast 
653 |a Artificial intelligence / http://id.loc.gov/authorities/subjects/sh85008180 
653 |a Application software / fast 
653 |a Artificial intelligence / Computer programs / http://id.loc.gov/authorities/subjects/sh85008181 
653 |a Amazon Web Services (Firm) / http://id.loc.gov/authorities/names/no2015140713 
653 |a Artificial intelligence / Computer programs / fast 
653 |a Intelligence artificielle 
700 1 |a Barth, Antje  |e author 
700 1 |a Eigenbrode, Shelbee  |e author 
041 0 7 |a eng  |2 ISO 639-2 
989 |b OREILLY  |a O'Reilly 
776 |z 9781098159221 
776 |z 1098159195 
776 |z 1098159225 
776 |z 9781098159191 
856 4 0 |u https://learning.oreilly.com/library/view/~/9781098159214/?ar  |x Verlag  |3 Volltext 
082 0 |a 006.3 
520 |a Companies today are moving rapidly to integrate generative AI into their products and services. But there's a great deal of hype (and misunderstanding) about the impact and promise of this technology. With this book, Chris Fregly, Antje Barth, and Shelbee Eigenbrode from AWS help CTOs, ML practitioners, application developers, business analysts, data engineers, and data scientists find practical ways to use this exciting new technology. You'll learn the generative AI project life cycle including use case definition, model selection, model fine-tuning, retrieval-augmented generation, reinforcement learning from human feedback, and model quantization, optimization, and deployment. And you'll explore different types of models including large language models (LLMs) and multimodal models such as Stable Diffusion for generating images and Flamingo/IDEFICS for answering questions about images. Apply generative AI to your business use cases Determine which generative AI models are best suited to your task Perform prompt engineering and in-context learning Fine-tune generative AI models on your datasets with low-rank adaptation (LoRA) Align generative AI models to human values with reinforcement learning from human feedback (RLHF) Augment your model with retrieval-augmented generation (RAG) Explore libraries such as LangChain and ReAct to develop agents and actions Build generative AI applications with Amazon Bedrock