|
|
|
|
LEADER |
06093nmm a2200469 u 4500 |
001 |
EB002187541 |
003 |
EBX01000000000000001325026 |
005 |
00000000000000.0 |
007 |
cr||||||||||||||||||||| |
008 |
231204 ||| eng |
020 |
|
|
|a 9781098159191
|
050 |
|
4 |
|a Q336
|
100 |
1 |
|
|a Fregly, Chris
|
245 |
0 |
0 |
|a Generative AI on AWS
|b Building Context-Aware Multimodal Reasoning Applications
|c Chris Fregly, Antje Barth & Shelbee Eigenbrode
|
250 |
|
|
|a First edtion
|
260 |
|
|
|a [Sebastopol, California]
|b O'Reilly Media, Inc.
|c 2023
|
300 |
|
|
|a 1 online resource
|
505 |
0 |
|
|a Cover -- Copyright -- Table of Contents -- Preface -- Conventions Used in This Book -- Using Code Examples -- O'Reilly Online Learning -- How to Contact Us -- Acknowledgments -- Chris -- Antje -- Shelbee -- Chapter 1. Generative AI Use Cases, Fundamentals, and Project Life Cycle -- Use Cases and Tasks -- Foundation Models and Model Hubs -- Generative AI Project Life Cycle -- Generative AI on AWS -- Why Generative AI on AWS? -- Building Generative AI Applications on AWS -- Summary -- Chapter 2. Prompt Engineering and In-Context Learning -- Prompts and Completions -- Tokens -- Prompt Engineering -- Prompt Structure -- Instruction -- Context -- In-Context Learning with Few-Shot Inference -- Zero-Shot Inference -- One-Shot Inference -- Few-Shot Inference -- In-Context Learning Gone Wrong -- In-Context Learning Best Practices -- Prompt-Engineering Best Practices -- Inference Configuration Parameters -- Summary -- Chapter 3. Large-Language Foundation Models --
|
505 |
0 |
|
|a Large-Language Foundation Models -- Tokenizers -- Embedding Vectors -- Transformer Architecture -- Inputs and Context Window -- Embedding Layer -- Encoder -- Self-Attention -- Decoder -- Softmax Output -- Types of Transformer-Based Foundation Models -- Pretraining Datasets -- Scaling Laws -- Compute-Optimal Models -- Summary -- Chapter 4. Memory and Compute Optimizations -- Memory Challenges -- Data Types and Numerical Precision -- Quantization -- fp16 -- bfloat16 -- fp8 -- int8 -- Optimizing the Self-Attention Layers -- FlashAttention -- Grouped-Query Attention -- Distributed Computing -- Distributed Data Parallel -- Fully Sharded Data Parallel -- Performance Comparison of FSDP over DDP -- Distributed Computing on AWS -- Fully Sharded Data Parallel with Amazon SageMaker -- AWS Neuron SDK and AWS Trainium -- Summary -- Chapter 5. Fine-Tuning and Evaluation -- Instruction Fine-Tuning -- Llama 2-Chat -- Falcon-Chat -- FLAN-T5 -- Instruction Dataset -- Multitask Instruction Dataset --
|
505 |
0 |
|
|a FLAN: Example Multitask Instruction Dataset -- Prompt Template -- Convert a Custom Dataset into an Instruction Dataset -- Instruction Fine-Tuning -- Amazon SageMaker Studio -- Amazon SageMaker JumpStart -- Amazon SageMaker Estimator for Hugging Face -- Evaluation -- Evaluation Metrics -- Benchmarks and Datasets -- Summary -- Chapter 6. Parameter-Efficient Fine-Tuning -- Full Fine-Tuning Versus PEFT -- LoRA and QLoRA -- LoRA Fundamentals -- Rank -- Target Modules and Layers -- Applying LoRA -- Merging LoRA Adapter with Original Model -- Maintaining Separate LoRA Adapters -- Full-Fine Tuning Versus LoRA Performance -- QLoRA -- Prompt Tuning and Soft Prompts -- Summary -- Chapter 7. Fine-Tuning with Reinforcement Learning from Human Feedback -- Human Alignment: Helpful, Honest, and Harmless -- Reinforcement Learning Overview -- Train a Custom Reward Model -- Collect Training Dataset with Human-in-the-Loop -- Sample Instructions for Human Labelers
|
653 |
|
|
|a Intelligence artificielle / Logiciels
|
653 |
|
|
|a Logiciels d'application
|
653 |
|
|
|a artificial intelligence / aat
|
653 |
|
|
|a Application software / http://id.loc.gov/authorities/subjects/sh90001980
|
653 |
|
|
|a Amazon Web Services (Firm) / fast
|
653 |
|
|
|a Artificial intelligence / http://id.loc.gov/authorities/subjects/sh85008180
|
653 |
|
|
|a Application software / fast
|
653 |
|
|
|a Artificial intelligence / Computer programs / http://id.loc.gov/authorities/subjects/sh85008181
|
653 |
|
|
|a Amazon Web Services (Firm) / http://id.loc.gov/authorities/names/no2015140713
|
653 |
|
|
|a Artificial intelligence / Computer programs / fast
|
653 |
|
|
|a Intelligence artificielle
|
700 |
1 |
|
|a Barth, Antje
|e author
|
700 |
1 |
|
|a Eigenbrode, Shelbee
|e author
|
041 |
0 |
7 |
|a eng
|2 ISO 639-2
|
989 |
|
|
|b OREILLY
|a O'Reilly
|
776 |
|
|
|z 9781098159221
|
776 |
|
|
|z 1098159195
|
776 |
|
|
|z 1098159225
|
776 |
|
|
|z 9781098159191
|
856 |
4 |
0 |
|u https://learning.oreilly.com/library/view/~/9781098159214/?ar
|x Verlag
|3 Volltext
|
082 |
0 |
|
|a 006.3
|
520 |
|
|
|a Companies today are moving rapidly to integrate generative AI into their products and services. But there's a great deal of hype (and misunderstanding) about the impact and promise of this technology. With this book, Chris Fregly, Antje Barth, and Shelbee Eigenbrode from AWS help CTOs, ML practitioners, application developers, business analysts, data engineers, and data scientists find practical ways to use this exciting new technology. You'll learn the generative AI project life cycle including use case definition, model selection, model fine-tuning, retrieval-augmented generation, reinforcement learning from human feedback, and model quantization, optimization, and deployment. And you'll explore different types of models including large language models (LLMs) and multimodal models such as Stable Diffusion for generating images and Flamingo/IDEFICS for answering questions about images. Apply generative AI to your business use cases Determine which generative AI models are best suited to your task Perform prompt engineering and in-context learning Fine-tune generative AI models on your datasets with low-rank adaptation (LoRA) Align generative AI models to human values with reinforcement learning from human feedback (RLHF) Augment your model with retrieval-augmented generation (RAG) Explore libraries such as LangChain and ReAct to develop agents and actions Build generative AI applications with Amazon Bedrock
|