
Large Language Model-Based Solutions
How to Deliver Value with Cost-Effective Generative AI Applications
- English
- ePUB (mobile friendly)
- Available on iOS & Android
Large Language Model-Based Solutions
How to Deliver Value with Cost-Effective Generative AI Applications
About this book
Learn to build cost-effective apps using Large Language Models
In Large Language Model-Based Solutions: How to Deliver Value with Cost-Effective Generative AI Applications, Principal Data Scientist at Amazon Web Services, Shreyas Subramanian, delivers a practical guide for developers and data scientists who wish to build and deploy cost-effective large language model (LLM)-based solutions. In the book, you'll find coverage of a wide range of key topics, including how to select a model, pre- and post-processing of data, prompt engineering, and instruction fine tuning.
The author sheds light on techniques for optimizing inference, like model quantization and pruning, as well as different and affordable architectures for typical generative AI (GenAI) applications, including search systems, agent assists, and autonomous agents. You'll also find:
- Effective strategies to address the challenge of the high computational cost associated with LLMs
- Assistance with the complexities of building and deploying affordable generative AI apps, including tuning and inference techniques
- Selection criteria for choosing a model, with particular consideration given to compact, nimble, and domain-specific models
Perfect for developers and data scientists interested in deploying foundational models, or business leaders planning to scale out their use of GenAI, Large Language Model-Based Solutions will also benefit project leaders and managers, technical support staff, and administrators with an interest or stake in the subject.
Frequently asked questions
- Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
- Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Information
Table of contents
- Cover
- Table of Contents
- Title Page
- Introduction
- 1 Introduction
- 2 Tuning Techniques for Cost Optimization
- 3 Inference Techniques for Cost Optimization
- 4 Model Selection and Alternatives
- 5 Infrastructure and Deployment Tuning Strategies
- CONCLUSION
- INDEX
- Copyright
- Dedication
- ABOUT THE AUTHOR
- ABOUT THE TECHNICAL EDITOR
- End User License Agreement