Data Ingestion with Python Cookbook
eBook - ePub

Data Ingestion with Python Cookbook

A practical guide to ingesting, monitoring, and identifying errors in the data ingestion process

  1. 414 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Data Ingestion with Python Cookbook

A practical guide to ingesting, monitoring, and identifying errors in the data ingestion process

About this book

Deploy your data ingestion pipeline, orchestrate, and monitor efficiently to prevent loss of data and quality

Key Features

  • Harness best practices to create a Python and PySpark data ingestion pipeline
  • Seamlessly automate and orchestrate your data pipelines using Apache Airflow
  • Build a monitoring framework by integrating the concept of data observability into your pipelines

Book Description

Data Ingestion with Python Cookbook offers a practical approach to designing and implementing data ingestion pipelines. It presents real-world examples with the most widely recognized open source tools on the market to answer commonly asked questions and overcome challenges.You'll be introduced to designing and working with or without data schemas, as well as creating monitored pipelines with Airflow and data observability principles, all while following industry best practices. The book also addresses challenges associated with reading different data sources and data formats. As you progress through the book, you'll gain a broader understanding of error logging best practices, troubleshooting techniques, data orchestration, monitoring, and storing logs for further consultation.By the end of the book, you'll have a fully automated set that enables you to start ingesting and monitoring your data pipeline effortlessly, facilitating seamless integration with subsequent stages of the ETL process.

What you will learn

  • Implement data observability using monitoring tools
  • Automate your data ingestion pipeline
  • Read analytical and partitioned data, whether schema or non-schema based
  • Debug and prevent data loss through efficient data monitoring and logging
  • Establish data access policies using a data governance framework
  • Construct a data orchestration framework to improve data quality

Who this book is for

This book is for data engineers and data enthusiasts seeking a comprehensive understanding of the data ingestion process using popular tools in the open source community. For more advanced learners, this book takes on the theoretical pillars of data governance while providing practical examples of real-world scenarios commonly encountered by data engineers.

]]>

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Perlego offers two plans: Essential and Complete
  • Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
  • Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
Both plans are available with monthly, semester, or annual billing cycles.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Data Ingestion with Python Cookbook by Glaucia Esppenchutz in PDF and/or ePUB format, as well as other popular books in Computer Science & Data Modelling & Design. We have over one million books available in our catalogue for you to explore.

Table of contents

  1. Data Ingestion with Python Cookbook
  2. Contributors
  3. Preface
  4. Part 1: Fundamentals of Data Ingestion
  5. 1
  6. 2
  7. 3
  8. 4
  9. 5
  10. 6
  11. 7
  12. Part 2: Structuring the Ingestion Pipeline
  13. 8
  14. 9
  15. 10
  16. 11
  17. 12
  18. Index
  19. Other Books You May Enjoy