eBook - ePub

Repeatability, Reliability, and Scalability through GitOps

Name: Repeatability, Reliability, and Scalability through GitOps
ISBN: 9781801074315

Bryan Feuling,

292 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Repeatability, Reliability, and Scalability through GitOps

Bryan Feuling,

About this book

Learn how to best use GitOps to automate manual tasks in the continuous delivery and deployment processKey Features• Explore the different GitOps schools of thought and understand which GitOps practices will work for you and your team• Get up and running with the fundamentals of GitOps implementation• Understand how to effectively automate the deployment and delivery processBook DescriptionThe world of software delivery and deployment has come a long way in the last few decades. From waterfall methods to Agile practices, every company that develops its own software has to overcome various challenges in delivery and deployment to meet customer and market demands. This book will guide you through common industry practices for software delivery and deployment. Throughout the book, you'll follow the journey of a DevOps team that matures their software release process from quarterly deployments to continuous delivery using GitOps. With the help of hands-on tutorials, projects, and self-assessment questions, you'll build your knowledge of GitOps basics, different types of GitOps practices, and how to decide which GitOps practice is the best for your company. As you progress, you'll cover everything from building declarative language files to the pitfalls in performing continuous deployment with GitOps. By the end of this book, you'll be well-versed with the fundamentals of delivery and deployment, the different schools of GitOps, and how to best leverage GitOps in your teams. What you will learn• Explore a variety of common industry tools for GitOps• Understand continuous deployment, continuous delivery, and why they are important• Gain a practical understanding of using GitOps as an engineering organization• Become well-versed with using GitOps and Kubernetes together• Leverage Git events for automated deployments• Implement GitOps best practices and find out how to avoid GitOps pitfallsWho this book is forThis book is for engineering leaders and anyone working in software engineering, DevOps, SRE, build/release, or cloud automation teams. A basic understanding of the DevOps software development life cycle (SDLC) will help you to get the most out of this book.

Tools to learn more effectively

Saving Books

Keyword Search

Annotating Text

Listen to it instead

Information

Publisher

Packt Publishing

Year

2021

eBook ISBN

9781801074315

Edition

Topic

Informatica

Subtopic

Applicazioni per aziende

Section 1: Fundamentals of GitOps

This section is designed to provide a foundation that you can build your best practices on in regards to continuous deployment, continuous delivery, and GitOps.

This section comprises the following chapters:

Chapter 1, The Fundamentals of Delivery and Deployment
Chapter 2, Exploring Common Industry Delivery and Deployment Practices
Chapter 3, The "What" and "Why" of GitOps

Chapter 1: The Fundamentals of Delivery and Deployment

Any company that builds and maintains applications is automatically concerned with repeatability, reliability, and scalability. In fact, some of the main metrics that are monitored on an application are directly related to these operational concerns. Understanding the basics and history of the industry when attempting to accomplish the ultimate trifecta of software administration is paramount to learning from the issues of the past.

In this chapter, and throughout this book, you will embark on a journey with a DevOps team as they attempt to conquer the deployment and delivery world. By experiencing the pains, bottlenecks, and setbacks with the DevOps team, you will understand how the industry has evolved, and what needs to be accomplished in order to succeed.

In this chapter, we're going to cover the following main topics:

How did we get here?
What is a deployment process?
What is a delivery process?
What makes any practice continuous?

How did we get here?

It's 8 a.m. on a Saturday and the release party's post-mortem has finally been completed. Throughout the release, every encountered issue resulted in a Root Cause Analysis process. Once each of the RCAs were done, the release team would then create and assign tickets as needed, resulting in action items for the different teams in the Engineering organization. With the post-mortem being completed, the release team can hand-off the monitoring of the production application to the weekend support team and head home.

The final production servers were upgraded with the new application release at around 3 a.m. that morning with all of the application health checks successfully passing by 3:30 a.m. And yet, even with the early morning finishing time, this is a significant improvement when comparing it to the release parties of a few years ago. Previously, the applications were released every 6 to 12 months, rather than the quarterly release cadence that the company is currently on.

Their company had hired a consulting agency to advise them on how to improve their application's mean-time-to-market and reduce their production outages in order to meet business initiatives and demands. The outcome suggested by this consulting agency was to release the application more frequently than once or twice a year. As a result, the releases have been quicker and less prone to error, which the business has taken notice of. The release parties still require pulling an all-nighter, but the previous release parties were more like all-weekenders or longer.

The on-call engineering team still has to be brought in for every release, but at least they aren't required to be a part of the release party for the entire time. And the most recent release only required a conference bridge for about 4 hours to solve issues with the underlying code or provide quick fixes. Overall, the operations team, infrastructure team, network team, and security team were able to solve most of the issues that showed up, which accounted for significantly more confidence in the newer release cadence.

The different teams should be able to accomplish the backlog of issues before the next release. But the team with the largest issue backlog were the systems administrators , who build, integrate, administrate, and troubleshoot the many different tools used during the releases.

After 12 straight hours with over 15 members across a host of different teams, the release party was complete. When considering the time associated with the attempt to improve the process throughout the quarter, as well as the actual release itself, it is not difficult to run the mental math on the associated costs. The teams need to figure out a way to make the releases more reliable, repeatable, and scalable.

This analogy is all too familiar for many who have been involved in the engineering side of a business during the Waterfall software development life cycle days. When applications were first made available as a SaaS (Software as a Service) solution,the common release cadence was an annual release. Throughout the year a company would deploy small release, often called patches, which mainly consisted of hardware, software, or security updates.

Since the yearly update was essentially releasing a brand-new product, the release process required significant involvement from every team across the entire engineering organization. The release was a major event , often taking an entire weekend, or longer, from every team available. Many in the industry had dubbed this event a release party. Each release party included significant amounts of caffeine and food, which accompanied a host of people hunched over their laptops as they watched the output of the release on a massive projector screen. Yet the worst part of this whole scenario was that this was the expected release style for every company at the time.

The quarterly release cadence was a novel idea that revolutionized how companies would develop and test their code. The code changes were smaller in nature and the teams evolved their thinking from a new product every year to a new subversion every quarter. Some user experience changes may be introduced, but most of the user experience in the application would remain the same from release to release. Another major benefit to the increased release frequency was the significant reduction in lead time, which is the time it takes to go from a feature being requested to being available in production.

Alongside the release parties were two very important processes when issues would arise during the release:

Root-cause analysis (RCA)
Post-mortem

An RCA would occur anytime there was a significant issue in production that would halt or severely affect the functionality or availability of the application. Often, the RCA process would start with the teams analyzing what was wrong, fixing the issue, validating that the fix worked, and then documenting how the issue arose and what the root cause was. Every release party would result in at least one RCA taking place, and would exponentially increase in number relative to the total amount of production servers involved in the release party.

The post-mortem was a retrospective process after the release was completed and the teams were confident in production operating as expected. The release captain would gather any and all information related to RCAs, bugs, errors, and so on, and create the required documentation and tickets. At the end of the post-mortem, a weekend support team would be briefed on the release party outcome and any items needing to be monitored.

The desire to automate the release of the application had been a central focus of every engineering organizations for years Automation was seen as the best way to enforce reliability and repeatability into the release process, and most of the common tools in use today were created with the intention and purpose of release automation. These tools, and really the underlying processes they address, intend to solve two major concepts in the software development life cycle: deliveries and deployments.

What is a deployment process?

10 p.m. on Friday was when everything started falling apart. The Q2 release party started a few hours ago with the entire operations team and a few members of the infrastructure and network team in attendance. Routing customer traffic away from the initial test server to allow for the upgrade went as expected. This process was recently automated through some network management scripts that the systems administration and network team worked on. The idea was that all new traffic should be routed away from the initial server while allowing the customer sessions that were currently using the server to continue until they disconnected. After all the user sessions were completed, the server was removed from the load balancer and the release process could start.

The infrastructure team had a bootstrap script already built out to automatically configure the server. Sometimes this process involved tearing down the whole server and rebuilding it, while other times the release required some simple software updates to be completed before the hardware was ready for the application release. The new release wouldn't require an entire rebuild of the server this time. However, since the last release was 3 months ago, they did have to patch the server, add a new application stack version, and make sure that other configuration requirements were set accordingly. The entirety of the infrastructure process took about an hour for the first server, which would then be repeated for the other servers so that the bootstrapping time would be reduced for the rest of the fleet. As more customers were acquired, the total number of servers in production had grown. To avoid downtime for the production environment these servers were grouped together into pools, which could then be individually targeted for stopping, upgrading, and restarting as needed.

After the initial server was bootstrapped by the infrastructure team and validated through some basic quality and security tests on the server, the operations team would then start the application release process. It was just after 7 p.m. when the operations team started the release process, also known as a deployment, by copying the ZIP file from the production network share to the server. The file was then expanded, into a mess of files and folders which contained system services, application files, and a rather daunting INSTALL_README.txt file. This README file detailed all of the required install steps and validation checks that the engineering team documented for the operations team to execute.

With the install instruction file open on one screen and the terminal open on another, the install process could start. That is when everything went wrong.

Although the deployment testing in the staging environment had some issues because of missing requirements, those were documented and added to the install process. But what the operations team didn't know was that the server bootstrapping script had reset all of the network configuration files and all of the aplication traffic heading out of the server was being redirected back to itself. As the deployment went through, the application ZIP file was able to get pushed to the server, the filesystem was set up as needed, and the required system services began running. The script used to test the health of the application showed all successful log messages. However, when the script to test the interaction between the application and the database was run, the terminal output showed only connection errors. It took the team over an hour to get everything copied over, stood up, and tested before the network errors were discovered. The release party had come to a grinding halt.

The operations team was in full-on panic mode and the first RCA process had started. If they could not figure out why the server was not able to talk to external machines within the next hour, they would need to tear down the whole server and start over again. While one person from the operations team collaborated with the network and infrastructure team, another operations team member would retrace every action taken since the infrastructure team had finished their tasks. After 30 minutes of the network team analyzing all traffic related to the new server and the desired databases, they could not find any reason as to why the server could not reach the database. The Infrastructure team was checking to see if the server had been properly added to the domain and that no other machines were using the same hostname or IP address. The operations team had engaged the on-call engineering team and started a troubleshooting conference bridge for the data center support team to join.

It wasn't until a few minutes after midnight that the network team found the networking loopback issue on the server. The outcome of the RCA process found that the server bootstrapping script was the culprit, which was then altered to avoid the issue in the future. The server was now passing all health checks and the operations team could move on to the next server in the pool. Within an hour, the rest of the server pool had been fully upgraded without an issue. Almost two hours later, all server pools were upgraded and reporting healthy. The post-mortem process could begin now that the new application version was out in production and operating as expected.

A release party would always start with an initial test release into production, known as a deployment. At a high level, a deployment process solely concerned with is copying an artifact from a designated location to some endpoint or host. In the case of the quarterly release party, a deployment consisted of pushing or pulling the artifact to a designated test server in the production fleet. This was a common method used to avoid production downtime by preventing unknown production-specific nuances from negatively affecting an environment-wide deployment. The log output and application metrics for the...

Repeatability, Reliability, and Scalability through GitOps
Foreword
Preface
Section 1: Fundamentals of GitOps
Chapter 1: The Fundamentals of Delivery and Deployment
Chapter 2: Exploring Common Industry Delivery and Deployment Practices
Chapter 3: The "What" and "Why" of GitOps
Section 2: GitOps Types, Benefits, and Drawbacks
Chapter 4: The Original GitOps – Continuous Deployment in Kubernetes
Chapter 5: The Purist GitOps – Continuous Deployment Everywhere
Chapter 6: Verified GitOps – Continuous Delivery Declaratively Defined
Chapter 7: Best Practices for Delivery, Deployment, and GitOps
Section 3: Hands-On Practical GitOps
Chapter 8: Practicing the Basics – Declarative Language File Building
Chapter 9: Originalist Gitops in Practice – Continuous Deployment
Chapter 10: Verified GitOps Setup – Continuous Delivery GitOps with Harness
Chapter 11: Pitfall Examples – Experiencing Issues with GitOps
Chapter 12: What's Next?
Other Books You May Enjoy

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription

No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn how to download books offline

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 990+ topics, we’ve got you covered! Learn about our mission

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more about Read Aloud

Yes! You can use the Perlego app on both iOS and Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app

Yes, you can access Repeatability, Reliability, and Scalability through GitOps by Bryan Feuling in PDF and/or ePUB format, as well as other popular books in Informatica & Applicazioni per aziende. We have over one million books available in our catalogue for you to explore.