eBook - ePub

The DevOps 2.5 Toolkit

Name: The DevOps 2.5 Toolkit
Author: Viktor Farcic

Viktor Farcic

Buch teilen

322 Seiten
English
ePUB (handyfreundlich)
Über iOS und Android verfügbar

eBook - ePub

The DevOps 2.5 Toolkit

Viktor Farcic

Angaben zum Buch

Buchvorschau

Inhaltsverzeichnis

Quellenangaben

Über dieses Buch

An advanced exploration of the skills and knowledge required for operating Kubernetes clusters, with a focus on metrics gathering and alerting, with the goal of making clusters and applications inside them autonomous through self-healing and self-adaptation.Key Features• The sixth book of DevOps expert Viktor Farcic's bestselling DevOps Toolkit series, with an overview of advanced core Kubernetes techniques, -oriented towards monitoring and alerting.• Takes a deep dive into monitoring, alerting, logging, auto-scaling, and other subjects aimed at making clusters resilient, self-sufficient, and self-adaptive• Discusses how to customise and create dashboards and alertsBook DescriptionBuilding on The DevOps 2.3 Toolkit: Kubernetes, and The DevOps 2.4 Toolkit: Continuous Deployment to Kubernetes, Viktor Farcic brings his latest exploration of the Docker technology as he records his journey to monitoring, logging, and autoscaling Kubernetes.The DevOps 2.5 Toolkit: Monitoring, Logging, and Auto-Scaling Kubernetes: Making Resilient, Self-Adaptive, And Autonomous Kubernetes Clusters is the latest book in Viktor Farcic's series that helps you build a full DevOps Toolkit. This book helps readers develop the necessary skillsets needed to be able to operate Kubernetes clusters, with a focus on metrics gathering and alerting with the goal of making clusters and applications inside them autonomous through self-healing and self-adaptation.Work with Viktor and dive into the creation of self-adaptive and self-healing systems within Kubernetes.What you will learn• Autoscaling Deployments and Statefulsets based on resource usage• Autoscaling nodes of a Kubernetes cluster• Debugging issues discovered through metrics and alerts• Extending HorizontalPodAutoscaler with custom metrics• Visualizing metrics and alerts• Collecting and querying logsWho this book is forReaders with an advanced-level understanding of Kubernetes and hands-on experience.

Häufig gestellte Fragen

Wie kann ich mein Abo kündigen?

Gehe einfach zum Kontobereich in den Einstellungen und klicke auf „Abo kündigen“ – ganz einfach. Nachdem du gekündigt hast, bleibt deine Mitgliedschaft für den verbleibenden Abozeitraum, den du bereits bezahlt hast, aktiv. Mehr Informationen hier.

(Wie) Kann ich Bücher herunterladen?

Derzeit stehen all unsere auf Mobilgeräte reagierenden ePub-Bücher zum Download über die App zur Verfügung. Die meisten unserer PDFs stehen ebenfalls zum Download bereit; wir arbeiten daran, auch die übrigen PDFs zum Download anzubieten, bei denen dies aktuell noch nicht möglich ist. Weitere Informationen hier.

Welcher Unterschied besteht bei den Preisen zwischen den Aboplänen?

Mit beiden Aboplänen erhältst du vollen Zugang zur Bibliothek und allen Funktionen von Perlego. Die einzigen Unterschiede bestehen im Preis und dem Abozeitraum: Mit dem Jahresabo sparst du auf 12 Monate gerechnet im Vergleich zum Monatsabo rund 30 %.

Was ist Perlego?

Wir sind ein Online-Abodienst für Lehrbücher, bei dem du für weniger als den Preis eines einzelnen Buches pro Monat Zugang zu einer ganzen Online-Bibliothek erhältst. Mit über 1 Million Büchern zu über 1.000 verschiedenen Themen haben wir bestimmt alles, was du brauchst! Weitere Informationen hier.

Unterstützt Perlego Text-zu-Sprache?

Achte auf das Symbol zum Vorlesen in deinem nächsten Buch, um zu sehen, ob du es dir auch anhören kannst. Bei diesem Tool wird dir Text laut vorgelesen, wobei der Text beim Vorlesen auch grafisch hervorgehoben wird. Du kannst das Vorlesen jederzeit anhalten, beschleunigen und verlangsamen. Weitere Informationen hier.

Ist The DevOps 2.5 Toolkit als Online-PDF/ePub verfügbar?

Ja, du hast Zugang zu The DevOps 2.5 Toolkit von Viktor Farcic im PDF- und/oder ePub-Format sowie zu anderen beliebten Büchern aus Informatique & Développement de logiciels. Aus unserem Katalog stehen dir über 1 Million Bücher zur Verfügung.

Information

Verlag

Packt Publishing

Jahr

2019

ISBN

9781838642631

Auflage

Thema

Informatique

Thema

Développement de logiciels

Collecting and Querying Metrics and Sending Alerts

Insufficient facts always invite danger.

- Spock

So far, we explored how to leverage some of Kubernetes core features. We used HorizontalPodAutoscaler and Cluster Autoscaler. While the former relies on Metrics Server, the latter is not based on metrics, but on Scheduler's inability to place Pods within the existing cluster capacity. Even though Metrics Server does provide some basic metrics, we are in desperate need for more.

We have to be able to monitor our cluster and Metrics Server is just not enough. It contains a limited amount of metrics, it keeps them for a very short period, and it does not allow us to execute anything but simplest queries. I can't say that we are blind if we rely only on Metrics Server, but that we are severely impaired. Without increasing the number of metrics we're collecting, as well as their retention, we get only a glimpse into what's going on in our Kubernetes clusters.

Being able to fetch and store metrics cannot be the goal by itself. We also need to be able to query them in search for a cause of an issue. For that, we need metrics to be "rich" with information, and we need a powerful query language.

Finally, being able to find the cause of a problem is not worth much without being able to be notified that there is an issue in the first place. That means that we need a system that will allow us to define alerts that, when certain thresholds are reached, will send us notifications or, when appropriate, send them to other parts of the system that can automatically execute steps that will remedy issues.

If we accomplish that, we'll be a step closer to having not only a self-healing (Kubernetes already does that) but also a self-adaptive system that will react to changed conditions. We might go even further and try to predict that "bad things" will happen in the future and be proactive in resolving them before they even arise.

All in all, we need a tool, or a set of tools, that will allow us to fetch and store "rich" metrics, that will allow us to query them, and that will notify us when an issue happens or, even better, when a problem is about to occur.

We might not be able to build a self-adapting system in this chapter, but we can try to create a foundation. But, first things first, we need a cluster that will allow us to "play" with some new tools and concepts.

Creating a cluster

We'll continue using definitions from the vfarcic/k8s-specs (https://github.com/vfarcic/k8s-specs) repository. To be on the safe side, we'll pull the latest version first.

All the commands from this chapter are available in the 03-monitor.sh (https://gist.github.com/vfarcic/718886797a247f2f9ad4002f17e9ebd9) Gist.

 1 cd k8s-specs 2 3 git pull

In this chapter, we'll need a few things that were not requirements before, even though you probably already used them.

We'll start using UIs so we'll need NGINX Ingress Controller that will route traffic from outside the cluster. We'll also need environment variable LB_IP with the IP through which we can access worker nodes. We'll use it to configure a few Ingress resources.

The Gists used to test the examples in this chapters are below. Please use them as they are, or as inspiration to create your own cluster or to confirm whether the one you already have meets the requirements. Due to new requirements (Ingress and LB_IP), all the cluster setup Gists are new.

A note to Docker for Desktop users
You'll notice LB_IP=[...] command at the end of the Gist. You'll have to replace [...] with the IP of your cluster. Probably the easiest way to find it is through the ifconfig command. Just remember that it cannot be localhost, but the IP of your laptop (for example, 192.168.0.152).

A note to minikube and Docker for Desktop users
We have to increase memory to 3 GB. Please have that in mind in case you were planning only to skim through the Gist that matches your Kubernetes flavor.

The Gists are as follows.

gke-monitor.sh: GKE with 3 n1-standard-1 worker nodes, nginx Ingress, tiller, and cluster IP stored in environment variable LB_IP (https://gist.github.com/vfarcic/10e14bfbec466347d70d11a78fe7eec4).
eks-monitor.sh: EKS with 3 t2.small worker nodes, nginx Ingress, tiller, Metrics Server, and cluster IP stored in environment variable LB_IP (https://gist.github.com/vfarcic/211f8dbe204131f8109f417605dbddd5).
aks-monitor.sh: AKS with 3 Standard_B2s worker nodes, nginx Ingress, and tiller, and cluster IP stored in environment variable LB_IP (https://gist.github.com/vfarcic/5fe5c238047db39cb002cdfdadcfbad2).
docker-monitor.sh: Docker for Desktop with 2 CPUs, 3 GB RAM, nginx Ingress, tiller, Metrics Server, and cluster IP stored in environment variable LB_IP (https://gist.github.com/vfarcic/4d9ab04058cf00b9dd0faac11bda8f13).
minikube-monitor.sh: minikube with 2 CPUs, 3 GB RAM, ingress, storage-provisioner, default-storageclass, and metrics-server addons enabled, tiller, and cluster IP stored in environment variable LB_IP (https://gist.github.com/vfarcic/892c783bf51fc06dd7f31b939bc90248).

Now that we have a cluster, we'll need to choose the tools we'll use to accomplish our goals.

Choosing the tools for storing and querying metrics and alerting

HorizontalPodAutoscaler (HPA) and Cluster Autoscaler (CA) provide essential, yet very rudimentary mechanisms to scale our Pods and clusters.

While they do scaling decently well, they do not solve our need to be alerted when there's something wrong, nor do they provide enough information required to find the cause of an issue. We'll need to expand our setup with additional tools that will allow us to store and query metrics as well as to receive notifications when there is an issue.

If we focus on tools that we can install and manage ourselves, there is very little doubt about what to use. If we look at the list of Cloud Native Computing Foundation (CNCF) projects (https://www.cncf.io/projects/), only two graduated so far (October 2018). Those are Kubernetes and Prometheus (https://prometheus.io/). Given that we are looking for a tool that will allow us to store and query metrics and that Prometheus fulfills that need, the choice is straightforward. That is not to say that there are no other similar tools worth considering. There are, but they are all service based. We might explore them later but, for now, we're focused on those that we can run inside our cluster. So, we'll add Prometheus to the mix and try to answer a simple question. What is Prometheus?

Prometheus is a database (of sorts) designed to fetch (pull) and store highly dimensional time series data.

Time series are identified by a metric name and a set of key-value pairs. Data is stored both in memory and on disk. Former allows fast retrieval of information, while the latter exists for fault tolerance.

Prometheus' query language allows us to easily find data that can be used both for graphs and, more importantly, for alerting. It does not attempt to provide "great" visualization experience. For that, it integrates with Grafana (https://grafana.com/).

Unlike most other similar tools, we do not push data to Prometheus. Or, to be more precise, that is not the common way of getting metrics. Instead, Prometheus is a pull-based system that periodically fetches metrics from exporters. There are many third-party exporters we can use. But, in our case, the most crucial exporter is baked into Kubernetes. Prometheus can pull data from an exporter that transforms information from Kube API. Through it, we can fetch (almost) everything we might need. Or, at least, that's where the bulk of the information will be coming from.

Finally, storing metrics in Prometheus would not be of much use if w...