The DevOps 2.5 Toolkit
eBook - ePub

The DevOps 2.5 Toolkit

Viktor Farcic

Condividi libro
  1. 322 pagine
  2. English
  3. ePUB (disponibile sull'app)
  4. Disponibile su iOS e Android
eBook - ePub

The DevOps 2.5 Toolkit

Viktor Farcic

Dettagli del libro
Anteprima del libro
Indice dei contenuti
Citazioni

Informazioni sul libro

An advanced exploration of the skills and knowledge required for operating Kubernetes clusters, with a focus on metrics gathering and alerting, with the goal of making clusters and applications inside them autonomous through self-healing and self-adaptation.Key Features• The sixth book of DevOps expert Viktor Farcic's bestselling DevOps Toolkit series, with an overview of advanced core Kubernetes techniques, -oriented towards monitoring and alerting.• Takes a deep dive into monitoring, alerting, logging, auto-scaling, and other subjects aimed at making clusters resilient, self-sufficient, and self-adaptive• Discusses how to customise and create dashboards and alertsBook DescriptionBuilding on The DevOps 2.3 Toolkit: Kubernetes, and The DevOps 2.4 Toolkit: Continuous Deployment to Kubernetes, Viktor Farcic brings his latest exploration of the Docker technology as he records his journey to monitoring, logging, and autoscaling Kubernetes.The DevOps 2.5 Toolkit: Monitoring, Logging, and Auto-Scaling Kubernetes: Making Resilient, Self-Adaptive, And Autonomous Kubernetes Clusters is the latest book in Viktor Farcic's series that helps you build a full DevOps Toolkit. This book helps readers develop the necessary skillsets needed to be able to operate Kubernetes clusters, with a focus on metrics gathering and alerting with the goal of making clusters and applications inside them autonomous through self-healing and self-adaptation.Work with Viktor and dive into the creation of self-adaptive and self-healing systems within Kubernetes.What you will learn• Autoscaling Deployments and Statefulsets based on resource usage• Autoscaling nodes of a Kubernetes cluster• Debugging issues discovered through metrics and alerts• Extending HorizontalPodAutoscaler with custom metrics• Visualizing metrics and alerts• Collecting and querying logsWho this book is forReaders with an advanced-level understanding of Kubernetes and hands-on experience.

Domande frequenti

Come faccio ad annullare l'abbonamento?
È semplicissimo: basta accedere alla sezione Account nelle Impostazioni e cliccare su "Annulla abbonamento". Dopo la cancellazione, l'abbonamento rimarrà attivo per il periodo rimanente già pagato. Per maggiori informazioni, clicca qui
È possibile scaricare libri? Se sì, come?
Al momento è possibile scaricare tramite l'app tutti i nostri libri ePub mobile-friendly. Anche la maggior parte dei nostri PDF è scaricabile e stiamo lavorando per rendere disponibile quanto prima il download di tutti gli altri file. Per maggiori informazioni, clicca qui
Che differenza c'è tra i piani?
Entrambi i piani ti danno accesso illimitato alla libreria e a tutte le funzionalità di Perlego. Le uniche differenze sono il prezzo e il periodo di abbonamento: con il piano annuale risparmierai circa il 30% rispetto a 12 rate con quello mensile.
Cos'è Perlego?
Perlego è un servizio di abbonamento a testi accademici, che ti permette di accedere a un'intera libreria online a un prezzo inferiore rispetto a quello che pagheresti per acquistare un singolo libro al mese. Con oltre 1 milione di testi suddivisi in più di 1.000 categorie, troverai sicuramente ciò che fa per te! Per maggiori informazioni, clicca qui.
Perlego supporta la sintesi vocale?
Cerca l'icona Sintesi vocale nel prossimo libro che leggerai per verificare se è possibile riprodurre l'audio. Questo strumento permette di leggere il testo a voce alta, evidenziandolo man mano che la lettura procede. Puoi aumentare o diminuire la velocità della sintesi vocale, oppure sospendere la riproduzione. Per maggiori informazioni, clicca qui.
The DevOps 2.5 Toolkit è disponibile online in formato PDF/ePub?
Sì, puoi accedere a The DevOps 2.5 Toolkit di Viktor Farcic in formato PDF e/o ePub, così come ad altri libri molto apprezzati nelle sezioni relative a Informatique e Développement de logiciels. Scopri oltre 1 milione di libri disponibili nel nostro catalogo.

Informazioni

Anno
2019
ISBN
9781838642631

Collecting and Querying Metrics and Sending Alerts

Insufficient facts always invite danger.
- Spock
So far, we explored how to leverage some of Kubernetes core features. We used HorizontalPodAutoscaler and Cluster Autoscaler. While the former relies on Metrics Server, the latter is not based on metrics, but on Scheduler's inability to place Pods within the existing cluster capacity. Even though Metrics Server does provide some basic metrics, we are in desperate need for more.
We have to be able to monitor our cluster and Metrics Server is just not enough. It contains a limited amount of metrics, it keeps them for a very short period, and it does not allow us to execute anything but simplest queries. I can't say that we are blind if we rely only on Metrics Server, but that we are severely impaired. Without increasing the number of metrics we're collecting, as well as their retention, we get only a glimpse into what's going on in our Kubernetes clusters.
Being able to fetch and store metrics cannot be the goal by itself. We also need to be able to query them in search for a cause of an issue. For that, we need metrics to be "rich" with information, and we need a powerful query language.
Finally, being able to find the cause of a problem is not worth much without being able to be notified that there is an issue in the first place. That means that we need a system that will allow us to define alerts that, when certain thresholds are reached, will send us notifications or, when appropriate, send them to other parts of the system that can automatically execute steps that will remedy issues.
If we accomplish that, we'll be a step closer to having not only a self-healing (Kubernetes already does that) but also a self-adaptive system that will react to changed conditions. We might go even further and try to predict that "bad things" will happen in the future and be proactive in resolving them before they even arise.
All in all, we need a tool, or a set of tools, that will allow us to fetch and store "rich" metrics, that will allow us to query them, and that will notify us when an issue happens or, even better, when a problem is about to occur.
We might not be able to build a self-adapting system in this chapter, but we can try to create a foundation. But, first things first, we need a cluster that will allow us to "play" with some new tools and concepts.

Creating a cluster

We'll continue using definitions from the vfarcic/k8s-specs (https://github.com/vfarcic/k8s-specs) repository. To be on the safe side, we'll pull the latest version first.
All the commands from this chapter are available in the 03-monitor.sh (https://gist.github.com/vfarcic/718886797a247f2f9ad4002f17e9ebd9) Gist.
 1 cd k8s-specs 2 3 git pull
In this chapter, we'll need a few things that were not requirements before, even though you probably already used them.
We'll start using UIs so we'll need NGINX Ingress Controller that will route traffic from outside the cluster. We'll also need environment variable LB_IP with the IP through which we can access worker nodes. We'll use it to configure a few Ingress resources.
The Gists used to test the examples in this chapters are below. Please use them as they are, or as inspiration to create your own cluster or to confirm whether the one you already have meets the requirements. Due to new requirements (Ingress and LB_IP), all the cluster setup Gists are new.
A note to Docker for Desktop users
You'll notice LB_IP=[...] command at the end of the Gist. You'll have to replace [...] with the IP of your cluster. Probably the easiest way to find it is through the ifconfig command. Just remember that it cannot be localhost, but the IP of your laptop (for example, 192.168.0.152).
A note to minikube and Docker for Desktop users
We have to increase memory to 3 GB. Please have that in mind in case you were planning only to skim through the Gist that matches your Kubernetes flavor.
The Gists are as follows.
  • gke-monitor.sh: GKE with 3 n1-standard-1 worker nodes, nginx Ingress, tiller, and cluster IP stored in environment variable LB_IP (https://gist.github.com/vfarcic/10e14bfbec466347d70d11a78fe7eec4).
  • eks-monitor.sh: EKS with 3 t2.small worker nodes, nginx Ingress, tiller, Metrics Server, and cluster IP stored in environment variable LB_IP (https://gist.github.com/vfarcic/211f8dbe204131f8109f417605dbddd5).
  • aks-monitor.sh: AKS with 3 Standard_B2s worker nodes, nginx Ingress, and tiller, and cluster IP stored in environment variable LB_IP (https://gist.github.com/vfarcic/5fe5c238047db39cb002cdfdadcfbad2).
  • docker-monitor.sh: Docker for Desktop with 2 CPUs, 3 GB RAM, nginx Ingress, tiller, Metrics Server, and cluster IP stored in environment variable LB_IP (https://gist.github.com/vfarcic/4d9ab04058cf00b9dd0faac11bda8f13).
  • minikube-monitor.sh: minikube with 2 CPUs, 3 GB RAM, ingress, storage-provisioner, default-storageclass, and metrics-server addons enabled, tiller, and cluster IP stored in environment variable LB_IP (https://gist.github.com/vfarcic/892c783bf51fc06dd7f31b939bc90248).
Now that we have a cluster, we'll need to choose the tools we'll use to accomplish our goals.

Choosing the tools for storing and querying metrics and alerting

HorizontalPodAutoscaler (HPA) and Cluster Autoscaler (CA) provide essential, yet very rudimentary mechanisms to scale our Pods and clusters.
While they do scaling decently well, they do not solve our need to be alerted when there's something wrong, nor do they provide enough information required to find the cause of an issue. We'll need to expand our setup with additional tools that will allow us to store and query metrics as well as to receive notifications when there is an issue.
If we focus on tools that we can install and manage ourselves, there is very little doubt about what to use. If we look at the list of Cloud Native Computing Foundation (CNCF) projects (https://www.cncf.io/projects/), only two graduated so far (October 2018). Those are Kubernetes and Prometheus (https://prometheus.io/). Given that we are looking for a tool that will allow us to store and query metrics and that Prometheus fulfills that need, the choice is straightforward. That is not to say that there are no other similar tools worth considering. There are, but they are all service based. We might explore them later but, for now, we're focused on those that we can run inside our cluster. So, we'll add Prometheus to the mix and try to answer a simple question. What is Prometheus?
Prometheus is a database (of sorts) designed to fetch (pull) and store highly dimensional time series data.
Time series are identified by a metric name and a set of key-value pairs. Data is stored both in memory and on disk. Former allows fast retrieval of information, while the latter exists for fault tolerance.
Prometheus' query language allows us to easily find data that can be used both for graphs and, more importantly, for alerting. It does not attempt to provide "great" visualization experience. For that, it integrates with Grafana (https://grafana.com/).
Unlike most other similar tools, we do not push data to Prometheus. Or, to be more precise, that is not the common way of getting metrics. Instead, Prometheus is a pull-based system that periodically fetches metrics from exporters. There are many third-party exporters we can use. But, in our case, the most crucial exporter is baked into Kubernetes. Prometheus can pull data from an exporter that transforms information from Kube API. Through it, we can fetch (almost) everything we might need. Or, at least, that's where the bulk of the information will be coming from.
Finally, storing metrics in Prometheus would not be of much use if w...

Indice dei contenuti