The DevOps 2.5 Toolkit
eBook - ePub

The DevOps 2.5 Toolkit

Viktor Farcic

Partager le livre
  1. 322 pages
  2. English
  3. ePUB (adapté aux mobiles)
  4. Disponible sur iOS et Android
eBook - ePub

The DevOps 2.5 Toolkit

Viktor Farcic

DĂ©tails du livre
Aperçu du livre
Table des matiĂšres
Citations

À propos de ce livre

An advanced exploration of the skills and knowledge required for operating Kubernetes clusters, with a focus on metrics gathering and alerting, with the goal of making clusters and applications inside them autonomous through self-healing and self-adaptation.Key Features‱ The sixth book of DevOps expert Viktor Farcic's bestselling DevOps Toolkit series, with an overview of advanced core Kubernetes techniques, -oriented towards monitoring and alerting.‱ Takes a deep dive into monitoring, alerting, logging, auto-scaling, and other subjects aimed at making clusters resilient, self-sufficient, and self-adaptive‱ Discusses how to customise and create dashboards and alertsBook DescriptionBuilding on The DevOps 2.3 Toolkit: Kubernetes, and The DevOps 2.4 Toolkit: Continuous Deployment to Kubernetes, Viktor Farcic brings his latest exploration of the Docker technology as he records his journey to monitoring, logging, and autoscaling Kubernetes.The DevOps 2.5 Toolkit: Monitoring, Logging, and Auto-Scaling Kubernetes: Making Resilient, Self-Adaptive, And Autonomous Kubernetes Clusters is the latest book in Viktor Farcic's series that helps you build a full DevOps Toolkit. This book helps readers develop the necessary skillsets needed to be able to operate Kubernetes clusters, with a focus on metrics gathering and alerting with the goal of making clusters and applications inside them autonomous through self-healing and self-adaptation.Work with Viktor and dive into the creation of self-adaptive and self-healing systems within Kubernetes.What you will learn‱ Autoscaling Deployments and Statefulsets based on resource usage‱ Autoscaling nodes of a Kubernetes cluster‱ Debugging issues discovered through metrics and alerts‱ Extending HorizontalPodAutoscaler with custom metrics‱ Visualizing metrics and alerts‱ Collecting and querying logsWho this book is forReaders with an advanced-level understanding of Kubernetes and hands-on experience.

Foire aux questions

Comment puis-je résilier mon abonnement ?
Il vous suffit de vous rendre dans la section compte dans paramĂštres et de cliquer sur « RĂ©silier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez rĂ©siliĂ© votre abonnement, il restera actif pour le reste de la pĂ©riode pour laquelle vous avez payĂ©. DĂ©couvrez-en plus ici.
Puis-je / comment puis-je télécharger des livres ?
Pour le moment, tous nos livres en format ePub adaptĂ©s aux mobiles peuvent ĂȘtre tĂ©lĂ©chargĂ©s via l’application. La plupart de nos PDF sont Ă©galement disponibles en tĂ©lĂ©chargement et les autres seront tĂ©lĂ©chargeables trĂšs prochainement. DĂ©couvrez-en plus ici.
Quelle est la différence entre les formules tarifaires ?
Les deux abonnements vous donnent un accĂšs complet Ă  la bibliothĂšque et Ă  toutes les fonctionnalitĂ©s de Perlego. Les seules diffĂ©rences sont les tarifs ainsi que la pĂ©riode d’abonnement : avec l’abonnement annuel, vous Ă©conomiserez environ 30 % par rapport Ă  12 mois d’abonnement mensuel.
Qu’est-ce que Perlego ?
Nous sommes un service d’abonnement Ă  des ouvrages universitaires en ligne, oĂč vous pouvez accĂ©der Ă  toute une bibliothĂšque pour un prix infĂ©rieur Ă  celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! DĂ©couvrez-en plus ici.
Prenez-vous en charge la synthÚse vocale ?
Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte Ă  haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accĂ©lĂ©rer ou le ralentir. DĂ©couvrez-en plus ici.
Est-ce que The DevOps 2.5 Toolkit est un PDF/ePUB en ligne ?
Oui, vous pouvez accĂ©der Ă  The DevOps 2.5 Toolkit par Viktor Farcic en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Informatique et DĂ©veloppement de logiciels. Nous disposons de plus d’un million d’ouvrages Ă  dĂ©couvrir dans notre catalogue.

Informations

Année
2019
ISBN
9781838642631

Collecting and Querying Metrics and Sending Alerts

Insufficient facts always invite danger.
- Spock
So far, we explored how to leverage some of Kubernetes core features. We used HorizontalPodAutoscaler and Cluster Autoscaler. While the former relies on Metrics Server, the latter is not based on metrics, but on Scheduler's inability to place Pods within the existing cluster capacity. Even though Metrics Server does provide some basic metrics, we are in desperate need for more.
We have to be able to monitor our cluster and Metrics Server is just not enough. It contains a limited amount of metrics, it keeps them for a very short period, and it does not allow us to execute anything but simplest queries. I can't say that we are blind if we rely only on Metrics Server, but that we are severely impaired. Without increasing the number of metrics we're collecting, as well as their retention, we get only a glimpse into what's going on in our Kubernetes clusters.
Being able to fetch and store metrics cannot be the goal by itself. We also need to be able to query them in search for a cause of an issue. For that, we need metrics to be "rich" with information, and we need a powerful query language.
Finally, being able to find the cause of a problem is not worth much without being able to be notified that there is an issue in the first place. That means that we need a system that will allow us to define alerts that, when certain thresholds are reached, will send us notifications or, when appropriate, send them to other parts of the system that can automatically execute steps that will remedy issues.
If we accomplish that, we'll be a step closer to having not only a self-healing (Kubernetes already does that) but also a self-adaptive system that will react to changed conditions. We might go even further and try to predict that "bad things" will happen in the future and be proactive in resolving them before they even arise.
All in all, we need a tool, or a set of tools, that will allow us to fetch and store "rich" metrics, that will allow us to query them, and that will notify us when an issue happens or, even better, when a problem is about to occur.
We might not be able to build a self-adapting system in this chapter, but we can try to create a foundation. But, first things first, we need a cluster that will allow us to "play" with some new tools and concepts.

Creating a cluster

We'll continue using definitions from the vfarcic/k8s-specs (https://github.com/vfarcic/k8s-specs) repository. To be on the safe side, we'll pull the latest version first.
All the commands from this chapter are available in the 03-monitor.sh (https://gist.github.com/vfarcic/718886797a247f2f9ad4002f17e9ebd9) Gist.
 1 cd k8s-specs 2 3 git pull
In this chapter, we'll need a few things that were not requirements before, even though you probably already used them.
We'll start using UIs so we'll need NGINX Ingress Controller that will route traffic from outside the cluster. We'll also need environment variable LB_IP with the IP through which we can access worker nodes. We'll use it to configure a few Ingress resources.
The Gists used to test the examples in this chapters are below. Please use them as they are, or as inspiration to create your own cluster or to confirm whether the one you already have meets the requirements. Due to new requirements (Ingress and LB_IP), all the cluster setup Gists are new.
A note to Docker for Desktop users
You'll notice LB_IP=[...] command at the end of the Gist. You'll have to replace [...] with the IP of your cluster. Probably the easiest way to find it is through the ifconfig command. Just remember that it cannot be localhost, but the IP of your laptop (for example, 192.168.0.152).
A note to minikube and Docker for Desktop users
We have to increase memory to 3 GB. Please have that in mind in case you were planning only to skim through the Gist that matches your Kubernetes flavor.
The Gists are as follows.
  • gke-monitor.sh: GKE with 3 n1-standard-1 worker nodes, nginx Ingress, tiller, and cluster IP stored in environment variable LB_IP (https://gist.github.com/vfarcic/10e14bfbec466347d70d11a78fe7eec4).
  • eks-monitor.sh: EKS with 3 t2.small worker nodes, nginx Ingress, tiller, Metrics Server, and cluster IP stored in environment variable LB_IP (https://gist.github.com/vfarcic/211f8dbe204131f8109f417605dbddd5).
  • aks-monitor.sh: AKS with 3 Standard_B2s worker nodes, nginx Ingress, and tiller, and cluster IP stored in environment variable LB_IP (https://gist.github.com/vfarcic/5fe5c238047db39cb002cdfdadcfbad2).
  • docker-monitor.sh: Docker for Desktop with 2 CPUs, 3 GB RAM, nginx Ingress, tiller, Metrics Server, and cluster IP stored in environment variable LB_IP (https://gist.github.com/vfarcic/4d9ab04058cf00b9dd0faac11bda8f13).
  • minikube-monitor.sh: minikube with 2 CPUs, 3 GB RAM, ingress, storage-provisioner, default-storageclass, and metrics-server addons enabled, tiller, and cluster IP stored in environment variable LB_IP (https://gist.github.com/vfarcic/892c783bf51fc06dd7f31b939bc90248).
Now that we have a cluster, we'll need to choose the tools we'll use to accomplish our goals.

Choosing the tools for storing and querying metrics and alerting

HorizontalPodAutoscaler (HPA) and Cluster Autoscaler (CA) provide essential, yet very rudimentary mechanisms to scale our Pods and clusters.
While they do scaling decently well, they do not solve our need to be alerted when there's something wrong, nor do they provide enough information required to find the cause of an issue. We'll need to expand our setup with additional tools that will allow us to store and query metrics as well as to receive notifications when there is an issue.
If we focus on tools that we can install and manage ourselves, there is very little doubt about what to use. If we look at the list of Cloud Native Computing Foundation (CNCF) projects (https://www.cncf.io/projects/), only two graduated so far (October 2018). Those are Kubernetes and Prometheus (https://prometheus.io/). Given that we are looking for a tool that will allow us to store and query metrics and that Prometheus fulfills that need, the choice is straightforward. That is not to say that there are no other similar tools worth considering. There are, but they are all service based. We might explore them later but, for now, we're focused on those that we can run inside our cluster. So, we'll add Prometheus to the mix and try to answer a simple question. What is Prometheus?
Prometheus is a database (of sorts) designed to fetch (pull) and store highly dimensional time series data.
Time series are identified by a metric name and a set of key-value pairs. Data is stored both in memory and on disk. Former allows fast retrieval of information, while the latter exists for fault tolerance.
Prometheus' query language allows us to easily find data that can be used both for graphs and, more importantly, for alerting. It does not attempt to provide "great" visualization experience. For that, it integrates with Grafana (https://grafana.com/).
Unlike most other similar tools, we do not push data to Prometheus. Or, to be more precise, that is not the common way of getting metrics. Instead, Prometheus is a pull-based system that periodically fetches metrics from exporters. There are many third-party exporters we can use. But, in our case, the most crucial exporter is baked into Kubernetes. Prometheus can pull data from an exporter that transforms information from Kube API. Through it, we can fetch (almost) everything we might need. Or, at least, that's where the bulk of the information will be coming from.
Finally, storing metrics in Prometheus would not be of much use if w...

Table des matiĂšres