Overview
The chapter begins by describing the evolution of software development and delivery, beginning with running software on bare-metal machines, through to the modern approach of containerization. We will also take a look at the underlying Linux technologies that enable containerization. By the end of the chapter, you will be able to run a basic Docker container from an image. You will also be able to package a custom application to make your own Docker image. Next, we will take a look at how we can control the resource limits and group for a container. Finally, the end of the chapter describes why we need to have a tool such as Kubernetes, along with a short introduction to its strengths.
Introduction
About a decade ago, there was a lot of discussion over software development paradigms such as service-oriented architecture, agile development, and software design patterns. In hindsight, those were all great ideas, but only a few of them were practically adopted a decade ago.
One of the major reasons for the lack of adoption of these paradigms is that the underlying infrastructure couldn't offer the resources or capabilities for abstracting fine-grained software components and managing an optimal software development life cycle. Hence, a lot of duplicated efforts were still required for resolving some common issues of software development such as managing software dependencies and consistent environments, software testing, packaging, upgrading, and scaling.
In recent years, with Docker at the forefront, containerization technology has provided a new encapsulation mechanism that allows you to bundle your application, its runtime, and its dependencies, and also brings in a new angle to view the development of software. By using containerization technology, the underlying infrastructure gets abstracted away so that applications can be seamlessly moved among heterogeneous environments. However, along with the rising volume of containers, you may need orchestration tools to help you to manage their interactions with each other as well as to optimize the utilization of the underlying hardware.
That's where Kubernetes comes into play. Kubernetes provides a variety of options to automate deployment, scaling, and the management of containerized applications. It has seen explosive adoption in recent years and has become the de-facto standard in the container orchestration field.
As this is the first chapter of this book, we will start with a brief history of software development over the past few decades, and then illustrate the origins of containers and Kubernetes. We will focus on explaining what problems they can solve, and three key reasons why their adoption has seen a considerable rise in recent years.
The Evolution of Software Development
Along with the evolution of virtualization technology, it's common for companies to use virtual machines (VMs) to manage their software products, either in the public cloud or an on-premises environment. This brings huge benefits such as automatic machine provisioning, better hardware resource utilization, resource abstraction, and more. More critically, for the first time, it employs the separation of computing, network, and storage resources to unleash the power of software development from the tediousness of hardware management. Virtualization also brings in the ability to manipulate the underlying infrastructure programmatically. So, from a system administrator and developer's perspective, they can better streamline the workflow of software maintenance and development. This is a big move in the history of software development.
However, in the past decade, the scope and life cycle of software development have changed vastly. Earlier, it was not uncommon for software to be developed in big monolithic chunks with a slow-release cycle. Nowadays, to catch up with the rapid changes of business requirements, a piece of software may need to be broken down into individual fine-grained subcomponents, and each component may need to have its release cycle so that it can be released as often as possible to get feedback from the market earlier. Moreover, we may want each component to be scalable and cost-effective.
So, how does this impact application development and deployment? In comparison to the bare-metal era, adopting VMs doesn't help much since VMs don't change the granularity of how different components are managed; the entire software is still deployed on a single machine, only it is a virtual one instead of a physical one. Making a number of interdependent components work together is still not an easy task.
A straightforward idea here is to add an abstraction layer to connect the machines with the applications running on them. This is so that application developers would only need to focus on the business logic to build the applications. Some examples of this are Google App Engine (GAE) and Cloud Foundry.
The first issue with these solutions is the lack of consistent development experience among different environments. Developers develop and test applications on their machines with their local dependencies (both at the programming language and operating system level); while in a production environment, the application has to rely on another set of dependencies underneath. And we still haven't talked about the software components that need the cooperation of different developers in different teams.
The second issue is that the hard boundary between applications and the underlying infrastructure would limit the applications from being highly performant, especially if the application is sensitive to the storage, compute, or network resources. For instance, you may want the application to be deployed across multiple availability zones (isolated geographic locations within data centers where cloud resources are managed), or you may want some applications to coexist, or not to coexist, with other particular applications. Alternatively, you may want some applications to adhere to particular hardware (for example, solid-state drives). In such cases, it becomes hard to focus on the functionality of the app without exposing the topological characteristics of the infrastructure to upper applications.
In fact, in the life cycle of software development, there is no clear boundary between the infrastructure and applications. What we want to achieve is to manage the applications automatically, while making optimal use of the infrastructure.
So, how could we achieve this? Docker (which we will introduce later in this chapter) solves the first issue by leveraging Linux containerization technologies to encapsulate the application and its dependencies. It also introduces the concept of Docker images to make the software aspect of the application runtime environment lightweight, reproducible, and portable.
The second issue is more complicated. That's where Kubernetes comes in. Kubernetes leverages a battle-tested design rationale called the Declarative API to abstract the infrastructure as well as each phase of application delivery such as deployment, upgrades, redundancy, scaling, and more. It also offers a series of building blocks for users to choose, orchestrate, and compose into the eventual application. We will gradually move on to study Kubernetes, which is the core of this book, toward the end of this chapter.
Note
If not specified particularly, the term "container" might be used interchangeably with "Linux container" throughout this book.
Virtual Machines versus Containers
A virtual machine (VM), as the name implies, aims to emulate a physical computer system. Technically, VMs are provisioned by a hypervisor, and the hypervisor runs on the host OS. The following diagram illustrates this concept:
Figure 1.1: Running applications on VMs
Here, the VMs have full OS stacks, and the OS running on the VM (called the Guest OS) must rely on t...