When the number of servers enters the hundreds, never mind thousands (or even greater!), this whole organic process breaks down. What was once an interesting challenge becomes laborious and tedious, even stressful. The learning curve for new team members is steep. A new hire may find themselves with a disparate environment with lots of different technologies to learn, and possibly a long period of training before they can become truly effective. Long-serving team members can end up being silos of knowledge, and should they depart the business, their loss can cause continuity issues. Problems and outages become more numerous as the non-standard environment grows in an uncontrolled manner, and troubleshooting becomes a lengthy endeavorāhardly ideal when trying to achieve a 99.99% service uptime agreement, where every second of downtime matters! Hence, in the next section, we will look at how to address these challenges with an SOE.
From this, we realize our requirement for standardization. Building a suitable SOE is all about the following:
- Realizing economies of scale
- Being efficient in day-to-day operations
- Making it easy for all involved to get up to speed quickly and easily
- Being aligned with the growing needs of the business
After all, if an environment is concise in its definition, then it is easier for everyone involved in it to understand and work with. This, in turn, means tasks are completed quicker and with greater ease. In short, standardization can bring cost savings and improved reliability.
It must be stressed that this is a concept and not an absolute. There is no right or wrong way to build such an environment, though there are best practices. Throughout this chapter, we will explore the concept further and help you to identify core best practices associated with SOEs so that you can make informed decisions when defining your own.
Let's proceed to explore this in more detail. Every enterprise has certain demands of their IT environments, whether they are based on Linux, Windows, FreeBSD, or any other technology. Sometimes, these are well understood and documented, and sometimes, they are simply implicitāthat is to say, everyone assumes the environment meets these standards, but there is no official definition. These requirements often include the following:
- Security
- Reliability
- Scalability
- Longevity
- Supportability
- Ease of use
These, of course, are all high-level requirements, and very often, they intersect with each other. Let's explore these in more detail.
Security in an environment is established by several factors. Let's look at some questions to understand the factors involved:
- Is the configuration secure?
- Have we allowed the use of weak passwords?
- Is the superuser, root, allowed to log in remotely?
- Are we logging and auditing all connections?
Now, in a non-standard environment, how can you truly say that these requirements are all enforced across all of your Linux servers? To do so requires a great deal of faith they have all been built the same way, that they had the same security parameters applied, and that no-one has ever revisited the environment to change anything. In short, it requires fairly frequent auditing to ensure compliance.
However, where the environment has been standardized, and all servers have been built from a common source or using a common automation tool (we shall demonstrate this later in this book), it is much easier to say with confidence that your Linux estate is secure.
A standards-based environment isn't implicitly secure, of courseāif there is an issue that results in a vulnerability in the build process for this environment, automation means this vulnerability will be replicated across the entire environment! It is important to be aware of the security requirements of your environment and to implement these with care, maintaining and auditing your environment continuously to ensure security levels are maintained.
Security is also enforced by patches, which ensure you are not running any software with vulnerabilities that could allow an attacker to compromise your servers. Some Linux distributions have longer lives than others. For example, Red Hat Enterprise Linux (and derivatives such as CentOS) and the Ubuntu LTS releases all have long, predictable life cycles and make good candidates for your Linux estate.
As such, they should be part of your standards. By contrast, if a bleeding edge Linux distribution such as Fedora has been used because, perhaps, it had the latest packages required at the time, you can be sure that the life cycle will be short, and that updates would cease in the not too distant future, hence leaving you open to potential unpatched vulnerabilities and the need to upgrade to a newer release of Fedora.
Even if the upgrade to a newer version of Fedora is performed, sometimes packages get orphanedāthat is to say, they do not get included in the newer release. This might be because they have been superseded by a different package. Whatever the cause, upgrading one distribution to another could cause a false sense of security and should be avoided unless thoroughly researched. In this way, standardization helps to ensure good security practices.
Many enterprises expect their IT operations to be up and running 99.99% of the time (or better). Part of the route to achieving this is robust software, application of relevant bug fixes, and well-defined troubleshooting procedures. This ensures that in the worst case scenario of an outage, the downtime is as minimal as possible.
Standardization again helps hereāas we discussed in the preceding section on security, a good choice of underlying operating system ensures that you have ongoing access to bug fixes and updates, and if you know that your business needs a vendor backup to ensure business continuity, then the selection of a Linux operating system with a support contract (available with Red Hat or Canonical, for example) makes sense.
Equally, when servers are all built to a well-defined and understood standard, making changes to them should yield predictable results as everyone knows what they are working with. If all servers are bui...