Introduction
This book is about how to get a handle on systems logs. More precisely, it is about how to get useful information out of your logs of all kinds. Logs, while often under-appreciated, are a very useful source of information for computer system resource management (printers, disk systems, battery backup systems, operating systems, etc.), user and application management (login and logout, application access, etc.), and security. It should be noted that sometimes the type of information can be categorized into more than one bucket. User login and logout messages are both relevant for both user management and security. A few examples are now presented to show how useful log data can be.
Various disk storage products will log messages when hardware errors occur. Having access to this information can often times mean small problems are resolved before they become really big nightmares.
As a second example, letās briefly consider how user management and security logs can be used together to shed light on a user activity. When a user logs onto a Windows environment, this action is logged in some place as a logon record. We will call this a user management log data. Anytime this user accesses various parts of the network, a firewall is more than likely in use. This firewall also records network access in the form of whether or not it allowed network packets to flow from the source, a userās workstation, to a particular part of the network. We will call this as security log data. Now, letās say your company is developing some new product and you want to know who attempts to access your R&D server. Of course, you can use firewall access control lists (ACLs) to control this, but you want to take it a step further. The logon data for a user can be matched up with the firewall record showing that the user attempted to access the server. And if this occurred outside of normal business hours, you might have reason to speak with the employee to better understand their intent. While this example is a little bit out there, it does drive home an important point. If you have access to the right information, you are able to do some sophisticated things.
But getting that information takes some time and some work. At first glance (and maybe the second one too) it can seem an overwhelming taskāthe sheer volume of data can alone be daunting. But we think we can help āde-whelmā you. Weāll present an overall strategy for handling your logs. Weāll show you some different log types and formats. The point of using different log types and formats is twofold. First, it will get you accustomed to looking at log messages and data so you become more familiar with them. But, second it will help you establish a mindset of understanding basic logging formats so you can more easily identify and deal with new or previously unseen log data in your environment. Itās a fact of life that different vendors will implement log messages in different formats, but at the end of the day itās all about how you deal with and manage log data. The faster you can understand and integrate new log data into your overall logging system, the faster you will begin to gain value from it.
The remainder of this chapter is geared toward providing a foundation for the concepts that will be presented throughout the rest of this book. The ideas around log data, people, process, and technology will be explored, with some real-world examples sprinkled in to ensure you see the real value in log data.
Log Data Basics
So far we have been making reference to logging and log data without providing a real concrete description of what these things are. Letās define these now in no uncertain terms the basics around logging and log data.
What Is Log Data?
At the heart of log data are, simply, log messages, or logs. A log message is what a computer system, device, software, etc. generates in response to some sort of stimuli. What exactly the stimuli are greatly depends on the source of the log message. For example, Unix systems will have user login and logout messages, firewalls will have ACL accept and deny messages, disk storage systems will generate log messages when failures occur or, in some cases, when the system perceives an impending failure.
Log data is the intrinsic meaning that a log message has. Or put another way, log data is the information pulled out of a log message to tell you why the log message generated. For example, a Web server will often log whenever someone accesses a resource (image, file, etc.) on a Web page. If the user accessing the page had to authenticate herself, the log message would contain the userās name. This is an example of log data: you can use the username to determine who accessed a resource.
The term logs is really used to indicate a collection of log messages that will be used collectively to paint a picture of some occurrence.
Log messages can be classified into the following general categories:
Informational: Messages of this type are designed to let users and administrators know that something benign has occurred. For example, Cisco IOS will generate messages when the system is rebooted. Care must be taken, however. If a reboot, for example, occurs out of normal maintenance or business hours, you might have reason to be alarmed. Subsequent chapters in this book will provide you with the skills and techniques to be able to detect when something like this occurs.
Debug: Debug messages are generally generated from software systems in order to aid software developers troubleshoot and identify problems with running application code.
Warning: Warning messages are concerned with situations where things may be missing or needed for a system, but the absence of which will not impact system operation. For example, if a program isnāt given the proper number of command line arguments, but yet it can run without them, is something the program might log just as a warning to the user or operator.
Error: Error log messages are used to relay errors that occur at various levels in a computer system. For example, an operating system might generate an error log when it cannot synchronize buffers to disk. Unfortunately, many error messages only give you a starting point as to why they occurred. Further investigation is often required in order to get at the root cause of the error.
Chapters 7,
8,
9,
10,
11,
12,
13,
15,
and 16 in this book will provide you with ways to deal with this.
Alert: An alert is meant to indicate that something interesting has happened. Alerts, in general, are the domain of security devices and security-related systems, but this is not a hard and fast rule. An Intrusion Prevention System
(IPS) may sit in-line on a computer network, examining all inbound traffic. It will make a determination on whether or not a given network connection is allowed through based on the contents of the packet data. If the IPS encounters a connection that might be malicious it can take any number of pre-configured actions. The determination, along with the action taken...