Software Telemetry
eBook - ePub

Software Telemetry

Reliable logging and monitoring

Jamie Riedesel

Share book
  1. 560 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Software Telemetry

Reliable logging and monitoring

Jamie Riedesel

Book details
Book preview
Table of contents
Citations

About This Book

Software Telemetry shows you how to efficiently collect, store, and analyze system and application log data so you can monitor and improve your systems. Summary
In Software Telemetry you will learn how to: Manage toxic telemetry and confidential records
Master multi-tenant techniques and transformation processes
Update to improve the statistical validity of your metrics and dashboards
Make software telemetry emissions easier to parse
Build easily-auditable logging systems
Prevent and handle accidental data leaks
Maintain processes for legal compliance
Justify increased spend on telemetry software Software Telemetry teaches you best practices for operating and updating telemetry systems. These vital systems trace, log, and monitor infrastructure by observing and analyzing the events generated by the system. This practical guide is filled with techniques you can apply to any size of organization, with troubleshooting techniques for every eventuality, and methods to ensure your compliance with standards like GDPR. About the technology
Take advantage of the data generated by your IT infrastructure! Telemetry systems provide feedback on what's happening inside your data center and applications, so you can efficiently monitor, maintain, and audit them. This practical book guides you through instrumenting your systems, setting up centralized logging, doing distributed tracing, and other invaluable telemetry techniques. About the book
Software Telemetry shows you how to efficiently collect, store, and analyze system and application log data so you can monitor and improve your systems. Manage the pillars of observability—logs, metrics, and traces—in an end-to-end telemetry system that integrates with your existing infrastructure. You'll discover how software telemetry benefits both small startups and legacy enterprises. And at a time when data audits are increasingly common, you'll appreciate the thorough coverage of legal compliance processes, so there's no reason to panic when a discovery request arrives. What's inside Multi-tenant techniques and transformation processes
Toxic telemetry and confidential records
Updates to improve the statistical validity of your metrics and dashboards
Revisions that make software telemetry emissions easier to parse About the reader
For software developers and infrastructure engineers supporting and building telemetry systems. About the author
Jamie Riedesel is a staff engineer at Dropbox with over twenty years of experience in IT. Table of Contents
1 Introduction
PART 1 TELEMETRY SYSTEM ARCHITECTURE
2 The Emitting stage: Creating and submitting telemetry
3 The Shipping stage: Moving and storing telemetry
4 The Shipping stage: Unifying diverse telemetry formats
5 The Presentation stage: Displaying telemetry
6 Marking up and enriching telemetry
7 Handling multitenancy
PART 2 USE CASES REVISITED: APPLYING ARCHITECTURE CONCEPTS
8 Growing cloud-based startup
9 Nonsoftware business
10 Long-established business IT
PART 3 TECHNIQUES FOR HANDLING TELEMETRY
11 Optimizing for regular expressions at scale
12 Standardized logging and event formats
13 Using more nonfile emitting techniques
14 Managing cardinality in telemetry
15 Ensuring telemetry integrity
16 Redacting and reprocessing telemetry
17 Building policies for telemetry retention and aggregation
18 Surviving legal processes

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Software Telemetry an online PDF/ePUB?
Yes, you can access Software Telemetry by Jamie Riedesel in PDF and/or ePUB format, as well as other popular books in Computer Science & Software Development. We have over one million books available in our catalogue for you to explore.

Information

Publisher
Manning
Year
2021
ISBN
9781638356479

1 Introduction

This chapter covers
  • What telemetry systems are
  • What telemetry means to different technical groups
  • Challenges unique to telemetry systems
Telemetry is the feedback you get from your production systems that tells you what’s going on in there—feedback that improves your ability to make decisions about your production systems. For NASA, the production system might be a rover on Mars, but most of the rest of us have our production systems right here on Earth (and sometimes in orbit around Earth). Whether it’s the amount of power left in a rover’s batteries or the number of containers live in production right now, everything is telemetry. Modern computing systems, especially those operating at scale, live and breathe telemetry, which is how we can manage systems that large at all. Telemetry is ubiquitous in our industry:
  • If you’ve ever looked at a graph describing site hits over time, you’ve used telemetry.
  • If you’ve ever written a logging statement in code and later looked up those statements in a log-searching tool such as Kibana or Loggly, you’ve used telemetry.
  • If you’ve ever researched application performance in Datadog, you’ve used telemetry.
  • If you’ve ever configured the Apache web server to send logs to a relational database, you’ve used telemetry.
  • If you’ve ever written a Jenkinsfile to send continuous integration test results to another system that could display it better, you’ve used telemetry.
  • If you’ve ever configured GitHub to send webhooks for repository events, you’ve used telemetry.
As figure 1.1 shows, Software Telemetry is about the systems that bring you telemetry and display it in a way that will help you make decisions. Telemetry comes from all kinds of things, from the power distribution units your servers (or your cloud provider’s servers) are plugged into to your running code at the top of the technical pyramid. Taking that telemetry from whatever emitted it and transforming it so that your telemetry can be displayed usefully is the job of the telemetry system. Software Telemetry is all about that system and how to make it durable.
Figure 1.1 Where telemetry systems fit inside your overall technical infrastructure. Everything we run gives us some indication of how it is running. Those indications (dotted lines here) are telemetry, and this book is about handling that telemetry.
Telemetry is a broad topic and one that is rapidly changing. Between 2010 and 2020, our industry saw the emergence of metrics (adding to the monitoring that operations groups were already doing) and distributed tracing, which combined with logs into the three Pillars of Observability. We saw two new styles of telemetry systems in the past decade; who knows what we will see between 2020 and 2030? This book will teach you the fundamentals of how all telemetry systems operate, including ones you haven’t seen yet, which will prepare you to modernize your current systems and adapt to new styles of telemetry. Any time you teach information passing and translation, which is what telemetry systems do, you unavoidably have to cover how people pass information. This book will teach you both the technical details of maintaining and upgrading telemetry systems and the conversations you need to have with your co-workers while you revise and refine your telemetry systems.
All telemetry systems have a similar architecture. Figure 1.2 is an architecture you will see often as you move through parts 1 and 2 of this book.
Figure 1.2 Architecture common to all telemetry systems, though some stages are often combined in smaller architectures. The Emitting stage receives telemetry from your production systems and delivers it to the Shipping stage. The Shipping stage processes and ultimately stores telemetry. The Presentation stage is where people search and work with telemetry. The Emitting and Shipping stages can apply context-related markup to telemetry; the Shipping and Presentation stages can further enrich telemetry by pulling out the details encoded within.
Telemetry is data that production systems emit to provide feedback about what is happening inside. Telemetry systems are the systems that handle, transform, store, and present telemetry data. This book is all about the systems, so let’s take a look at the four major telemetry styles in use today:
  • Centralized logging—The first telemetry system created, which happened in the early 1980s. This style takes text-based logging output from production systems and centralizes it to ease searching. Note that this technique is the only one widely supported by hardware.
  • Metrics—Grew out of the monitoring systems used by Operations teams and was renamed metrics when software engineers adopted the technique. This system, which emerged in the early 2010s, focuses on numbers rather than text to describe what is happening. Metrics allow much longer timeframes to be kept online and searchable compared to centralized logging.
  • Distributed tracing—Focuses directly on tracking events across many components of a distributed system. (Large monoliths count as a large distributed system, by the way.) This style emerged in the late 2010s and is undergoing rapid development.
  • Security Information Event Management (SIEM)—A specialized telemetry system for use by Security and Compliance teams, and a specialization of centralized logging and metrics. The technique was in use long before the term was formalized in the mid-2000s.
These telemetry styles are used throughout this book, so you will see them mentioned a lot. Section 1.1 provides you longer definitions and histories of these telemetry styles and shows how each style conforms to the architecture in figure 1.1.
Note In the past couple of years, the concept of Pillars of Observability has emerged. The word observability was first used to define a specific style of telemetry and evolved as a sophistication of the metrics style. Today, however, observability is generally considered to be a practice rather than a telemetry style. The three pillars are logs, metrics, and traces. If you use all three styles, you are best equipped to observe how your system is operating. This book is about supporting the systems that provide your observability. The SIEM systems used by Security teams are a form of observability, telling you who did what, when they did it, how they did it, and what happened when they did, just like the Pillars.
Because people matter as much as the telemetry data being handled by our telemetry systems, section 1.2 breaks down the many teams inside a technical organization as well as the telemetry systems each team prefers to use. These teams are referenced frequently in the rest of this book.
Finally, telemetry systems face more disasters than production systems do. Section 1.3 covers some of these disasters in brief. Part 3 of this book has several chapters that are useful for making your telemetry systems durable.

1.1 Defining the styles of telemetry

The list of telemetry styles provided in the introduction to this chapter provides a nice thumbnail of what each style of telemetry does and will be a good reference for you as you move through this book. This section provides far more detailed definitions of the four telemetry styles and gives real-world examples of them.

1.1.1 Defining centralized logging

Centralized logging brings logging data genera...

Table of contents