front matter
preface
I came to software telemetry the way most of us do: as a producer through the use of print statements in my code and as a consumer by reading the logs and metrics produced by the code I was using. In spite of my computer science degree, I did not go into software engineering right out of college. No, I went into what was then called IT or operations, and I stayed there until I had clocked 14 years of experience. That brought me to 2011, which was a new era in a lot of ways.
That year, I left my job in higher education to join a 20-person legal technology startup as its only operations person. That year also was in the middle of a revolution in software telemetry, when the monitoring systems long used by operations teams and systems administrators started to be extended for use directly by software. The metrics style of telemetry was born. Over the next decade, we saw two more styles of telemetry emerge as databases became featured enough to support them: observability (which did not last long on its own) and distributed tracing.
When I had the idea for this book in 2019, I had watched the feedback software engineers use evolve over two and a half decades. In the beginning, it was common for developers to watch log files inside a telnet session directly in production, and by 2019, all that telemetry was instead accessed through browser-based applications. Telemetrythe feedback engineers use to understand their environmentswas an understood concept centering on the three Pillars of Observability: logs, metrics, and traces. And I, who was still on the systems or platform side of the infrastructure, realized that all these new telemetry methods had the same core conceptsand the same core vulnerabilities. I looked for, and I found, plenty of resources on specific technologies such as Kafka, Prometheus, application monitoring, and how to do centralized logging. But no resources discussed the ecosystem of telemetry systems that were available.
That lack is terrible. Telemetry systems underpin the efficient functioning of software development organizations, because these systems tell you how your code (and the systems that run your code) is operating. There are so many competing demands on our telemetry systems now. I set out to write a book to help you navigate these competing concerns, improve cost management, and get better at operating these mission-critical systems. This book is about improving what youre already doing and better adapting to new telemetry technologies as they emerge.
This book is about improving what you already have, because every software ecosystem has at least some telemetry at its core. Whether youre working on a planet-scale Software as a Service (SaaS) application that deploys to wider percentages of your global data centers as part of your canary deploy process, or a time-card entry system for your city government that you update every couple of months, youre using telemetry. This book is both for companies in which software is the business and organizations in which software merely enables the mission.
If your ecosystem is a fleet of serverless functions running in your cloud providers platform, or if youre running a VMware ESX cluster down the hall, you need software telemetry in much the same ways, even if the tools you use are quite different. Telemetry is a vast topic, with no one product (or even technique) suiting everyones needs. After reading this book, I want you to better understand what your needs are and how to go about meeting them.
Figure FM.1 Where telemetry systems fit alongside production systems. All production systems emit telemetry; telemetry is how we know they're working right. This book is about the systems that handle that telemetry and transform it so that people can view it.
As an industry, weve come a long way from the beginning of the digital age, when a blinking indicator light on the room-size computer was our only feedback that it was actually processing something. (Blinking too fast or too slow meant something was wrong.) The figure shows where telemetry systems fit into a modern web development stack, which is connected to everything.
Were not done innovating our feedback systemsnot by a long shot. Expect fun and interesting things to come onto the market over the next 10 years. This book should set you up to operate those systems when they arrive.