Observability is the technical term used to understand a system’s behavior using externally recognizable factors.
To measure the current status of a system (telemetry) it is common to use the data generated by that system, such as logs, metrics and traces.
Logs are text messages that aid in understanding the system’s behavior. They are issued by services or components in relation to events that take place at a given time.
Metrics are recounts, numerical data or measurements that are often calculated or added in a particular time period
They may originate in different sources such as: infrastructure, hosts, services, platforms in the cloud and external sources.
Some examples of metrics are: the system’s error rate and its request rate for a specific service.
Traces, also known as Distributed Tracing, show the routes taken by the requests as they flow through the applications.
Example of Trace Viewing:
Image taken from https://opentelemetry.io/docs/concepts/observability-primer/
Logs, metrics and traces enable the assessment of:
- The good functioning of a service.
- Where it is executed.
- How it has behaved in the past and its interactions.
They help make a system observable.
The purpose of observability is to comprehend what occurs, in order to detect and solve problems and thus keep systems efficient and reliable. For this reason, a greater level of observability makes it easier to go from diagnosing a problem to finding its causes and defining a solution.
Observability aims at issues involving unknown aspects that arise in modern applications (dynamic apps with ever-changing scale and complexity that generate unknown problems and without supervision). This enables the ongoing and automatic understanding of problems as they arise, and even the possibility to solve them in a proactive manner before they have any effects on users
Other reasons why observability is important:
- It reduces time spent on solving problems.
- It implies the possibility of developing predictive capability to overcome repeated problems.
- It improves user satisfaction.
- It increases operational efficiency.
- A high-quality software may be developed.