Caleb received a Slack alert about an issue with one of the company’s backend applications. How does Caleb and his team track down the error in a complex maze of backend services?

Let’s take a look at the general steps Caleb’s team might go through when an issue arises:

1) Evaluate usage and performance data
2) Identify the cause of the issue.
3) Apply the appropriate solution, restoring system performance.

Observability is the degree to which a system’s information can be used to locate and fix a problem. In a system with high observability, a team can more easily trace, diagnose, and fix the problem. With poor observability, the data does little to help.

To improve a system’s observability:

  • Make sure team is aligned with service level objectives
  • Create meaningful alerts
  • Optimize application logging by ensuring messages are informational and descriptive
  • Automate work processes

Maintaining an observable system enables teams to proactively monitor and track for errors.


What might be the effects of a system with very low observability?

See the answer!
The primary effect would be a large increase in time taken to solve problems. The amount of time spent on fixing bugs could surpass the amount of time spent on delivering new features. New features are delayed since teams deal with large amounts of unexpected work. Observability is key in cutting down the amount of time spent dealing with these problems!

We’ve learned about the important roles monitoring and observability play in our system. But how do we know we are doing a good job? Next, we will discuss how to measure the quality of our system’s monitoring.

Take this course for free

Mini Info Outline Icon
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Or sign up using:

Already have an account?