Clark added a new service to handle more traffic for his growing business. But he doesn’t know how his new application is doing! What information could help him gain insight into his system?
Clark can measure how his application is operating with the help of metrics. Metrics express a value relevant to the system at a specific point of time. Here are some key metrics for monitoring system health and performance:
LatencyLatency is the time between the start of an event, such as serving a request, to its completion. This metric is a key indicator of performance.
Traffic/ConnectionsTraffic is the amount of system usage over time. An abnormal amount of traffic can require scaling to maintain performance.
ErrorsAn error is an invalid state our system has reached. Examples include exceeding a memory limit or reading a corrupted data file. The rate of errors returned by a service can indicate deeper issues.
SaturationSaturation describes the load on our system’s resources. Our system reaching its limits can result in poor performance.
Tracking these metrics can give teams a broad view of system health and help diagnose issues.
Can you think of other system metrics we can monitor?
See the answer!Some metrics we can retrieve include but not limited to:
We now have an understanding of the kind of data that needs to be monitored. Can metrics be tied to how we think about our business’ performance?