Description: Sudden drop in the number of requests
Detects on: Sudden drop in calls is detected on an abrupt (within an interval of approximately 3-5 minutes) drop in the calls per second metric (call rate). The drop has to be at least 5calls/sec and -30% relative to the value before.
Assumptions: Requires service/application to have been running for 60 Minutes and there is a stable baseline in the call rate for the last 30 minutes. When the calls rate drops to 0, the baseline assumption is not taken into account and an issue is thrown regardless of it.
Severity: RED - relative drop > 90%, YELLOW - otherwise
Description: Sudden increase in latency for a fraction of requests
Detects on: detected on an abrupt (within an interval of approximately 3-5 minutes) increase in the 99th percentile of the latency metric. The increase has to be at least 1000ms and +100% relative to the value before.
Assumptions: Requires service/application to have been running for 30 Minutes and there is a stable baseline in the latency for the last 30 minutes.
Severity: YELLOW
Description: Sudden increase in average latency
Detects on: detected on an abrupt (within an interval of approximately 3-5 minutes) increase in the 50th percentile of the latency metric. The increase has to be at least 1000ms and +100% relative to the value before.
Assumptions: Requires service/application to have been running for 30 Minutes and there is a stable baseline in the latency for the last 30 minutes.
Remark: Should the "Sudden increase in latency for a fraction of requests" be detected before the issue "Sudden increase in average latency" is detected for a service/application the former issue will be replaced by the latter one.
Severity: YELLOW
Description: Sudden increase in error rate
Detects on: detected on an abrupt (within an interval of approximately 3-5 minutes) increase in the error rate metric. The increase has to be at least +5% error rate in absolute values and +10 percentage points relative to the value before.
Assumptions: Requires service/application to have been running for 10 Minutes and there is a stable baseline in the error rate for the last 10 minutes.
Severity: YELLOW
Description: Increasing trend in error rate
Detects on: detected on an increasing trend in the error rate metric. The increase has to be at least +10 percentage points relative to the value before and have a duration of 30minutes.
Assumptions: Requires service/application to have been running for 30 Minutes.
Severity: YELLOW
Description: Error rate too high
Detects on: detected on an average error rate value being above 50% in the last 4 minutes (interrupts in the metric value of arbitrary length, e.g. due to service not being called, are allowed and ignored).
Severity: RED - error rate is above 80%, YELLOW - otherwise
What are Instana's built-in rules for service quality detection ?
Have more questions? Submit a request
Comments