What is rate node_context_switches_total ans why rate(node_context_switches_total[5m]) > 1000?
A context switch is the action of storing the state of a process or of a thread.As per Prometheus Documentation book and metrics description -
node_context_switches_total
is Total number of context switches.
Typical alert looks like:
- alert: ContextSwitching expr: rate(node_context_switches_total[5m]) > 1000 for: 30m labels: severity: warning annotations: summary: "Context switching (instance {{ $labels.instance }})" description: "Context switching is growing on node (> 1000 / s)\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"