Alert when docker container pod is in Error or CarshLoopBackOff kubernetes Alert when docker container pod is in Error or CarshLoopBackOff kubernetes kubernetes kubernetes

Alert when docker container pod is in Error or CarshLoopBackOff kubernetes


Prometheus collects a wide range of metrics. As an example, you can use a metric kube_pod_container_status_restarts_total for monitoring restarts, which will reflect your problem.

It containing tags which you can use in the alert:

  • container=container-name
  • namespace=pod-namespace
  • pod=pod-name

So, everything you need is to configure your alertmanager.yaml config by adding correct SMTP settings, receiver and rules like that:

global:  # The smarthost and SMTP sender used for mail notifications.  smtp_smarthost: 'localhost:25'  smtp_from: 'alertmanager@example.org'  smtp_auth_username: 'alertmanager'  smtp_auth_password: 'password'receivers:- name: 'team-X-mails'  email_configs:  - to: 'team-X+alerts@example.org'# Only one default receiverroute:  receiver: team-X-mails# Example group with one alertgroups:- name: example-alert  rules:    # Alert about restarts  - alert: RestartAlerts    expr: count(kube_pod_container_status_restarts_total) by (pod-name) > 5    for: 10m    annotations:      summary: "More than 5 restarts in pod {{ $labels.pod-name }}"      description: "{{ $labels.container-name }} restarted (current value: {{ $value }}s) times in pod {{ $labels.pod-namespace }}/{{ $labels.pod-name }}"


I'm using this one :

    - alert: PodCrashLooping  annotations:    description: Pod {{ $labels.namespace }}/{{ $labels.pod }} ({{ $labels.container }}) is restarting {{ printf "%.2f" $value }} times / 5 minutes.    summary: Pod is crash looping.  expr: rate(kube_pod_container_status_restarts_total{job="kube-state-metrics",namespace=~".*"}[5m]) * 60 * 5 > 0  for: 5m  labels:    severity: critical