How to trigger alert in Kubernetes using Prometheus Alert Manager
To send an alert to your gmail account, you need to setup the alertmanager configuration in a file say alertmanager.yaml:
cat <<EOF > alertmanager.ymlroute: group_by: [Alertname] # Send all notifications to me. receiver: email-mereceivers:- name: email-me email_configs: - to: $GMAIL_ACCOUNT from: $GMAIL_ACCOUNT smarthost: smtp.gmail.com:587 auth_username: "$GMAIL_ACCOUNT" auth_identity: "$GMAIL_ACCOUNT" auth_password: "$GMAIL_AUTH_TOKEN"EOF
Now, as you're using kube-prometheus so you will have a secret named alertmanager-main
that is default configuration for alertmanager
. You need to create a secret alertmanager-main
again with the new configuration using following command:
kubectl create secret generic alertmanager-main --from-file=alertmanager.yaml -n monitoring
Now you're alertmanager is set to send an email whenever it receive alert from the prometheus.
Now you need to setup an alert on which your mail will be sent. You can set up DeadManSwitch alert which fires in every case and it is used to check your alerting pipeline
groups:- name: meta rules: - alert: DeadMansSwitch expr: vector(1) labels: severity: critical annotations: description: This is a DeadMansSwitch meant to ensure that the entire Alerting pipeline is functional. summary: Alerting DeadMansSwitch
After that the DeadManSwitch
alert will be fired and should send email to your mail.
Reference link:
EDIT:
The deadmanswitch alert should go in a config-map which your prometheus is reading. I will share the relevant snaps from my prometheus here:
"spec": { "alerting": { "alertmanagers": [ { "name": "alertmanager-main", "namespace": "monitoring", "port": "web" } ] }, "baseImage": "quay.io/prometheus/prometheus", "replicas": 2, "resources": { "requests": { "memory": "400Mi" } }, "ruleSelector": { "matchLabels": { "prometheus": "prafull", "role": "alert-rules" } },
The above config is of my prometheus.json file which have the name of alertmanager to use and the ruleSelector
which will select the rules based on prometheus
and role
label. So I have my rule configmap like:
kind: ConfigMapapiVersion: v1metadata: name: prometheus-rules namespace: monitoring labels: role: alert-rules prometheus: prafulldata: alert-rules.yaml: |+ groups: - name: alerting_rules rules: - alert: LoadAverage15m expr: node_load15 >= 0.50 labels: severity: major annotations: summary: "Instance {{ $labels.instance }} - high load average" description: "{{ $labels.instance }} (measured by {{ $labels.job }}) has high load average ({{ $value }}) over 15 minutes."
Replace the DeadManSwitch
in above config map.
If you are using kube-promehtheus, by default it have alertmanager-main secrete and prometheus kind setup.
Step 1: You have to remove alertmanager-main secret
kubectl delete secret alertmanager-main -n monitoring
Step 2 :As Praful explained create secret with new change
cat <<EOF > alertmanager.yamlroute: group_by: [Alertname] # Send all notifications to me. receiver: email-mereceivers:- name: email-me email_configs: - to: $GMAIL_ACCOUNT from: $GMAIL_ACCOUNT smarthost: smtp.gmail.com:587 auth_username: "$GMAIL_ACCOUNT" auth_identity: "$GMAIL_ACCOUNT" auth_password: "$GMAIL_AUTH_TOKEN"EOFkubectl create secret generic alertmanager-main --from-file=alertmanager.yaml -n monitoring
Step3 : You have to add new prometheus rule
apiVersion: monitoring.coreos.com/v1kind: PrometheusRulemetadata: creationTimestamp: null labels: prometheus: k8s role: alert-rules name: prometheus-podfail-rulesspec: groups: - name: ./podfail.rules rules: - alert: PodFailAlert expr: sum(kube_pod_container_status_restarts_total{container="ffmpeggpu"}) BY (container) > 10
NB : The role should be role: alert-rules which is specified in the rule selector prometheus kind, To check that use
kubectl get prometheus k8s -n monitoring -o yaml