Is there a way to filter metric by a string value a part of which comes from the result of another query in prometheus query?
With promql, you won't be able to have something the way you describe it. Moreover, I am not sure the last schedule time is always the same as the job start time; if there is a slowness or a reschedule somewhere by example.
You can follow the approach indicated in this article. An alternative one would be using the job metrics to determine:
the timestamp of the last failed job per cronjob
- record: job_cronjob:kube_job_status_start_time:last_failed expr: max((kube_job_status_start_time AND kube_job_status_failed == 1) * ON(job,namespace) GROUP_LEFT kube_job_labels{label_cronjob!=""} ) BY(label_cronjob)
the timestamp of the last successful job per cronjob
- record: job_cronjob:kube_job_status_start_time:last_suceeded expr: max((kube_job_status_start_time AND kube_job_status_suceeded == 1) * ON(job,namespace) GROUP_LEFT kube_job_labels{label_cronjob!=""} ) BY(label_cronjob)
And alert if failed one is more recent than successful one:
- alert: CronJobStatusFailed expr: job_cronjob:kube_job_status_start_time:last_failed > job_cronjob:kube_job_status_start_time:last_suceeded for: 1m annotations: description: '{{ $labels.label_cronjob}} last run has failed.'