NRQL Query to get latest K8s pod with feature xyz and alert

This question is more for the NRQL-ninjas out there. :smiley:

I’m running a postgres operator in K8s. This operator has the ability to do scheduled backups. For this it spawns a backup-pod which either gets the status Succeeded/Completed or Failed/Error after it’s finished. But this pods don’t get cleaned up. So after each backup there is a new backup-pod.

After a few days it looks like this

$ kubectl get pods
[...]
postgres-keycloak-repo1-full-27608265-5hcmz   0/1     Error       0          45h
postgres-keycloak-repo1-full-27608265-c2l8w   0/1     Error       0          45h
postgres-keycloak-repo1-full-27608265-jkpcv   0/1     Error       0          45h
postgres-keycloak-repo1-full-27608265-49rdz   0/1     Error       0          45h
postgres-keycloak-repo1-full-27608265-8qgld   0/1     Error       0          45h
postgres-keycloak-repo1-full-27608265-qvhbd   0/1     Error       0          45h
postgres-keycloak-repo1-full-27608265-97zzm   0/1     Error       0          45h
postgres-keycloak-repo1-full-27608380-qz9qg   0/1     Completed   0          43h
postgres-keycloak-repo1-full-27609180-sdzrd   0/1     Completed   0          30h
postgres-keycloak-repo1-full-27610620-hvcg9   0/1     Completed   0          6h26m
[...]

Now I want to monitor this backup with newrelic. My idea was to get the latest pod and check for the status. When it’s Succeeded it’s ok and when it’s Failed its bad and should trigger an alert.

So far I wrote a query that returns the number of latest pods with the status Failed

SELECT count(*) FROM (
FROM K8sPodSample SELECT latest(createdAt), latest(podName), (latest(timestamp)/1000 - latest(createdAt))/3600 as 'age / h', latest(status) as status WHERE namespace = 'keycloak' AND podName LIKE '%postgres-keycloak-repo1-full%' FACET podName ORDER BY createdAt LIMIT 1) WHERE status = 'Failed'

But I can’t create an alert from this, as alerts always seem to be based on a TIMESERIES.

Maybe there is a better way to get this alert working? I’m glad for all input.

Yours
Michael

1 Like

I’m running into this exact same issue. I went down a path like this:

SELECT count(*)
FROM K8sPodSample 
WHERE createdKind = 'Job' 
AND podName LIKE ‘example-job-%’
AND status = 'Failed'
AND createdAt = (
  SELECT max(createdAt)
  FROM K8sPodSample 
  WHERE createdKind = 'Job' 
  AND podName LIKE ‘example-job-%’
)

It seems to work as I’d expect in Query Builder but when I try to deploy it (we use Terraform) I get the following error so it appears the you cannot use nested queries like this in Alerts.

Unable to parse NRQL query: Validation failed. Error at line 1 position 257, SELECT is not permitted here: BAD_USER_INPUT

@michael.schmid1 - Were you ever able to solve this or find a satisfactory workaround?

@Matt.Nohelty2 and @michael.schmid1 Following up since its been awhile. Were either of you able to find a solution?

This category is not monitored by our support team but hoping if you haven’t found the answer another community member will be able to assist.