Hi,
I would like to configure an alert in NR to detect and notify when a AWS AutoScaling Group is thrashing, i.e., creating new short-lived instances continuously because the instances fail to start up properly for some reason (e.g., a failing UserData script) or the Load Balancer healthcheck fails to reach them (e.g., due to a misconfigured Security Group).
Up to now I’ve been able to visualise ASG activity (i.e., spawning of new instances) by requesting against the “InfrastructureEvent” type of events:
SELECT count(*) FROM InfrastructureEvent FACET `provider.autoScalingGroupName` WHERE changeType = 'added' SINCE 3 hours ago TIMESERIES UNTIL now
But I haven’t been able to set up an alert from that request, as the Alerts don’t seem to accept InfrastructureEvents as a data source.
Any idea how to achieve this?
Thanks a lot, any help greatly appreciated!