Now in beta: Track your service level objectives

We’re happy to announce a new capability to track service levels for your applications on New Relic One!

This capability is available in beta to all full platform users. You can use it to create service level indicators and objectives for any entity type, and get a nice compact view of their compliance so the whole team can discuss together.

When you use it from an APM service, New Relic can automatically suggest the typical service levels, adjusting them to the last weeks of data. Or you can also create your own service level indicators based on any NRDB event.

Learn all about New Relic Service Levels on docs.newrelic.com.

Please remember that our Global Technical Support team does not offer direct support for beta releases. So don’t hesitate to post your questions and suggestions here on the Explorer’s Hub, or use the “Help us improve” button on New Relic One.

Cheers!

8 Likes

Do you have on the roadmap alerting capabilities on SLO compliance and error budgets burn rates?
any plans to make this feature (Service Levels) available to basic users in the future?

Thank you!

1 Like

Hello @gygabyte, thanks for the questions!

Alerts are on the roadmap, indeed.

May I ask your opinion on the SLO compliance alert? We’re not sure that it should follow the typical incident flow, so perhaps it could be a report you get once a day when you’re out of compliance? How do you envision it? Who would be the recipients?

Regarding the visibility to basic users, that’s not in the plans for the moment.

Cheers!

Thanks for your reply.

I am not too familiar with the incident concept in NR, but it seems to me that a SLO that is not met should be treated as a major event. Typically in SRE the SLO is a critical KPI. The incident resolution to one of these incidents can be a lot of things (establish a new SLO, enhancement/fix ticket, etc)… not the typical outage situation, I would agree.

Ideally though there should be also a capability to alert before being out of compliance, ie, a threshold and treated as a warning. Not sure to what exactly that would translate into NR monitoring/alerting current capabilities.

2 Likes

Trying to create Service Levels in a workload via terraform. In terraform apply, i get the following error when creating Service Levels in a workload i just created:
Error: Could not validate account access to the following guids:
323xxxx:69746:MzIzODM2MnxOUjF8V09SS0xPQUxxxxxxx: Invalid entity guid
First number is account, which is valid
last field is entity guid of the workload, which is valid
not sure what number 69746 represents. Seems like permission issue. I can manually create these through the gui without issue

Hello @Jerry.Johnson, thanks for reaching out and for using Service Levels! :slightly_smiling_face:

Based on the message, I suspect that the issue could be to two factors.

The first could be that the GUID of the Workload you are trying to create te SLI for is not correct. Could you please confirm that in your Terraform resource the argument guid only contains the GUID of the Workload itself? Because from the message you shared it seems that “guid” is a combination of an account ID, the SLI ID and the Workload GUID separated by colons.

Please, see the official New Relic Terraform documentation for more details: registry.terraform.io

The other reason, as you pointed out, could be that the New Relic API Key that you are using to create the Terraform resources doesn’t have access to the account where the Workloads lives in. I’d suggest double checking that if you get a chance. More details on docs.newrelic.com

Please, don’t hesitate to reach out if you require any further assistance.

2 Likes

thanks for quick response. In fact i was using id instead of GUID, so issue resolved

1 Like

Hi,
Is there a way to reflect service level performance in the workload status? i.e. if SLO’s aren’t being met, can the workflow status be changed to ‘disrupted’? I can’t find a way of selecting Service Levels in the workload status calculation.
Cheers.

1 Like

Hello @steve.ohara , thank you for this question!

Right now the workload status can’t be derived from related SLOs. But this is something that makes total sense, and we definitely want to add this capability to workloads in the upcoming months. At that point, the status of a workload will be determined by
a) the related SLOs,
b) the roll up of child entities’ status, or
c) the static status set manually by the owner.

Here’s a question for you: when you think of the workload status coming from SLO data, would you expect that status to show the current SLO attainment (meaning, “is the service burning error budget too quickly now?”), or would you expect to see the compliance over the whole period?

Cheers

1 Like

Thanks,
I can see reasons for both but I think the current SLO attainment would be most useful. If it was tied to the whole period, then the workload status would be affected for a long time after any issues are fixed; e.g. a service level with an SLO of 99% and a period of 7 days, an issue with the service could make the workload show as ‘disrupted’ for a week after less than 2 hours of not meeting the SLO. While the workload hasn’t been meeting the linked SLO, anyone looking at it on the 7th day would think there’s still an issue.
Cheers

1 Like

Unlike custom dashboards, workloads are more streamlined for a specific real-time monitoring use-case. As a result, I would think that the current SLO attainment would be more relevant here.

1 Like

Thanks for sharing @steve.ohara and @rishav.dhar, your thoughts are 100% aligned with our current thinking. Our approach is that the health status of a workload (or any other entity type) should indicate if there are ongoing issues, and we’ll find other ways to show whether the SLOs are out of compliance for the period.