Relic Solution: Understanding Single Location Timeout Failures in Synthetics

It’s a tale as old as time. Your Synthetics monitor is working perfectly. Every 5 minutes, it’s running a check to make sure that your site is operating properly. The cycle repeats itself for days, weeks, or months without issue and your site is running without missing a beat. One day, this comes to a screeching halt and you see a number of failures come from a single location. They all come bearing the same error message:

TimeoutError: Page load timed-out (unable to finish all network requests on time)

Let’s take a step back to establish a concrete example here. You have a ping monitor that runs checks from 5 different locations on a 5-minute frequency. It runs checks from San Francisco, Tokyo, Portland, Seoul, and Dublin. You see a handful of failures come through during a 30 minute time window and they’re all timeout errors from the Portland location. This might seem like a strange circumstance. If the site was really having problems, wouldn’t this be represented by failures from all locations?

At this point, you’ve probably checked your site and have come to find that things were working just fine during the period of time that the failures occurred during. Could this be an issue with New Relic?

It’s certainly a possibility, but not always the case. It can definitely be worth asking if you’re concerned. That said, these occurrences are actually quite rare in practice. If there is an issue where a location goes down for all users, then a message may be pushed to status.newrelic.com. This isn’t the case very often, so it can be helpful to understand what specifically constitutes an error like this one.

Understanding the TimeoutError

This error indicates that a Simple or Scripted Browser monitor loaded the target page, but the page’s load event was not fired in less than 60 seconds. You may still see that the page appears rendered in the failed screenshot. Even so, the load event was not observed during the default time limit. There might have been some blocking resource requests on the page that held up the page load, and were possibly due to an underlying network issue.

This error may still occur with the parent page load event being observed where an iframe resource document’s own load event is unobserved due to a dependent resource of that document failing to load.

I’m seeing failures thrown from nr-bam.data.net or Google Analytics - is that causing this failure?

Great question! A failure thrown from one of these services wouldn’t necessarily be solely responsible for a failed check. A check can still succeed if a resource fails to load. That said, it can be a contributing factor if the failure holds up the page enough to result in a timeout.

I don’t want to see failures if it takes more than 60 seconds for my site to load. Can I reconfigure my monitor so that it has 90 seconds to perform a check instead?

This depends on the type of monitor. Currently, you can only adjust this setting if you’re utilizing a scripted browser monitor. If you are using a scripted browser monitor, you can use the following command placed before the first $browser.get to customize this value:

$browser.manage().timeouts().pageLoadTimeout(ms: number)

If it’s not New Relic and it’s not my site, then what else could be causing this?

A failure like this may be an indication of an issue between your servers and ours during the time period that the check failed. These are often the result of transient network issues and we expect that a timeout or lag issue that resulted in a failure would have been a genuine indication of what users from the impacted location experienced during the time period that the failure occurred. It can also be possible to get more information by using a Simple or Scripted Browser monitor to see if specific page elements began to become unavailable or slow before these timeouts.

When we detect a failure, two further Synthetics checks are queued immediately. If both of these checks also fail, the failure is highlighted as a failure in the Synthetics UI. If you’re interested in learning more about this, you can refer to this resource: Relic Solution: Understanding the “Three Strikes” Behavior in Synthetics.

4 Likes