Synthetics Troubleshooting Framework Timeouts and Error Messages

Timeouts by monitor

Timeout thresholds

Ping Monitors

  • 30s timeout for connect per HTTP request. Not Configurable.

    • Error message: NetworkError: Connect to my.host.com:80 [/255.255.255.255] failed: connect timed out
  • 65s timeout for overall job. This accounts for two 30s HTTP requests with a 5 second buffer. Not Configurable.

  • 30s timeout for SSL Verify. This runs as a completely separate request as an openssl command. Time spent on this is not reflected in the original request. Not Configurable.

    • Error message: SSLTimedOut error

Simple Browser Monitors

  • 60s timeout to observe the page load event. Not Configurable.
    • Error message: TimeoutError: Page load timed-out (unable to finish all network requests on time)

Scripted Browser Monitors

  • 180s (3 min) default global timeout for script execution. If the script has not completed after 180s, the job is terminated.

  • Error message: Job timed-out after 180s

    • This timeout can be configured when using the Containerized Private Minion by providing the MINION_CHECK_TIMEOUT on minion startup. This value must be an integer between 0 seconds (excluded) and 900 seconds (included).
  • 60s timeout to observe the page load event. Can be increased up to 180 seconds using the $browser.manage().timeouts().pageLoadTimeout(ms: number) helper function.

    • Error message: TimeoutError: Page load timed-out (unable to finish all network requests on time)

Key waiting methods that have configurable timeouts:

  • $browser.waitForAndFindElement(locator, timeout_ms)
  • $browser.waitForPendingRequests(timeout_ms)
  • $browser.wait(time_ms)
  • $browser.manage.timeouts() functions

API Monitors

  • 180s (3 min) global timeout for script execution. If the script has not completed after 180s, the job is terminated.

    • Error message: Job timed-out after 180s
    • This timeout can be configured when using the Containerized Private Minion by providing the MINION_CHECK_TIMEOUT on minion startup. This value must be an integer between 0 seconds (excluded) and 900 seconds (included).
  • NodeJS Request library specific timeouts can be reviewed here: GitHub - request/request: 🏊🏾 Simplified HTTP request client..

Error messages, what they mean, and potential solutions

Intermittent scripted browser monitor timeout errors

Common scenario: Script has been working for [insert long length of time]. Suddenly, it starts failing intermittently. A wave of errors comes through, but the behavior eventually self-corrects

Example error messages:

  • NetworkError: Connect to my.host.com:80 [/255.255.255.255] failed: connect timed out
  • TimeoutError: Page load timed-out (unable to finish all network requests on time)

Potential causes:

  1. Issue with the site being monitored
  2. Issue with New Relic or one of New Relic’s providers
  3. Network issues between New Relic and the endpoint being monitored

Solution + troubleshooting steps

  1. Verify that there isn’t a known or visible issue with the site being monitored
  2. Check the New Relic status page at status.newrelic.com to ensure that there are not any ongoing incidents that may be impacting Synthetics

If both of these steps have been followed, then the errors are most likely the result of Network issues somewhere between our servers and the endpoint being monitored. There are a number of steps between these two points. One of these steps, or a number of them, could be contributing to slowdowns that result in timeout errors. One example of a potential contributing issue could be a problem with a CDN.

This may seem like a reason for concern, but these failures can often be ignored. More specifically, it’s unlikely that a user can do anything to remedy these sorts of network issues assuming they’re outside of the user’s environment and/or the New Relic environment. While it may not be the result of an issue on our end or on the user side, the monitor failure likely reflects the real world user experience of the person trying to access the application being monitored from the corresponding geographic location. For more context, see the following resource: Relic Solution: Understanding Single Location Timeout Failures in Synthetics.

Users who are interested in going a step further in debugging network issues can find a handy resource for this here: If you’re interested in debugging network performance issues, you can find a resource on how to do that here:

Single location timeout failures

These errors are similar to intermittent timeout errors and generally come with similar timeout error messages. In this case, failures may come from a single location while other locations continue to monitor without failures. These most commonly impact scripted browser monitors, but they can impact ping and other monitor types as well.

Example error messages:

  • NetworkError: Connect to my.host.com:80 [/255.255.255.255] failed: connect timed out

  • TimeoutError: Page load timed-out (unable to finish all network requests on time)

Potential causes:

  1. Issue with the site being monitored
  2. Issue with the specific AWS location monitoring the endpoint
  3. Transient network issues

Solution + troubleshooting steps:

  1. Verify that there isn’t a known or visible issue with the site being monitored
  2. Check the New Relic status page at status.newrelic.com to ensure that there are not any ongoing incidents that may be impacting Synthetics
  3. As mentioned previously, you can troubleshoot along the network path with a resource like this one: Relic Solution: Network debugging tools available in Synthetics

More information on these types of errors can be found here: Relic Solution: Understanding Single Location Timeout Failures in Synthetics

Non-intermittent scripted browser monitor timeout errors

Symptom: Monitor consistently time outs when it runs or when it is validated. In this case, the behavior does not self-correct or resolve by itself.

Example error messages:

  • NetworkError: Connect to my.host.com:80 [/255.255.255.255] failed: connect timed out
  • TimeoutError: Page load timed-out (unable to finish all network requests on time)

Potential causes:

  1. Syntax error in the script (in the case of a scripted browser monitor)
  2. Issue with the endpoint being monitored

Non-intermittent timeout errors frequently stem from issues with script syntax. Keep in mind that a timeout error can be symptomatic of an underlying issue with how the script is written. Otherwise, it may be an issue where there are too many steps within the script. As mentioned previously, monitors must be able to execute within 180 seconds if they’re running on a public location without additional configuration. It could also be an issue where the endpoint being monitored isn’t configured to handle checks from Synthetics monitors.

Ping monitor timeout failures

Symptom: ping monitor times out (either intermittently or constantly)

Example error messages:

  • ERROR: Job timed out after 65 seconds

Ping monitors have a non-configurable timeout of 65 seconds. During this time, a GET and a HEAD request are performed. Each step has 30 seconds to execute with a 5 second buffer. If monitor timeouts are happening frequently, it is likely the result of one of the following:

  1. Server latency
  2. Misconfigured endpoint

User should ensure that the endpoint is publicly available and can receive HTTP requests. If timeout errors persist beyond this, then it may be an issue that stems from server latency. In that case, it’s worth checking the path between the New Relic servers and the endpoint being monitored.

Syntax errors

Syntax errors point to issues with how scripts are written. This will commonly be the result of elements being selected incorrectly (or not at all), steps being incorrectly sequenced, statements being used incorrectly, etc.

Examples of syntax error messages include:

  • Error: no such element
  • Error: unknown error: Element is not clickable at point

Syntax errors can often be identified based on corresponding failure messages. Specifically, failure messages will generally point to specific steps in the script or certain elements that can’t be identified or interacted with.

New Relic Support isn’t able to provide assistance with custom scriptwriting and troubleshooting. That said, there are a variety of other places where users can find assistance with issues like this.

Main scriptwriting docs: https://docs.newrelic.com/docs/synthetics/new-relic-synthetics/scripting-monitors/write-scripted-browsers

Explorer’s Hub: https://discuss.newrelic.com/c/full-stack-observability/synthetic

LevelUp Relic Solution posts: https://discuss.newrelic.com/tags/c/proven-practices/level-up-relic-solutions/synthetic

Explanations and solutions to common scripted monitor errors: https://docs.newrelic.com/docs/synthetics/new-relic-synthetics/troubleshooting/simple-scripted-or-scripted-api-non-ping-errors

Selenium WebDriver docs: http://seleniumhq.github.io/selenium/docs/api/javascript/module/selenium-webdriver/index.html

Miscellaneous errors

Error: Unknown error: Element is not clickable at point (X, Y). Other element would receive the click: <div">…

Cause: This error occurs when a user attempts to .click() an element that is being blocked by some kind of overlaying DOM element (in the example here, the div).

Solution: This really depends on the site. When the .click() is fired, you will need to make sure the element being clicked is not behind another element. If it’s an overlay that needs to be manually closed, you can script that in prior to the click.

Error: StaleElementReferenceError

Cause: This error comes about when there is a delay between the execution of an element locator and an action being executed on that element. If the DOM has changed between when the element locator was generated and the action was executed against the element, this will occur because the actual element has changed.

Solution: This generally is caused by the condition of a waitForAndFindElement being fulfilled while a site is still loading. To address it, you can generally either replace a waitForAndFindElement .then element.click() block with a regular findElement.click(), OR keep the waitForAndFindElement and just do a second findElement.click within the .then block.

Error: HMAC MISMATCH ERROR: HMAC mismatch between saved HMAC and calculated HMAC at job run time

Cause: This error means the pass phrase configured on the Private Minion and for the monitor don’t match.

Solution: Ensure the pass phrases between private minion and Synthetics monitor match. See public docs for more info:

https://docs.newrelic.com/docs/synthetics/new-relic-synthetics/private-locations/verified-script-execution-private-locations

https://docs.newrelic.com/docs/synthetics/new-relic-synthetics/troubleshooting/private-location-hmac-errors

Other error resources

Non-scripted monitor errors: Non-scripted monitor errors | New Relic Documentation

Non-ping monitor errors (scripted browser, simple browser, API test): Simple, scripted, or scripted API (non-ping) errors | New Relic Documentation

Isolated monitor failures: Troubleshoot isolated monitor failures | New Relic Documentation

5 Likes