Kubernetes Troubleshooting Framework Log Forwarding

This guide is intended for help troubleshooting logs from the configuration and data collection end. For example: missing logs and/or log messages, unparsed logs, config questions etc. If the issue resides in the UI then there is a separate troubleshooting guide.

Before reading further:

  • Ensure that you have installed and configured a compatible Log Forwarder. You can find a list of the different log forwarding options and how to install New Relic’s various plugins in our documentation here. Troubleshooting for individual forwarders can be found below.
  • Logs in Context is different from Log Forwarding. Setting up Logs in Context for your application simply adjusts your application’s logger to format logs as JSON with New Relic’s logging metadata. This metadata establishes context for your application logs and other features of your New Relic APM agent. Check out our Configure logs in context with APM agents doc for more detailed information.

General

  • New Relic deploys its own pod on each of the nodes in your cluster when using the Kubernetes Logging Plugin. If you do not have a newrelic-logging pod on your node, then you will not see logs for the pods on that node.

  • By default each pod on your node should write its logs to STDOUT. These STDOUT logs get written as a log file in the /var/log/containers directory on your node. The newrelic-logging pod has access to the filesystem on the node and is able to read the logs from the /var/log/containers directory. It forwards these logs to New Relic.

  • Ensure you are using the latest version of the plugin and update it if not.

  • If you’re using Helm to install, make sure the most recent version of Helm is installed and up to date.

  • If you’re installing manually, ensure that you put your manifest files in your current working directory.

    • Note that if installing manually you’ll need to apply any changes to manifest files by using “kubectl apply -f .”
  • Check that your installation has properly installed the manifest files necessary for generating the New Relic Logging Pod for log forwarding:

    • fluent-conf.yml - ConfigMap for newrelic-logging
    • new-relic-fluent-plugin.yml - DaemonSet for newrelic-logging
    • rbac.yml - ClusterRoleBinding for newrelic-logging
  • If you’ve installed logging as part of the nri-bundle then you’ll likely have one large manifest that bundles all of the logging manifests into a single newrelic-manifest.yaml file. For the next few bullet points you will need to check through this manifest file to verify the info.

  • Ensure the newrelic-logging DaemonSet (new-relic-fluent-plugin.yml) has the correct environment variable values set.

    • This DaemonSet is what generates the newrelic-logging pod.
    • Make sure the LICENSE_KEY env variable is the correct license key for your account.
    • Check that ENDPOINT is set to “https://log-api.newrelic.com/log/v1” (or “https://log-api.eu.newrelic.com/log/v1” if you are in the EU).
    • Check that PATH is set to “/var/log/containers/*.log”. This can be changed but it isn’t recommended to do so. A Pod’s logs write to STDOUT which in turn gets written to that directory on the Node.
  • Ensure the newrelic-logging ConfigMap (fluent-conf.yml) looks correct.

    • This ConfigMap configures Fluent bit which handles the log forwarding for the plugin.
    • This file gets its environment variable values from the DaemonSet mentioned above.
    • We generally don’t recommend customizing this file unless you absolutely need to.
    • It follows the same general structure as a Fluent bit config file.
    • Check out the Fluent bit’s docs on their Kubernetes Filter for more info on the default values set and available config options.

Troubleshooting

  • The main cause of issues usually stems from customizing the default values in the manifest files for the newrelic-logging pod. That said we know the defaults don’t always fit everyone’s needs so please consider the following:
    • Double check your Tag and Match values. These config options are what associates the different parts of Fluent bit’s config.
    • Remember that the environment variable values for the ConfigMap are set in the DaemonSet. If you need to change their values or add any then do so in the DaemonSet.
    • If changing the settings for the default [INPUT] in the ConfigMap. Ensure you reference Fluent bit’s tail docs for more info on the values you change:
    • If adding additional [INPUT] sections, ensure the logs you want to tail aren’t already being written to /var/log/container on the node. Otherwise you’ll get duplicate log entries.
    • You can add additional [FILTER] sections with record_modifier if you want to add attributes to your logs. Pay attention to Match when doing this so you can associate them with the right [INPUT] Tag. A default example in the ConfigMap is the cluster_name:
[FILTER]

Name record_modifier

Match *

Record cluster_name ${CLUSTER_NAME}
  • It’s ok to add additional parsers but we don’t recommend editing the default parsers unless you absolutely have to. Do so at your own risk and keep a copy of the defaults nearby.

  • If you write logs to a file on your pod instead of STDOUT you will likely run into issues. The newrelic-logging pod is not able to access log files on other pods since it can only see filesystems on the Node.

    • If you are writing your logs to a file on the pod you will need to set up a sidecar. We recommend looking up how to deploy Fluent bit directly as a sidecar rather than trying to edit the New Relic kubernetes plugin to “make” it a sidecar.
    • It should be noted that a pod logging to STDOUT is the supported configuration. If you choose to go the sidecar route you will need to configure that on your end.
  • It can be helpful to look at the actual log files that are being written to the /var/log/containers directory on your node.

    • You can use the CLI and run kubectl logs podname replacing “podname” with the actual pod’s name.
    • If you prefer to see the logs directly, you can also SSH into the Node itself and navigate to the /var/log/containers directory.
  • Using the same method you can see the fluent bit logs for the newrelic-logging pod as well: kubectl logs newrelic-logging-xxxxxx. You can check these logs for errors in the fluent bit process.

    • By default the log level is set to “info”. If you need more verbose logging you can edit your DaemonSet’s LOG_LEVEL environment variable.
  • If you are missing Kubernetes metadata in your logs make sure you haven’t changed the default values in the Kubernetes [FILTER] in your ConfigMap. It should look like this:

 [FILTER]

Name kubernetes

Match kube.*

Kube_URL https://kubernetes.default.svc.cluster.local:443

Merge_Log Off
  • There should be a Kubernetes API server in every Kubernetes Cluster. New Relic queries this API server to pull Kubernetes metadata and adds it to the logs.

  • You may also see https://kubernetes.default.svc:443 as the Kube_URL value. This also works.

  • If you have set up logging via the Kubernetes Integration for Infrastructure then you can query the K8sPodSample event type using NRQL to get more information about your pods and the newrelic-logging pod. Note that this event type is only available when using the integration.

    • SELECT uniqueCount(nodeNames) FROM K8sPodSample SINCE 1 day AGO
      • Shows the amount of unique nodes that integration is monitoring.
    • SELECT uniqueCount(podNames) FROM K8sPodSample WHERE podName LIKE ‘%newrelic-logging%’ SINCE 1 day AGO
      • Should show the same number as the nodes since there should be one newrelic-logging pod on every node.

Here’s a complete list of the event types for the Kubernetes Integration as well.

1 Like