We have Kafka on RHEL - Data not Ingesting

We are using Cassandra and Kafka , we are able to get Cassandra ingested Metrics but not for Kafka
can you please assist
we are using New Relic for APM

Hello @Charles.Netshivhera,

We’ll need to see the kafka-config.yml file and a set of verbose logs from the Infra agent.

To Collect Verbose Debug logs for Infrastructure on Linux:

  1. Create a directory outside of the system logs to send the verbose logs: `sudo mkdir /var/log/newrelic
  2. In the newrelic-infra.yml file add the lines:
  • verbose: 1
  • log_file: /var/log/newrelic/newrelic-infra.log
  1. Restart the Infrastructure agent.
  2. Let it run for about 5 minutes.
  3. In the newrelic-infra.yml file set verbose= 0 to disable verbose logging mode. I would recommend leaving the new logging location.
  4. Restart the Infrastructure agent again.
  5. Send the file from /var/log/newrelic/newrelic-infra.log by attaching the file to your response. Attaching it instead of pasting it into the ticket makes it easier to read, as well as to download and parse through.

Hi
I am receiving an error that says new users cannot upload docs

Hi @Charles.Netshivhera, If you have collected the logs, could you check the log file for lines that include "level=error"

time=“2020-06-22T09:35:34+02:00” level=error msg=“processing update for kernel module” error=“Unable to get module info for ‘falcon_lsm_pinned_9606’: exit status 1” plugin=KernelModules
indent preformatted text by 4 spaces

Hi @Charles.Netshivhera

The falcon_lsm kernel module is part of CrowdStrike’s endpoint security software and requires additional permissions not afforded to an ordinary process. However, the only information the Infrastructure agent attempts to gather from the loaded kernel modules is their name, version, and description for submission to the Infrastructure Inventory, and its omission shouldn’t affect the rest of the performance metrics it attempts to gather.

Are there any level=error lines in your logs that have the string kafka in them?

time="2020-07-06T10:43:39+02:00" level=error msg="Integration command failed" error="exit status 1" instance=kafka-metrics-zookeeper-discovery integration=com.newrelic.kafka prefix=integration/com.newrelic.kafka stderr="[ERR] Integration failed: zookeeper connection must not be nil for 'All' mode\n" working-dir=/var/db/newrelic-infra/newrelic-integrations


time="2020-07-06T10:43:39+02:00" level=error msg="Integration command failed" error="exit status 1" instance=kafka-metrics-zookeeper-discovery integration=com.newrelic.kafka prefix=integration/com.newrelic.kafka stderr="[ERR] Integration failed: zookeeper connection must not be nil for 'All' mode\n" working-dir=/var/db/newrelic-infra/newrelic-integrations


time="2020-06-22T09:34:25+02:00" level=error msg="processing update for kernel module" error="Unable to get module info for 'falcon_lsm_pinned_9606': exit status 1" plugin=KernelModules

Hi @Charles.Netshivhera, If you have the topic_mode field set to “all” in the kafka-config.yml file, a Zookeeper connection is required, at least the zookeeper_hosts field is required, as topics are looked up via Zookeeper.

You can refer the example config file here for clarification on this - kafka-config.yml.sample

Topic_Mode is disabled in kafka_config.yml

‘# local_only_collection: false’
’ # collect_broker_topic_data: true’
'# topic_mode: “all”
‘# collect_topic_size: falsePreformatted text

Another Kafka_config.xml from another Node

integration_name: com.newrelic.kafka

instances:

This instance gives an example of autodiscovery of brokers with zookeeper

  • name: kafka-metrics-zookeeper-discovery
    command: metrics
    arguments:

    A cluster name is required to uniquely identify this collection result in Insights

    cluster_name: Prod-Kafka

    Override the kafka API version to target. Defaults to 1.0.0, which will work for all post-1.0.0 versions. Older versions of the API may be missing features.

    kafka_version: “1.0.0”

    autodiscover_strategy: bootstrap

    Bootstrap broker arguments. These configure a connection to a single broker. The rest of the brokers in the cluster

    will be discovered using that connection.

    bootstrap_broker_host: nodeIP
    bootstrap_broker_kafka_port: 9092
    bootstrap_broker_kafka_protocol: PLAINTEXT # Currently support PLAINTEXT and SSL
    bootstrap_broker_jmx_port: 9999

    JMX user and password default to default_jmx_user and default_jmx_password if unset

    bootstrap_broker_jmx_user: admin
    bootstrap_broker_jmx_password: password

    #local_only_collection: false

    See above for more information on topic collection

    collect_broker_topic_data: true
    topic_mode: ‘regex’
    collect_topic_size: false
    indent preformatted text by 4 spaces

do you have any comments / suggestions ?

Hi @Charles.Netshivhera

It appears that you have the topic_mode set to regex, but no corresponding topic_regex option. Can you try changing topic_mode to all and restarting the agent?

i am getting

time=“2020-07-14T09:50:01+02:00” level=debug msg=“Sending events to metrics-ingest” component=MetricsIngestSender key=4463778918856679882 numEvents=92 postCount=40 timestamps="[2020-07-14 09:50:01 +0200 SAST]"
time=“2020-07-14T09:50:01+02:00” level=debug msg=“preparing metrics post” component=MetricsIngestSender postCount=40
time=“2020-07-14T09:50:02+02:00” level=debug msg=“patch sender found no deltas to send” component=PatchSender entityKey=ZAPRNBRAPP1120.corp.dsarena.com
time=“2020-07-14T09:50:02+02:00” level=debug msg=“patch sender found no deltas to send” component=PatchSender entityKey=
time=“2020-07-14T09:50:02+02:00” level=debug msg=“metrics post succeeded” component=MetricsIngestSender postCount=40
time=“2020-07-14T09:50:06+02:00” level=debug msg="[{“cpu”:“cpu-total”,“user”:6785.3,“system”:4095.93,“idle”:3066591.5,“nice”:23.72,“iowait”:155.72,“irq”:0,“softirq”:115.43,“steal”:0,“guest”:0,“guestNice”:0,“stolen”:0}]" component=SystemSampler location=raw structure=CpuTimes
time=“2020-07-14T09:50:06+02:00” level=debug msg=“delta storage” component=SystemSampler elapsedMs=20018 totalReadTime=0 totalReads=0 totalWriteTime=25 totalWrites=20
time=“2020-07-14T09:50:06+02:00” level=debug msg="{“eventType”:“SystemSample”,“timestamp”:0,“entityKey”:"",“cpuPercent”:0.7038712921063526,“cpuUserPercent”:0.3519356460533549,“cpuSystemPercent”:0.35193564605299765,“cpuIOWaitPercent”:0,“cpuIdlePercent”:99.29612870789366,“cpuStealPercent”:0,“loadAverageOneMinute”:0.02,“loadAverageFiveMinute”:0.06,“loadAverageFifteenMinute”:0.05,“memoryTotalBytes”:24035250176,“memoryFreeBytes”:21214425088,“memoryUsedBytes”:2820825088,“swapTotalBytes”:8589930496,“swapFreeBytes”:8589930496,“swapUsedBytes”:0,“diskUsedBytes”:9022648320,“diskUsedPercent”:12.176825957284052,“diskFreeBytes”:62268416000,“diskFreePercent”:84.03648655872267,“diskTotalBytes”:74096881664,“diskUtilizationPercent”:0.008742132081126985,“diskReadUtilizationPercent”:0,“diskWriteUtilizationPercent”:0.008742132081126985,“diskReadsPerSecond”:0,“diskWritesPerSecond”:0.9991008092716555}" component=SystemSampler location=final structure=SystemSample
time=“2020-07-14T09:50:06+02:00” level=debug msg="[{“mtu”:65536,“name”:“lo”,“hardwareaddr”:"",“flags”:[“up”,“loopback”],“addrs”:[{“addr”:“127.0.0.1/8”}]},{“mtu”:1500,“name”:“ens192”,“hardwareaddr”:“00:50:56:ac:bf:2a”,“flags”:[“up”,“broadcast”,“multicast”],“addrs”:[{“addr”:“22.242.240.224/23”}]}]" component=NetworkSampler location=raw structure=NetInterfaces
time=“2020-07-14T09:50:06+02:00” level=debug msg="[{“name”:“ens192”,“bytesSent”:3381613934,“bytesRecv”:2703098418,“packetsSent”:26959380,“packetsRecv”:24059067,“errin”:0,“errout”:0,“dropin”:0,“dropout”:0,“fifoin”:0,“fifoout”:0},{“name”:“lo”,“bytesSent”:251537408,“bytesRecv”:251537408,“packetsSent”:2092296,“packetsRecv”:2092296,“errin”:0,“errout”:0,“dropin”:0,“dropout”:0,“fifoin”:0,“fifoout”:0}]" component=NetworkSampler location=raw structure=IOCounters
time=“2020-07-14T09:50:06+02:00” level=debug msg="{“eventType”:“NetworkSample”,“timestamp”:0,“entityKey”:"",“interfaceName”:“ens192”,“hardwareAddress”:“00:50:56:ac:bf:2a”,“ipV4Address”:“22.242.240.224/23”,“state”:“up”,“receiveBytesPerSecond”:9465.931863727454,“receivePacketsPerSecond”:95.39078156312625,“receiveErrorsPerSecond”:0,“receiveDroppedPerSecond”:0,“transmitBytesPerSecond”:14247.094188376754,“transmitPacketsPerSecond”:114.62925851703406,“transmitErrorsPerSecond”:0,“transmitDroppedPerSecond”:0}" component=NetworkSampler location=final structure=NetworkSample
time=“2020-07-14T09:50:06+02:00” level=debug msg=“Sending events to metrics-ingest” component=MetricsIngestSender key=4463778918856679882 numEvents=2 postCount=41 timestamps="[2020-07-14 09:50:06 +0200 SAST]"
time=“2020-07-14T09:50:06+02:00” level=debug msg=“preparing metrics post” component=MetricsIngestSender postCount=41
time=“2020-07-14T09:50:07+02:00” level=debug msg=“metrics post succeeded” component=MetricsIngestSender postCount=41
time=“2020-07-14T09:50:11+02:00” level=debug msg="[{“cpu”:“cpu-total”,“user”:6785.37,“system”:4096,“idle”:3066651.28,“nice”:23.72,“iowait”:155.73,“irq”:0,“softirq”:115.43,“steal”:0,“guest”:0,“guestNice”:0,“stolen”:0}]" component=SystemSampler location=raw structure=CpuTimes
time=“2020-07-14T09:50:11+02:00” level=debug msg=“delta storage” component=SystemSampler elapsedMs=20018 totalReadTime=0 totalReads=0 totalWriteTime=25 totalWrites=20
time=“2020-07-14T09:50:11+02:00” level=debug msg="{“eventType”:“SystemSample”,“timestamp”:0,“entityKey”:"",“cpuPercent”:0.25029200734252754,“cpuUserPercent”:0.11680293675946678,“cpuSystemPercent”:0.11680293676022559,“cpuIOWaitPercent”:0.016686133822835167,“cpuIdlePercent”:99.74970799265748,“cpuStealPercent”:0,“loadAverageOneMinute”:0.02,“loadAverageFiveMinute”:0.06,“loadAverageFifteenMinute”:0.05,“memoryTotalBytes”:24035250176,“memoryFreeBytes”:21214498816,“memoryUsedBytes”:2820751360,“swapTotalBytes”:8589930496,“swapFreeBytes”:8589930496,“swapUsedBytes”:0,“diskUsedBytes”:9022648320,“diskUsedPercent”:12.176825957284052,“diskFreeBytes”:62268416000,“diskFreePercent”:84.03648655872267,“diskTotalBytes”:74096881664,“diskUtilizationPercent”:0.008742132081126985,“diskReadUtilizationPercent”:0,“diskWriteUtilizationPercent”:0.008742132081126985,“diskReadsPerSecond”:0,“diskWritesPerSecond”:0.9991008092716555}" component=SystemSampler location=final structure=SystemSample
time=“2020-07-14T09:50:11+02:00” level=debug msg="[{“mtu”:65536,“name”:“lo”,“hardwareaddr”:"",“flags”:[“up”,“loopback”],“addrs”:[{“addr”:“127.0.0.1/8”}]},{“mtu”:1500,“name”:“ens192”,“hardwareaddr”:“00:50:56:ac:bf:2a”,“flags”:[“up”,“broadcast”,“multicast”],“addrs”:[{“addr”:“22.242.240.224/23”}]}]" component=NetworkSampler location=raw structure=NetInterfaces
time=“2020-07-14T09:50:11+02:00” level=debug msg="[{“name”:“ens192”,“bytesSent”:3381666409,“bytesRecv”:2703140830,“packetsSent”:26959867,“packetsRecv”:24059514,“errin”:0,“errout”:0,“dropin”:0,“dropout”:0,“fifoin”:0,“fifoout”:0},{“name”:“lo”,“bytesSent”:251537408,“bytesRecv”:251537408,“packetsSent”:2092296,“packetsRecv”:2092296,“errin”:0,“errout”:0,“dropin”:0,“dropout”:0,“fifoin”:0,“fifoout”:0}]" component=NetworkSampler location=raw structure=IOCounters
time=“2020-07-14T09:50:11+02:00” level=debug msg="{“eventType”:“NetworkSample”,“timestamp”:0,“entityKey”:"",“interfaceName”:“ens192”,“hardwareAddress”:“00:50:56:ac:bf:2a”,“ipV4Address”:“22.242.240.224/23”,“state”:“up”,“receiveBytesPerSecond”:8484.096819363873,“receivePacketsPerSecond”:89.41788357671535,“receiveErrorsPerSecond”:0,“receiveDroppedPerSecond”:0,“transmitBytesPerSecond”:10497.099419883978,“transmitPacketsPerSecond”:97.41948389677937,“transmitErrorsPerSecond”:0,“transmitDroppedPerSecond”:0}" component=NetworkSampler location=final structure=NetworkSample
time=“2020-07-14T09:50:11+02:00” level=debug msg=“starting harvest” component=Plugins id=metadata/host_aliases
time=“2020-07-14T09:50:11+02:00” level=debug msg=“completed harvest, emitting” component=Plugins dataset="[{ZAPRNBRAPP1120.corp.dsarena.com hostname} {zaprnbrapp1120.corp.dsarena.com hostname_short}]" id=metadata/host_aliases
time=“2020-07-14T09:50:11+02:00” level=debug msg=“completed emitting” component=Plugins id=metadata/host_aliases
time=“2020-07-14T09:50:11+02:00” level=debug msg=“Updating identity” component=Agent new=ZAPRNBRAPP1120.corp.dsarena.cPreformatted text

but i dont see any ingest metrics
https://infrastructure.newrelic.com/accounts/1718962/integrations/onHostIntegrations/accounts/7/kafka/dashboard

it looks like there is ingestion but the dashboards are not populated with data

Hi @Charles.Netshivhera

It looks like your integration is able to connect to your brokers but is receiving empty payloads when it tries to get metrics. The Kafka integration gathers metrics from the JMX port on your brokers; can you confirm that JMX is enabled and the JMX port is network accessible from where you have the Kafka integration installed?

sorry for the late reply - it is not installed not enabled

Hello @Charles.Netshivhera, as per the compatibility and requirements for the Kafka integration, JMX needs to be enabled on all brokers, Java consumers, and Java producers that you want monitored. - Kafka monitoring integration - Compatibility and requirements

Could you enable JMX and let us know if you can see metrics populating as expected?