Kafka Integration keeps getting disconnected. Any suggestions?

We cannot get this integration to work and we do have a support ticket open, but we are at a loss what to do next. (We are not kafka experts either).

We get these messages when running the kafka integration:

client/metadata got error from broker -1 while fetching metadata: EOF

We have a 3 node zookeeper cluster and a 3 node kafka cluster

  • RHEL 7
  • Infrastructure agent version 1.17.1
  • Kafka integration version 2.15.0
  • Kafka version 2.12-0.10.2.1

Full Error Message:
time="2021-06-02T19:48:36Z" level=error msg="Integration command failed" error="exit status 1" instance=kafka-consumer-offsets integration=com.newrelic.kafka prefix=config/kafka stderr="19:48:35.268171 [DEBUG] store file (/tmp/nr-integrations/com.newrelic.kafka.json) is older than 1m0s, skipping loading from disk.\n[DEBUG] Connected to 10.56.31.50:2181\n[DEBUG] authenticated: id=105544128853686523, timeout=4000\n[DEBUG] re-submitting 0 credentials after reconnect\n[DEBUG] Connected to broker at KFK01.:9092 (unregistered)\n[DEBUG] Connected to broker at KFK02.:9092 (unregistered)\n[DEBUG] Connected to broker at KFK03.:9092 (unregistered)\n[DEBUG] recv loop terminated: err=EOF\n[DEBUG] send loop terminated: err=<nil>\n[DEBUG] Creating a new client to brokers: [KFK01.:9092 KFK02.:9092 KFK03.:9092]\n[DEBUG] [Initializing new client]\n[DEBUG] client/metadata fetching metadata for all topics from broker KFK03.:9092\n[DEBUG] Connected to broker at KFK03.:9092 (unregistered)\n[DEBUG] client/metadata got error from broker -1 while fetching metadata: EOF\n[DEBUG] Closed connection to broker KFK03.:9092\n[DEBUG] client/metadata fetching metadata for all topics from broker KFK01.:9092\n[DEBUG] Connected to broker at KFK01.:9092 (unregistered)\n[DEBUG] client/metadata got error from broker -1 while fetching metadata: EOF\n[DEBUG] Closed connection to broker KFK01.:9092\n[DEBUG] client/metadata fetching metadata for all topics from broker KFK02.:9092\n[DEBUG] Connected to broker at KFK02.:9092 (unregistered)\n[DEBUG] client/metadata got error from broker -1 while fetching metadata: EOF\n[DEBUG] Closed connection to broker KFK02.:9092\n[DEBUG] [client/metadata no available broker to send metadata request to]\n[DEBUG] client/brokers resurrecting 3 dead seed brokers\n[DEBUG] client/metadata retrying after 250ms... (3 attempts remaining)\n[DEBUG] client/metadata fetching metadata for all topics from broker KFK03.:9092\n[DEBUG] Connected to broker at KFK03.:9092 (unregistered)\n[DEBUG] client/metadata got error from broker -1 while fetching metadata: EOF\n[DEBUG] Closed connection to broker KFK03.:9092\n[DEBUG] client/metadata fetching metadata for all topics from broker KFK01.:9092\n[DEBUG] Connected to broker at KFK01.:9092 (unregistered)\n[DEBUG] client/metadata got error from broker -1 while fetching metadata: EOF\n[DEBUG] Closed connection to broker KFK01.:9092\n[DEBUG] client/metadata fetching metadata for all topics from broker KFK02.:9092\n[DEBUG] Connected to broker at KFK02.:9092 (unregistered)\n[DEBUG] client/metadata got error from broker -1 while fetching metadata: EOF\n[DEBUG] Closed connection to broker KFK02.:9092\n[DEBUG] [client/metadata no available broker to send metadata request to]\n[DEBUG] client/brokers resurrecting 3 dead seed brokers\n[DEBUG] client/metadata retrying after 250ms... (2 attempts remaining)\n[DEBUG] client/metadata fetching metadata for all topics from broker KFK03.:9092\n[DEBUG] Connected to broker at KFK03.:9092 (unregistered)\n[DEBUG] client/metadata got error from broker -1 while fetching metadata: EOF\n[DEBUG] Closed connection to broker KFK03.:9092\n[DEBUG] client/metadata fetching metadata for all topics from broker KFK01.:9092\n[DEBUG] Connected to broker at KFK01.:9092 (unregistered)\n[DEBUG] client/metadata got error from broker -1 while fetching metadata: EOF\n[DEBUG] Closed connection to broker KFK01.:9092\n[DEBUG] client/metadata fetching metadata for all topics from broker KFK02.:9092\n[DEBUG] Connected to broker at KFK02.:9092 (unregistered)\n[DEBUG] client/metadata got error from broker -1 while fetching metadata: EOF\n[DEBUG] Closed connection to broker KFK02.:9092\n[DEBUG] [client/metadata no available broker to send metadata request to]\n[DEBUG] client/brokers resurrecting 3 dead seed brokers\n[DEBUG] client/metadata retrying after 250ms... (1 attempts remaining)\n[DEBUG] client/metadata fetching metadata for all topics from broker KFK03.:9092\n[DEBUG] Connected to broker at KFK03.:9092 (unregistered)\n[DEBUG] client/metadata got error from broker -1 while fetching metadata: EOF\n[DEBUG] Closed connection to broker KFK03.:9092\n[DEBUG] client/metadata fetching metadata for all topics from broker KFK01.:9092\n[DEBUG] Connected to broker at KFK01.:9092 (unregistered)\n[DEBUG] client/metadata got error from broker -1 while fetching metadata: EOF\n[DEBUG] Closed connection to broker KFK01.:9092\n[DEBUG] client/metadata fetching metadata for all topics from broker KFK02.:9092\n[DEBUG] Connected to broker at KFK02.:9092 (unregistered)\n[DEBUG] client/metadata got error from broker -1 while fetching metadata: EOF\n[DEBUG] Closed connection to broker KFK02.:9092\n[DEBUG] [client/metadata no available broker to send metadata request to]\n[DEBUG] client/brokers resurrecting 3 dead seed brokers\n[DEBUG] [Closing Client]\n[ERR] Integration failed: kafka: client has run out of available brokers to talk to (Is your cluster reachable?)\n" working-dir=/var/db/newrelic-infra/newrelic-integrations

Connfiguration File:

---
integration_name: com.newrelic.kafka
instances:
  - name: kafka-metrics-zookeeper-discovery
    command: metrics
    arguments:
      cluster_name: "kfka_us"
      default_jmx_port: 7199
      autodiscover_strategy: "zookeeper"
      zookeeper_hosts: '[{"host": "xxx.xxx.xxx.0", "port": 2181}, {"host": "xxx.xxx.xxx.1", "port": 2181}, {"host": "xxx.xxx.xxx.2", "port": 2181}]'
      zookeeper_path: "/opt/zookeeper/conf"
      topic_mode: "all"
      collect_topic_size: true
      timeout: 20000
    labels:
      env: 'prod'
      role: kafka
      name: 'ZK01'
      version: '0.10.2.1'
  - name: kafka-consumer-offsets
    command: consumer_offset
    arguments:
      cluster_name: "kfka_us"
      autodiscover_strategy: "zookeeper"
      zookeeper_hosts: '[{"host": "xxx.xxx.xxx.0", "port": 2181}, {"host": "xxx.xxx.xxx.1", "port": 2181}, {"host": "xxx.xxx.xxx.2", "port": 2181}]'
      consumer_group_regex: '.*'
    labels:
      env: 'prod'
      role: kafka
      name: 'ZK01'
      version: '0.10.2.1'

@Larry.Collicott Thank you for mentioning you are working with our support team via a ticket. You can certainly still post on the forum but please note that our support team will only work via the ticket otherwise there could be duplication of efforts. Fellow community members are welcome to provide their insights on this thread.

Yes, I understand. It’s been a mystery ticket for just over two weeks, so I thought maybe some community insight might exist out there. Fingers crossed.

2 Likes

Great thinking @Larry.Collicott! I’m very interested to see if any of the community members have thoughts on this.

Yeah, @nmcnamara I’m hoping we don’t have to reinvent the wheel on this integration.

Setting up the Kafka integration is not very straight forward since kafka itself is less than straight forward and has so many different configurations of its own. It would be nice if there was a firm, definite document that stated something simple like this:

  • A minimal kafka-config.yml format that will always give you data if…

  • …you use these variables/ports/authentication/method on either your broker nodes or zookeeper nodes.

This integration is by far the most difficult to get working and we’ve already struggled through it on two other kafka installations, so we’ve got hundreds of hours of experience screaming at these config files.

1 Like

Does your producer connect / then send a message / and then disconnect? It’s important not to do that - you need to keep your producer connected to kafka if that’s the case.

To my knowledge (and I am no kafka expert) yes, the producers stay connected.

I just realized that I never shared the solution I used for this was to set the version to “1.0.0”

# Override the kafka API version to target. Defaults to 1.0.0, which will work for all post-1.0.0 versions. Older versions of the API may be missing features.
      kafka_version: "1.0.0"
1 Like