Writing Custom Kafka Metric in New Relic

Hi all,

I am new to NewRelic and struggling on writing queries to fetch custom metrics.
Have anyone successfully wrote a query to fetch custom kafka metrics which are not provided out of the box in NewRelic?

I have tried everything I could but not able to get the results.
To elaborate more, I need below metrics. If you could help me get any metric, I’ll try to write the remaining queries.

1- Total number of Partitions in the cluster
2- Active Controller Count
3- Partition Count per Broker
4- Offline Partition Count
5- Under Minimum ISR
6- Leader Partition Count per Broker and few more

Hi all,

Have anyone successfully wrote a query to fetch custom kafka metrics which are not provided out of the box in NewRelic?

I have tried everything I could but not able to get the results.
To elaborate more, I need below metrics. If you could help me get any metric, I’ll try to write the remaining queries.

1- Total number of Partitions in the cluster
2- Active Controller Count
3- Partition Count per Broker
4- Offline Partition Count
5- Under Minimum ISR
6- Leader Partition Count per Broker and few more

Hi, @zubairuddin: You may view the available metrics for the Kafka monitoring integration here: Kafka monitoring integration | New Relic Documentation.

It does not appear that the metrics you want are captured by the integration; if they aren’t sent to New Relic, you won’t be able to query them. You may be able to configure the Flex integration to get the metrics from Kafka and send them to New Relic.

I have tried running flex integration and its working fine with the example of upTime.
When I add the similar code for getting number of kafka topics, its either not giving the output or giving incorrect output.

sudo vi flex-uptime.yml

integrations:

  • name: nri-flex
    config:
    name: linuxUptimeIntegration
    apis:
    - name: Uptime
    commands:
    - run: ‘cat /proc/uptime’
    split: horizontal
    split_by: \s+
    set_header: [uptimeSeconds,idletimeSeconds]

  • name: nri-flex
    config:
    name: kafkaTopics
    apis:
    - name: TopicCount
    commands:
    - run: ‘kafka-topics --bootstrap-server kafka1:9092 --list | wc -l’
    split: horizontal
    set_header: [totalTopicCount]

Also I am not able to see the default consumer related metrics that comes out of the box(eg. Consumer Max Lag)
Instead of showing all 3 brokers at the bottom of the widget, its showing KafkaConsumerSample

integrations:

  • name: nri-kafka
    env:
    CLUSTER_NAME: “NewRelicKafka”
    KAFKA_VERSION: “2.8.0”
    AUTODISCOVER_STRATEGY: “bootstrap”
    BOOTSTRAP_BROKER_HOST: “kafka1”
    BOOTSTRAP_BROKER_KAFKA_PORT: 9092
    BOOTSTRAP_BROKER_KAFKA_PROTOCOL: PLAINTEXT
    BOOTSTRAP_BROKER_JMX_PORT: 9999
    BOOTSTRAP_BROKER_JMX_USER: admin
    BOOTSTRAP_BROKER_JMX_PASSWORD: password
    LOCAL_ONLY_COLLECTION: false
    COLLECT_BROKER_TOPIC_DATA: false
    TOPIC_MODE: “all”
    #TOPIC_REGEX: ‘topic\d+’
    COLLECT_TOPIC_SIZE: true
    COLLECT_TOPIC_OFFSET: true
    METRICS: “true”
    interval: 15s
    inventory_source: config/kafka
    labels:
    env: production
    role: kafka

I have updated the flex integration file with timeout. Below is the flex yml file

integrations:

  • name: nri-flex
    config:
    name: kafkaTopics
    apis:
    - name: TCount
    commands:
    - timeout: 300000
    run: ‘kafka-topics --bootstrap-server kafka1:9092 --list | wc -l’
    set_header: [totalTopicCnt]

When I run
sudo /var/db/newrelic-infra/newrelic-integrations/bin/nri-flex --config_path topic-metric.yml

{
“name”: “com.newrelic.nri-flex”,
“protocol_version”: “3”,
“integration_version”: “1.4.4”,
“data”: [
{
“metrics”: [
{
“event_type”: “flexStatusSample”,
“flex.Hostname”: “kafka1”,
“flex.IntegrationVersion”: “1.4.4”,
“flex.counter.ConfigsProcessed”: 1,
“flex.counter.EventCount”: 0,
“flex.counter.EventDropCount”: 0,
“flex.time.elaspedMs”: 23360,
“flex.time.endMs”: 1646915414502,
“flex.time.startMs”: 1646915391142
}
],
“inventory”: {},
“events”: []
}
]
}

I am not getting “totalTopicCnt” anywhere

In the ERROR MSG field the number 53 is the total number of topics which I am looking for under the column “totalTopicCount”, unfortunately that column is not even visible in “select * from TCountSample”.

Later I changed my linux command to “cat count”, this simple command is also not working

Hi Team,

I somehow manage to bring the no. of topics using flex. Now when I am a running below query, its compulsorily giving an extra column “TIMESTAMP” which I am not able to get rid of.

SELECT TopicCount as ‘TotalTopicCount’ from TopicTestSample limit 1

When I am trying to create an alert on when number of topic exceeds a particular value, its not accepting below query.

SELECT TopicCount as ‘TotalTopicCount’ from TopicTestSample limit 1

Hi @zubairuddin I am not seeing any data when I query TopicTestSample. Are you still seeing that data in the system.