Sending Multiline logs using Fluentbit Plugin

Hi,
I have multiline JSON logs and want to send each record as individual log, but new relic is sending each line as separate log.
I tried two ways:
First Case:
I used default logging mechanism of infra agent:
Logs.yml
logs:
- name: “test_log”
file: /home/azureuser/mfhost2.json
Second Case:
I tried using fluentbit plugin:
Fluentbit.yml
logs:
- name: external-fluentbit-config-and-parsers-file
fluentbit:
config_file: /etc/newrelic-infra/logging.d/fluentbit.config
parsers_file: /etc/newrelic-infra/logging.d/parsers.conf

Fluentbit.config
[INPUT]
Name tail
Path /home/azureuser/mfhost.json
Path_Key filePath
Key message
Tag tail_test
Multiline On
Parser_Firstline MULTILINE_MATCH

[FILTER]
    Name             parser
    Match            tail_test
    Key_Name         message
    Parser           MULTILINE_MATCH
    Reserve_Data     On
    Preserve_Key     On

Parsers.conf
[PARSER]
Name MULTILINE_MATCH
Format regex
Regex /{[\s\S]* “ProviderId”: "(?<provider_id>[a-f0-9[\s\S]* -])",[\s\S] “EventId”: (?<event_id>[^ ]),[\s\S] “Keywords”: (?[^ ]),[\s\S] “Level”: "(?[^ ])",[\s\S] “Message”: “(?.)",[\s\S] “Opcode”: (?[^ ]),[\s\S] “Task”: (?[^ ]),[\s\S] “Version”: (?[^ ]),[\s\S] “Payload”: {[\s\S]( “elapsedTime”: (?<elaspsed_time>[^ ]),[\s\S])?( “error”: (?.),[\s\S])?( “scopedActivityId”: (?<scoped_activity_id>[^ ]),[\s\S])?( “duration”: (?[^ ]),[\s\S])?( “host”: (?[^ ]),[\s\S])?( “operation”: (?[^ ]),[\s\S])? “MachineName”: "(?<machine_name>[^ ])”,[\s\S]* “ActivityId”: "(?<activity_id>[^ ])",[\s\S] “ApplicationId”: "(?<application_id>[^ ])",[\s\S] “TenantId”: "(?<tenant_id>[^ ])",[\s\S] “UserId”: "(?<user_id>[^ ])"[\s\S] },[\s\S]* “EventName”: "(?<event_name>[^ ])",[\s\S] “Timestamp”: "(?[^ ])",[\s\S] “ProcessId”: (?<process_id>[^ ]),[\s\S] “ThreadId”: (?<thread_id>[^ ])[\s\S]},[\s\S][\s\S]/

I also tried a different Regex: /{(?[^{}][{][^{}][}][,][^{}]*)},/

Mfhost.json
{
“ProviderId”: “xyz”,
“EventId”: 123,
“Keywords”: 12,
“Level”: “Info”,
“Message”: “abc”,
“Opcode”: 0,
“Task”: 123,
“Version”: 0,
“Payload”: {
“elapsedTime”: 123,
“MachineName”: “abc”,
“ActivityId”: “abc”,
“ApplicationId”: “abc”,
“TenantId”: “abc”,
“UserId”: “abc”
},
“EventName”: “abc”,
“Timestamp”: “abc”,
“ProcessId”: 123,
“ThreadId”: 123
},

Need some help in troubleshooting the issue.

@utsav.tulsyan welcome!

can you share a sample of your JSON with at least 2 log messages please? I’m curious to try this because JSON is usually captured and parsed without you needing to change anything on your side. There might be an issue we need to account for in some nesting through, but we 'd need to see a bigger sample.

1 Like

Hi @zackm ,
Thanks a lot for your response. My log file mfhost.json would look something like this with 2 records:

{
“ProviderId”: “xyz”,
“EventId”: 123,
“Keywords”: 12,
“Level”: “Info”,
“Message”: “abc”,
“Opcode”: 0,
“Task”: 123,
“Version”: 0,
“Payload”: {
“elapsedTime”: 123,
“MachineName”: “abc”,
“ActivityId”: “abc”,
“ApplicationId”: “abc”,
“TenantId”: “abc”,
“UserId”: “abc”
},
“EventName”: “abc”,
“Timestamp”: “abc”,
“ProcessId”: 123,
“ThreadId”: 123
},
{
“ProviderId”: “xyz”,
“EventId”: 123,
“Keywords”: 12,
“Level”: “Info”,
“Message”: “abc”,
“Opcode”: 0,
“Task”: 123,
“Version”: 0,
“Payload”: {
“elapsedTime”: 123,
“MachineName”: “abc”,
“ActivityId”: “abc”,
“ApplicationId”: “abc”,
“TenantId”: “abc”,
“UserId”: “abc”
},
“EventName”: “abc”,
“Timestamp”: “abc”,
“ProcessId”: 123,
“ThreadId”: 123
},

I hope this helps.

thanks for that! I’ll try and work out a regex tomorrow for this to see if we can get a working parser.

Is it safe to assume that you cannot change your log pattern to flatten the JSON payload into one element per line? The reason you’re getting the multiple entries right now is because we use the TAIL input from fluent bit on the File attribute in the YAML config. If you flatten it out like this, it “just works”:

{“ProviderId”: “xyz”, “EventId”: 123, “Keywords”: 12, “Level”: “Info”, “Message”: “abc”, “Opcode”: 0, “Task”: 123, “Version”: 0, “Payload”: {“elapsedTime”: 123, “MachineName”: “abc”, “ActivityId”: “abc”, “ApplicationId”: “abc”, “TenantId”: “abc”, “UserId”: “abc” }, “EventName”: “abc”, “Timestamp”: “abc”, “ProcessId”: 123, “ThreadId”: 123 }, 
{“ProviderId”: “xyz”, “EventId”: 123, “Keywords”: 12, “Level”: “Info”, “Message”: “abc”, “Opcode”: 0, “Task”: 123, “Version”: 0, “Payload”: {“elapsedTime”: 123, “MachineName”: “abc”, “ActivityId”: “abc”, “ApplicationId”: “abc”, “TenantId”: “abc”, “UserId”: “abc” }, “EventName”: “abc”, “Timestamp”: “abc”, “ProcessId”: 123, “ThreadId”: 123 },

Hi @zackm flattening is not an option right now, it would be awesome if you could get the fluent bit parser working for this.

@utsav.tulsyan - following up; still banging away at this; I’m honestly not sure if this pattern of parsing is supported by fluent bit. The industry standard for JSON in logs is to be flattened and I’m unable to find any reasons why fluent bit is not capturing the entire block correctly. For context; this is the current/latest config and parser setup I have been testing out; along with a validated regex. Note, I removed the “Timestamp” attribute just for testing so I didn’t have to keep updating the epoch with each new test to avoid any potential issues with “old” logs

Config File:

[INPUT]
    Name                tail
    Path                /root/test.log
    Path_Key            filePath
    Key                 message
    Multiline           On
    Parser_Firstline    FIRST_LINE
    Parser_1            JSON_MATCH

This config uses the Parser_Firstline pattern to find the start of our expected log entry, and the Parser_1 pattern to break the rest of the block into proper key-value pairs. You can read more about this in the docs.

Parsers File:

[PARSER]
    Name        FIRST_LINE
    Format      regex
    Regex       /^{.*/

[PARSER]
    Name        JSON_MATCH
    Format      regex
    Regex       /{."ProviderId": "(?<provider_id>[^ ].*)",."EventId": (?<event_id>[^ ].*),."Keywords": (?<keywords>[^ ].*),."Level": "(?<level>[^ ].*)",."Message": "(?<message>[^ ].*)",."Opcode": (?<op_code>[^ ].*),."Task": (?<task>[^ ].*),."Version": (?<version>[^ ].*),."Payload": {."elapsedTime": (?<elapsed_time>[^ ].*),."MachineName": "(?<machine_name>[^ ].*)",."ActivityId": "(?<activity_id>[^ ].*)",."ApplicationId": "(?<application_id>[^ ].*)",."TenantId": "(?<tenant_id>[^ ].*)",."UserId": "(?<user_id>[^ ].*)".},."EventName": "(?<event_name>[^ ].*)",."ProcessId": (?<process_id>[^ ].*),."ThreadId": (?<thread_id>[^ ]\d+).}../m

Regex confirmation on Rubular

I’ll try and find some time Monday to keep looking at this, but I am not positive that it will work. An issue on the fluent bit GitHub repository may be warranted to rule out any potential problems there as well.

Hey @zackm I tried a few more regex patterns but none are working even though Rubular and Fluentular are able to convert it. I’ll raise an issue on the fluentbit repo. I hope I can find a solution to this, It would be amazing if NewRelic would support this out of the box.

1 Like

@utsav.tulsyan Please let the community know if you do find a solution. Great to get your feedback :slight_smile: