So this is a topic near and dear to my heart since I spend a pretty terrible amount of time bouncing across platforms and trying to sort things out for people. One of the things I have been excited by in relation to NR since we started using it is that it seems like there is good potential for scraping data out of all kinds of places and bundling it up in here in a nice cohesive package. So far I have lots of big ideas but very little in the way of concrete progress to refer to and wanted to see what everyone else who has a situation anything like mine is doing.
So we currently heavily use New Relic, Splunk, SolarWinds, Stackdriver, Netcool, and Service Now as the ticketing for all that to monitor an environment where most of our systems have made the jump to GCP and AWS, with some legacy systems living in colo’s and on vCenter VM’s. Also tracking network gear spread all around the world. Lots of the servers were classic lift and shift monoliths, but some teams are living deeply in that SRE/Devops universe where they need to monitor their automated deployment pipelines in kubernetes and such. Everyone is in varying stages of technological maturity and time is always a constraint. It’s essentially the wild west and my 3 man team’s job is to hard these cats as best we can.
So just to throw an example out there, I recently put together a script that I can use to crawl through a team’s subaccount and check if they have any servers with leftover process monitors on any SolarWinds SAM templates. Since the Infra agent is already grabbing that data I figured that was a fairly easy target to set up Infra alerts to match all the process monitors and be one step closer to removing their systems from the second tool. Now I’m a bit bogged down in the whole communicating with all the users and change control processes before I get to pull the trigger on forcing the change but it’s an encouraging step forward for me.
I’m also trying to explore what options I have with NR One where I might be able to visualize info from other tools, just as an example I want to be able to pull up a given server or APM app and cook up some wizardry to see some of the info I need from their CMDB CI records. Just save our users some clicks in terms of hunting down the info about a host that maybe the person who got the ticket wasn’t as familiar with as we would have hoped. I’ve done similar sorts of correlations in Orion so I know how I want it to look under the covers, I just need to skill up on JS and the nerdpacks and see how I can make it happen there.
Anyway those are some ideas of my struggles and successes, hopefully the community has some neat tricks up their sleeves or just wants to commiserate on the challenges we have?