Your data. Anywhere you go.

New Relic for iOS or Android


Download on the App Store    Android App on Google play


New Relic Insights App for iOS


Download on the App Store


Learn more

Close icon

Feature Idea: Using Mongoose Cursors: Memory leaking very quickly

feature-idea

#1

Setup

Node v7.10.0
Mongoose v4.10.4
NewRelic Agent v1.39.1

Issue

  • One of our Express app route do open a Mongoose cursor (http://mongoosejs.com/docs/api.html#querycursor_QueryCursor) to fetch documents (might be ~100k), pipe them into a CSV stream then pipe them into Express response.
  • This is working super fine with NEW_RELIC_ENABLED=false, memory goes from 100mo to 130mo, and stays that way during the streaming of the CSV (which can take up to a minute).
  • When NewRelic is enabled, the memory goes from 100mo to 500mo (our provider’s, Heroku, limit) in less than 10 seconds (during the CSV is being streamed) and the app then crash.
  • We tried disabling instrumentation for that route (NEW_RELIC_IGNORING_RULES='.*format=csv.*') but it keeps leaking the same
  • According to your documentation, there is a known issue with MongoDB cursors (https://docs.newrelic.com/docs/agents/nodejs-agent/troubleshooting/troubleshooting-large-memory-usage-nodejs#mongo), but i’m not so sure it is related since Mongoose handles the cursor closing properly (https://github.com/Automattic/mongoose/blob/4.10.4/lib/querycursor.js), and the memory leak happen as soon as the cursor is being streamed anyway.
  • Our only solution right now is to disable NewRelic completely
  • Is there a way to fix this issue? or at least to make NewRelic ignore that transaction / Mongoose cursor methods?

Thanks


New Relic edit

  • I want this, too
  • I have more info to share (reply below)
  • I have a solution for this

0 voters

We take feature ideas seriously and our product managers review every one when plotting their roadmaps. However, there is no guarantee this feature will be implemented. This post ensures the idea is put on the table and discussed though. So please vote and share your extra details with our team.


#2

Any hint on this one? We will have to drop using NewRelic, as streaming large amount of mongodb documents to CSV is a requirement for us.


#3

@loris I’ve been looking into this and I’d like to gather a bit more information as I investigate, so I’m going to open a ticket for you.


#4

We keep bumping into this exact issue all the time with cursors and streams and have had to run parts of our app uninstrumented because of it. Any insights that you can share publicly would help greatly here.

Thanks


#5

How do you keep the part uninstrumented? We tried ignoring transactions but NewRelic would keep on leaking


#6

We’re lucky in that most of our long cursors are in cron jobs on which, being independent processes, we can control whether NewRelic is required or not (they’re all custom transactions).

We’ve also had little luck with ignoring transactions on other points of the app, but that’s for another topic :slight_smile:


#7

Thanks for the updates, @loris and @jmnsf!

I looked into that support ticket mentioned above in hopes that I could add something here that would help clarify the questions in this thread. The answer I found was that an update was able to help you here, @loris—is that right?

Let us know what else you both need! :blush: Talk soon!


#8

This problem is still present in 2.5.0.


#9

Same here.

For us, leak only happens when we call a promise inside the ‘on data’ event.
nulling the promises at the end of the stream and after cursor is close doesn’t help.


#10

@jmnsf @contas Mongoose is not supported at this time, though you could try our instrumentDatastore() API. One thing to note if you use this to instrument Mongoose- you’ll want to stub out the MongoDB instrumentation. Here are a few docs to get you started:

Compatibility and requirements for the Node.js agent

Intro to Datastore Instrumentation - Tutorial

newrelic.instrumentDatastore() API

We’ll also start a feature idea poll here for Mongoose support with the Node agent.


#11

Probably also worth noting that the instrumentDatastore API allows you to package/distribute the instrumentation as a separate module. So if someone wants to be a rock star and publish instrumentation for Mongoose using the instrumentDatastore API to npm I’ll be happy to hook you up with some New Relic swag :grin:


#12

I feel there’s a misunderstanding here: this isn’t a feature request, it’s a memory leak caused by the NewRelic agent that makes mongoose cursors unusable. We don’t care about mongoose instrumentation, we care that this doesn’t happen:

The moment the agent is enabled, our background workers start leaking memory. We’ve isolated the issue to cursors/streams: the more objects are iterated, the worse it gets. I must emphasize the seriousness of this problem as there is no other way to efficiently iterate a large number of records from MongoDB, and we’ve had to resort to disabling NR altogether to restore our app to normal behaviour.


#13

Can confirm last message, I don’t know why my post has been flagged as a “Feature idea”, this is clearly a bug report. Right now, we have to stop using NewRelic on all our apps which use Mongoose cursors.


#14

Apologies, there’s a bit of subtlety here which I think is leading to the bug vs feature discussion. The mongoose module is built on top of mongodb. New Relic instruments the mongodb module, not mongoose. There is apparently a bad interaction between the mongoose cursors implementation and New Relic’s instrumentation of mongodb resulting in the issue being observed. The exact cause is not currently known, but we do not see memory leak issues with the mongodb instrumentation by itself (just as you obviously don’t observe significant leaks with mongoose cursors by itself).

Adding instrumentation for mongoose (and disabling the mongodb instrumentation) would likely be the fastest way to address this issue.

That said, we’re looking into reproducing the issue internally to see if there is a workaround or other solution we can offer in the near term.

UPDATE: I realized after my response that we did investigate the issue as reported by the OP ( @loris ) and the conclusion was that it would require instrumenting mongoose to resolve. @jmnsf support will be in contact with you to collect some additional info to confirm you are seeing the same issue.


#15

@loris @contas found a workaround to this, hinted by @jstuckey. If you use a stream straight from the MongoDB driver, the memory issue goes away.

const through2 = require('through2'); // helps dealing with streams
const stream = mongoose.model('Model').collection.find().stream();

stream.pipe(through2.obj(async (model, _enc, cb) => {
  model = mongoose.model('Model').hydrate(model); // make it a mongoose instance
  await someAsyncThing(model); // do whatever async work
  cb();
});

stream.on('end', magic());

This is working for us.


#16

@jmnsf that’s great news and thank you very much for sharing your workaround with the community here. Also thanks for the help on getting data you’ve been providing via the ticket that was opened.

We’re continuing to dig into the issue on the engineering side with an aim of providing a more general solution (that would avoid the need for a code change on your end). But it’s great to know you were able to get unblocked with this workaround and hopefully this information will be helpful to others as a temporary solution.


#17

With the release of v2.7.0, there is a work around for this issue. By marking a method as opaque in the instrumentation of mongoose, the segments associated with the next method calls will be omitted. Here is an example instrumentation of mongoose’s eachAsync method:

var mongo = require('mongodb')
var nr = require('newrelic')
var mongoInstrumentation = require('newrelic/lib/instrumentation/mongodb')
nr.instrumentDatastore('mongodb', function stub(){})
nr.instrumentDatastore('mongoose', function onRequire(shim, mongoose, name) {
  require.cache[require.resolve('mongodb')] = mongoInstrumentation(nr.agent, mongo, 'mongodb', shim);
  shim.setDatastore('Mongoose')
  shim.wrapReturn(
    mongoose.Query.prototype,
    'cursor',
    function cursorFactoryWrapper(shim, fn, fnName, cursor) {
      var methods = ['eachAsync']
      methods.forEach(function(methodName) {
        shim.recordQuery(cursor, methodName, function wrapCursorMethod() {
          return {query: methodName, opaque: true, promise: true}
        })
      }
    )
  })
})

More methods can be added to the methods array to apply this functionality to other methods on the cursor.