https://stackoverflow.com/questions/44248108/mongodb-error-getmore-command-failed-cursor-not-found
EDIT - Query performance:
As
@NeilLunn pointed out in his comments, you should not be filtering the
documents manually, but use
.find(...)
for
that instead:db.snapshots.find({
roundedDate: { $exists: true },
stream: { $exists: true },
sid: { $exists: false }
})
Also,
using
.bulkWrite()
, available as
from MongoDB 3.2
, will be
far way more performant than doing individual updates.
It
is possible that, with that, you are able to execute your query within the 10
minutes lifetime of the cursor. If it still takes more than that, you cursor
will expire and you will have the same problem anyway, which is explained
below:
What is going on here:
Error: getMore command failed
may
be due to a cursor timeout, which is related with two cursor attributes:
·
Timeout limit, which is 10 minutes by default. From the docs:
By default, the server will automatically
close the cursor after 10 minutes of inactivity, or if client has exhausted the
cursor.
·
Batch size, which is 101 documents or 16 MB for the first batch,
and 16 MB, regardless of the number of documents, for subsequent batches (as of
MongoDB
3.4
). From the docs:find()
and aggregate()
operations
have an initial batch size of 101 documents by default. Subsequent getMore operations issued against the resulting
cursor have no default batch size, so they are limited only by the 16 megabyte
message size.
Probably
you are consuming those initial 101 documents and then getting a 16 MB batch,
which is the maximum, with a lot more documents. As it is taking more than 10
minutes to process them, the cursor on the server times out and, by the time
you are done processing the documents in the second batch and request a new one,
the cursor is already closed:
As you iterate
through the cursor and reach the end of the returned batch, if there are more
results, cursor.next() will perform a getMore operation to retrieve the next
batch.
Possible solutions:
I
see 5 possible ways to solve this, 3 good ones, with their pros and cons, and 2
bad one:
1.
👍 Reducing the batch
size to keep the cursor alive.
2.
👍 Remove the timeout
from the cursor.
3.
👍 Retry when the
cursor expires.
4.
👎 Query the results
in batches manually.
5.
👎 Get all the
documents before the cursor expires.
Note
they are not numbered following any specific criteria. Read throw them and
decide which one works best for your particular case.
1. 👍 Reducing the batch size to keep the cursor alive
One
way to solve that is use
cursor.bacthSize
to set the
batch size on the cursor returned by your find
query
to match those that you can process within those 10 minutes:const cursor = db.collection.find()
.batchSize(NUMBER_OF_DOCUMENTS_IN_BATCH);
However,
keep in mind that setting a very conservative (small) batch size will probably
work, but will also be slower, as now you need to access the server more times.
On
the other hand, setting it to a value too close to the number of documents you
can process in 10 minutes means that it is possible that if some iterations
take a bit longer to process for any reason (other processes may be consuming
more resources), the cursor will expire anyway and you will get the same error
again.
2. 👍 Remove the timeout from the cursor
Another
option is to use cursor.noCursorTimeout to
prevent the cursor from timing out:
const cursor = db.collection.find().noCursorTimeout();
This is considered a bad practice as you would need to close the
cursor manually or exhaust all its results so that it is automatically closed:
After setting
the
noCursorTimeout
option, you
must either close the cursor manually with cursor.close()
or by
exhausting the cursor’s results.
As
you want to process all the documents in the cursor, you wouldn't need to close
it manually, but it is still possible that something else goes wrong in your
code and an error is thrown before you are done, thus leaving the cursor
opened.
If
you still want to use this approach, use a
try-catch
to
make sure you close the cursor if anything goes wrong before you consume all
its documents.
Note
I don't consider this a bad solution (therefore the 👍), as even thought it is
considered a bad practice...:
·
It is a feature supported by the driver. If it was so bad, as
there are alternatives ways to get around timeout issues, as explained in the
other solutions, this won't be supported.
·
There are ways to use it safely, it's just a matter of being
extra cautious with it.
·
I assume you are not running this kind of queries regularly, so
the chances that you start leaving open cursors everywhere is low. If this is
not the case, and you really need to deal with these situations all the time,
then it does make sense not to use
noCursorTimeout
.
3. 👍 Retry when the cursor expires
Basically,
you put your code in a
try-catch
and
when you get the error, you get a new cursor skipping the documents that you
have already processed:let processed = 0;
let updated = 0;
while(true) {
const cursor = db.snapshots.find().sort({ _id: 1 }).skip(processed);
try {
while (cursor.hasNext()) {
const doc = cursor.next();
++processed;
if (doc.stream && doc.roundedDate && !doc.sid) {
db.snapshots.update({
_id: doc._id
}, { $set: {
sid: `${ doc.stream.valueOf() }-${ doc.roundedDate }`
}});
++updated;
}
}
break; // Done processing all, exit outer loop
} catch (err) {
if (err.code !== 43) {
// Something else than a timeout went wrong. Abort loop.
throw err;
}
}
}
Note
you need to sort the results for this solution to work.
With
this approach you are minimizing the number of requests to the server by using
the maximum possible batch size of 16 MB, without having to guess how many
documents you will be able to process in 10 minutes beforehand. Therefore, it
is also more robust than the previous approach.
4. 👎 Query the results in batches manually
Basically,
you use skip(), limit() and sort() to do
multiple queries with a number of documents you think you can process in 10
minutes.
I
consider this a bad solution because the driver already has the option to set
the batch size, so there's no reason to do this manually, just use solution 1
and don't reinvent the wheel.
Also,
it is worth mentioning that it has the same drawbacks than solution 1,
5. 👎 Get all the documents before the cursor expires
Probably
your code is taking some time to execute due to results processing, so you
could retrieve all the documents first and then process them:
const results = new Array(db.snapshots.find());
This
will retrieve all the batches one after another and close the cursor. Then, you
can loop throw all the documents inside
results
and
do what you need to do.
However,
if you are having timeout issues, chances are that your result set is quite
large, thus pulling everything in memory may not be the most advisable thing to
do.
Note about snapshot mode
and duplicate documents
It
is possible that some documents are returned multiple times if intervening
write operations move them due to a grow in document size. To solve this,
use
cursor.snapshot()
. From the docs:
Append the
snapshot() method to a cursor to toggle the “snapshot” mode. This ensures that
the query will not return a document multiple times, even if intervening write
operations result in a move of the document due to the growth in document size.
However,
keep in mind its limitations:
·
It doesn't work with sharded collections.
·
It doesn't guarantee isolation from insertion or deletions.
Note
with solution 5 the time window to have a move of documents that may cause
duplicate documents retrieval is narrower that with the other solutions, so you
may not need
snapshot()
.
In
your particular case, as the collection is called
snapshot
,
probably it is not likely to change, so you probably don't need snapshot()
.
Moreover, you are doing updates on documents based on their data and, once the
update is done, that same document will not be updated again even though it is
retrieved multiple times, as the if
condition
will skip it.
Note about open cursors
To
see a count of open cursors use
db.serverStatus().metrics.cursor
.
No comments:
Post a Comment