moveChunk failed to engage TO-shard in the data transfer: can't accept new chunks because moveChunk failed to engage TO-shard in the data transfer: can't accept new chunks because mongodb mongodb

moveChunk failed to engage TO-shard in the data transfer: can't accept new chunks because


It's not common to see this kind of issue, but I have seen it occur sporadically.

The best remedial action to take here is to step down the primary of the referenced TO shard which will clear out the background deletes. The delete threads only exist on the current primary (they will be replicated from that primary via the oplog as they are processed). When you step it down, it becomes a secondary, the threads can no longer write and you get a new primary with no pending deletes. You may wish to restart the former primary after the step down to clear out old cursors, but it's not usually urgent.

Once you do this, you will be left with a large number of orphaned documents, which can be addresses with the cleanUpOrphaned command which I would recommend running at low traffic times (if you have such times).

For reference, if this is a recurring problem, then it is likely the primaries are struggling a little in terms of load, and to avoid the queuing up of deletes you can set the _waitForDelete option for the balancer to true (false by default) as follows:

use configdb.settings.update(   { "_id" : "balancer" },   { $set : { "_waitForDelete" : true } },   { upsert : true })

This will mean that each migration is slower (perhaps significantly so) but will not cause the background deletes to accumulate.