Find duplicate records in MongoDB

Use aggregation on name and get name with count > 1:

db.collection.aggregate([    {"$group" : { "_id": "$name", "count": { "$sum": 1 } } },    {"$match": {"_id" :{ "$ne" : null } , "count" : {"$gt": 1} } },     {"$project": {"name" : "$_id", "_id" : 0} }]);

To sort the results by most to least duplicates:

db.collection.aggregate([    {"$group" : { "_id": "$name", "count": { "$sum": 1 } } },    {"$match": {"_id" :{ "$ne" : null } , "count" : {"$gt": 1} } },     {"$sort": {"count" : -1} },    {"$project": {"name" : "$_id", "_id" : 0} }     ]);

To use with another column name than "name", change "$name" to "$column_name"

mongodb aggregation-framework database

You can find the list of duplicate names using the following aggregate pipeline:

Group all the records having similar name.
Match those groups having records greater than 1.
Then group again to project all the duplicate names as an array.

The Code:

db.collection.aggregate([{$group:{"_id":"$name","name":{$first:"$name"},"count":{$sum:1}}},{$match:{"count":{$gt:1}}},{$project:{"name":1,"_id":0}},{$group:{"_id":null,"duplicateNames":{$push:"$name"}}},{$project:{"_id":0,"duplicateNames":1}}])

o/p:

{ "duplicateNames" : [ "ksqn291", "ksqn29123213Test" ] }

mongodb aggregation-framework database

The answer anhic gave can be very inefficient if you have a large database and the attribute name is present only in some of the documents.

To improve efficiency you can add a $match to the aggregation.

db.collection.aggregate(    {"$match": {"name" :{ "$ne" : null } } },     {"$group" : {"_id": "$name", "count": { "$sum": 1 } } },    {"$match": {"count" : {"$gt": 1} } },     {"$project": {"name" : "$_id", "_id" : 0} })

CodeHunter

Find duplicate records in MongoDB

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last