Finding duplicate values in a SQL table
SELECT name, email, COUNT(*)FROM usersGROUP BY name, emailHAVING COUNT(*) > 1
Simply group on both of the columns.
Note: the older ANSI standard is to have all non-aggregated columns in the GROUP BY but this has changed with the idea of "functional dependency":
In relational database theory, a functional dependency is a constraint between two sets of attributes in a relation from a database. In other words, functional dependency is a constraint that describes the relationship between attributes in a relation.
Support is not consistent:
- Recent PostgreSQL supports it.
- SQL Server (as at SQL Server 2017) still requires all non-aggregated columns in the GROUP BY.
- MySQL is unpredictable and you need
sql_mode=only_full_group_by
:- GROUP BY lname ORDER BY showing wrong results;
- Which is the least expensive aggregate function in the absence of ANY() (see comments in accepted answer).
- Oracle isn't mainstream enough (warning: humour, I don't know about Oracle).
try this:
declare @YourTable table (id int, name varchar(10), email varchar(50))INSERT @YourTable VALUES (1,'John','John-email')INSERT @YourTable VALUES (2,'John','John-email')INSERT @YourTable VALUES (3,'fred','John-email')INSERT @YourTable VALUES (4,'fred','fred-email')INSERT @YourTable VALUES (5,'sam','sam-email')INSERT @YourTable VALUES (6,'sam','sam-email')SELECT name,email, COUNT(*) AS CountOf FROM @YourTable GROUP BY name,email HAVING COUNT(*)>1
OUTPUT:
name email CountOf---------- ----------- -----------John John-email 2sam sam-email 2(2 row(s) affected)
if you want the IDs of the dups use this:
SELECT y.id,y.name,y.email FROM @YourTable y INNER JOIN (SELECT name,email, COUNT(*) AS CountOf FROM @YourTable GROUP BY name,email HAVING COUNT(*)>1 ) dt ON y.name=dt.name AND y.email=dt.email
OUTPUT:
id name email----------- ---------- ------------1 John John-email2 John John-email5 sam sam-email6 sam sam-email(4 row(s) affected)
to delete the duplicates try:
DELETE d FROM @YourTable d INNER JOIN (SELECT y.id,y.name,y.email,ROW_NUMBER() OVER(PARTITION BY y.name,y.email ORDER BY y.name,y.email,y.id) AS RowRank FROM @YourTable y INNER JOIN (SELECT name,email, COUNT(*) AS CountOf FROM @YourTable GROUP BY name,email HAVING COUNT(*)>1 ) dt ON y.name=dt.name AND y.email=dt.email ) dt2 ON d.id=dt2.id WHERE dt2.RowRank!=1SELECT * FROM @YourTable
OUTPUT:
id name email----------- ---------- --------------1 John John-email3 fred John-email4 fred fred-email5 sam sam-email(4 row(s) affected)