Finding duplicate values in a SQL table Finding duplicate values in a SQL table sql sql

Finding duplicate values in a SQL table


SELECT    name, email, COUNT(*)FROM    usersGROUP BY    name, emailHAVING     COUNT(*) > 1

Simply group on both of the columns.

Note: the older ANSI standard is to have all non-aggregated columns in the GROUP BY but this has changed with the idea of "functional dependency":

In relational database theory, a functional dependency is a constraint between two sets of attributes in a relation from a database. In other words, functional dependency is a constraint that describes the relationship between attributes in a relation.

Support is not consistent:


try this:

declare @YourTable table (id int, name varchar(10), email varchar(50))INSERT @YourTable VALUES (1,'John','John-email')INSERT @YourTable VALUES (2,'John','John-email')INSERT @YourTable VALUES (3,'fred','John-email')INSERT @YourTable VALUES (4,'fred','fred-email')INSERT @YourTable VALUES (5,'sam','sam-email')INSERT @YourTable VALUES (6,'sam','sam-email')SELECT    name,email, COUNT(*) AS CountOf    FROM @YourTable    GROUP BY name,email    HAVING COUNT(*)>1

OUTPUT:

name       email       CountOf---------- ----------- -----------John       John-email  2sam        sam-email   2(2 row(s) affected)

if you want the IDs of the dups use this:

SELECT    y.id,y.name,y.email    FROM @YourTable y        INNER JOIN (SELECT                        name,email, COUNT(*) AS CountOf                        FROM @YourTable                        GROUP BY name,email                        HAVING COUNT(*)>1                    ) dt ON y.name=dt.name AND y.email=dt.email

OUTPUT:

id          name       email----------- ---------- ------------1           John       John-email2           John       John-email5           sam        sam-email6           sam        sam-email(4 row(s) affected)

to delete the duplicates try:

DELETE d    FROM @YourTable d        INNER JOIN (SELECT                        y.id,y.name,y.email,ROW_NUMBER() OVER(PARTITION BY y.name,y.email ORDER BY y.name,y.email,y.id) AS RowRank                        FROM @YourTable y                            INNER JOIN (SELECT                                            name,email, COUNT(*) AS CountOf                                            FROM @YourTable                                            GROUP BY name,email                                            HAVING COUNT(*)>1                                        ) dt ON y.name=dt.name AND y.email=dt.email                   ) dt2 ON d.id=dt2.id        WHERE dt2.RowRank!=1SELECT * FROM @YourTable

OUTPUT:

id          name       email----------- ---------- --------------1           John       John-email3           fred       John-email4           fred       fred-email5           sam        sam-email(4 row(s) affected)


Try this:

SELECT name, emailFROM usersGROUP BY name, emailHAVING ( COUNT(*) > 1 )