How to delete duplicate rows in SQL Server?
I like CTEs and ROW_NUMBER
as the two combined allow us to see which rows are deleted (or updated), therefore just change the DELETE FROM CTE...
to SELECT * FROM CTE
:
WITH CTE AS( SELECT [col1], [col2], [col3], [col4], [col5], [col6], [col7], RN = ROW_NUMBER()OVER(PARTITION BY col1 ORDER BY col1) FROM dbo.Table1)DELETE FROM CTE WHERE RN > 1
DEMO (result is different; I assume that it's due to a typo on your part)
COL1 COL2 COL3 COL4 COL5 COL6 COL7john 1 1 1 1 1 1sally 2 2 2 2 2 2
This example determines duplicates by a single column col1
because of the PARTITION BY col1
. If you want to include multiple columns simply add them to the PARTITION BY
:
ROW_NUMBER()OVER(PARTITION BY Col1, Col2, ... ORDER BY OrderColumn)
I would prefer CTE for deleting duplicate rows from sql server table
strongly recommend to follow this article ::http://codaffection.com/sql-server-article/delete-duplicate-rows-in-sql-server/
by keeping original
WITH CTE AS(SELECT *,ROW_NUMBER() OVER (PARTITION BY col1,col2,col3 ORDER BY col1,col2,col3) AS RNFROM MyTable)DELETE FROM CTE WHERE RN<>1
without keeping original
WITH CTE AS(SELECT *,R=RANK() OVER (ORDER BY col1,col2,col3)FROM MyTable) DELETE CTEWHERE R IN (SELECT R FROM CTE GROUP BY R HAVING COUNT(*)>1)
Without using CTE
and ROW_NUMBER()
you can just delete the records just by using group by with MAX
function here is and example
DELETEFROM MyDuplicateTableWHERE ID NOT IN(SELECT MAX(ID)FROM MyDuplicateTableGROUP BY DuplicateColumn1, DuplicateColumn2, DuplicateColumn3)