Delete duplicate records in SQL Server?
You can do this with window functions. It will order the dupes by empId, and delete all but the first one.
delete x from ( select *, rn=row_number() over (partition by EmployeeName order by empId) from Employee ) xwhere rn > 1;
Run it as a select to see what would be deleted:
select *from ( select *, rn=row_number() over (partition by EmployeeName order by empId) from Employee ) xwhere rn > 1;
Assuming that your Employee table also has a unique column (ID
in the example below), the following will work:
delete from Employee where ID not in( select min(ID) from Employee group by EmployeeName );
This will leave the version with the lowest ID in the table.
Edit
Re McGyver's comment - as of SQL 2012
MIN
can be used with numeric, char, varchar, uniqueidentifier, or datetime columns, but not with bit columns
For 2008 R2 and earlier,
MIN can be used with numeric, char, varchar, or datetime columns, but not with bit columns (and it also doesn't work with GUID's)
For 2008R2 you'll need to cast the GUID
to a type supported by MIN
, e.g.
delete from GuidEmployeeswhere CAST(ID AS binary(16)) not in( select min(CAST(ID AS binary(16))) from GuidEmployees group by EmployeeName );
You could try something like the following:
delete T1from MyTable T1, MyTable T2where T1.dupField = T2.dupFieldand T1.uniqueField > T2.uniqueField
(this assumes that you have an integer based unique field)
Personally though I'd say you were better off trying to correct the fact that duplicate entries are being added to the database before it occurs rather than as a post fix-it operation.