Removing duplicate rows (based on values from multiple columns) from SQL table Removing duplicate rows (based on values from multiple columns) from SQL table sql-server sql-server

Removing duplicate rows (based on values from multiple columns) from SQL table


Sample SQL FIDDLE

1) Use CTE to get max ship code value record based on ARDivisionNo, CustomerNo for each Customers

WITH cte AS (  SELECT*,      row_number() OVER(PARTITION BY ARDivisionNo, CustomerNo ORDER BY ShipToCode desc) AS [rn]  FROM t)Select * from cte WHERE [rn] = 1

2) To Delete the record use Delete query instead of Select and change Where Clause to rn > 1. Sample SQL FIDDLE

WITH cte AS (  SELECT*,      row_number() OVER(PARTITION BY ARDivisionNo, CustomerNo ORDER BY ShipToCode desc) AS [rn]  FROM t)Delete from cte WHERE [rn] > 1;select * from t;


You didn't specify the version of SQL Server, but ROW_NUMBER is probably supported:

select *from (  select ...     ,row_number()       over (partition by ARDivisionNo, CustomerNo            order by ShipToCode desc) as rn   from tab ) as dtwhere rn = 1


ROW_NUMBER() is great for this:

;WITH cte AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY ARDivisionNo,CustomerNo ORDER BY ShipToCode DESC) AS RN               FROM AR_Customer_ShipTo              )SELECT * FROM  cteWHERE RN = 1

You mention removing the duplicates, if you want to DELETE you can simply:

;WITH cte AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY ARDivisionNo,CustomerNo ORDER BY ShipToCode DESC) AS RN               FROM AR_Customer_ShipTo              )DELETE cteWHERE RN > 1

The ROW_NUMBER() function assigns a number to each row. PARTITION BY is optional, but used to start the numbering over for each value in a given field or group of fields, ie: if you PARTITION BY Some_Date then for each unique date value the numbering would start over at 1. ORDER BY of course is used to define how the counting should go, and is required in the ROW_NUMBER() function.