MySQL delete duplicate records but keep latest MySQL delete duplicate records but keep latest mysql mysql

MySQL delete duplicate records but keep latest


Imagine your table test contains the following data:

  select id, email    from test;ID                     EMAIL                ---------------------- -------------------- 1                      aaa                  2                      bbb                  3                      ccc                  4                      bbb                  5                      ddd                  6                      eee                  7                      aaa                  8                      aaa                  9                      eee 

So, we need to find all repeated emails and delete all of them, but the latest id.
In this case, aaa, bbb and eee are repeated, so we want to delete IDs 1, 7, 2 and 6.

To accomplish this, first we need to find all the repeated emails:

      select email         from test       group by email      having count(*) > 1;EMAIL                -------------------- aaa                  bbb                  eee  

Then, from this dataset, we need to find the latest id for each one of these repeated emails:

  select max(id) as lastId, email    from test   where email in (              select email                 from test               group by email              having count(*) > 1       )   group by email;LASTID                 EMAIL                ---------------------- -------------------- 8                      aaa                  4                      bbb                  9                      eee                                 

Finally we can now delete all of these emails with an Id smaller than LASTID. So the solution is:

delete test  from test inner join (  select max(id) as lastId, email    from test   where email in (              select email                 from test               group by email              having count(*) > 1       )   group by email) duplic on duplic.email = test.email where test.id < duplic.lastId;

I don't have mySql installed on this machine right now, but should work

Update

The above delete works, but I found a more optimized version:

 delete test   from test  inner join (     select max(id) as lastId, email       from test      group by email     having count(*) > 1) duplic on duplic.email = test.email  where test.id < duplic.lastId;

You can see that it deletes the oldest duplicates, i.e. 1, 7, 2, 6:

select * from test;+----+-------+| id | email |+----+-------+|  3 | ccc   ||  4 | bbb   ||  5 | ddd   ||  8 | aaa   ||  9 | eee   |+----+-------+

Another version, is the delete provived by Rene Limon

delete from test where id not in (    select max(id)      from test     group by email)


Try this method

DELETE t1 FROM test t1, test t2 WHERE t1.id > t2.id AND t1.email = t2.email


Correct way is

DELETE FROM `tablename`   WHERE id NOT IN (    SELECT * FROM (      SELECT MAX(id) FROM tablename         GROUP BY name    )   )