Trouble wrapping head around complex SQL delete query
Begin by identifying the registrations of the other clients of a registration. Here's a view:
create view groups as select a.Client_id , c.Registration_idfrom AssociatedClient as a join AssociatedClient as b on a.Registration_id = b.Registration_id join AssociatedClient as c on b.Client_id = c.Client_id;
That gives us:
select Client_id , min(Registration_id) as first , max(Registration_id) as last , count(distinct Registration_id) as regs , count(*) as palsfrom groups group by Client_id;Client_id first last regs pals ---------- ---------- ---------- ---------- ----------2 2 8 4 5 3 2 8 4 18 4 5 5 1 1 5 2 8 4 5 7 10 10 1 1 8 9 9 1 1
You dont' need a view, of course; it's just for convenience. You could just use a virtual table. But inspect it carefully to convince yourself it produces the right range of "pal registrations" for each client. Note that the view does not reference Registration
. That's significant because it produces the same results even after we use it to delete from Registration
, so we can use it for the second delete statement.
Now we have a list of clients and their "pal registrations". What's the date of each pal's last registration?
select g.Client_id, max(Registration_date) as last_regfrom groups as g join Registration as ron g.Registration_id = r.Idgroup by g.Client_id;g.Client_id last_reg ----------- ----------2 2011-10-143 2011-10-144 2011-10-075 2011-10-147 2011-10-178 2011-10-14
Which ones have a latest date before a time certain?
select g.Client_id, max(Registration_date) as last_regfrom groups as g join Registration as ron g.Registration_id = r.Idgroup by g.Client_idhaving max(Registration_date) < '2011-10-08';g.Client_id last_reg ----------- ----------4 2011-10-07
IIUC that would mean that client #4 should be deleted, and anything he registered for should be deleted. Registrations would be
select * from Registrationwhere Id in ( select Registration_id from groups as g where Client_id in ( select g.Client_id from groups as g join Registration as r on g.Registration_id = r.Id group by g.Client_id having max(Registration_date) < '2011-10-08' ));Id Registration_date---------- -----------------5 2011-10-07
And, sure enough, client #4 is in Registration #5, and is the only client subject to deletion by this test.
From there you can work out the delete
statements. I think the rule is "delete the client and anything he registered for". If so, I'd probably write the Registration IDs to a temporary table, and write the deletes for both Registration
and AssociatedClient
by joining to it.
You want to know all registrations that need to be kept.So your first query returns registrations within 5 previous years :
SELECT IdFROM RegistrationWHERE Registration_date >= '2011-10-08'
then all registrations with clients related to the previous query :
SELECT a2.Registration_id as IdFROM AssociatedClient AS a1 INNER JOIN AssociatedClient AS a2 ON a1.Client_id = a2.Client_id WHERE a1.Registration_id IN ( SELECT Id FROM Registration WHERE Registration_date >= '2011-10-08' )
Then you have all registrations that you must not delete by combining the previous queries in an UNION
, and you want all clients that are not part of this query :
SELECT Client_idFROM AssociatedClientWHERE Registration_id NOT IN ( SELECT Id FROM Registration WHERE Registration_date >= '2011-10-08' UNION SELECT a2.Registration_id as Id FROM AssociatedClient AS a1 INNER JOIN AssociatedClient AS a2 ON a1.Client_id = a2.Client_id WHERE a1.Registration_id IN ( SELECT Id FROM Registration WHERE Registration_date >= '2011-10-08' ) )
you can see the results in this SQL fiddle
Then you can delete the lines of clients without registration correspondig to the criterias using the following query :
DELETE FROM AssociatedClientWHERE Client_id IN (<previous query>);
and all registrations not present in AssociatedClient :
DELETE FROM RegistrationWHERE Id NOT IN (SELECT Registration_id FROM AssociatedClient)
Use temporary tables.
INSERT INTO LockedClient(client_id) --select clients that should not be deletedSELECT DISTINCT ac.client_id FROM AssociatedClient acJOIN Registration r ON r.Id = ac.IDWHERE TIMESTAMPDIFF(YEAR, Reg.`Registration_date`, NOW()) >= 5;DELETE * FROM Registration r -- now delete all except locked clientsJOIN AssociatedClient ac ON ac.registration_id = r.idLEFT JOIN LockedClient lc ON lc.client_id = ac.client_idWHERE TIMESTAMPDIFF(YEAR, Reg.`Registration_date`, NOW()) >= 5 AND lc.client_id IS NULL