SQL - remove duplicates from left join
This will give the results you requested and should have the best performance.
SELECT OBJECTID , PKID , t2.SUBDIVISIO, , t2.ASBUILT1FROM dbo.table1 AS t1OUTER APPLY ( SELECT TOP 1 * FROM dbo.table2 AS t2 WHERE t1.PKID = t2.FKID ) AS t2
Your original query is producing arbitrary values for the two columns (the use of top
with no order by
). You can get the same effect with this:
SELECT t1.OBJECTID, t1.PKID, t2.ProjectName, t2.AsbuiltFROM dbo.table1 t1 LEFT JOIN (SELECT FKID, min(ProjectName) as ProjectName, MIN(asBuilt) as AsBuilt FROM dbo.table2 group by fkid ) t2 ON t1.PKID = t2.FKID
This version replaces the distinct
with a group by
.
To get a truly random row in SQL Server (which your syntax suggests you are using), try this:
SELECT t1.OBJECTID, t1.PKID, t2.ProjectName, t2.AsbuiltFROM dbo.table1 t1 LEFT JOIN (SELECT FKID, ProjectName, AsBuilt, ROW_NUMBER() over (PARTITION by fkid order by newid()) as seqnum FROM dbo.table2 ) t2 ON t1.PKID = t2.FKID and t2.seqnum = 1
This assumes version 2005 or greater.
If you want described result, you need to use INNER JOIN
and following query will satisfy your need:
SELECT t1.OID, t1.PKID, MAX(t2.ProjectName) AS ProjectName, MAX(t2.Asbuilt) AS AsbuiltFROM table1 t1JOIN table2 t2 ON t1.PKID = t2.FKIDGROUP BY t1.OID, t1.PKID
If you want to see all rows from left table (table1) whether it has pair in right table or not, then use LEFT JOIN
and same query will gave you desired result.
EDITED
This construction has good performance, and you dont need to use subqueries.