How to combine these two mysql queries to preserve the speed of it?
IN ( SELECT ... )
is usually vary inefficient. Avoid it.
All the answers so far are working harder than they need to. It seems like the JOINs
are unnecessary until after the UNION
. See more Notes below.
SELECT Ads.AdId FROM Ads, JOIN ( ( SELECT AdId FROM AdsGeometry WHERE ST_CONTAINS(GeomFromText('Polygon(( -4.9783515930176 36.627100703563, -5.0075340270996 36.61222072018, -4.9896812438965 36.57638676015, -4.965991973877 36.579419508882, -4.955005645752 36.617732160006, -4.9783515930176 36.627100703563 ))'), AdsGeometry.GeomPoint) AND AdId > 1000000 ) UNION DISTINCT ( SELECT ads_AdId AS AdId FROM AdsHierarchy WHERE locations_LocationId = 148022797 AND ads_AdId > 1000000 ) ) AS nt ON Ads.AdId = nt.AdId ORDER BY Ads.ModifiedDate ASC
Notes:
- Both
AdsGeometry
andAdsHierarchy
have adId (by different names); there is no need to do theJOIN
in the inner queries, except possibly to verify that it exists inAds
. Is that an issue? My query will take care of in the outerSELECT's JOIN
, anyway. UNION DISTINCT
is needed because the twoSELECTs
may fetch the same ids.- Move the
> 1000000
inside to cut down on the number of values gathered by theUNION
. - The
UNION
will always (in older versions of MySQL) or sometimes (in newer versions) create a temp table. You are stuck with that. IN ( SELECT ... )
usually optimizes terribly; avoid it.- I added some parentheses; It is possible (but not necessary at the moment) to add
ORDER BY
, etc, to theUNION
; the parens make it clear what it would belong to. - The only reason for the outer query is to get
ModifiedDate
for ordering. You could speed things up by removing that requirement. (TheUNION
probably creates a tmp table; thisORDER BY
probably creates another.)
Both your UNION
queries does extra work by searching for the results already found by doing
SELECT Ads.AdId FROM Ads WHERE AdId IN ...
orSELECT Ads.AdId FROM Ads, (SELECT Ads.AdId ...) AS nt WHERE Ads.AdId = nt.AdId
Also SELECT Ads.AdId FROM Ads, ... GROUP BY Ads
will probably be more efficient, as well as easier to understand if written as SELECT DISTINCT Ads.AdID FROM Ads, ...
Thus, this should give a better query:
SELECT DISTINCT AdId FROM (SELECT Ads.AdId FROM Ads INNER JOIN AdsGeometry ON AdsGeometry.AdId = Ads.AdId WHERE ST_CONTAINS(GeomFromText('Polygon(( -4.9783515930176 36.627100703563, -5.0075340270996 36.61222072018, -4.9896812438965 36.57638676015, -4.965991973877 36.579419508882, -4.955005645752 36.617732160006, -4.9783515930176 36.627100703563 ))'), AdsGeometry.GeomPoint) UNION ALL SELECT Ads.AdId FROM Ads INNER JOIN AdsHierarchy ON Ads.AdId = AdsHierarchy.ads_AdId WHERE AdsHierarchy.locations_LocationId = 148022797) AS sub WHERE AdId > 100000
Some databases have different performance for INNER JOIN and LEFT OUTER JOIN.Just try next request and if it is slow please add EXPLAIN before SELECT and provide results.
SELECT DISTINCT Ads.AdIdFROM AdsLEFT OUTER JOIN AdsGeometry ag ON ag.AdId = Ads.AdIdLEFT OUTER JOIN AdsHierarchy ah ON ah.ads_AdId = Ads.AdIdWHERE ah.locations_LocationId = 148022797 OR (ST_CONTAINS(GeomFromText('Polygon(( -4.9783515930176 36.627100703563, -5.0075340270996 36.61222072018, -4.9896812438965 36.57638676015, -4.965991973877 36.579419508882, -4.955005645752 36.617732160006, -4.9783515930176 36.627100703563 ))'), ag.GeomPoint))