LEFT OUTER JOIN with subquery syntax
Lets build this up slowly.
First, lets see about getting just the information about stars:
SELECT name AS starName, (class + 7) * intensity * 1000000 AS starTemp FROM StarsWHERE starId < 100
(this should look might familiar!)
We get a list of all stars whose starId
is less than 100 (the WHERE
clause), grabbing the name and calculating temperature. At this point, we don't need a disambiguating reference to source.
Next, we need to add planet information. What about an INNER JOIN
(note that the actual keyword INNER
is optional)?
SELECT Stars.name as starName, (Stars.class + 7) * Stars.intensity * 1000000 AS starTemp, Planets.name as planetNameFROM StarsINNER JOIN Planets ON Planets.starId = Stars.starIdWHERE Stars.starId < 100
The ON
clause is using an =
(equals) condition to link planets to the star they orbit; otherwise, we'd be saying they were orbiting more than one star, which is very unusual! Each star is listed once for every planet it has, but that's expected.
...Except now we have a problem: Some of our stars from the first query disappeared! The (INNER) JOIN
is causing only stars with at least one planet to be reported. But we still need to report stars without any planets! So what about a LEFT (OUTER) JOIN
?
SELECT Stars.name as starName, (Stars.class + 7) * Stars.intensity * 1000000 AS starTemp, Planets.name as planetNameFROM StarsLEFT JOIN Planets ON Planets.starId = Stars.starIdWHERE Stars.starId < 100
... And we have all the stars back, with planetName
being null
(and only appearing once) if there are no planets for that star. Good so far!
Now we need to add the planet temperature. Should be simple:
SELECT Stars.name as starName, (Stars.class + 7) * Stars.intensity * 1000000 AS starTemp, Planets.name as planetName, starTemp - (50 * Planets.orbitDistance) as planetTempFROM StarsLEFT JOIN Planets ON Planets.starId = Stars.starIdWHERE Stars.starId < 100
...except that on most RDBMSs, you'll get a syntax error stating the system can't find starTemp
. What's going on? The problem is that the new column alias (name) isn't (usually) available until after the SELECT
part of the statement runs. Which means we need to put in the calculation again:
SELECT Stars.name as starName, (Stars.class + 7) * Stars.intensity * 1000000 AS starTemp, Planets.name as planetName, ((Stars.class + 7) * Stars.intensity * 1000000) - (50 * Planets.orbitDistance) as planetTempFROM StarsLEFT JOIN Planets ON Planets.starId = Stars.starIdWHERE Stars.starId < 100
(note that the db may actually be smart enough to perform the starTemp
calculation only once per-line, but when writing you have to mention it twice in this context).
Well, that's slightly messy, but it works. Hopefully, you'll remember to change both references if that's necessary...
Thankfully, we can move the Stars
portion of this into a subquery. We'll only have to list the calculation for starTemp
once!
SELECT Stars.starName, Stars.starTemp, Planets.name as planetName, Stars.starTemp - (50 * Planets.orbitDistance) as planetTempFROM (SELECT starId, name AS starName, (class + 7) * intensity * 1000000 AS starTemp FROM Stars WHERE starId < 100) StarsLEFT JOIN Planets ON Planets.starId = Stars.starId
Yeah, that looks like how I'd write it. Should work on essentially any RDBMS.
Note that the parenthesis in Stars.starTemp - (50 * Planets.orbitDistance)
is only there for clarity for the reader, the meaning of the math would remain unchanged if they were removed. Regardless of how well you know operator-precedence rules, always put in parenthesis when mixing operations. This becomes especially beneficial when dealing with OR
s and AND
s in JOIN
and WHERE
conditions - many people lose track of what's going to be effected.
Also note that the implicit-join syntax (the comma-separated FROM
clause) is considered bad practice in general, or outright deprecated on some platforms (queries will still run, but the db may scold you). It also makes certain things - like LEFT JOIN
s - difficult to do, and increases the possibility of accidently sabotaging yourself. So please, avoid it.
SELECT * FROM (SELECT [...]) as Alias1LEFT OUTER JOIN (SELECT [...]) as Alias2ON Alias1.id = Alias2.id
WITH( SELECT stars.name AS starname, ((star.class+7)*star.intensity)*1000000) AS startemp, stars.starid FROM stars) AS star_tempsSELECT planets.name AS planetname, (startemp-50*planets.orbitdistance) AS planettemp star_temps.starname, star_temps.startempFROM star_temps LEFT OUTER JOIN planets USING (star_id)WHERE star_temps.starid < 100;
Alternatively, one can construct a sub-query (I've used a common table expression) to accomplish the same task as demonstrated below:
SELECT planets.name AS planetname, (startemp-50*planets.orbitdistance) AS planettemp star_temps.starname, star_temps.startempFROM (SELECT stars.name AS starname, ((star.class+7)*star.intensity)*1000000) AS startemp, stars.starid FROM stars ) AS star_tempsLEFT OUTER JOIN planets USING (star_id)WHERE star_temps.starid < 100;