Performance of SQL "EXISTS" usage variants Performance of SQL "EXISTS" usage variants sql sql

Performance of SQL "EXISTS" usage variants


The truth about the EXISTS clause is that the SELECT clause is not evaluated in an EXISTS clause - you could try:

SELECT *   FROM tableA  WHERE EXISTS (SELECT 1/0                  FROM tableB                 WHERE tableA.x = tableB.y)

...and should expect a divide by zero error, but you won't because it's not evaluated. This is why my habit is to specify NULL in an EXISTS to demonstrate that the SELECT can be ignored:

SELECT *   FROM tableA  WHERE EXISTS (SELECT NULL                 FROM tableB                 WHERE tableA.x = tableB.y)

All that matters in an EXISTS clause is the FROM and beyond clauses - WHERE, GROUP BY, HAVING, etc.

This question wasn't marked with a database in mind, and it should be because vendors handle things differently -- so test, and check the explain/execution plans to confirm. It is possible that behavior changes between versions...


Definitely #1. It "looks" scary, but realize the optimizer will do the right thing and is expressive of intent. Also ther is a slight typo bonus should one accidently think EXISTS but type IN. #2 is acceptable but not expressive. The third option stinks in my not so humble opinion. It's too close to saying "if 'no value' exists" for comfort.

In general it's important to not be scared to write code that mearly looks inefficient if it provides other benefits and does not actually affect performance.

That is, the optimizer will almost always execute your complicated join/select/grouping wizardry to save a simple EXISTS/subquery the same way.

After having given yourself kudos for cleverly rewriting that nasty OR out of a join you will eventually realize the optimizer still used the same crappy execution plan to resolve the much easier to understand query with embedded OR anyway.

The moral of the story is know your platforms optimizer. Try different things and see what is actually being done because the rampant knee jerks assumptions regarding 'decorative' query optimization are almost always incorrect and irrelevant from my experience.


I realize this is an old post, but I thought it important to add clarity about why one might choose one format over another.

First, as others have pointed out, the database engine is supposed to ignore the Select clause. Every version of SQL Server has/does, Oracle does, MySQL does and so on. In many, many moons of database development, I have only ever encountered one DBMS that did not properly ignore the Select clause: Microsoft Access. Specifically, older versions of MS Access (I can't speak to current versions).

Prior to my discovery of this "feature", I used to use Exists( Select *.... However, i discovered that MS Access would stream across every column in the subquery and then discard them (Select 1/0 also would not work). That convinced me switch to Select 1. If even one DBMS was stupid, another could exist.

Writing Exists( Select 1... is as abundantly clear in conveying intent (It is frankly silly to claim "It's too close to saying "if 'no value' exists" for comfort.") and makes the odds of a DBMS doing something stupid with the Select statement nearly impossible. Select Null would serve the same purpose but is simply more characters to write.

I switched to Exists( Select 1 to make absolutely sure the DBMS couldn't be stupid. However, that was many moons ago, and today I would expect that most developers would expect seeing Exists( Select * which will work exactly the same.

That said, I can provide one good reason for avoiding Exists(Select * even if your DBMS evaluates it properly. It is much easier to find and trounce all uses of Select * if you don't have to skip every instance of its use in an Exists clause.