SQL returns less results when using UNION? SQL returns less results when using UNION? sql-server sql-server

SQL returns less results when using UNION?


UNION will eliminate duplicate rows, whether they come between the two result sets or, as in your case, within just one result set. Try using UNION ALL instead.


I believe you need to use UNION ALL. UNION suppresses duplicates by default


A little background here. Technically a table is a set of rows. This is in the mathematical sense of sets.

Two important properties of mathematical sets are:

  • A set is unordered
  • A set has no duplicates

The first point is why row order is not reliable unless you add an ORDER BY clause. The second is one reason you should always have a primary key to ensure that each rows is indeed distinct.

The UNION operation is a set operation and (a) combines two sets, (b) produces a set. In order to maintain the integrity of a set the UNION will remove duplicate rows.

Unfortunately, there is no real way of guaranteeing whether two rows are supposed to be same, so SQL will simply compare the values in the SELECT clauses. If those values are the same, then SQL decides there is a duplicate.

This can lead to an exaggerated example:

SELECT state FROM customersUNIONSELECT state FROM customers;

In principle, the state value is produced twice, duplicate values are removed, and what you have is a long-winded way of saying SELECT DISTINCT.

The set behaviour is also why you can’t have an ORDER BY clause in the SELECT statements: once you order a data set, it is no longer a true set.

(Yes, you can have an ORDER BY clause, but it isn’t attached to a SELECT statement, but rather to the resulting UNION).

Of course, SQL isn’t just about pure mathematics, so it includes two features which break the set:

  • You can order a set with an ORDER BY clause.

    However, you can only do this after the rest of the work is complete, which is why it’s the last clause.

  • You can include duplicates using the UNION ALL clause.