Using a UNION in nested queries with SUM calculations
In this article we describe a situation where some care is required in
writing the SQL to perform an outer aggregate SUM calculation around an inner union.
This is a situation which we illustrate using Access, but applies equally well to SQL Server.
Simple Sum
Consider performing a simple sum over a single table of data as shown below.
Then we use a very simple aggregate Sum function.
Giving the following correct answer.
Top of Page
Nested query with two tables
Now introduce a second table as shown below.
We now want to now perform the aggregate over all records in both tables, so we use a nested query,
where the inner part of the query is a UNION of the data in both tables.
Giving the following unexpected result!
In order to investigate this in more detail, it is worth viewing the result of
running the inner part of the nested query without the aggregate.
Top of Page
UNION ALL
To simplify things, let us start by looking at the union in more detail;
this also acts as a nice example of how to break down a problem in SQL into simpler steps.
Which displays only 2 records.
Now we can see that the problem is that the default UNION clause has eliminated any duplicate records,
although this is often a desirable behaviour, in this situation it is the source of our problems.
Changing the SQL to perform a UNION ALL including any duplicate records works as follows.
This gives the following result.
Top of Page
Corrected SQL
Now returning to our original aggregate in our nested query we can modify the SQL as follows :-
Which gives the following result.
Top of Page
Conclusion
In many circumstances the power of a UNION to exclude duplicate records is very useful,
for example when bringing together a list of all customers from active and historical tables into a single list;
in which case eliminating duplicates gives the desired results.
However, when working with aggregate functions such as a SUM over a UNION we
normally need to ensure that a UNION ALL is used; in the above example it is
only because some of the transaction dates and amounts are the same,
that duplicates get removed. This is an example of a subtle problem which
very much depends on the data values.
Top of Page
Download Code
The demonstration files for this article can
be downloaded from this link
MUST+SQL allows you to upsize into multiple schemas splitting up an application into manageable chunks or
bringing together multiple databases and simplifying security. Click below to improve your
development productivity.
Top of Page