Set Operators (Transact-SQL)
SQL Server provides the
following set operators. Set operators combine results from two or more queries
into a single result set.
UNION
Combines
the results of two or more queries into a single result set that includes all the
rows that belong to all queries in the union. The UNION operation is different
from using joins that combine columns from two tables.
The
following are basic rules for combining the result sets of two queries by using
UNION:
·
Each query must have the same number of columns
·
Each column must have compatible data types
·
Column names for the final result set are taken from
the first query
·
ORDER BY and COMPUTE clauses can only be issued for
the overall result set and not within each individual result set
·
GROUP BY and HAVING clauses can only be issued for
each individual result set and not for the overall result set
·
SELECT INTO clause can only be issued for first
result set not for the overall result set.
Syntax:
-- Syntax for SQL
Server, Azure SQL Database, Azure SQL Data Warehouse, Parallel Data Warehouse
{<query_specification> | (
<query_expression> ) }
UNION
[ ALL ]
<query_specification | (
<query_expression> )
[
UNION [ ALL ] <query_specification> | ( <query_expression> )
[ ...n ] ]
<query_specification> |
(<query_expression>)
Is a query specification or query expression that returns data to be combined with the data from another query specification or query expression. The definitions of the columns that are part of a UNION operation do not have to be the same, but they must be compatible through implicit conversion. When data types differ, the resulting data type is determined based on the rules for data type precedence. When the types are the same but differ in precision, scale, or length, the result is determined based on the same rules for combining expressions.
Is a query specification or query expression that returns data to be combined with the data from another query specification or query expression. The definitions of the columns that are part of a UNION operation do not have to be the same, but they must be compatible through implicit conversion. When data types differ, the resulting data type is determined based on the rules for data type precedence. When the types are the same but differ in precision, scale, or length, the result is determined based on the same rules for combining expressions.
Columns of the xml data
type must be equivalent. All columns must be either typed to an XML schema or
untyped. If typed, they must be typed to the same XML schema collection.
UNION
specifies that multiple result sets are to be combined and returned as a single result set.
specifies that multiple result sets are to be combined and returned as a single result set.
ALL
incorporates all rows into the results. This includes duplicates. If not specified, duplicate rows are removed.
incorporates all rows into the results. This includes duplicates. If not specified, duplicate rows are removed.
Using SELECT INTO with UNION
SELECT
statement specifies that the
table named ProductResults
holds the final result set of the
union of the designated columns of the ProductModel
and Gloves
tables.-- Uses AdventureWorks
SELECT ProductModelID, Name
INTO dbo.ProductResults
FROM Production.ProductModel
WHERE ProductModelID NOT IN (3, 4)
UNION
SELECT ProductModelID, Name
FROM dbo.Gloves;
GO
Using UNION of two SELECT statements
with ORDER BY
/* INCORRECT */
-- Uses AdventureWorks
SELECT ProductModelID, Name
FROM Production.ProductModel
WHERE ProductModelID NOT IN (3, 4)
ORDER BY Name
UNION
SELECT ProductModelID, Name
FROM dbo.Gloves;
GO
/* CORRECT */
-- Uses AdventureWorks
SELECT ProductModelID, Name
FROM Production.ProductModel
WHERE ProductModelID NOT IN (3, 4)
UNION
SELECT ProductModelID, Name
FROM dbo.Gloves
ORDER BY Name;
GO
EXCEPT
and INTERSECT
Returns distinct rows by
comparing the results of two queries.
EXCEPT returns distinct rows
from the left input query that aren’t output by the right input query.
INTERSECT returns distinct rows
that are output by both the left and right input queries operator.
The basic rules for combining
the result sets of two queries that use EXCEPT or INTERSECT are the following:
- The
number and the order of the columns must be the same in all queries.
- The
data types must be compatible.
-- Syntax for SQL Server, Azure SQL Database, Azure SQL Data Warehouse, Parallel Data Warehouse
{ <query_specification> | ( <query_expression> ) }
{ EXCEPT | INTERSECT }
{ <query_specification> | ( <query_expression> ) }
Arguments
<query_specification>
| ( <query_expression>
)
Is a query specification or query expression that returns data to be compared with the data from another query specification or query expression. The definitions of the columns that are part of an EXCEPT or INTERSECT operation do not have to be the same, but they must be comparable through implicit conversion. When data types differ, the type that is used to perform the comparison and return results is determined based on the rules fordata type precedence.
Is a query specification or query expression that returns data to be compared with the data from another query specification or query expression. The definitions of the columns that are part of an EXCEPT or INTERSECT operation do not have to be the same, but they must be comparable through implicit conversion. When data types differ, the type that is used to perform the comparison and return results is determined based on the rules fordata type precedence.
When
the types are the same but differ in precision, scale, or length, the result is
determined based on the same rules for combining expressions.
The
query specification or expression cannot return xml, text, ntext, image,
or non binary CLR user-defined type columns because these data types are not
comparable.
EXCEPT
Returns any distinct values from the query to the left of the EXCEPT operator that are not also returned from the right query.
Returns any distinct values from the query to the left of the EXCEPT operator that are not also returned from the right query.
INTERSECT
Returns any distinct values that are returned by both the query on the left and right sides of the INTERSECT operator.
Returns any distinct values that are returned by both the query on the left and right sides of the INTERSECT operator.
Remarks
When
the data types of comparable columns that are returned by the queries to the
left and right of the EXCEPT or INTERSECT operators are character data types
with different collations, the required comparison is performed according to
the rules of collation precedence.
If this conversion cannot be performed, the SQL Server Database Engine returns
an error.
When
comparing column values for determining DISTINCT rows, two NULL values are
considered equal.
The
column names of the result set that are returned by EXCEPT or INTERSECT are the
same names as those returned by the query on the left side of the operator.
Column
names or aliases in ORDER BY clauses must reference column names returned by
the left-side query.
The
nullability of any column in the result set returned by EXCEPT or INTERSECT is
the same as the nullability of the
corresponding
column that is returned by the query on the left side of the operator.
If
EXCEPT or INTERSECT is used together with other operators in an expression, it
is evaluated in the context of the following precedence:
1.
Expressions in parentheses
2.
The INTERSECT operator
3.
EXCEPT and UNION evaluated from left to right based
on their position in the expression
If
EXCEPT or INTERSECT is used to compare more than two sets of queries, data type
conversion is determined by comparing two queries at a time, and following the
previously mentioned rules of expression evaluation.
EXCEPT
and INTERSECT cannot be used in distributed partitioned view definitions, query
notifications.
EXCEPT
and INTERSECT may be used in distributed queries, but are only executed on the
local server and not pushed to the linked server. Therefore, using EXCEPT and
INTERSECT in distributed queries may affect performance.
Fast
forward-only and static cursors are fully supported in the result set when they
are used with an EXCEPT or INTERSECT operation. If a keyset-driven or dynamic
cursor is used together with an EXCEPT or INTERSECT operation, the cursor of
the result set of the operation is converted to a static cursor.
When
an EXCEPT operation is displayed by using the Graphical Showplan feature in SQL
Server Management Studio, the operation appears as a left anti semi join,
and an INTERSECT operation appears as a left semi join.
Tip
If you don't have the exact same columns in all
queries use a default value or a NULL value such as:
SELECT firstName, lastName, company FROM
businessContacts
UNION
SELECT firstName, lastName, NULL FROM nonBusinessContacts
or
SELECT firstName, lastName, createDate FROM
businessContacts
UNION ALL
SELECT firstName, lastName, getdate() FROM nonBusinessContacts
Performance Considerations:
UNION
vs. UNION ALL ExamplesLet's take a look at a few simple examples of how these commands work and how they differ. As you will see the final result sets will differ, but there is some interesting info on how SQL Server actually completes the process.
UNION ALL
In this first example we are using the UNION ALL operator against the Employee table from the AdventureWorks database. This is probably not something you would do, but this helps illustrate the differences of these two operators.
There are 290 rows in table dbo. Employee.
SELECT * FROM HumanResources.Employee
UNION ALL
SELECT * FROM HumanResources.Employee
UNION ALL
SELECT * FROM HumanResources.Employee
When this query is run the result set has 870 rows. This is the 290 rows returned 3 times. The data is just put together one dataset on top of the other dataset.
Here is the execution plan for this query. We can see
that the table was queried 3 times and SQL Server did a Concatenation step to
concatenate all of the data.
UNION
In this next example we are using the UNION operator against the
Employee table again from the AdventureWorks database.
SELECT * FROM HumanResources.Employee
UNION
SELECT * FROM HumanResources.Employee
UNION
SELECT * FROM HumanResources.Employee
When this query is run the result set has 290 rows. Even though we
combined the data three times the UNION operator removed the duplicate records
and therefore returns just the 290 unique rows.
Here is the execution plan for this query. We can see
that SQL Server first queried 2 of the tables, then did a Merge Join operation
to combine the first two tables and then it did another Merge Join along with
querying the third table in the query. So we can see there was much more worked
that had to be performed to get this result set compared to the UNION ALL.
UNION vs. UNION ALL Examples With Sort on Clustered
Index Column
If we take this a step further and do a SORT of the
data using the Clustered Index column we get these execution plans. From this
we can see that the execution plan that SQL Server is using is identical for
each of these operations even though the final result sets will still contain
870 rows for the UNION ALL and 290 rows for the UNION.
UNION ALL
UNION
UNION vs. UNION ALL Examples With Sort on Non-indexed
Column
Here is another example doing the same thing, but this
time doing a SORT on a non indexed column. As you can see the execution plans
are again identical for these two queries, but this time instead of using a
MERGE JOIN, a CONCATENATION and SORT operations are used.
UNION ALL
UNION
No comments:
Post a Comment