Create a unique
constraint to ensure no duplicate values are entered in specific columns that
do not participate in a primary key.
While both a
unique constraint and a primary key enforce uniqueness, you should attach a
unique constraint instead of a primary key constraint to a table if:
- You want to enforce
uniqueness in a column or combination of columns. You can attach multiple
unique constraints to a table, whereas you can attach only one primary key constraint to a table.
- You want to enforce
uniqueness in a column that allows null values. You can attach unique
constraints to columns that allow null values, whereas you can attach
primary key constraints only to columns that do not allow null values.
When you attach a unique constraint to a column allowing null values, you ensure that at most one row will
have a null value in the constrained column.
A UNIQUE constraint can be
referenced by a FOREIGN KEY constraint.
Unique
Constraints vs. Unique Indexes:
Uniqueness can
be implemented by primary keys, unique
constraints, and unique indexes. While primary keys are well understood,
confusion exists around the use of unique constraints and unique indexes. This
month, Greg explores both.
This month's column arose from one of my
recent consulting engagements. During an architectural discussion I had with
the client, an interesting question came up: "When would you ever use a
unique index?" After thinking about it for a moment, I realized that I
almost never would use one, so I raised the topic on our internal company
technical discussion forum. Here’s a summary of the key thoughts that came out
of that discussion.
Enforcing Uniqueness:
You can constrain a column (or group of
columns) to be unique in three basic ways: through primary keys, unique
constraints, and unique indexes.
In database terminology, a candidate key is a key that’s capable of
uniquely identifying a row. The key cannot
be NULL and can be constructed from one or more columns. One of the
candidate keys can be nominated as the primary key for the table, as shown in
Listing 1. Note that I have a strong preference for naming constraints rather
than leaving the naming choice to SQL Server, which tends to pick obscure
names.
Listing
1: Making a Candidate Key the Primary key:
Create table reference.countries
( Country_id nvarchar(3) NOT NULL --ISO 3 country
code
constraint pk_countries primary
key,
Countryname nvarchar(255) NOT NULL
)
If the key needs to be nullable, you can
define a unique constraint instead. In SQL Server, only one row can then be
NULL. That is at odds with many other database engines, where a unique
constraint indicates only that a key is unique if it is present. The unique
constraint can be placed within the definition of a column, as Listing 2 shows.
Listing 2:
Placing a Unique Constraint in
the Column defination:
Create table reference.countries
( Country_id nvarchar(3) NOT NULL --ISO 3 country
code
constraint pk_countries primary
key,
Countryname nvarchar(255) NOT NULL
constraint uk_countries_contryname UNIQUE
)
You can also place the unique constraint
on an entire key or set of columns by defining it at the table level, as
Listing 3 shows.
Listing
3: Defining a Unique Constraint at the Table level:
Create table reference.regions
(
Regionid int identity(1,1) Constraint pk_regions primary
key,
Statecode nvarchar(2) not null,
regioncode nvarchar(10) not null
Constraint
uk_regions_statecode UNIQUE(statecode,regioncode)
)
When you define a unique constraint, SQL Server internally creates an
index to support the constraint. You can see the index that’s created by
executing the code shown in Listing 4. Figure 1 shows the
output of this query.
Listing
4: Code to See the Index Created to Support a Unique Constraint:
select
type_desc,
is_primary_key,
is_unique_constraint
from sys.indexes where name ='uk_countries_contryname'
type_desc
is_primary_key is_unique_constraint
------------------------------------------------------------
-------------- --------------------
NONCLUSTERED
0 1
Figure 1: Example of Index that Supports
a Unique Constraint
Note that SQL Server automatically
created this nonclustered index. It is flagged as a unique constraint and given
the same name as our unique constraint.
You can also force the values in a key
to be unique by defining a unique index on the key. This approach is very
similar to defining a unique constraint on a table, but it’s not identical. In
the earlier Reference.Regions example, you could have declared the table and
the unique index as Listing 5 shows.
Listing
5: Declaring the Reference.Regions Table and a Unique index:
Create table reference.regions
(
Regionid int identity(1,1) Constraint pk_regions primary
key,
Statecode nvarchar(2) not null,
regioncode nvarchar(10) not null
)
Create unique index
uq_regions_statecode_regioncode
ON reference.regions(statecode,regioncode)
If you again query the sys.indexes view,
you’ll notice that the index created is very similar to the one in Figure 1,
but it doesn’t have the is_unique_constraint
flag set to 1.
Uniqueness and Performance:
So far, I've described uniqueness as a
business requirement, not a performance-related issue. So let’s spend a moment
considering performance-related impacts of uniqueness.
I find that one of the most common
database schema design errors is to ignore placing unique constraints where
they should be used. For example, consider our table of countries defined
earlier. Common sense tells us that CountryName would also be unique, but it’s
very common to see a primary key assigned to the table but no unique constraint
added to columns such as CountryName.
There are two potential consequences of
this error. The first is a business issue in that we could create two countries
with the same name. The other is a performance issue.
Imagine a table holding individual sales
for each country. It might be defined as in
Listing 6.
Listing
6: Sample Sales.SalesDetails table:
CREATE TABLE sales.salesdetails
(
salesdetailid int identity(1,1)
Constraint
pk_salesdetails primary key,
Amount decimal(18,2) not null,
Salesdate datetime not null,
Countryid nvarchar(3) not null
Constraint fk_salesdetails_countries
References refernce.country(countryid)
)
Note that I've included only a few
relevant columns for simplicity. Now imagine a query that summarizes sales by
country, such as that in Listing 7.
Listing
7: Query Summarizing Sales by country:
SELECT c.countryname,sum(sd.amount) as totalamount
FROM reference.countries as c
Inner join sales.salesdetails
as sd
On c.countryid = sd.countryid
Group by c.countryname
Order by c.countryname
If you were writing a query from
scratch, it might occur to you to group the Sales.SalesDetails rows by
CountryID and then to use the summarized rows to join to the
Reference.Countries table. However, most end users who are using a query tool
will sort their results by CountryName, not by CountryID. They probably won't
even return the CountryID, so they wouldn’t think to group by it instead of by
CountryName.
If you have a unique constraint on the
Reference.Countries table, SQL Server knows that grouping the sales rows by
CountryID is equivalent to grouping the joined table by CountryName. The
problem is that without the unique constraint, SQL Server then has to join
every row of the Sales.SalesDetails table to the Reference.Countries table and
then sort the entire result set by the CountryName. This is a very expensive
sorting operation.
So adding the unique constraint has a
major impact on the performance of this query, not just on the business
implications of avoiding duplicated country names. You could gain the same
performance benefit by adding a unique index on the CountryName column. What is
of further interest in this situation is that the sys.dm_db_index_usage_stats
Dynamic Management View (DMV) will not show that index ever being used, even
though its presence has materially affected the execution plan that’s
generated.
Constraints vs. Indexes:
There is no difference between Unique
Index and Unique Constraint. Even though syntax are different the effect is the
same.
Given that this unique requirement could
have been implemented via either a constraint or an index, which is preferable?
In general, I prefer to have this declared as a constraint every time. Indexes
are mostly used externally to enhance performance or internally by SQL Server
to implement constraints. Uniqueness is normally a business-related requirement
that you do not want to have any chance of being separated from the table
definition.
However, there are a few situations that
might lead you to consider a unique index rather than (or in addition to) a
constraint:
• Beginning
in SQL Server 2008, you have the ability to create filtered indexes.This
feature could help you get around the limitation where SQL Server lets you have
only one NULL key in a unique constraint.
• You
might want to add columns to the index to avoid lookups and to help create
covering indexes. You do this via the INCLUDE clause of the CREATE INDEX
statement. You currently have no way of specifying that you want columns
included in the internal index that SQL Server creates to support unique
constraints.
• When you create a unique constraint, you have no
way to specify that the index created to support it should be ascending or
descending. In a few (albeit rare) situations, this might concern you.
No comments:
Post a Comment