Pages

Thursday, 26 September 2013

Unique Constraint

Create a unique constraint to ensure no duplicate values are entered in specific columns that do not participate in a primary key.

While both a unique constraint and a primary key enforce uniqueness, you should attach a unique constraint instead of a primary key constraint to a table if:

  • You want to enforce uniqueness in a column or combination of columns. You can attach multiple unique constraints to a table, whereas you can attach only one primary key constraint to a table.

  • You want to enforce uniqueness in a column that allows null values. You can attach unique constraints to columns that allow null values, whereas you can attach primary key constraints only to columns that do not allow null values. When you attach a unique constraint to a column allowing null values, you ensure that at most one row will have a null value in the constrained column.

A UNIQUE constraint can be referenced by a FOREIGN KEY constraint.



Unique Constraints vs. Unique Indexes:

Uniqueness can be implemented by primary keys, unique constraints, and unique indexes. While primary keys are well understood, confusion exists around the use of unique constraints and unique indexes. This month, Greg explores both.
This month's column arose from one of my recent consulting engagements. During an architectural discussion I had with the client, an interesting question came up: "When would you ever use a unique index?" After thinking about it for a moment, I realized that I almost never would use one, so I raised the topic on our internal company technical discussion forum. Here’s a summary of the key thoughts that came out of that discussion.
Enforcing Uniqueness:
You can constrain a column (or group of columns) to be unique in three basic ways: through primary keys, unique constraints, and unique indexes.

In database terminology, a candidate key is a key that’s capable of uniquely identifying a row. The key cannot be NULL and can be constructed from one or more columns. One of the candidate keys can be nominated as the primary key for the table, as shown in Listing 1. Note that I have a strong preference for naming constraints rather than leaving the naming choice to SQL Server, which tends to pick obscure names.
Listing 1: Making a Candidate Key the Primary key:
Create table reference.countries
( Country_id nvarchar(3) NOT NULL  --ISO 3 country code
     constraint pk_countries primary key,
  Countryname nvarchar(255) NOT NULL
)
If the key needs to be nullable, you can define a unique constraint instead. In SQL Server, only one row can then be NULL. That is at odds with many other database engines, where a unique constraint indicates only that a key is unique if it is present. The unique constraint can be placed within the definition of a column, as Listing 2 shows.
Listing 2: Placing a Unique Constraint in the Column defination:
 Create table reference.countries
( Country_id nvarchar(3) NOT NULL  --ISO 3 country code
     constraint pk_countries primary key,
  Countryname nvarchar(255) NOT NULL
  constraint uk_countries_contryname UNIQUE
)
You can also place the unique constraint on an entire key or set of columns by defining it at the table level, as Listing 3 shows.
Listing 3: Defining a Unique Constraint at the Table level:
Create table reference.regions
(
Regionid int identity(1,1) Constraint pk_regions primary key,
Statecode nvarchar(2) not null,
regioncode nvarchar(10) not null
Constraint uk_regions_statecode UNIQUE(statecode,regioncode)
)
When you define a unique constraint, SQL Server internally creates an index to support the constraint. You can see the index that’s created by executing the code shown in Listing 4. Figure 1 shows the output of this query.
Listing 4: Code to See the Index Created to Support a Unique Constraint:
select
type_desc,
is_primary_key,
is_unique_constraint
from sys.indexes where name ='uk_countries_contryname' 

type_desc                                                    is_primary_key is_unique_constraint
------------------------------------------------------------ -------------- --------------------
NONCLUSTERED                                                 0              1

Figure 1: Example of Index that Supports a Unique Constraint

Note that SQL Server automatically created this nonclustered index. It is flagged as a unique constraint and given the same name as our unique constraint.

You can also force the values in a key to be unique by defining a unique index on the key. This approach is very similar to defining a unique constraint on a table, but it’s not identical. In the earlier Reference.Regions example, you could have declared the table and the unique index as Listing 5 shows.
Listing 5: Declaring the Reference.Regions Table and a Unique index:
Create table reference.regions
(
Regionid int identity(1,1) Constraint pk_regions primary key,
Statecode nvarchar(2) not null,
regioncode nvarchar(10) not null
)

Create unique index uq_regions_statecode_regioncode
ON reference.regions(statecode,regioncode)
If you again query the sys.indexes view, you’ll notice that the index created is very similar to the one in Figure 1, but it doesn’t have the is_unique_constraint flag set to 1.
Uniqueness and Performance:
So far, I've described uniqueness as a business requirement, not a performance-related issue. So let’s spend a moment considering performance-related impacts of uniqueness.
I find that one of the most common database schema design errors is to ignore placing unique constraints where they should be used. For example, consider our table of countries defined earlier. Common sense tells us that CountryName would also be unique, but it’s very common to see a primary key assigned to the table but no unique constraint added to columns such as CountryName.

There are two potential consequences of this error. The first is a business issue in that we could create two countries with the same name. The other is a performance issue.
Imagine a table holding individual sales for each country. It might be defined as in
Listing 6.




Listing 6: Sample Sales.SalesDetails table:
CREATE TABLE sales.salesdetails
(
salesdetailid int identity(1,1)
Constraint pk_salesdetails primary key,
Amount decimal(18,2) not null,
Salesdate datetime not null,
Countryid nvarchar(3) not null
 Constraint fk_salesdetails_countries
References refernce.country(countryid)
)
Note that I've included only a few relevant columns for simplicity. Now imagine a query that summarizes sales by country, such as that in Listing 7.
Listing 7: Query Summarizing Sales by country:
SELECT c.countryname,sum(sd.amount) as totalamount
FROM reference.countries as c
Inner join sales.salesdetails as sd
On c.countryid = sd.countryid
Group by c.countryname
Order by c.countryname
If you were writing a query from scratch, it might occur to you to group the Sales.SalesDetails rows by CountryID and then to use the summarized rows to join to the Reference.Countries table. However, most end users who are using a query tool will sort their results by CountryName, not by CountryID. They probably won't even return the CountryID, so they wouldn’t think to group by it instead of by CountryName.
If you have a unique constraint on the Reference.Countries table, SQL Server knows that grouping the sales rows by CountryID is equivalent to grouping the joined table by CountryName. The problem is that without the unique constraint, SQL Server then has to join every row of the Sales.SalesDetails table to the Reference.Countries table and then sort the entire result set by the CountryName. This is a very expensive sorting operation.

So adding the unique constraint has a major impact on the performance of this query, not just on the business implications of avoiding duplicated country names. You could gain the same performance benefit by adding a unique index on the CountryName column. What is of further interest in this situation is that the sys.dm_db_index_usage_stats Dynamic Management View (DMV) will not show that index ever being used, even though its presence has materially affected the execution plan that’s generated.









Constraints vs. Indexes:
There is no difference between Unique Index and Unique Constraint. Even though syntax are different the effect is the same.

Given that this unique requirement could have been implemented via either a constraint or an index, which is preferable? In general, I prefer to have this declared as a constraint every time. Indexes are mostly used externally to enhance performance or internally by SQL Server to implement constraints. Uniqueness is normally a business-related requirement that you do not want to have any chance of being separated from the table definition.
However, there are a few situations that might lead you to consider a unique index rather than (or in addition to) a constraint:
• Beginning in SQL Server 2008, you have the ability to create filtered indexes.This feature could help you get around the limitation where SQL Server lets you have only one NULL key in a unique constraint.
• You might want to add columns to the index to avoid lookups and to help create covering indexes. You do this via the INCLUDE clause of the CREATE INDEX statement. You currently have no way of specifying that you want columns included in the internal index that SQL Server creates to support unique constraints.

• When you create a unique constraint, you have no way to specify that the index created to support it should be ascending or descending. In a few (albeit rare) situations, this might concern you.

No comments:

Post a Comment