The simplest definition of a transaction is that it is a single unit of
work; a task or set of tasks that together form an "all-or-nothing"
operation. If some event interrupts a transaction in the middle, so that not
all of it was completed, the system should treat the transaction as if it never
occurred at all. A transaction can be short, like changing the price of one
book in the inventory, or long, like updating the quantity sold of every
inventory item at the beginning of an accounting period.
Transactions have four basic properties, called the ACID properties, which guarantee the validity of the data after the completion of any transaction.
Atomicity:
Example:
The database table has two columns, A and B. An integrity constraint requires that the value in A and the value in B must sum to 100.
Atomicity failure:
Consistency failure:
Isolation failure:
To demonstrate isolation, we assume two transactions execute at the same
time, each attempting to modify the same data. One of the two must wait until
the other completes in order to maintain isolation.
Consider two transactions. T1 transfers 10 from A to B. T2 transfers 10 from B to A. Combined, there are four actions:
By interleaving the transactions, the actual order of actions might be:
Durability failure:
Transaction scope
SQL Server supports several different ways to define the beginning and end of atransaction. Two methods are available by default, and two are only available under specific conditions.
The default types of transactions are auto-commit transactions and explicit transactions.
An auto-commit transaction is any single data modification operation. In other words,any INSERT, UPDATE or DELETE statement (as well as others, such as MERGE and BULK INSERT), by itself, is automatically a transaction. If we modify one row, or one million rows, in a single UPDATE statement, SQL Server will consider the UPDATE operation to be an atomic operation, and will modify either all the rows or none of the rows. If there is a server failure in the middle of the modification operation then, when SQL Server recovers, it will be as if no modifications ever happened. With an auto-commit transaction,there is no way to force a rollback, manually. A transaction rollback will only occur when there is a system failure.
An explicit transaction uses the BEGIN TRANSACTION (or BEGIN TRAN) statement to indicate the beginning of the transaction, and either a COMMIT TRANSACTION or a ROLLBACK TRANSACTION statement to indicate the end. In between, the transaction can include any number of statements.
The non-default types of transactions are implicit transactions and batch-scoped transactions.
For implicit transactions, a session must be in implicit transaction mode, invoked with a SET option: SET IMPLICIT_TRANSACTIONS ON. In implicit transaction mode, the start of any transaction is implied. In other words, any data manipulation language (DML) statement (such as INSERT, UPDATE, DELETE and even SELECT) will automatically start a transaction. Although, in this mode, the start of the transaction is implied, the end of the transaction must be explicit, and the transaction is not finished until we issue either a ROLLBACK TRAN or COMMIT TRAN. This mode is mainly for use by developers who have come to SQL Server from other database management systems, such as Oracle or DB2, which deal with transactions in a different way. However, I strongly recommend that you get used to working with SQL Server's default transaction management options because all the documentation and all books and magazine articles about SQL Server assume you are using that mode. If you must use implicit transaction mode for compatibility with other systems or applications, you'll probably be better off not mixing and matching the two modes, but rather having all your sessions and all your transactions using implicit transaction mode.
Introduced in SQL Server 2005, we invoke batch-scoped transactions by requesting the option Multiple Active Result Sets (or MARS) in the client connection string. In those connections, SQL Server will roll back any batch that includes a BEGIN TRAN but does not include a COMMIT TRAN. The purpose of MARS is to avoid a problem called "application deadlock," which we'll discuss in Chapter 4, in the section on sharing locks across connections.
Transactions have four basic properties, called the ACID properties, which guarantee the validity of the data after the completion of any transaction.
Atomicity:
A transaction is treated as a single unit of work. Either it
completes entirely, or the system has no "memory" of it happening at
all. This applies to transactions of any size, whether two rows are being
inserted, or 10 million rowsare being updated.
Consistency:
The consistency property ensures that any transaction will
bring the database from one valid state to another. Any data written to the
database must be valid according to all defined rules, including but not
limited to constraints, cascades, triggers,
and any combination thereof. This does not guarantee correctness of the
transaction in all ways the application programmer might have wanted (that is
the responsibility of application-level code) but merely that any programming
errors do not violate any defined rules.
Isolation:
The isolation property ensures that the
concurrent execution of transactions results in a system state that would be
obtained if transactions were executed serially, i.e. one after the other.
Providing isolation is the main goal of concurrency control. Depending on concurrency
control method, the effects of an incomplete transaction might not even be
visible to another transaction.
Durability:
means that once a transaction has been committed, it will
remain so, even in the event of power loss, crashes,
or errors. In a relational database, for instance, once a group of SQL
statements execute, the results need to be stored permanently (even if the
database crashes immediately thereafter). To defend against power loss,
transactions (or their effects) must be recorded in a non-volatile memory.
Example:
The database table has two columns, A and B. An integrity constraint requires that the value in A and the value in B must sum to 100.
CREATE TABLE acidtest (A INTEGER, B INTEGER CHECK (A + B = 100));
Atomicity failure:
Assume that a transaction attempts to subtract 10 from A and
add 10 to B. This is a valid transaction, since the data continue to satisfy
the constraint after it has executed. However, assume that after removing 10
from A, the transaction is unable to modify B. If the database retained A's new
value, atomicity and the constraint would both be violated. Atomicity requires
that both parts of this transaction, or neither, be complete.
Consistency failure:
Consistency is a very general term which demands that the
data must meet all validation rules. In the previous example, the validation is
a requirement that A + B = 100. Also, it may be inferred that both A and B must
be integers. A valid range for A and B may also be inferred. All validation
rules must be checked to ensure consistency.
Assume that a transaction attempts to subtract 10 from A
without altering B. Because consistency is checked after each transaction, it
is known that A + B = 100 before the transaction begins. If the transaction
removes 10 from A successfully, atomicity will be achieved. However, a
validation check will show that A + B = 90, which is inconsistent with the
rules of the database. The entire transaction must be cancelled and the
affected rows rolled back to their pre-transaction state. If there had been
other constraints, triggers, or cascades, every single change operation would
have been checked in the same way as above before the transaction was
committed.
Isolation failure:
To demonstrate isolation, we assume two transactions execute at the same
time, each attempting to modify the same data. One of the two must wait until
the other completes in order to maintain isolation.Consider two transactions. T1 transfers 10 from A to B. T2 transfers 10 from B to A. Combined, there are four actions:
- T1 subtracts 10 from A.
- T1 adds 10 to B.
- T2 subtracts 10 from B.
- T2 adds 10 to A.
By interleaving the transactions, the actual order of actions might be:
- T1 subtracts 10 from A.
- T2 subtracts 10 from B.
- T2 adds 10 to A.
- T1 adds 10 to B.
Durability failure:
Assume that a transaction transfers 10 from A to B. It
removes 10 from A. It then adds 10 to B. At this point, a "success"
message is sent to the user. However, the changes are still queued in the disk buffer
waiting to be committed to the disk. Power fails and the changes are lost. The
user assumes (understandably) that the changes have been made.
Transaction scope
SQL Server supports several different ways to define the beginning and end of atransaction. Two methods are available by default, and two are only available under specific conditions.
The default types of transactions are auto-commit transactions and explicit transactions.
An auto-commit transaction is any single data modification operation. In other words,any INSERT, UPDATE or DELETE statement (as well as others, such as MERGE and BULK INSERT), by itself, is automatically a transaction. If we modify one row, or one million rows, in a single UPDATE statement, SQL Server will consider the UPDATE operation to be an atomic operation, and will modify either all the rows or none of the rows. If there is a server failure in the middle of the modification operation then, when SQL Server recovers, it will be as if no modifications ever happened. With an auto-commit transaction,there is no way to force a rollback, manually. A transaction rollback will only occur when there is a system failure.
An explicit transaction uses the BEGIN TRANSACTION (or BEGIN TRAN) statement to indicate the beginning of the transaction, and either a COMMIT TRANSACTION or a ROLLBACK TRANSACTION statement to indicate the end. In between, the transaction can include any number of statements.
The non-default types of transactions are implicit transactions and batch-scoped transactions.
For implicit transactions, a session must be in implicit transaction mode, invoked with a SET option: SET IMPLICIT_TRANSACTIONS ON. In implicit transaction mode, the start of any transaction is implied. In other words, any data manipulation language (DML) statement (such as INSERT, UPDATE, DELETE and even SELECT) will automatically start a transaction. Although, in this mode, the start of the transaction is implied, the end of the transaction must be explicit, and the transaction is not finished until we issue either a ROLLBACK TRAN or COMMIT TRAN. This mode is mainly for use by developers who have come to SQL Server from other database management systems, such as Oracle or DB2, which deal with transactions in a different way. However, I strongly recommend that you get used to working with SQL Server's default transaction management options because all the documentation and all books and magazine articles about SQL Server assume you are using that mode. If you must use implicit transaction mode for compatibility with other systems or applications, you'll probably be better off not mixing and matching the two modes, but rather having all your sessions and all your transactions using implicit transaction mode.
Introduced in SQL Server 2005, we invoke batch-scoped transactions by requesting the option Multiple Active Result Sets (or MARS) in the client connection string. In those connections, SQL Server will roll back any batch that includes a BEGIN TRAN but does not include a COMMIT TRAN. The purpose of MARS is to avoid a problem called "application deadlock," which we'll discuss in Chapter 4, in the section on sharing locks across connections.
Transaction isolation
Every transaction runs in one
particular transaction isolation level, which determines how sensitive your application is to changes made by other users'
transactions, and how long SQL Server must hold locks to protect against
these changes. The ANSI SQL standard defines four levels of isolation for
transactions. SQL Server supports all four of these levels, listed in order of
increasing restrictiveness, in terms of the read phenomena permitted:
•
READ UNCOMMITTED – allows dirty
reads, non-repeatable reads and phantom reads
•
READ COMMITTED – prevents dirty
reads, allows non-repeatable reads and phantom reads
•
REPEATABLE READ – prevents dirty reads and
non-repeatable reads but allows phantom reads
•
SERIALIZABLE – prevents all read
phenomena.
With the exception of READ
UNCOMMITTED, each of these isolations levels is pessimistic in nature. In other
words, when transactions are operating in one of these modes, SQL Server will
acquire shared and exclusive locks in order to prevent data being read that is
currently being modified by another transaction, and to prevent other
transactions modifying data that is currently being read. In addition, SQL
Server 2005 (and later) offers a new optimistic isolation level, called
SNAPSHOT isolation, plus an optimistic alternative to READ COMMITTED isolation
(READ_COMMITTED_SNAPSHOT), both of which can ensure consistent results without
the need to acquire shared locks, and so can enhance concurrency.
Controlling the
isolation level
SQL Server's default isolation level is READ COMMITTED, but
an application can override this setting by using the following SET command:
SET TRANSACTION ISOLATION LEVEL [READ UNCOMMITTED | READ
COMMITTED | REPEATABLE READ | SNAPSHOT | SERIALIZABLE]
|
The SET command will control the
isolation level for all queries submitted over the current connection, until
the isolation level is changed, or the connection is terminated. In other
words, every time an application makes a new connection (or we open a new query
editor window in SSMS), it starts a new session in the SQL Server instance, and
any transactions within that new session will use the READ COMMITTED isolation
level, by default.
You may be wondering if there is a
way to change SQL Server's isolation level server-wide, so that, by default,
every connection uses an isolation level other than READ COMMITTED. The answer
is no; the isolation level must be set
at the connection level, or within a query. We can control the isolation
level for individual queries by using Lock Hints
Preventable
read phenomena
The easiest way to define the
differences between the various ANSI isolation levels is to describe the set of
behaviors that are either permitted or forbidden, depending on which isolation
level is in use. The three behaviors, also called "preventable read
phenomena,"
are:
• Dirty reads
• Non-repeatable reads
• Phantom reads.
Dirty
reads:
This behavior occurs when a
transaction reads uncommitted data. If one transaction has changed data but not
committed the change, and another transaction is allowed to read that changed
data, then there is a strong possibility that the data will be read in an
inconsistent state.
For example, consider a stock
management application for a factory that receives and distributes
SuperWidgets. A number of sales clerks log deliveries and shipments, updating
the SuperWidgets inventory item, as appropriate.
Currently, there are only 25 widgets
in the stock inventory database, but a new shipment of 50 widgets is just in,
so Clerk A starts a transaction and issues an UPDATE to reflect a new stock
level of 75. At that point, a Clerk B receives an order for 60 widgets and
checks the inventory. If Clerk B's transaction permits dirty reads, Clerk B
would see 75 widgets and so could authorize the sale, for next-day delivery to
a customer. Meanwhile, just as Clerk A prepares to confirm the stock update
transaction, he receives a message that a fault has been detected with the
batch of widgets, and that they need to be returned to the manufacturer. As a
result, he cancels (rolls back) the transaction; Clerk A has authorized an
order that the company cannot fulfill, due to insufficient stock.
By default, SQL Server does not allow
dirty reads. Keep in mind that the transaction updating the data has no control
over whether or not another transaction can read its data before it's
committed. The decision regarding
whether or not to read "dirty" data lies entirely in the hands of the
reading transaction.
Non-repeatable
reads:
This behavior is also called inconsistent
analysis. A read is non-repeatable if a query might get different values when
reading the same data in two separate reads within the same transaction. This
can happen when a separate transaction updates the same data, after the first
read but before the second read.
In the receiving room example,
suppose that a manager comes in to do a spot check of the current inventory.
She walks up to each clerk, asking the total number of widgets received that
day, and adding the numbers on her calculator. When she's done, she wants to
double-check the result, so she goes back to the first clerk. However, if Clerk
A received more widgets between the manager's first and second inquiries, the
total will be different each time and the reads are non-repeatable.
Phantom
reads
This behavior occurs when membership
in a set changes. It can happen only when a query with a predicate, such as
WHERE count_of_widgets < 10, is involved. A phantom occurs if two SELECT
operations using the same predicate in the same transaction return a different
number of rows. For example, let's say that our manager is still doing spot checks
of inventory. This time, she goes around the receiving room and notes which clerks
have fewer than ten widgets. After she completes the list, she goes back around
to offer advice to everyone with a low total. However, imagine that during her
first walkthrough the manager failed to include in her list a clerk who had
just returned from a break, and had fewer than ten widgets. This additional
clerk (or row) is a phantom.
To see the behavior in each ANSI isolation level, we'll look at some example code. First, create a table called IsolationTest in a database called IsolationDB and populate the table with a few rows, by running the code in Listing 1-1. I'll refer to the IsolationTest table in examples for each of the four ANSI isolation levels.
To see the behavior in each ANSI isolation level, we'll look at some example code. First, create a table called IsolationTest in a database called IsolationDB and populate the table with a few rows, by running the code in Listing 1-1. I'll refer to the IsolationTest table in examples for each of the four ANSI isolation levels.
-- Listings from "Concurrency in SQL Server"
-- Listing 1-1: Create a database and table for running
the isolation level exercises
-- Create a database and table for testing the
isolation levels
USE master
GO
IF EXISTS ( SELECT 1
FROM sys.databases
WHERE name = 'IsolationDB' )
DROP DATABASE
IsolationDB ;
GO
CREATE DATABASE IsolationDB ;
GO
USE IsolationDB ;
GO
CREATE TABLE IsolationTest
(
col1 INT PRIMARY KEY ,
col2 VARCHAR(20)
) ;
GO
INSERT INTO
IsolationTest
VALUES ( 10, 'The first row' ) ;
INSERT INTO
IsolationTest
VALUES ( 20, 'The second row' ) ;
INSERT INTO
IsolationTest
VALUES ( 30, 'The third row' ) ;
INSERT INTO
IsolationTest
VALUES ( 40, 'The fourth row' ) ;
INSERT INTO
IsolationTest
VALUES ( 50, 'The fifth row' ) ;
GO
READ
UNCOMMITTED:
READ UNCOMMITTED isolation level allows a transaction to read any data
currently on a data or index page, regardless of whether or not the
transaction that wrote that data has been committed.
Transactions running at the READ
UNCOMMITTED level do not issue shared
locks to prevent other transactions from modifying data read by the current
transaction. READ UNCOMMITTED transactions are also not blocked by exclusive locks that would prevent the current
transaction from reading rows that have been modified but not committed by
other transactions.
For example, although another user might have a transaction in
progress that performs data modifications, and that transaction is holding
exclusive locks on the data, a
transaction using the READ UNCOMMITTED isolation level can read the data anyway
(a dirty read), and possibly take further actions based on the values read.
The potential problem with dirty
reads is that the user who started the modification transaction might then decide to roll it back so,
logically, those changes never occurred. If we act based on a data value that
essentially never existed, then that decision or action might not be valid.
Let's see how the READ UNCOMMITTED
isolation level behaves. In Listing 1-2, run Step 1 to begin a transaction
(without committing it) and then open a new query window to run Step 2. Use the
IsolationDB database for each connection.
-- Step 1:
-- Start a transaction but don't commit it
USE IsolationDB ;
GO
BEGIN TRAN
UPDATE IsolationTest
SET col2 = 'New Value' ;
--<EXECUTE>
-- Step 2:
-- Start a new connection and change your isolation
level
USE IsolationDB ;
GO
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED ;
SELECT *
FROM IsolationTest ;
--<EXECUTE>
-- Step 3:
-- Return to the connection from Step 1 and issue a
ROLLBACK
ROLLBACK TRANSACTION ;
--<EXECUTE>
-- Step 4:
-- Rerun the SELECT statement in the connection from
Step 2
SELECT *
FROM IsolationTest ;
-- <EXECUTE>
In the results returned from the
query in Step 2, notice that all the values in col2 are the string 'New Value',
even though the transaction in the first connection has not yet committed. In
fact, the transaction might never commit. If we take some action, based on an
understanding that all the values are the same, we could regret it if the
changes turned out not to be permanent. Back in the first connection, roll back
the transaction, as shown in Step 3.
For Step 4, rerun the SELECT
statement in the second connection to see that all the values in col2 have
reverted to their original values. If you're following along with these
examples, make sure you close all your connections before proceeding, so that
we're sure that SQL Server will release any outstanding locks.
Risks:
Dirty
Reads:
There are many risks that you run
when using the read-uncommitted isolation level or nolock table hints. The
first risk that could occur is “Dirty Reads”. A dirty read occurs when your
query reads a data page that is different in memory than the page on disk. Any
data that has been changed, but not yet committed, is considered to be “dirty”.
For some applications this may be ok, but for many applications, this can cause
a major problem for the users that rely on the system.
A typical example that is used
throughout the SQL Server community to illustrate this issue is: Suppose you
need to get some repairs done on your wife’s car and you pull up to an ATM
machine to withdraw the money needed. Your account currently has $300 in it.
You put your debit card into the machine, enter your pin, and tell the machine
you want to withdraw $200. Your wife then calls you and tells you that the car
is working fine now and you don’t have to worry about getting repairs. You
immediately cancel your transaction and drive off. Little did you know, the
weekly balance report that you requested from your bank began processing after
you entered your withdrawal amount but before you cancelled your transaction.
The report is then emailed to you saying you have a balance of $100.
Immediately you are now concerned and wondering what happened to your other
$200. The issue is that the bank report read ‘dirty’ data that had not yet been
committed and then passed along that ‘dirty’ information to you. This is a
primary example of when reading ‘dirty’ data is not a good idea.
Non
Repeatable Reads
Non repeatable reads becomes an issue
when your application has code that executes which requires for data to be read
multiple times and return the same consistent data upon each read. When using
the read-uncommitted isolation level or nolock table hints shared (S) locks are
not requested on the tables you are reading. As a result writers are able to
access those tables and modify the data within them during the middle of your
read operations. This can drastically skew your expected results.
To help illustrate this issue: Suppose
I go to the bank to deposit some money into my account. I currently have a
Checking account with a $100 balance and a Savings account with a $100 balance.
I plan to deposit $300 into my checking account. Meanwhile, my wife is at home
on her mobile application checking our account balance. The mobile application
runs a stored procedure which reads the AccountBalance table one time for each
account that you own, and then reads the table one final time to give you the
sum of all your accounts. The application shows my wife we have $100 in the
Checking account, $100 in the Savings account, and our two accounts have a
total value of $500. Based on this information my wife received, she is
completely confused by this inaccurate data.
What happened is the stored procedure
read the table multiple times expecting the data would be the same in each
read. However, an outside transaction (the deposit) was able to modify the data
in between the multiple reads that the stored procedure issued. The developer
of the stored procedure never anticipated this but he used a nolock hint on the
table so that the application would not run into any blocking and results would
return quicker. The developer did indeed prevent blocking, but at the expense
of creating the risk of the application returning inaccurate data.
Same
Data is Read Twice
There are rare occasions when the
same data can be read twice when using the read-uncommitted isolation level or
nolock hint. To illustrate this issue we have to give a little background
first. Clustered Indexes are created on SQL Server tables to physically order
the data within the table based on the Cluster Key. The leaf pages of the index
contain the data pages which contains the actual data for the table. Data pages
can hold 8K worth of data.
Scenario: You have an ETL process
that will Extract all records from a table, perform some type of
transformation, and then load that data into another table. There are two types
of scans that occur in SQL Server to read data: allocation scans and range
scans. Range scans occur when you have a specific filter (where clause) for the
data you are reading, and an index can be used to help seek out those specific
records. When you do not have a filter, an allocation scan is used to scan all
of the data pages that have been allocated to that table. Pending you are not
doing any type of sort operations, your data will read the data pages in the
order as it finds them on the disk. For simplicity, let’s assume there is no
fragmentation so your data pages are in order 1-10. So far your process has
read pages 1-6. Remember your process is not requesting shared (S) locks so you
are not blocking other users. Meanwhile, another process begins which inserts
records into your table. This process attempts to insert records onto Page 3,
but the page is full and the record will not fit. As a result the page has to
be split and half of the records will remain on Page 3 and the other records
will be moved to a new page which will be page 11. Your process has already read
the data that was on Page 3, but now half of that data has been moved to page
11. As a result, as your process continues it will read Page 11 which contains
data that has already been read. If there is no type of checks on the
destination table, you will end up with bad duplicate data.
This is an example of how changing
the locking mechanism can cause you to read duplicate data into your process
and reduce the integrity of the data.
Phantom
Reads:
A phantom read occurs when you read
data that is there one minute, and gone the next. A phantom read can become
problematic when you have a process that performs some type of operation based
on the data that it has read. To illustrate this issue, suppose you have stock
in CompanyA and CompanyA decides to pay all of its stock holders a dividend.
There is a stored procedure that first reads a table with all of the current
stock holders and builds a temp table based on that list. Then the stored
procedure initiates a dividend payout process for all of the accounts in the
temp table. In the middle of this procedure running, a stock-holder sales all
of his shares in the company. The sale of the stock did not complete until
after the temp table was built, but before the dividend process completed. This
will cause an issue with the dividend process because it will not be able to
pay the dividend since the person no longer owns the stock.
Using the read uncommitted isolation
level or no lock hint allowed the user to sale his stock at the same time the
dividend process was running. If this isolation level or hint was not used, the
user would have been blocked and unable to sale his stock until after the
dividend payout completed.
Advantages
and Disadvantages:
·
The primary advantages of read
uncommitted isolation are the reduced potential for blocking and deadlocking
due to incompatible locks (including unnecessary blocking due to lock
escalation), and possibly increased performance (by avoiding the need to
acquire and release shared locks).
·
The most obvious potential drawback
of read uncommitted isolation is (as the name suggests) that we might read
uncommitted data (even data that is never committed, in the case of a
transaction rollback). In a database where rollbacks are relatively rare, the
question of reading uncommitted data might be seen as a mere timing issue,
since the data in question will surely be committed at some stage, and probably
quite soon. We have already seen timing-related inconsistencies in the
row-counting example (which was operating at a higher isolation level) so one
might well question how much of a concern it is to read data "too
soon."
Clearly the answer depends on local
priorities and context, but an informed decision to use read uncommitted
isolation certainly seems possible. There is more to think about though. The
SQL Server implementation of the read uncommitted isolation level includes some
subtle behaviours that we need to be aware of before making that "informed
choice."
Misconceptions
Using
nolock table hints for update or delete statements
Sometimes I see developers create
code which performs Update or Delete statements and use the nolock table hint.
It is important to note that this hint only works for read-only operations.
Anytime you are modifying data, such as in an Update or Delete statement, this
hint is completely ignored by SQL Server. When performing these types of
operations, locks have to be generated to ensure that transactions are atomic,
consistent, isolated, and durable.
Nolock
Table Hint
The nolock table hint behind the
scenes performs the exact same action as running under the read-uncommitted
isolation level. The only difference between the two is that the
read-uncommitted isolation level determines the locking mechanism for the
entire connection and the nolock table hint determines the locking mechanism
for the table that you give the hint to. For example, if you have a stored
procedure that will read data from 20 tables, the read-uncommitted isolation
level would cause every table to be accessed without requesting shared (S)
locks. If you only need a subset of those 20 tables to not acquire shared (S)
locks, but have other tables that you need to only read committed data, then
you would want to use the nolock hint instead.
The syntax for using a nolock hint is
below:
SELECT * FROM
DatabaseName.dbo.TableName WITH (NOLOCK)
READ COMMITTED
READ COMMITTED is SQL Server's
default isolation level. It ensures that an operation will never read data another
transaction has changed but not committed. However, because SQL Server holds
locks for SELECT operations for only a short time, if a transaction running
with READ COMMITTED isolation re-reads data, that data might have changed, or
new rows might appear that meet the criteria of the original query.
So READ COMMITTED behavior has two
aspects. Firstly, it prevents dirty reads but, secondly, it still allows
non-repeatable reads and phantom reads.
To see the first aspect, we can
simply repeat the previous example, but with the second connection using the
default READ COMMITTED isolation level, rather than READ UNCOMMITTED, as shown
by Steps 1 through 4, in Listing 1-3. The second connection blocks on its
SELECT statement; it can't read the changes the first connection has made but
not yet committed (or rolled back). Once we roll back the transaction, in Step
3, the query in Step 2 completes and returns the original data.
-- Step 1:
-- Start a transaction but don't commit it
USE IsolationDB ;
GO
BEGIN TRAN
UPDATE IsolationTest
SET col2 = 'New Value' ;
--<EXECUTE>
-- Step 2:
-- Start a new connection and change your isolation
level
USE IsolationDB ;
GO
SET TRANSACTION ISOLATION LEVEL READ COMMITTED ;
SELECT *
FROM IsolationTest ;
--<EXECUTE>
-- You should notice that the process blocks, and
returns
-- no data or messages!
-- To finish up, perform the following two steps:
-- Step 3:
-- Return to the connection from Step 1 and issue a
ROLLBACK
ROLLBACK TRANSACTION ;
--<EXECUTE>
-- Step 4:
-- Rerun the SELECT statement in the connection from
Step 2
SELECT *
FROM IsolationTest ;
-- <EXECUTE>
-- Verify that the data is available
To see the second aspect of READ
COMMITTED behavior (specifically, non-repeatable reads), close all the
connections from the previous example, and open two new connections, using
IsolationDB again. Listing 1-4 shows the code. In the first connection, Step 1
will make sure the isolation level is the default READ COMMITTED, and then it
will start a transaction that reads data from the IsolationTest table to
compute an average.
In the second connection, Step 2 will
UPDATE the table. Assuming that the query in Step 1 has finished processing,
the UPDATE will succeed, even though the first connection is still inside a
transaction. Note that the UPDATE is an auto-commit transaction and so SQL
Server will commit the UPDATE and release the locks as soon as it completes.
In Step 3, return to the first
connection and run the same SELECT statement. The average value is now
different and we have a non-repeatable read. The default READ COMMITTED
isolation level prevents other connections from reading data being modified,
but only prevents other connections from changing data being read, while the
read operation is in progress. Once it is complete, other transactions can
change the data, even if the reading transaction is still open. As a result,
there is no guarantee that we'll see the same data if we rerun the SELECT
within the transaction.
-- Step 1:
-- Read data in the default isolation level
USE IsolationDB
SET TRANSACTION ISOLATION
LEVEL READ COMMITTED ;
BEGIN TRAN
SELECT AVG(col1)
FROM IsolationTest ;
--<EXECUTE>
-- Step 2:
-- In a new connection, update the table:
USE IsolationDB ;
UPDATE IsolationTest
SET col1 = 500
WHERE col1 = 50 ;
--<EXECUTE>
-- Step 3:
-- Go back to the first connection and
-- run the same SELECT statement:
SELECT AVG(col1)
FROM IsolationTest ;
--<EXECUTE>
-- Step 4:
-- issue a ROLLBACK
ROLLBACK TRANSACTION ;
--<EXECUTE>
The isolation level READ COMMITTED
guarantees only that a transaction will not read uncommitted data. However, the
ANSI SQL specification does not specify any particular mechanism that a
database system should use to implement READ COMMITTED isolation, and so
prevent dirty reads.
As of SQL Server 2005, SQL Server
provides two different ways of preventing a transaction from reading dirty
data in the READ COMMITTED isolation level. The default method we have just
seen, using pessimistic concurrency, locks the data modified inside a
transaction, and the locks keep other processes from accessing that data. It
also takes shared locks for short periods to prevent data being modified while
it is being read.
READ_
COMMITTED_SNAPSHOT:
In SQL Server 2005, the alternative
form of READ COMMITTED, called READ_ COMMITTED_SNAPSHOT, uses optimistic concurrency.
As one would expect, its behavior is the same as the default in terms of the
read phenomena, i.e. it prevents dirty reads, but allows non-repeatable reads
and phantom reads. However, this optimistic implementation of the READ
COMMITTED level prevents dirty reads, without blocking other transactions.
-- First close all other connections to make sure no
one is using
-- the IsolationDB datatabase
-- Step 2:
-- Change the database option to enable "read
committed snapshot"
ALTER DATABASE IsolationDB SET
READ_COMMITTED_SNAPSHOT ON ;
--<EXECUTE>
-- Step 3:
-- Start a transaction but don't commit it
USE IsolationDB ;
GO
BEGIN TRAN
UPDATE IsolationTest
SET col2 = 'New Value' ;
--<EXECUTE>
-- Step 4:
-- Start a new connection and change your isolation level
USE IsolationDB ;
GO
SET TRANSACTION ISOLATION
LEVEL READ COMMITTED ;
SELECT *
FROM IsolationTest ;
--<EXECUTE>
-- You should notice that the second connection is not
blocked, but
-- it does not return the changed data. The results you
get are the
-- original committed data, before the UPDATE in Step 3
was performed
-- no data or messages!
-- To finish up, perform the following steps:
-- Step 5:
-- Return to the connection from Step 1 and issue a
ROLLBACK
ROLLBACK TRANSACTION ;
--<EXECUTE>
-- Step 6:
-- Now close all other connections to make sure no one
is using
-- the IsolationDB datatabase
-- Step 7:
-- Change the database option to disable "read
committed snapshot"
ALTER DATABASE IsolationDB SET
READ_COMMITTED_SNAPSHOT OFF ;
--<EXECUTE>
REPEATABLE READ
The REPEATABLE READ isolation level
adds to the properties of READ COMMITTED by ensuring that if a transaction
re-reads data, or if a query is reissued within the same transaction, then the
same data will be returned. In other words, issuing the same query twice within
a transaction won't pick up any changes to data values that were made by
another transaction. A second transaction cannot modify the data that a first
transaction has read, as long as that first transaction has not yet committed
or rolled back.
To
see REPEATABLE READ behavior, close all the connections to the IsolationDB
database, and open two new ones. Steps 1 through 3 in Listing 1-6 will issue
the same two queries as in Listing 1-4, but this time, the first connection
will set the isolation level to REPEATABLE READ
in
Step 1.
In Step 2, the second connection will
have to use a slightly different UPDATE statement, because the value of 50 for
col1 no longer exists. This UPDATE will block when it tries to modify the
IsolationTest table. In Step 3, the first connection will get the same result
when it reissues its original SELECT.
-- Step 1:
-- Read data in the Repeatable Read isolation level
USE IsolationDB ;
SET TRANSACTION ISOLATION
LEVEL REPEATABLE
READ ;
BEGIN TRAN
SELECT AVG(col1)
FROM IsolationTest ;
--<EXECUTE>
-- Step 2:
-- In the second connection, update the table:
USE IsolationDB ;
UPDATE IsolationTest
SET col1 = 5000
WHERE col1 = 500 ;
--<EXECUTE>
-- You should notice that the UPDATE process blocks,
-- and returns no data or messages
-- Step 3:
-- Go back to the first connection and
-- run the same SELECT statement:
SELECT AVG(col1)
FROM IsolationTest ;
--<EXECUTE>
-- Step 4:
-- issue a ROLLBACK
ROLLBACK TRANSACTION ;
--<EXECUTE>
Preventing non-repeatable reads, or
allowing the first connection to make sure it will reread the same data, is a
desirable safeguard, but it comes at a price. The cost of this extra safeguard
is that SQL Server holds all the shared locks in a transaction until the
completion (COMMIT or ROLLBACK) of the transaction.
However, REPEATABLE READ isolation
doesn't prevent all possible read phenomena. It protects only the data that has
already been read. Listing 1-7 demonstrates what this protection means. Close
all connections, and open two new ones connecting to IsolationDB. In the first
connection, start a transaction in REPEATABLE READ isolation level and look for
all rows that meet a certain condition, as shown in Step 1.In the second
connection, Step 2 will insert a new row. Now go back to the first connection,
and re-execute the SELECT in Step 3.
-- Close all connections and open two new ones
-- Step 1:
USE IsolationDB ;
SET TRANSACTION ISOLATION
LEVEL REPEATABLE
READ
BEGIN TRAN
SELECT *
FROM IsolationTest
WHERE col1 BETWEEN 20 AND 40
--<EXECUTE>
-- Step 2:
-- In the second connection, insert new data
USE IsolationDB ;
INSERT INTO IsolationTest
VALUES ( 25, 'New
Row' ) ;
--<EXECUTE>
-- Step 3:
-- Go back to the first connection and rerun the SELECT
SELECT *
FROM IsolationTest
WHERE col1 BETWEEN 20 AND 40 ;
--<EXECUTE>-- Notice one additional row
-- Step 4:
-- issue a ROLLBACK
ROLLBACK TRANSACTION ;
--<EXECUTE>
Upon the second execution of the same
SELECT statement, the new row appears, called a phantom. The row didn't even
exist the first time we ran the SELECT statement, so it wasn't locked. We can
prevent phantoms with the SERIALIZABLE isolation level.
SERIALIZABLE:
The SERIALIZABLE isolation level
ensures that, if a query is reissued, no data will have changed and no new rows
will appear. In other words, we won't see phantoms if the same query is issued
twice within a transaction. In Listing 1-8, we rerun the example from Listing
1-7, inserting a row with a col1 value of 35, but this time setting the
isolation level to SERIALIZABLE in the first connection. The second connection
will block when we try to do the INSERT, and the first connection will read
exactly the same rows each time.
-- Open two new connections
-- Step 1:
-- In the first connection, start a
transaction
USE IsolationDB ;
SET TRANSACTION
ISOLATION LEVEL
SERIALIZABLE ;
BEGIN TRAN
SELECT *
FROM IsolationTest
WHERE col1 BETWEEN 20 AND 40 ;
--<EXECUTE>
-- Step 2:
-- In the second connection, insert
new data
USE IsolationDB
INSERT INTO
IsolationTest
VALUES (
35, 'Another New Row'
) ;
-- Notice that the INSERT will block
--<EXECUTE>
-- Step 3:
-- Go back to the first connection
and rerun the SELECT
SELECT *
FROM IsolationTest
WHERE col1 BETWEEN 20 AND 40 ;
--<EXECUTE>
-- Notice no
new rows
-- Step 4:
-- issue a ROLLBACK
ROLLBACK TRANSACTION ;
--<EXECUTE>
Again, we pay a price to prevent
phantoms. In addition to locking all the data has been read, enforcing the SERIALIZABLE
isolation level, and so preventing phantoms, requires that SQL Server also lock
data that doesn't exist (see the Key-range Locks section in Chapter 3). The SERIALIZABLE
level gets its name from the fact that running multiple SERIALIZABLE transactions
at the same time is the equivalent of running them one at a time – that is,
serially.
SNAPSHOT:
There is an entirely new isolation
level, introduced in SQL Server 2005 called SNAPSHOT isolation. The only
implementation of snapshot isolation uses optimistic concurrency, so we'll save
the discussion of this level until the Chapter 6, on optimistic concurrency.
No comments:
Post a Comment