Pages

Thursday, 9 June 2016

Transaction Isolation Levels

The simplest definition of a transaction is that it is a single unit of work; a task or set of tasks that together form an "all-or-nothing" operation. If some event interrupts a transaction in the middle, so that not all of it was completed, the system should treat the transaction as if it never occurred at all. A transaction can be short, like changing the price of one book in the inventory, or long, like updating the quantity sold of every inventory item at the beginning of an accounting period.
Transactions have four basic properties, called the ACID properties, which guarantee the validity of the data after the completion of any transaction.
Atomicity:
A transaction is treated as a single unit of work. Either it completes entirely, or the system has no "memory" of it happening at all. This applies to transactions of any size, whether two rows are being inserted, or 10 million rowsare being updated.
Consistency:
The consistency property ensures that any transaction will bring the database from one valid state to another. Any data written to the database must be valid according to all defined rules, including but not limited to constraints, cascades, triggers, and any combination thereof. This does not guarantee correctness of the transaction in all ways the application programmer might have wanted (that is the responsibility of application-level code) but merely that any programming errors do not violate any defined rules.
Isolation:
The isolation property ensures that the concurrent execution of transactions results in a system state that would be obtained if transactions were executed serially, i.e. one after the other. Providing isolation is the main goal of concurrency control. Depending on concurrency control method, the effects of an incomplete transaction might not even be visible to another transaction.
Durability:
means that once a transaction has been committed, it will remain so, even in the event of power loss, crashes, or errors. In a relational database, for instance, once a group of SQL statements execute, the results need to be stored permanently (even if the database crashes immediately thereafter). To defend against power loss, transactions (or their effects) must be recorded in a non-volatile memory.


Example:
The database table has two columns, A and B. An integrity constraint requires that the value in A and the value in B must sum to 100.
CREATE TABLE acidtest (A INTEGER, B INTEGER CHECK (A + B = 100));

Atomicity failure:

Assume that a transaction attempts to subtract 10 from A and add 10 to B. This is a valid transaction, since the data continue to satisfy the constraint after it has executed. However, assume that after removing 10 from A, the transaction is unable to modify B. If the database retained A's new value, atomicity and the constraint would both be violated. Atomicity requires that both parts of this transaction, or neither, be complete.

Consistency failure:

Consistency is a very general term which demands that the data must meet all validation rules. In the previous example, the validation is a requirement that A + B = 100. Also, it may be inferred that both A and B must be integers. A valid range for A and B may also be inferred. All validation rules must be checked to ensure consistency.
Assume that a transaction attempts to subtract 10 from A without altering B. Because consistency is checked after each transaction, it is known that A + B = 100 before the transaction begins. If the transaction removes 10 from A successfully, atomicity will be achieved. However, a validation check will show that A + B = 90, which is inconsistent with the rules of the database. The entire transaction must be cancelled and the affected rows rolled back to their pre-transaction state. If there had been other constraints, triggers, or cascades, every single change operation would have been checked in the same way as above before the transaction was committed.

Isolation failure:

To demonstrate isolation, we assume two transactions execute at the same time, each attempting to modify the same data. One of the two must wait until the other completes in order to maintain isolation.
Consider two transactions. T1 transfers 10 from A to B. T2 transfers 10 from B to A. Combined, there are four actions:
  • T1 subtracts 10 from A.
  • T1 adds 10 to B.
  • T2 subtracts 10 from B.
  • T2 adds 10 to A.
If these operations are performed in order, isolation is maintained, although T2 must wait. Consider what happens if T1 fails half-way through. The database eliminates T1's effects, and T2 sees only valid data.
By interleaving the transactions, the actual order of actions might be:
  • T1 subtracts 10 from A.
  • T2 subtracts 10 from B.
  • T2 adds 10 to A.
  • T1 adds 10 to B.
Again, consider what happens if T1 fails halfway through. By the time T1 fails, T2 has already modified A; it cannot be restored to the value it had before T1 without leaving an invalid database. This is known as a write-write failure, [citation needed] because two transactions attempted to write to the same data field. In a typical system, the problem would be resolved by reverting to the last known good state, canceling the failed transaction T1, and restarting the interrupted transaction T2 from the good state.

 

 

 

 

Durability failure:

Assume that a transaction transfers 10 from A to B. It removes 10 from A. It then adds 10 to B. At this point, a "success" message is sent to the user. However, the changes are still queued in the disk buffer waiting to be committed to the disk. Power fails and the changes are lost. The user assumes (understandably) that the changes have been made.


Transaction scope
SQL Server supports several different ways to define the beginning and end of atransaction. Two methods are available by default, and two are only available under specific conditions.
The default types of transactions are auto-commit transactions and explicit transactions.
An auto-commit transaction is any single data modification operation. In other words,any INSERT, UPDATE or DELETE statement (as well as others, such as MERGE and BULK INSERT), by itself, is automatically a transaction. If we modify one row, or one million rows, in a single UPDATE statement, SQL Server will consider the UPDATE operation to be an atomic operation, and will modify either all the rows or none of the rows. If there is a server failure in the middle of the modification operation then, when SQL Server recovers, it will be as if no modifications ever happened. With an auto-commit transaction,there is no way to force a rollback, manually. A transaction rollback will only occur when there is a system failure.
An explicit transaction uses the BEGIN TRANSACTION (or BEGIN TRAN) statement to indicate the beginning of the transaction, and either a COMMIT TRANSACTION or a ROLLBACK TRANSACTION statement to indicate the end. In between, the transaction can include any number of statements.
The non-default types of transactions are implicit transactions and batch-scoped transactions.
For implicit transactions, a session must be in implicit transaction mode, invoked with a SET option: SET IMPLICIT_TRANSACTIONS ON. In implicit transaction mode, the start of any transaction is implied. In other words, any data manipulation language (DML) statement (such as INSERT, UPDATE, DELETE and even SELECT) will automatically start a transaction. Although, in this mode, the start of the transaction is implied, the end of the transaction must be explicit, and the transaction is not finished until we issue either a ROLLBACK TRAN or COMMIT TRAN. This mode is mainly for use by developers who have come to SQL Server from other database management systems, such as Oracle or DB2, which deal with transactions in a different way. However, I strongly recommend that you get used to working with SQL Server's default transaction management options because all the documentation and all books and magazine articles about SQL Server assume you are using that mode. If you must use implicit transaction mode for compatibility with other systems or applications, you'll probably be better off not mixing and matching the two modes, but rather having all your sessions and all your transactions using implicit transaction mode.
Introduced in SQL Server 2005, we invoke batch-scoped transactions by requesting the option Multiple Active Result Sets (or MARS) in the client connection string. In those connections, SQL Server will roll back any batch that includes a BEGIN TRAN but does not include a COMMIT TRAN. The purpose of MARS is to avoid a problem called "application deadlock," which we'll discuss in Chapter 4, in the section on sharing locks across connections.






Transaction isolation

Every transaction runs in one particular transaction isolation level, which determines how sensitive your application is to changes made by other users' transactions, and how long SQL Server must hold locks to protect against these changes. The ANSI SQL standard defines four levels of isolation for transactions. SQL Server supports all four of these levels, listed in order of increasing restrictiveness, in terms of the read phenomena permitted:

   READ UNCOMMITTED – allows dirty reads, non-repeatable reads and phantom reads

   READ COMMITTED – prevents dirty reads, allows non-repeatable reads and phantom reads

    REPEATABLE READ – prevents dirty reads and non-repeatable reads but allows phantom reads

   SERIALIZABLE – prevents all read phenomena.

With the exception of READ UNCOMMITTED, each of these isolations levels is pessimistic in nature. In other words, when transactions are operating in one of these modes, SQL Server will acquire shared and exclusive locks in order to prevent data being read that is currently being modified by another transaction, and to prevent other transactions modifying data that is currently being read. In addition, SQL Server 2005 (and later) offers a new optimistic isolation level, called SNAPSHOT isolation, plus an optimistic alternative to READ COMMITTED isolation (READ_COMMITTED_SNAPSHOT), both of which can ensure consistent results without the need to acquire shared locks, and so can enhance concurrency.



Controlling the isolation level
SQL Server's default isolation level is READ COMMITTED, but an application can override this setting by using the following SET command:
SET TRANSACTION ISOLATION LEVEL [READ UNCOMMITTED | READ COMMITTED | REPEATABLE READ | SNAPSHOT | SERIALIZABLE]
The SET command will control the isolation level for all queries submitted over the current connection, until the isolation level is changed, or the connection is terminated. In other words, every time an application makes a new connection (or we open a new query editor window in SSMS), it starts a new session in the SQL Server instance, and any transactions within that new session will use the READ COMMITTED isolation level, by default.
You may be wondering if there is a way to change SQL Server's isolation level server-wide, so that, by default, every connection uses an isolation level other than READ COMMITTED. The answer is no; the isolation level must be set at the connection level, or within a query. We can control the isolation level for individual queries by using Lock Hints
Preventable read phenomena
The easiest way to define the differences between the various ANSI isolation levels is to describe the set of behaviors that are either permitted or forbidden, depending on which isolation level is in use. The three behaviors, also called "preventable read phenomena,"
are:
• Dirty reads
• Non-repeatable reads
• Phantom reads.



Dirty reads:
This behavior occurs when a transaction reads uncommitted data. If one transaction has changed data but not committed the change, and another transaction is allowed to read that changed data, then there is a strong possibility that the data will be read in an inconsistent state.
For example, consider a stock management application for a factory that receives and distributes SuperWidgets. A number of sales clerks log deliveries and shipments, updating the SuperWidgets inventory item, as appropriate.
Currently, there are only 25 widgets in the stock inventory database, but a new shipment of 50 widgets is just in, so Clerk A starts a transaction and issues an UPDATE to reflect a new stock level of 75. At that point, a Clerk B receives an order for 60 widgets and checks the inventory. If Clerk B's transaction permits dirty reads, Clerk B would see 75 widgets and so could authorize the sale, for next-day delivery to a customer. Meanwhile, just as Clerk A prepares to confirm the stock update transaction, he receives a message that a fault has been detected with the batch of widgets, and that they need to be returned to the manufacturer. As a result, he cancels (rolls back) the transaction; Clerk A has authorized an order that the company cannot fulfill, due to insufficient stock.
By default, SQL Server does not allow dirty reads. Keep in mind that the transaction updating the data has no control over whether or not another transaction can read its data before it's committed. The decision regarding whether or not to read "dirty" data lies entirely in the hands of the reading transaction.

Non-repeatable reads:
This behavior is also called inconsistent analysis. A read is non-repeatable if a query might get different values when reading the same data in two separate reads within the same transaction. This can happen when a separate transaction updates the same data, after the first read but before the second read.
In the receiving room example, suppose that a manager comes in to do a spot check of the current inventory. She walks up to each clerk, asking the total number of widgets received that day, and adding the numbers on her calculator. When she's done, she wants to double-check the result, so she goes back to the first clerk. However, if Clerk A received more widgets between the manager's first and second inquiries, the total will be different each time and the reads are non-repeatable.

Phantom reads
This behavior occurs when membership in a set changes. It can happen only when a query with a predicate, such as WHERE count_of_widgets < 10, is involved. A phantom occurs if two SELECT operations using the same predicate in the same transaction return a different number of rows. For example, let's say that our manager is still doing spot checks of inventory. This time, she goes around the receiving room and notes which clerks have fewer than ten widgets. After she completes the list, she goes back around to offer advice to everyone with a low total. However, imagine that during her first walkthrough the manager failed to include in her list a clerk who had just returned from a break, and had fewer than ten widgets. This additional clerk (or row) is a phantom.
To see the behavior in each ANSI isolation level, we'll look at some example code. First, create a table called IsolationTest in a database called IsolationDB and populate the table with a few rows, by running the code in Listing 1-1. I'll refer to the IsolationTest table in examples for each of the four ANSI isolation levels.
-- Listings from "Concurrency in SQL Server"

-- Listing 1-1: Create a database and table for running the isolation level exercises
-- Create a database and table for testing the isolation levels
USE master
GO
IF EXISTS ( SELECT  1
            FROM    sys.databases
            WHERE   name = 'IsolationDB' )
    DROP DATABASE IsolationDB ;
GO
CREATE DATABASE IsolationDB ;
GO
USE IsolationDB ;
GO
CREATE TABLE IsolationTest
    (
      col1 INT PRIMARY KEY ,
      col2 VARCHAR(20)
    ) ;
GO

INSERT  INTO IsolationTest
VALUES  ( 10, 'The first row' ) ;
INSERT  INTO IsolationTest
VALUES  ( 20, 'The second row' ) ;
INSERT  INTO IsolationTest
VALUES  ( 30, 'The third row' ) ;
INSERT  INTO IsolationTest
VALUES  ( 40, 'The fourth row' ) ;
INSERT  INTO IsolationTest
VALUES  ( 50, 'The fifth row' ) ;
GO




READ UNCOMMITTED:
READ UNCOMMITTED isolation level allows a transaction to read any data currently on a data or index page, regardless of whether or not the transaction that wrote that data has been committed.
Transactions running at the READ UNCOMMITTED level do not issue shared locks to prevent other transactions from modifying data read by the current transaction. READ UNCOMMITTED transactions are also not blocked by exclusive locks that would prevent the current transaction from reading rows that have been modified but not committed by other transactions.
For example, although another user might have a transaction in progress that performs data modifications, and that transaction is holding exclusive locks on the data, a transaction using the READ UNCOMMITTED isolation level can read the data anyway (a dirty read), and possibly take further actions based on the values read.
The potential problem with dirty reads is that the user who started the modification transaction might then decide to roll it back so, logically, those changes never occurred. If we act based on a data value that essentially never existed, then that decision or action might not be valid.
Let's see how the READ UNCOMMITTED isolation level behaves. In Listing 1-2, run Step 1 to begin a transaction (without committing it) and then open a new query window to run Step 2. Use the IsolationDB database for each connection.
-- Step 1:
-- Start a transaction but don't commit it
USE IsolationDB ;
GO
BEGIN TRAN
UPDATE IsolationTest
SET col2 = 'New Value' ;
--<EXECUTE>

-- Step 2:
-- Start a new connection and change your isolation level
USE IsolationDB ;
GO
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED ;
SELECT *
FROM IsolationTest ;
--<EXECUTE>

-- Step 3:
-- Return to the connection from Step 1 and issue a ROLLBACK
ROLLBACK TRANSACTION ;
--<EXECUTE>

-- Step 4:
-- Rerun the SELECT statement in the connection from Step 2
SELECT *
FROM IsolationTest ;
-- <EXECUTE>
In the results returned from the query in Step 2, notice that all the values in col2 are the string 'New Value', even though the transaction in the first connection has not yet committed. In fact, the transaction might never commit. If we take some action, based on an understanding that all the values are the same, we could regret it if the changes turned out not to be permanent. Back in the first connection, roll back the transaction, as shown in Step 3.
For Step 4, rerun the SELECT statement in the second connection to see that all the values in col2 have reverted to their original values. If you're following along with these examples, make sure you close all your connections before proceeding, so that we're sure that SQL Server will release any outstanding locks.




Risks:
Dirty Reads:
There are many risks that you run when using the read-uncommitted isolation level or nolock table hints. The first risk that could occur is “Dirty Reads”. A dirty read occurs when your query reads a data page that is different in memory than the page on disk. Any data that has been changed, but not yet committed, is considered to be “dirty”. For some applications this may be ok, but for many applications, this can cause a major problem for the users that rely on the system.
A typical example that is used throughout the SQL Server community to illustrate this issue is: Suppose you need to get some repairs done on your wife’s car and you pull up to an ATM machine to withdraw the money needed. Your account currently has $300 in it. You put your debit card into the machine, enter your pin, and tell the machine you want to withdraw $200. Your wife then calls you and tells you that the car is working fine now and you don’t have to worry about getting repairs. You immediately cancel your transaction and drive off. Little did you know, the weekly balance report that you requested from your bank began processing after you entered your withdrawal amount but before you cancelled your transaction. The report is then emailed to you saying you have a balance of $100. Immediately you are now concerned and wondering what happened to your other $200. The issue is that the bank report read ‘dirty’ data that had not yet been committed and then passed along that ‘dirty’ information to you. This is a primary example of when reading ‘dirty’ data is not a good idea.
Non Repeatable Reads
Non repeatable reads becomes an issue when your application has code that executes which requires for data to be read multiple times and return the same consistent data upon each read. When using the read-uncommitted isolation level or nolock table hints shared (S) locks are not requested on the tables you are reading. As a result writers are able to access those tables and modify the data within them during the middle of your read operations. This can drastically skew your expected results.
To help illustrate this issue: Suppose I go to the bank to deposit some money into my account. I currently have a Checking account with a $100 balance and a Savings account with a $100 balance. I plan to deposit $300 into my checking account. Meanwhile, my wife is at home on her mobile application checking our account balance. The mobile application runs a stored procedure which reads the AccountBalance table one time for each account that you own, and then reads the table one final time to give you the sum of all your accounts. The application shows my wife we have $100 in the Checking account, $100 in the Savings account, and our two accounts have a total value of $500. Based on this information my wife received, she is completely confused by this inaccurate data.
What happened is the stored procedure read the table multiple times expecting the data would be the same in each read. However, an outside transaction (the deposit) was able to modify the data in between the multiple reads that the stored procedure issued. The developer of the stored procedure never anticipated this but he used a nolock hint on the table so that the application would not run into any blocking and results would return quicker. The developer did indeed prevent blocking, but at the expense of creating the risk of the application returning inaccurate data.
Same Data is Read Twice
There are rare occasions when the same data can be read twice when using the read-uncommitted isolation level or nolock hint. To illustrate this issue we have to give a little background first. Clustered Indexes are created on SQL Server tables to physically order the data within the table based on the Cluster Key. The leaf pages of the index contain the data pages which contains the actual data for the table. Data pages can hold 8K worth of data.
Scenario: You have an ETL process that will Extract all records from a table, perform some type of transformation, and then load that data into another table. There are two types of scans that occur in SQL Server to read data: allocation scans and range scans. Range scans occur when you have a specific filter (where clause) for the data you are reading, and an index can be used to help seek out those specific records. When you do not have a filter, an allocation scan is used to scan all of the data pages that have been allocated to that table. Pending you are not doing any type of sort operations, your data will read the data pages in the order as it finds them on the disk. For simplicity, let’s assume there is no fragmentation so your data pages are in order 1-10. So far your process has read pages 1-6. Remember your process is not requesting shared (S) locks so you are not blocking other users. Meanwhile, another process begins which inserts records into your table. This process attempts to insert records onto Page 3, but the page is full and the record will not fit. As a result the page has to be split and half of the records will remain on Page 3 and the other records will be moved to a new page which will be page 11. Your process has already read the data that was on Page 3, but now half of that data has been moved to page 11. As a result, as your process continues it will read Page 11 which contains data that has already been read. If there is no type of checks on the destination table, you will end up with bad duplicate data.

This is an example of how changing the locking mechanism can cause you to read duplicate data into your process and reduce the integrity of the data.
Phantom Reads:
A phantom read occurs when you read data that is there one minute, and gone the next. A phantom read can become problematic when you have a process that performs some type of operation based on the data that it has read. To illustrate this issue, suppose you have stock in CompanyA and CompanyA decides to pay all of its stock holders a dividend. There is a stored procedure that first reads a table with all of the current stock holders and builds a temp table based on that list. Then the stored procedure initiates a dividend payout process for all of the accounts in the temp table. In the middle of this procedure running, a stock-holder sales all of his shares in the company. The sale of the stock did not complete until after the temp table was built, but before the dividend process completed. This will cause an issue with the dividend process because it will not be able to pay the dividend since the person no longer owns the stock.
Using the read uncommitted isolation level or no lock hint allowed the user to sale his stock at the same time the dividend process was running. If this isolation level or hint was not used, the user would have been blocked and unable to sale his stock until after the dividend payout completed.
Advantages and Disadvantages:
·         The primary advantages of read uncommitted isolation are the reduced potential for blocking and deadlocking due to incompatible locks (including unnecessary blocking due to lock escalation), and possibly increased performance (by avoiding the need to acquire and release shared locks).

·         The most obvious potential drawback of read uncommitted isolation is (as the name suggests) that we might read uncommitted data (even data that is never committed, in the case of a transaction rollback). In a database where rollbacks are relatively rare, the question of reading uncommitted data might be seen as a mere timing issue, since the data in question will surely be committed at some stage, and probably quite soon. We have already seen timing-related inconsistencies in the row-counting example (which was operating at a higher isolation level) so one might well question how much of a concern it is to read data "too soon."
Clearly the answer depends on local priorities and context, but an informed decision to use read uncommitted isolation certainly seems possible. There is more to think about though. The SQL Server implementation of the read uncommitted isolation level includes some subtle behaviours that we need to be aware of before making that "informed choice."

Misconceptions
Using nolock table hints for update or delete statements
Sometimes I see developers create code which performs Update or Delete statements and use the nolock table hint. It is important to note that this hint only works for read-only operations. Anytime you are modifying data, such as in an Update or Delete statement, this hint is completely ignored by SQL Server. When performing these types of operations, locks have to be generated to ensure that transactions are atomic, consistent, isolated, and durable.

Nolock Table Hint
The nolock table hint behind the scenes performs the exact same action as running under the read-uncommitted isolation level. The only difference between the two is that the read-uncommitted isolation level determines the locking mechanism for the entire connection and the nolock table hint determines the locking mechanism for the table that you give the hint to. For example, if you have a stored procedure that will read data from 20 tables, the read-uncommitted isolation level would cause every table to be accessed without requesting shared (S) locks. If you only need a subset of those 20 tables to not acquire shared (S) locks, but have other tables that you need to only read committed data, then you would want to use the nolock hint instead.
The syntax for using a nolock hint is below:
SELECT * FROM DatabaseName.dbo.TableName WITH (NOLOCK)



READ COMMITTED
READ COMMITTED is SQL Server's default isolation level. It ensures that an operation will never read data another transaction has changed but not committed. However, because SQL Server holds locks for SELECT operations for only a short time, if a transaction running with READ COMMITTED isolation re-reads data, that data might have changed, or new rows might appear that meet the criteria of the original query.
So READ COMMITTED behavior has two aspects. Firstly, it prevents dirty reads but, secondly, it still allows non-repeatable reads and phantom reads.
To see the first aspect, we can simply repeat the previous example, but with the second connection using the default READ COMMITTED isolation level, rather than READ UNCOMMITTED, as shown by Steps 1 through 4, in Listing 1-3. The second connection blocks on its SELECT statement; it can't read the changes the first connection has made but not yet committed (or rolled back). Once we roll back the transaction, in Step 3, the query in Step 2 completes and returns the original data.
-- Step 1:
-- Start a transaction but don't commit it
USE IsolationDB ;
GO
BEGIN TRAN
UPDATE IsolationTest
SET col2 = 'New Value' ;
--<EXECUTE>

-- Step 2:
-- Start a new connection and change your isolation level
USE IsolationDB ;
GO
SET TRANSACTION ISOLATION LEVEL READ COMMITTED ;
SELECT *
FROM IsolationTest ;
--<EXECUTE>
-- You should notice that the process blocks, and returns
-- no data or messages!
-- To finish up, perform the following two steps:

-- Step 3:
-- Return to the connection from Step 1 and issue a ROLLBACK
ROLLBACK TRANSACTION ;
--<EXECUTE>

-- Step 4:
-- Rerun the SELECT statement in the connection from Step 2
SELECT *
FROM IsolationTest ;
-- <EXECUTE>
-- Verify that the data is available
To see the second aspect of READ COMMITTED behavior (specifically, non-repeatable reads), close all the connections from the previous example, and open two new connec­tions, using IsolationDB again. Listing 1-4 shows the code. In the first connection, Step 1 will make sure the isolation level is the default READ COMMITTED, and then it will start a transaction that reads data from the IsolationTest table to compute an average.
In the second connection, Step 2 will UPDATE the table. Assuming that the query in Step 1 has finished processing, the UPDATE will succeed, even though the first connection is still inside a transaction. Note that the UPDATE is an auto-commit transaction and so SQL Server will commit the UPDATE and release the locks as soon as it completes.
In Step 3, return to the first connection and run the same SELECT statement. The average value is now different and we have a non-repeatable read. The default READ COMMITTED isolation level prevents other connections from reading data being modified, but only prevents other connections from changing data being read, while the read operation is in progress. Once it is complete, other transactions can change the data, even if the reading transaction is still open. As a result, there is no guarantee that we'll see the same data if we rerun the SELECT within the transaction.

-- Step 1:
-- Read data in the default isolation level
USE IsolationDB
SET TRANSACTION ISOLATION LEVEL READ COMMITTED ;
BEGIN TRAN
SELECT AVG(col1)
FROM IsolationTest ;
--<EXECUTE>

-- Step 2:
-- In a new connection, update the table:
USE IsolationDB ;
UPDATE IsolationTest
SET col1 = 500
WHERE col1 = 50 ;
--<EXECUTE>

-- Step 3:
-- Go back to the first connection and
-- run the same SELECT statement:
SELECT AVG(col1)
FROM IsolationTest ;
--<EXECUTE>
-- Step 4:
-- issue a ROLLBACK
ROLLBACK TRANSACTION ;
--<EXECUTE>
The isolation level READ COMMITTED guarantees only that a transaction will not read uncommitted data. However, the ANSI SQL specification does not specify any particular mechanism that a database system should use to implement READ COMMITTED isolation, and so prevent dirty reads.
As of SQL Server 2005, SQL Server provides two different ways of preventing a trans­action from reading dirty data in the READ COMMITTED isolation level. The default method we have just seen, using pessimistic concurrency, locks the data modified inside a transaction, and the locks keep other processes from accessing that data. It also takes shared locks for short periods to prevent data being modified while it is being read.
READ_ COMMITTED_SNAPSHOT:
In SQL Server 2005, the alternative form of READ COMMITTED, called READ_ COMMITTED_SNAPSHOT, uses optimistic concurrency. As one would expect, its behavior is the same as the default in terms of the read phenomena, i.e. it prevents dirty reads, but allows non-repeatable reads and phantom reads. However, this optimistic imple­mentation of the READ COMMITTED level prevents dirty reads, without blocking other transactions.
-- First close all other connections to make sure no one is using
-- the IsolationDB datatabase
-- Step 2:
-- Change the database option to enable "read committed snapshot"
ALTER DATABASE IsolationDB SET READ_COMMITTED_SNAPSHOT ON ;
--<EXECUTE>
-- Step 3:
-- Start a transaction but don't commit it
USE IsolationDB ;
GO
BEGIN TRAN
UPDATE IsolationTest
SET col2 = 'New Value' ;
--<EXECUTE>
-- Step 4:
-- Start a new connection and change your isolation level
USE IsolationDB ;
GO
SET TRANSACTION ISOLATION LEVEL READ COMMITTED ;
SELECT *
FROM IsolationTest ;
--<EXECUTE>
-- You should notice that the second connection is not blocked, but
-- it does not return the changed data. The results you get are the
-- original committed data, before the UPDATE in Step 3 was performed
-- no data or messages!
-- To finish up, perform the following steps:
-- Step 5:
-- Return to the connection from Step 1 and issue a ROLLBACK
ROLLBACK TRANSACTION ;
--<EXECUTE>
-- Step 6:
-- Now close all other connections to make sure no one is using
-- the IsolationDB datatabase

-- Step 7:
-- Change the database option to disable "read committed snapshot"
ALTER DATABASE IsolationDB SET READ_COMMITTED_SNAPSHOT OFF ;
--<EXECUTE>


REPEATABLE READ
The REPEATABLE READ isolation level adds to the properties of READ COMMITTED by ensuring that if a transaction re-reads data, or if a query is reissued within the same trans­action, then the same data will be returned. In other words, issuing the same query twice within a transaction won't pick up any changes to data values that were made by another transaction. A second transaction cannot modify the data that a first transaction has read, as long as that first transaction has not yet committed or rolled back.
To see REPEATABLE READ behavior, close all the connections to the IsolationDB database, and open two new ones. Steps 1 through 3 in Listing 1-6 will issue the same two queries as in Listing 1-4, but this time, the first connection will set the isolation level to REPEATABLE READ
in Step 1.

In Step 2, the second connection will have to use a slightly different UPDATE statement, because the value of 50 for col1 no longer exists. This UPDATE will block when it tries to modify the IsolationTest table. In Step 3, the first connection will get the same result when it reissues its original SELECT.
-- Step 1:
-- Read data in the Repeatable Read isolation level
USE IsolationDB ;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ ;
BEGIN TRAN
SELECT AVG(col1)
FROM IsolationTest ;
--<EXECUTE>

-- Step 2:
-- In the second connection, update the table:
USE IsolationDB ;
UPDATE IsolationTest
SET col1 = 5000
WHERE col1 = 500 ;
--<EXECUTE>
-- You should notice that the UPDATE process blocks,
-- and returns no data or messages

-- Step 3:
-- Go back to the first connection and
-- run the same SELECT statement:
SELECT AVG(col1)
FROM IsolationTest ;
--<EXECUTE>

-- Step 4:
-- issue a ROLLBACK
ROLLBACK TRANSACTION ;
--<EXECUTE>
Preventing non-repeatable reads, or allowing the first connection to make sure it will reread the same data, is a desirable safeguard, but it comes at a price. The cost of this extra safeguard is that SQL Server holds all the shared locks in a transaction until the completion (COMMIT or ROLLBACK) of the transaction.
However, REPEATABLE READ isolation doesn't prevent all possible read phenomena. It protects only the data that has already been read. Listing 1-7 demonstrates what this protection means. Close all connections, and open two new ones connecting to IsolationDB. In the first connection, start a transaction in REPEATABLE READ isolation level and look for all rows that meet a certain condition, as shown in Step 1.In the second connection, Step 2 will insert a new row. Now go back to the first connection, and re-execute the SELECT in Step 3.





-- Close all connections and open two new ones
-- Step 1:
USE IsolationDB ;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRAN
SELECT *
FROM IsolationTest
WHERE col1 BETWEEN 20 AND 40
--<EXECUTE>

-- Step 2:
-- In the second connection, insert new data
USE IsolationDB ;
INSERT INTO IsolationTest
VALUES ( 25, 'New Row' ) ;
--<EXECUTE>

-- Step 3:
-- Go back to the first connection and rerun the SELECT
SELECT *
FROM IsolationTest
WHERE col1 BETWEEN 20 AND 40 ;
--<EXECUTE>-- Notice one additional row

-- Step 4:
-- issue a ROLLBACK
ROLLBACK TRANSACTION ;
--<EXECUTE>
Upon the second execution of the same SELECT statement, the new row appears, called a phantom. The row didn't even exist the first time we ran the SELECT statement, so it wasn't locked. We can prevent phantoms with the SERIALIZABLE isolation level.

SERIALIZABLE:
The SERIALIZABLE isolation level ensures that, if a query is reissued, no data will have changed and no new rows will appear. In other words, we won't see phantoms if the same query is issued twice within a transaction. In Listing 1-8, we rerun the example from Listing 1-7, inserting a row with a col1 value of 35, but this time setting the isolation level to SERIALIZABLE in the first connection. The second connection will block when we try to do the INSERT, and the first connection will read exactly the same rows each time.

-- Open two new connections
-- Step 1:
-- In the first connection, start a transaction
USE IsolationDB ;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE ;
BEGIN TRAN
SELECT *
FROM IsolationTest
WHERE col1 BETWEEN 20 AND 40 ;
--<EXECUTE>
-- Step 2:
-- In the second connection, insert new data
USE IsolationDB
INSERT INTO IsolationTest
VALUES ( 35, 'Another New Row' ) ;
-- Notice that the INSERT will block
--<EXECUTE>
-- Step 3:
-- Go back to the first connection and rerun the SELECT
SELECT *
FROM IsolationTest
WHERE col1 BETWEEN 20 AND 40 ;
--<EXECUTE>
-- Notice no new rows

-- Step 4:
-- issue a ROLLBACK
ROLLBACK TRANSACTION ;
--<EXECUTE>

Again, we pay a price to prevent phantoms. In addition to locking all the data has been read, enforcing the SERIALIZABLE isolation level, and so preventing phantoms, requires that SQL Server also lock data that doesn't exist (see the Key-range Locks section in Chapter 3). The SERIALIZABLE level gets its name from the fact that running multiple SERIALIZABLE transactions at the same time is the equivalent of running them one at a time – that is, serially.


SNAPSHOT:
There is an entirely new isolation level, introduced in SQL Server 2005 called SNAPSHOT isolation. The only implementation of snapshot isolation uses optimistic concurrency, so we'll save the discussion of this level until the Chapter 6, on optimistic concurrency.



No comments:

Post a Comment