Tuesday, April 27, 2010

Processing Data Queues in SQL Server with READPAST and UPDLOCK

Problem
One common processing problem that a DBA can encounter is processing rows from a table used as a data queue. Queue processing is functionality where one or more processes INSERTs rows into a database table representing a work queue with each row representing a business action that must be performed. At the same time, one or more processes SELECTs records from the same queue table in order to execute the business action required by the application while later deleting the processed row so it is not processed again. Typically, the reading processes use polling to interrogate the queuing table for any new rows that require execution of a business action. If done incorrectly, processing data queues can produce unexpected results and/or performance issues.

Solution
The following examples set up a process queue to be processed by two separate processes reading the same queue. This can be extended beyond two processes, but for this example we want to show you how two processes can work against one work queue. We’ll start with examples to illustrate issues that can be encountered.

First, let’s create a sample table and populate it with 10 records to be processed.

-- create an example queue table
CREATE TABLE DBO.QUEUE (
QUEUEID INT IDENTITY( 1 , 1 ) NOT NULL PRIMARY KEY,
SOMEACTION VARCHAR(100))

GO

-- seed the queue table with 10 rows
DECLARE @counter INT

SELECT
@counter = 1

WHILE (@counter <= 10)
BEGIN
INSERT INTO
DBO.QUEUE
(SOMEACTION)
SELECT 'some action ' + CAST(@counter AS VARCHAR)

SELECT @counter = @counter + 1
END

Encountering unexpected results
Open 2 separate query windows and issue the following statements in each session:

DECLARE @queueid INT

BEGIN TRAN
TRAN1

SELECT TOP 1 @queueid = QUEUEID
FROM DBO.QUEUE

PRINT 'processing queueid # ' + CAST(@queueid AS VARCHAR)

-- account for delay in processing time
WAITFOR DELAY '00:00:10'

DELETE FROM DBO.QUEUE
WHERE QUEUEID = @queueid

COMMIT

As you will see, each session processed the same row! This is obviously unacceptable processing behavior, but what can we do about it?

We can eliminate this behavior by adding the UPDLOCK hint to the SELECT statement. The UPDLOCK hint tells the SQL Server query engine “Don’t allow any other reader of this row to acquire an UPDLOCK (“U” lock) because I will be promoting this lock to an exclusive “X” lock later in my processing”. It effectively reserves the row for your processing. However, as you will see, this can cause a new problem to arise.

Encountering blocking
The SELECT statement has been modified to use the UPDLOCK hint.

Open 2 separate query windows and issue the following statements again.

DECLARE @queueid INT

BEGIN TRAN
TRAN1

SELECT TOP 1 @queueid = QUEUEID
FROM DBO.QUEUE WITH (updlock)

PRINT 'processing queueid # ' + CAST(@queueid AS VARCHAR)

-- account for delay in processing time
WAITFOR DELAY '00:00:10'

DELETE FROM DBO.QUEUE
WHERE QUEUEID = @queueid

COMMIT

As you can see from the modified example, each session now processes separate rows. Good so far. However, the 2nd session took longer to execute than it did in the first example even though it now processes a separate row. Why is this? It’s because an UPDLOCK (“U”) lock has been placed on the row processed by the first session and the 2nd session is forced to wait on this lock to be released before it is allowed to retrieve the next row for processing. This is highly inefficient since multiple consumers of the queue must all wait until any locks are released. So, how do we get around this?

To get around the blocking encountered in the previous example, a READPAST hint can be used in conjunction with the UPDLOCK hint. The READPAST hint tells the SQL Server query engine “If you encounter any rows that are locked, just skip them… I want whatever is not currently being processed by anyone”.

Incorporating the READPAST query hint
The SELECT statement has been modified to use the READPAST hint in addition to the UPDLOCK hint.

Open 2 separate query windows and issue the following statements again.

DECLARE @queueid INT

BEGIN TRAN
TRAN1

SELECT TOP 1 @queueid = QUEUEID
FROM DBO.QUEUE WITH (updlock, readpast)

PRINT 'processing queueid # ' + CAST(@queueid AS VARCHAR)

-- account for delay in processing time
WAITFOR DELAY '00:00:10'

DELETE FROM DBO.QUEUE
WHERE QUEUEID = @queueid

COMMIT


As you can see from this latest example, each session now processes separate rows and the 2nd session is no longer blocked as evidenced by the execution time (both sessions should complete at roughly the same time).

Using the UPDLOCK hint in conjunction with the READPAST hint gives the best performance for processing queues while eliminating unexpected results and blocking.

Putting it all together
Here is an example of the above code that takes it a step further and processes all of the records in the queue. To run this drop table dbo.queue and then recreate it by running the code in the very first code block above that creates the table and loads the data.

Open 2 separate query windows and issue the following statements again.

SET NOCOUNT ON
DECLARE
@queueid INT

WHILE
(SELECT COUNT(*) FROM DBO.QUEUE WITH (updlock, readpast)) >= 1

BEGIN

BEGIN TRAN
TRAN1

SELECT TOP 1 @queueid = QUEUEID
FROM DBO.QUEUE WITH (updlock, readpast)

PRINT 'processing queueid # ' + CAST(@queueid AS VARCHAR)

-- account for delay in processing time
WAITFOR DELAY '00:00:05'

DELETE FROM DBO.QUEUE
WHERE QUEUEID = @queueid
COMMIT
END

Next Steps

  • When processing data queues, use the UPDLOCK hint along with the READPAST hint to get maximum throughput of your data queues.
  • Read more information about UPDLOCK and READPAST in the SQL Server 2000 and 2005 Books Online under Locking Hints.
  • Read more about Lock Compatibility in the SQL Server 2000 and 2005 Books Online
  • Thank you to Armando Prato for providing this tip!