Tuesday, March 19, 2013

A sample queue with READPAST and OUTPUT

Anyone that has taken a basic data structures class in a language like C++ understands the simple joy of constructing linked lists, queues, trees, and other elementary data storage algorithms. We can still do this in SQL, but there's frequently many solutions to a given problem.

This solution to a simple queue uses READPAST and OUTPUT, and is intended more as an example of their usage than as a fully designed queue system.

The basic goal with our queue is to be able to add a work item of a certain type, and then through another process, read the next work item from the queue and mark it as retrieved. FIFO (First in, First out) applies.

Here we go. First, we create a schema. Schemas are horribly underused in my opinion, and I like to use them for logical encapsulation.

create schema QueueSample authorization dbo
go


Then we create the table.

create table QueueSample.WorkQueue
(
 WorkQueueID  bigint not null identity(1,1) primary key clustered,
 JobType   uniqueidentifier not null,
 JobInfo   xml null,

 CreatedDate  datetime not null default(getdate()),
 CreatedBy  uniqueidentifier null,
 ModifiedDate datetime null,
 ModifiedBy  uniqueidentifier null,
 RetrievedDate datetime null,
 RetrievedBy  uniqueidentifier null

) on [primary]
go


The key columns for this table are the JobType, which is some manner of GUID that might refer to a process type, and JobInfo, which is an untyped xml column that should store whatever work-related information is needed to process the job. Xml is good for the generic exercise, but most real implementations should examine this a little closer.

create view QueueSample.WorkQueue_Pending
as
 select *
 from QueueSample.WorkQueue
 where RetrievedDate is null
go


The above view lets us look for jobs that need to be done (i.e., haven't been retrieved yet). I'm using select * because I treat this type of view as a "view abstraction layer", but listing the columns would be the best choice for most shops.

create proc QueueSample.ViewQueue
(
 @JobType uniqueidentifier = null
)
-- Look at pending jobs (jobs not yet retrieved)
as
begin

 select JobType, JobInfo,
  CreatedDate, CreatedBy,
  ModifiedDate, ModifiedBy
 from QueueSample.WorkQueue_Pending
 where @JobType is null
   or JobType = @JobType

end --proc
go


Our first sproc, ViewQueue, queries for jobs that need to be done. It can take a JobType parameter, or can return with no filter if that's null. Nothing fancy here so let's move on.

create proc QueueSample.QueueJob
(
 @JobType  uniqueidentifier,
 @JobInfo  xml = null,
 @CreatedBy  uniqueidentifier = null
)
-- Add a new unit of work to the queue
as
begin

 insert into QueueSample.WorkQueue
   ( JobType, JobInfo, CreatedBy )
 output inserted.WorkQueueID
 values ( @JobType, @JobInfo, @CreatedBy )

end --proc
go


QueueJob adds an item to the queue, and makes use of the OUTPUT clause to solve a common problem. In many use cases, it is important to return the new identity or other generated key to the calling process. This usually ends up using @@identity or scope_identity() as a second statement right after the insert, or some sort of select that looks for the record that just got inserted, or something even crazier. In newer versions of SQL, 2005 and later, the OUTPUT clause solves this pattern much simpler. It uses inserted and deleted mock tables, much like triggers do, and can be put on INSERT, UPDATE, MERGE, or DELETE.

create proc QueueSample.GetNextJob
(
 @JobType  uniqueidentifier,
 @RetrievedBy uniqueidentifier = null
)
-- get the next unit of work from the queue for a specific job type
as
begin

 update top(1) QueueSample.WorkQueue_Pending with (readpast)
 set  RetrievedDate = getdate(),
   RetrievedBy = @RetrievedBy
 output deleted.JobInfo,
   deleted.CreatedBy,
   deleted.CreatedDate,
   deleted.ModifiedBy,
   deleted.ModifiedDate,
   inserted.RetrievedDate
 where JobType = @JobType

end --proc
go


GetNextJob is called to fetch the next work item (of a particular JobType). Note the use of OUTPUT once again, to return the needed information to run the job. We've updated and returned the correct information, all in a single statement. Be sure to do good error handling if such a clause is inside a transaction - if the transaction is rolled back, the output will still be there as it was generated already.

The other interesting piece is the READPAST hint. This hint tells SQL to step past any locked rows. Such a consideration is important if more than one process might be calling the procedure. It helps avoid lock waits, similar to NOLOCK - but safer, because it won't risk returning such information but rather just skips it.

That's it. Here is a script that tests all of this code.

declare @JobType_1 uniqueidentifier = 'AC351A46-49E4-4F0C-BF8C-F45255012150',
  @JobType_2 uniqueidentifier = 'FB0F15E0-2A97-46A3-951B-6655E4D7A06A'

exec QueueSample.QueueJob @JobType_1, '<info><key>4</key></info>'
exec QueueSample.QueueJob @JobType_1, '<info><key>5</key></info>'
exec QueueSample.QueueJob @JobType_2, '<info><anotherkey>422</anotherkey></info>'
exec QueueSample.QueueJob @JobType_1, '<info><key>6</key></info>'
exec QueueSample.QueueJob @JobType_2, '<info><anotherkey>893</anotherkey></info>'
exec QueueSample.QueueJob @JobType_1, '<info><key>8</key></info>'

exec QueueSample.ViewQueue
exec QueueSample.ViewQueue @JobType_1
exec QueueSample.GetNextJob @JobType_1
exec QueueSample.ViewQueue @JobType_1

exec QueueSample.ViewQueue @JobType_2
exec QueueSample.GetNextJob @JobType_2
exec QueueSample.ViewQueue @JobType_2

exec QueueSample.ViewQueue
go


And, finally, some cleanup code.

drop proc QueueSample.ViewQueue
drop proc QueueSample.GetNextJob
drop proc QueueSample.QueueJob
drop view QueueSample.WorkQueue_Pending
drop table QueueSample.WorkQueue
drop schema QueueSample
go

No comments:

Post a Comment