Project Description
Grid computing framework for Azure for situations where the workers have an affinity to a particular type of work.

New work is routed to existing workers that are already handling such work.

Also supported is the ability to broadcast to all registered workers.

===============================
Note: With the Nov 2009 SDK of Azure, this framework is not necessary. The same functionality can be achieved by direct role-to-role communication now enabled by Azure.
================================

Normally, Azure worker roles receive their 'work' via one or more queues. Additional worker roles instances can be added at any time. If the workers are stateless, they can share the same queues. The workload is evenly balanced across the worker role instances.

In some scenarios, workers may be required to have an affinity to certain type of work. For example, lets say your 'work' consists of sending messages to certain number of endpoints using .Net Services. To send a message, a worker first establishes a connection to the endpoint and then sends the message. As new messages come in, it may be desired that each message is given to a worker that already has a connection to that message's target endpoint. The AffinityGrid framework is designed to handle such situations in an Azure context.

AffinityGrid may also be used for publish-subscribe scenarios using the broadcast functionality.

Note: AffinityGrid framework only defines how work is to be distributed to Workers. It does not define what the Workers do and in particular how the results of the completed work are communicated back. Users of this framework will have to define that for themselves via the 'data' that the Workers receive.

AffinityGrid Basics

The framework consists of Dispatcher and Workers. Work is distributed via the Dispatcher. There may only be one instance of the Dispatcher for a given group. A group identifier allows multiple Dispatcher-Worker systems to co-exist within same storage.

The Dispatcher has a well-known Azure queue in the group. All communication to the Dispatcher is via this queue. Senders must know the group id.

At creation time, each Worker assigns itself a globally unique identifier (guid). It then creates a receive queue based on this guid and the group id. It then informs the Dispatcher for its group about its existence. The Dispatcher records the existence of each such Worker.

This approach will give rise to numerous garbage queues over time as the system is stopped and started. To prevent this, the Dispatcher runs a cleanup job (at a configurable time) to remove queues that happen to be in the current group but don't belong to active Workers.

Work given to the Dispatcher must have a workType identifier. This is part of the 'work' message that is sent to the Dispatcher.

As work is received, the Dispatcher doles it out to Workers based on a WorkDistributionStrategy. This is a pluggable component. The current version comes with a basic RoundRobbin WorkDistributionStrategy that keeps track of which Worker is currently doing what work and chooses that Worker of additional work of the same type. Users may define and plug in their own WorkDistributionStrategy derived class for work distribution that is more suited to their specific need.

To send data to all registered workers - independent of the work assigned, use the Broadcast method. The dispatcher will convey the broadcast message, as-is, to all confirmed workers. If no worker is registered, the broadcast message is discarded. This functionality is useful for publish-subscribe scenarios.

Another possible use of the broadcast functionality is in the case of mutiple WebRoles. Say your WebRoles 'listen' for events and display them to users via some push-to-brower mechansim (such as Silverlight duplex binding). At 'application start' time, create a worker for the specific group and start it. The DoBroadcast override will receive the broadcasted events. Note that in Azure the Application_Start event is not processed. You will have to check at BeginRequest time to see if the worker exists and create it if it does not.

Usage

Determine a group id for your AffinityGrid Dispatcher-Worker system. The group id should be conform to Azure queue naming conventions.

Create an instance of Disptacher with that group id and host it in a WorkerRole that is only allowed a single instance. The Dispatcher constructor takes an instance of WorkDispatcherSettings class - peruse that for additional details.

Subclass the abstract class Worker and minimally implement the following abstract methods:
  • DoWork - called when work of the given type is to be started or performed
  • StopWork - called when work of the given type is to be stopped
  • StopAllWork - called when the worker should stop all assigned work - it will most likely be assigned new work
  • DoBroadcast - Broadcast message to all workers regardless of the type of work assigned
The group id is required for Worker instantiation. WorkerRoles hosting Workers can have multiple instances. One such WorkerRole may contain multiple Workers.

Use the Utility helper class to:
  • Create messages for work; stop work; broadcast; diagnostics; etc.
  • Determine a Dispatcher's receive queue name given its group id
  • etc.
The creation of work messages requires three parameters:
  • group id
  • workType - a string that represents the type of the work. It should not contain the Utility.DELIMITER character.
  • data - an arbitrary string that is meant to denote the actual 'work'. This value is passed unchanged to a Worker via the DoWork override method.
See the test project AffinityTest and the classes AffinityGridTest and TestWorker for examples of how to use this framework.

Last edited Nov 18, 2009 at 7:14 PM by fwaris, version 15