The Twelve Days of AWS: SQS

12 Days of AWS Day 1 written around snowflakes with a penguin building a snowman

Simple Queue Service (SQS) is an indispensable tool for many batch operations and concurrent events within, and even outside of, the AWS infrastructure. There are two types of queues: Standard or FIFO. The standard queue, where messages enter the queue and exit in the most performance effective manner, can be batched into groups of messages and can be used to trigger Lambdas. The other kind is FIFO (First In First Out) queues where, as the name suggests, the first message to enter the queue is the first to exit, which guarantees a specific order, but these queues cannot be used for triggering Lambas, among other limitations.

An example of a use of SQS that I have implemented recently would be parsing large files from S3 via a Lambda, splitting up the task into parts by creating smaller subtasks and assigning each one to an SQS message. This in turn fires off another Lambda which does the heavy lifting on that part of the larger task. This can be very useful if you have resource limits that are being hit, and you don’t need something to happen in a specific order.

Another example is the chaining of events in both Drupal and AWS, using aws_sqs module, which is designed to run Drupal queues in AWS. Queue API in Drupal, unlike Batch API, does not need to run the tasks immediately but whenever possible, and entities are entered into a Queue for later processing. AWS SQS provides an AWS-based implementation of the Queue API backend, and is targeted specifically for use within Drupal, with a specific message format. As I was not using the Queue API, and needed extra parameters in the message not supported by Queue API, I created a custom implementation of the module class file to include JSON data, and then fired off an SQS on the completion of a task in Drupal. This then triggered a Lambda to run Data Pipelines in AWS.

SQS messages are persisted for a specified amount of time, and are designed to be polled by the systems that use them. Once a message is received it is cleared from the queue so only one recipient is able to process a message. This is perfect for distributed systems where if any part of the system is down, it will have a chance to recover and continue, and no subsequent task will be duplicated.

If you do need various recipients to get the same message, you would turn to SNS