AWS Integration & Messaging - Part 1: Understanding SQS Queues for Decoupling Applications

In this blog, we will dive into AWS Integration & Messaging, specifically Amazon SQS (Simple Queue Service). We'll learn how SQS enables application decoupling, ASG integration in SQS for dynamic scaling, Message Visibility Timeout to ensure efficient message processing, SQS FIFO Queue, including managing duplication issues.

Decoupling Applications with AWS Integration & Messaging

AWS Integration & Messaging includes services that help applications, systems, and components communicate in a distributed setup. These services support asynchronous communication, event-driven workflows, and data streaming, allowing for scalable, reliable, and independent interactions.

In case of synchronous communication between applications, it can be problematic if there are sudden spikes of traffic. In that case asynchronous or event based comes in to decouple your applications. Decoupling applications mean designing your applications in a way that reduces dependencies between their different components. That is, each part of the application can function independently, making it easier to update, scale, or replace without affecting other parts.

Applications can be decoupled using—

SQS: Queue model
SNS: Pub/Sub model
Kinesis: Real-time streaming model

What do they solve?

Scalability Issues:

Messaging services decouple components, allowing independent scaling.
Tight Coupling:

These services enable asynchronous and loosely coupled communication.
Data Processing Delays:

Processing large volumes of real-time data can lead to latency. Kinesis handles high-throughput, low-latency data streaming.
Reliability and Fault Tolerance:

SQS guarantees message delivery; SNS ensures notifications reach all subscribers

Amazon SQS

Amazon Simple Queue Service (SQS) is a fully managed message queuing service provided by AWS. It allows application components to be decoupled, enabling them to communicate asynchronously by-passing messages through a queue.

Let's understand the process. A producer (e.g., a web server, application, or service) creates a message and sends it to an SQS queue since producer doesn’t directly interact with the consumer.

SQS queue acts as a buffer, temporarily storing messages sent by producers. Messages can be retained in the queue for a configured duration (up to 14 days). A consumer polls the SQS queue to retrieve messages. Polling can be short-polling (immediate response) or long-polling (waits for new messages to arrive). Once a message is received, it becomes "invisible" to other consumers for a specified period. Consumer processes the retrieved message and deletes it from queue once it is processed.

Types of SQS Queues

Standard Queue (Default):

It is the default option designed for high throughput, capable of handling an unlimited number of transactions. It follows a best-effort ordering approach, meaning messages may occasionally be delivered out of order. Additionally, it guarantees at-least-once delivery, where a message might be delivered multiple times. This type of queue is ideal for applications that can tolerate occasional duplicate messages, such as data processing pipelines.
FIFO Queue (First-In-First-Out):

This type of queue ensures the strict ordering of messages, meaning they are delivered in the exact order they are sent. It also supports exactly-once processing, ensuring each message is processed only once. This queue is ideal for applications where maintaining a strict sequence is crucial, such as financial transaction systems.

Integrating SQS with ASG

Amazon SQS can be integrated with an Auto Scaling Group (ASG) to process messages dynamically based on demand. This integration ensures that the number of instances in the ASG scales up or down automatically, depending on the volume of messages in the SQS queue.

Let’s analyze the diagram workflow and understand the process.

Producers send messages to the SQS queue, which holds messages until they are retrieved by consumer applications running on instances in the Auto Scaling Group.
Instances in the ASG are configured to poll the SQS queue, retrieve messages, and process them.
Amazon CloudWatch continuously monitors the SQS queue and provides metrics based on which ASG will scale the instances.
If the workload increases (more messages in the queue), the ASG adjusts the number of instances to match the demand.

Decouple between Application Tiers using SQS

Let's explore how SQS can help an application allow its components, such as the front end and back end, to function independently. In other words, it helps to decouple the application tiers.

The front-end web application integrated with an ASG receives incoming requests from users or external systems. Instead of directly handling these requests, the application packages each request into a message and sends it to the SQS Queue.

The SQS queue serves as a message buffer or middleware between the front-end and back-end. Messages are stored in the queue until they are retrieved by the back-end application. This decoupling ensures that the front-end does not depend on capacity of the back-end processing application.

The back-end application retrieves messages from the SQS queue using the ReceiveMessage API. This application could be responsible for tasks like video processing, as shown in the diagram. After retrieving a message, the back-end processes it and stores it in S3 bucket.

This architecture is a classic example of decoupling between application tiers using SQS, Auto Scaling, and Amazon S3 for efficient, reliable, and scalable application design.

SQS – Message Visibility Timeout

To understand it in simple terms, Visibility Timeout in Amazon SQS works like a “busy signal” for messages. When a message is taken by one consumer, it becomes temporarily invisible to other consumers. This gives the first consumer enough time to finish the task (processing the message). If the task isn't completed before the timeout, the message reappears for others to pick up and try again. This ensures that only one consumer processes a message at a time, preventing duplicate processing.

Let’s take a look at the diagram workflow to understand the process.

First off let's breakdown the steps for better understanding:

Message Retrieval:

First, the ReceiveMessage request retrieves a message from the SQS queue. Once retrieved, it enters the visibility timeout period. During this time, the message is not returned to subsequent ReceiveMessage requests.
Visibility Timeout Period:

While the message is in the visibility timeout period, it is invisible to other consumers. This gives the current consumer time to process the message.
Timeout Expiration:

If the visibility timeout expires and the consumer has not deleted the message from the queue, the message becomes visible again. At this point, another ReceiveMessage request can retrieve the same message, allowing it to be processed again.

Consequences of Visibility Timeout

The Visibility Timeout should be set in a reasonable range. By default, the “message visibility timeout” is 30 seconds. However, if is not then following two consequences may occur.

Visibility Timeout Too High:

If the timeout is set too long, messages stay hidden even after the consumer has finished processing them. This causes delays in processing the next messages, and if the consumer crashes, reprocessing will take longer.
Visibility Timeout Too Low:

If the timeout is too short then, message may become visible again before the consumer finishes processing it. Another consumer may retrieve the same message, causing duplicate processing.

Amazon SQS – FIFO Queue

To understand this type of SQS Queue, let's consider an example. Suppose you're managing an order processing system for an online store. Customers place orders, and each order needs to be processed in the exact order it was received. In this case, you use a FIFO (First-In, First-Out) queue to ensure that orders are processed in the correct sequence, meaning the first customer in line is the first one served.

Similar is the case for Amazon SQS FIFO Queues. They are designed for applications that require strict ordering of messages. This is useful for applications like order processing or anything that depends on preserving message order.

Let's understand the workflow of the FIFO Queue.

When a Producer sends a message to the FIFO queue, it includes a message group ID. This Id determines the logic of the messages. Messages within the same group are guaranteed to be processed in order. FIFO ensures that first sent message is processed first.

Consumer requests messages from the FIFO queue and receives the poll messages in the exact order they were placed in according to message group ID.

Some Attributes of SQS FIFO Queue:

Limited throughput: 300 msg/s without batching, 3000 msg/s with batching
Exactly-once send capability (by removing duplicates)
Messages are processed in order by the consumer

Duplication Issues in SQS — FIFO Queue:

FIFO queues are designed to ensure exactly once processing, but you might still encounter duplicate messages if a message is not processed within the visibility timeout. To manage this, a de-duplication interval of 5 minutes is used.

There are two methods to achieve message de-duplication:

Content-based Deduplication:

SQS calculates a SHA-256 hash of the message body. If an identical message is sent within the 5 minutes, it will be treated as a duplicate and not processed again.
Explicit Deduplication ID:

A custom Message Deduplication ID with each message can be set. If a message with the same de-duplication ID is sent within the 5-minute window, it will be treated as a duplicate and prevent it from being processed again.

Both methods help ensure that duplicate messages are not processed, improving the reliability of the system.

This is the end of this blog.

In the next blog, we will discuss part 2 of AWS Integration & Messaging, focusing on Amazon SNS, which is used to send one message to many receivers. Stay tuned for more details in the upcoming post.