AWS Solution Architect Associate (SAA-C02) Review Material – Message Queues and Streams (Kinesis, SQS, SNS, & Active MQ)

Kinesis

  • A fully managed service that allows you to ingest, buffer, and process streaming data in real-time.
  • Can handle any amount of streaming data and process data from hundreds of thousands of sources with very low latencies
  • A Producer/Consumer model
  • Kinesis Capabilities:
    1.  Kinesis Data Streams(KDS)
      • A serverless streaming data service that makes it easy to capture, process, and store data stream at any scale.
      • Producers use SDK, KPL or through an agent
      • Producers emit data records that contain partition keys. Partition keys ultimately determine which shard ingests the data record for a data stream
      • Shard
        • the base throughput unit of an Amazon Kinesis data stream.
        •  can ingest up to 1000 data records per second, or 1MB/sec.
        • grouped into Data Streams which will retain data for 24 hours by default, or optionally up to 365 days.
        • scaling is done through shards
      • Consumers can be an app(SDK, KCL) or other AWS services such as Lambda, Kinesis FireHorse or Analytics
      • KCL:
        • Uses Lease for consumer to lock on the shard.
        • Consumer cannot hold the Lease of a shard at the same time
      • Consumed records will have a sequence no which is added by Kinesis Data Streams
      • 2 types of consumption mechanisms:
        1. Shared – 2 MB/sec/shard shared across all consumers
        2. Enhanced – 2MB/sec/shard/consumer
      • Billed per shard
      • Records are immutable.
      • Realtime, replay capability
      • Kinesis Agent is a stand-alone Java software application that offers an easy way to collect and send data to Kinesis Data Streams.
    2. Kinesis Data Firehorse (KDF)
      • A fully managed service that makes it easy to capture, transform, and load massive volumes of streaming data into a data store or analytics tool.
      • Requires a lambda function to transform data
      • Write to destination in batches i.e. near real-time
      • Supported destination includes Amazon S3, Amazon Redshift, Amazon OpenSearch Service, HTTP endpoints, Datadog, New Relic, MongoDB, and Splunk as destinations.
      • Failed data can be copied to S3 bucket
    3. Kinesis Data Analytics (KDA)
      • A fully managed service for analyzing streaming data in real-time.
      •  You get a console-based editor to build SQL queries.
      • Kinesis Data Analytics Studio supports sub-second queries with built-in visualizations.
      • Automatic Scaling
      • Real-time
      • Integrates with KDS and KDF
      • Use cases: Real-time dashboard, metrics, time-series.
    4. Kinesis Video Streams
      • makes it easy to securely stream video from connected devices to AWS for analytics, machine learning (ML), playback, and other processing. 

SQS

  • A fully managed message queuing service that enables you to decouple and scale microservices, distributed systems, and serverless applications. 
  • A Producer/Consumer model.
  • Unlimited throughput/queue
  • Consumers must delete the message.
  • Has 2 types:
    • Standard
    • FIFO
  • Best effort ordering and message can have duplicates unless using FIFO.
  • Can receive up to max 10 messages.
  • FIFO Queue:
    • Ordered
    • No duplicates
    • Limited throughput (compared to Standard)
    • Queue name must end with .fifo
    • Has batching (support 3000 transactions/sec per API call)
    • W/o batching supports 300 API/sec
    • S3 is not allowed to send notifications to this type of SQS
  • Has access policy (like S3)
  • Message Group ID
    1. A tag that specifies that a message belongs to a specific message group.
    2. The same message group are always processed one by one, in a strict order relative to the message group.
  • Message Deduplication ID
    • A token that is used for deduplication of sent messages.
    • Messages sent with the same message deduplication ID are accepted successfully but aren’t delivered during the 5-minute deduplication interval.
  • Request-Response Model:
    • Used when a producer requires responses from consumers
    • A producer will send a message containing a ‘Correlation Id’ and ‘Response Queue Name’ is sent to the Request queue.
    • A consumer will respond by sending a message containing the same ‘Correlation Id’ to the queue specified in the request.
    • Implemented through SQS Temporary Queue Client.
  • Important Configurations:
    • Visibility Timeout – a time when it will not be visible to other consumers. 0 – 12 hours. Can be set programmatically(ChangeMessageVisibility API)
    • Delivery Delay – time to delay the first delivery of each message added to the queue. 0 – 15 minutes
    • Receive Message Timeout – time that polling will wait for messages to become available to receive. 0 – 20 seconds
    • Message Retention Period- time that Amazon SQS retains a message that does not get deleted. 1 minute to 14 days
    • Maximum message size – maximum message size for your queue. 1 – 256 KB.
    • Dead Letter Queue (DLQ)
      • message can’t be consumed successfully can be sent to DLQ
      • DLQ is another SQS but with Redrive Allow Policy enabled
      • Maximum Receives determines when a message will be sent to the DLQ. If the ReceiveCount exceeds this value then the message will go to the DLQ

SNS

  • A fully managed messaging service for both application-to-application (A2A) and application-to-person (A2P) communication
  • A Pub/Sub model
  • Subscribers can be SQS, Lambda, Email, SMS, HTTP/HTTPS, Mobile endpoints.
  • Has Access Policy
  • Has Subscription Filter
  • Can define Message Attributes
  • Has DLQ
  • Has Message Group ID and Deduplication ID
  • Has TTL (only for Mobile Endpoints)
  • Can be coupled with SQS for the Fan Out pattern. But SQS access policy must allow SNS to write to SQS
  • Has 2 types (similar to SQS):
    • Standard
    • FIFO – only allows SQS for subscription

Active MQ

  • A  managed message broker service for Apache ActiveMQ and RabbitMQ.
  • Supports open/standard protocols such as MQTT, AMQP, STOMP
  • HA (Active/Standby) but requires EFS.

Leave a Comment

Your email address will not be published. Required fields are marked *