A fully managed service that allows you to ingest, buffer, and process streaming data in real-time.
Can handle any amount of streaming data and process data from hundreds of thousands of sources with very low latencies
A Producer/Consumer model
Kinesis Capabilities:
Kinesis Data Streams(KDS)
A serverless streaming data service that makes it easy to capture, process, and store data stream at any scale.
Producers use SDK, KPL or through an agent
Producers emit data records that contain partition keys. Partition keys ultimately determinewhich shard ingests the data record for a data stream
Shard
the base throughput unit of an Amazon Kinesis data stream.
can ingest up to 1000 data records per second, or 1MB/sec.
grouped into Data Streams which will retain data for 24 hours by default, or optionally up to 365 days.
scaling is done through shards
Consumers can be an app(SDK, KCL) or other AWS services such as Lambda, Kinesis FireHorse or Analytics
KCL:
Uses Lease for consumer to lock on the shard.
Consumer cannot hold the Lease of a shard at the same time
Consumed records will have a sequence no which is added by Kinesis Data Streams
2 types of consumption mechanisms:
Shared – 2 MB/sec/shard shared across all consumers
Enhanced – 2MB/sec/shard/consumer
Billed per shard
Records are immutable.
Realtime, replay capability
Kinesis Agent is a stand-alone Java software application that offers an easy way to collect and send data to Kinesis Data Streams.
Kinesis Data Firehorse (KDF)
A fully managed service that makes it easy to capture, transform, and load massive volumes of streaming data into a data store or analytics tool.
Requires a lambda function to transform data
Write to destination in batches i.e. near real-time
Supported destination includes Amazon S3, Amazon Redshift, Amazon OpenSearch Service, HTTP endpoints, Datadog, New Relic, MongoDB, and Splunk as destinations.
Failed data can be copied to S3 bucket
Kinesis Data Analytics (KDA)
A fully managed service for analyzing streaming data in real-time.
You get a console-based editor to build SQL queries.
Kinesis Data Analytics Studio supports sub-second queries with built-in visualizations.
Automatic Scaling
Real-time
Integrates with KDS and KDF
Use cases: Real-time dashboard, metrics, time-series.
Kinesis Video Streams
makes it easy to securely stream video from connected devices to AWS for analytics, machine learning (ML), playback, and other processing.
SQS
A fully managed message queuing service that enables you to decouple and scale microservices, distributed systems, and serverless applications.
A Producer/Consumer model.
Unlimited throughput/queue
Consumers must delete the message.
Has 2 types:
Standard
FIFO
Best effort ordering and message can have duplicates unless using FIFO.
Can receive up to max 10 messages.
FIFO Queue:
Ordered
No duplicates
Limited throughput (compared to Standard)
Queue name must end with .fifo
Has batching (support 3000 transactions/sec per API call)
W/o batching supports 300 API/sec
S3 is not allowed to send notifications to this type of SQS
Has access policy (like S3)
Message Group ID
A tag that specifies that a message belongs to a specific message group.
The same message group are always processed one by one, in a strict order relative to the message group.
Message Deduplication ID
A token that is used for deduplication of sent messages.
Messages sent with the same message deduplication ID are accepted successfully but aren’t delivered during the 5-minute deduplication interval.
Request-Response Model:
Used when a producer requires responses from consumers
A producer will send a message containing a ‘Correlation Id’ and ‘Response Queue Name’ is sent to the Request queue.
A consumer will respond by sending a message containing the same ‘Correlation Id’ to the queue specified in the request.
Implemented through SQS Temporary Queue Client.
Important Configurations:
Visibility Timeout – a time when it will not be visible to other consumers. 0 – 12 hours. Can be set programmatically(ChangeMessageVisibility API)
Delivery Delay – time to delay the first delivery of each message added to the queue. 0 – 15 minutes
Receive Message Timeout – time that polling will wait for messages to become available to receive. 0 – 20 seconds
Message Retention Period- time that Amazon SQS retains a message that does not get deleted. 1 minute to 14 days
Maximum message size – maximum message size for your queue. 1 – 256 KB.
Dead Letter Queue(DLQ)
message can’t be consumed successfully can be sent to DLQ
DLQ is another SQS but with Redrive Allow Policy enabled
Maximum Receives determines when a message will be sent to the DLQ. If the ReceiveCount exceeds this value then the message will go to the DLQ
SNS
A fully managed messaging service for both application-to-application (A2A) and application-to-person (A2P) communication
A Pub/Sub model
Subscribers can be SQS, Lambda, Email, SMS, HTTP/HTTPS, Mobile endpoints.
Has Access Policy
Has Subscription Filter
Can define Message Attributes
Has DLQ
Has Message Group ID and Deduplication ID
Has TTL (only for Mobile Endpoints)
Can be coupled with SQS for the Fan Out pattern. But SQS access policy must allow SNS to write to SQS