A managed service that makes it easy to deploy, operate, and scale OpenSearch clusters in the AWS Cloud.
An OpenSearch Service domain is synonymous with an OpenSearch cluster.
Automatically detects and replaces failed OpenSearch Service nodes.
Has the option for a Managed or Serverless cluster.
It can scale out or scale up/down (no downtime).
It can be placed inside a VPC or made public.
Managing Indexes
Storages:
UltraWarm:
A cost-effective way to store large amounts of read-only data.
It uses Amazon S3 and a sophisticated caching solution to improve performance.
Best-suited to immutable data, such as logs.
Standard:
Use “hot” storage, which takes the form of instance stores or Amazon EBS volumes attached to each node.
Hot storage provides the fastest possible performance for indexing and searching new data.
Cold:
Backed by Amazon S3.
Suitable for storing infrequently accessed or historical data.
Data suitable for cold storage include infrequently accessed logs, data that must be preserved to meet compliance requirements, and logs that have historical value.
OR1:
An instance family for Amazon OpenSearch Service that provides a cost-effective way to store large amounts of data.
It uses Amazon Elastic Block Store (Amazon EBS) gp3 or io1 volumes for primary storage, with data copied synchronously to Amazon S3 as it arrives.
Suitable for running indexing heavy operational analytics workloads such as log analytics, observability, or security analytics.
OR1 instances offer an automatic data recovery option, which improves your domain’s overall reliability.
Index State Management (ISM):
It lets you define custom management policies that automate routine tasks and apply them to indexes and index patterns.
Done through a policy which is attached to an index.
Examples of policies are:
Hot to warm to cold storage
Reduce replica count
Take an index snapshot
Index Rollup
It reduces storage costs by periodically rolling up old data into summarized indexes.
With index rollup, you create a new index with selected fields aggregated into coarser time buckets.
Reduces data granularity by rolling up old data into condensed indexes
Index Transform:
You create a different, summarized view of your data centered around certain fields so you can visualize or analyze the data in different ways.
Cross-cluster replication:
Replicate user indexes, mappings, and metadata from one OpenSearch Service domain to another.
It can be used for disaster recovery or to reduce latency.
The replication follows an active-passive replication model where the local or follower index pulls data from the remote or leader index.
Remote reindex:
It lets you copy indexes from one Amazon OpenSearch Service domain to another.
You can use it to migrate indexes from one domain to another.
Security
Encryption at rest (except for manual snapshots)
Encryption in flight, i.e. node-to-node encryption
Resource-based policy – specify which actions a principal can perform on the domain’s subresources
Identity-based policy
IP-based policy – restrict access to a domain to one or more IP addresses or CIDR blocks
Dashboard access control via:
Cognito
SAML
Fine-grained access control with HTTP basic authentication
IP-based policy
Access to a domain that is inside a VPC can be either via Reverse Proxy, Direct Connect, VPN, or Cognito
Hands-On
Stream a CloudWatch Log to Amazon OpenSearch
In this hands-on, we will stream a Cloudwatch log from a Lambda function to an Amazon OpenSearch domain.
Create a domain on a managed cluster:
Use the instance type t3.small.search to be eligible for a Free Tier.
Place the cluster inside a VPC.
For the Access Policy, change to ‘Allow All’.
Since the OpenSearch nodes are inside a VPC, we need to create a reverse proxy server that will forward our request to the OpenSearch nodes from outside.
Launch an EC2 instance with a public IP in the same VPC and subnet as the OpenSearch nodes
Install Nginx in the EC2 instance.
Configure Nginx as a reverse proxy by modifying its configuration file (nginx.conf). Set the value of the proxy_pass to the OpenSearch ‘Domain endpoint’
Start the Nginx server.
Test to see if you can connect to the OpenSearch dashboard:
Ensure that the Nginx EC2 instance security group allows access to port 80 (HTTP).
From your browser, connect to the URL http://<ec2_public_ip>/_dashboards.
Create a Amazon OpenSearch Subscription Filter in a CloudWatch logroup.
For this hands-on, I used a Lambda CloudWatch log group.
Test the streaming:
Generate a new log in the log group.
Check if a new index is created in the OpenSearch domain.
Query the streamed data from the Dashboard and compare it with the CloudWatch log.