{"id":265,"date":"2022-02-04T01:47:38","date_gmt":"2022-02-04T01:47:38","guid":{"rendered":"https:\/\/192.168.1.3\/wordpress\/?p=265"},"modified":"2025-02-11T11:39:44","modified_gmt":"2025-02-11T11:39:44","slug":"aws-solution-architect-associate-saac02-review-material-other-data-database-services","status":"publish","type":"post","link":"https:\/\/mylinuxsite.com\/wordpress\/?p=265","title":{"rendered":"AWS Solution Architect Associate (SAA-C02) Review Material  &#8211; Other Data\/Database Services"},"content":{"rendered":"\n<!--more continue reading-->\n\n\n\n<h4 class=\"wp-block-heading\">DyanamoDB<\/h4>\n\n\n\n<ul class=\"wp-block-list\"><li><strong>General<\/strong><ul><li>Low latency NoSQL database<\/li><li>Supports document (JSON, XML,HTML) or key-value data model<\/li><li>Supports transaction<\/li><li>Serverless, fully managed and replicates across AZs<\/li><li>Can provide <strong>Eventual<\/strong>, <strong>Strong<\/strong> or <strong>Transactional<\/strong> consistency model<\/li><li>Low latency (single digit)<\/li><li>Data are queried through keys<\/li><li>Use IAM for authentication<\/li><li>Can only store 400KB of data (item size)<\/li><\/ul><\/li><li><strong>Provisioned Throughput(Read\/Write Capacities)<\/strong><ul><li>How much data can be read\/written to a table<\/li><li><strong>Transactional<\/strong> requires 2X capacity of strong consistency model.<\/li><li>RCU\/WCU are spread across partitions. <\/li><li>Capacity Units:<ul><li>1 <strong>WCU (Write Capacity Unit)<\/strong> = 1KB write\/sec<ul><li>e.g.  Need to write 5 items in 1 sec with 4KB per item<ul><li>5 x 4 = 20KB \/ (1 KB write\/sec)  =  20WCU is required or  40 WCU (if <strong>transactional<\/strong>)<\/li><\/ul><\/li><li>e.g. Need to write 2 items in  1 sec with 2.5KB per item<ul><li>2 x 3(round to next kB) = 6KB\/(1 KB write\/sec) = 6 WCU is required <\/li><\/ul><\/li><\/ul><\/li><li><strong>1 RCU (Read Capacity Unit) <\/strong>= <strong>4KB<\/strong> (<span style=\"color:#ec0909\" class=\"has-inline-color\"> strong<\/span>  read)\/sec) or  (<span style=\"color:#ef1906\" class=\"has-inline-color\">eventual<\/span>  consistency is <strong>2x of strong<\/strong>) <ul><li>Trick:<ul><li>Think in terms of strong consistent read i.e. Strong Consistent RCU is 4KB<\/li><li>Think how many you need per 1 item. Think of Strong Consistent RCU as a box that can accommodate 4KB.<\/li><li>Roundup the size to the nearest 4KB.<\/li><\/ul><\/li><li>e.g. 10 <em>strong<\/em> read\/sec with size 4KB per item<ul><li>(4\/4)  x 10 = 40KB \/(4 KB read\/sec) = 10 RCU<\/li><li>e.g. 16 <em>eventual<\/em> read\/sec with the size of 12KB per item<\/li><li>(12\/4 round up) x 16 = 48\/2(since this is eventual) = 24 WCU or 48 WCU (if transactional)<\/li><\/ul><\/li><li>e.g. 12 <em>strong<\/em> read\/sec with size 10 KB per item<ul><li>(10\/4 round up) x 12 = 36 WCU<\/li><li>Note:<ul><li>You need to read 10KB per item. So you need 3 boxes of 4 KB i.e. (4 + 4 +4 = 12).<\/li><li>But you have 12 items to read<\/li><li>So 12 x 3 = 36<\/li><\/ul><\/li><\/ul><\/li><li>e.g. 10 <em>eventual<\/em> read\/sec with a size 13 KB per item<ul><li>(13\/4 round up) x 10 = 40<strong> \/2 = 20 WCU<\/strong><\/li><li>Note:<ul><li>You need to read 13KB per item. So you need 4 boxes of 4 KB i.e. ( 4 x 4  = 16 KB to store 13KB)<\/li><li>But you have 10 items to read<\/li><li>So 10 x 4 = <strong>40 WCU<\/strong><\/li><li>But this is <strong>eventual<\/strong> so you only need half i.e. 40\/2 = 20 WCU<\/li><\/ul><\/li><\/ul><\/li><\/ul><\/li><\/ul><\/li><\/ul><\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li>Capacity Modes:<ul><li><\/li><li><strong>Provisioned<\/strong><ul><li>Need to provision ahead of time the WCU and RCU<\/li><li>Pay based on the provisioned WCU and RCU<\/li><\/ul><\/li><li>Can enable auto-scaling.<\/li><li><strong>On-Demand<\/strong><ul><li>Scale up or down based on the workload<\/li><li>Pay per request model (e.g. unknown workload, spiky load)<\/li><li>More expensive<\/li><\/ul><\/li><\/ul><\/li><li><strong>DAX<\/strong><ul><li>Write through cache. Data is written to both DAX and DynamoDB<\/li><li>Micro-second latency read<\/li><li>Reads is eventually consistent. <strong>Not suitable if require strong consistency.<\/strong><\/li><li>5 minutes TTL default. After TTL will read the DB again.<\/li><li><strong>Not suitable for write-intensive operation<\/strong><\/li><li>Inside a VPC<\/li><\/ul><\/li><li><strong>Keys and Indices:<\/strong><ul><li>Two (2) types of <strong>Primary key<\/strong>:<ol><li>Partition Key<\/li><li>Composite Key (Partition Key + Sort Key)<\/li><\/ol><\/li><li>Indices:<ul><li><strong>Secondary Index<\/strong><ul><li>Can be created only when the table is created. Cannot be modified later on<\/li><li>Uses the <span class=\"has-inline-color has-vivid-cyan-blue-color\">same Partition Key<\/span> but a <span class=\"has-inline-color has-vivid-cyan-blue-color\">different<\/span> Sort Key<\/li><\/ul><\/li><li><strong>Global Secondary Index<\/strong><ul><li>Can be created anytime,<\/li><li>Can use a <span class=\"has-inline-color has-vivid-cyan-blue-color\">different Partition Key or Sort Key<\/span><\/li><li>Has its own RCU\/WCU. But if the writes are throttled, the write to the main table is also throttled.<\/li><li><strong>Only supports eventual consistency<\/strong><\/li><\/ul><\/li><\/ul><\/li><li>A hot partition can cause throttling if the partition limits of 3000 RCU or 1000 WCU (or a combination of both) per second are exceeded.<\/li><\/ul><\/li><\/ul>\n\n\n\n<ul class=\"wp-block-list\"><li><strong>DB Streams<\/strong><ul><li>Time-ordered sequence or stream<\/li><li>Records CRUD operation in the stream.<\/li><li>Stores in a log for <strong>24hours<\/strong><\/li><li>Mainly used to trigger events (e.g. trigger Lambda)<\/li><li>Has a separate endpoint<\/li><li>Can store before and after the change<\/li><\/ul><\/li><li><strong>Global Table<\/strong><ul><li>Multi-way replication across regions<\/li><li><strong>All copies are Active<\/strong> i.e. application can read or write on any region<\/li><li>Requires DB streams<\/li><\/ul><\/li><li><strong>TTL<\/strong><ul><li>Defines expiry time of the data<\/li><li>Once passed expiry data is marked for deletion<\/li><li>Guaranteed to be deleted with 48 hours of expiration<\/li><li>Good for removing old or irrelevant data<\/li><li>Help reduce the storage requirement (and cost)<\/li><\/ul><\/li><li>API<ul><li><strong>Items<\/strong>:<ul><li><code>PutItem<\/code>&nbsp;\u2014 Creates a new item, or <strong>replaces<\/strong> an old item with a new item<\/li><li><code>GetItem<\/code>&nbsp;\u2014 Returns a set of attributes for the item with the given primary key<\/li><li><code>UpdateItem<\/code>&nbsp;\u2014 Edits an existing item&#8217;s attributes, or adds a new item to the table if it does not already exist.<\/li><li><code>DeleteItem<\/code>&nbsp;\u2014 Deletes a single item in a table by primary key.&nbsp;<\/li><li><code>BatchGetItem<\/code>&nbsp;\u2014 Read up to 100 items from one or more tables.<\/li><li><code>BatchWriteItem<\/code>&nbsp;\u2014 Create or delete up to 25 items in one or more tables.<\/li><li><em><strong>Projection Expression<\/strong><\/em>&nbsp;is a string that identifies the attributes that you want (SELECT <strong>&lt;projection expression  &#8211; list of columns&gt;<\/strong> from ..)<\/li><\/ul><\/li><li><strong>Query<\/strong>(<strong>Collections<\/strong>)<ul><li>The&nbsp;<code><strong>Query<\/strong><\/code>&nbsp;operation in Amazon DynamoDB finds items <strong>based on primary key values<\/strong>.<\/li><li>Has <strong>Filter Expression<\/strong>  &#8211; determines which items within the&nbsp;<code>Query<\/code>&nbsp;results should be returned. (SELECT &lt;projection expression  &#8211; list of columns&gt; from  X where <strong>&lt;Filter Expression&gt;.<\/strong>..)<\/li><li>Can <strong>Limit<\/strong> the number of items that it reads.<ul><li>Returns <strong>LastEvaluatedKey<\/strong><\/li><\/ul><\/li><li>Has <strong>Paginatio<\/strong>n &#8211;  <code>Query<\/code>&nbsp;results are divided into &#8220;pages&#8221; of data that are 1 MB in size (or less)<\/li><\/ul><\/li><\/ul><ul><li><strong>Scans<\/strong><ul><li>A&nbsp;<code>Scan<\/code>&nbsp;operation in Amazon DynamoDB <strong>reads every item in a table or a secondary index<\/strong>.<\/li><li>Can use ProjectionExpression to limit the attribute<\/li><li>Has <strong>Filter Expression (see Query)<\/strong><\/li><li>Has <strong>Limit<\/strong><ul><li>Returns <strong>LastEvaluatedKey<\/strong><\/li><\/ul><\/li><li>Has <strong>Pagination<\/strong><ul><li>Returns <strong>NextToken<\/strong> if &#8211;max-items is used<\/li><\/ul><\/li><\/ul><\/li><\/ul><\/li><\/ul>\n\n\n\n<p><\/p>\n\n\n\n<h4 class=\"wp-block-heading\">ElasticCache<\/h4>\n\n\n\n<ul class=\"wp-block-list\"><li>Managed Redis\/Memcached<\/li><li>In-memory key\/value store<\/li><li>sub-millisecond latency<\/li><li>Supports clustering and Multi-AZ<\/li><li>Sharding <ul><li>Also known as partitioning, is splitting the data up by key; While replication, also known as mirroring, is to copy all data.<\/li><\/ul><ul><li>Useful to increase performance, reducing the hit and memory load on any one resource. Replication is useful for getting a high availability of reads.<\/li><\/ul><\/li><li>Replication<ul><li>Also known as mirroring, is to copy all data<\/li><\/ul><\/li><li>Cluster Mode Disabled<ul><li>Has a <strong>single<\/strong> shard, inside of which is a collection of Redis nodes; one primary read\/write node and up to five secondary, read-only replica nodes. <\/li><li>Each read replica maintains a copy of the data from the cluster&#8217;s primary node. <\/li><li>Asynchronous replication mechanisms are used to keep the read replicas synchronized with the primary. <\/li><li>Applications can read from any node in the cluster. <\/li><li>Applications can write only to the primary node. Read replicas improve read throughput and guard against data loss in cases of a node failure.<\/li><li>Cannot convert to Cluster Mode<\/li><\/ul><\/li><li>Cluster Mode Enabled<ul><li>&nbsp;1 to 500 shards&nbsp;<\/li><li>Each shard has a primary node and up to five read-only replica nodes.&nbsp;<\/li><li>You cannot manually promote any of the replica nodes to primary.<\/li><li>You can only change the structure of a cluster, the node type, and the number of nodes by restoring from a backup.&nbsp;<\/li><li>Multi-AZ is required.<\/li><\/ul><\/li><\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Redshift<\/h4>\n\n\n\n<ul class=\"wp-block-list\"><li>For Data Warehousing, Analytics  and BI (OLAP)<\/li><li>Uses PostgreSQL behind the scene.<\/li><li>Data can be loaded via:<ol><li>Kinesis Data Firehose<\/li><li>S3 copy<\/li><li>An application using JDBC<\/li><\/ol><\/li><li>Uses columnar storage and columnar compression.<\/li><li>MPP(Massive Parallel Query Execution)<\/li><li>Can have to 128 nodes<\/li><li><span class=\"has-inline-color has-vivid-cyan-blue-color\">Backup<\/span> is enabled by default and can store up to 35 days. It will try to create 3 copies of data(original, replica and S3)<\/li><li>Only runs on 1 AZ.<\/li><li>Encrypted at rest and uses SSL for in-flight<\/li><li>For DR need to take incremental snapshots and store them in S3.<\/li><li>Can configure snapshot to copy to another region<\/li><li><strong>RedShift Spectrum<\/strong> &#8211; query data from S3 without loading it to RedShift<\/li><\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Glue<\/h4>\n\n\n\n<ul class=\"wp-block-list\"><li>Managed ETL service<\/li><li>Serverless<\/li><li><strong>Data Crawler<\/strong> automates the discovery of your data schema. Discovered schema can be stored into <strong>Glue Data Catalog <\/strong>which is used in the authoring process of your ETL jobs.<\/li><\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Neptune<\/h4>\n\n\n\n<ul class=\"wp-block-list\"><li>Fully managed <strong>Graph Database <\/strong>(like Neo4J)<\/li><li>HA in 3 AZ and clustering<\/li><li>Has IAM authentication<\/li><\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">Athena<\/h4>\n\n\n\n<ul class=\"wp-block-list\"><li>An <strong>interactive query service<\/strong> that makes it easy to analyze data in <strong>Amazon S3 <\/strong>using <strong>standard SQL<\/strong><\/li><li>Serverless<\/li><li>No need to perform ETL to analyze data<\/li><li>The data format can be CSV, JSON, ORC, Apache Parquet and Avro<\/li><li>Use Presto engine<\/li><li>Output to S3 so need to have S3 security <\/li><\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">OpenSearch (previously ElasticSearch service)<\/h4>\n\n\n\n<ul class=\"wp-block-list\"><li>A  distributed, open-source search and analytics suite.<\/li><li>Based on Apache Lucene<\/li><li>OpenSearch Dashboards were originally derived from Elasticsearch 7.10.2 and Kibana 7.10.2<\/li><\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">EMR (Elastic Map Reduce)<\/h4>\n\n\n\n<ul class=\"wp-block-list\"><li>Big data platform for data processing<\/li><li>Help create a Hadoop cluster with hundreds of EC2 instances.<\/li><li>Deploy workloads to EMR using Amazon EC2, Amazon Elastic Kubernetes Service (EKS), or on-premises AWS Outposts.<\/li><li>Has auto-scaling and integrate with Spot Instances<\/li><li>Uses open-source frameworks such as Apache Spark, Apache Hive, and Presto.<\/li><\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">DMS (Data Migration Service)<\/h4>\n\n\n\n<ul class=\"wp-block-list\"><li>Can perform homogenous (same DB type) and heterogeneous(diff DB type migration<\/li><li>Source and destination can be in AWS or on-prem,<\/li><li>Requires a replication instance (EC2) to run the migration task<\/li><li>Can create a task that captures ongoing changes after you complete your initial (full-load) migration to a supported target data store (<strong>CDC<\/strong> &#8211; Change Data Capture)<\/li><li>Can use <strong>SCT<\/strong> (Schema Conversion Tool) if the source and destination DB are of different engines. SCT is a separate program <\/li><\/ul>\n\n\n\n<h4 class=\"wp-block-heading\">AWS Data Sync<\/h4>\n\n\n\n<ul class=\"wp-block-list\"><li>Automates and accelerates moving data between<strong> on-premises<\/strong> and <strong>AWS<\/strong> storage service<ul><li>Uses NFS\/SMB or S3 API or HDFS <strong>via an agent <\/strong>running on a VM, Snowcone or S3 Outpost to move date<\/li><\/ul><\/li><li>Transfer data between AWS Storage services so you can replicate, archive, or share application data easily.<\/li><li>Synchronization is scheduled (e.g hourly, daily, weekly)<\/li><\/ul>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[11],"tags":[],"class_list":["post-265","post","type-post","status-publish","format-standard","hentry","category-aws-review-notes"],"_links":{"self":[{"href":"https:\/\/mylinuxsite.com\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/265","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mylinuxsite.com\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mylinuxsite.com\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mylinuxsite.com\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mylinuxsite.com\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=265"}],"version-history":[{"count":60,"href":"https:\/\/mylinuxsite.com\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/265\/revisions"}],"predecessor-version":[{"id":1473,"href":"https:\/\/mylinuxsite.com\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/265\/revisions\/1473"}],"wp:attachment":[{"href":"https:\/\/mylinuxsite.com\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=265"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mylinuxsite.com\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=265"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mylinuxsite.com\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=265"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}