{"id":861,"date":"2024-10-03T10:32:41","date_gmt":"2024-10-03T10:32:41","guid":{"rendered":"https:\/\/192.168.1.3\/wordpress\/?p=861"},"modified":"2024-12-16T07:11:27","modified_gmt":"2024-12-16T07:11:27","slug":"aws-certified-data-engineer-associate-dea-c01-review-material-glue","status":"publish","type":"post","link":"https:\/\/mylinuxsite.com\/wordpress\/?p=861","title":{"rendered":"AWS Certified Data Engineer Associate (DEA-C01) Review Material \u2013 Glue"},"content":{"rendered":"\n<!--more continue reading-->\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Overview<\/strong><\/h3>\n\n\n\n<ul class=\"wp-block-list\"><li>A\u00a0serverless data integration service primarily used for ETL.<\/li><li><span style=\"color: var(--ast-global-color-3); font-size: 1rem; font-weight: inherit;\">Key Components are:<\/span><ul><li><strong>Data Catalog<\/strong> &#8211; A central metadata repository of your data lake<\/li><li><strong>ETL Jobs<\/strong> &#8211; Perform processing on the data.  Run on-demand, on a schedule or triggered by an event<\/li><\/ul><ul><li><strong>Workflows<\/strong> &#8211; An orchestration tool within Glue.<\/li><\/ul><\/li><li><strong>Data Catalog:<\/strong><ul><li><strong>Databases<\/strong> &#8211; used to organize metadata tables.<\/li><li><strong>Tables<\/strong> &#8211; metadata definition that represents the data in a data store.  <ul><li>The definitions include:<ul><li>Schema or data structure<\/li><li>Data Format, e.g. CSV, Parquet, ORC, JSON, AVRO, XML<\/li><li>Datastore, e.g. S3, Kinesis, Kafka<\/li><li>Partitions<ul><li>All the following conditions must be true for AWS Glue to create a partitioned table for an Amazon S3 folder:<ul><li>The schemas of the files are similar, as determined by AWS Glue.<\/li><li>The data format of the files is the same.<\/li><li>The compression format of the files is the same.<\/li><\/ul><\/li><\/ul><\/li><li>Partition Index<ul><li>A list of partition keys that already exist on a given table.<\/li><li>Partition index is sub list of partition keys defined in the table. Speed up GetPartitions API call (the API to identify what partition to read)<\/li><\/ul><\/li><\/ul><\/li><li>Tables belong to a Database<\/li><li>Schema can be manually created or populated by Crawlers.<\/li><\/ul><\/li><\/ul><ul><li><strong>Crawlers<\/strong> &#8211; populate the AWS Glue Data Catalog with databases and tables<ul><li>The crawler connects to the data store. <\/li><li>Can crawl multiple data stores in a single run.\u00a0<\/li><li>It has built-in and custom classifiers.<\/li><li>Some data stores require connection properties for crawler access.<\/li><\/ul><\/li><li><strong>Connections<\/strong> &#8211; an object that stores login credentials, URI strings, virtual private cloud (VPC) information, and more for a particular data store.<ul><li>can be both used for sources and targets<\/li><li>built-in connectors to AWS resources:<ul><li>Aurora<\/li><li>Redshift<\/li><li>Kafka<\/li><li>AWS DocumentDB<\/li><li>AWS OpenSearch<\/li><\/ul><\/li><\/ul><\/li><\/ul><ul><li>It can be used as a metadata store for Hive ( Hive allows you to run SQL-like queries on an EMR)<\/li><\/ul><\/li><li><strong>ETL Jobs<\/strong>:<ul><li>Perform a set of ETL operations from various data sources.<\/li><li>Encapsulates a script that connects to your source data, processes it, and then writes it out to your data target. Typically, a job runs extract, transform, and load (ETL) scripts.\u00a0<\/li><li>Visual ETL will automatically generate the script in <strong>Python<\/strong> or <strong>Scala<\/strong> and can be modified.  You can also provide your own Spark or PySpark scripts.<\/li><li>You can configure your AWS Glue ETL jobs to run within a VPC when using connectors.<\/li><li>Uses a Spark platform under the hood.<\/li><li>You are charged hourly based on the number of data processing units (<strong>DPUs<\/strong>) used to run your ETL job. A single standard DPU provides 4 vCPU and 16 GB of memory, whereas a high-memory DPU (M-DPU) provides 4 vCPU and 32 GB of memory.<\/li><li>You can set the no. of workers or the worker type (i.e. more CPU\/Mem)<\/li><li>Run on schedule or triggered by an event.<\/li><li>You can set properties of your tables to enable an AWS Glue ETL job to group files when they are read from an Amazon S3 data store (\u00a0<strong>dynamic frame file-grouping<\/strong>).<\/li><li><strong>Dynamic Frame:<\/strong><\/li><li><strong>Transformations<\/strong>:<ul><li>Bundled<\/li><li>Machine Learning<\/li><li>Format Conversion<\/li><li>Apache Spark Transformation<\/li><\/ul><\/li><li><strong>Execution Class:<\/strong><ul><li>Standard &#8211; ideal for time-sensitive workloads that require fast job startup and dedicated resources.<\/li><li>Flex &#8211; appropriate for time-<em>insensitive<\/em> jobs whose start and completion times may vary.<\/li><\/ul><\/li><li>ETL scripts can modify the Data Catalog and partition keys:<ul><li>New Partitions: Pass\u00a0<code>enableUpdateCatalog<\/code>\u00a0and\u00a0<code>partitionKeys<\/code> <\/li><li>Update Table Schema: <code>enableUpdateCatalog<\/code>\u00a0and \u00a0<code>updateBehavior<\/code><\/li><li>Create New Tables: setCatalogInfo, <code>updateBehavior<\/code> and enableUpdateCatalog<\/li><\/ul><\/li><li><strong>Job bookmarks<\/strong> <ul><li>It helps AWS Glue maintain state information and prevent the reprocessing of old data. By persisting state information from the previous run of an ETL job, AWS Glue tracks data that has already been processed.\u00a0<\/li><li>The maximum concurrency must be set to 1<\/li><li><strong>The script must end with the\u00a0<strong>job.<\/strong> Commit<\/strong>().<\/li><\/ul><\/li><li><strong>Glue Studio<\/strong> &#8211; a graphical interface that makes it easy to create, run, and monitor data integration jobs in AWS Glue. You can visually compose data transformation workflows and seamlessly run them on the Apache Spark\u2013based serverless ETL engine in AWS Glue.<ul><li><strong>Data Quality<\/strong>:<ul><li>allows you to measure and monitor the quality of your data so that you can make good business decisions<\/li><li>works with Data Quality Definition Language (DQDL), which is a domain specific language that you use to define data quality rules.<\/li><\/ul><\/li><\/ul><\/li><li>Can be coded using:<ol><li>PySpark &#8211; more features and\u00a0has reduced wait times,<\/li><li>Python Shell &#8211; to run small to medium-sized generic tasks that are often part of an ETL workflow. Cannot be used with Job bookmarks.<\/li><li>Scala<\/li><\/ol><\/li><\/ul><\/li><li><strong>Workflows<\/strong>:<ul><li>For orchestrating complex ETL operations.<\/li><li>Used to create and visualize complex extract, transform, and load (ETL) activities involving multiple crawlers, jobs, and triggers.<\/li><\/ul><ul><li>It can be triggered by:<ol><li>Schedule<\/li><li>On-Demand<\/li><li>Event Bridge<\/li><\/ol><\/li><\/ul><\/li><li><strong>Data Brew<\/strong><ul><li>A visual data preparation tool that makes it <em>easier<\/em> for <em>data analysts and data scientists<\/em> to <em>clean<\/em> and <em>normalize<\/em> data to prepare it for analytics and machine learning (ML).<\/li><\/ul><\/li><\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Security<\/strong><\/h3>\n\n\n\n<ul class=\"wp-block-list\"><li>IAM Policies to access Glue.<\/li><li>KMS encryption on:<ul><li>Meta data stored in the Data Catalog<\/li><li>Connection setting<\/li><li>S3<\/li><li>CW log<\/li><li>Job bookmarks<\/li><li>Data Quality<\/li><\/ul><\/li><li>Resource policy<\/li><li>SSL on a connection from the client to Glue or from Glue to target\/source.<\/li><\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><strong><span style=\"color:#2ceb0e\" class=\"has-inline-color\">Hands-On<\/span><\/strong><\/h3>\n\n\n\n<h6 class=\"wp-block-heading\"><strong>Test Data<\/strong><\/h6>\n\n\n\n<p>We will use two(2) sets of test data. The first set will be in XML format and copied to an S3 bucket, and the second set will be a MySQL RDS Database.<\/p>\n\n\n\n<ol class=\"wp-block-list\"><li><strong>XML File<\/strong><ul><li>The data is located at this link: <a href=\"https:\/\/data.gov.hk\/en-data\/dataset\/hk-rvd-tsinfo_rvd-names-of-buildings\" data-type=\"URL\" data-id=\"https:\/\/data.gov.hk\/en-data\/dataset\/hk-rvd-tsinfo_rvd-names-of-buildings\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/data.gov.hk\/en-data\/dataset\/hk-rvd-tsinfo_rvd-names-of-buildings<\/a>. The data is a list of building names in Hong Kong.<\/li><li>Download the files named (1) Names of Buildings (Volume 1) Hong Kong Island and Kowloon, and  (2) Names of Buildings (Volume 2) The New Territories<\/li><li>The content of the <span style=\"color:#98401a\" class=\"has-inline-color\">file has no line breaks. In short, all the records will appear as one long line. Because of this, the crawler will not be able to identify the structure of the file. So you need to <em>&#8216;pretty&#8217;<\/em> the document<\/span>. I used xmllint for that:<ul><li>$ xmllint -format &lt;the_ugly_file.xml&gt; &gt;  &lt;the_pretty_file.xml&gt;<\/li><\/ul><\/li><li>Copy the files to an S3 bucket. <ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"284\" class=\"wp-image-882\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.22.56.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.22.56.png 2868w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.22.56-300x142.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.22.56-1024x485.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.22.56-768x364.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.22.56-1536x727.png 1536w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.22.56-2048x970.png 2048w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><\/ul><\/li><\/ul><\/li><li>MySQL RDS<ul><li>TODO<\/li><\/ul><\/li><\/ol>\n\n\n\n<h5 class=\"wp-block-heading\"><strong><span class=\"has-inline-color has-ast-global-color-0-color\">Crawlers<\/span><\/strong><\/h5>\n\n\n\n<ul class=\"wp-block-list\"><li>Create a database<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"285\" class=\"wp-image-885\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.49.40.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.49.40.png 2880w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.49.40-300x143.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.49.40-1024x486.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.49.40-768x365.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.49.40-1536x730.png 1536w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.49.40-2048x973.png 2048w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><li>Note: The table depicted in this image will be created by the crawler. Do not create any table.<\/li><\/ul><\/li><\/ul>\n\n\n\n<h6 class=\"wp-block-heading\"><strong>XML Crawlers<\/strong><\/h6>\n\n\n\n<ul class=\"wp-block-list\"><li>Create the XML crawler:<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"239\" class=\"wp-image-887\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.53.22.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.53.22.png 2874w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.53.22-300x119.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.53.22-1024x408.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.53.22-768x306.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.53.22-1536x611.png 1536w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-17.53.22-2048x815.png 2048w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><\/ul><\/li><li>Create a custom classifier for the XML crawler. Use &#8216;<strong>Record<\/strong>&#8216; as the &#8216;Row Tag&#8217;.<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"158\" class=\"wp-image-889\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.23.39.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.23.39.png 2240w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.23.39-300x79.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.23.39-1024x269.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.23.39-768x202.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.23.39-1536x403.png 1536w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.23.39-2048x538.png 2048w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"259\" class=\"wp-image-913\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-36-23.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-36-23.png 1575w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-36-23-300x129.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-36-23-1024x441.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-36-23-768x331.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-36-23-1536x662.png 1536w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"323\" class=\"wp-image-890\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.24.10.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.24.10.png 2246w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.24.10-300x161.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.24.10-1024x551.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.24.10-768x413.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.24.10-1536x826.png 1536w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.24.10-2048x1102.png 2048w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><\/ul><\/li><li>Run the crawler to generate the table and its schema:<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"280\" class=\"wp-image-894\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.27.17.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.27.17.png 2212w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.27.17-300x140.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.27.17-1024x479.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.27.17-768x359.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.27.17-1536x718.png 1536w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.27.17-2048x957.png 2048w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"302\" class=\"wp-image-893\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.28.13.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.28.13.png 2322w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.28.13-300x151.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.28.13-1024x515.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.28.13-768x386.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.28.13-1536x773.png 1536w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.28.13-2048x1030.png 2048w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"356\" class=\"wp-image-892\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.28.25.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.28.25.png 2326w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.28.25-300x178.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.28.25-1024x608.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.28.25-768x456.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.28.25-1536x913.png 1536w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-03-at-18.28.25-2048x1217.png 2048w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><\/ul><\/li><\/ul>\n\n\n\n<h5 class=\"wp-block-heading\"><strong><span class=\"has-inline-color has-ast-global-color-0-color\">ETL Jobs<\/span><\/strong><\/h5>\n\n\n\n<h6 class=\"wp-block-heading\"><strong>S3 to RedShift<\/strong><\/h6>\n\n\n\n<p>We will use the ETL Job to load the data from S3 to RedShift, removing some of the columns along the way. For this hands-on, we will use Serverless Redshift because it has a free tier.<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Create the Serverless Redshift<ul><li>If this is the first time your account will use the Serverless Redshift, the dashboard will prompt you the necessary information to set up a <em>Namespace<\/em> and <em>Workgroup<\/em> in the Serverless Redshift. <\/li><li>If your Serverless Redshift was already activated, then create a Namespace and Workgroup for this hands-on:<ul><li>Namespace:<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"241\" class=\"wp-image-899\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.37.07.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.37.07.png 2238w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.37.07-300x120.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.37.07-1024x411.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.37.07-768x308.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.37.07-1536x616.png 1536w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.37.07-2048x822.png 2048w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><li>Note the &#8216;Admin user name&#8217; and the &#8216;Database name&#8217;.<\/li><\/ul><\/li><li>Workgroup:<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"287\" class=\"wp-image-984\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.39.22-1024x489-2.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.39.22-1024x489-2.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.39.22-1024x489-2-300x143.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.39.22-1024x489-2-768x367.png 768w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><li><em>Note<\/em>: Your Endpoint, JDBC Url and ODBC Url should point to the correct database.<\/li><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"232\" class=\"wp-image-901\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.40.11.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.40.11.png 2090w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.40.11-300x116.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.40.11-1024x396.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.40.11-768x297.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.40.11-1536x594.png 1536w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.40.11-2048x792.png 2048w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><li><em>Note<\/em>: Place your Redshift in the right VPC with correct Security Group.<\/li><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"144\" class=\"wp-image-902\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.40.38.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.40.38.png 2090w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.40.38-300x72.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.40.38-1024x245.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.40.38-768x184.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.40.38-1536x367.png 1536w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-16.40.38-2048x490.png 2048w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><li><em>Note:<\/em> Choose the lowest RPU that you can choose to reduce cost.<\/li><\/ul><\/li><\/ul><\/li><li>Once you set up your <em>Namespace<\/em> and <em>Workgroup<\/em>, connect to your database and create a table using the &#8216;admin&#8217; user. If you forgot its password, you can go to your <em>Namespace<\/em>. Under &#8216;Actions&#8217; choose &#8216;Edit admin credentials&#8217;.<ul><li>Create a table named &#8216;hk_buildings&#8217; with the following columns:<ul><li><code>CREATE TABLE IF NOT EXISTS public.hk_buildings (EnglishAddress1 VARCHAR(512), EnglishAddress2 VARCHAR(512), EnglishAddress3 VARCHAR(512), EnglishBuildingName1 VARCHAR(512), EnglishBuildingName2 VARCHAR(512), EnglishBuildingName3 VARCHAR(512), EnglishPublicHousingType VARCHAR(512), OwnersCorporation VARCHAR(512), YearBuild VARCHAR(512));<\/code><\/li><\/ul><\/li><\/ul><\/li><\/ul><\/li><li>Create a new AWS Glue Connection:<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"405\" class=\"wp-image-986\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-17.18.00-1024x691-2.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-17.18.00-1024x691-2.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-17.18.00-1024x691-2-300x202.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-2024-10-04-at-17.18.00-1024x691-2-768x518.png 768w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><\/ul><ul><li><em>Note<\/em>: Use the admin username and password to connect. Also, make sure that you are in the right Subnet\/VPC and that your Security Group allows outgoing connections to RedShift.<\/li><\/ul><\/li><li>Create the ETL job using the Visual ETL tool (<span style=\"color:#eb1109\" class=\"has-inline-color\">Note: DO NOT enable the <em>Data Preview<\/em> as this will start an <em>Interactive Session,<\/em> which you will be charged<\/span>)<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"319\" class=\"wp-image-914\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-44-58.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-44-58.png 1587w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-44-58-300x159.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-44-58-1024x544.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-44-58-768x408.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-44-58-1536x816.png 1536w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><\/ul><ul><li>Your job will have four nodes:<ol><li><strong>AWS Glue Data Catalog<\/strong> &#8211; This will be your  Data Source.<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"308\" class=\"wp-image-918\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-53-47.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-53-47.png 1599w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-53-47-300x154.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-53-47-1024x526.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-53-47-768x394.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-53-47-1536x789.png 1536w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><\/ul><\/li><li><strong>Change Schema<\/strong> &#8211; <span style=\"box-sizing: border-box; margin: 0px; padding: 0px;\">You need this transformation node to change the Field Na<\/span>mes to Camel Case. The next transformation node (<strong>Drop Fields)<\/strong>is case-sensitive, so it won&#8217;t recognize the fields if you don&#8217;t transform them.<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"322\" class=\"wp-image-917\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-54-18.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-54-18.png 1591w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-54-18-300x161.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-54-18-1024x549.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-54-18-768x412.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-54-18-1536x824.png 1536w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><\/ul><\/li><li><strong>Drop Fields<\/strong> &#8211; This transformation node will drop fields that contain Chinese characters.<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"317\" class=\"wp-image-916\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-54-37.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-54-37.png 1614w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-54-37-300x159.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-54-37-1024x541.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-54-37-768x406.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-54-37-1536x812.png 1536w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><\/ul><\/li><li><strong>Amazon Redshift<\/strong> &#8211; This will write the transformed record to the Amazon RedShift.<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"306\" class=\"wp-image-915\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-54-56.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-54-56.png 1605w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-54-56-300x153.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-54-56-1024x523.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-54-56-768x392.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-16-54-56-1536x784.png 1536w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><\/ul><\/li><\/ol><\/li><\/ul><\/li><li>Run the Job:<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"277\" class=\"wp-image-922\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-17-05-19.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-17-05-19.png 1603w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-17-05-19-300x138.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-17-05-19-1024x472.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-17-05-19-768x354.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-17-05-19-1536x708.png 1536w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><li>Lunch your Amazon Redshift Query Editor to check if the records are load to your table:<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"257\" class=\"wp-image-921\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-17-06-03.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-17-06-03.png 1912w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-17-06-03-300x128.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-17-06-03-1024x438.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-17-06-03-768x329.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-08-17-06-03-1536x657.png 1536w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><\/ul><\/li><\/ul><\/li><\/ul>\n\n\n\n<h5 class=\"wp-block-heading\"><strong><span class=\"has-inline-color has-ast-global-color-0-color\">Workflow<\/span><\/strong><\/h5>\n\n\n\n<p>We will create a workflow triggered by a new file uploaded to the S3 bucket. The workflow will first execute the crawler, followed by the ETL job we created earlier in this hands-on.<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Create a workflow:<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"326\" class=\"wp-image-931\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-10-25.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-10-25.png 1580w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-10-25-300x163.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-10-25-1024x557.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-10-25-768x418.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-10-25-1536x835.png 1536w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><li>Start with an EVENT triggered by the Event Bridge Rule, followed by a crawler, and then an ETL job.<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"318\" class=\"wp-image-932\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-10-43.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-10-43.png 1584w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-10-43-300x159.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-10-43-1024x542.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-10-43-768x407.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-10-43-1536x814.png 1536w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><\/ul><\/li><\/ul><\/li><li>Create an Event Bridge Rule:<ul><li>The event must be triggered by the creation of an Object in an S3 folder:<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"295\" class=\"wp-image-988\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-09-37-2.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-09-37-2.png 1552w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-09-37-2-300x147.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-09-37-2-1024x503.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-09-37-2-768x378.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-09-37-2-1536x755.png 1536w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><\/ul><\/li><\/ul><ul><li>Select the Glue ETL job created in the previous hands-on as the target:<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"318\" class=\"wp-image-990\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-09-58-2.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-09-58-2.png 1599w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-09-58-2-300x159.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-09-58-2-1024x542.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-09-58-2-768x407.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-09-58-2-1536x814.png 1536w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><\/ul><\/li><\/ul><\/li><li>Trigger the Event Bridge Rule:<ul><li>Upload one of the files in the S3 bucket (Note that you must clear the folder so it will only process the files that you uploaded)<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"150\" class=\"wp-image-935\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-24-17.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-24-17.png 1841w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-24-17-300x75.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-24-17-1024x255.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-24-17-768x191.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-24-17-1536x383.png 1536w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><\/ul><\/li><li>After uploading the file, the workflow will be triggered:<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"319\" class=\"wp-image-936\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-29-54.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-29-54.png 1592w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-29-54-300x160.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-29-54-1024x545.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-29-54-768x409.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-29-54-1536x817.png 1536w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/li><\/ul><\/li><li>Check the result if the data is loaded in the Redshift cluster:<ul><li><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"261\" class=\"wp-image-938\" style=\"width: 600px;\" src=\"http:\/\/192.168.1.3\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-30-17.png\" alt=\"\" srcset=\"https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-30-17.png 1909w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-30-17-300x130.png 300w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-30-17-1024x445.png 1024w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-30-17-768x334.png 768w, https:\/\/mylinuxsite.com\/wordpress\/wp-content\/uploads\/2024\/10\/Screenshot-from-2024-10-14-20-30-17-1536x668.png 1536w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><br><\/li><\/ul><\/li><\/ul><\/li><\/ul>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[11],"tags":[],"class_list":["post-861","post","type-post","status-publish","format-standard","hentry","category-aws-review-notes"],"_links":{"self":[{"href":"https:\/\/mylinuxsite.com\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/861","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mylinuxsite.com\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mylinuxsite.com\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mylinuxsite.com\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mylinuxsite.com\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=861"}],"version-history":[{"count":64,"href":"https:\/\/mylinuxsite.com\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/861\/revisions"}],"predecessor-version":[{"id":1385,"href":"https:\/\/mylinuxsite.com\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/861\/revisions\/1385"}],"wp:attachment":[{"href":"https:\/\/mylinuxsite.com\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=861"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mylinuxsite.com\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=861"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mylinuxsite.com\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=861"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}