-10.3 C
New York
Monday, December 23, 2024

Handle Amazon Redshift provisioned clusters with Terraform


Amazon Redshift is a quick, scalable, safe, and absolutely managed cloud information warehouse that makes it easy and cost-effective to research all of your information utilizing normal SQL and your present extract, rework, and cargo (ETL); enterprise intelligence (BI); and reporting instruments. Tens of 1000’s of consumers use Amazon Redshift to course of exabytes of information per day and energy analytics workloads equivalent to BI, predictive analytics, and real-time streaming analytics.

HashiCorp Terraform is an infrastructure as code (IaC) software that allows you to outline cloud assets in human-readable configuration information that you may model, reuse, and share. You may then use a constant workflow to provision and handle your infrastructure all through its lifecycle.

On this put up, we display the best way to use Terraform to handle frequent Redshift cluster operations, equivalent to:

  • Creating a brand new provisioned Redshift cluster utilizing Terraform code and including an AWS Id and Entry Administration (IAM) function to it
  • Scheduling pause, resume, and resize operations for the Redshift cluster

Answer overview

The next diagram illustrates the answer structure for provisioning a Redshift cluster utilizing Terraform.

Handle Amazon Redshift provisioned clusters with Terraform

Along with Amazon Redshift, the answer makes use of the next AWS companies:

  • Amazon Elastic Compute Cloud (Amazon EC2) affords the broadest and deepest compute platform, with over 750 situations and selection of the newest processors, storage, networking, working system (OS), and buy mannequin that will help you greatest match the wants of your workload. For this put up, we use an m5.xlarge occasion with the Home windows Server 2022 Datacenter Version. The selection of occasion kind and Home windows OS is versatile; you may select a configuration that fits your use case.
  • IAM lets you securely handle identities and entry to AWS companies and assets. We use IAM roles and insurance policies to securely entry companies and carry out related operations. An IAM function is an AWS identification that you may assume to realize momentary entry to AWS companies and assets. Every IAM function has a set of permissions outlined by IAM insurance policies. These insurance policies decide the actions and assets the function can entry.
  • AWS Secrets and techniques Supervisor lets you securely retailer the person identify and password wanted to log in to Amazon Redshift.

On this put up, we display the best way to arrange an surroundings that connects AWS and Terraform. The next are the high-level duties concerned:

  1. Arrange an EC2 occasion with Home windows OS in AWS.
  2. Set up Terraform on the occasion.
  3. Configure your surroundings variables (Home windows OS).
  4. Outline an IAM coverage to have minimal entry to carry out actions on a Redshift cluster, together with pause, resume, and resize.
  5. Set up an IAM function utilizing the coverage you created.
  6. Create a provisioned Redshift cluster utilizing Terraform code.
  7. Connect the IAM function you created to the Redshift cluster.
  8. Write the Terraform code to schedule cluster operations like pause, resume, and resize.

Conditions

To finish the actions described on this put up, you want an AWS account and administrator privileges on the account to make use of the important thing AWS companies and create the required IAM roles.

Create an EC2 occasion

We start with creating an EC2 occasion. Full the next steps to create a Home windows OS EC2 occasion:

  1. On the Amazon EC2 console, select Launch Occasion.
  2. Select a Home windows Server Amazon Machine Picture (AMI) that fits your necessities.
  3. Choose an acceptable occasion kind on your use case.
  4. Configure the occasion particulars:
    1. Select the VPC and subnet the place you need to launch the occasion.
    2. Allow Auto-assign Public IP.
    3. For Add storage, configure the specified storage choices on your occasion.
    4. Add any crucial tags to the occasion.
  5. For Configure safety group, choose or create a safety group that enables the required inbound and outbound site visitors to your occasion.
  6. Overview the occasion configuration and select Launch to begin the occasion creation course of.
  7. For Choose an present key pair or create a brand new key pair, select an present key pair or create a brand new one.
  8. Select Launch occasion.
  9. When the occasion is operating, you may hook up with it utilizing the Distant Desktop Protocol (RDP) and the administrator password obtained from the Get Home windows password

Set up Terraform on the EC2 occasion

Set up Terraform on the Home windows EC2 occasion utilizing the next steps:

  1. RDP into the EC2 occasion you created.
  2. Set up Terraform on the EC2 occasion.

You could replace the surroundings variables to level to the listing the place the Terraform executable is out there.

  1. Below System Properties, on the Superior tab, select Atmosphere Variables.

Environment Variables

  1. Select the trail variable.

Path Variables

  1. Select New and enter the trail the place Terraform is put in. For this put up, it’s within the C: listing.

Add Terraform to path variable

  1. Affirm Terraform is put in by getting into the next command:

terraform -v

Check Terraform version

Optionally, you should use an editor like Visible Studio Code (VS Code) and add the Terraform extension to it.

Create a person for accessing AWS via code (AWS CLI and Terraform)

Subsequent, we create an administrator person in IAM, which performs the operations on AWS via Terraform and the AWS Command Line Interface (AWS CLI). Full the next steps:

  1. Create a brand new IAM person.
  2. On the IAM console, obtain and save the entry key and person key.

Create New IAM User

  1. Set up the AWS CLI.
  2. Launch the AWS CLI and run aws configure and go the entry key ID, secret entry key, and default AWS Area.

This prevents the AWS person identify and password from being seen in plain textual content within the Terraform code and prevents unintended sharing when the code is dedicated to a code repository.

AWS Configure

Create a person for Accessing Redshift via code (Terraform)

As a result of we’re making a Redshift cluster and subsequent operations, the administrator person identify and password required for these processes (totally different than the admin function we created earlier for logging in to the AWS Administration Console) must be invoked within the code. To do that securely, we use Secrets and techniques Supervisor to retailer the person identify and password. We write code in Terraform to entry these credentials in the course of the cluster create operation. Full the next steps:

  1. On the Secrets and techniques Supervisor console, select Secrets and techniques within the navigation pane.
  2. Select Retailer a brand new secret.

Store a New Secret

  1. For Secret kind, choose Credentials for Amazon Redshift information warehouse.
  2. Enter your credentials.

Choose Secret Type

Arrange Terraform

Full the next steps to arrange Terraform:

  1. Create a folder or listing for storing all of your Terraform code.
  2. Open the VS Code editor and browse to your folder.
  3. Select New File and enter a reputation for the file utilizing the .tf extension

Now we’re prepared to begin writing our code beginning with defining suppliers. The suppliers definition is a means for Terraform to get the required APIs to work together with AWS.

  1. Configure a supplier for Terraform:
terraform {
required_providers {
aws = {
supply  = "hashicorp/aws"
model = "5.53.0"
}
}
}

# Configure the AWS Supplier
supplier "aws" {
area = "us-east-1"
}

  1. Entry the admin credentials for the Amazon Redshift admin person:
information "aws_secretsmanager_secret_version" "creds" {
# Fill within the identify you gave to your secret
secret_id = "terraform-creds"
}
/*json decode to parse the key*/
locals {
terraform-creds = jsondecode(
information.aws_secretsmanager_secret_version.creds.secret_string
)
}

Create a Redshift cluster

To create a Redshift cluster, use the aws_redshift_cluster useful resource:

# Create an encrypted Amazon Redshift cluster

useful resource "aws_redshift_cluster" "dw_cluster" {
cluster_identifier = "tf-example-redshift-cluster"
database_name      = "dev"
master_username    = native.terraform-creds.username
master_password    = native.terraform-creds.password
node_type          = "ra3.xlplus"
cluster_type       = "multi-node"
publicly_accessible = "false"
number_of_nodes    = 2
encrypted         = true
kms_key_id        = native.RedshiftClusterEncryptionKeySecret.arn
enhanced_vpc_routing = true
cluster_subnet_group_name="<<your-cluster-subnet-groupname>>"
}

On this instance, we create a Redshift cluster referred to as tf-example-redshift-cluster, utilizing the ra3.xlplus node kind 2 node cluster. We use the credentials from Secrets and techniques Supervisor and jsondecode to entry these values. This makes positive the person identify and password aren’t handed in plain textual content.

Add an IAM function to the cluster

As a result of we didn’t have the choice to affiliate an IAM function throughout cluster creation, we achieve this now with the next code:

useful resource "aws_redshift_cluster_iam_roles" "cluster_iam_role" {
cluster_identifier = aws_redshift_cluster.dw_cluster.cluster_identifier
iam_role_arns      = ["arn:aws:iam::yourawsaccountId:role/service-role/yourIAMrolename"]
}

Allow Redshift cluster operations

Performing operations on the Redshift cluster equivalent to resize, pause, and resume on a schedule affords a extra sensible use of those operations. Due to this fact, we create two insurance policies: one that enables the Amazon Redshift scheduler service and one that enables the cluster pause, resume, and resize operations. Then we create a job that has each insurance policies hooked up to it.

You may carry out these steps immediately from the console after which referenced in Terraform code. The next instance demonstrates the code snippets to create insurance policies and a job, after which to connect the coverage to the function.

  1. Create the Amazon Redshift scheduler coverage doc and create the function that assumes this coverage:
#outline coverage doc to determine the Belief Relationship between the function and the entity (Redshift scheduler)

information "aws_iam_policy_document" "assume_role_scheduling" {
assertion {
impact = "Enable"
principals {
kind        = "Service"
identifiers = ["scheduler.redshift.amazonaws.com"]
}
actions = ["sts:AssumeRole"]
}
}

#create a job that has the above belief relationship hooked up to it, in order that it will probably invoke the redshift scheduling service
useful resource "aws_iam_role" "scheduling_role" {
identify               = "redshift_scheduled_action_role"
assume_role_policy = information.aws_iam_policy_document.assume_role_scheduling.json
}

  1. Create a coverage doc and coverage for Amazon Redshift operations:
/*outline the coverage doc for different redshift operations*/

information "aws_iam_policy_document" "redshift_operations_policy_definition" {
assertion {
impact = "Enable"
actions = [
"redshift:PauseCluster",
"redshift:ResumeCluster",
"redshift:ResizeCluster",
]
assets = ["arn:aws:redshift:*:youraccountid:cluster:*"]
}
}

/*create the coverage and add the above information (json) to the coverage*/
useful resource "aws_iam_policy" "scheduling_actions_policy" {
identify   = "redshift_scheduled_action_policy"
coverage = information.aws_iam_policy_document.redshift_operations_policy_definition.json
}

  1. Connect the coverage to the IAM function:
/*join the coverage and the function*/
useful resource "aws_iam_role_policy_attachment" "role_policy_attach" {
policy_arn = aws_iam_policy.scheduling_actions_policy.arn
function       = aws_iam_role.scheduling_role.identify
}

  1. Pause the Redshift cluster:
#pause a cluster
useful resource "aws_redshift_scheduled_action" "pause_operation" {
identify     = "tf-redshift-scheduled-action-pause"
schedule = "cron(00 22 * * ? *)"
iam_role = aws_iam_role.scheduling_role.arn
target_action {
pause_cluster {
cluster_identifier = aws_redshift_cluster.dw_cluster.cluster_identifier
}
}
}

Within the previous instance, we created a scheduled motion referred to as tf-redshift-scheduled-action-pause that pauses the cluster at 10:00 PM daily as a cost-saving motion.

  1. Resume the Redshift cluster:
identify     = "tf-redshift-scheduled-action-resume"
schedule = "cron(15 07 * * ? *)"
iam_role = aws_iam_role.scheduling_role.arn
target_action {
resume_cluster {
cluster_identifier = aws_redshift_cluster.dw_cluster.cluster_identifier
}
}
}

Within the previous instance, we created a scheduled motion referred to as tf-redshift-scheduled-action-resume that resumes the cluster at 7:15 AM daily in time for enterprise operations to begin utilizing the Redshift cluster.

  1. Resize the Redshift cluster:
#resize a cluster
useful resource "aws_redshift_scheduled_action" "resize_operation" {
identify     = "tf-redshift-scheduled-action-resize"
schedule = "cron(15 14 * * ? *)"
iam_role = aws_iam_role.scheduling_role.arn
target_action {
resize_cluster {
cluster_identifier = aws_redshift_cluster.dw_cluster.cluster_identifier
cluster_type = "multi-node"
node_type = "ra3.xlplus"
number_of_nodes = 4 /*enhance the variety of nodes utilizing resize operation*/
traditional = true /*default conduct is to make use of elastic resizeboolean worth if we need to use traditional resize*/
}
}
}

Within the previous instance, we created a scheduled motion referred to as tf-redshift-scheduled-action-resize that will increase the nodes from 2 to 4. You are able to do different operations like change the node kind as properly. By default, elastic resize will probably be used, however if you wish to use traditional resize, you need to go the parameter traditional = true as proven within the previous code. This generally is a scheduled motion to anticipate the wants of peak intervals and resize appripriately for that period. You may then downsize utilizing related code throughout non-peak occasions.

Take a look at the answer

We apply the next code to check the answer. Change the useful resource particulars accordingly, equivalent to account ID and Area identify.

terraform {
  required_providers {
    aws = {
      supply  = "hashicorp/aws"
      model = "5.53.0"
    }
  }
}

# Configure the AWS Supplier
supplier "aws" {
  area = "us-east-1"
}

# entry secrets and techniques saved in secret supervisor
information "aws_secretsmanager_secret_version" "creds" {
  # Fill within the identify you gave to your secret
  secret_id = "terraform-creds"
}

/*json decode to parse the key*/
locals {
  terraform-creds = jsondecode(
    information.aws_secretsmanager_secret_version.creds.secret_string
  )
}

#Retailer the arn of the KMS key for use for encrypting the redshift cluster

information "aws_secretsmanager_secret_version" "encryptioncreds" {
  secret_id = "RedshiftClusterEncryptionKeySecret"
}
locals {
  RedshiftClusterEncryptionKeySecret = jsondecode(
    information.aws_secretsmanager_secret_version.encryptioncreds.secret_string
  )
}

# Create an encrypted Amazon Redshift cluster
useful resource "aws_redshift_cluster" "dw_cluster" {
  cluster_identifier = "tf-example-redshift-cluster"
  database_name      = "dev"
  master_username    = native.terraform-creds.username
  master_password    = native.terraform-creds.password
  node_type          = "ra3.xlplus"
  cluster_type       = "multi-node"
  publicly_accessible = "false"
  number_of_nodes    = 2
  encrypted         = true
  kms_key_id        = native.RedshiftClusterEncryptionKeySecret.arn
  enhanced_vpc_routing = true
  cluster_subnet_group_name="redshiftclustersubnetgroup-yuu4sywme0bk"
}

#add IAM Position to the Redshift cluster

useful resource "aws_redshift_cluster_iam_roles" "cluster_iam_role" {
  cluster_identifier = aws_redshift_cluster.dw_cluster.cluster_identifier
  iam_role_arns      = ["arn:aws:iam::youraccountid:role/service-role/yourrolename"]
}

#for audit logging please create an S3 bucket which has learn write privileges for Redshift service, this instance doesn't embody S3 bucket creation.

useful resource "aws_redshift_logging" "redshiftauditlogging" {
  cluster_identifier   = aws_redshift_cluster.dw_cluster.cluster_identifier
  log_destination_type = "s3"
  bucket_name          = "your-s3-bucket-name"
}

#to do operations like pause, resume, resize on a schedule we have to first create a job that has permissions to carry out these operations on the cluster

#outline coverage doc to determine the Belief Relationship between the function and the entity (Redshift scheduler)

information "aws_iam_policy_document" "assume_role_scheduling" {
  assertion {
    impact = "Enable"
    principals {
      kind        = "Service"
      identifiers = ["scheduler.redshift.amazonaws.com"]
    }

    actions = ["sts:AssumeRole"]
  }
}

#create a job that has the above belief relationship hooked up to it, in order that it will probably invoke the redshift scheduling service
useful resource "aws_iam_role" "scheduling_role" {
  identify               = "redshift_scheduled_action_role"
  assume_role_policy = information.aws_iam_policy_document.assume_role_scheduling.json
}

/*outline the coverage doc for different redshift operations*/

information "aws_iam_policy_document" "redshift_operations_policy_definition" {
  assertion {
    impact = "Enable"
    actions = [
      "redshift:PauseCluster",
      "redshift:ResumeCluster",
      "redshift:ResizeCluster",
    ]

    assets =  ["arn:aws:redshift:*:youraccountid:cluster:*"]
  }
}

/*create the coverage and add the above information (json) to the coverage*/

useful resource "aws_iam_policy" "scheduling_actions_policy" {
  identify   = "redshift_scheduled_action_policy"
  coverage = information.aws_iam_policy_document.redshift_operations_policy_definition.json
}

/*join the coverage and the function*/

useful resource "aws_iam_role_policy_attachment" "role_policy_attach" {
  policy_arn = aws_iam_policy.scheduling_actions_policy.arn
  function       = aws_iam_role.scheduling_role.identify
}

#pause a cluster

useful resource "aws_redshift_scheduled_action" "pause_operation" {
  identify     = "tf-redshift-scheduled-action-pause"
  schedule = "cron(00 14 * * ? *)"
  iam_role = aws_iam_role.scheduling_role.arn
  target_action {
    pause_cluster {
      cluster_identifier = aws_redshift_cluster.dw_cluster.cluster_identifier
    }
  }
}

#resume a cluster

useful resource "aws_redshift_scheduled_action" "resume_operation" {
  identify     = "tf-redshift-scheduled-action-resume"
  schedule = "cron(15 14 * * ? *)"
  iam_role = aws_iam_role.scheduling_role.arn
  target_action {
    resume_cluster {
      cluster_identifier = aws_redshift_cluster.dw_cluster.cluster_identifier
    }
  }
}

#resize a cluster

useful resource "aws_redshift_scheduled_action" "resize_operation" {
  identify     = "tf-redshift-scheduled-action-resize"
  schedule = "cron(15 14 * * ? *)"
  iam_role = aws_iam_role.scheduling_role.arn
  target_action {
    resize_cluster {
      cluster_identifier = aws_redshift_cluster.dw_cluster.cluster_identifier
      cluster_type = "multi-node"
      node_type = "ra3.xlplus"
      number_of_nodes = 4 /*enhance the variety of nodes utilizing resize operation*/
      traditional = true /*default conduct is to make use of elastic resizeboolean worth if we need to use traditional resize*/
    }
  }
}

Run terraform plan to see an inventory of modifications that will probably be made, as proven within the following screenshot.

Terraform plan

After you might have reviewed the modifications, use terraform apply to create the assets you outlined.

Terraform Apply

You may be requested to enter sure or no earlier than Terraform begins creating the assets.

Confirmation of apply

You may affirm that the cluster is being created on the Amazon Redshift console.

redshift cluster creation

After the cluster is created, the IAM roles and schedules for pause, resume, and resize operations are added, as proven within the following screenshot.

Terraform actions

You can too view these scheduled operations on the Amazon Redshift console.

Scheduled Actions

Clear up

In case you deployed assets such because the Redshift cluster and IAM roles, or any of the opposite related assets by operating terraform apply, to keep away from incurring expenses in your AWS account, run terraform destroy to tear these assets down and clear up your surroundings.

Conclusion

Terraform affords a strong and versatile resolution for managing your infrastructure as code utilizing a declarative strategy, with a cloud-agnostic nature, useful resource orchestration capabilities, and powerful group help. This put up supplied a complete information to utilizing Terraform to deploy a Redshift cluster and carry out essential operations equivalent to resize, resume, and pause on the cluster. Embracing IaC and utilizing the precise instruments, equivalent to Workflow Studio, VS Code, and Terraform, will allow you to construct scalable and maintainable distributed functions, and automate processes.


In regards to the Authors

Amit Ghodke is an Analytics Specialist Options Architect based mostly out of Austin. He has labored with databases, information warehouses and analytical functions for the previous 16 years. He loves to assist prospects implement analytical options at scale to derive most enterprise worth.

Ritesh Kumar Sinha is an Analytics Specialist Options Architect based mostly out of San Francisco. He has helped prospects construct scalable information warehousing and large information options for over 16 years. He likes to design and construct environment friendly end-to-end options on AWS. In his spare time, he loves studying, strolling, and doing yoga.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles