Amazon Managed Workflows for Apache Airflow (Amazon MWAA) supplies a safe and managed atmosphere to run Apache Airflow on AWS. Airflow is usually utilized in extremely regulated industries, equivalent to finance and healthcare. These prospects may need to additional limit entry and visitors to reinforce safety posture than what the Amazon MWAA default configurations present. This put up covers some really helpful practices.
The precept of least privilege is a basic tenet that ought to be adopted diligently. In relation to configuring AWS companies, it’s important to grant solely the minimal required permissions to sources, avoiding overly broad or permissive insurance policies.
On this put up, we discover the right way to apply the precept of least privilege to your Amazon MWAA atmosphere by tightening community safety utilizing safety teams, community entry management lists (ACLs), and digital non-public cloud (VPC) endpoints. We additionally focus on the Amazon MWAA execution and deployment roles and their respective permissions.
Understanding the Amazon MWAA atmosphere
When an Amazon MWAA atmosphere is created, sources are created in an AWS managed service VPC and your buyer managed VPC. Within the buyer VPC supplied at atmosphere creation, the mandatory sources to run the Airflow atmosphere are deployed, together with schedulers and employees operating on Amazon Elastic Container Service (Amazon ECS) clusters. These clusters are deployed in your VPC they usually assume Elastic Community Interfaces (ENIs) with non-public IP addresses within the buyer account. These ENIs span non-public subnets throughout two Availability Zones to connect with the Airflow database and internet server, which reside within the service-owned account (if in non-public entry mode). The next diagram illustrates this structure.
VPC safety teams act as digital firewalls that may management community visitors on the ENI degree, or occasion degree. Safety teams are stateful, that means that inbound visitors is robotically permitted outbound and vice versa. The default safety group configuration in a VPC begins with is not any inbound guidelines and an outbound rule permitting all visitors. By definition, a safety group with no inbound guidelines denies all ingress visitors that wasn’t allowed out via the 0.0.0.0/0 outbound rule.
Amazon MWAA presents two internet server entry modes contained in the buyer VPC: private and non-private. Public internet server mode should have a manner for visitors to entry the net servers within the customer-owned VPC via the general public web. This requires routing to the general public web utilizing public subnets and a NAT gateway. A NAT gateway can be utilized to supply web entry for sources in non-public subnets. With non-public entry mode, the safety group for the Amazon MWAA atmosphere doesn’t want to permit visitors to and from the NAT gateway, solely granting entry to the Airflow UI to customers with acceptable permissions from throughout the VPC. An Software Load Balancer is simply provisioned in public mode to route visitors to the general public internet servers. The shopper should provision the remainder of the networking elements.
In case your Amazon MWAA atmosphere wants to speak with sources exterior your VPC (equivalent to exterior information sources or APIs), you may have to configure acceptable safety group guidelines and routing to permit the mandatory visitors. In such circumstances, you’ll usually use a NAT gateway or VPN connection to facilitate the communication between your Amazon MWAA atmosphere and the exterior sources and VPC endpoints for AWS sources.
For tighter safety restrictions, an atmosphere with non-public routing with out web entry is feasible, and finer-grained safety group guidelines may be utilized and VPC endpoint insurance policies can be utilized. As a result of this put up is specializing in least privilege, we are going to deal with the minimal safety necessities wanted for an Amazon MWAA atmosphere.
Safety teams: Minimizing permissions
Your Amazon MWAA atmosphere could have a safety group related together with your VPC’s atmosphere sources. This safety group can be utilized by the ENIs created by the interface VPC endpoint that’s used to speak with the database and internet server. By default, safety teams deny all inbound visitors and safety group guidelines should be explicitly acknowledged, denoting the ports and supply that the occasion will permit community visitors from. At a minimal, the Amazon MWAA atmosphere should permit for visitors to and from the Amazon Aurora PostgreSQL-Appropriate Version metadata database that’s owned and managed by Amazon MWAA. The metadata database is an important part of Airflow that acts as a centralized supply of reality for process execution, configuration, and monitoring. Each the scheduler and employees require entry to this database to carry out their respective roles in orchestrating and operating duties. This database listens on TCP port 5432. Moreover, the net server visitors may be restricted to HTTPS via TCP port 443. At a minimal, the Amazon MWAA safety group should have the 2 inbound guidelines, detailed within the following desk.
Kind | Protocol | Port Vary | Supply Kind | Supply |
Customized TCP | TCP | 5432 | Customized | sg-xxxxx / my-mwaa-vpc-security-group |
HTTPS | TCP | 443 | Customized | sg-xxxxx / my-mwaa-vpc-security-group |
Many shoppers produce other AWS sources residing in VPCs, to which the Amazon MWAA employees want entry. These sources may be granted community entry in a non-public routing configuration utilizing safety teams as properly. If the useful resource sits in the identical safety group, add a further inbound rule with the port wanted. For instance, if an Amazon Redshift cluster sits in the identical safety group, add the next rule.
Kind | Protocol | Port Vary | Supply Kind | Supply |
Customized TCP | TCP | 5439 | Customized | sg-xxxxx / my-mwaa-vpc-security-group |
If the Redshift cluster is in a distinct safety group, change the supply to the Redshift safety group.
Kind | Protocol | Port Vary | Supply Kind | Supply |
Customized TCP | TCP | 5439 | Customized | sg-xxxxx / redshift-security-group |
If the sources are in one other VPC, then VPC peering should be enabled earlier than referencing that different VPC’s safety group. For sources that don’t reside in a subnet, a VPC endpoint will even present non-public routing to and from the Amazon MWAA atmosphere and people sources. For instance, a VPC endpoint for Amazon Easy Storage Service (Amazon S3) can present enhanced safety, improved efficiency, and decrease prices.
Community ACLs: Minimizing permissions
Community ACLs can handle (by permit or deny guidelines) inbound and outbound visitors on the subnet degree. An ACL is stateless, which signifies that inbound and outbound guidelines should be specified individually and explicitly. It’s used to specify the kinds of community visitors which can be allowed in or out from the situations in a VPC community.
Each Amazon VPC has a default ACL that permits all inbound and outbound visitors, with a rule as follows.
Rule quantity | Kind | Protocol | Port Vary | Supply | Enable/Deny |
100 | All IPv4 visitors | All | All | 0.0.0.0/0 | Enable |
* | All IPv4 visitors | All | All | 0.0.0.0/0 | Deny |
You possibly can edit the default ACL guidelines or create a customized ACL and connect it to your subnets. A subnet can solely have one ACL connected to it at any time, however one ACL may be connected to a number of subnets. To implement least privilege in your Amazon MWAA atmosphere, limit the inbound ACL to permit visitors from the metadata database and internet server and limit the outbound to permit visitors to solely the shoppers within the non-public subnet. Observe the next examples use instance non-public IPs for the subnets used.
Inbound NACL
Rule quantity | Kind | Protocol | Port Vary | Supply | Enable/Deny | Feedback |
100 | Customized TCP | TCP | 5432 | 10.192.21.0/16 | Enable | Enable inbound database visitors from non-public subnet |
110 | HTTPS | TCP | 443 | 10.192.21.0/16 | Enable | Enable inbound HTTPS visitors from non-public subnet |
* | All visitors | All | All | 0.0.0.0/0 | Deny | Denies all inbound IPv4 visitors not already dealt with by a previous rule (not modifiable) |
Outbound NACL
Rule quantity | Kind | Protocol | Port Vary | Supply | Enable/Deny | Feedback |
100 | Customized TCP | TCP | 1024-65535 | 10.192.21.0/24 | Enable | Permits outbound return IPv4 visitors to shoppers in non-public subnet |
* | All visitors | All | All | 0.0.0.0/0 | Deny | Denies all outbound IPv4 visitors not already dealt with by a previous rule (not modifiable) |
VPC endpoints: Minimizing permissions
If you create an Amazon MWAA atmosphere, it’s deployed inside a VPC. This lets you management the community entry and safety of your Airflow deployment. Nonetheless, some buyer workloads executing within the Amazon MWAA atmosphere may have to orchestrate duties utilizing different AWS companies, equivalent to Amazon S3 to entry recordsdata, AWS Glue to start out ETL (extract, remodel, and cargo) jobs, or Amazon Redshift for operating information warehouse queries, which reside exterior of your VPC. To determine a safe and personal connection between your Amazon MWAA atmosphere and these exterior AWS companies, you should utilize VPC endpoints. The aim of VPC endpoints in Amazon MWAA is to supply a safe and personal connection between your Amazon MWAA atmosphere and different AWS companies inside your VPC. VPC endpoints are digital gadgets which can be provisioned inside your VPC and act as an entry level for the desired AWS service, permitting your Amazon MWAA atmosphere to speak with the service utilizing a non-public IP tackle, with no need to undergo the general public web. The next diagram illustrates this structure.
VPC endpoints permit you to maintain your Amazon MWAA atmosphere’s community visitors throughout the AWS community, decreasing the publicity to the general public web and enhancing the general safety of your Airflow deployment. Though non-public VPC endpoints are robotically created for the database and internet server, to create a least privileged atmosphere with out web entry, extra VPC endpoints can be wanted for the extra Amazon MWAA required sources. Amazon S3, Amazon Easy Queue Service (Amazon SQS), Amazon CloudWatch, and optionally AWS Key Administration Service (AWS KMS) will want VPC endpoints created. For extra particulars, see Creating the required VPC service endpoints in an Amazon VPC with non-public routing. Exterior of the mandatory companies, many shoppers run Amazon MWAA workflows that orchestrate extra AWS companies, equivalent to Amazon Redshift, Amazon EMR, and AWS Glue. Let’s take a look at an instance VPC endpoint that we need to use to connect with Amazon Redshift, which is usually referred to as within the Airflow DAGS utilizing the Redshift Operator for workflows that work together with Amazon Redshift as a knowledge warehouse. For extra info on creating Amazon VPC interface endpoints, see Entry an AWS service utilizing an interface VPC endpoint.
Create a VPC endpoint
Full the next steps to create a VPC endpoint utilizing Amazon Digital Non-public Cloud (Amazon VPC):
- On the Amazon VPC console, create a brand new VPC endpoint for the
amazonaws.area.redshift
service, the placearea
is the AWS Area the place your Amazon MWAA atmosphere and Redshift cluster are situated. Ensure that non-public DNS is enabled. - Create a VPC endpoint coverage. This can be utilized to restrict entry to the Redshift cluster solely to the Amazon MWAA atmosphere, stopping unauthorized entry from different sources. The next is an instance coverage:
- The
Model
area specifies the coverage language model. - The
Assertion
part comprises a single assertion that permits the desired actions on the Redshift cluster. - The
Impact
area is about to Enable, which suggests the coverage grants the desired permissions. - The
Principal
area specifies the AWS Identification and Entry Administration (IAM) position related together with your Amazon MWAA execution position, which is permitted to entry the Redshift cluster. - The
Motion
area lists the precise Redshift actions that the Amazon MWAA execution position is allowed to carry out, equivalent to describing the cluster, getting cluster credentials, and restoring from a snapshot. - The
Useful resource
area specifies the Amazon Useful resource Title (ARN) of the Redshift cluster that the coverage applies to.
- Affiliate the VPC endpoint with the proper route desk. This route desk ought to be utilized by the subnets the place your Amazon MWAA atmosphere is deployed. If utilizing a VPC interface endpoint, affiliate the endpoint with the 2 non-public subnets and safety group utilized by Amazon MWAA.
- Ensure that the safety teams related to the Amazon MWAA atmosphere and the Redshift cluster permit the mandatory inbound and outbound visitors between them. This usually consists of permitting entry on the Redshift port (usually 5439) from the Amazon MWAA atmosphere’s safety group.
- On the Amazon MWAA console, below Admin, Connections, replace the Redshift connection particulars to make use of the VPC endpoint tackle as an alternative of the general public Redshift endpoint. This makes positive that the connection between Amazon MWAA and Amazon Redshift is safe and stays throughout the VPC.
By configuring VPC endpoints for the AWS companies your Amazon MWAA atmosphere must entry, you possibly can present safe, non-public, and environment friendly communication between your Airflow deployment and AWS sources.
Limiting visitors inside AWS with a buyer managed endpoints for Amazon MWAA sources
As talked about earlier, Amazon MWAA integrates with numerous AWS companies, equivalent to CloudWatch for logging, Amazon S3 for DAGs and necessities, Amazon SQS as a messaging middleware, and optionally AWS KMS for encryption. You possibly can create VPC endpoints for these companies to verify visitors stays throughout the AWS community. Entry to those endpoints may be restricted by permitting solely the Amazon MWAA safety group because the ingress supply. For particulars on the right way to create these endpoints and insurance policies, see Introducing shared VPC help on Amazon MWAA. If the Amazon MWAA atmosphere was up to date after April 2, 2024, will probably be on AWS Fargate v1.4 and won’t use Amazon Elastic Container Registry (Amazon ECR) and due to this fact you’ll not have to create a VPC endpoint for it.
Managing permissions to deploy an Amazon MWAA atmosphere
To create and deploy an Amazon MWAA atmosphere, you have to have the suitable permissions granted to your IAM person or position. The required permissions may be granted via an IAM coverage connected to your person or position. If you create an Amazon MWAA atmosphere, you possibly can specify an execution position that can be assumed by the Airflow employees to carry out duties. The execution position ought to have the mandatory permissions to entry the required AWS companies and sources primarily based in your workflow necessities. It’s necessary to observe the precept of least privilege when granting permissions to IAM roles and customers. It is best to solely grant the minimal permissions required to your Amazon MWAA atmosphere and Airflow workflows to operate accurately.
Amazon MWAA belief coverage
Amazon MWAA wants to have the ability to assume the execution position as a way to carry out actions in your behalf. To do that, create a belief coverage, permitting the Amazon MWAA service the flexibility to AssumeRole
. To keep away from the confused deputy downside, we add a situation to the belief coverage, and change the AWS account quantity and Area as wanted. The next is an instance coverage:
VPC endpoint permissions for the deployer position
Though the service-linked position creates the VPC endpoints, the deployer position requires permissions to create VPC endpoints and carry out a dry run. You possibly can restrict these permissions by permitting the ec2:CreateVpcEndpoint
motion and specifying useful resource ARNs for VPC endpoints, VPCs, subnets, and safety teams. Moreover, you should utilize the aws:CalledVia
situation key to limit entry to the airflow.amazonaws.com
service.
Amazon MWAA execution position: Required permissions
When creating an Amazon MWAA atmosphere, you have to specify an execution position that grants the mandatory permissions for Airflow to work together with different AWS companies. As a substitute of utilizing a wildcard coverage, you possibly can create a customized coverage with the minimal required permissions.
The next is an instance of an execution position coverage that permits Amazon MWAA to work together with numerous companies utilizing an AWS managed key:
This coverage grants Amazon MWAA the mandatory permissions to work together with CloudWatch Logs, Amazon S3, Amazon SQS, and AWS KMS when utilizing the AWS managed key providing, whereas explicitly specifying the sources it could actually entry. You possibly can additional refine this coverage primarily based in your particular necessities.
The next is an instance of an execution coverage that permits Amazon MWAA to work together with numerous companies utilizing a KMS buyer managed key:
For the use case of utilizing the shopper managed key, connect the next JSON coverage to the important thing to supply entry to the Airflow logs in CloudWatch Logs:
You possibly can connect a number of insurance policies to the execution position as wanted to permit your employees to entry extra AWS sources. For instance, let’s discover the right way to allow Amazon EMR entry. You possibly can create a JSON coverage that comprises the narrowest permissions you possibly can configure, as within the following instance:
Conclusion
On this put up, we mentioned greatest practices for least privilege configuration in Amazon MWAA. By following these approaches, you possibly can adhere to the precept of least privilege and keep a safe posture inside your Amazon MWAA atmosphere, with out compromising performance or counting on overly permissive insurance policies. Safety is all the time high precedence; to study extra about safety in Amazon MWAA, see Safety in Amazon Managed Workflows for Apache Airflow and Safety greatest practices on Amazon MWAA.
In regards to the Authors
Elizabeth Davis is a Sr Options Architect at Amazon Net Providers (AWS). She at the moment works with instructional expertise corporations and has a ardour for serverless and information orchestration applied sciences. She has been an Amazon MWAA as a topic knowledgeable (SME) for the final 3+ years.
Mark Richman is a Principal Options Architect at Amazon Net Providers with 30 years of expertise constructing complicated internet and enterprise software program. He contributes to Apache Airflow, bringing his experience in cloud computing and serverless applied sciences to the open-source platform. Mark can be an achieved author and speaker who has authored business publications and AWS programs whereas recurrently presenting at business occasions.