7.4 C
New York
Wednesday, November 19, 2025

Introducing Amazon MWAA Serverless | AWS Huge Knowledge Weblog


In the present day, AWS introduced Amazon Managed Workflows for Apache Airflow (MWAA) Serverless. This can be a new deployment possibility for MWAA that eliminates the operational overhead of managing Apache Airflow environments whereas optimizing prices by means of serverless scaling. This new providing addresses key challenges that information engineers and DevOps groups face when orchestrating workflows: operational scalability, value optimization, and entry administration.

With MWAA Serverless you possibly can focus in your workflow logic relatively than monitoring for provisioned capability. Now you can submit your Airflow workflows for execution on a schedule or on demand, paying just for the precise compute time used throughout every activity’s execution. The service routinely handles all infrastructure scaling in order that your workflows run effectively no matter load.

Past simplified operations, MWAA Serverless introduces an up to date safety mannequin for granular management by means of AWS Id and Entry Administration (IAM). Every workflow can now have its personal IAM permissions, operating on a VPC of your selecting so you possibly can implement exact safety controls with out creating separate Airflow environments. This method considerably reduces safety administration overhead whereas strengthening your safety posture.

On this put up, we display methods to use MWAA Serverless to construct and deploy scalable workflow automation options. We stroll by means of sensible examples of making and deploying workflows, establishing observability by means of Amazon CloudWatch, and changing present Apache Airflow DAGs (Directed Acyclic Graphs) to the serverless format. We additionally discover finest practices for managing serverless workflows and present you methods to implement monitoring and logging.

How does MWAA Serverless work?

MWAA Serverless processes your workflow definitions and executes them effectively in service-managed Airflow environments, routinely scaling assets based mostly on workflow calls for. MWAA Serverless makes use of the Amazon Elastic Container Service (Amazon ECS) executor to run every particular person activity by itself ECS Fargate container, on both your VPC or a service-managed VPC. These containers then talk again to their assigned Airflow cluster utilizing the Airflow 3 Activity API.


Determine 1: Amazon MWAA Structure

MWAA Serverless makes use of declarative YAML configuration recordsdata based mostly on the favored open supply DAG Manufacturing unit format to boost safety by means of activity isolation. You may have two choices for creating these workflow definitions:

This declarative method offers two key advantages. First, since MWAA Serverless reads workflow definitions from YAML it might probably decide activity scheduling with out operating any workflow code. Second, this permits MWAA Serverless to grant execution permissions solely when duties run, relatively than requiring broad permissions on the workflow degree. The result’s a safer atmosphere the place activity permissions are exactly scoped and time restricted.

Service issues for MWAA Serverless

MWAA Serverless has the next limitations that it is best to think about when deciding between serverless and provisioned MWAA deployments:

  • Operator assist
    • MWAA Serverless solely helps operators from the Amazon Supplier Package deal.
    • To execute customized code or scripts, you’ll want to make use of AWS companies, akin to:
  • Person interface
    • MWAA Serverless operates with out utilizing the Airflow net interface.
    • For workflow monitoring and administration, we offer integration with Amazon CloudWatch and AWS CloudTrail.

Working with MWAA Serverless

Full the next stipulations and steps to make use of MWAA Serverless.

Stipulations

Earlier than you start, confirm you will have the next necessities in place:

  • Entry and permissions
    • An AWS account
    • AWS Command Line Interface (AWS CLI) model 2.31.38 or later put in and configured
    • The suitable permissions to create and modify IAM roles and insurance policies, together with the next required IAM permissions:
      • airflow-serverless:CreateWorkflow
      • airflow-serverless:DeleteWorkflow
      • airflow-serverless:GetTaskInstance
      • airflow-serverless:GetWorkflowRun
      • airflow-serverless:ListTaskInstances
      • airflow-serverless:ListWorkflowRuns
      • airflow-serverless:ListWorkflows
      • airflow-serverless:StartWorkflowRun
      • airflow-serverless:UpdateWorkflow
      • iam:CreateRole
      • iam:DeleteRole
      • iam:DeleteRolePolicy
      • iam:GetRole
      • iam:PutRolePolicy
      • iam:UpdateAssumeRolePolicy
      • logs:CreateLogGroup
      • logs:CreateLogStream
      • logs:PutLogEvents
      • airflow:GetEnvironment
      • airflow:ListEnvironments
      • s3:DeleteObject
      • s3:GetObject
      • s3:ListBucket
      • s3:PutObject
      • s3:Sync
    • Entry to an Amazon Digital Non-public Cloud (VPC) with web connectivity
  • Required AWS companies – Along with MWAA Serverless you will want entry to the next AWS companies:
    • Amazon MWAA to entry your present Airflow atmosphere(s)
    • Amazon CloudWatch to view logs
    • Amazon S3 for DAG and YAML file administration
    • AWS IAM to regulate permissions
  • Growth atmosphere
  • Extra necessities
    • Primary familiarity with Apache Airflow ideas
    • Understanding of YAML syntax
    • Data of AWS CLI instructions

Word: All through this put up, we use instance values that you simply’ll want to exchange with your personal:

  • Exchange amzn-s3-demo-bucket along with your S3 bucket title
  • Exchange 111122223333 along with your AWS account quantity
  • Exchange us-east-2 along with your AWS Area. MWAA Serverless is accessible in a number of AWS Areas. Examine the Record of AWS Providers Out there by Area for present availability.

Creating your first serverless workflow

Let’s begin by defining a easy workflow that will get an inventory of S3 objects and writes that listing to a file in the identical bucket. Create a brand new file referred to as simple_s3_test.yaml with the next content material:

simples3test:
  dag_id: simples3test
  schedule: 0 0 * * *
  duties:
    list_objects:
      operator: airflow.suppliers.amazon.aws.operators.s3.S3ListOperator
      bucket: 'amzn-s3-demo-bucket'
      prefix: ''
      retries: 0
    create_object_list:
      operator: airflow.suppliers.amazon.aws.operators.s3.S3CreateObjectOperator
      information: '{{ ti.xcom_pull(task_ids="list_objects", key="return_value") }}'
      s3_bucket: 'amzn-s3-demo-bucket'
      s3_key: 'filelist.txt'
      dependencies: [list_objects]

For this workflow to run, you could create an Execution function that has permissions to listing and write to the above bucket. The function additionally must be assumable from MWAA Serverless. The next CLI instructions create this function and its related coverage:

aws iam create-role 
--role-name mwaa-serverless-access-role 
--assume-role-policy-document '{
    "Model": "2012-10-17",
    "Assertion": [
      {
        "Effect": "Allow",
        "Principal": {
          "Service": [
            "airflow-serverless.amazonaws.com"
          ]
        },
        "Motion": "sts:AssumeRole"
      },
      {
        "Sid": "AllowAirflowServerlessAssumeRole",
        "Impact": "Enable",
        "Principal": {
          "Service": "airflow-serverless.amazonaws.com"
        },
        "Motion": "sts:AssumeRole",
        "Situation": {
          "StringEquals": {
            "aws:SourceAccount": "${aws:PrincipalAccount}"
          },
          "ArnLike": {
            "aws:SourceArn": "arn:aws:*:*:${aws:PrincipalAccount}:workflow/*"
          }
        }
      }
    ]
  }'

aws iam put-role-policy 
  --role-name mwaa-serverless-access-role 
  --policy-name mwaa-serverless-policy   
  --policy-document '{
	"Model": "2012-10-17",
	"Assertion": [
		{
			"Sid": "CloudWatchLogsAccess",
			"Effect": "Allow",
			"Action": [
				"logs:CreateLogGroup",
				"logs:CreateLogStream",
				"logs:PutLogEvents"
			],
			"Useful resource": "*"
		},
		{
			"Sid": "S3DataAccess",
			"Impact": "Enable",
			"Motion": [
				"s3:ListBucket",
				"s3:GetObject",
				"s3:PutObject"
			],
			"Useful resource": [
				"arn:aws:s3:::amzn-s3-demo-bucket",
				"arn:aws:s3:::amzn-s3-demo-bucket/*"
			]
		}
	]
}'

You then copy your YAML DAG to the identical S3 bucket, and create your workflow based mostly upon the Arn response from the above operate.

aws s3 cp "simple_s3_test.yaml" 
s3://amzn-s3-demo-bucket/yaml/simple_s3_test.yaml

aws mwaa-serverless create-workflow 
--name simple_s3_test 
--definition-s3-location '{ "Bucket": "amzn-s3-demo-bucket", "ObjectKey": "yaml/simple_s3_test.yaml" }' 
--role-arn arn:aws:iam::111122223333:function/mwaa-serverless-access-role 
--region us-east-2

The output of the final command returns a WorkflowARN worth, which you then use to run the workflow:

aws mwaa-serverless start-workflow-run 
--workflow-arn arn:aws:airflow-serverless:us-east-2:111122223333:workflow/simple_s3_test-abc1234def 
--region us-east-2

The output returns a RunId worth, which you then use to examine the standing of the workflow run that you simply simply executed.

aws mwaa-serverless get-workflow-run 
--workflow-arn arn:aws:airflow-serverless:us-east-2:111122223333:workflow/simple_s3_test-abc1234def 
--run-id ABC123456789def 
--region us-east-2

If it’s essential to make a change to your YAML, you possibly can copy again to S3 and run the update-workflow command.

aws s3 cp "simple_s3_test.yaml" 
s3://amzn-s3-demo-bucket/yaml/simple_s3_test.yaml

aws mwaa-serverless update-workflow 
--workflow-arn arn:aws:airflow-serverless:us-east-2:111122223333:workflow/simple_s3_test-abc1234def 
--definition-s3-location '{ "Bucket": "amzn-s3-demo-bucket", "ObjectKey": "yaml/simple_s3_test.yaml" }' 
--role-arn arn:aws:iam::111122223333:function/mwaa-serverless-access-role 
--region us-east-2

Changing Python DAGs to YAML format

AWS has printed a conversion device that makes use of the open-source Airflow DAG processor to serialize Python DAGs into YAML DAG manufacturing unit format. To put in, you run the next:

pip3 set up python-to-yaml-dag-converter-mwaa-serverless
dag-converter convert source_dag.py --output output_yaml_folder

For instance, create the next DAG and title it create_s3_objects.py:

from datetime import datetime
from airflow import DAG
from airflow.fashions.param import Param
from airflow.suppliers.amazon.aws.operators.s3 import S3CreateObjectOperator

default_args = {
    'start_date': datetime(2024, 1, 1),
    'retries': 0,
}

dag = DAG(
    'create_s3_objects',
    default_args=default_args,
    description='Create a number of S3 objects in a loop',
    schedule=None
)

# Set variety of recordsdata to create
LOOP_COUNT = 3
s3_bucket="md-workflows-mwaa-bucket"
s3_prefix = 'test-files'

# Create a number of S3 objects utilizing loop
last_task=None
for i in vary(1, LOOP_COUNT + 1):  
    create_object = S3CreateObjectOperator(
        task_id=f'create_object_{i}',
        s3_bucket=s3_bucket,
        s3_key=f'{s3_prefix}/{i}.txt',
        information="{{ ds_nodash }}-{ decrease }",
        change=True,
        dag=dag
    )
    if last_task:
        last_task >> create_object
    last_task = create_object

Upon getting put in python-to-yaml-dag-converter-mwaa-serverless, you run:

dag-converter convert "/path_to/create_s3_objects.py" --output "/path_to/yaml/"

The place the output will finish with:

YAML validation profitable, no errors discovered

YAML written to /path_to/yaml/create_s3_objects.yaml

And ensuing YAML will seem like:

create_s3_objects:
  dag_id: create_s3_objects
  params: {}
  default_args:
    start_date: '2024-01-01'
    retries: 0
  schedule: None
  duties:
    create_object_1:
      operator: airflow.suppliers.amazon.aws.operators.s3.S3CreateObjectOperator
      aws_conn_id: aws_default
      information: '{{ ds_nodash }}-{ decrease }'
      encrypt: false
      shops: []
      params: {}
      priority_weight: 1
      change: true
      retries: 0
      retry_delay: 300.0
      retry_exponential_backoff: false
      s3_bucket: md-workflows-mwaa-bucket
      s3_key: test-files/1.txt
      task_id: create_object_1
      trigger_rule: all_success
      wait_for_downstream: false
      dependencies: []
    create_object_2:
      operator: airflow.suppliers.amazon.aws.operators.s3.S3CreateObjectOperator
      aws_conn_id: aws_default
      information: '{{ ds_nodash }}-{ decrease }'
      encrypt: false
      shops: []
      params: {}
      priority_weight: 1
      change: true
      retries: 0
      retry_delay: 300.0
      retry_exponential_backoff: false
      s3_bucket: md-workflows-mwaa-bucket
      s3_key: test-files/2.txt
      task_id: create_object_2
      trigger_rule: all_success
      wait_for_downstream: false
      dependencies: [create_object_1]
    create_object_3:
      operator: airflow.suppliers.amazon.aws.operators.s3.S3CreateObjectOperator
      aws_conn_id: aws_default
      information: '{{ ds_nodash }}-{ decrease }'
      encrypt: false
      shops: []
      params: {}
      priority_weight: 1
      change: true
      retries: 0
      retry_delay: 300.0
      retry_exponential_backoff: false
      s3_bucket: md-workflows-mwaa-bucket
      s3_key: test-files/3.txt
      task_id: create_object_3
      trigger_rule: all_success
      wait_for_downstream: false
      dependencies: [create_object_2]
  catchup: false
  description: Create a number of S3 objects in a loop
  max_active_runs: 16
  max_active_tasks: 16
  max_consecutive_failed_dag_runs: 0

Word that, as a result of the YAML conversion is finished after the DAG parsing, the loop that creates the duties is run first and the ensuing static listing of duties is written to the YAML doc with their dependencies.

Migrating an MWAA atmosphere’s DAGs to MWAA Serverless

You may benefit from a provisioned MWAA atmosphere to develop and check your workflows after which transfer them to serverless to run effectively at scale. Additional, in case your MWAA atmosphere is utilizing suitable MWAA Serverless operators, then you possibly can convert the entire atmosphere’s DAGs without delay. Step one is to permit MWAA Serverless to imagine the MWAA Execution function through a belief relationship. This can be a one-time operation for every MWAA Execution function, and could be carried out manually within the IAM console or utilizing an AWS CLI command as follows:

MWAA_ENVIRONMENT_NAME="MyAirflowEnvironment"
MWAA_REGION=us-east-2

MWAA_EXECUTION_ROLE_ARN=$(aws mwaa get-environment --region $MWAA_REGION --name $MWAA_ENVIRONMENT_NAME --query 'Setting.ExecutionRoleArn' --output textual content )
MWAA_EXECUTION_ROLE_NAME=$(echo $MWAA_EXECUTION_ROLE_ARN | xargs basename) 
MWAA_EXECUTION_ROLE_POLICY=$(aws iam get-role --role-name $MWAA_EXECUTION_ROLE_NAME --query 'Function.AssumeRolePolicyDocument' --output json | jq '.Assertion[0].Principal.Service += ["airflow-serverless.amazonaws.com"] | .Assertion[0].Principal.Service |= distinctive | .Assertion += [{"Sid": "AllowAirflowServerlessAssumeRole", "Effect": "Allow", "Principal": {"Service": "airflow-serverless.amazonaws.com"}, "Action": "sts:AssumeRole", "Condition": {"StringEquals": {"aws:SourceAccount": "${aws:PrincipalAccount}"}, "ArnLike": {"aws:SourceArn": "arn:aws:*:*:${aws:PrincipalAccount}:workflow/*"}}}]')

aws iam update-assume-role-policy --role-name $MWAA_EXECUTION_ROLE_NAME --policy-document "$MWAA_EXECUTION_ROLE_POLICY"

Now we are able to loop by means of every efficiently transformed DAG and create serverless workflows for every.

S3_BUCKET=$(aws mwaa get-environment --name $MWAA_ENVIRONMENT_NAME --query 'Setting.SourceBucketArn' --output textual content --region us-east-2 | lower -d':' -f6)

for file in /tmp/yaml/*.yaml; do MWAA_WORKFLOW_NAME=$(basename "$file" .yaml); 
      aws s3 cp "$file" s3://$S3_BUCKET/yaml/$MWAA_WORKFLOW_NAME.yaml --region us-east-2; 
      aws mwaa-serverless create-workflow --name $MWAA_WORKFLOW_NAME 
      --definition-s3-location "{"Bucket": "$S3_BUCKET", "ObjectKey": "yaml/$MWAA_WORKFLOW_NAME.yaml"}" --role-arn $MWAA_EXECUTION_ROLE_ARN  
      --region us-east-2  
      executed

To see an inventory of your created workflows, run:

aws mwaa-serverless list-workflows --region us-east-2

Monitoring and observability

MWAA Serverless workflow execution standing is returned through the GetWorkflowRun operate. The outcomes from that can return particulars for that exact run. If there are errors within the workflow definition, they’re returned beneath RunDetail within the ErrorMessage area as within the following instance:

{
  "WorkflowVersion": "7bcd36ce4d42f5cf23bfee67a0f816c6",
  "RunId": "d58cxqdClpTVjeN",
  "RunType": "SCHEDULE",
  "RunDetail": {
    "ModifiedAt": "2025-11-03T08:02:47.625851+00:00",
    "ErrorMessage": "anticipated token ',', bought 'create_test_table'",
    "TaskInstances": [],
    "RunState": "FAILED"
  }
}

Workflows which might be correctly outlined, however whose duties fail, will return "ErrorMessage": "Workflow execution failed":

{
  "WorkflowVersion": "0ad517eb5e33deca45a2514c0569079d",
  "RunId": "ABC123456789def",
  "RunType": "SCHEDULE",
  "RunDetail": {
    "StartedOn": "2025-11-03T13:12:09.904466+00:00",
    "CompletedOn": "2025-11-03T13:13:57.620605+00:00",
    "ModifiedAt": "2025-11-03T13:16:08.888182+00:00",
    "Period": 107,
    "ErrorMessage": "Workflow execution failed",
    "TaskInstances": [
      "ex_5496697b-900d-4008-8d6f-5e43767d6e36_create_bucket_1"
    ],
    "RunState": "FAILED"
  },
}

MWAA Serverless activity logs are saved within the CloudWatch log group /aws/mwaa-serverless/<workflow id>/ (the place /<workflow id> is similar string because the distinctive workflow id within the ARN of the workflow). For particular activity log streams, you will want to listing the duties for the workflow run after which get every activity’s data. You may mix these operations right into a single CLI command.

aws mwaa-serverless list-task-instances 
  --workflow-arn arn:aws:airflow-serverless:us-east-2:111122223333:workflow/simple_s3_test-abc1234def 
  --run-id ABC123456789def 
  --region us-east-2 
  --query 'TaskInstances[].TaskInstanceId' 
  --output textual content | xargs -n 1 -I {} aws mwaa-serverless get-task-instance 
  --workflow-arn arn:aws:airflow-serverless:us-east-2:111122223333:workflow/simple_s3_test-abc1234def 
  --run-id ABC123456789def 
  --task-instance-id {} 
  --region us-east-2 
  --query '{Standing: Standing, StartedAt: StartedAt, LogStream: LogStream}'

Which might outcome within the following:

{
    "Standing": "SUCCESS",
    "StartedAt": "2025-10-28T21:21:31.753447+00:00",
    "LogStream": "//aws/mwaa-serverless/simple_s3_test_3-abc1234def//workflow_id=simple_s3_test-abc1234def/run_id=ABC123456789def/task_id=list_objects/try=1.log"
}
{
    "Standing": "FAILED",
    "StartedAt": "2025-10-28T21:23:13.446256+00:00",
    "LogStream": "//aws/mwaa-serverless/simple_s3_test_3-abc1234def//workflow_id=simple_s3_test-abc1234def/run_id=ABC123456789def/task_id=create_object_list/try=1.log"
}

At which level, you’ll use the CloudWatch LogStream output to debug your workflow.

Chances are you’ll view and handle your workflows within the Amazon MWAA Serverless console:

For an instance that creates detailed metrics and monitoring dashboard utilizing AWS Lambda, Amazon CloudWatch, Amazon DynamoDB, and Amazon EventBridge, overview the instance in this GitHub repository.

Clear up assets

To keep away from incurring ongoing fees, comply with these steps to wash up all assets created throughout this tutorial:

  1. Delete MWAA Serverless workflows – Run this AWS CLI command to delete all workflows:
    aws mwaa-serverless list-workflows --query 'Workflows[*].WorkflowArn' --output textual content | whereas learn -r workflow; do aws mwaa-serverless delete-workflow --workflow-arn $workflow executed

  2. Take away the IAM roles and insurance policies created for this tutorial:
    aws iam delete-role-policy --role-name mwaa-serverless-access-role --policy-name mwaa-serverless-policy

  3. Take away the YAML workflow definitions out of your S3 bucket:
    aws s3 rm s3://amzn-s3-demo-bucket/yaml/ --recursive

After finishing these steps, confirm within the AWS Administration Console that each one assets have been correctly eliminated. Keep in mind that CloudWatch Logs are retained by default and will should be deleted individually if you wish to take away all traces of your workflow executions.

Should you encounter any errors throughout cleanup, confirm you will have the required permissions and that assets exist earlier than making an attempt to delete them. Some assets might have dependencies that require them to be deleted in a selected order.

Conclusion

On this put up, we explored Amazon MWAA Serverless, a brand new deployment possibility that simplifies Apache Airflow workflow administration. We demonstrated methods to create workflows utilizing YAML definitions, convert present Python DAGs to the serverless format, and monitor your workflows.

MWAA Serverless affords a number of key benefits:

  • No provisioning overhead
  • Pay-per-use pricing mannequin
  • Automated scaling based mostly on workflow calls for
  • Enhanced safety by means of granular IAM permissions
  • Simplified workflow definitions utilizing YAML

To study extra MWAA Serverless, overview the documentation.


Concerning the authors

John Jackson

John Jackson

John has over 25 years of software program expertise as a developer, programs architect, and product supervisor in each startups and enormous companies and is the AWS Principal Product Supervisor accountable for Amazon MWAA.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles