New Amazon DynamoDB zero-ETL integration with Amazon SageMaker Lakehouse

30 December 2024

106

Amazon DynamoDB, a serverless NoSQL database, has been a go-to answer for over a million clients to construct low-latency and high-scale purposes. As information grows, organizations are continually looking for methods to extract beneficial insights from operational information, which is commonly saved in DynamoDB. Nonetheless, to benefit from this information in Amazon DynamoDB for analytics and machine studying (ML) use circumstances, clients usually construct customized information pipelines—a time-consuming infrastructure job that provides little distinctive worth to their core enterprise.

Beginning at the moment, you need to use Amazon DynamoDB zero-ETL integration with Amazon SageMaker Lakehouse to run analytics and ML workloads in only a few clicks with out consuming your DynamoDB desk capability. Amazon SageMaker Lakehouse unifies all of your information throughout Amazon S3 information lakes and Amazon Redshift information warehouses, serving to you construct highly effective analytics and AI/ML purposes on a single copy of knowledge.

Zero-ETL is a set of integrations that eliminates or minimizes the necessity to construct ETL information pipelines. This zero-ETL integration reduces the complexity of engineering efforts required to construct and preserve information pipelines, benefiting customers working analytics and ML workloads on operational information in Amazon DynamoDB with out impacting manufacturing workflows.

Let’s get began
For the next demo, I must arrange zero-ETL integration for my information in Amazon DynamoDB with an Amazon Easy Storage Service information lake managed by Amazon SageMaker Lakehouse. Earlier than organising the zero-ETL integration, there are stipulations to finish. If you wish to be taught extra on arrange, check with this Amazon DynamoDB documentation web page.

With all of the stipulations accomplished, I can get began with this integration. I navigate to the AWS Glue console and choose Zero-ETL integrations beneath Information Integration and ETL. Then, I select Create zero-ETL integration.

Right here, I’ve choices to pick my information supply. I select Amazon DynamoDB and select Subsequent.

Subsequent, I must configure the supply and goal particulars. Within the Supply particulars part, I choose my Amazon DynamoDB desk. Within the Goal particulars part, I specify the S3 bucket that I’ve arrange within the AWS Glue Information Catalog.

To arrange this integration, I want an IAM function that grants AWS Glue the mandatory permissions. For steering on configuring IAM permissions, go to the Amazon DynamoDB documentation web page. Additionally, if I haven’t configured a useful resource coverage for my AWS Glue Information Catalog, I can choose Repair it for me to robotically add the required useful resource insurance policies.

Right here, I’ve choices to configure the output. Below Information partitioning, I can both use DynamoDB desk keys for partitioning or specify customized partition keys. After finishing the configuration, I select Subsequent.

As a result of I choose the Repair it for me checkbox, I must evaluation the required modifications and select Proceed earlier than I can proceed to the following step.

On the following web page, I’ve the pliability to configure information encryption. I can use AWS Key Administration Service (AWS KMS) or a customized encryption key. Then, I assign a reputation to the combination and select Subsequent.

On the final step, I must evaluation the configurations. Once I’m completely happy, I select Subsequent to create the zero-ETL integration.

After the preliminary information ingestion completes, my zero-ETL integration shall be prepared to be used. The completion time varies relying on the dimensions of my supply DynamoDB desk.

If I navigate to Tables beneath Information Catalog within the left navigation panel, I can observe extra particulars together with Schema. Below the hood, this zero-ETL integration makes use of Apache Iceberg to rework associated to information format and construction in my DynamoDB information into Amazon S3.

Lastly, I can inform that each one my information is offered in my S3 bucket.

This zero-ETL integration considerably reduces the complexity and operational burden of knowledge motion, and I can subsequently give attention to extracting insights somewhat than managing pipelines.

Out there now
This new zero-ETL functionality is offered within the following AWS Areas: US East (N. Virginia, Ohio), US West (Oregon), Asia Pacific (Hong Kong, Singapore, Sydney, Tokyo), Europe (Frankfurt, Eire, Stockholm).

Discover streamline your information analytics workflows utilizing Amazon DynamoDB zero-ETL integration with Amazon SageMaker Lakehouse. Be taught extra get began on the Amazon DynamoDB documentation web page.

Comfortable constructing!
— Donnie

New Amazon DynamoDB zero-ETL integration with Amazon SageMaker Lakehouse

Related Articles

An Open Supply Device to Unravel UEFI and its Vulnerabilities

Subsequent-Gen JavaScript Bundle Administration with Ruy Adorno and Darcy Clarke

Codenotary updates its free SBOM scanning device with capabilities that higher help AI apps

LEAVE A REPLY Cancel reply

Latest Articles

An Open Supply Device to Unravel UEFI and its Vulnerabilities

Subsequent-Gen JavaScript Bundle Administration with Ruy Adorno and Darcy Clarke

Codenotary updates its free SBOM scanning device with capabilities that higher help AI apps

The best way to Construct and Optimize It for Success

MetalBear launches mirrord for CI to enhance testing course of for cloud native apps