16.6 C
New York
Wednesday, September 10, 2025

Lower your storage prices with Amazon OpenSearch Service index rollups


Amazon OpenSearch Service is a totally managed service to help search, log analytics, and generative AI Retrieval Increase Technology (RAG) workloads within the AWS Cloud. It simplifies the deployment, safety, and scaling of OpenSearch clusters. As organizations scale their log analytics workloads by constantly gathering and analyzing huge quantities of knowledge, they typically wrestle to keep up fast entry to historic data whereas managing prices successfully. OpenSearch Service addresses these challenges by way of its tiered storage choices: scorching, UltraWarm, and chilly storage. These storage tiers are nice choices to assist optimize prices and supply a steadiness between efficiency and affordability, so organizations can handle their knowledge extra effectively. Organizations can select between these totally different storage tiers by holding knowledge in costly scorching storage for fast entry or transferring it to cheaper chilly storage with restricted accessibility. This trade-off turns into notably difficult when organizations want to investigate each latest and historic knowledge for compliance, pattern evaluation, or enterprise intelligence.

On this put up, we discover methods to use index rollups in Amazon OpenSearch Service to deal with this problem. This characteristic helps organizations effectively handle their historic knowledge by mechanically summarizing and compressing older knowledge whereas sustaining its analytical worth, considerably lowering storage prices in any storage tier with out sacrificing the flexibility to question historic data successfully.

Index rollups overview

Index rollups present a mechanism to mixture historic knowledge into summarized indexes at specified time intervals. This characteristic is especially helpful for time collection knowledge the place the granularity of older knowledge will be lowered whereas sustaining significant analytics capabilities.

Key advantages embody:

  • Lowered storage prices (varies by granularity stage), for instance:
    • Bigger financial savings when aggregating from seconds to hours
    • Average financial savings when aggregating from seconds to minutes
  • Improved question efficiency of historic knowledge
  • Maintained knowledge accessibility for long-term analytics
  • Automated knowledge summarization course of

Index rollups are a part of a complete knowledge administration technique. The actual price financial savings come from correctly managing your knowledge lifecycle together with rollups. To attain significant price reductions, you could take away or transfer the unique knowledge to a lower-cost storage tier after creating the rollup.

For purchasers already utilizing Index State Administration (ISM) to maneuver older knowledge to UltraWarm or chilly tiers, rollups can present important extra advantages. By aggregating knowledge at increased time intervals earlier than transferring it to lower-cost tiers, you possibly can dramatically cut back the amount of knowledge in these tiers, resulting in additional price financial savings. This technique is especially efficient for workloads with giant quantities of time collection knowledge, sometimes measuring in terabytes or petabytes. The bigger your knowledge quantity, the extra impactful your financial savings will likely be when implementing rollups appropriately.

Index rollups will be applied utilizing ISM insurance policies by way of the OpenSearch Dashboards UI or the OpenSearch API. Index rollups require OpenSearch or Elasticsearch 7.9 or later.

The choice to make use of totally different storage tiers requires cautious consideration of a corporation’s particular wants, balancing the need for price financial savings with the requirement for knowledge accessibility and efficiency. As knowledge volumes proceed to develop and analytics change into more and more necessary, discovering the suitable storage technique turns into essential for companies to stay aggressive and compliant whereas managing their budgets successfully.

On this put up, we take into account a situation with a big quantity of time collection knowledge that may be aggregated utilizing the Rollup API. With rollups, you may have the flexibleness to both retailer aggregated knowledge within the scorching tier for fast entry or mixture and advertise to cheaper tiers resembling UltraWarm or chilly storage. This method permits for environment friendly knowledge and index lifecycle administration whereas optimizing each efficiency and price.

Index rollups are sometimes confused with index rollovers, that are automated OpenSearch Service operations that create new indexes when specified thresholds are met, for instance by age, dimension, or doc depend. This characteristic maintains uncooked knowledge whereas optimizing cluster efficiency by way of managed index progress. For instance, rolling over when an index reaches 50 GB or is 30 days previous.

Use instances for index rollups

Index rollups are perfect for situations the place you must steadiness storage prices with knowledge granularity, resembling:

  • Time collection knowledge that requires totally different granularity ranges over time – For instance, Web of Issues (IoT) sensor knowledge the place real-time precision issues just for the newest knowledge.
    • Conventional method – It is not uncommon for customers to maintain all knowledge in costly scorching storage for immediate accessibility. Nevertheless, this isn’t optimum for price.
    • Beneficial – Retain latest (per second) knowledge in scorching storage for instant entry. For older intervals, retailer aggregated (hourly or each day) knowledge utilizing index rollups. Transfer or delete the higher-granularity previous knowledge from the recent tier. This balances accessibility and cost-effectiveness.
  • Historic knowledge with cost-optimization wants – For instance, system efficiency metrics the place total tendencies are extra worthwhile than exact values over time.
    • Conventional method – It is not uncommon for customers to retailer all efficiency metrics at full granularity indefinitely, consuming extreme space for storing. We don’t advocate storing knowledge indefinitely. Implement an information retention coverage based mostly in your particular enterprise wants and compliance necessities.
    • Beneficial – Keep detailed metrics for latest monitoring (final 30 days) and mixture older knowledge into hourly or each day summaries. This preserves the pattern evaluation functionality whereas considerably lowering storage prices.
  • Log knowledge with rare historic entry and low worth – For instance, software error logs the place detailed investigation is primarily wanted for latest incidents.
    • Conventional method – It is not uncommon for customers to maintain all log entries at full element, no matter age or entry frequency.
    • Beneficial – Protect detailed logs for an energetic troubleshooting interval (for instance, 1 week) and preserve summarized error patterns and statistics for older intervals. This permits historic sample evaluation whereas lowering storage overhead.

Schema design

A well-planned schema is essential for profitable rollup implementation. Correct schema design makes positive your rolled-up knowledge stays worthwhile for evaluation whereas maximizing storage financial savings. Contemplate the next key elements:

  • Establish fields required for long-term evaluation – Rigorously choose fields that present significant insights over time, avoiding pointless knowledge retention.
  • Outline aggregation varieties for every area, resembling min, max, sum, and common – Select acceptable aggregation strategies that protect the analytical worth of your knowledge.
  • Decide which fields will be excluded from rollups – Cut back storage prices by omitting fields that don’t contribute to long-term evaluation.
  • Contemplate mapping compatibility between supply and goal indexes – Present profitable knowledge transition with out mapping conflicts. This includes:
    • Matching knowledge varieties (for instance, date fields stay as date in rollups)
    • Dealing with nested fields appropriately
    • Guaranteeing all required fields are included within the rollup
    • Contemplating the impression of analyzed vs. non-analyzed fields
    • Incompatible mappings can result in failed rollup jobs or incorrect knowledge aggregation.

Practical and non-functional necessities

Earlier than implementing index rollups, take into account the next:

  • Knowledge entry patterns – When implementing knowledge rollup methods, it’s essential to first analyze knowledge entry patterns, together with question frequency and utilization intervals, to find out optimum rollup intervals. This evaluation ought to result in particular granularity metrics, resembling deciding between hourly or each day aggregations, whereas establishing clear thresholds based mostly on each knowledge quantity and question necessities. These choices ought to be documented alongside particular aggregation guidelines for every knowledge sort.
  • Knowledge progress fee – Storage optimization begins with calculating your present dataset dimension and its progress fee. This data helps quantify potential area reductions throughout totally different rollup methods. Efficiency metrics, notably anticipated question response occasions, ought to be outlined upfront. Moreover, set up monitoring KPIs specializing in latency, throughput, and useful resource utilization to ensure the system meets efficiency expectations.
  • Compliance or knowledge retention necessities – Retention planning requires cautious consideration of regulatory necessities and enterprise wants. Develop a transparent retention coverage that specifies how lengthy to maintain several types of knowledge at varied granularity ranges. Implement systematic processes for archiving or deleting older knowledge and preserve detailed documentation of storage prices throughout totally different retention intervals.
  • Useful resource utilization and planning – For profitable implementation, correct cluster capability planning is crucial. This includes precisely sizing computing assets, together with CPU, RAM, and storage necessities. Outline particular time home windows for executing rollup jobs to reduce impression on common operations. Set clear useful resource utilization thresholds and implement proactive capability monitoring. Lastly, develop a scalability plan that accounts for each horizontal and vertical progress to accommodate future wants.

Operational necessities

Correct operational planning facilitates clean ongoing administration of your rollup implementation. That is important for sustaining knowledge reliability and system well being:

  • Monitoring – You will need to monitor rollup jobs for his or her accuracy and desired outcomes. This implies implementing automated checks that validate knowledge completeness, aggregation accuracy, and job execution standing. Arrange alerts for failed jobs, knowledge inconsistencies, or when aggregation outcomes fall outdoors anticipated ranges.
  • Scheduling hours – Schedule rollup operations during times of low system utilization, sometimes throughout off-peak hours. Doc these upkeep home windows clearly and talk them to all stakeholders. Embrace buffer time for potential points and set up clear procedures for what occurs if a upkeep window must be prolonged.
  • Backup and restoration – OpenSearch Service takes automated snapshots of your knowledge at 1-hour intervals. However you possibly can outline and implement complete backup procedures utilizing snapshot administration performance to help your Restoration Time Goal (RTO) and Restoration Level Goal (RPO).

Your RPO will be personalized by way of totally different rollup schedules based mostly on index patterns. This flexibility helps you outline different knowledge loss tolerance ranges based on your knowledge’s criticality. For mission-critical indexes, you possibly can configure extra frequent rollups, whereas sustaining much less frequent schedules for analytical knowledge.

You possibly can tailor RTO administration in OpenSearch per index sample by way of backup and replication choices. For vital rollup indexes, implementing cross-cluster replication maintains up-to-date copies, considerably lowering restoration time. Different indexes may use commonplace backup procedures, balancing restoration velocity with operational prices. This versatile method helps you optimize each storage prices and restoration aims based mostly in your particular enterprise necessities for several types of knowledge inside your OpenSearch deployment.

Earlier than implementing rollups, audit all functions and dashboards that use the information being aggregated. Replace queries and visualizations to accommodate the brand new knowledge construction. Check these adjustments totally in a staging setting to verify they proceed to offer correct outcomes with the rolled-up knowledge. Create a rollback plan in case of sudden points with dependent functions.

Within the following sections, we stroll by way of the steps to create, run, and monitor a rollup job.

Create a rollup job

As mentioned in earlier sections, there are some issues when selecting good candidates for index rollup utilization. Constructing on this idea, establish your indexes to roll up their knowledge and create the roles.The next code is an instance of making a primary rollup job:

PUT /_plugins/_rollup/jobs/sensor_hourly_rollup
{
  "rollup": {
    "rollup_id": "sensor_1_hour_rollup",
    "enabled": true,
    "schedule": {
      "interval": {
        "start_time": 1746632400,        
        "interval": 1,
        "unit": "hours",
        "schedule_delay": 0
      }
    },
    "description": "Rolls up sensor knowledge 1 hourly per device_id",
    "source_index": "sensor-*",           
    "target_index": "sensor_rolled_hour",
    "page_size": 1000,
    "delay": 0,
    "steady": true,
    "dimensions": [
      {
        "date_histogram": {
          "fixed_interval": "1h",
          "source_field": "timestamp",
          "target_field": "timestamp",
          "timezone": "UTC"
        }
      },
      {
        "terms": {
          "source_field": "device_id",
          "target_field": "device_id"
        }
      }
    ],
    "metrics": [
      {
        "source_field": "temperature",
        "metrics": [
          { "avg": {} },
          { "min": {} },
          { "max": {} }
        ]
      },
      {
        "source_field": "humidity",
        "metrics": [
          { "avg": {} },
          { "min": {} },
          { "max": {} }
        ]
      },
      {
        "source_field": "stress",
        "metrics": [
          { "avg": {} },
          { "min": {} },
          { "max": {} }
        ]
      },
      {
        "source_field": "battery",
        "metrics": [
          { "avg": {} },
          { "min": {} },
          { "max": {} }
        ]
      }
    ]
  }
}

This rollup job processes IoT sensor knowledge, aggregating readings from the sensor-* index sample into hourly summaries saved in sensor_rolled_hour. It maintains device-level granularity whereas calculating common, minimal, and most values for temperature, humidity, stress, and battery ranges. The job executes hourly, processing 1,000 paperwork per batch.

The previous code assumes that the device_id area is of sort key phrase; notice that aggregation can’t be carried out on the textual content area.

Begin the rollup job

After you create the job, it is going to mechanically be scheduled based mostly on the job’s configuration (discuss with the schedule: a part of the job instance code within the earlier part). Nevertheless, you can too set off the job manually utilizing the next API name:

POST _plugins/_rollup/jobs/sensor_hourly_rollup/_start

The next is an instance of the outcomes:

Monitor progress

Utilizing Dev Instruments, run the next command to observe the progress:

GET _plugins/_rollup/jobs/sensor_hourly_rollup/_explain

The next is an instance of the outcomes:

{
  "sensor_hourly_rollup": {
    "metadata_id": "pCDjMZcBgTxYF90dWEfP",
    "rollup_metadata": {
      "rollup_id": "sensor_hourly_rollup",
      "last_updated_time": 1749043472416,
      "steady": {
        "next_window_start_time": 1749043440000,
        "next_window_end_time": 1749043560000
      },
      "standing": "began",
      "failure_reason": null,
      "stats": {
        "pages_processed": 374603,
        "documents_processed": 390,
        "rollups_indexed": 200,
        "index_time_in_millis": 789,
        "search_time_in_millis": 402202
      }
    }
  }
}  

The GET _plugins/_rollup/jobs/sensor_hourly_rollup/_explain command reveals the present standing and statistics of the sensor_hourly_rollup job. The response reveals necessary statistics such because the variety of processed paperwork, listed rollups, time spent on indexing and looking, and data of any failures. The standing signifies whether or not the job is energetic (began) or stopped (stopped) and reveals the final processed timestamp. This data is essential for monitoring the effectivity and well being of the rollup course of, serving to directors observe progress, establish potential points or bottlenecks, and make sure the job is working as anticipated. Common checks of those statistics can assist in optimizing the rollup job’s efficiency and sustaining knowledge integrity.

Actual-world instance

Let’s take into account a situation the place an organization collects IoT sensor knowledge, ingesting 240 GB of knowledge per day to an OpenSearch cluster, which totals 7.2 TB per 30 days.

The next is an instance file:

"_source": {
          "timestamp": "2024-01-01T10:00:00Z",
          "device_id": "sensor_001",
          "temperature": 26.1,
          "humidity": 43,
          "stress": 1009.3,
          "battery": 90
}

Assume you may have a time collection index with the next configuration:

  • Ingest fee: 10 million paperwork per hour
  • Retention interval: 30 days
  • Every doc dimension: Roughly 1 KB

The full storage with out rollups is as follows:

  • Per-day storage dimension: 10,000,000 docs per hour × ~1 KB × 24 hours per day = ~240 GB
  • Per-month storage dimension: 240 GB × 30 days = ~7.2 TB

The choice to implement rollups ought to be based mostly on a cost-benefit evaluation. Contemplate the next:

  • Present storage prices vs. potential financial savings
  • Compute prices for working rollup jobs
  • Worth of granular knowledge over time
  • Frequency of historic knowledge entry

For smaller datasets (for instance, lower than 50 GB/day), the advantages is perhaps much less important. As knowledge volumes develop, the price financial savings change into extra compelling.

Rollup configuration

Let’s roll up the information with the next configuration:

  • From 1-minute granularity to 1-hour granularity
  • Aggregating common, min, and max, grouped by device_id
  • Decreasing 60 paperwork per minute to 1 rollup doc per minute

The brand new doc depend per hour is as follows:

  • Per-hour paperwork: 10,000,000/60 = 166,667 docs per hour
  • Assuming every rollup doc is 2 KB (additional metadata), whole rollup storage: 166,667 docs per hour × 24 hours per day × 30 days × 2KB ˜= 240 GB/month

Confirm all required knowledge exists within the new rolled index, then delete the unique index to take away uncooked knowledge manually or through the use of ISM insurance policies (as mentioned within the subsequent part).

Execute the rollup job following the previous directions to mixture knowledge into the brand new rolled up index. To view your aggregated outcomes, run the next code:

GET sensor_rolled_hour/_search
{
  "dimension": 0,
  "aggs": {
    "per_device": {
      "phrases": {
        "area": "device_id",
        "dimension": 200,
        "shard_size": 200
      },
      "aggs": {
        "temperature_avg": {
          "avg": {
            "area": "temperature"
          }
        },
        "temperature_min": {
          "min": {
            "area": "temperature"
          }
        },
        "temperature_max": {
          "max": {
            "area": "temperature"
          }
        }
      }
      }
    }
  } 

The next code reveals the instance outcomes:

"aggregations": {
    "per_device": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "sensor_001",
          "doc_count": 98,
          "temperature_min": {
            "value": 24.100000381469727
          },
          "temperature_avg": {
            "value": 26.287754603794642
          },
          "temperature_max": {
            "value": 27.5
          }
        },
        {
          "key": "sensor_002",
          "doc_count": 98,
          "temperature_min": {
            "value": 20.600000381469727
          },
          "temperature_avg": {
            "value": 22.192856146364797
          },
          "temperature_max": {
            "value": 22.799999237060547
          }
        },...]

This doc represents the rolled-up knowledge for sensor_001 and sensor_002 throughout a 1-hour interval. It aggregates 1 hour of sensor readings right into a single file, storing minimal, common, and most values for temperature ranges. The file contains metadata concerning the rollup course of and timestamps for knowledge monitoring. This aggregated format considerably reduces storage necessities whereas sustaining important statistical details about the sensor’s efficiency throughout that hour.

We are able to calculate the storage financial savings as follows:

  • Authentic storage: 7.2 TB (or 7200 GB)
  • Put up-rollup storage: 240 GB
  • Storage financial savings: ((7.2 TB – 240 GB)/7.2 GB) × 100 = 96.67% financial savings

Utilizing OpenSearch rollups as demonstrated on this instance, you possibly can obtain roughly 96% storage financial savings whereas preserving necessary mixture insights.

The aggregation ranges and doc sizes will be personalized based on your particular use case necessities.

Automate rollups with ISM

To totally understand the advantages of index rollups, automate the method utilizing ISM insurance policies. The next code is an instance that implements a rollup technique based mostly on the given situation:

PUT _plugins/_ism/insurance policies/sensor_rollup_policy
{
  "coverage": {
    "description": "Roll up sensor knowledge and delete unique",
    "default_state": "scorching",
    "ism_template": {
      "index_patterns": ["sensor-*"],
      "precedence": 100
    },
    "states": [
      {
        "name": "hot",
        "actions": [],
        "transitions": [
          {
            "state_name": "rollup",
            "conditions": {
              "min_index_age": "1d"
            }
          }
        ]
      },
      {
        "title": "rollup",
        "actions": [
          {
            "rollup": {
              "ism_rollup": {
                "target_index": "sensor_rolled_minutely",
                "description": "Rollup sensor data to minutely aggregations",
                "page_size": 1000,
                "dimensions": [
                  {
                    "date_histogram": {
                      "fixed_interval": "1m",
                      "source_field": "timestamp",
                      "target_field": "timestamp"
                    }
                  },
                  {
                    "terms": {
                      "source_field": "device_id",
                      "target_field": "device_id"
                    }
                  }
                ],
                "metrics": [
                  {
                    "source_field": "temperature",
                    "metrics": [{ "avg": {} }, { "min": {} }, { "max": {} }]
                  },
                  {
                    "source_field": "humidity",
                    "metrics": [{ "avg": {} }, { "min": {} }, { "max": {} }]
                  }
                ]
              }
            }
          }
        ],
        "transitions": [
          {
            "state_name": "delete",
            "conditions": {
              "min_index_age": "2d"
            }
          }
        ]
      },
      {
        "title": "delete",
        "actions": [
          {
            "delete": {}
          }
        ]
      }
    ]
  }
}

This ISM coverage automates the rollup course of and knowledge lifecycle:

    1. Applies to all indexes matching the sensor-* sample.
    2. Retains unique knowledge within the scorching state for 1 day.
    3. After 1 day, rolls up the information into minutely aggregations. Aggregates by device_id and calculates common, minimal, and most for temperature and humidity.
    4. Shops rolled-up knowledge within the sensor_rolled_minutely index.
    5. Deletes the unique index 2 days after rollup.

This technique gives the next advantages:

  • Current knowledge is accessible at full granularity
  • Historic knowledge is effectively summarized
  • Storage is optimized by eradicating unique knowledge after rollup

You possibly can monitor the coverage’s execution utilizing the next command:

GET _plugins/_ism/insurance policies/sensor_rollup_policy

Bear in mind to regulate the timeframes, metrics, and aggregation intervals based mostly in your particular necessities and knowledge patterns.

Conclusion

Index rollups in OpenSearch Service present a robust technique to handle storage prices whereas sustaining worthwhile historic knowledge entry. By implementing a well-planned rollup technique, organizations can obtain important price financial savings whereas ensuring their knowledge stays obtainable for evaluation.

To get began, take the next subsequent steps:

  • Evaluate your present index patterns and knowledge retention necessities
  • Analyze your historic knowledge volumes and entry patterns
  • Begin with a proof-of-concept rollup implementation in a check setting
  • Monitor efficiency and storage metrics to optimize your rollup technique
  • Transfer the sometimes accessed knowledge between storage tiers:
    • Delete knowledge you’ll not use
    • Automate the method utilizing ISM insurance policies

To be taught extra, discuss with the next assets:


Concerning the authors

Luis Tiani

Luis Tiani

Luis is a Sr Options Architect at AWS. He focuses on knowledge and analytics matters, with intensive give attention to Amazon OpenSearch Service for search, log analytics, and vector environments. Tiani has helped quite a few clients throughout monetary companies, DNB, SMB, and enterprise segments of their OpenSearch adoption journey, reviewing use instances and offering structure design and cluster sizing steerage. As a Options Architect, he has labored with FSI clients in growing and implementing huge knowledge and knowledge lake options, app modernization, cloud migrations, and AI/ML initiatives.

Muhammad Ali

Muhammad Ali

Muhammad is a Principal Analytics (APJ Tech Lead) at AWS with over 20 years of expertise within the trade. He focuses on data retrieval, knowledge analytics, and synthetic intelligence, advocating an AI-first method whereas serving to organizations construct data-driven mindsets by way of expertise modernization and course of transformation.

Srikanth Daggumalli

Srikanth Daggumalli

Srikanth is a Senior Analytics & AI Specialist Options Architect in AWS. He has over a decade of expertise in architecting cost-effective, performant, and safe enterprise functions that enhance buyer reachability and expertise, utilizing huge knowledge, AI/ML, cloud, and safety applied sciences. He has constructed high-performing knowledge platforms for main monetary establishments, enabling improved buyer attain and distinctive experiences. He has additionally constructed many real-time streaming log analytics, SIEM, observability, and monitoring options to many AWS clients, together with main monetary establishments, enterprise, ISV, DNB, and extra.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles