The right way to Do Load Testing with Rockset

What’s load testing and why does it matter?

load-test-1

Load testing is a crucial course of for any database or knowledge service, together with Rockset. By doing load testing, we goal to evaluate the system’s conduct underneath each regular and peak situations. This course of helps in evaluating necessary metrics like Queries Per Second (QPS), concurrency, and question latency. Understanding these metrics is crucial for sizing your compute assets accurately, and making certain that they’ll deal with the anticipated load. This, in flip, helps in attaining Service Degree Agreements (SLAs) and ensures a clean, uninterrupted person expertise. That is particularly necessary for customer-facing use instances, the place finish customers count on a handy guide a rough person expertise. Load testing is typically additionally referred to as efficiency or stress testing.

“53% of visits are prone to be deserted if pages take longer than 3 seconds to load” — Google

Rockset compute assets (referred to as digital situations or VIs) come in several sizes, starting from Small to 16XL, and every measurement has a predefined variety of vCPUs and reminiscence accessible. Selecting an acceptable measurement relies on your question complexity, dataset measurement and selectivity of your queries, variety of queries which can be anticipated to run concurrently and goal question efficiency latency. Moreover, in case your VI can also be used for ingestion, it’s best to consider assets wanted to deal with ingestion and indexing in parallel to question execution. Fortunately, we provide two options that may assist with this:

Auto-scaling – with this function, Rockset will mechanically scale the VI up and down relying on the present load. That is necessary when you have some variability in your load and/or use your VI to do each ingestion and querying.
Compute-compute separation – that is helpful as a result of you’ll be able to create VIs which can be devoted solely for operating queries and this ensures that all the accessible assets are geared in the direction of executing these queries effectively. This implies you’ll be able to isolate queries from ingest or isolate totally different apps on totally different VIs to make sure scalability and efficiency.

We advocate doing load testing on not less than two digital situations – with ingestion operating on the principle VI and on a separate question VI. This helps with deciding on a single or multi-VI structure.

Load testing helps us determine the bounds of the chosen VI for our explicit use case and helps us choose an acceptable VI measurement to deal with our desired load.

Instruments for load testing

Relating to load testing instruments, just a few fashionable choices are JMeter, k6, Gatling and Locust. Every of those instruments has its strengths and weaknesses:

JMeter: A flexible and user-friendly instrument with a GUI, ultimate for numerous varieties of load testing, however could be resource-intensive.
k6: Optimized for top efficiency and cloud environments, utilizing JavaScript for scripting, appropriate for builders and CI/CD workflows.
Gatling: Excessive-performance instrument utilizing Scala, finest for complicated, superior scripting situations.
Locust: Python-based, providing simplicity and fast script growth, nice for easy testing wants.

Every instrument affords a novel set of options, and the selection relies on the precise necessities of the load take a look at being performed. Whichever instrument you utilize, remember to learn by the documentation and perceive the way it works and the way it measures the latencies/response occasions. One other good tip is to not combine and match instruments in your testing – in case you are load testing a use case with JMeter, keep it up to get reproducible and reliable outcomes that you may share together with your workforce or stakeholders.

Rockset has a REST API that can be utilized to execute queries, and all instruments listed above can be utilized to load take a look at REST API endpoints. For this weblog, I’ll concentrate on load testing Rockset with Locust, however I’ll present some helpful assets for JMeter, k6 and Gatling as properly.

Establishing Rockset and Locust for load testing

Let’s say we’ve a pattern SQL question that we need to take a look at and our knowledge is ingested into Rockset. The very first thing we often do is convert that question right into a Question Lambda – this makes it very straightforward to check that SQL question as a REST endpoint. It may be parametrized and the SQL could be versioned and stored in a single place, as a substitute of going forwards and backwards and altering your load testing scripts each time you should change one thing within the question.

Step 1 – Establish the question you need to load take a look at

In our situation, we need to discover the preferred product on our webshop for a specific day. That is what our SQL question seems like (notice that :date is a parameter which we are able to provide when executing the question):

--top product for a specific day
SELECT
    s.Date,
    MAX_BY(p.ProductName, s.Depend) AS ProductName,
    MAX(s.Depend) AS NumberOfClicks
FROM
    "Demo-Ecommerce".ProductStatsAlias s
    INNER JOIN "Demo-Ecommerce".ProductsAlias p ON s.ProductID = CAST(p._id AS INT)
WHERE
    s.Date = :date
GROUP BY
    1
ORDER BY
    1 DESC;

load-test-2

Step 2 – Save your question as a Question Lambda

We’ll save this question as a question lambda referred to as LoadTestQueryLambda which is able to then be accessible as a REST endpoint:

https://api.usw2a1.rockset.com/v1/orgs/self/ws/sandbox/lambdas/LoadTestQueryLambda/tags/newest

curl --request POST 
--url https://api.usw2a1.rockset.com/v1/orgs/self/ws/sandbox/lambdas/LoadTestQueryLambda/tags/newest 
-H "Authorization: ApiKey $ROCKSET_APIKEY" 
-H 'Content material-Sort: utility/json' 
  -d '{
    "parameters": [
      {
        "name": "days",
        "type": "int",
        "value": "1"
      }
    ],
      "virtual_instance_id": "<your digital occasion ID>"
  }' 
 | python -m json.instrument

Step 3 – Generate your API key

Now we have to generate an API key, which we’ll use as a means for our Locust script to authenticate itself to Rockset and run the take a look at. You’ll be able to create an API key simply by our console or by the API.

Step 4 – Create a digital occasion for load testing

Subsequent, we want the ID of the digital occasion we need to load take a look at. In our situation, we need to run a load take a look at towards a Rockset digital occasion that’s devoted solely to querying. We spin up an extra Medium digital occasion for this:

load-test-3

As soon as the VI is created, we are able to get its ID from the console:

load-test-4

Step 5 – Set up Locust

Subsequent, we’ll set up and arrange Locust. You are able to do this in your native machine or a devoted occasion (suppose EC2 in AWS).

$ pip set up locust

Step 6 – Create your Locust take a look at script

As soon as that’s accomplished, we’ll create a Python script for the Locust load take a look at (notice that it expects a ROCKSET_APIKEY setting variable to be set which is our API key from step 3).

We are able to use the script beneath as a template:

import os
from locust import HttpUser, job, tag
from random import randrange

class query_runner(HttpUser):
    ROCKSET_APIKEY = os.getenv('ROCKSET_APIKEY') # API secret is an setting variable

    header = {"authorization": "ApiKey " + ROCKSET_APIKEY}

    def on_start(self):
        self.headers = {
            "Authorization": "ApiKey " + self.ROCKSET_APIKEY,
            "Content material-Sort": "utility/json"
        }
        self.consumer.headers = self.headers
        self.host="https://api.usw2a1.rockset.com/v1/orgs/self" # substitute this together with your area's URI
        self.consumer.base_url = self.host
        self.vi_id = '<your digital occasion ID>' # substitute this together with your VI ID

    @tag('LoadTestQueryLambda')
    @job(1)
    def LoadTestQueryLambda(self):
        # utilizing default params for now
        knowledge = {
            "virtual_instance_id": self.vi_id
        }
        target_service="/ws/sandbox/lambdas/LoadTestQueryLambda/tags/newest" # substitute this together with your question lambda
        consequence = self.consumer.submit(
            target_service,
            json=knowledge
        )

Step 7 – Run the load take a look at

As soon as we set the API key setting variable, we are able to run the Locust setting:

export ROCKSET_APIKEY=<your api key>
locust -f my_locust_load_test.py --host https://api.usw2a1.rockset.com/v1/orgs/self

And navigate to: http://localhost:8089 the place we are able to begin our Locust load take a look at:

load-test-5

Let’s discover what occurs as soon as we hit the Begin swarming button:

Initialization of simulated customers: Locust begins creating digital customers (as much as the quantity you specified) on the charge you outlined (the spawn charge). These customers are situations of the person class outlined in your Locust script. In our case, we’re beginning with a single person however we’ll then manually improve it to five and 10 customers, after which go down to five and 1 once more.
Activity execution: Every digital person begins executing the duties outlined within the script. In Locust, duties are sometimes HTTP requests, however they are often any Python code. The duties are picked randomly or based mostly on the weights assigned to them (if any). We’ve got only one question that we’re executing (our LoadTestQueryLambda).
Efficiency metrics assortment: Because the digital customers carry out duties, Locust collects and calculates efficiency metrics. These metrics embrace the variety of requests made, the variety of requests per second, response occasions, and the variety of failures.
Actual-time statistics replace: The Locust net interface updates in real-time, exhibiting these statistics. This consists of the variety of customers presently swarming, the request charge, failure charge, and response occasions.
Take a look at scalability: Locust will proceed to spawn customers till it reaches the entire quantity specified. It ensures the load is elevated progressively as per the required spawn charge, permitting you to watch how the system efficiency modifications because the load will increase. You’ll be able to see this within the graph beneath the place the variety of customers begins to develop to five and 10 after which go down once more.
Person conduct simulation: Digital customers will anticipate a random time between duties, as outlined by the wait_time within the script. This simulates extra reasonable person conduct. We didn’t do that in our case however you are able to do this and extra superior issues in Locust like customized load shapes, and so forth.
Steady take a look at execution: The take a look at will proceed operating till you resolve to cease it, or till it reaches a predefined length when you’ve set one.
Useful resource utilization: Throughout this course of, Locust makes use of your machine’s assets to simulate the customers and make requests. It is necessary to notice that the efficiency of the Locust take a look at also can depend upon the assets of the machine it is operating on.

Let’s now interpret the outcomes we’re seeing.

Deciphering and validating load testing outcomes

Deciphering outcomes from a Locust run entails understanding key metrics and what they point out in regards to the efficiency of the system underneath take a look at. Listed here are a few of the important metrics supplied by Locust and how you can interpret them:

Variety of customers: The full variety of simulated customers at any given level within the take a look at. This helps you perceive the load degree in your system. You’ll be able to correlate system efficiency with the variety of customers to find out at what level efficiency degrades.
Requests per second (RPS): The variety of requests (queries) made to your system per second. The next RPS signifies the next load. Evaluate this with response occasions and error charges to evaluate if the system can deal with concurrency and excessive site visitors easily.
Response time: Often displayed as common, median, and percentile (e.g., ninetieth and 99th percentile) response occasions. You’ll doubtless have a look at median and the 90/99 percentile as this offers you the expertise for “most” customers – solely 10 or 1 % may have worse expertise.
Failure charge: The share or variety of requests that resulted in an error. A excessive failure charge signifies issues with the system underneath take a look at. It is essential to research the character of those errors.

Beneath you’ll be able to see the entire RPS and response occasions we achieved underneath totally different masses for our load take a look at, going from a single person to 10 customers after which down once more.

load-test-6

Our RPS went as much as about 20 whereas sustaining median question latency beneath 300 milliseconds and P99 of 700 milliseconds.

load-test-7

We are able to now correlate these knowledge factors with the accessible digital occasion metrics in Rockset. Beneath, you’ll be able to see how the digital occasion handles the load when it comes to CPU, reminiscence and question latency. There’s a correlation between variety of customers from Locust and the peaks we see on the VI utilization graphs. You may also see the question latency beginning to rise and see the concurrency (requests or queries per second) go up. The CPU is beneath 75% on the height and reminiscence utilization seems steady. We additionally don’t see any important queueing occurring in Rockset.

load-test-8

Aside from viewing these metrics within the Rockset console or by our metrics endpoint, you may as well interpret and analyze the precise SQL queries that have been operating, what was their particular person efficiency, queue time, and so forth. To do that, we should first allow question logs after which we are able to do issues like this to determine our median run and queue occasions:

SELECT
    query_sql,
    COUNT(*) as rely,
    ARRAY_SORT(ARRAY_AGG(runtime_ms)) [(COUNT(*) + 1) / 2] as median_runtime,
    ARRAY_SORT(ARRAY_AGG(queued_time_ms)) [(COUNT(*) + 1) / 2] as median_queue_time
FROM
    commons."QueryLogs"
WHERE
    vi_id = '<your digital occasion ID>'
    AND _event_time > TIMESTAMP '2023-11-24 09:40:00'
GROUP BY
    query_sql

We are able to repeat this load take a look at on the principle VI as properly, to see how the system performs ingestion and runs queries underneath load. The method can be the identical, we’d simply use a unique VI identifier in our Locust script in Step 6.

Conclusion

In abstract, load testing is a vital a part of making certain the reliability and efficiency of any database resolution, together with Rockset. By deciding on the precise load testing instrument and establishing Rockset appropriately for load testing, you’ll be able to achieve helpful insights into how your system will carry out underneath numerous situations.

Locust is straightforward sufficient to get began with shortly, however as a result of Rockset has REST API help for executing queries and question lambdas, it’s straightforward to hook up any load testing instrument.

Keep in mind, the purpose of load testing isn’t just to determine the utmost load your system can deal with, but in addition to know the way it behaves underneath totally different stress ranges and to make sure that it meets the required efficiency requirements.

Fast load testing suggestions earlier than we finish the weblog:

At all times load take a look at your system earlier than going to manufacturing
Use question lambdas in Rockset to simply parametrize, version-control and expose your queries as REST endpoints
Use compute-compute separation to carry out load testing on a digital occasion devoted for queries, in addition to in your important (ingestion) VI
Allow question logs in Rockset to maintain statistics of executed queries
Analyze the outcomes you’re getting and evaluate them towards your SLAs – when you want higher efficiency, there are a number of methods on how you can sort out this, and we’ll undergo these in a future weblog.

Have enjoyable testing 💪

Helpful assets

Listed here are some helpful assets for JMeter, Gatling and k6. The method is similar to what we’re doing with Locust: you should have an API key and authenticate towards Rockset after which hit the question lambda REST endpoint for a specific digital occasion.

The right way to Do Load Testing with Rockset

What’s load testing and why does it matter?

Instruments for load testing

Establishing Rockset and Locust for load testing

Step 1 – Establish the question you need to load take a look at

Step 2 – Save your question as a Question Lambda

Step 3 – Generate your API key

Step 4 – Create a digital occasion for load testing

Step 5 – Set up Locust

Step 6 – Create your Locust take a look at script

Step 7 – Run the load take a look at

Deciphering and validating load testing outcomes

Conclusion

Helpful assets

Related Articles

How On-line Buying Apps Can Enhance Gross sales: The Final Information

Why Check Environments Fail—and What High Groups Do to Keep away from the Chaos

Cease Paving the Cowpath: Why Agentic-First Is the Solely Option to Construct for the Enterprise

LEAVE A REPLY Cancel reply

Latest Articles

How On-line Buying Apps Can Enhance Gross sales: The Final Information

Why Check Environments Fail—and What High Groups Do to Keep away from the Chaos

Cease Paving the Cowpath: Why Agentic-First Is the Solely Option to Construct for the Enterprise

Organizational Context for AI Coding Brokers with Dennis Pilarinos

The 5 Pillars of Software program Assurance in System Acquisition