Autoscaling Overview

SaladCloud offers two main methods for autoscaling your containerized workloads:

Job Queue Autoscaling: This method allows you to scale your workloads based on the number of jobs in your Salad Job Queue. It is particularly useful for workloads that can be processed in parallel and where the number of jobs can vary significantly over time.
Programmatic Autoscaling: This method allows you to scale your workloads via the API based on any custom metrics you can come up with. We have detailed guides on how to set up programmatic autoscaling based on:
- Time of Day: Scale your workloads based on the time of day. This is useful for workloads that have predictable usage patterns, such as aligning with business hours in a particular region.
- Hardware Metrics: Scale your workloads based on hardware metrics such as GPU utilization. This is useful for workloads that require specific hardware resources and where the demand for those resources can vary over time.
- Job Queue Volume: Scale your workloads based on the volume of messages in your SQS queue. This is useful for workloads that process messages from a queue and need to scale based on the current backlog of messages.

Autoscaling On SaladCloud