Autoscaling your Container Services
When you run your services in a production environment, you typically have multiple instances of each server for high availability.
Flightcontrol lets you set the minimum and maximum number of instances for each service (web server or worker). In addition, you can configure the instance size to use with each service.
Autoscaling Rules
Flightcontrol sets up autoscaling rules on ECS for each type of service (web service and worker).
The default autoscaling rules used by both types are:
- 70% CPU utilization
- 70% Memory utilization
Web Services
For web services, Flightcontrol also sets a threshold of 500 requests over the course of 60 seconds, measured twice in a three minute period. If this threshold is exceeded, another instance gets added to your cluster, up to the maximum number of instances.
Configuring Autoscaling in the Dashboard
The autoscaling configuration is available in the dashboard. For an existing service, you will find the options under the Config tab.
Number of Instances
The minumum and maximum number of instances are options in the Instance configuration section of the service configuration. If these numbers are different, Flightcontrol will tell ECS to use autoscaling. If the numbers are the same, autoscaling wouldn’t be necessary.
Both minimum and maximum instances may be specified for each service. This can also vary between environments, as each environment is configured separately.
Thresholds
The autoscaling thresholds are configured in the Autoscaling section of the service configuration.
You can set different thresholds for different environments. For example, you may want to set a higher threshold for your staging environment, and a lower threshold for your production environment.
Configuring Autoscaling in Code
With flightcontrol.json
as your configuration option, the minInstances
and maxInstances
attributes can be set for each individual service.
The following example shows a Flask web application that has a minimum of 2 instances, and a maximum of 5 instances to run at any one time.
Example using the following autoscaling parameters:
- CPU Threshold of 60%
- Memory Threshold of 60%
- Cooldown Timer of 300 seconds
- Requests per Target of 1000
{
"$schema": "https://app.flightcontrol.dev/schema.json",
"environments": [
{
"id": "production",
"name": "Production",
"region": "us-west-2",
"source": {
"branch": "main"
},
"services": [
{
"id": "flask-web",
"name": "Flask Web",
"type": "web",
"target": {
"type": "fargate"
},
"buildType": "nixpacks",
"ci": {
"type": "ec2"
},
"cpu": 0.5,
"memory": 1,
"minInstances": 2,
"maxInstances": 5,
"autoscaling": {
"cpuThreshold": 60,
"memoryThreshold": 60,
"cooldownTimerSecs": 300,
"requestsPerTarget": 1000
},
"envVariables": {},
"healthCheckPath": "/healthcheck"
}
]
}
]
}
Conclusion
We encourage everyone that uses cloud environments to monitor their costs, and to consider how instance counts and sizes affect cost and performance.
You may also need to adjust the autoscaling parameters to suit your application’s performance needs. For example, if your application is CPU intensive, you may want to increase the CPU threshold to 80%.