We are using and want to be able to use t3.small’s for the bulk of the cluster. Most of our stuff fits in a small memory budget just fine. We have some large render jobs that need 4+ GB of memory though. We are coming from Heroku where we are able to define that just the render service runs on the performance dynos that have plenty of memory available.
I tried setting up a t3.large on the cluster. I manually created a t3.large using one of the existing instances as a template from the AWS console. Convox appears to have picked up the instance. We ran into some challenges:
- Convox doesn’t know anything about it’s memory and displays blanks for most of the stats when you use
convox instances
- Auto scaling is completely broken. When we have the instance in the cluster sometimes the auto scaler will just go crazy and start adding new instances infinitely until it runs into the AWS limit. We had to just turn it off completely but would like to still use it.
- We can’t get other services to stay off of the t3.large so that the high memory service always can go on it. We were able to mostly work around this by setting the service’s prop
singleton: true
so that it didn’t try to do rolling deployments. Now most of the time it deletes the service and re-creates it on the same instance during deploys. This doesn’t always work though and causes the deployment to get stuck for hours before it finally rolls back.
My questions are mostly around how to better handle this type of stuff. I didn’t see anything in the docs but it would be nice to be able to:
- Mix and match instance sizes with auto scaling
- Be able to set an affinity to a specific instance size for a service
Does anyone have any suggestions about how to handle high memory services needing to run in the same cluster as services that fit fine into smaller instances? I don’t really want to go to using t3.large’s across the board because it will quadruple my costs for similar performance levels just to get around this one issue.
PS t3.large = 2 CPU x 8GB and t3.small = 2 CPU x 2GB. My render service can go up to 5GB for large projects.