The subnet IDs subnet-********,subnet-******** are in use. (Service: AmazonElastiCache; Status Code: 400; Error Code: SubnetInUse; Request ID: ********; Proxy: null)
I’m still running an older version of the convox rack (gen 2): 20191120232059
$ convox rack resources info redis-****
Name redis-****
Type redis
Status running
Options AutomaticFailoverEnabled=false
Database=0
Encrypted=false
InstanceType=cache.t2.medium
NumCacheClusters=1
Does anyone have any idea how I might be able to fix this?
I’ve considered creating a new app-level Redis resource and swapping the REDIS_URL during a deploy, but that will cause some downtime, since my worker processes always update much faster than the web processes.
Hi @matt, yes fortunately I was able to resolve this issue. I reached out to AWS support and got a few more details about the cause:
---> To identify the cause behind the aforementioned error, I checked the latest update call made on the production stack using our internal tools. I noticed that an API call, ‘ModifyCacheSubnetGroup’ was initiated during the latest stack update to modify the subnetIds in the Cloudformation resource, ‘CacheSubnetGroup’ . Please refer to [1] for more information on ‘ModifyCacheSubnetGroup’.
Old subnetIds :
****************************
subnet-ab6f****, subnet-57d3****, subnet-7cb3****
****************************
New subnetIds :
****************************
subnet-9b3a****, subnet-6de8****, subnet-2022****
****************************
I was not able to resolve the subnet problem for the Convox Redis resource, so I ended up migrating to a new Redis cluster.
I figured out how I can perform a zero-downtime migration by running two sets of worker containers at the same time, for both Redis clusters. So I did something like this:
Set OLD_REDIS_URL to REDIS_URL. (Don’t promote.)
a) Add another set of worker containers that always perform jobs from OLD_REDIS_URL.
b) Add a new Redis resource to convox.yml
c) Deploy
Set REDIS_URL to the new Redis URL and promote the release
Remove worker_old service from convox.yml. Deploy the change. Now all web and worker containers are using the new Redis.
Delete the redis rack resource
Unset OLD_REDIS_URL
This way, I was running two sets of workers that were using both Redis instances at the same time, so I could perform a zero-downtime migration without losing any background jobs.