Convox Rack In Unknown State repeatedly

I am running a V3 rack using AWS as the provider that was previously functional, but has now twice dropped into “Unknown” status without any specific change on my part. Efforts to reach the rack via CLI return back “ERROR: response status 502” and listing the racks in the CLI (via convox racks) shows that the rack is simply in “Unknown” state now. The first time, the rack simply came back after a few hours. It’s down again now and I have no idea when or how to bring it back.

Has this happened to others and are there any suggestions for how to diagnosis and address this?

Good morning @kalquaddoomi

While I have experienced v3 racks showing blips of sporadic 502 they are only for seconds at a time and are typically caused by some convergence around scaling.

Is this rack high_availability=false ? This will reduce redundancy of rack elements such as the API and can additionally lead to interruption due to pod or node scaling/rescheduling.

I would also ask you to double check the strength of your connection. I didn’t see you mention directly checking this in the Console, but I have seen folks have extended 502 errors via CLI when on a weak wi-fi. Could that be the case here? Have you checked to see if the rack is responsive in the Convox Console?

If you’re still experiencing this problem could you share your convox rack params redacting any sensitive information.