Hi,
We believe to hit a bug with the new “VPC endpoints” feature whilst attempting to update two of our racks from 20220310121318
to 20220427210019
.
We believe the bug is due to a series of releases starting from 20220328135225
, released by heronrs (Heron Rossi) · GitHub (is that you @heron ?)
Relevant CloudFormation events highlighting the issue are:
convox-rack-rtc UPDATE_ROLLBACK_IN_PROGRESS The following resource(s) failed to create: [CFEndpoint, KMSEndpoint, ECSEndpoint]. The following resource(s) failed to update: [InstancesLifecycleHandler].
InstancesLifecycleHandler UPDATE_FAILED Resource update cancelled
ECSEndpoint CREATE_FAILED private-dns-enabled cannot be set because there is already a conflicting DNS domain for ecs.us-east-1.amazonaws.com in the VPC vpc-dfa0a*** (Service: AmazonEC2; Status Code: 400; Error Code: InvalidParameter; Request ID: 6e5ebffe-d686-4891-adbb-f0873ba6e***; Proxy: null)
KMSEndpoint CREATE_FAILED private-dns-enabled cannot be set because there is already a conflicting DNS domain for kms.us-east-1.amazonaws.com in the VPC vpc-dfa0a*** (Service: AmazonEC2; Status Code: 400; Error Code: InvalidParameter; Request ID: 92e592d5-8af1-42c6-bbaf-cda7ee05b***; Proxy: null)
CFEndpoint CREATE_FAILED private-dns-enabled cannot be set because there is already a conflicting DNS domain for cloudformation.us-east-1.amazonaws.com in the VPC vpc-dfa0a*** (Service: AmazonEC2; Status Code: 400; Error Code: InvalidParameter; Request ID: da32833d-99fd-42f5-9040-4e6151a27***; Proxy: null)
A little bit of context so that you can better understand how our racks are set up:
There are two racks (A
and B
), and they share a single VPC (i.e., B
's ExistingVpc
has the ID of A
and they both share the same VPCCIDR
).
A
was created a long while before B
, and when we’ve created B
here’s the command we’ve used:
convox rack install aws \
--name <rack-name> \
ExistingVpc="<vpc-id>" \
VPCCIDR="10.0.0.0/16" \
Subnet0CIDR="10.0.7.0/24" \
Subnet1CIDR="10.0.8.0/24" \
Subnet2CIDR="10.0.9.0/24" \
SubnetPrivate0CIDR="10.x.10.0/24" \
SubnetPrivate1CIDR="10.x.11.0/24" \
SubnetPrivate2CIDR="10.x.12.0/24" \
InternetGateway="<igw-id>"
Up until this update however, we’ve always updated B
first, and then A
. On this update cycle for whatever reason we’ve first updated A
(successfully) instead, but failed to update B
afterwards due to the errors mentioned above.
I guess the new “VPC endpoints” feature has some rough edges still.
Would it be possible to issue a bug fix relatively quick as this is blocking our production environment’s ability to update to newer releases.
Thank you!