RESOLVED: vSphere vCenter and NSX may not have been available for a short period of time in us-central1

Incident began at 2022-06-23 00:13 and ended at 2022-06-23 06:36 (all times are US/Pacific).

SUMMARY:

On Thursday, 16 June 2022, customers were unable to create or edit Google Compute Engine (GCE) instances via the Google Cloud Console for 2 days, 20 hours, 31 minutes. To our customers that were impacted during this outage, we sincerely apologize. We are conducting an internal investigation and are taking steps to improve our service.

ROOT CAUSE:

Customers can specify organization policies to limit what instances can use external IP addresses (compute.instanceExternalIpAccess). Setting an IP address on an instance not allowed by policy will cause an operation failure.

The incident was triggered by a compute frontend UI release which made it impossible for certain users to modify instances due to interactions between org policies and a bug that forced an Ephemeral IP address while on either the edit or create page. . Any user with the compute.instanceExternalIpAccess policy could not create or edit instances without a public IP.

A bug was identified where customers with a policy restricting external IP addresses were not able to select one during instance creation. An attempt to fix this bug created a regression where changing any field in the Edit instance page would change the IP address to ephemeral for machines that had no IP address selected. Because of the org policy blocks assigning the instance a public IP, the save operation would fail.The release containing this change included a fix to address the bug however, once in production several customers reported issues.

REMEDIATION AND PREVENTION:

Google engineers were alerted to the issue via customer support case on Wednesday, 14 June 2022 at 04:15 and started an investigation. At 06:11, Google engineers were able to reproduce the issue and escalated the incident at 09:08. At 09:50, Google engineers initiated a rollback of the release which was completed at 11:26 fully mitigating the issue.

Google is committed to improving our service in the future and will be completing the following actions:
Improve unit testing for org policies to identify issues of this type.
Improve alerting to quickly detect configuration failures.

DETAILED DESCRIPTION OF IMPACT:

On Wednesday, 14 June 2022 04:15 to 11:26 US/Pacific

Google Compute Engine

Affected customers experienced failures creating or editing GCE instances via the Google Cloud Console and may have received an error “Constraint constraints/compute.instanceExternalIpAccess violated for project [project ID].“

ADDITIONAL INFORMATION FOR CUSTOMERS:

As a workaround, customers were still able to create or edit GCE instances via the gcloud CLI or via Google Cloud Console by disabling the constraints/compute.instanceExternalIpAccess policy.


Affected products: VMWare engine, Google Compute Engine

Affected locations: Iowa (us-central1)

View Incident Report

Google Cloud Outages