![]() ![]() This will prevent the secondary issue recurrence, which occurred at 17:45 PT. Refactor our rollout vehicles to provide more fine-grained change deployment, so that unrelated configuration changes do not also deploy core service configuration or binaries for the IAP infrastructure. Make modifications to our alerting practices for IAP SSH to closely monitor the backlog of unprocessed instance creations and updates, enabling us to take corrective actions to mitigate any customer impact. Google is committed to preventing a repeat of this issue in the future and is completing the following actions: Engineers again redirected traffic to the database replica, which fully mitigated and resolved the issue at 20:50 PT. At 17:45 PT, the issue recurred, as an existing rollout mistakenly redirected traffic back to the primary database, overriding the non-standard configuration being used to mitigate the issue at the time. ![]() Database engineers attempted several mitigations starting at 03:01 PT but experienced limited success.Īt 11:01 PT, engineers redirected IAP traffic to a standby replica of the system database, which temporarily mitigated the issue at 13:38 PT. At 01:55 PT on 2 December, engineers determined the problem to be an internal system backend database used for instance and networking information for GCE, and the issue was escalated to our database team. Google engineers were initially notified of the issue through a support case on 1 December 2022 at 22:44 PT and immediately started an investigation. Therefore, IAP was unaware of the GCE instances and thus denied access to them when customers attempted to use SSH through the IAP Proxy. This meant that as new GCE instances were created, they were not being added to the database that IAP accessed. This database is updated in real-time, as GCE instances are added, changed, or removed.ĭue to an issue with a global index associated with this backend system database, writes to the database were unable to keep up with instance updates from GCE and eventually caused the updates of GCE IAP information to lag. ![]() That database is used to collect instance and networking information for GCE instances for validation that the instance exists. Root Causeīefore allowing a connection through SSH, IAP validates the user and connects to a backend system database. The IAP connection method also can be specified by organization policy or gcloud invocation flags. IAP is the default method of connectivity when customers SSH through the Google Cloud Console or gcloud tool to GCE instances which do not have external IP addresses. IAP assists in securing user access to registered web applications and cloud resources by providing centralized access control. We have conducted an internal investigation and are taking steps to improve our service. This is not the level of quality and reliability we strive to offer you, and we are taking immediate steps to improve the platform’s performance and availability. To our GCE customers whose instances were impacted during this outage, we sincerely apologize. On 1 December 2022 at 17:30 PT, Google Compute Engine (GCE) customers experienced issues connecting to instances via Secure Shell Protocol (SSH) through Identity Aware Proxy (IAP) for a total duration of 22 hours and 46 minutes. ![]() Taiwan (asia-east1) Hong Kong (asia-east2) Tokyo (asia-northeast1) Osaka (asia-northeast2) Seoul (asia-northeast3) Mumbai (asia-south1) Delhi (asia-south2) Singapore (asia-southeast1) Jakarta (asia-southeast2) Sydney (australia-southeast1) Melbourne (australia-southeast2) Warsaw (europe-central2) Finland (europe-north1) Madrid (europe-southwest1) Belgium (europe-west1) London (europe-west2) Frankfurt (europe-west3) Netherlands (europe-west4) Zurich (europe-west6) Milan (europe-west8) Paris (europe-west9) Montréal (northamerica-northeast1) Toronto (northamerica-northeast2) São Paulo (southamerica-east1) Santiago (southamerica-west1) Iowa (us-central1) South Carolina (us-east1) Northern Virginia (us-east4) Columbus (us-east5) Dallas (us-south1) Oregon (us-west1) Los Angeles (us-west2) Salt Lake City (us-west3) Las Vegas (us-west4) Incident began at 17:30 and ended at 20:50 (all times are US/Pacific). ssh fails with ERROR: (-iap-tunnel) Error while connecting For additional information on these services, please visit. Learn more about what's posted on the dashboard in this FAQ. If you are experiencing an issue not listed here, please contact Support. Check back here to view the current status of the services listed below. This page provides status information on the services that are part of Google Cloud. ![]()
0 Comments
Leave a Reply. |