Unplanned Downtime: QMetry Test Management Cloud Not Accessible (for SAML users) on 11th March 2024

Unplanned Downtime: QMetry Test Management - Cloud was not accessible for SAML users with an error message “503 Service Temporarily Unavailable“ on March 11th, 2024 from 02.00 AM PST to 02.20 AM PST. This downtime affected all QMetry Test Management Cloud customers (log in using SAML).

Users received the below error while accessing the QMetry > Login with SAML > Enter “org code” > Submit.

image-20240311-091920.png

All events regarding this downtime are updated on this page.

Day

Time (PST)

Event Details

Day

Time (PST)

Event Details

11th March

02.20 AM

  • QMetry services are restored. Root cause analysis and Preventive Actions applicable will be shared soon.

11th March

02.00 AM

  • QMetry Test Management gives error - “503 Service Temporarily Unavailable“ while accessing the QMetry URL > Login with SAML.

  • QMetry Teams are working on high priority to restore the services back to normal. Based on the initial analysis, QMetry services should be restored soon.

Root Cause Analysis

The Cloud instance got de-registered from target SAML group due to a missed configuration caused by user error during the previous deployment.

Preventive Action

An auto-scaling group has been created now which automatically adds new instances to the target SAML group in case of any observed failures and de-registrations