Unplanned Downtime : QMetry Test Management Cloud Not Accessible, 19-20 Oct 2021

Unplanned Downtime: QMetry Test Management - Cloud is Not Accessible with an error message “502 Bad Gateway“ from 19 October 2021, 11:40 pm to 20th October 2021, 3:58 am PST. This downtime affected all QMetry Test Management Cloud customers.

Users received the below error while accessing the QMetry URL.

All events regarding this downtime will be updated on this page.

Day

Time (PST)

Event Details

Day

Time (PST)

Event Details

20th October

03:58 AM

  • QMetry services have been restored.

20th October

01:30 AM

  • Our teams continue to investigate this issue, more updates to follow soon.

19th October

11:40 PM

  • QMetry Test Management gives error - 502 Bad Gateway on accessing the URL.

  • QMetry Teams are working on high priority to restore the services back to normal. Based on the initial analysis, QMetry services should be restored soon.

 

Root Cause (RCA)

The incident occurred due to high memory utilization triggered due to frequent resource intensive operations for import of automation test result files through Automation API resulting in the instance failure.

Fix

Required architecture level changes on the QMetry server have been done to fix this issue along with additional alerts set up for any incidents in the future.