At 06:25 UTC on September 28 2022 we started to receive notifications of users seeing an error page when accessing their MURAL workspaces. Initial investigations indicated the cause was an issue with our cloud provider. Shortly after we received confirmation from our cloud provider that routine server maintenance had had an unexpected impact on the availability of data hosted in one of their facilities. At 06:55 UTC MURAL implemented a workaround to re-route traffic and restore service via a secondary server. At 07:25 UTC our cloud provider confirmed that full service had been restored on the primary server and we were able to revert the earlier change.
This incident impacted service availability from 06:25 UTC to 06:55 UTC, for a total of 30 minutes of downtime. No data was lost, however MURAL was not accessible during this time.
What we've done to avoid this happening again
We have improved automated monitoring and notifications to include this particular scenario, to ensure we can restore service faster in the event this should occur again.