Connectivity issues in Oslo
Incident Report for Servebolt
Postmortem

On Saturday, February 15th at approximately 11:43 we experienced a network outage at our Oslo data center. This resulted in service disruption for all websites hosted on this data center.

With any significant event that affects our customers, we conduct an extensive examination to understand the root cause and develop a course of action to improve our systems and procedures. To that end, we wanted to provide a synopsis of the situation that occurred and our reassurance that we are working diligently to proactively mitigate and prevent future outages.

Here's what happened

The main router for the Oslo data center experienced a failure resulting in all network traffic not being able to be routed to the servers. Our systems are built with integrated redundancy, but unfortunately, in this case, the failover system didn't kick in resulting in the outage.

We replaced the router at 13:10 resulting in some of the network coming back online at 13:10. The rest was up at 13:34, and we were back at full capacity at 13:34.

Here's what we're doing

We are in the process of modifying the current network architecture to prevent or reduce the impact of any device failure by improving our monitoring and failover triggering.

We are also investing in router upgrades to allow support for an even more rigid redundancy throughout our entire infrastructure.

Outages disrupt your life and your business. We understand and we take our responsibility to you very seriously.

Please allow me to take this opportunity to thank you for your business and provide my personal assurance that we are dedicated to meeting our commitment to you.

Sincerely,

Erlend Eide
CEO
Servebolt.com

Posted Feb 17, 2020 - 14:21 CET

Resolved
This incident has been resolved.
Posted Feb 15, 2020 - 15:48 CET
Update
All servers are back online, and we are monitoring the fix.
Posted Feb 15, 2020 - 13:36 CET
Update
We are still working on getting the last servers back online.
Posted Feb 15, 2020 - 13:32 CET
Monitoring
Servers are coming back up again now.
Some servers are already up and running normally, and some will come back online during the next few minutes.
Posted Feb 15, 2020 - 13:12 CET
Identified
We have identified a possible issue and will start implementing the fix.
We do not have an ETA currently but will update as soon as we have a timeframe for the fix.
Posted Feb 15, 2020 - 12:49 CET
Update
We are still investigating, and have also sent an engineer to the location to investigate on-site.
Posted Feb 15, 2020 - 12:18 CET
Update
We are continuing to investigate this issue.
Posted Feb 15, 2020 - 11:48 CET
Investigating
We are currently investigating this issue.
Posted Feb 15, 2020 - 11:47 CET
This incident affected: Servebolt Cloud OSL.