Goal in sight: better uptime

TL;DR: We’re planning to host our CDN and an InterMine registry on AWS in the future. For more specifics, read on…

Over the course of 2015, we had several note-worthy service outages.

First of all there was a downpour in Cambridge that caused massive flooding, which took out the university’s network connection. The server room actually had to be pumped out. Yikes.

Next, the machine room had a broken chiller – this was so severe that the servers began to heat up almost immediately and we had to go into shutdown straight away whilst repair people worked on it. Services came back up the next morning, but only just in time for our training workshop.

Then came the big nasty DDOS attack against Janet, a higher education network which spans the UK. From the university, we could access some websites, but many more were down. Crucially, no one could reach our servers from the outside, even though this time they were running.

Apart from taking down HumanMine, FlyMine, and our websites, this also took down our CDN, meaning that anyone who hosted their own InterMine but relied upon us for their CDN was also affected by the outages.

Amazon Web Services to the rescue

We’re always working our hardest to make sure things run as smoothly as possible, but sometimes things are beyond our control. This number of outages is obviously less than ideal, and we’d like to do better in 2016.

One solution is to set up your own InterMine CDN, but could we do something to help you on our end, something that doesn’t require extra work for others? After a bit of thinking, we realised we probably could.

So as part of a plan to offer a more reliable service, our CDN will be migrated to live on AWS in the next few months. AWS has a pretty darned good uptime track record. What’s more, Amazon serves a massive chunk of the web – so if AWS ever is down, the likelihood is that much of the web will be down as well, and the outage won’t be surprising to any potential users.

An InterMine registry

Another feature that’s been in planning works for a while is an InterMine registry. Like the CDN, it’s pretty important that a dynamic discoverable registry have near-perfect uptime. So hopefully you can expect to see the hardcoded mine lists disappearing as we start to take advantage of a hosting service with high availability.

Watch this space for the official announcement and release dates for the CDN upgrade and registry release!

A final note about outages

While email was okay when Janet was playing up, the machine room breakdowns took out our mail server, and hence the mailing list – so the “Hey guys, is your server down?” queries didn’t reach us until after everything had recovered.

If you’re aware of a problem, the best place to find out more will be our Twitter account, @intermineorg. We’ll always make an effort to update our twitter feed – by mobile network if necessary – with details of outages, whether planned or emergency.

Featured image courtesy of Torkild Retvedt via Flickr. Licence CC-BY-SA 2.0