Dear DigiServ Customers,

As many of you are aware by now we experienced a prolonged outage on our local hosting network with MTN Business in Cape Town data centre number 3. I would like to personally apologize for this extended outage and reassure everyone that we've already taken steps to ensure that downtime is kept to a minimum in future!

I myself and day shift technicians worked throughout Thursday night to ensure that services were restored as soon as possible.

This outage comes shortly after we had a network interface controller fail on us on Wednesday evening which required a data centre technician to replace the card. 

In an effort to keep things in simple English I don't want to confuse many of you with the technical terms as well as checklist procedures we performed in getting where the services were restored. I do however want to reassure everyone that all necessary steps and checks were done from our side to bring services back online as soon as as possible.

During diagnostic testing we suffered a hard disk drive failure which required that we order a new replacement drive from the data centre. This request was actioned by the data centre technician within 2 hours however it was not too long before we suffered a catastrophic failure of drive 2 on the RAID protected storage.

It took a further 2 hours for the data centre technician to install the new drives and we wasted no time in reinstalling the Operating System as well as control panel software - cPanel. By late evening we were already ahead of schedule in bringing the server back online. In cases like this it takes a minimum of 24 hours to restore a server completely. 

By Friday morning 6AM services were back up and running and the server restore had completed successfully. Although we got services restored in record time the outage experienced was too long. This directly influenced my decision in moving ahead and bringing our annual server and network maintenance program forward from December (generally performed over the holiday season) to be completed as soon as possible on our local South African network.

We suffered a very similar drive (RAID) failure in August on one of our dedicated servers with our data centre provider in Germany. At that time I planned to move forward our annual maintenance program and during late September we had already acquired the new servers and started to slowly get these up and running. During the past week we were doing final installations and planning our server migrations to start from Friday the 12th of October 2012 at 7PM.

With the recent downtime experienced on our local SA network we are planning on replicating these upgrades we are performing on our German network across to our local network here in South Africa with MTN Business. Our schedule maintenance program for our German servers is expected to complete next Monday morning at 4am on 22 October 2012. We will then commence with upgrades on our local network with MTN Business in Cape town as soon as by the end of this month if not sooner.

Some of the changes/upgrades we will be implementing is RAID10 protected storage which will give us the greatest protection against downtime and drive failures. We will however not be immune to drive failures and never will be. The hosting industry suffers hardware failures quite often and its quite a common thing for us, however the amount of downtime experienced depends on various factors. For us moving over to RAID10 protected storage will allow the server to stay online even if 1 drive fails. We will be able to rebuild the RAID protection without having an extended outage such as on Thursday. Having a minimum of 4 hard disks in such a configuration allows us to drastically reduce the amount of downtime our customers experience in case of a drive failure. The extra benefit is that it will also increase reliability and performance.

We will also be upgrading other hardware on servers such as RAM and CPU's to allow for future growth.

Unfortunately our own domain, was also affected by the recent outage as we host the DNS on this server. Steps have already been taken to move the DNS and our domain over to a third party provider outside of the MTN Business network to allow customers a contact point in cases of outage. Customers will be able to stay in contact with us at all times and we with them. We will also be able to update our network notice section on our website with updates regarding progress and further issues.

I would like to reassure customers that we are taking this matter extremely serious. Our reputation is known for high uptime and reliability. The slightest amount of downtime concerns us as it means our customers businesses are down too. It is as worrying if not even more for us as it directly affects our customers and their experience with us.

I myself and technicians have been up working since Wednesday and Thursday without any sleep up to this point. We are working extremely hard to ensure that the service we offer is of high value and meet your requirements.

I would also like to thank everyone for their patience and encouraging emails we've received during this difficult outage.

Thank you for your time.


Rudi Burger
DigiServ Technologies CC - Director

Sunday, October 14, 2012

« Back