I am taking care of several dedicated servers hosted at different providers. As these servers are running 24/7 and have lots of things to write to and read from disk, from time to time a disk fails and has to be replaced. As there are RAIDs in these servers, this is no problem. Quite accidentally three disks at three different providers failed within a short time, and this is the story of their replacement:
- Server4You: I informed the support of the bad drive and asked what I need to do for a replacement. After a short time I was told to show part of the syslog, note the serial number of the faulty device and tell when the server might be switched of (the drives are not hot pluggable). At the given time nagios complained about a missing host. After about 15 minutes later everything was fine again and the RAID was syncing.
downtime of host: 15min, total working time spent: 20min, only two people involved
Great service! - Hetzner: I informed the support of the bad drive and asked what I need to do for a replacement. After a short time I was told to show part of the syslog, note the serial number of the faulty device and tell when the server might be switched of (the drives are not hot pluggable). At the given time nagios complained about a missing host. After about 15 minutes later everything was fine again and the RAID was syncing.
downtime of host: 15min, total working time spent: 20min, only two people involved
Great service!(both are really almost identical)
- Strato: I informed the support of the bad drive and asked what I need to do for a replacement. After a short time employe1 told me to show part of the syslog and note the serial number of the faulty device. In response to those data employe2 told me that it is not possible to replace a single disk of the RAID. Instead the complete server(!!) needs to be replaced. I asked whether he was joking, but he confirmed that the answer of employe1 was wrong. I really need to click here and there on the customer service webpage to request a new installation of the server and activate a checkbox to request the exchange of the hardware.
Ok, after thinking about my options I returned to the webpage and wanted to activate that checkbox. It was gone! My next email was answered by employe1: She is very sorry but she could not answer my email because I sent it from an unauthorized address. Btw. it was the same address that I used before and employe1 already sent an answer to!
Anyway, maybe their webinterface can be used to send authenticated emails. Really, I got an answer from employe3 saying that I need to perform a hardware test to get my checkbox back. There are two versions, one lasting 2 hours and the second lasting up to 12 hours. During that time the server is not reachable. Ok, I needed that checkbox so I started the test. The next morning I was told that everything is fine with the hardware. Strange enough that checkbox appeared again. So I was finally able to use the new hardware and start to install the new system.
downtime of host: about 12 hours, total working time spent: 6 hours, four people involvedMaybe there are good reasons for such a procedure. From the customers point of view this is a total desaster. I think you can guess who will not rent out the next servers.