Fun with broken harddisks

Today I needed to replace a faulty harddisk, which had a GPT, in a software RAID1. A GPT is a Guid Partition Table and is normally needed for partitions > 2TB. But wait, my external harddisk has 4TB and it uses an MBR (Master Boot Record)!?

In an MBR the partition size is stored in four bytes, which could have 0xFFFFFFFF as a maximum value. This would be 4294967295 in decimal. But the partition size is not given in bytes but in sectors. On Linux systems the sector size of an attached harddisk can be found in /sys/block/sd[X]/queue/hw_sector_size.

root@server:~ # cat /sys/block/sdd/queue/hw_sector_size

This is the normal sector size of a harddisk, so 4294967295 sectors of 512 bytes result in 2TB.

Luckily some external harddisks have a sector size of 4096 bytes.

root@server:~ # cat /sys/block/sda/queue/hw_sector_size

This results in a partition size of 16TB.

Anyway, my disk had a GPT and after installing the new harddisk, it had to get a copy of the GPT of the first one. This can be done with sgdisk, that is part of package gdisk on Debian systems. So after doing apt-get install gdisk one can:

sgdisk --replicate=/dev/sdb /dev/sda

In this case /dev/sda is the source disk and /dev/sdb is the new one.

You can see the GPT with:

sgdisk -p /dev/sda
sgdisk -p /dev/sdb

Due to the cloning, both disks have the same GUID and to avoid hassle, the new one needs a new GUID. This is done with:

sgdisk -G /dev/sdb

The structure of the software raid can be seen in /proc/mdstat. In my case I have three md devices: md0, md1 and md2
On my system md0 currently has only one active member /dev/sda2. So /dev/sdb2 has to be added:

mdadm /dev/md0 --manage --add /dev/sdb2

As this is just a small partition, it took only a few seconds and syslog showed:

[ 5881.551829] md: bind
[ 5881.581014] RAID1 conf printout:
[ 5881.581020] --- wd:1 rd:2
[ 5881.581026] disk 0, wo:0, o:1, dev:sda2
[ 5881.581030] disk 1, wo:1, o:1, dev:sdb2
[ 5881.581174] md: recovery of RAID array md0
[ 5881.581180] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
[ 5881.581186] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
[ 5881.581195] md: using 128k window, over a total of 499988k.
[ 5889.511049] md: md0: recovery done.
[ 5889.614014] RAID1 conf printout:
[ 5889.614020] --- wd:2 rd:2
[ 5889.614026] disk 0, wo:0, o:1, dev:sda2
[ 5889.614031] disk 1, wo:0, o:1, dev:sdb2

The same needs to be done for the other partitions:
mdadm /dev/md1 --manage --add /dev/sdb3
mdadm /dev/md2 --manage --add /dev/sdb4

They are way bigger and recovery of the RAID lasts a bit longer. But finally everything is done and nagios switches back from red to green. Mission accomplished!