SMEServer RAID Rebuild
* | Warning: |
Get it right or you will lose data. Take a backup! Let the raid sync, this can take quite a while. |
SME Servers Raid Options are largely automated, if you built your system with a single hard disk simply logon as admin and select Disk Redundancy to add a new drive to your RAID1 array. The same procedure is used if you have a disk failure in a RAID array and you have replaced that failed disk.
But with the best laid plans things don't always go according to plan, these are the processes required to do it manually.
See also: Hard Disk Partitioning and Raid#Resynchronising_a_Failed_RAID
Contents
HowTo: Manage/Check a RAID1 Array from the command Line
What is the Status of the Array
[root@ ~]# cat /proc/mdstat Personalities : [raid1] md2 : active raid1 sdb2[2] sda2[0] 488279488 blocks [2/1] [U_] [=>...................] recovery = 6.3% (31179264/488279488) finish=91.3min speed=83358K/sec md1 : active raid1 sdb1[1] sda1[0] 104320 blocks [2/2] [UU] unused devices: <none>
HowTo: Reinstate a disk from the RAID1 Array with the command Line
Look at the mdstat
First we must determine which drive is in default.
[root@ ~]#cat /proc/mdstat Personalities : [raid1] md1 : active raid1 sdb1[1] sda1[0] 104320 blocks [2/2] [UU] md2 : active raid1 sdb2[2](F) sda2[0] 52323584 blocks [2/1] [U_] unused devices: <none>
(S)= Spare (F)= Fail [0]= number of the disk
* | Note: |
As we can see the partition sdb2 is in default, we can see the flag: sdb2 [2] (F). We need to resynchronize the disk sdb to the existing array md2. |
Fail and remove the disk, sdb in this case
mdadm: set /dev/sdb2 faulty in /dev/md2
[root@ ~]# mdadm --manage /dev/md2 --fail /dev/sdb2
mdadm: hot removed /dev/sdb2
[root@ ~]# mdadm --manage /dev/md2 --remove /dev/sdb2
mdadm: set /dev/sdb1 faulty in /dev/md1
[root@ ~]# mdadm --manage /dev/md1 --fail /dev/sdb1
mdadm: hot removed /dev/sdb1
[root@ ~]# mdadm --manage /dev/md1 --remove /dev/sdb1
Do your Disk Maintenance here
At this point the disk is idle.
[root@ ~]# cat /proc/mdstat Personalities : [raid1] md1 : active raid1 sda1[0] 104320 blocks [2/1] [U_] md2 : active raid1 sda2[0] 52323584 blocks [2/1] [U_] unused devices: <none>
* | Note: |
You'll have to determine if your disk can be reinstated at the array. In fact sometimes a raid can get out of sync after a power failure but also some times for physical outages of the hard disk. It is necessary to test the hard disk if this occurs repeatedly. For this we will use smartctl. |
For all the details available by SMART on the disk
[root@ ~]# smartctl -a /dev/sdb
At least two types of tests are possible, short (~ 1 min) and long (~ 10 min to 90 min).
[root@ ~]# smartctl -t short /dev/sdb #short test [root@ ~]# smartctl -t long /dev/sdb #long test
to access the results / statistics for these tests:
[root@ ~]# smartctl -l selftest /dev/sdb
You can refer to this page for more information how activate or understand the Analysis and Reporting Technology (SMART) Monitor_Disk_Health
* | Note: |
if you need to change the disk due to physical failure found by the smartctl command, install a new disk of the same capacity (or more) and enter the following commands to recreate new partitions by copying them from healthy disk sda. |
[root@ ~]# sfdisk -d /dev/sda > sfdisk_sda.output [root@ ~]# sfdisk /dev/sdb < sfdisk_sda.output
GPT Disks
Larger disks will be GPT Disks, sfdisk will not work - you will need to use gdisk and partx (parted)
[root@ ~]# yum install gdisk
The copy the partition table from a good disk to the new disk, the first line will copy the partition table from disk sda to sdd, the second will randomize the GUID
[root@ ~]# sgdisk /dev/sda -R /dev/sdd [root@ ~]# sgdisk -G /dev/sdd
To view the partitions use partx
[root@ ~]# partx -l /dev/sdd
If you want to reinstate the same disk without replacing it, go to the next step.
Add the partitions back
mdadm: hot added /dev/sdb1
[root@ ~]# mdadm --manage /dev/md1 --add /dev/sdb1
mdadm: hot added /dev/sdb2
[root@ ~]# mdadm --manage /dev/md2 --add /dev/sdb2
Another Look at the mdstat
[root@sme8-64-dev ~]# cat /proc/mdstat Personalities : [raid1] md1 : active raid1 sdb1[1] sda1[0] 104320 blocks [2/2] [UU] md2 : active raid1 sdb2[2] sda2[0] 52323584 blocks [2/1] [U_] [>....................] recovery = 1.9% (1041600/52323584) finish=14.7min speed=57866K/sec unused devices: <none>
* | Note: |
with a new disk it may be worthwhile to reinstall grub to avoid problems on startup error. The grub is the program that allows you to launch the operating systems. Please enter the following commands. |
HowTo: Write the GRUB boot sector
* | Warning: |
as the dd command is named "data destroyer" you need to be extremely prudent and sure of the name of source partition and/or destination. At first you should skip the dd command, Step 1 below, and attempt to install grub without it, see Step 2 below. If grub can be installed without using dd, then Step 1 can be discarded. |
- 1.dd
[root@ ~]# dd if=/dev/sda1 of=/dev/sdb1
- 2.grub
[root@ ~]# grub GNU GRUB version 0.95 (640K lower / 3072K upper memory) [ Minimal BASH-like line editing is supported. For the first word, TAB lists possible command completions. Anywhere else TAB lists the possible completions of a device/filename.] grub> device (hd0) /dev/sdb grub> root (hd0,0) Filesystem type is ext2fs, partition type 0xfd grub> setup (hd0) Checking if "/boot/grub/stage1" exists... no Checking if "/grub/stage1" exists... yes Checking if "/grub/stage2" exists... yes Checking if "/grub/e2fs_stage1_5" exists... yes Running "embed /grub/e2fs_stage1_5 (hd0)"... 16 sectors are embedded. succeeded Running "install /grub/stage1 (hd0) (hd1)1+16 p (hd0,0)/grub/stage2 /grub/grub.conf"... succeeded Done. grub> quit