INT 21h

Hi, I am Vladimir Smagin, SysAdmin and Kaptain. Telegram Email / GIT / Thingiverse / RSS / GPG

Another raid6 recovery story

№ 11223 В разделе "Sysadmin" от February 14th, 2021,

Ahh… Again… You thinking affected server is a backup of backup server but somehow this old as dinosaur shit server contains part of production without any copy in git or somewhere else.

60000 power-on hours of each hard drive. Yeeeaah.

Load rescue OS and check mdstat. Two disks already dead and failing third, FS already corrupted. Everything as we love.

>$ cat /proc/mdstat 
md2 : active raid6 sda3[6] sdb3[5] sdd3[4] sdf3[3] sdg3[0]
      1073085440 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/5] [U__UUUU]
      bitmap: 2/2 pages [8KB], 65536KB chunk

You can try replace dead disks, reassemble raid in-place and fix FS, but who knows, result may be much worse than current state. So I decided to create partitions images and work with them.

Copy images over network to new server, if dd won’t work use ddrescue.

dd if=/dev/sda3 bs=1M | gzip | ssh root@new_server 'gzip -d | dd of=/sda3 bs=1M'
dd if=/dev/sdb3 bs=1M | gzip | ssh root@new_server 'gzip -d | dd of=/sdb3 bs=1M'
dd if=/dev/sdd3 bs=1M | gzip | ssh root@new_server 'gzip -d | dd of=/sdd3 bs=1M'
dd if=/dev/sdf3 bs=1M | gzip | ssh root@new_server 'gzip -d | dd of=/sdf3 bs=1M'
dd if=/dev/sdg3 bs=1M | gzip | ssh root@new_server 'gzip -d | dd of=/sdg3 bs=1M'

Now you have all images to reassemble partition, so use losetup to imaginate that image file is a block device, because mdadm not working with images directly.

losetup -P /dev/loop0 /sda3
losetup -P /dev/loop1 /sdb3
losetup -P /dev/loop2 /sdd3
losetup -P /dev/loop3 /sdf3
losetup -P /dev/loop4 /sdg3

Now run cat /proc/mdstat

md127 : active raid6 loop4[6] loop3[5] loop2[4] loop1[3] loop0[0]
      1073085440 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/5] [U__UUUU]
      bitmap: 2/2 pages [8KB], 65536KB chunk

unused devices: 

If raid not assembled automaticaly run mdadm -A md127 /dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3 /dev/loop4.

Yes, you assembled raid partition from images! Lets try to fix filesystem.

fsck -y /dev/md127

But this is not the right time to give up

Try to use another supeblock fsck -b 32768 -y /dev/md127

AAaaaand fsck can’t write changes to first superblock, lol. BUT! You can use alive superblock to mount FS!

mount -o sb=131072 /dev/md127 /mnt

Нет комментариев »

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Яндекс.Метрика

Fortune cookie: "It is death, and not what comes after death, that men are generally afraid of." [Samuel Butler]