Bug#287415: mdadm: degraded md devices not assembled on boot

Mau mavog at hotmail.com
Mon Jul 24 20:38:08 UTC 2006


martin f krafft wrote:
> [...]
> To debug this, I need more output. Specifically, I need to know
> pretty much exactly what mdadm says during initramfs. To make your
> life easier, try the following:
> 
>   boot with "break=mount" appended to the kernel command line
>   when a shell appears, run the following (the leading dot is
>   needed):
>     . conf/initramfs.conf
>     . scripts/functions
>     scripts/local-top/udev_helper
>     scripts/local-top/md
> 
> I hope I am remembering this correctly. Try it and lt me know.

I tried to reproduce the issue by disconnecting /dev/sdb and setting up
mdadm to assemble only the root partition /dev/md0 in initrd; as I wrote
everything works fine in this phase: /dev/md0 is assembled.

The only message I get from scripts/local-top/mdadm is:

mdadm: /dev/md0 has been started with 1 drive (out of 2)

When the boot process continues this is what happens:

[...]
Loading modules...
	bcm5700
	ide-cd
	ide-disk
	ide-generic
	psmouse
	ahci
	sd_mod
	raid1
All modules loaded.
Setting the system clock again..
System clock set. Local time [...]
Loading device-mapper support.
Assembling RAID array md0...done (already running).
Will now check all file systems.
fsck 1.39 (29-May-2006)
Checking all file systems.
[/sbin/fsck.ext3 (1) -- /tmp] fsck.ext3 -a -C0 /dev/md2
fsck.ext3: Invalid argument while trying to open /dev/md2
/dev/md2:
The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
	e2fsck -b 8193 <device>

[/sbin/fsck.ext3 (1) -- /var] fsck.ext3 -a -C0 /dev/md4
fsck.ext3: Invalid argument while trying to open /dev/md4

[ ... same as /dev/md2, and the same is repeated also for md5 ... ]

fsck died with exit status 8
* File system check failed.
A log is being saved in /var/log/fsck/checkfs if that location is writable.
Please repair file system manually.
A maintenance shell will now be started.
CONTROL-D will terminate this shell and resume system boot.
Give root password for maintenance
(or type Control-D to continue):

[ ... i give root password, then ... ]

root@(none):~# /etc/init.d/mdadm-raid start
Assembling RAID array md0...done (already running).
Assembling RAID array md2...failed (no devices found).
Assembling RAID array md4...failed (no devices found).
Assembling RAID array md5...failed (no devices found).
root@(none):~#

[ ... and md1? it's a swap partition; trying something else ... ]

root@(none):~# mdadm -Av /dev/md2
mdadm: looking for devices for /dev/md2
mdadm: /dev/sda6 had wrong uuid.
mdadm: /dev/sda5 had wrong uuid.
mdadm: no recogniseable superblock on /dev/sda4
mdadm: /dev/sda4 had wrong uuid.
mdadm: cannot open device /dev/sda2: Device or resource busy
mdadm: /dev/sda2 had wrong uuid.
mdadm: cannot open device /dev/sda1: Device or resource busy
mdadm: /dev/sda1 had wrong uuid.
mdadm: cannot open device /dev/sda: Device or resource busy
mdadm: /dev/sda had wrong uuid.
mdadm: /dev/sda3 is identified as a member of /dev/md2, slot 0.
mdadm: no uptodate device for slot 1 of /dev/md2
mdadm: added /dev/sda3 to /dev/md2 as 0
mdadm: /dev/md2 has been started with 1 drive (out of 2).
root@(none):~#

[ ... hmmm... wanna see more ... ]

root@(none):~# mount
/dev/md0 on / type ext3 (rw,noatime,errors=remount-ro)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
usbfs on /proc/bus/usb type usbfs (rw)
tmpfs on /dev/shm type tmpfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
root@(none):~# lsmod
Module                  Size  Used by
dm_mod                 49976  0
ahci                   14468  0
ide_disk               15072  0
8250_pnp                8704  0
evdev                   9088  0
mousedev               10788  0
tsdev                   7392  0
8250_pci               19840  0
pcspkr                  3040  0
serio_raw               6596  0
psmouse                34600  0
shpchp                 34272  0
pci_hotplug            27196  1 shpchp
parport_pc             32132  0
parport                33160  1 parport_pc
bcm5700               133196  0
ext3                  118152  1
jbd                    50260  1 ext3
mbcache                 8324  1 ext3
ide_generic             1376  0 [permanent]
sd_mod                 18592  4
ide_cd                 35680  0
cdrom                  32448  1 ide_cd
usbhid                 35520  0
ata_piix               11556  3
piix                    9476  0 [permanent]
libata                 61420  2 ahci,ata_piix
scsi_mod              123080  3 ahci,sd_mod,libata
ehci_hcd               28008  0
generic                 4420  0 [permanent]
ide_core              110888  5 ide_disk,ide_generic,ide_cd,piix,generic
uhci_hcd               20392  0
usbcore               111616  4 usbhid,ehci_hcd,uhci_hcd
thermal                12904  0
processor              25512  1 thermal
fan                     4516  0
raid1                  20480  2
md_mod                 68884  2 raid1
root@(none):~#

[ ... let's try to assemble md1, the swap device ... ]


root@(none):~# mdadm -Av /dev/md1
mdadm: looking for devices for /dev/md1
mdadm: /dev/sda6 had wrong uuid.
mdadm: /dev/sda5 had wrong uuid.
mdadm: no recogniseable superblock on /dev/sda4
mdadm: /dev/sda4 had wrong uuid.
mdadm: cannot open device /dev/sda3: Device or resource busy
mdadm: /dev/sda3 had wrong uuid.
mdadm: cannot open device /dev/sda1: Device or resource busy
mdadm: /dev/sda1 had wrong uuid.
mdadm: cannot open device /dev/sda: Device or resource busy
mdadm: /dev/sda had wrong uuid.
mdadm: /dev/sda2 is identified as a member of /dev/md1, slot 0.
mdadm: no uptodate device for slot 1 of /dev/md1
mdadm: added /dev/sda2 to /dev/md1 as 0
mdadm: /dev/md1 has been started with 1 drive (out of 2).
root@(none):~#

Ok, let's reboot with both drives...

I'm wondering (please see my /etc/mdadm/mdadm.conf in first message):

- why can't mdadm-raid assemble the arrays?
- why can't mdadm-raid even find /dev/md1?

I also noticed that in /etc/init.d/mdadm-raid there is
RUNDIR=/var/run/mdadm
AUTOSTARTED_DEVICES=$RUNDIR/autostarted-devices
but in my case /var is still unmounted when mdadm-raid is called: when
the box will be up no /var/run/mdadm/autostarted-devices will be found.
I really can't figure out if this is sane.

Thank you!

Mau




More information about the pkg-mdadm-devel mailing list