Bug#287415: mdadm: degraded md devices not assembled on boot
Mau
mavog at hotmail.com
Mon Jul 24 20:38:08 UTC 2006
martin f krafft wrote:
> [...]
> To debug this, I need more output. Specifically, I need to know
> pretty much exactly what mdadm says during initramfs. To make your
> life easier, try the following:
>
> boot with "break=mount" appended to the kernel command line
> when a shell appears, run the following (the leading dot is
> needed):
> . conf/initramfs.conf
> . scripts/functions
> scripts/local-top/udev_helper
> scripts/local-top/md
>
> I hope I am remembering this correctly. Try it and lt me know.
I tried to reproduce the issue by disconnecting /dev/sdb and setting up
mdadm to assemble only the root partition /dev/md0 in initrd; as I wrote
everything works fine in this phase: /dev/md0 is assembled.
The only message I get from scripts/local-top/mdadm is:
mdadm: /dev/md0 has been started with 1 drive (out of 2)
When the boot process continues this is what happens:
[...]
Loading modules...
bcm5700
ide-cd
ide-disk
ide-generic
psmouse
ahci
sd_mod
raid1
All modules loaded.
Setting the system clock again..
System clock set. Local time [...]
Loading device-mapper support.
Assembling RAID array md0...done (already running).
Will now check all file systems.
fsck 1.39 (29-May-2006)
Checking all file systems.
[/sbin/fsck.ext3 (1) -- /tmp] fsck.ext3 -a -C0 /dev/md2
fsck.ext3: Invalid argument while trying to open /dev/md2
/dev/md2:
The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>
[/sbin/fsck.ext3 (1) -- /var] fsck.ext3 -a -C0 /dev/md4
fsck.ext3: Invalid argument while trying to open /dev/md4
[ ... same as /dev/md2, and the same is repeated also for md5 ... ]
fsck died with exit status 8
* File system check failed.
A log is being saved in /var/log/fsck/checkfs if that location is writable.
Please repair file system manually.
A maintenance shell will now be started.
CONTROL-D will terminate this shell and resume system boot.
Give root password for maintenance
(or type Control-D to continue):
[ ... i give root password, then ... ]
root@(none):~# /etc/init.d/mdadm-raid start
Assembling RAID array md0...done (already running).
Assembling RAID array md2...failed (no devices found).
Assembling RAID array md4...failed (no devices found).
Assembling RAID array md5...failed (no devices found).
root@(none):~#
[ ... and md1? it's a swap partition; trying something else ... ]
root@(none):~# mdadm -Av /dev/md2
mdadm: looking for devices for /dev/md2
mdadm: /dev/sda6 had wrong uuid.
mdadm: /dev/sda5 had wrong uuid.
mdadm: no recogniseable superblock on /dev/sda4
mdadm: /dev/sda4 had wrong uuid.
mdadm: cannot open device /dev/sda2: Device or resource busy
mdadm: /dev/sda2 had wrong uuid.
mdadm: cannot open device /dev/sda1: Device or resource busy
mdadm: /dev/sda1 had wrong uuid.
mdadm: cannot open device /dev/sda: Device or resource busy
mdadm: /dev/sda had wrong uuid.
mdadm: /dev/sda3 is identified as a member of /dev/md2, slot 0.
mdadm: no uptodate device for slot 1 of /dev/md2
mdadm: added /dev/sda3 to /dev/md2 as 0
mdadm: /dev/md2 has been started with 1 drive (out of 2).
root@(none):~#
[ ... hmmm... wanna see more ... ]
root@(none):~# mount
/dev/md0 on / type ext3 (rw,noatime,errors=remount-ro)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
usbfs on /proc/bus/usb type usbfs (rw)
tmpfs on /dev/shm type tmpfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
root@(none):~# lsmod
Module Size Used by
dm_mod 49976 0
ahci 14468 0
ide_disk 15072 0
8250_pnp 8704 0
evdev 9088 0
mousedev 10788 0
tsdev 7392 0
8250_pci 19840 0
pcspkr 3040 0
serio_raw 6596 0
psmouse 34600 0
shpchp 34272 0
pci_hotplug 27196 1 shpchp
parport_pc 32132 0
parport 33160 1 parport_pc
bcm5700 133196 0
ext3 118152 1
jbd 50260 1 ext3
mbcache 8324 1 ext3
ide_generic 1376 0 [permanent]
sd_mod 18592 4
ide_cd 35680 0
cdrom 32448 1 ide_cd
usbhid 35520 0
ata_piix 11556 3
piix 9476 0 [permanent]
libata 61420 2 ahci,ata_piix
scsi_mod 123080 3 ahci,sd_mod,libata
ehci_hcd 28008 0
generic 4420 0 [permanent]
ide_core 110888 5 ide_disk,ide_generic,ide_cd,piix,generic
uhci_hcd 20392 0
usbcore 111616 4 usbhid,ehci_hcd,uhci_hcd
thermal 12904 0
processor 25512 1 thermal
fan 4516 0
raid1 20480 2
md_mod 68884 2 raid1
root@(none):~#
[ ... let's try to assemble md1, the swap device ... ]
root@(none):~# mdadm -Av /dev/md1
mdadm: looking for devices for /dev/md1
mdadm: /dev/sda6 had wrong uuid.
mdadm: /dev/sda5 had wrong uuid.
mdadm: no recogniseable superblock on /dev/sda4
mdadm: /dev/sda4 had wrong uuid.
mdadm: cannot open device /dev/sda3: Device or resource busy
mdadm: /dev/sda3 had wrong uuid.
mdadm: cannot open device /dev/sda1: Device or resource busy
mdadm: /dev/sda1 had wrong uuid.
mdadm: cannot open device /dev/sda: Device or resource busy
mdadm: /dev/sda had wrong uuid.
mdadm: /dev/sda2 is identified as a member of /dev/md1, slot 0.
mdadm: no uptodate device for slot 1 of /dev/md1
mdadm: added /dev/sda2 to /dev/md1 as 0
mdadm: /dev/md1 has been started with 1 drive (out of 2).
root@(none):~#
Ok, let's reboot with both drives...
I'm wondering (please see my /etc/mdadm/mdadm.conf in first message):
- why can't mdadm-raid assemble the arrays?
- why can't mdadm-raid even find /dev/md1?
I also noticed that in /etc/init.d/mdadm-raid there is
RUNDIR=/var/run/mdadm
AUTOSTARTED_DEVICES=$RUNDIR/autostarted-devices
but in my case /var is still unmounted when mdadm-raid is called: when
the box will be up no /var/run/mdadm/autostarted-devices will be found.
I really can't figure out if this is sane.
Thank you!
Mau
More information about the pkg-mdadm-devel
mailing list