Bug#496334: mdadm segfault on --assemble --force with raid10
Török Edwin
edwintorok at gmail.com
Sun Aug 24 14:57:24 UTC 2008
Package: mdadm
Version: 2.6.7-3
Severity: critical
Justification: breaks the whole system
My raid10 array got marked faulty, and mdadm refused to assemble
(says it only got 2 devices, and needs 6).
I tried to run mdadm --assemble --force /dev/md4, but I got a segfault.
I couldn't reassemble the array using mdadm 2.6.7 in any way.
However I found a solution:
booted the laptop, and downloaded latest mdadm git from
git://neil.brown.name/mdadm, and compiled it (laptop is 32 bit), then
transfered it to my (64-bit) box using netcat:
On the 64-bit box with the failed raid array: nc -l -p 1000 >x
On the laptop: nc 192.168.0.199 100 <mdadm
Then chmod +x x, ./x --assemble --force /dev/md4, and it WORKED!
I have the array up now, and a resync is in progress.
dmesg:
[ 1306.867850] md: bind<sdb3>
[ 1306.868912] md: bind<sdd3>
[ 1306.871850] md: bind<sde3>
[ 1306.871850] md: bind<sdc3>
[ 1306.867850] md: bind<sdf3>
[ 1306.871850] md: bind<sda3>
[ 1322.306694] md: kicking non-fresh sdc3 from array!
[ 1322.306694] md: unbind<sdc3>
[ 1322.306694] md: export_rdev(sdc3)
[ 1322.306694] md: kicking non-fresh sde3 from array!
[ 1322.306694] md: unbind<sde3>
[ 1322.306694] md: export_rdev(sde3)
[ 1322.306694] md: kicking non-fresh sdd3 from array!
[ 1322.306694] md: unbind<sdd3>
[ 1322.306694] md: export_rdev(sdd3)
[ 1322.306694] md: kicking non-fresh sdb3 from array!
[ 1322.306694] md: unbind<sdb3>
[ 1322.306694] md: export_rdev(sdb3)
[ 1322.306694] md: md4: raid array is not clean -- starting background reconstruction
[ 1322.838707] raid10: not enough operational mirrors for md4
[ 1322.838707] md: pers->run() failed ...
[ 1410.510714] md: md4 stopped.
[ 1410.510714] md: unbind<sda3>
[ 1410.510714] md: export_rdev(sda3)
[ 1410.518705] md: unbind<sdf3>
[ 1410.518705] md: export_rdev(sdf3)
[ 1623.820925] md: md4 stopped.
[ 1624.015384] md: bind<sdb3>
[ 1624.015384] md: bind<sdd3>
[ 1624.015511] md: bind<sde3>
[ 1624.015632] md: bind<sdc3>
[ 1624.016165] md: bind<sdf3>
[ 1624.016165] md: bind<sda3>
[ 1866.359643] md: md4 stopped.
[ 1866.359643] md: unbind<sda3>
[ 1866.359643] md: export_rdev(sda3)
[ 1866.359643] md: unbind<sdf3>
[ 1866.359643] md: export_rdev(sdf3)
[ 1866.359643] md: unbind<sdc3>
[ 1866.359643] md: export_rdev(sdc3)
[ 1866.359643] md: unbind<sde3>
[ 1866.359643] md: export_rdev(sde3)
[ 1866.359643] md: unbind<sdd3>
[ 1866.359643] md: export_rdev(sdd3)
[ 1866.359643] md: unbind<sdb3>
[ 1866.359643] md: export_rdev(sdb3)
[ 1866.599690] mdadm[3205]: segfault at a0 ip 417db5 sp 7fff5f3a6580 error 4 in mdadm[400000+29000]
^^^^^Here is where I have run mdadm --assemble --force /dev/md4
[ 3117.017725] md: md4 stopped.
[ 3117.185350] md: bind<sdb3>
[ 3117.186526] md: bind<sdd3>
[ 3117.186526] md: bind<sde3>
[ 3117.189911] md: bind<sdc3>
[ 3117.185350] md: bind<sdf3>
[ 3117.186526] md: bind<sda3>
[ 3122.470735] md: md4 stopped.
[ 3122.470735] md: unbind<sda3>
[ 3122.470735] md: export_rdev(sda3)
[ 3122.470735] md: unbind<sdf3>
[ 3122.470735] md: export_rdev(sdf3)
[ 3122.470735] md: unbind<sdc3>
[ 3122.470735] md: export_rdev(sdc3)
[ 3122.470735] md: unbind<sde3>
[ 3122.470735] md: export_rdev(sde3)
[ 3122.470735] md: unbind<sdd3>
[ 3122.470735] md: export_rdev(sdd3)
[ 3122.470735] md: unbind<sdb3>
[ 3122.470735] md: export_rdev(sdb3)
[ 3122.624492] md: bind<sdb3>
[ 3122.625244] md: bind<sdd3>
[ 3122.626323] md: bind<sde3>
[ 3122.626323] md: bind<sdc3>
[ 3122.626323] md: bind<sdf3>
[ 3122.626833] md: bind<sda3>
[ 3122.626898] md: md4: raid array is not clean -- starting background reconstruction
Here is where I've run the new mdadm from git, and it was able to start the
array:
[ 3122.655983] raid10: raid set md4 active with 6 out of 6 devices
[ 3161.721339] md: resync of RAID array md4
[ 3161.721339] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
[ 3161.721339] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync.
[ 3161.721339] md: using 128k window, over a total of 2159617728 blocks.
I don't know if mdadm from git worked because the bug in 2.6.7 got fixed,
or because it was a 32-bit build, but I suggest updating mdadm in debian to
include patches from upstream git relevant to this problem.
-- Package-specific info:
--- mount output
/dev/md3 on / type ext3 (rw,noatime,errors=remount-ro)
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
procbususb on /proc/bus/usb type usbfs (rw)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
tmpfs on /tmp type tmpfs (rw)
fusectl on /sys/fs/fuse/connections type fusectl (rw)
/dev/mapper/vg--all-lv--usr on /usr type xfs (rw)
/dev/mapper/vg--all-lv--var on /var type xfs (rw)
/dev/mapper/vg--all-lv--home on /home type reiserfs (rw,noatime,notail)
--- mdadm.conf
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#
# by default, scan all partitions (/proc/partitions) for MD superblocks.
# alternatively, specify devices to scan, using wildcards if desired.
DEVICE partitions
# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes
# automatically tag new arrays as belonging to the local system
HOMEHOST <system>
# instruct the monitoring daemon where to send mail alerts
MAILADDR root
# definitions of existing MD arrays
ARRAY /dev/md3 level=raid1 num-devices=6 UUID=8b4b5bca:f8e579d9:318cfef2:eb9ed6d4
ARRAY /dev/md4 level=raid10 num-devices=6 UUID=1156c578:5138c7ee:318cfef2:eb9ed6d4
devices=/dev/sdb3,/dev/sdc3,/dev/sdd3,/dev/sde3,/dev/sdf3,/dev/sda3
# This file was auto-generated on Sat, 23 Aug 2008 13:25:32 +0000
# by mkconf $Id$
--- /proc/mdstat:
Personalities : [raid1] [raid10]
md4 : active raid10 sda3[0] sdf3[5] sdc3[4] sde3[3] sdd3[2] sdb3[1]
2159617728 blocks 64K chunks 2 near-copies [6/6] [UUUUUU]
[>....................] resync = 1.4% (31960768/2159617728) finish=174.9min speed=202727K/sec
md3 : active raid1 sda1[0] sdf1[5] sdc1[4] sde1[3] sdd1[2] sdb1[1]
9767424 blocks [6/6] [UUUUUU]
unused devices: <none>
--- /proc/partitions:
major minor #blocks name
8 0 732574584 sda
8 1 9767488 sda1
8 2 2931862 sda2
8 3 719872650 sda3
8 16 732574584 sdb
8 17 9767488 sdb1
8 18 2931862 sdb2
8 19 719872650 sdb3
8 32 732574584 sdc
8 33 9767488 sdc1
8 34 2931862 sdc2
8 35 719872650 sdc3
8 48 732574584 sdd
8 49 9767488 sdd1
8 50 2931862 sdd2
8 51 719872650 sdd3
8 64 732574584 sde
8 65 9767488 sde1
8 66 2931862 sde2
8 67 719872650 sde3
8 80 732574584 sdf
8 81 9767488 sdf1
8 82 2931862 sdf2
8 83 719872650 sdf3
9 3 9767424 md3
9 4 2159617728 md4
253 0 104857600 dm-0
253 1 1363148800 dm-1
253 2 629145600 dm-2
253 3 10485760 dm-3
--- initrd.img-2.6.26-1-amd64:
45881 blocks
etc/mdadm
etc/mdadm/mdadm.conf
scripts/local-top/mdadm
sbin/mdadm
lib/modules/2.6.26-1-amd64/kernel/drivers/md/multipath.ko
lib/modules/2.6.26-1-amd64/kernel/drivers/md/linear.ko
lib/modules/2.6.26-1-amd64/kernel/drivers/md/dm-log.ko
lib/modules/2.6.26-1-amd64/kernel/drivers/md/raid456.ko
lib/modules/2.6.26-1-amd64/kernel/drivers/md/dm-snapshot.ko
lib/modules/2.6.26-1-amd64/kernel/drivers/md/dm-mirror.ko
lib/modules/2.6.26-1-amd64/kernel/drivers/md/md-mod.ko
lib/modules/2.6.26-1-amd64/kernel/drivers/md/dm-mod.ko
lib/modules/2.6.26-1-amd64/kernel/drivers/md/raid10.ko
lib/modules/2.6.26-1-amd64/kernel/drivers/md/raid1.ko
lib/modules/2.6.26-1-amd64/kernel/drivers/md/raid0.ko
--- /proc/modules:
dm_mirror 20608 0 - Live 0xffffffffa0165000
dm_log 13956 1 dm_mirror, Live 0xffffffffa0160000
dm_snapshot 19400 0 - Live 0xffffffffa015a000
dm_mod 58864 11 dm_mirror,dm_log,dm_snapshot, Live 0xffffffffa014a000
raid10 23680 1 - Live 0xffffffffa0143000
raid1 24192 1 - Live 0xffffffffa013c000
md_mod 80164 4 raid10,raid1, Live 0xffffffffa0127000
--- volume detail:
--- /proc/cmdline
root=/dev/md3 ro single
--- grub:
kernel /boot/vmlinuz-2.6.27-rc4 root=/dev/md3 ro
kernel /boot/vmlinuz-2.6.27-rc4 root=/dev/md3 ro single
kernel /boot/vmlinuz-2.6.26-1-amd64 root=/dev/md3 ro
kernel /boot/vmlinuz-2.6.26-1-amd64 root=/dev/md3 ro single
-- System Information:
Debian Release: lenny/sid
APT prefers unstable
APT policy: (500, 'unstable'), (500, 'testing')
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.26-1-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Versions of packages mdadm depends on:
ii debconf 1.5.23 Debian configuration management sy
ii libc6 2.7-13 GNU C Library: Shared libraries
ii lsb-base 3.2-20 Linux Standard Base 3.2 init scrip
ii makedev 2.3.1-88 creates device files in /dev
ii udev 0.125-5 /dev/ and hotplug management daemo
Versions of packages mdadm recommends:
ii module-init-tools 3.4-1 tools for managing Linux kernel mo
ii postfix [mail-transport-agent 2.5.4-1 High-performance mail transport ag
mdadm suggests no packages.
-- debconf information:
mdadm/autostart: true
* mdadm/mail_to: root
mdadm/initrdstart_msg_errmd:
* mdadm/initrdstart: all
mdadm/initrdstart_msg_errconf:
mdadm/initrdstart_notinconf: false
mdadm/initrdstart_msg_errexist:
mdadm/initrdstart_msg_intro:
* mdadm/autocheck: true
mdadm/initrdstart_msg_errblock:
* mdadm/start_daemon: true
More information about the pkg-mdadm-devel
mailing list