Bug#599352: mdadm: Starting up duplicate degraded arrays
Dominic Hargreaves
dom at earth.li
Wed Oct 6 20:01:48 UTC 2010
Package: mdadm
Version: 3.1.4-1+8efb9d1
Severity: normal
My system has two md arrays, normally named md0 and md1. Each normally
has component devices:
/dev/md0: /dev/sda1 /dev/sdb1
/dev/md1: /dev/sda2 /dev/sdb2
This evening I booted up with one drive missing (how I managed this
is a long story -- I am debugging a potential power supply problem
which may be the causes of SATA bus reset. I suppose it's possible
that some bus reset event actually caused some corrupt data somewhere,
but none were observed during this incident).
When I realised my mistake I shut down the system cleanly and then
started up with both drives present.
I expected to get both arrays starting in degraded mode. What I actually
got was four arrays, md0, md1, md126 and md127, all with one partition
each.
/dev/md0 was mounted as /, as per fstab. LVM had discovered /dev/md127
as its PV, and started using this.
I don't have a copy of the /proc/mdstat from then, but I did collect
the following output:
dmesg excerpt:
[ 20.328590] md: raid1 personality registered for level 1
[ 21.010971] md: md0 stopped.
[ 21.013357] md: bind<sda1>
[ 21.013507] md: bind<sdb1>
[ 21.013557] md: kicking non-fresh sda1 from array!
[ 21.013594] md: unbind<sda1>
[ 21.024068] md: export_rdev(sda1)
[ 21.025521] raid1: raid set md0 active with 1 out of 2 mirrors
[ 21.025576] md0: detected capacity change from 0 to 2047934464
[ 21.026458] md0: unknown partition table
[ 21.042339] md: md1 stopped.
[ 21.043208] md: bind<sda2>
[ 21.043385] md: bind<sdb2>
[ 21.043432] md: kicking non-fresh sda2 from array!
[ 21.043469] md: unbind<sda2>
[ 21.056102] md: export_rdev(sda2)
[ 21.057538] raid1: raid set md1 active with 1 out of 2 mirrors
[ 21.057592] md1: detected capacity change from 0 to 998154108928
[ 21.058468] md1: unknown partition table
[ 24.540527] md: md127 stopped.
[ 24.541724] md: bind<sda2>
[ 24.584505] raid1: raid set md127 active with 1 out of 2 mirrors
[ 24.584593] md127: detected capacity change from 0 to 998154108928
[ 24.586077] md127: unknown partition table
[ 29.618612] md: md126 stopped.
[ 29.619969] md: bind<sda1>
[ 29.622283] raid1: raid set md126 active with 1 out of 2 mirrors
[ 29.622355] md126: detected capacity change from 0 to 2047934464
[ 29.623605] md126: unknown partition table
mdadm --examine
/dev/sda1:
Magic : a92b4efc
Version : 0.90.00
UUID : 577cf208:5b05d3c9:dc677657:389f812d
Creation Time : Tue Oct 7 21:20:37 2008
Raid Level : raid1
Used Dev Size : 1999936 (1953.39 MiB 2047.93 MB)
Array Size : 1999936 (1953.39 MiB 2047.93 MB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Update Time : Tue Oct 5 23:08:22 2010
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : 66b7f88 - correct
Events : 17394
Number Major Minor RaidDevice State
this 1 8 1 1 active sync /dev/sda1
0 0 8 17 0 active sync /dev/sdb1
1 1 8 1 1 active sync /dev/sda1
/dev/sda2:
Magic : a92b4efc
Version : 0.90.00
UUID : 52c89d31:ddcb3e86:dc677657:389f812d
Creation Time : Tue Oct 7 21:20:51 2008
Raid Level : raid1
Used Dev Size : 974759872 (929.60 GiB 998.15 GB)
Array Size : 974759872 (929.60 GiB 998.15 GB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 127
Update Time : Wed Oct 6 19:35:51 2010
State : active
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
Checksum : be855454 - correct
Events : 852773
Number Major Minor RaidDevice State
this 1 8 2 1 active sync /dev/sda2
0 0 0 0 0 removed
1 1 8 2 1 active sync /dev/sda2
/dev/sdb1:
Magic : a92b4efc
Version : 0.90.00
UUID : 577cf208:5b05d3c9:dc677657:389f812d
Creation Time : Tue Oct 7 21:20:37 2008
Raid Level : raid1
Used Dev Size : 1999936 (1953.39 MiB 2047.93 MB)
Array Size : 1999936 (1953.39 MiB 2047.93 MB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 0
Update Time : Wed Oct 6 19:35:34 2010
State : clean
Active Devices : 1
Working Devices : 1
Failed Devices : 1
Spare Devices : 0
Checksum : 66c9fe6 - correct
Events : 17486
Number Major Minor RaidDevice State
this 0 8 17 0 active sync /dev/sdb1
0 0 8 17 0 active sync /dev/sdb1
1 1 0 0 1 faulty removed
/dev/sdb2:
Magic : a92b4efc
Version : 0.90.00
UUID : 52c89d31:ddcb3e86:dc677657:389f812d
Creation Time : Tue Oct 7 21:20:51 2008
Raid Level : raid1
Used Dev Size : 974759872 (929.60 GiB 998.15 GB)
Array Size : 974759872 (929.60 GiB 998.15 GB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 1
Update Time : Wed Oct 6 19:23:26 2010
State : clean
Active Devices : 1
Working Devices : 1
Failed Devices : 1
Spare Devices : 0
Checksum : be9253b1 - correct
Events : 852724
Number Major Minor RaidDevice State
this 0 8 2 0 active sync /dev/sda2
0 0 8 2 0 active sync /dev/sda2
1 1 0 0 1 faulty removed
mdadm --detail:
/dev/md0:
Version : 0.90
Creation Time : Tue Oct 7 21:20:37 2008
Raid Level : raid1
Array Size : 1999936 (1953.39 MiB 2047.93 MB)
Used Dev Size : 1999936 (1953.39 MiB 2047.93 MB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Wed Oct 6 19:35:34 2010
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
UUID : 577cf208:5b05d3c9:dc677657:389f812d
Events : 0.17486
Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 0 0 1 removed
/dev/md1:
Version : 0.90
Creation Time : Tue Oct 7 21:20:51 2008
Raid Level : raid1
Array Size : 974759872 (929.60 GiB 998.15 GB)
Used Dev Size : 974759872 (929.60 GiB 998.15 GB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 1
Persistence : Superblock is persistent
Update Time : Wed Oct 6 19:23:26 2010
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
UUID : 52c89d31:ddcb3e86:dc677657:389f812d
Events : 0.852724
Number Major Minor RaidDevice State
0 8 18 0 active sync /dev/sdb2
1 0 0 1 removed
/dev/md126:
Version : 0.90
Creation Time : Tue Oct 7 21:20:37 2008
Raid Level : raid1
Array Size : 1999936 (1953.39 MiB 2047.93 MB)
Used Dev Size : 1999936 (1953.39 MiB 2047.93 MB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 126
Persistence : Superblock is persistent
Update Time : Tue Oct 5 23:08:22 2010
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
UUID : 577cf208:5b05d3c9:dc677657:389f812d
Events : 0.17394
Number Major Minor RaidDevice State
0 0 0 0 removed
1 8 1 1 active sync /dev/sda1
/dev/md127:
Version : 0.90
Creation Time : Tue Oct 7 21:20:51 2008
Raid Level : raid1
Array Size : 974759872 (929.60 GiB 998.15 GB)
Used Dev Size : 974759872 (929.60 GiB 998.15 GB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 127
Persistence : Superblock is persistent
Update Time : Wed Oct 6 19:37:07 2010
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
UUID : 52c89d31:ddcb3e86:dc677657:389f812d
Events : 0.852806
Number Major Minor RaidDevice State
0 0 0 0 removed
1 8 2 1 active sync /dev/sda2
You can see the oddities in the update times of these, and also that
there are two pairs of md arrays with identical UUIDs.
Because I wanted to get back to a stable state ASAP, I then stopped the
two "bogus" arrays (md1 and md126, as it happened), zeroed the two
component devices and re-added them to other two arrays. Rebuilding
appeared to proceed as normal.
I realise that this may be very difficult to debug, but I thought it
was worth including as a bug report just in case.
Cheers,
Dominic.
-- Package-specific info:
--- mdadm.conf
DEVICE partitions
CREATE owner=root group=disk mode=0660 auto=yes
HOMEHOST <system>
MAILADDR root
--- /etc/default/mdadm
INITRDSTART='all'
AUTOSTART=true
AUTOCHECK=true
START_DAEMON=true
DAEMON_OPTIONS="--syslog"
VERBOSE=false
--- /proc/mdstat:
Personalities : [raid1]
md127 : active raid1 sdb2[2] sda2[1]
974759872 blocks [2/1] [_U]
[>....................] recovery = 3.1% (30845888/974759872) finish=197.5min speed=79648K/sec
md0 : active raid1 sda1[1] sdb1[0]
1999936 blocks [2/2] [UU]
unused devices: <none>
--- /proc/partitions:
major minor #blocks name
8 16 1465138584 sdb
8 17 2000061 sdb1
8 18 974759940 sdb2
8 0 976762584 sda
8 1 2000061 sda1
8 2 974759940 sda2
9 0 1999936 md0
9 127 974759872 md127
253 0 419430400 dm-0
253 1 104857600 dm-1
253 2 5242880 dm-2
253 3 15728640 dm-3
253 4 10485760 dm-4
253 5 2097152 dm-5
253 6 133169152 dm-6
253 7 31457280 dm-7
--- LVM physical volumes:
PV VG Fmt Attr PSize PFree
/dev/md127 tera_vg lvm2 a- 929.60g 240.60g
--- initrd.img-2.6.32-5-686:
47583 blocks
f4fbd9099399ab08ba9b9f6c71d77595 ./scripts/local-top/mdadm
56c6e36e747cc9852c4a1fde6e86b5f7 ./etc/mdadm/mdadm.conf
ee6eabe5fb44714ca6be61409a762103 ./sbin/mdadm
fee4fa9f9b6c97da3645aa194fedb4d8 ./lib/modules/2.6.32-5-686/kernel/drivers/md/dm-region-hash.ko
480a00dc400fb607225384a983c1c88c ./lib/modules/2.6.32-5-686/kernel/drivers/md/dm-crypt.ko
812b7a6605a158373d676806dbf3497d ./lib/modules/2.6.32-5-686/kernel/drivers/md/linear.ko
00c5a97b4c68ee0df7b3b529276011ba ./lib/modules/2.6.32-5-686/kernel/drivers/md/md-mod.ko
b7aaf10cf94e53fa747f2ea0d41274e5 ./lib/modules/2.6.32-5-686/kernel/drivers/md/multipath.ko
1ee514b06f6d99546d879b0c3269f60e ./lib/modules/2.6.32-5-686/kernel/drivers/md/raid0.ko
5865f1339279a74a5500f989fa5a5280 ./lib/modules/2.6.32-5-686/kernel/drivers/md/dm-snapshot.ko
0bf30c721b69c5631c217aefe374b945 ./lib/modules/2.6.32-5-686/kernel/drivers/md/raid1.ko
ce8ee08fdfd672db37665cb0632260a9 ./lib/modules/2.6.32-5-686/kernel/drivers/md/dm-log.ko
08935e602ce1a98d24cd59767b57e73b ./lib/modules/2.6.32-5-686/kernel/drivers/md/raid456.ko
12f810355fe6cb14ceccff8a9e97d6bc ./lib/modules/2.6.32-5-686/kernel/drivers/md/dm-mirror.ko
fc02f256aba4f796b2ff0d0a23aa5c49 ./lib/modules/2.6.32-5-686/kernel/drivers/md/raid6_pq.ko
0f462f810d9f42eaca0ff31edf097778 ./lib/modules/2.6.32-5-686/kernel/drivers/md/raid10.ko
d0c88e4a79d78ef163428115da9b47fc ./lib/modules/2.6.32-5-686/kernel/drivers/md/dm-mod.ko
--- initrd's /conf/conf.d/md:
MD_HOMEHOST='callisto'
MD_DEVPAIRS='/dev/md0:raid1 /dev/md1:raid1'
MD_LEVELS='raid1 raid1'
MD_DEVS=all
MD_MODULES='raid1'
--- /proc/modules:
dm_crypt 9127 0 - Live 0xf7c29000
dm_mod 46094 24 dm_crypt, Live 0xf7c49000
raid1 16367 2 - Live 0xf81c9000
md_mod 67329 3 raid1, Live 0xf81a5000
--- /var/log/syslog:
--- volume detail:
/dev/sda is not recognised by mdadm.
/dev/sda1:
Magic : a92b4efc
Version : 0.90.00
UUID : 577cf208:5b05d3c9:dc677657:389f812d
Creation Time : Tue Oct 7 21:20:37 2008
Raid Level : raid1
Used Dev Size : 1999936 (1953.39 MiB 2047.93 MB)
Array Size : 1999936 (1953.39 MiB 2047.93 MB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Update Time : Wed Oct 6 20:00:19 2010
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : 66ca615 - correct
Events : 17538
Number Major Minor RaidDevice State
this 1 8 1 1 active sync /dev/sda1
0 0 8 17 0 active sync /dev/sdb1
1 1 8 1 1 active sync /dev/sda1
--
/dev/sda2:
Magic : a92b4efc
Version : 0.90.00
UUID : 52c89d31:ddcb3e86:dc677657:389f812d
Creation Time : Tue Oct 7 21:20:51 2008
Raid Level : raid1
Used Dev Size : 974759872 (929.60 GiB 998.15 GB)
Array Size : 974759872 (929.60 GiB 998.15 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 127
Update Time : Wed Oct 6 20:02:24 2010
State : active
Active Devices : 1
Working Devices : 2
Failed Devices : 0
Spare Devices : 1
Checksum : be855cd6 - correct
Events : 853325
Number Major Minor RaidDevice State
this 1 8 2 1 active sync /dev/sda2
0 0 0 0 0 removed
1 1 8 2 1 active sync /dev/sda2
2 2 8 18 2 spare /dev/sdb2
--
/dev/sdb is not recognised by mdadm.
/dev/sdb1:
Magic : a92b4efc
Version : 0.90.00
UUID : 577cf208:5b05d3c9:dc677657:389f812d
Creation Time : Tue Oct 7 21:20:37 2008
Raid Level : raid1
Used Dev Size : 1999936 (1953.39 MiB 2047.93 MB)
Array Size : 1999936 (1953.39 MiB 2047.93 MB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Update Time : Wed Oct 6 20:00:19 2010
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : 66ca623 - correct
Events : 17538
Number Major Minor RaidDevice State
this 0 8 17 0 active sync /dev/sdb1
0 0 8 17 0 active sync /dev/sdb1
1 1 8 1 1 active sync /dev/sda1
--
/dev/sdb2:
Magic : a92b4efc
Version : 0.90.00
UUID : 52c89d31:ddcb3e86:dc677657:389f812d
Creation Time : Tue Oct 7 21:20:51 2008
Raid Level : raid1
Used Dev Size : 974759872 (929.60 GiB 998.15 GB)
Array Size : 974759872 (929.60 GiB 998.15 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 127
Update Time : Wed Oct 6 20:02:24 2010
State : clean
Active Devices : 1
Working Devices : 2
Failed Devices : 0
Spare Devices : 1
Checksum : be926232 - correct
Events : 853326
Number Major Minor RaidDevice State
this 2 8 18 2 spare /dev/sdb2
0 0 0 0 0 removed
1 1 8 2 1 active sync /dev/sda2
2 2 8 18 2 spare /dev/sdb2
--
--- /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-2.6.32-5-686 root=UUID=b286d1cf-e54e-4045-811f-86ef2064ed70 ro
--- grub2:
insmod raid
set root='(tera_vg-usr)'
insmod raid
set root='(md0)'
insmod raid
set root='(md0)'
insmod raid
set root='(md0)'
linux /boot/vmlinuz-2.6.32-5-686 root=UUID=b286d1cf-e54e-4045-811f-86ef2064ed70 ro
insmod raid
set root='(md0)'
linux /boot/vmlinuz-2.6.32-5-686 root=UUID=b286d1cf-e54e-4045-811f-86ef2064ed70 ro single
--- grub legacy:
kernel /boot/vmlinuz-2.6.26-1-686 root=/dev/md0 ro pci=routeirq
kernel /boot/vmlinuz-2.6.24-etchnhalf.1-686 root=/dev/md0 ro pci=routeirq
--- udev:
ii udev 160-1 /dev/ and hotplug management daemon
4a574fcd059040d33ea18a8aa605a184 /lib/udev/rules.d/64-md-raid.rules
--- /dev:
brw-rw---- 1 root disk 9, 0 Oct 6 19:25 /dev/md0
brw-rw---- 1 root disk 9, 127 Oct 6 19:25 /dev/md127
/dev/disk/by-path:
total 0
lrwxrwxrwx 1 root root 9 Oct 6 19:25 pci-0000:00:1f.2-scsi-0:0:0:0 -> ../../sda
lrwxrwxrwx 1 root root 10 Oct 6 19:47 pci-0000:00:1f.2-scsi-0:0:0:0-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 Oct 6 19:25 pci-0000:00:1f.2-scsi-0:0:0:0-part2 -> ../../sda2
lrwxrwxrwx 1 root root 9 Oct 6 19:25 pci-0000:00:1f.2-scsi-1:0:0:0 -> ../../sdb
lrwxrwxrwx 1 root root 10 Oct 6 19:25 pci-0000:00:1f.2-scsi-1:0:0:0-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 Oct 6 19:53 pci-0000:00:1f.2-scsi-1:0:0:0-part2 -> ../../sdb2
lrwxrwxrwx 1 root root 9 Oct 6 19:25 pci-0000:00:1f.2-scsi-2:0:0:0 -> ../../sr0
lrwxrwxrwx 1 root root 9 Oct 6 19:25 pci-0000:01:09.0-scsi-0:0:4:0 -> ../../sr1
/dev/disk/by-uuid:
total 0
lrwxrwxrwx 1 root root 10 Oct 6 19:25 12e0adf9-2607-4698-8594-6597eb876820 -> ../../dm-5
lrwxrwxrwx 1 root root 10 Oct 6 19:25 1bb23705-5c94-48ce-80b3-6e83212d41a6 -> ../../dm-3
lrwxrwxrwx 1 root root 10 Oct 6 19:25 2849efe6-6251-41ed-b78d-900c89ebc3b1 -> ../../dm-7
lrwxrwxrwx 1 root root 10 Oct 6 19:25 2f5132f4-da3e-4bbb-85cb-891235c26391 -> ../../dm-0
lrwxrwxrwx 1 root root 10 Oct 6 19:25 590917ee-29c4-4447-875a-9a93f52360d6 -> ../../dm-2
lrwxrwxrwx 1 root root 10 Oct 6 19:25 62e9d705-1323-4d3c-9e1f-296e12e5bab8 -> ../../dm-1
lrwxrwxrwx 1 root root 10 Oct 6 19:25 999ea6f1-2522-44ec-a20a-52168293db3b -> ../../dm-4
lrwxrwxrwx 1 root root 9 Oct 6 19:46 b286d1cf-e54e-4045-811f-86ef2064ed70 -> ../../md0
lrwxrwxrwx 1 root root 10 Oct 6 19:25 b98f7614-9f07-4b50-98dc-e4a303b910d4 -> ../../dm-6
/dev/md:
total 0
lrwxrwxrwx 1 root root 8 Oct 6 19:39 1_0 -> ../md127
Auto-generated on Wed, 06 Oct 2010 20:02:25 +0100
by mdadm bugscript 3.1.4-1+8efb9d1
-- System Information:
Debian Release: squeeze/sid
APT prefers testing
APT policy: (500, 'testing'), (10, 'unstable')
Architecture: i386 (i686)
Kernel: Linux 2.6.32-5-686 (SMP w/2 CPU cores)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Versions of packages mdadm depends on:
ii debconf 1.5.35 Debian configuration management sy
ii libc6 2.11.2-6 Embedded GNU C Library: Shared lib
ii lsb-base 3.2-23.1 Linux Standard Base 3.2 init scrip
ii makedev 2.3.1-89 creates device files in /dev
ii udev 160-1 /dev/ and hotplug management daemo
Versions of packages mdadm recommends:
ii exim4-daemon-light [mail-tran 4.72-1 lightweight Exim MTA (v4) daemon
ii module-init-tools 3.12-1 tools for managing Linux kernel mo
mdadm suggests no packages.
-- debconf information:
* mdadm/autostart: true
* mdadm/initrdstart: all
mdadm/initrdstart_notinconf: false
mdadm/initrdstart_msg_errexist:
mdadm/initrdstart_msg_intro:
mdadm/initrdstart_msg_errblock:
* mdadm/warning:
* mdadm/start_daemon: true
* mdadm/mail_to: root
mdadm/initrdstart_msg_errmd:
mdadm/initrdstart_msg_errconf:
* mdadm/autocheck: true
--
Dominic Hargreaves | http://www.larted.org.uk/~dom/
PGP key 5178E2A5 from the.earth.li (keyserver,web,email)
More information about the pkg-mdadm-devel
mailing list