Bug#510911: multipath-tools: bad side effects with FC devices
Vincent McIntyre
Vince.McIntyre at atnf.csiro.au
Mon Jan 5 21:50:28 UTC 2009
Package: multipath-tools
Version: 0.4.7-1.1etch1
Severity: normal
*** Please type your report below this line ***
Not sure what to do about this bug, except to say "this behaviour
shouldn't happen". If someone can reproduce it, that is...
I have a system with a single internal disk, set up as follows:
The internal disk is actually 3 disks in a RAID5 configuration,
using hardware RAID, controlled by a Dell Perc/4i.
The disk is partitioned as follows:
# fdisk -l /dev/sda
Disk /dev/sda: 293.3 GB, 293391564800 bytes
255 heads, 63 sectors/track, 35669 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 1 62 497983+ 83 Linux
/dev/sda2 63 35669 286013227+ 8e Linux LVM
The first partition is /. The LVM contains /usr, /var, etc.
There are two filesystems attached to the machine via FC connections.
These are Apple Xserve RAIDs connected to the host via a Qlogic
FC switch.
Installing multipath-tools from 'etch' gives the following behaviour:
- after the next reboot, the system is unable to see the data files on
the Xserve RAID filesystems.
- 'mount' does not show them as mounted.
- ls -al yeilds:
# ls -al /data/foo_1
. ..
- yet 'umount' gives this result:
# umount /data/foo_1
umount: /data/foo_1: not mounted
- and 'mount' gives this:
# mount /data/foo_1
mount: /dev/sdc1 already mounted or /data/foo_1 busy
- the devices are not in /etc/mtab
# grep foo_ /etc/mtab
<nothing>
- but they show in /etc/blkid.tab (see dmsetup notes below)
- I'm not familiar enough with lsof, but tried:
# lsof /data/foo_1
<nothing>
# lsof /dev/sdc1
<nothing>
Other things I noticed:
- fdisk works ok
# fdisk -l /dev/sdc
Disk /dev/sdc: 1505.9 GB, 1505973239808 bytes
64 heads, 32 sectors/track, 1436208 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Device Boot Start End Blocks Id System
/dev/sdc1 1 1436208 1470676976 83 Linux
- so does tune2fs -l.
- during boot, I see
Setting up DMRAID devices....
Starting multipathdevice-mapper: multipath: version 1.0.4 loaded
error calling out /sbin/scsi_id -g -u -s sda
device-mapper: multipath round-robin: version 1.0.0 loaded
...
Checking root file system...fsck 1.40-WIP (14-Nov-2006)
root: clean, 123EXT3 FS on sda1, 08/124928 files,internal journal
172019/497980 blocks
done.
...
Loading device-mapper support.
Setting up LVM Volume Groups...
Reading all physical volumes. This may take a while...
Found volume group "horus_1" using metadata type lvm2
7 logical volume(s) in volume group "horus_1" now active
...
mount: /dev/sdb1 already mounted or /data/foo_1 busy
mount: /dev/sdb1 already mounted
- once booted, /dev/mapper is left in a very odd state indeed:
# ls -l /dev/mapper
brw-rw---- 1 root disk 254, 8 2009-01-05 14:35
36000393000007d3901000000fef00a2d
brw-rw---- 1 root disk 254, 9 2009-01-05 14:35
36000393000007d3901000000fef00a2d1
brw-rw---- 1 root disk 254, 7 2009-01-05 14:35
36000393000007da901000000fef00a2d
crw-rw---- 1 root root 10, 63 2009-01-05 14:35 control
brw-rw---- 1 root disk 254, 6 2009-01-05 14:35 horus_1-HORUS_1
brw-rw---- 1 root disk 254, 4 2009-01-05 14:35 horus_1-local
brw-rw---- 1 root disk 254, 5 2009-01-05 14:35 horus_1-srv
brw-rw---- 1 root disk 254, 2 2009-01-05 14:35 horus_1-swap0
brw-rw---- 1 root disk 254, 3 2009-01-05 14:35 horus_1-tmp
brw-rw---- 1 root disk 254, 0 2009-01-05 14:35 horus_1-usr
brw-rw---- 1 root disk 254, 1 2009-01-05 14:35 horus_1-var
# dmsetup ls
horus_1-usr (254, 0)
horus_1-var (254, 1)
horus_1-srv (254, 5)
36000393000007d3901000000fef00a2d1 (254, 9)
horus_1-swap0 (254, 2)
36000393000007da901000000fef00a2d (254, 7)
36000393000007d3901000000fef00a2d (254, 8)
horus_1-tmp (254, 3)
horus_1-HORUS_1 (254, 6)
horus_1-local (254, 4)
#dmsetup info 36000393000007d3901000000fef00a2d1
Name: 36000393000007d3901000000fef00a2d1
State: ACTIVE
Tables present: LIVE
Open count: 0
Event number: 0
Major, minor: 254, 9
Number of targets: 1
UUID: part1-mpath-36000393000007d3901000000fef00a2d
# dmsetup info 36000393000007d3901000000fef00a2d
Name: 36000393000007d3901000000fef00a2d
State: ACTIVE
Tables present: LIVE
Open count: 1
Event number: 1
Major, minor: 254, 8
Number of targets: 1
UUID: mpath-36000393000007d3901000000fef00a2d
Multipath-tools was installed but I did no configuration of it to
set up any multipathing - there are no multipaths on this machine.
The FC HBA (LSI FC929X) has two ports, but only one is being used.
Things I tried:
- the /etc/fstab is mounting the affected (ext3) filesystems by label.
Try by UUID. no change in behaviour.
- edit /etc/udev/multipath.rules, as per #484823
A small improvement, one of the strange device names
(36000393000007d3901000000fef00a2d1) goes away.
The diff is
--- multipath.rules.orig 2007-10-28 12:15:34.000000000 +1100
+++ multipath.rules 2009-01-05 15:56:24.000000000 +1100
@@ -4,7 +4,7 @@
#
# take care of devmap partitioning
-ACTION=="add", SUBSYSTEM=="block", KERNEL=="dm-*", \
+ACTION=="change", SUBSYSTEM=="block", KERNEL=="dm-*", \
PROGRAM="/sbin/dmsetup -j %M -m %m --noopencount --noheadings -c
-o name
info", \
RUN+="/sbin/kpartx -a /dev/mapper/%c"
- Uninstalling multipath-tools.
Resolves the problem completely. I left dmsetup installed.
I also ran some straces on 'mount' when it was failing. Let me know
if those would be of any use.
-- System Information:
Debian Release: 4.0
APT prefers stable
APT policy: (500, 'stable')
Architecture: i386 (i686)
Shell: /bin/sh linked to /bin/bash
Kernel: Linux 2.6.18-6-686
Locale: LANG=C, LC_CTYPE=en_AU.iso88591 (charmap=ISO-8859-1)
Versions of packages multipath-tools depends on:
ii dmsetup 2:1.02.08-1 The Linux Kernel Device Mapper use
ii initscripts 2.86.ds1-38+etchnhalf.1 Scripts for initializing and shutt
ii libc6 2.3.6.ds1-13etch8 GNU C Library: Shared libraries
ii libdevmapper1.02 2:1.02.08-1 The Linux Kernel Device Mapper use
ii libncurses5 5.5-5 Shared libraries for terminal hand
ii libreadline5 5.2-2 GNU readline and history libraries
ii libsysfs2 2.1.0-1 interface library to sysfs
ii udev 0.105-4 /dev/ and hotplug management daemon
multipath-tools recommends no packages.
lsmod (with the machine back in normal state)
Module Size Used by
autofs4 19748 18
nfsd 197936 17
exportfs 5600 1 nfsd
button 6672 0
ac 5188 0
battery 9636 0
ipv6 226272 44
nfs 202860 2
lockd 54344 3 nfsd,nfs
nfs_acl 3584 2 nfsd,nfs
sunrpc 138812 13 nfsd,nfs,lockd,nfs_acl
loop 15048 0
tsdev 7520 0
i2c_piix4 8268 0
serio_raw 6660 0
i2c_core 19680 1 i2c_piix4
floppy 53124 0
psmouse 35016 0
sworks_agp 9152 0
rtc 12372 0
agpgart 29896 1 sworks_agp
pcspkr 3072 0
sg 31292 0
evdev 9088 0
ext3 119336 9
jbd 52456 1 ext3
mbcache 8356 1 ext3
dm_snapshot 15552 0
dm_mirror 19152 0
dm_mod 50200 17 dm_snapshot,dm_mirror
ide_cd 36064 0
cdrom 32544 1 ide_cd
sd_mod 19040 7
generic 4868 0 [permanent]
megaraid_mbox 29168 2
mptfc 14468 2
mptscsih 21664 1 mptfc
aic7xxx 150932 0
serverworks 8328 0 [permanent]
mptbase 46176 2 mptfc,mptscsih
megaraid_mm 10560 1 megaraid_mbox
scsi_transport_fc 28544 1 mptfc
ohci_hcd 18276 0
scsi_transport_spi 22336 1 aic7xxx
usbcore 112644 2 ohci_hcd
ide_core 110504 3 ide_cd,generic,serverworks
tg3 94948 0
scsi_mod 124168 8
sg,sd_mod,megaraid_mbox,mptfc,mptscsih,aic7xxx,scsi_transport_fc,scsi_transport_spi
thermal 13608 0
processor 28840 1 thermal
fan 4804 0
# (lspci ;lspci -n )|sort
00:00.0 0600: 1166:0014 (rev 33)
00:00.0 Host bridge: Broadcom CMIC-LE Host Bridge (GC-LE chipset) (rev 33)
00:00.1 0600: 1166:0014
00:00.1 Host bridge: Broadcom CMIC-LE Host Bridge (GC-LE chipset)
00:00.2 0600: 1166:0014
00:00.2 Host bridge: Broadcom CMIC-LE Host Bridge (GC-LE chipset)
00:0e.0 0300: 1002:4752 (rev 27)
00:0e.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
00:0f.0 0600: 1166:0201 (rev 93)
00:0f.0 Host bridge: Broadcom CSB5 South Bridge (rev 93)
00:0f.1 0101: 1166:0212 (rev 93)
00:0f.1 IDE interface: Broadcom CSB5 IDE Controller (rev 93)
00:0f.2 0c03: 1166:0220 (rev 05)
00:0f.2 USB Controller: Broadcom OSB4/CSB5 OHCI USB Controller (rev 05)
00:0f.3 0601: 1166:0225
00:0f.3 ISA bridge: Broadcom CSB5 LPC bridge
00:10.0 0600: 1166:0110 (rev 12)
00:10.0 Host bridge: Broadcom CIOB-E I/O Bridge with Gigabit Ethernet (rev
12)
00:10.2 0600: 1166:0110 (rev 12)
00:10.2 Host bridge: Broadcom CIOB-E I/O Bridge with Gigabit Ethernet (rev
12)
00:11.0 0600: 1166:0101 (rev 05)
00:11.0 Host bridge: Broadcom CIOB-X2 PCI-X I/O Bridge (rev 05)
00:11.2 0600: 1166:0101 (rev 05)
00:11.2 Host bridge: Broadcom CIOB-X2 PCI-X I/O Bridge (rev 05)
01:04.0 0100: 9005:00c0 (rev 01)
01:04.0 SCSI storage controller: Adaptec AHA-3960D / AIC-7899A U160/m (rev
01)
01:04.1 0100: 9005:00c0 (rev 01)
01:04.1 SCSI storage controller: Adaptec AHA-3960D / AIC-7899A U160/m (rev
01)
02:00.0 0200: 14e4:1648 (rev 02)
02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704
Gigabit Ethernet (rev 02)
02:00.1 0200: 14e4:1648 (rev 02)
02:00.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704
Gigabit Ethernet (rev 02)
03:06.0 0c04: 1000:0626
03:06.0 Fibre Channel: LSI Logic / Symbios Logic FC929X Fibre Channel
Adapter
03:06.1 0c04: 1000:0626
03:06.1 Fibre Channel: LSI Logic / Symbios Logic FC929X Fibre Channel
Adapter
04:03.0 0104: 1028:000f (rev 02)
04:03.0 RAID bus controller: Dell PowerEdge Expandable RAID controller
4/Di (rev 02)
More information about the pkg-lvm-maintainers
mailing list