Bug#843953: multipath-tools: LVM on internal disks + multipath-toosl = no multipaths
Vincent McIntyre
vincent.mcintyre at csiro.au
Fri Nov 11 06:55:54 UTC 2016
Package: multipath-tools
Version: 0.5.0-6+deb8u2
Severity: important
* What led up to the situation?
Upgrade working wheezy system to jessie
Apply patch to fix multipath segfault, see #751993
The system boots off an internal physical disk
That disk has one / partition and an LVM partition
/usr is one of the LVs on that disk
There are two other internal disks, LVM is not used on these.
The multipath devices are connected via a qlogic ISP2432-based card,
through a FC switch to two Promise VTrak units.
LVM is not used on the multipath devices.
* What exactly did you do (or not do) that was effective (or
ineffective)?
Cold-plug FC connection to external storage, boot system.
* What was the outcome of this action?
multipath -l shows no devices. no related device maps in /dev/mapper.
The FC and SCSI layers all worked fine, I see lots of /dev/sdX devices.
multipath -l -v3 shows the /dev/sdX devices as blacklisted
...
sdd: blacklisted, udev property missing
sde: blacklisted, udev property missing
sdp: blacklisted, udev property missing
...etc
* What outcome did you expect instead?
usable multipaths to the configured devices
/dev/mapper populated, including kpartx partitions
Related notes:
On some reboots the system log shows multipath timing out. Below is 'sdd'.
The timeout occurs 33 seconds after the disk was attached.
I was unable to determine the cause of this or reproduce consistently.
Nov 11 11:11:53 kernel: sd 1:0:0:4: [sdd] 25769805824 512-byte logical blocks: (13.1 TB/12.0 TiB)
Nov 11 11:11:53 kernel: sd 1:0:0:4: [sdd] Write Protect is off
Nov 11 11:11:53 kernel: sd 1:0:0:4: [sdd] Mode Sense: 97 00 10 08
Nov 11 11:11:53 kernel: sd 1:0:0:4: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
Nov 11 11:11:53 kernel: sdd: sdd1
Nov 11 11:11:53 kernel: sd 1:0:0:4: [sdd] Attached SCSI disk
...
Nov 11 11:12:25 systemd-udevd[346]: timeout '/sbin/multipath -v0 /dev/sdd'
Nov 11 11:12:26 systemd-udevd[346]: timeout: killing '/sbin/multipath -v0 /dev/sdd' [459]
Nov 11 11:12:26 systemd-udevd[346]: '/sbin/multipath -v0 /dev/sdd' [459] terminated by signal 9 (Killed)
On some reboots, there was a bad interaction between LVM and multipathd
and/or udev. On the console systemd showed it was waiting for tasks to
complete for both of these.
device-mapper would try to handle the multipath devices before the LVM
ones, which sometimes caused the system to fail to boot; it went into
emergency mode.
I was unable to determine the cause of this or reproduce it consistently.
I don't know where the multipath-tools-boot line comes from, that
package is not even installed.
You can however see the two running contemporaneously
# journalctl |egrep -i -e '(multipath|lvm|-udev)'
Nov 11 16:52:57 systemd[1]: Starting LVM2 metadata daemon socket.
Nov 11 16:52:57 systemd[1]: Listening on LVM2 metadata daemon socket.
Nov 11 16:52:57 systemd-udevd[317]: starting version 215
Nov 11 16:52:58 systemd[1]: Starting LSB: early multipath boot script...
Nov 11 16:52:58 kernel: device-mapper: multipath: version 1.7.0 loaded
Nov 11 16:52:59 multipath-tools-boot[633]: Discovering and coalescing multipaths...done.
Nov 11 16:52:59 systemd[1]: Started LSB: early multipath boot script.
Nov 11 16:52:59 systemd[1]: Starting system-lvm2\x2dpvscan.slice.
Nov 11 16:52:59 systemd[1]: Created slice system-lvm2\x2dpvscan.slice.
Nov 11 16:52:59 systemd[1]: Starting LVM2 PV scan on device 8:2...
Nov 11 16:52:59 systemd[1]: Starting Activation of LVM2 logical volumes...
Nov 11 16:52:59 systemd[1]: Started LVM2 PV scan on device 8:2.
Nov 11 16:52:59 lvm[676]: 10 logical volume(s) in volume group "testbox" now active
Nov 11 16:53:00 systemd[1]: Started Activation of LVM2 logical volumes.
Nov 11 16:53:00 systemd[1]: Starting Activation of LVM2 logical volumes...
Nov 11 16:53:01 lvm[854]: 10 logical volume(s) in volume group "testbox" now active
Nov 11 16:53:01 systemd[1]: Started Activation of LVM2 logical volumes.
Nov 11 16:53:01 systemd[1]: Starting Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling...
Nov 11 16:53:01 lvm[928]: 10 logical volume(s) in volume group "testbox" monitored
Nov 11 16:53:01 systemd[1]: Started Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling.
Nov 11 16:53:09 systemd[1]: Starting LSB: multipath daemon...
Nov 11 16:53:10 multipath-tools[1190]: Starting multipath daemon: multipathd.
Nov 11 16:53:10 systemd[1]: Started LSB: multipath daemon.
Nov 11 16:53:10 multipathd[1288]: path checkers start up
Further work:
First I applied the patch discussed in #799781 (shared lock with udev).
This didn't fix things but may have helped.
Then after reviewing #782487, I built sg3-utils v1.42 and installed it,
including sg3-utils-udev. This got the system working.
hot-plugging the fibre doesn't work properly but that will have to wait
for another bug.
I reverted the shared lock patch, to test if that is essential.
It appears not - I was able to boot the system fine as long as I
had the sg3-utils packages installed. Nonetheless it seems worth
including it as I did notice that multipath-tools and LVM were
trying to do things at the same time (systemd was waiting for both).
Notice also the dm ordering:
# dmsetup ls |sort -t: -k2,2 -n |column -t
testbox-swap_1 (254:0)
testbox-usr (254:1)
vt04-ld4 (254:2)
vt05-ld3-atoa (254:3)
vt05-ld4-atoa (254:4)
vt05-ld5-atoa (254:5)
vt04-ld4-part1 (254:6)
vt05-ld4-atoa-part1 (254:7)
vt05-ld5-atoa-part1 (254:8)
testbox-var (254:9)
testbox-var+log (254:10)
testbox-tmp (254:11)
testbox-opt (254:12)
testbox-local (254:13)
testbox-srv (254:14)
testbox-data (254:15)
testbox-srv+jenkins (254:16)
Requests:
Can we please have sg3-utils v1.42 added to a stable point release?
Also multipath-tools needs to depend on sg3-utils-udev.
It seems a shame to not include the shared lock patch as it avoids
a known deadlock and the system still works fine with it included.
-- Package-specific info:
Contents of /etc/multipath.conf:
blacklist {
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^hd[a-z][[0-9]*]"
devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]"
device {
vendor MegaRAID
}
device {
vendor APPLE
}
device {
vendor ATA
}
device {
vendor DELL
}
device {
vendor Dell
}
}
devices {
device {
vendor "Promise"
product "VTrak"
path_grouping_policy multibus
getuid_callout "/lib/udev/scsi_id --whitelisted --replace-whitespace --device /dev/%n"
path_checker readsector0
path_selector "round-robin 0"
hardware_handler "0"
failback immediate
rr_weight uniform
rr_min_io 100
no_path_retry 20
features "1 queue_if_no_path"
product_blacklist "VTrak V-LUN"
}
}
multipaths {
multipath {
wwid 22258000155e916fb
alias vt05-ld3-atoa
}
multipath {
wwid 22268000155a61f7d
alias vt05-ld4-atoa
}
multipath {
wwid 222e8000155286d15
alias vt05-ld5-atoa
}
multipath {
wwid 22290000155c7e34a
alias vt04-ld4
}
}
-- System Information:
Debian Release: 8.6
APT prefers stable
APT policy: (990, 'stable')
Architecture: amd64 (x86_64)
Kernel: Linux 3.16.0-4-amd64 (SMP w/12 CPU cores)
Locale: LANG=en_AU.UTF-8, LC_CTYPE=en_AU.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
Versions of packages multipath-tools depends on:
ii initscripts 2.88dsf-59
ii kpartx 0.5.0-6+deb8u2
ii libaio1 0.3.110-1
ii libc6 2.19-18+deb8u6
ii libdevmapper1.02.1 2:1.02.90-2.2+deb8u1
ii libgcc1 1:4.9.2-10
ii libreadline6 6.3-8+b3
ii libudev1 215-17+deb8u5
ii lsb-base 4.1+Debian13+nmu1
ii udev 215-17+deb8u5
multipath-tools recommends no packages.
Versions of packages multipath-tools suggests:
pn multipath-tools-boot <none>
-- no debconf information
More information about the pkg-lvm-maintainers
mailing list