Bug#843953: multipath-tools: LVM on internal disks + multipath-toosl = no multipaths
Ritesh Raj Sarraf
rrs at debian.org
Fri Nov 11 10:36:44 UTC 2016
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
That's a lot of information. Let's try to cover as much.
On Fri, 2016-11-11 at 17:55 +1100, Vincent McIntyre wrote:
> Package: multipath-tools
> Version: 0.5.0-6+deb8u2
> Severity: important
>
> * What led up to the situation?
>
> Upgrade working wheezy system to jessie
> Apply patch to fix multipath segfault, see #751993
>
> The system boots off an internal physical disk
> That disk has one / partition and an LVM partition
> /usr is one of the LVs on that disk
> There are two other internal disks, LVM is not used on these.
> The multipath devices are connected via a qlogic ISP2432-based card,
> through a FC switch to two Promise VTrak units.
> LVM is not used on the multipath devices.
>
>
> * What exactly did you do (or not do) that was effective (or
> ineffective)?
>
> Cold-plug FC connection to external storage, boot system.
>
> * What was the outcome of this action?
>
> multipath -l shows no devices. no related device maps in /dev/mapper.
>
> The FC and SCSI layers all worked fine, I see lots of /dev/sdX devices.
>
> multipath -l -v3 shows the /dev/sdX devices as blacklisted
> ...
> sdd: blacklisted, udev property missing
> sde: blacklisted, udev property missing
> sdp: blacklisted, udev property missing
> ...etc
>
Okay! I think we fixed this somewhere.
Ah. Mauricio had found this issue long back in #782487
Okay!. Since this did not make into Jessie, you've hit this problem.
Hmmm.
> * What outcome did you expect instead?
>
> usable multipaths to the configured devices
> /dev/mapper populated, including kpartx partitions
>
> Related notes:
>
> On some reboots the system log shows multipath timing out. Below is 'sdd'.
> The timeout occurs 33 seconds after the disk was attached.
> I was unable to determine the cause of this or reproduce consistently.
>
> Nov 11 11:11:53 kernel: sd 1:0:0:4: [sdd] 25769805824 512-byte logical
> blocks: (13.1 TB/12.0 TiB)
> Nov 11 11:11:53 kernel: sd 1:0:0:4: [sdd] Write Protect is off
> Nov 11 11:11:53 kernel: sd 1:0:0:4: [sdd] Mode Sense: 97 00 10 08
> Nov 11 11:11:53 kernel: sd 1:0:0:4: [sdd] Write cache: enabled, read
> cache: enabled, supports DPO and FUA
> Nov 11 11:11:53 kernel: sdd: sdd1
> Nov 11 11:11:53 kernel: sd 1:0:0:4: [sdd] Attached SCSI disk
> ...
> Nov 11 11:12:25 systemd-udevd[346]: timeout '/sbin/multipath -v0 /dev/sdd'
> Nov 11 11:12:26 systemd-udevd[346]: timeout: killing '/sbin/multipath -v0
> /dev/sdd' [459]
> Nov 11 11:12:26 systemd-udevd[346]: '/sbin/multipath -v0 /dev/sdd' [459]
> terminated by signal 9 (Killed)
>
This is mostly the locking issue which you mentioned in the other bug.
> On some reboots, there was a bad interaction between LVM and multipathd
> and/or udev. On the console systemd showed it was waiting for tasks to
> complete for both of these.
> device-mapper would try to handle the multipath devices before the LVM
> ones, which sometimes caused the system to fail to boot; it went into
> emergency mode.
> I was unable to determine the cause of this or reproduce it consistently.
> I don't know where the multipath-tools-boot line comes from, that
> package is not even installed.
>
> You can however see the two running contemporaneously
> # journalctl |egrep -i -e '(multipath|lvm|-udev)'
> Nov 11 16:52:57 systemd[1]: Starting LVM2 metadata daemon socket.
> Nov 11 16:52:57 systemd[1]: Listening on LVM2 metadata daemon socket.
> Nov 11 16:52:57 systemd-udevd[317]: starting version 215
> Nov 11 16:52:58 systemd[1]: Starting LSB: early multipath boot script...
> Nov 11 16:52:58 kernel: device-mapper: multipath: version 1.7.0 loaded
> Nov 11 16:52:59 multipath-tools-boot[633]: Discovering and coalescing
> multipaths...done.
> Nov 11 16:52:59 systemd[1]: Started LSB: early multipath boot script.
> Nov 11 16:52:59 systemd[1]: Starting system-lvm2\x2dpvscan.slice.
> Nov 11 16:52:59 systemd[1]: Created slice system-lvm2\x2dpvscan.slice.
> Nov 11 16:52:59 systemd[1]: Starting LVM2 PV scan on device 8:2...
> Nov 11 16:52:59 systemd[1]: Starting Activation of LVM2 logical volumes...
> Nov 11 16:52:59 systemd[1]: Started LVM2 PV scan on device 8:2.
> Nov 11 16:52:59 lvm[676]: 10 logical volume(s) in volume group "testbox"
> now active
> Nov 11 16:53:00 systemd[1]: Started Activation of LVM2 logical volumes.
> Nov 11 16:53:00 systemd[1]: Starting Activation of LVM2 logical volumes...
> Nov 11 16:53:01 lvm[854]: 10 logical volume(s) in volume group "testbox"
> now active
> Nov 11 16:53:01 systemd[1]: Started Activation of LVM2 logical volumes.
> Nov 11 16:53:01 systemd[1]: Starting Monitoring of LVM2 mirrors, snapshots
> etc. using dmeventd or progress polling...
> Nov 11 16:53:01 lvm[928]: 10 logical volume(s) in volume group "testbox"
> monitored
> Nov 11 16:53:01 systemd[1]: Started Monitoring of LVM2 mirrors, snapshots
> etc. using dmeventd or progress polling.
> Nov 11 16:53:09 systemd[1]: Starting LSB: multipath daemon...
> Nov 11 16:53:10 multipath-tools[1190]: Starting multipath daemon:
> multipathd.
> Nov 11 16:53:10 systemd[1]: Started LSB: multipath daemon.
> Nov 11 16:53:10 multipathd[1288]: path checkers start up
>
>
>
> Further work:
>
> First I applied the patch discussed in #799781 (shared lock with udev).
> This didn't fix things but may have helped.
>
"Didn't fixed things" as in the maps not appearing. Which is because of the
missing vpd, which can only be brought by the newer version of sg3-utils
available only in Stretch.
> Then after reviewing #782487, I built sg3-utils v1.42 and installed it,
> including sg3-utils-udev. This got the system working.
> hot-plugging the fibre doesn't work properly but that will have to wait
> for another bug.
>
Okay Great. So you have confirmed here. :-)
> I reverted the shared lock patch, to test if that is essential.
> It appears not - I was able to boot the system fine as long as I
> had the sg3-utils packages installed. Nonetheless it seems worth
> including it as I did notice that multipath-tools and LVM were
> trying to do things at the same time (systemd was waiting for both).
Yes. Agreed.
> Notice also the dm ordering:
>
> # dmsetup ls |sort -t: -k2,2 -n |column -t
> testbox-swap_1 (254:0)
> testbox-usr (254:1)
> vt04-ld4 (254:2)
> vt05-ld3-atoa (254:3)
> vt05-ld4-atoa (254:4)
> vt05-ld5-atoa (254:5)
> vt04-ld4-part1 (254:6)
> vt05-ld4-atoa-part1 (254:7)
> vt05-ld5-atoa-part1 (254:8)
> testbox-var (254:9)
> testbox-var+log (254:10)
> testbox-tmp (254:11)
> testbox-opt (254:12)
> testbox-local (254:13)
> testbox-srv (254:14)
> testbox-data (254:15)
> testbox-srv+jenkins (254:16)
>
>
> Requests:
>
> Can we please have sg3-utils v1.42 added to a stable point release?
> Also multipath-tools needs to depend on sg3-utils-udev.
>
So as I understand, for now, you've already picked the fixed versions from
testing for your setup.
For this issue, multipath-tools doesn't really have any code change. The change
is required in sg3-utils. For the stable release, I'm not sure what to do.
1. New releases are not allowed in stable
2. Backports could cover this case, but I really can't commit right now, on when
I can get this done.
> It seems a shame to not include the shared lock patch as it avoids
> a known deadlock and the system still works fine with it included.
>
Indeed. But, as I mentioned in other bug report, it was submitted upstream very
recently only. And unless something is committed upstream, I don't pick it as a
fix for Debian Stable.
I'd suggest, you pick the contents of sg3-utils-udev, for now. There's nothing
other than the udev rules, in that package.
PS: You may also want to make plans for Stretch now. There are many more changes
in multipath in the Stretch version. Having a test setup and reporting bugs in
the development phase helps much more.
> -- Package-specific info:
> Contents of /etc/multipath.conf:
> blacklist {
> devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
> devnode "^hd[a-z][[0-9]*]"
> devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]"
> device {
> vendor MegaRAID
> }
> device {
> vendor APPLE
> }
> device {
> vendor ATA
> }
> device {
> vendor DELL
> }
> device {
> vendor Dell
> }
> }
> devices {
> device {
> vendor "Promise"
> product "VTrak"
> path_grouping_policy multibus
> getuid_callout "/lib/udev/scsi_id --whitelisted --
> replace-whitespace --device /dev/%n"
> path_checker readsector0
> path_selector "round-robin 0"
> hardware_handler "0"
> failback immediate
> rr_weight uniform
> rr_min_io 100
> no_path_retry 20
> features "1 queue_if_no_path"
> product_blacklist "VTrak V-LUN"
> }
> }
> multipaths {
> multipath {
> wwid 22258000155e916fb
> alias vt05-ld3-atoa
> }
> multipath {
> wwid 22268000155a61f7d
> alias vt05-ld4-atoa
> }
> multipath {
> wwid 222e8000155286d15
> alias vt05-ld5-atoa
> }
> multipath {
> wwid 22290000155c7e34a
> alias vt04-ld4
> }
> }
>
>
> -- System Information:
> Debian Release: 8.6
> APT prefers stable
> APT policy: (990, 'stable')
> Architecture: amd64 (x86_64)
>
> Kernel: Linux 3.16.0-4-amd64 (SMP w/12 CPU cores)
> Locale: LANG=en_AU.UTF-8, LC_CTYPE=en_AU.UTF-8 (charmap=UTF-8)
> Shell: /bin/sh linked to /bin/dash
> Init: systemd (via /run/systemd/system)
>
> Versions of packages multipath-tools depends on:
> ii initscripts 2.88dsf-59
> ii kpartx 0.5.0-6+deb8u2
> ii libaio1 0.3.110-1
> ii libc6 2.19-18+deb8u6
> ii libdevmapper1.02.1 2:1.02.90-2.2+deb8u1
> ii libgcc1 1:4.9.2-10
> ii libreadline6 6.3-8+b3
> ii libudev1 215-17+deb8u5
> ii lsb-base 4.1+Debian13+nmu1
> ii udev 215-17+deb8u5
>
> multipath-tools recommends no packages.
>
> Versions of packages multipath-tools suggests:
> pn multipath-tools-boot <none>
>
> -- no debconf information
>
- --
Ritesh Raj Sarraf | http://people.debian.org/~rrs
Debian - The Universal Operating System
-----BEGIN PGP SIGNATURE-----
iQIzBAEBCgAdFiEEQCVDstmIVAB/Yn02pjpYo/LhdWkFAlglnzwACgkQpjpYo/Lh
dWljLRAAqbnOduUJuUy5GSmreLtPWgAPT3/f4xwvp2c9UnwrNlcYVQ+APhUDbVYV
gbuAiVCa8mNbpXeM4Dto9jr2+3Vi4aESuf4nbgJ62X8biMSR06vkL8CGZDzGp4LM
j3gh51mwi/xfU4VsTsQP46HXXdV91CWfgfJ3dMqX7egOSZlv77zdg+UVNHSQYCtO
h4Xne0T00ceJWiMVrgkMTOV8xyLaAXqPG0ddJc0epVtmx7N+F39l8hdpQ6Za3McY
K3bKRF/hxBN1IxhBPFgJ8RbbimZML/G+1V7Xggc3f8AfADDxA/rKKEzciiFRbFWZ
ImqDOSOFKRJFxAsbsS7W5Tlj1Da1BAlRWYaQT+QHpojK+daCihm6Bzsf0jXLmmfc
liwRm8XtWTo6Xofzsv3nX6tNc0lkT7VDxDJMVf/SPTa47Bf4uufLRU5QdAWQ3GUg
zuaVf07KgDCg2CDO4qpd18T3ltsHrUzny+RjRpkO4k54RPaApT3fLCLRUK8sDDEM
1HK1FRryLDzt0bh6cE/0ZF87JJiL9fuW+3bvbpd58gevuvVer2i1LmTG4rvAE7Hj
9KSI59Zg/fJESAaZOL2mZ0EgzDWn/TdYfvew0nl4zqJBRDRF3tDqcCyJzVNlTtLr
E/JkXZDLsfNkGULZEkXFdRzW8t0BVAJd3ex+/0KXq2hsoh28bP8=
=qv8n
-----END PGP SIGNATURE-----
More information about the pkg-lvm-maintainers
mailing list