Bug#843953: multipath-tools: LVM on internal disks + multipath-toosl = no multipaths

Ritesh Raj Sarraf rrs at debian.org
Fri Nov 11 10:36:44 UTC 2016


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

That's a lot of information. Let's try to cover as much.


On Fri, 2016-11-11 at 17:55 +1100, Vincent McIntyre wrote:
> Package: multipath-tools
> Version: 0.5.0-6+deb8u2
> Severity: important
> 
>    * What led up to the situation?
> 
>    Upgrade working wheezy system to jessie
>    Apply patch to fix multipath segfault, see #751993
> 
>    The system boots off an internal physical disk
>    That disk has one / partition and an LVM partition
>    /usr is one of the LVs on that disk
>    There are two other internal disks, LVM is not used on these.
>    The multipath devices are connected via a qlogic ISP2432-based card,
>    through a FC switch to two Promise VTrak units.
>    LVM is not used on the multipath devices.
> 
> 
>    * What exactly did you do (or not do) that was effective (or
>      ineffective)?
> 
>    Cold-plug FC connection to external storage, boot system.
> 
>    * What was the outcome of this action?
> 
>    multipath -l shows no devices. no related device maps in /dev/mapper.
> 
>    The FC and SCSI layers all worked fine, I see lots of /dev/sdX devices.
> 
>    multipath -l -v3 shows the /dev/sdX devices as blacklisted
>      ...
>      sdd: blacklisted, udev property missing
>      sde: blacklisted, udev property missing
>      sdp: blacklisted, udev property missing
>      ...etc
> 

Okay! I think we fixed this somewhere.
Ah. Mauricio had found this issue long back in #782487

Okay!. Since this did not make into Jessie, you've hit this problem.
Hmmm.

>    * What outcome did you expect instead?
> 
>    usable multipaths to the configured devices
>    /dev/mapper populated, including kpartx partitions
> 
> Related notes:
> 
>    On some reboots the system log shows multipath timing out. Below is 'sdd'.
>    The timeout occurs 33 seconds after the disk was attached.
>    I was unable to determine the cause of this or reproduce consistently.
> 
>    Nov 11 11:11:53  kernel: sd 1:0:0:4: [sdd] 25769805824 512-byte logical
> blocks: (13.1 TB/12.0 TiB)
>    Nov 11 11:11:53  kernel: sd 1:0:0:4: [sdd] Write Protect is off
>    Nov 11 11:11:53  kernel: sd 1:0:0:4: [sdd] Mode Sense: 97 00 10 08
>    Nov 11 11:11:53  kernel: sd 1:0:0:4: [sdd] Write cache: enabled, read
> cache: enabled, supports DPO and FUA
>    Nov 11 11:11:53  kernel:  sdd: sdd1
>    Nov 11 11:11:53  kernel: sd 1:0:0:4: [sdd] Attached SCSI disk
>    ...
>    Nov 11 11:12:25  systemd-udevd[346]: timeout '/sbin/multipath -v0 /dev/sdd'
>    Nov 11 11:12:26  systemd-udevd[346]: timeout: killing '/sbin/multipath -v0
> /dev/sdd' [459]
>    Nov 11 11:12:26  systemd-udevd[346]: '/sbin/multipath -v0 /dev/sdd' [459]
> terminated by signal 9 (Killed)
> 

This is mostly the locking issue which you mentioned in the other bug.

>    On some reboots, there was a bad interaction between LVM and multipathd
>    and/or udev. On the console systemd showed it was waiting for tasks to
>    complete for both of these.
>    device-mapper would try to handle the multipath devices before the LVM
>    ones, which sometimes caused the system to fail to boot; it went into
>    emergency mode.
>    I was unable to determine the cause of this or reproduce it consistently.
>    I don't know where the multipath-tools-boot line comes from, that
>    package is not even installed.
> 
>    You can however see the two running contemporaneously
>    # journalctl |egrep -i -e '(multipath|lvm|-udev)'
>    Nov 11 16:52:57 systemd[1]: Starting LVM2 metadata daemon socket.
>    Nov 11 16:52:57 systemd[1]: Listening on LVM2 metadata daemon socket.
>    Nov 11 16:52:57 systemd-udevd[317]: starting version 215
>    Nov 11 16:52:58 systemd[1]: Starting LSB: early multipath boot script...
>    Nov 11 16:52:58 kernel: device-mapper: multipath: version 1.7.0 loaded
>    Nov 11 16:52:59 multipath-tools-boot[633]: Discovering and coalescing
> multipaths...done.
>    Nov 11 16:52:59 systemd[1]: Started LSB: early multipath boot script.
>    Nov 11 16:52:59 systemd[1]: Starting system-lvm2\x2dpvscan.slice.
>    Nov 11 16:52:59 systemd[1]: Created slice system-lvm2\x2dpvscan.slice.
>    Nov 11 16:52:59 systemd[1]: Starting LVM2 PV scan on device 8:2...
>    Nov 11 16:52:59 systemd[1]: Starting Activation of LVM2 logical volumes...
>    Nov 11 16:52:59 systemd[1]: Started LVM2 PV scan on device 8:2.
>    Nov 11 16:52:59 lvm[676]: 10 logical volume(s) in volume group "testbox"
> now active
>    Nov 11 16:53:00 systemd[1]: Started Activation of LVM2 logical volumes.
>    Nov 11 16:53:00 systemd[1]: Starting Activation of LVM2 logical volumes...
>    Nov 11 16:53:01 lvm[854]: 10 logical volume(s) in volume group "testbox"
> now active
>    Nov 11 16:53:01 systemd[1]: Started Activation of LVM2 logical volumes.
>    Nov 11 16:53:01 systemd[1]: Starting Monitoring of LVM2 mirrors, snapshots
> etc. using dmeventd or progress polling...
>    Nov 11 16:53:01 lvm[928]: 10 logical volume(s) in volume group "testbox"
> monitored
>    Nov 11 16:53:01 systemd[1]: Started Monitoring of LVM2 mirrors, snapshots
> etc. using dmeventd or progress polling.
>    Nov 11 16:53:09 systemd[1]: Starting LSB: multipath daemon...
>    Nov 11 16:53:10 multipath-tools[1190]: Starting multipath daemon:
> multipathd.
>    Nov 11 16:53:10 systemd[1]: Started LSB: multipath daemon.
>    Nov 11 16:53:10 multipathd[1288]: path checkers start up
> 
> 
> 
> Further work:
> 
> First I applied the patch discussed in #799781 (shared lock with udev).
> This didn't fix things but may have helped.
> 

"Didn't fixed things" as in the maps not appearing. Which is because of the
missing vpd, which can only be brought by the newer version of sg3-utils
available only in Stretch.

> Then after reviewing #782487, I built sg3-utils v1.42 and installed it,
> including sg3-utils-udev. This got the system working.
> hot-plugging the fibre doesn't work properly but that will have to wait
> for another bug.
> 
Okay Great. So you have confirmed here. :-)

> I reverted the shared lock patch, to test if that is essential.
> It appears not - I was able to boot the system fine as long as I
> had the sg3-utils packages installed. Nonetheless it seems worth
> including it as I did notice that multipath-tools and LVM were
> trying to do things at the same time (systemd was waiting for both).

Yes. Agreed.

> Notice also the dm ordering:
> 
> # dmsetup ls |sort -t: -k2,2 -n |column -t
> testbox-swap_1        (254:0)
> testbox-usr           (254:1)
> vt04-ld4              (254:2)
> vt05-ld3-atoa         (254:3)
> vt05-ld4-atoa         (254:4)
> vt05-ld5-atoa         (254:5)
> vt04-ld4-part1        (254:6)
> vt05-ld4-atoa-part1   (254:7)
> vt05-ld5-atoa-part1   (254:8)
> testbox-var           (254:9)
> testbox-var+log       (254:10)
> testbox-tmp           (254:11)
> testbox-opt           (254:12)
> testbox-local         (254:13)
> testbox-srv           (254:14)
> testbox-data          (254:15)
> testbox-srv+jenkins   (254:16)
> 
> 
> Requests:
> 
> Can we please have sg3-utils v1.42 added to a stable point release?
> Also multipath-tools needs to depend on sg3-utils-udev.
> 

So as I understand, for now, you've already picked the fixed versions from
testing for your setup.

For this issue, multipath-tools doesn't really have any code change. The change
is required in sg3-utils. For the stable release, I'm not sure what to do. 

1. New releases are not allowed in stable
2. Backports could cover this case, but I really can't commit right now, on when
I can get this done.

> It seems a shame to not include the shared lock patch as it avoids
> a known deadlock and the system still works fine with it included.
> 

Indeed. But, as I mentioned in other bug report, it was submitted upstream very
recently only. And unless something is committed upstream, I don't pick it as a
fix for Debian Stable.


I'd suggest, you pick the contents of sg3-utils-udev, for now. There's nothing
other than the udev rules, in that package.


PS: You may also want to make plans for Stretch now. There are many more changes
in multipath in the Stretch version. Having a test setup and reporting bugs in
the development phase helps much more.


> -- Package-specific info:
> Contents of /etc/multipath.conf:
> blacklist {
> 	devnode	"^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
> 	devnode	"^hd[a-z][[0-9]*]"
> 	devnode	"^cciss!c[0-9]d[0-9]*[p[0-9]*]"
> 	device {
> 		vendor MegaRAID
> 	}
> 	device {
> 		vendor APPLE
> 	}
> 	device {
> 		vendor ATA
> 	}
> 	device {
> 		vendor DELL
> 	}
> 	device {
> 		vendor Dell
> 	}
> }
> devices {
> 	device {
> 		vendor			"Promise"
> 		product			"VTrak"
> 		path_grouping_policy	multibus
> 		getuid_callout          "/lib/udev/scsi_id --whitelisted --
> replace-whitespace --device /dev/%n"
> 		path_checker		readsector0
> 		path_selector		"round-robin 0"
> 		hardware_handler	"0"
> 		failback		immediate
> 		rr_weight		uniform
> 		rr_min_io		100
> 		no_path_retry		20
> 		features		"1 queue_if_no_path"
> 		product_blacklist	"VTrak V-LUN"
> 	}
> }
> multipaths {
>    multipath {
>      wwid  22258000155e916fb
>      alias vt05-ld3-atoa
>    }
>    multipath {
>      wwid  22268000155a61f7d
>      alias vt05-ld4-atoa
>    }
>    multipath {
>      wwid  222e8000155286d15
>      alias vt05-ld5-atoa
>    }
>    multipath {
>      wwid  22290000155c7e34a
>      alias vt04-ld4
>    }
> }
> 
> 
> -- System Information:
> Debian Release: 8.6
>   APT prefers stable
>   APT policy: (990, 'stable')
> Architecture: amd64 (x86_64)
> 
> Kernel: Linux 3.16.0-4-amd64 (SMP w/12 CPU cores)
> Locale: LANG=en_AU.UTF-8, LC_CTYPE=en_AU.UTF-8 (charmap=UTF-8)
> Shell: /bin/sh linked to /bin/dash
> Init: systemd (via /run/systemd/system)
> 
> Versions of packages multipath-tools depends on:
> ii  initscripts         2.88dsf-59
> ii  kpartx              0.5.0-6+deb8u2
> ii  libaio1             0.3.110-1
> ii  libc6               2.19-18+deb8u6
> ii  libdevmapper1.02.1  2:1.02.90-2.2+deb8u1
> ii  libgcc1             1:4.9.2-10
> ii  libreadline6        6.3-8+b3
> ii  libudev1            215-17+deb8u5
> ii  lsb-base            4.1+Debian13+nmu1
> ii  udev                215-17+deb8u5
> 
> multipath-tools recommends no packages.
> 
> Versions of packages multipath-tools suggests:
> pn  multipath-tools-boot  <none>
> 
> -- no debconf information
> 
- -- 
Ritesh Raj Sarraf | http://people.debian.org/~rrs
Debian - The Universal Operating System
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEQCVDstmIVAB/Yn02pjpYo/LhdWkFAlglnzwACgkQpjpYo/Lh
dWljLRAAqbnOduUJuUy5GSmreLtPWgAPT3/f4xwvp2c9UnwrNlcYVQ+APhUDbVYV
gbuAiVCa8mNbpXeM4Dto9jr2+3Vi4aESuf4nbgJ62X8biMSR06vkL8CGZDzGp4LM
j3gh51mwi/xfU4VsTsQP46HXXdV91CWfgfJ3dMqX7egOSZlv77zdg+UVNHSQYCtO
h4Xne0T00ceJWiMVrgkMTOV8xyLaAXqPG0ddJc0epVtmx7N+F39l8hdpQ6Za3McY
K3bKRF/hxBN1IxhBPFgJ8RbbimZML/G+1V7Xggc3f8AfADDxA/rKKEzciiFRbFWZ
ImqDOSOFKRJFxAsbsS7W5Tlj1Da1BAlRWYaQT+QHpojK+daCihm6Bzsf0jXLmmfc
liwRm8XtWTo6Xofzsv3nX6tNc0lkT7VDxDJMVf/SPTa47Bf4uufLRU5QdAWQ3GUg
zuaVf07KgDCg2CDO4qpd18T3ltsHrUzny+RjRpkO4k54RPaApT3fLCLRUK8sDDEM
1HK1FRryLDzt0bh6cE/0ZF87JJiL9fuW+3bvbpd58gevuvVer2i1LmTG4rvAE7Hj
9KSI59Zg/fJESAaZOL2mZ0EgzDWn/TdYfvew0nl4zqJBRDRF3tDqcCyJzVNlTtLr
E/JkXZDLsfNkGULZEkXFdRzW8t0BVAJd3ex+/0KXq2hsoh28bP8=
=qv8n
-----END PGP SIGNATURE-----



More information about the pkg-lvm-maintainers mailing list