Bug#623613: Removing SAN mapping spawn uninterruptible process

Ritesh Raj Sarraf rrs at debian.org
Wed Apr 27 17:59:40 UTC 2011


On Wed, Apr 27, 2011 at 5:01 PM, Laurent Bigonville <bigon at debian.org> wrote:
>> It is already packaged in scsitools. Can you try that script?
>
> I will probably mix some issue here.
>
> 1) in the version in stable, when all patch are deleted, multipath -ll
> still list the mapping, with the version from unstable, the mapping is
> automatically removed. This create some confusion in my mind as I was
> reading that the mapping should be removed automatically when all the
> paths are deleted.
>

Yes. Recently, the flush_on_last_del feature was added. Can you check
if that is active by default?
(multipath -v3 should show you all the applied options)

> 2) as soon as the mapping on the san is removed
>
> all the paths get faulty
>
> cc_fleet_otf_test_4_1 (3600a0b80004725120000152a4d5cdd9e) dm-0
> IBM,1815      FAStT size=500G features='0' hwhandler='1 rdac' wp=rw
> |-+- policy='round-robin 0' prio=0 status=active
> | |- 0:0:1:1 sdb 8:16 active faulty running
> | `- 1:0:1:1 sdd 8:48 active faulty running
> `-+- policy='round-robin 0' prio=0 status=enabled
>  |- 0:0:0:1 sda 8:0  active faulty running
>  `- 1:0:0:1 sdc 8:32 active faulty running
>
> And /var/log/daemon.log gets spammed with:
>
> multipathd: sda: rdac checker reports path is ghost
> multipathd: checker failed path 8:0 in map cc_fleet_otf_test_4_1
> multipathd: cc_fleet_otf_test_4_1: remaining active paths: 3
> multipathd: dm-0: add map (uevent)
> multipathd: dm-0: devmap already registered
> multipathd: 8:16: mark as failed
> multipathd: cc_fleet_otf_test_4_1: remaining active paths: 2
> multipathd: 8:48: mark as failed
> multipathd: cc_fleet_otf_test_4_1: remaining active paths: 1
> multipathd: 8:32: mark as failed
> multipathd: cc_fleet_otf_test_4_1: remaining active paths: 0
> multipathd: sdc: rdac checker reports path is ghost
> multipathd: sdd: rdac checker reports path is up
> multipathd: sda: rdac checker reports path is ghost
> multipathd: sdb: rdac checker reports path is up
> multipathd: sdc: rdac checker reports path is ghost
> multipathd: sdd: rdac checker reports path is up
> multipathd: sda: rdac checker reports path is ghost
> multipathd: sdb: rdac checker reports path is up
> multipathd: sdc: rdac checker reports path is ghost
> multipathd: sdd: rdac checker reports path is up
> multipathd: sda: rdac checker reports path is ghost
> multipathd: sdb: rdac checker reports path is up
> multipathd: sdc: rdac checker reports path is ghost
> multipathd: sdd: rdac checker reports path is up
>
> so it seems that the status is flipping (the up and ghost paths seems
> the same as when the LUN was mapped), is this expected?
>
>

This might be weird. The pathchecker reports that sdd and sdb are up
but the overall active paths listed go down to 0.
At this point, if you run `multipath -v3`, does the status change to
what pathchecker is reporting?

> 3) The whole problem I have first described here could be due to the
> queue_if_no_path feature and the fact that some udev rules call kpartx
> and blkid, because when I issues dmsetup message cc_fleet_otf_test_4_1 0 fail_if_no_path
> the process that were stuck exit and I can then remove the previously
> stuck mapping.

queue_if_no_path is used to ensure that applications don't fail when
all paths go down (commonly seen during target cluster faults where
there is a window for all paths being unavailable). All it does is to
suspend all relevant processes (notice is the ps output, they'll all
be in 'D' UnInterruptible State).
Now, _if_ the udev rules touch device mapper devices at that moment,
yes, you will see all those processes stuck.

When those processes are stuck, can you take a ps output to see what
those commands (kpartx/blkid) are trying to access?

-- 
Ritesh Raj Sarraf
RESEARCHUT - http://www.researchut.com
"Necessity is the mother of invention."





More information about the pkg-lvm-maintainers mailing list