Bug#740701: multipath-tools: mkfs fails "Add. Sense: Incompatible medium installed"
Hans van Kranenburg
hans.van.kranenburg at mendix.com
Mon Jun 23 22:41:54 UTC 2014
On 06/23/2014 06:31 PM, Hans van Kranenburg wrote:
> On 06/23/2014 01:30 AM, Hans van Kranenburg wrote:
>>
>> If there's no obvious way to be found to trigger the same error in the
>> test environment, I think I'm going to propose to trigger the same again
>> while having the test physical server attached to the production luns.
>> From the past occurance, I know that if the only thing that breaks is
>> the storage connection on the physical server that executes the UNMAP.
>> It's still not the most reassuring choice, but a kind of a calculated
>> risk.
>>
>> If that's possible I can do a couple of tcpdumps on the iscsi and
>> blktrace dumps to capture what's going on and post them here. Doing so
>> will prove whether the SCSI error was actually being sent by the NetApp
>> device or not.
>
> And that's what I just did, together with a colleague of mine. On one
> lun, the NetApp box accepts unmap, on another lun it throws up with
> Incompatible Medium Installed. All other iSCSI connections from other
> physical servers to the same production lun are not impacted, only the
> connection to this server.
>
> [...] dsfsdfsdfsdfdsf
For netapp-linux-community folks, previous mail is still in moderation
queue, you can also read it in the debian bug report, including
interesting tcpdump attachments with iscsi traffic while the errors
occur: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=740701#87
Ok, for sake of completeness, I installed the 3.14 kernel from
wheezy-backports (linux-image-3.14-0.bpo.1-amd64 3.14.7-1~bpo70+1) and
reran the test, which provides the exact same results. Doing unmap on
the test lun succeeds, doing unmap on the other lun results in the same
behaviour and same errors, in slightly different formatting then when
using the 3.2 kernel:
[...]
Jun 23 23:29:51 jolteon kernel: [ 678.219033] sd 9:0:0:0: [sdl]
Unhandled sense code
Jun 23 23:29:51 jolteon kernel: [ 678.219142] sd 9:0:0:0: [sdl]
Jun 23 23:29:51 jolteon kernel: [ 678.219234] Result: hostbyte=DID_OK
driverbyte=DRIVER_SENSE
Jun 23 23:29:51 jolteon kernel: [ 678.219331] sd 9:0:0:0: [sdl]
Jun 23 23:29:51 jolteon kernel: [ 678.219423] Sense Key : Medium Error
[current]
Jun 23 23:29:51 jolteon kernel: [ 678.219653] sd 9:0:0:0: [sdl]
Jun 23 23:29:51 jolteon kernel: [ 678.219753] Add. Sense: Incompatible
medium installed
Jun 23 23:29:51 jolteon kernel: [ 678.219926] sd 9:0:0:0: [sdl] CDB:
Jun 23 23:29:51 jolteon kernel: [ 678.220019] Unmap/Read sub-channel:
42 00 00 00 00 00 00 00 18 00
Jun 23 23:29:51 jolteon kernel: [ 678.220946] device-mapper: multipath:
Failing path 8:176.
[...]
By the way, also, the first message on this debian bug report, from Bill
MacAllister already listed the output of a very recent linux kernel when
using the test case 'mkfs on jessie'.
That concludes the discussion about older or newer linux kernels. The
real problem here is NetApp, returning the SCSI errors while issuing
UNMAP commands to it.
Questions left:
- Is it wanted to have the linux kernel multipathing fail an iop
instead of retry on receiving the combination of a medium error and
additional code incompatible medium installed?
- Now I'm left with my broken NetApp, and I'd like to start using
UNMAP on it... Any comments from netapp people reading this? There must
be some reason why this is happening, and only on this specific lun, and
not on the test lun, or on several of the other NetApp filer we use.
--
Hans van Kranenburg - System / Network Engineer
T +31 (0)10 2760434 | hans.van.kranenburg at mendix.com | www.mendix.com
More information about the pkg-lvm-maintainers
mailing list