Bug#740701: multipath-tools: mkfs fails "Add. Sense: Incompatible medium installed"

Hans van Kranenburg hans.van.kranenburg at mendix.com
Tue Jun 17 17:43:59 UTC 2014


Hi,

On 06/17/2014 07:24 AM, Ritesh Raj Sarraf wrote:
>
> Okay!! Let me check in the lab. I will try to reproduce it. In case you
> do not hear back from me, please feel free to ping back.
>
> Meanwhile, can you run sg_inq on the SCSI device ??

Sure:

# sg_inq /dev/sg6
standard INQUIRY:
   PQual=0  Device_type=0  RMB=0  version=0x05  [SPC-3]
   [AERC=0]  [TrmTsk=0]  NormACA=1  HiSUP=1  Resp_data_format=2
   SCCS=0  ACC=0  TPGS=0  3PC=1  Protect=0  BQue=0
   EncServ=0  MultiP=1 (VS=0)  [MChngr=0]  [ACKREQQ=0]  Addr16=0
   [RelAdr=0]  WBus16=0  Sync=0  Linked=0  [TranDis=0]  CmdQue=1
   [SPI: Clocking=0x0  QAS=0  IUS=0]
     length=117 (0x75)   Peripheral device type: disk
  Vendor identification: NETAPP
  Product identification: LUN
  Product revision level: 811a
  Unit serial number: BWr95?E2aJc9

Besides this, I'm trying to isolate a test case that is as small as 
possible to reproduce this behaviour, using an emptied server and only 
giving this server access to a small test lun.

Test case 1:

No multipath or whatever, directly operate on the lun (well, on 1 of the 
four paths). It's a small 10GB lun.

/mnt 0-# mkfs.ext4 -E nodiscard /dev/sdf
[...]
/mnt 0-# mkdir discard
/mnt 0-# mount /dev/sdf discard/
/mnt 0-# cd discard/
/mnt/discard 0-# fstrim -v -o 0 -l 128MB ./
./: 0 bytes were trimmed

Ok, that went fine, there's no data yet, so this was expected. Also, no 
errors in dmesg. Let's create some random data on the lun, remove the 
file and fstrim again:

/mnt/discard 0-# dd if=/dev/urandom of=bla bs=1028476 count=128
128+0 records in
128+0 records out
131644928 bytes (132 MB) copied, 14.7496 s, 8.9 MB/s

/mnt/discard 0-# fstrim -v -o 0 -l 256MB ./
./: 117063680 bytes were trimmed

So far so good.

Next test cases will work towards the situation which is identical to in 
which the issue occured yesterday, having a striped lvm logical volume 
on top of encryption and multipath... I hope somewhere in between it 
will break in a way that will result in a clear pointer where the issue 
show yesterday originates.

The way we use netapp with linux, might sound a bit unusual, but it 
works great in practice:

                xvda in a domU
                      |
                     xen
                      |
              lv (striped, -i 2)
                      |
               lvm volume group
              /                \
             /                  \
        lvm pv                 lvm pv
       dm-crypt               dm-crypt
        mpath1                 mpath2
         ||||                   ||||
       a,b  c,d               e,f  g,h
       ||    ||               ||    ||
    --xxxx---||--------------xxxx---||--- switch
      |  |   ||              |  |   ||
    --|--|--xxxx-------------|--|--xxxx----- switch
      |  |  |  |             |  |  |  |
      |  |  |  |             |  |  |  |
      |  |  |  |             |  |  |  |
      a  b  c  d             e  f  g  h
   NetApp controller 1   NetApp controller 2

Because we don't care about snapshots and other fancy NetApp 
functionality (sorry 'bout that :) ), we create two multipath devices 
(each to a lun on a different one of the two disk controllers in a 
netapp device), put encryption on them, create a lvm pv out of them, add 
them together in a volume group, and take striped logical volumes out of 
them. It's even usable on multiple attached servers, as long as you get 
your locking on metadata operations done right.

But I have to leave now, will continue later.


-- 
Hans van Kranenburg - System / Network Engineer
+31 (0)10 2760434 | hans.van.kranenburg at mendix.com | www.mendix.com



More information about the pkg-lvm-maintainers mailing list