Bug#837964: mdadm --add fails, can not add spares (regression)

Anthony DeRobertis anthony at derobert.net
Fri Sep 16 18:12:04 UTC 2016


Package: mdadm
Version: 3.4-4
Followup-For: Bug #837964

I've tried to reproduce it using some (small) logical volumes and could
not. However, I've stared at strace output on trying to add sdc3 with
both the old (working) and new (broken) versions long enough to have
some idea what's going on—it appears mdadm is writing too much bitmap,
and thus overwriting the superblock.

I'm attaching the diff between the two runs. I'm happy to send along the
full strace runs (I actually have four: twice for each version, to see
how much two runs vary, to eliminate those difference from comparison),
but I'd prefer to send that somewhere not publically-archived in case it
contains something sensitive.

The commands run to make these files where:

    mdadm -r /dev/md/pv0 /dev/sdc3    # clean up
    mdadm --zero-superblock /dev/sdc3 # clean up
    strace -o add-1-v3.4-2 -y -e write=6 -- mdadm -a /dev/md/pv0 /dev/sdc3 # fails
    strace -o add-1-v3.3.2 -y -e write=6 -- /root/ugh/sbin/mdadm -a /dev/md/pv0 /dev/sdc3 # succeeds
    diff -dbU10 add-1-v3.3.2 add-1-v3.4 > /tmp/mdadm-diff

Observations:

  - line 417: left over data from 3.4 test; don't think this matters.
    Trying to add twice in a row with 3.4 both fail; trying 3.3.2 first
    (after zero-superblock) works.

  - line 496 appears to be superblock write. 1024 bytes starting at
    999939055616. The data differs, but it does between two runs of
    3.3.2 as well. Guessing it's things like UUIDs differing and not
    important (but I haven't bothered to confirm).

  - line 556. This looks like part of the bitmap write out. There is a
    convinient (for us) lseek to check position, giving 999939054592.
    3.3.2 writes 1024 bytes here, ending at 999939055615, leaving the
    position at the start of the superblock. 3.4 instead writes 4k,
    which of course overwites the superblock.

Not sure *why* it's writing too much bitmap, but it's overwriting the
superblock thus causing the failure.

In case it helps:

    # blockdev --getsize64 /dev/sdc3 
    999939064320
    # blockdev --getsz /dev/sdb3 
    1953005985



More information about the pkg-mdadm-devel mailing list