Bug#583917: My analysis of the bug

Matteo Frigo athena at fftw.org
Tue Aug 31 23:39:07 UTC 2010


I have been hit by this bug.  My analysis of the situation is
the following.

mdadm-3.1.2 introduced a change that stores a ``mapfile'' in
/lib/init/rw.  The problem is that this directory does not exist in the
standard initramfs.  Previous versions of mdadm were storing the file in
/dev/.mdadm (which exists), and mdadm-3.1.3 also stores the file
/dev/.mdadm.

What happens seems to be the following.  When assembling an array mdadm
invokes udev to create symlinks to the new file in /dev; in turn, udev
invokes mdadm to obtain information about the array.  For reasons that I
don't understand, the lack of the mapfile causes mdadm to scan for the
array again, causing another notification to udev.  The loop repeats
forever.  Presumably the mapfile is a cache of known arrays, and
normally mdadm breaks the loop after noticing that the array is already
present in the mapfile.  This safety net disappears if the mapfile
cannot be written.

It appears that all problems go away by recompiling mdadm after changing
a definition in the Makefile from

   ALT_RUN = /lib/init/rw

to 

   ALT_RUN = /dev/.mdadm

I recommend the following:

1) this but ought to be reclassified a release-critical.  It makes many
   machines unbootable and it must not be allowed to reach squeeze.

2) A workaround to debug is to change ALT_RUN as described above.
   However, I would strongly recommend the Debian maintainer to switch
   to mdadm 3.1.3 for squeeze, which seems to fix a few other scary
   bugs.

Regards,
Matteo Frigo 






More information about the pkg-mdadm-devel mailing list