Bug#658701: mdadm: should send email if mismatches are reported by a check

Russell Coker russell at coker.com.au
Sun Feb 5 14:58:13 UTC 2012


On Mon, 6 Feb 2012, Michael Tokarev <mjt at tls.msk.ru> wrote:
> > I believe that this is a serious bug, it seems to me that one of the most
> > significant conditions it can encounter that should be immediately
> > reported to the sysadmin is the fact that the contents of disks are
> > changing and breaking RAID consistency!
> 
> Yes that's the condition it may encouner indeed.  The question is WHY -
> under normal conditions there should be no such errors.

http://www.zdnet.com/blog/storage/why-raid-6-stops-working-in-2019/805

The disk just has errors sometimes.  The above article has some calculations 
of the probabilities.

> There are two points there.
> 
> First, a formal one.  Were it a serious issue if such a check weren't be
> done at all?  I think that in this case this bugreport didt'n exist to
> start with.

http://etbe.coker.com.au/2012/02/06/reliability-raid/

If there were no checks at all then we would migrate to BTRFS even sooner, at 
the above URL I've written some of the thoughts about BTRFS vs software RAID.

> And second, more to the point, Neil gave a very good writeup of these
> checks and repairs of raid arrays, about deciding which part/component of
> the array is "more right".  Unfortunately I can't find it right now.

Unfortunately at the moment it seems impossible to determine which disk had 
the error, if you even know that there was an error.

-- 
My Main Blog         http://etbe.coker.com.au/
My Documents Blog    http://doc.coker.com.au/





More information about the pkg-mdadm-devel mailing list