Bug#759063: mdadm RAID5 array intermittently stalls during a write operation

NeilBrown neilb at suse.de
Sun Aug 24 02:54:23 UTC 2014


On Sun, 24 Aug 2014 12:16:03 +1000 Tim Boundy <gigaplex at gmail.com> wrote:

> Package: mdadm
> Version: 3.3-2
> Severity: normal
> Tags: upstream
> 
> I have a RAID5 array using 4x 3TB WD Red drives. On occasion while writing to 
> the array, the write operation will stall. Using iostat, I can see that one 
> of the member disks shows >95% utilisation performing reads at approximately 
> 0.5MB/s. This stall usually takes about 10 seconds to recover, sometimes up to 
> a minute. Lengthy stalls cause Samba network transfers to fail, shorter stalls 
> are mostly harmless if not annoying. The SMART output for the offending member 
> drive looks clean. I did notice that I can reliably reproduce the stall by 
> restarting the array - the first write operation appears to trigger it. I've 
> had the array running for over a year and I've only just started noticing this 
> recently. All drives are using native SATA ports on the Asrock A75M-HVS 
> motherboard.
> 
> 
> 
> # iostat during a write stall
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
> sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
> sdd               0.00     0.00  135.00    0.00   540.00     0.00     8.00     0.98    7.23    7.23    0.00   7.23  97.60
> sde               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
> sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
> md0               0.00     0.00  135.00    0.00   540.00     0.00     8.00     0.00    0.00    0.00    0.00   0.00   0.00

That's very distinctive, isn't it!

I cannot see the md driver causing this.  My guess is that it is caused by
the filesystem - which I see is ext4.
Maybe on first write, the filesystem loads some tables off disk, and all the
tables are on the first drive.

It would be worth asking on ext3-users at redhat.com.au

   http://www.redhat.com/mailman/listinfo/ext3-users

NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 828 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/pkg-mdadm-devel/attachments/20140824/06ad6ffa/attachment-0001.sig>


More information about the pkg-mdadm-devel mailing list