[Virtual-pkg-base-maintainers] Bug#759706: base: bad write performance on software RAID 1

Dominique Barton dbarton at confirm.ch
Fri Aug 29 15:28:29 UTC 2014


Package: base
Severity: normal
Tags: lfs

Hi

Sorry I don't know if this bug is part of the base package or the madm package.
I hope base is OK, because I thought mdadm is only for the mdadm util and not for the md/dm itself.

I've some serious performance issues w/ my software RAID 1, especially write performance.
This is my setup:


      LV/ext4           (mounted w/ noatime)
--------------------
        lvm             (VG ssdvg on PV /dev/md0)
--------------------
      /dev/md0          (RAID 1)
--------------------
/dev/sda1   /dev/sdb1   (identical SSDs)

The monitoring sent a lot of alerts in the past, because of a high I/O wait CPU usage.
So I started to have a look at the raid by testing the performance via iozone.
The results were aweful, because I got approx. 4-9MB/s write performance (w/ direct I/O enabled).

I thought one of the SSDs is corrupt and tested them both separately (S.M.A.R.T. and iozone),
but they seem OK.

There is something odd w/ the RAID /dev/md0, but I don't have any test scenarios left :(

Here's my test setup:

~snip~
 # remove one SSD from active RAID
 mdadm --manage /dev/md0 --set-faulty  /dev/sda2
 mdadm --manage /dev/md0 --remove  /dev/sda2

 # create new RAID w/ only one SSD
 mdadm --create /dev/md1 --level=mirror --raid-devices=2 /dev/sda2 missing

 # create new PV / VG
 pvcreate /dev/md1
 vgcreate testvg /dev/md1

 # now I've two RAID 1 devices, each one w/ a single SSD device

 # create some LVs for testing
 lvcreate -L5G -niozone ssdvg   # this is the origin VG/PV/RAID w/ performance issues
 mkfs.ext4 /dev/mapper/ssdvg-iozone
 lvcreate -L5G -niozone testvg  # this is the new VG/PV/RAID
 mkfs.ext4 /dev/mapper/testvg-iozone
 
 # mount the LVs
 mkdir -p /mnt/iozone-ssdvg
 mount /dev/mapper/ssdvg-iozone !$
 mkdir -p /mnt/iozone-testvg
 mount /dev/mapper/testvg-iozone !$
 
~snap~
 
With the test setup prepared, I was ready to run iozone:

~snip~
iozone -t 2 -F /mnt/iozone-ssdvg/file1 /mnt/iozone-ssdvg/file2 -s 1g -r 4m -i 0 -I
...
    Children see throughput for  2 initial writers  =   19015.39 KB/sec
    Parent sees throughput for  2 initial writers   =   18931.88 KB/sec
    Min throughput per process          =    9385.20 KB/sec 
    Max throughput per process          =    9630.19 KB/sec
    Avg throughput per process          =    9507.69 KB/sec
    Min xfer                    = 1024000.00 KB
...

iozone -t 2 -F /mnt/iozone-ssdvg/file1 /mnt/iozone-ssdvg/file2 -s 1g -r 4m -i 0 -I
...
    Children see throughput for  2 initial writers  =  267037.86 KB/sec
    Parent sees throughput for  2 initial writers   =  266448.39 KB/sec
    Min throughput per process          =  133126.77 KB/sec 
    Max throughput per process          =  133911.09 KB/sec
    Avg throughput per process          =  133518.93 KB/sec
    Min xfer                    = 1044480.00 KB
...
~snap~

As you can see, the write performance on /dev/md0 is ~ 9MB/s while /dev/md1 is ~130MB/s.
I switched the disks (/dev/sda1 back to /dev/md0, resync, removed /dev/sdb1 from /dev/md0, created /dev/md1 with /dev/sdb1) and started the same tests again.
Of course I was surprised when I saw the same results: /dev/md0 was still slow, while /dev/md1 rocked!

So the problem is not on the SSD itself, because /dev/md0 is always slow as hell, even when I switch the disks.

I also checked both RAIDs for differences:

~snip~
 mdadm --detail /dev/md0 > /tmp/details.md0
 mdadm --detail /dev/md1 > /tmp/details.md1
 
root at asteria:/mnt# diff /tmp/details.md0 /tmp/details.md1 
1c1
< /dev/md0:
---
> /dev/md1:
3c3
<   Creation Time : Wed Jan 15 22:09:46 2014
---
>   Creation Time : Fri Aug 29 15:55:22 2014
11,12c11,12
<     Update Time : Fri Aug 29 16:01:57 2014
<           State : active, degraded 
---
>     Update Time : Fri Aug 29 15:59:39 2014
>           State : clean, degraded 
18,20c18,20
<            Name : asteria:0  (local to host asteria)
<            UUID : fd0c6a27:fddfb55b:3cc9c9d7:c9b59e8f
<          Events : 61833
---
>            Name : asteria:1  (local to host asteria)
>            UUID : 88f5f694:c2e0e9ba:82c24871:06368106
>          Events : 34
23,24c23,24
<        0       0        0        0      removed
<        2       8       18        1      active sync   /dev/sdb2
---
>        0       8        2        0      active sync   /dev/sda2
>        1       0        0        1      removed
~snap~

Nothing there...

Disks are looking good too:

~snip~
root at asteria:/mnt# smartctl -l selftest /dev/sda
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-4-amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      5407         -
# 2  Short offline       Completed without error       00%      5383         -
# 3  Short offline       Completed without error       00%      5359         -
# 4  Short offline       Completed without error       00%      5335         -
# 5  Extended offline    Completed without error       00%      5292         -
# 6  Short offline       Completed without error       00%      5288         -
# 7  Short offline       Completed without error       00%      5264         -
# 8  Short offline       Completed without error       00%      5217         -
# 9  Short offline       Completed without error       00%      5193         -
#10  Short offline       Completed without error       00%      5169         -
#11  Short offline       Completed without error       00%      5145         -
#12  Extended offline    Completed without error       00%      5126         -
#13  Short offline       Completed without error       00%      5121         -
#14  Extended offline    Completed without error       00%      5108         -
#15  Short offline       Completed without error       00%      5104         -
#16  Short offline       Aborted by host               90%      5104         -

root at asteria:/mnt# smartctl -l selftest /dev/sdb
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-4-amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      5407         -
# 2  Short offline       Completed without error       00%      5383         -
# 3  Short offline       Completed without error       00%      5359         -
# 4  Short offline       Completed without error       00%      5335         -
# 5  Extended offline    Completed without error       00%      5297         -
# 6  Short offline       Completed without error       00%      5288         -
# 7  Short offline       Completed without error       00%      5264         -
# 8  Short offline       Completed without error       00%      5217         -
# 9  Short offline       Completed without error       00%      5193         -
#10  Short offline       Completed without error       00%      5169         -
#11  Short offline       Completed without error       00%      5145         -
#12  Extended offline    Completed without error       00%      5132         -
#13  Short offline       Completed without error       00%      5121         -
#14  Extended offline    Completed without error       00%      5114         -
#15  Short offline       Completed without error       00%      5105         -
#16  Short offline       Aborted by host               70%      5105         -

root at asteria:/mnt# parted /dev/sda print
Model: ATA Samsung SSD 840 (scsi)
Disk /dev/sda: 256GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start   End    Size   File system  Name  Flags
 1      1049kB  128MB  127MB  fat32              boot
 2      128MB   256GB  256GB  ext4               raid

root at asteria:/mnt# parted /dev/sdb print
Model: ATA Samsung SSD 840 (scsi)
Disk /dev/sdb: 256GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start   End    Size   File system  Name  Flags
 1      1049kB  128MB  127MB  fat32              boot
 2      128MB   256GB  256GB                     raid
~snap~

I/O usage is also very low (dm-28 is an iSCSI LUN and not on stored on /dev/md0):

~snip~
root at asteria:/mnt# iostat -d -m 10
...
sda               0.00         0.00         0.00          0          0
sdb              24.30         0.00         0.15          0          1
md0              48.00         0.00         0.15          0          1
dm-0              0.00         0.00         0.00          0          0
dm-1              0.00         0.00         0.00          0          0
dm-2              0.00         0.00         0.00          0          0
dm-3              0.00         0.00         0.00          0          0
dm-4              0.00         0.00         0.00          0          0
dm-5              0.00         0.00         0.00          0          0
dm-6              0.00         0.00         0.00          0          0
dm-7              0.00         0.00         0.00          0          0
dm-8              5.70         0.00         0.01          0          0
dm-9              1.00         0.00         0.00          0          0
dm-10             0.00         0.00         0.00          0          0
dm-11            33.50         0.00         0.11          0          1
dm-12             2.40         0.00         0.01          0          0
dm-13             0.00         0.00         0.00          0          0
dm-14             5.40         0.00         0.02          0          0
dm-15             0.00         0.00         0.00          0          0
dm-16             0.00         0.00         0.00          0          0
dm-17             0.00         0.00         0.00          0          0
sdc              17.30         0.00         0.82          0          8
sdd               0.00         0.00         0.00          0          0
dm-18             0.00         0.00         0.00          0          0
dm-19             0.00         0.00         0.00          0          0
dm-20             0.00         0.00         0.00          0          0
dm-21            28.50         0.00         0.11          0          1
dm-22             0.00         0.00         0.00          0          0
dm-23             0.00         0.00         0.00          0          0
dm-24             0.00         0.00         0.00          0          0
dm-25             0.00         0.00         0.00          0          0
dm-26             0.00         0.00         0.00          0          0
dm-27             0.00         0.00         0.00          0          0
dm-28           182.30         0.00         0.71          0          7
md1               0.00         0.00         0.00          0          0
dm-29             0.00         0.00         0.00          0          0
dm-30             0.00         0.00         0.00          0          0
~snap~

Any ideas or hints where the write issues are coming from, or what I can test next?

Cheers
Domi

-- System Information:
Debian Release: 7.6
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 3.2.0-4-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash



More information about the Virtual-pkg-base-maintainers mailing list