[Pkg-ofed-devel] Infiniband performance: Maturity of kernel and tools in debian wheezy?
Wolfgang Rosner
wrosner at tirnet.de
Fri Apr 3 15:36:15 UTC 2015
Hello, debianic infiniband pro's
can I consider the infiniband tools in wheezy and the kernel in
whezy-backports as "state of the art?" Or can I expect considerable
performance improvement by building from recent sources?
There are some HowTo's on the web saying "always use latest versions", but
they all are older than 5 years.
Can I conclude that infiniband development has settled down, and there is no
use in chasing the last upgrades?
On the other hand,
http://downloads.openfabrics.org/downloads/
shows quite higher version numbers than those I get from debian tools.
Right now I'm working my way through the Infinband Howto.
I'm trapped at the performance chapter:
http://pkg-ofed.alioth.debian.org/howto/infiniband-howto-4.html#ss4.9
because I think that the raw performance of ib_rdma_bw & companions is
disappointing. Not to mention ibping, which merely can catch up with ethernet
latency values (~ 0.120 ..0.150 ms).
I don't get closer than 55% to the theoretical throughput value.
OK, I've learned that there is some "8 over 10 bits" encoding on the wire, but
between 55 % and 80 % there is still some gap left, I thought.
Before trying infiniband, I experimentet with teql layer 3 bonding of 6 x
1-GBit ethernet links, which yielded 5.7 GBit/s, which is 95 % of theoretical
max. But admittedly, not that close at PCIe bus limit.
I reconfigured my Blades to make sure the ib HCA gets PCIE x8 bandwith.
This was x4 before, which doubled throughput (as I expected).
I upgraded firmware on my HCA's , but no effect.
A parallel bidirectional RDMA bandwith test as this
for i in 10 11 14 15 16 ; do ( ib_rdma_bw -b 192.168.130.${i} & ) ; done
yields
712+712+572+570+575 = 3141 MB/s which is ~ 25 GB/s ~ 62 % of 40 GBit
same thing unidirectional (w/o -b option) is precisely half:
354+354+288+286+287 = 1569 = 12552 MBit / s
running the tests sequentially (without the &)
2820 MB/sec ~ 22 GBit bidirectional
1410 MB/sec ~ 11 GBit unidirectional
So it does not look like a bottlneck on the blade side or on the physical
pathway.
Blade<->Blade bandwith is somewhat lower across all setups
(e.g 2675.37 MB/sec for sequential bidirectional tests)
Are there any significant rewards to be expected from further tuning,
or did I hit the hardware ceiling already?
Wolfgang Rosner
=============================================
System details
"poor man's beowulf cluster" from "ebay'd old server hardware" (HP Blade
center) and a headnode on a recent "premium consumer grade" mainboard:
wheezy backport kernel
$ uname -a Linux cruncher 3.16.0-0.bpo.4-amd64 #1 SMP Debian
3.16.7-ckt4-3~bpo70+1 (2015-02-12) x86_64 GNU/Linux
Head node:
Sabertooth 990FX 2.0, AMD FX 8320 Eight-Core,
debian wheezy 7.7
Mellanox Technologies MT25208 InfiniHost III Ex (Tavor compatibility mode)
(rev 20)
lspci LnkSta: Speed 2.5GT/s, Width x8,
module 'mthca0' Firmware version: 4.8.930
cluster nodes :
HP Blades BL460c G1, Dual Intel Xeon QuadCore (mixed X5355 and E5430)
debian wheezy 7.8
Mellanox Technologies MT25204 [InfiniHost III Lx HCA] (rev 20)
lspci LnkSta: Speed 2.5GT/s, Width x8,
switch
"MT47396 Infiniscale-III Mellanox Technologies" base port 0 lid 3 lmc 0
(HP Blade center switch)
(no clue how to read or even upgrade its firmware level)
More information about the Pkg-ofed-devel
mailing list