[Pkg-ofed-devel] lenny - openmpi problems

Yann JOBIC jobic at polytech.univ-mrs.fr
Tue Sep 15 13:07:41 UTC 2009


Hello,

I pluged 2 opteron SMP, with infiniband cards ( CA type: MT25418 ).
I've got a lenny distrib (5.0.2)
I used the deb http://pkg-ofed.alioth.debian.org/apt/ofed repository. I 
recompiled the modules for the kernel, just to be sur.
I directly connected 2 infiniband cards, with opensm running on both 
(one is master, the other slient)

The ping test is fine.

However, the bencharks included are not working :
Lidia:~# ib_rdma_lat Lilou
mlx4: There is a mismatch between the kernel and the userspace 
libraries: Kernel does not support XRC. Exiting.
7312:pp_init_ctx: Couldn't get context for mlx4_0
I read on this mailing list that i shouldn't bother by that, as XRC is 
only used with the bench codes.

Openmpi is also not working.

I've got this config :
On the first  computer :
Network : Lidia : 193.49.33.178
Iboip : Lidiai : 10.10.11.2
On the second one :
Network : Lilou : 193.49.33.183
Iboip : Liloui : 10.10.11.1

The hostfile :
Liloui slots=1
Lidiai slots=1

The error :
Lidia-jobic% mpirun --mca btl_openib_verbose 1 --mca btl ^tcp -n 2 
-hostfile /home/jobic/test/parallel/benchs/hosts ./exe
mlx4: There is a mismatch between the kernel and the userspace 
libraries: Kernel does not support XRC. Exiting.
CMA: unable to open RDMA device
[....]

I first thought that it came from the rigths of the rdma devices. 
However, i'm on the rdma group :
homard-jobic% groups
hpc calabiyau sysadmin code_vf rdma

And the device files :
Lidia-jobic% ls -l  /dev/infiniband/uverbs*
crw-rw---- 1 root rdma 231, 192 2009-09-15 13:42 /dev/infiniband/uverbs0
Lidia-jobic% ls -l /dev/infiniband/rdma_cm*
crw-rw---- 1 root rdma 10, 60 2009-09-15 13:42 /dev/infiniband/rdma_cm

The modules seem to the there :
Lidia:~# lsmod | grep rdma
rdma_ucm               15680  0
rdma_cm                31732  1 rdma_ucm
ib_cm                  37800  1 rdma_cm
iw_cm                  13448  1 rdma_cm
ib_sa                  24832  3 ib_ipoib,rdma_cm,ib_cm
ib_addr                10888  1 rdma_cm
ib_uverbs              34736  1 rdma_ucm
ib_core                59264  11 
ib_ipoib,ib_umad,rdma_ucm,rdma_cm,ib_cm,iw_cm,ib_sa,ib_uverbs,ib_mthca,mlx4_ib,ib_mad


Have you got an idea ?

Many thanks,

Yann



-- 
___________________________

Yann JOBIC
HPC engineer
Polytech Marseille DME
IUSTI-CNRS UMR 6595
Technopôle de Château Gombert
5 rue Enrico Fermi
13453 Marseille cedex 13
Tel : (33) 4 91 10 69 39
  ou  (33) 4 91 10 69 43
Fax : (33) 4 91 10 69 69 




More information about the Pkg-ofed-devel mailing list