[Pkg-openmpi-maintainers] Bug#592326: Bug#592326: Failure of AZTEC test case run.

Ralf Wildenhues Ralf.Wildenhues at gmx.de
Thu Sep 2 16:59:38 UTC 2010


Hello Rachel, Jeff,

* Rachel Gordon wrote on Thu, Sep 02, 2010 at 01:35:37PM CEST:
> The cluster I am trying to run on has only the openmpi MPI version.
> So, mpif77 is equivalent to mpif77.openmpi and mpicc is equivalent
> to mpicc.openmpi
> 
> I changed the Makefile, replacing gfortran by mpif77 and gcc by mpicc.
> The compilation and linkage stage ran with no problem:
> 
> mpif77 -O   -I../lib -DMAX_MEM_SIZE=16731136 -DCOMM_BUFF_SIZE=200000
> -DMAX_CHUNK_SIZE=200000  -c -o az_tutorial_with_MPI.o
> az_tutorial_with_MPI.f
> mpif77 az_tutorial_with_MPI.o -O -L../lib -laztec      -o sample

Can you retry but this time add -pthread to both compile and link
command?

There were other reports on the OpenMPI devel list that some pthread
flags have gone missing somewhere.  It might well be that that caused
its libraries to already be built wrongly, or just the application,
I'm not sure.  But the segfault inside libpthread is suspicious.

Thanks,
Ralf

> But again when I try to run 'sample' I get:
> 
> mpirun -np 1 sample
> 
> 
> [cluster:24989] *** Process received signal ***
> [cluster:24989] Signal: Segmentation fault (11)
> [cluster:24989] Signal code: Address not mapped (1)
> [cluster:24989] Failing at address: 0x100000098
> [cluster:24989] [ 0] /lib/libpthread.so.0 [0x7f5058036a80]
> [cluster:24989] [ 1] /shared/lib/libmpi.so.0(MPI_Comm_size+0x6e)
> [0x7f50594ce34e]
> [cluster:24989] [ 2] sample(parallel_info+0x24) [0x41d2ba]
> [cluster:24989] [ 3] sample(AZ_set_proc_config+0x2d) [0x408417]
> [cluster:24989] [ 4] sample(az_set_proc_config_+0xc) [0x407b85]
> [cluster:24989] [ 5] sample(MAIN__+0x54) [0x407662]
> [cluster:24989] [ 6] sample(main+0x2c) [0x44e8ec]
> [cluster:24989] [ 7] /lib/libc.so.6(__libc_start_main+0xe6)
> [0x7f5057cf31a6]
> [cluster:24989] [ 8] sample [0x407459]
> [cluster:24989] *** End of error message ***
> --------------------------------------------------------------------------
> mpirun noticed that process rank 0 with PID 24989 on node cluster
> exited on signal 11 (Segmentation fault).
> --------------------------------------------------------------------------






More information about the Pkg-openmpi-maintainers mailing list