[Pkg-openmpi-maintainers] Bug#873450: openmpi: MPI_init in fortran fails on kfreebsd-amd64
Boud Roukema
boud-debian at cosmo.torun.pl
Sun Aug 27 22:13:56 UTC 2017
Source: openmpi
Version: 2.1.1-6
Severity: important
Dear Maintainer,
DESCRIPTION:
Running a minimal mpi fortran program with openmpi-2.1.1-6 on
kfreebsd-amd64 gives a libgfortran.so.3 vs .so.4 warning and fails
with PMIX ERROR: UNREACHABLE and PMIX ERROR: NOT-SUPPORTED.
CONTEXT:
I found this bug while trying to find out why mpgrafic builds have recently
failed on kfreebsd-*:
https://buildd.debian.org/status/fetch.php?pkg=mpgrafic&arch=kfreebsd-amd64&ver=0.3.15-1&stamp=1503147201&raw=0
EXAMPLE CODE:
(1) ====mpi_include_mpif_h_detect.f90====
program mpi_include_mpif_h_detect
include 'mpif.h'
call MPI_INIT(ierr)
call MPI_FINALIZE(ierr)
end program mpi_include_mpif_h_detect
(2) ====mpi_use_f08_detect.f90====
program mpi_use_f08_detect
use mpi_f08
call MPI_INIT(ierr)
call MPI_FINALIZE(ierr)
end program mpi_use_f08_detect
COMPILE/RUN OUTPUT:
(1)
mpifort mpi_include_mpif_h_detect.f90
./a.out
/usr/bin/ld: warning: libgfortran.so.3, needed by /usr/lib/x86_64-kfreebsd-gnu/openmpi/lib/libmpi_usempif08.so, may conflict with libgfortran.so.4
[k3bsd:17030] PMIX ERROR: UNREACHABLE in file src/client/pmix_client.c at line 1017
[k3bsd:17031] PMIX ERROR: NOT-SUPPORTED in file src/server/pmix_server_listener.c at line 540
[k3bsd:17030] PMIX ERROR: UNREACHABLE in file src/client/pmix_client.c at line 205
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):
init pmix failed
--> Returned value Unreachable (-12) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):
orte_ess_init failed
--> Returned value Unreachable (-12) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
ompi_mpi_init: ompi_rte_init failed
--> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[k3bsd:17030] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
(2)
mpifort mpi_use_f08_detect.f90
./a.out
/usr/bin/ld: warning: libgfortran.so.3, needed by /usr/lib/x86_64-kfreebsd-gnu/openmpi/lib/libmpi_usempif08.so, may conflict with libgfortran.so.4
[k3bsd:17039] PMIX ERROR: UNREACHABLE in file src/client/pmix_client.c at line 1017
[k3bsd:17040] PMIX ERROR: NOT-SUPPORTED in file src/server/pmix_server_listener.c at line 540
[k3bsd:17039] PMIX ERROR: UNREACHABLE in file src/client/pmix_client.c at line 205
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):
init pmix failed
--> Returned value Unreachable (-12) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):
orte_ess_init failed
--> Returned value Unreachable (-12) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
ompi_mpi_init: ompi_rte_init failed
--> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[k3bsd:17039] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
INSTALLED PACKAGES:
ii gfortran 4:7.1.0-2 kfreebsd-amd64 GNU Fortran 95 compiler
ii gfortran-7 7.2.0-1 kfreebsd-amd64 GNU Fortran compiler
ii libgfortran-7-dev:kfreebsd-amd64 7.2.0-1 kfreebsd-amd64 Runtime library for GNU Fortran applications (development files)
ii libgfortran3:kfreebsd-amd64 6.4.0-4 kfreebsd-amd64 Runtime library for GNU Fortran applications
ii libgfortran4:kfreebsd-amd64 7.2.0-1 kfreebsd-amd64 Runtime library for GNU Fortran applications
ii libopenmpi-dev 2.1.1-6 kfreebsd-amd64 high performance message passing library -- header files
ii libopenmpi2:kfreebsd-amd64 2.1.1-6 kfreebsd-amd64 high performance message passing library -- shared library
ii openmpi-bin 2.1.1-6 kfreebsd-amd64 high performance message passing library -- binaries
ii openmpi-common 2.1.1-6 all high performance message passing library -- common files
MPIFORT ANALYSIS:
$ mpifort --showme mpi_include_mpif_h_detect.f90
gfortran mpi_include_mpif_h_detect.f90
-I/usr/lib/x86_64-kfreebsd-gnu/openmpi/include -pthread
-I/usr/lib/x86_64-kfreebsd-gnu/openmpi/lib -L/usr//lib
-L/usr/lib/x86_64-kfreebsd-gnu/openmpi/lib -lmpi_usempif08
-lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi
$ mpifort --showme mpi_use_f08_detect.f90
gfortran mpi_use_f08_detect.f90
-I/usr/lib/x86_64-kfreebsd-gnu/openmpi/include -pthread
-I/usr/lib/x86_64-kfreebsd-gnu/openmpi/lib -L/usr//lib
-L/usr/lib/x86_64-kfreebsd-gnu/openmpi/lib -lmpi_usempif08
-lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi
$ dpkg -L libopenmpi-dev |grep mpifort
/usr/share/man/man1/mpifort.openmpi.1.gz
/usr/bin/mpifort.openmpi
Could the gfortran library 4 vs 3 conflict explain this bug?
No:
gfortran mpi_include_mpif_h_detect.f90
-I/usr/lib/x86_64-kfreebsd-gnu/openmpi/include -pthread
-I/usr/lib/x86_64-kfreebsd-gnu/openmpi/lib -L/usr//lib
-L/usr/lib/x86_64-kfreebsd-gnu/openmpi/lib -lmpi_mpifh -lmpi
compiles as above, but without the warning; it gives the same error
messages as above.
RELATED BUGS?
Bug #846635 looks superficially similar to this bug, but on 2016-12-03
in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=846635 there's a
comment that "This is fixed with 2.0.2~git.2015125-1, currently in
experimental".
SYSTEM:
-- System Information:
Debian Release: buster/sid
APT prefers unstable
APT policy: (500, 'unstable')
Architecture: kfreebsd-amd64 (x86_64)
Kernel: kFreeBSD 10.3-0-amd64
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US:en (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: sysvinit (via /sbin/init)
More information about the Pkg-openmpi-maintainers
mailing list