[Pkg-openmpi-maintainers] Bug#873450: openmpi: MPI_init in fortran fails on kfreebsd-amd64

Boud Roukema boud-debian at cosmo.torun.pl
Sun Aug 27 22:13:56 UTC 2017


Source: openmpi
Version: 2.1.1-6
Severity: important

Dear Maintainer,

DESCRIPTION:

Running a minimal mpi fortran program with openmpi-2.1.1-6 on
kfreebsd-amd64 gives a libgfortran.so.3 vs .so.4 warning and fails
with PMIX ERROR: UNREACHABLE and PMIX ERROR: NOT-SUPPORTED.


CONTEXT:

I found this bug while trying to find out why mpgrafic builds have recently
failed on kfreebsd-*:
https://buildd.debian.org/status/fetch.php?pkg=mpgrafic&arch=kfreebsd-amd64&ver=0.3.15-1&stamp=1503147201&raw=0


EXAMPLE CODE:

(1) ====mpi_include_mpif_h_detect.f90====

program mpi_include_mpif_h_detect
   include 'mpif.h'
   call MPI_INIT(ierr)
   call MPI_FINALIZE(ierr)
end program mpi_include_mpif_h_detect

(2) ====mpi_use_f08_detect.f90====
program mpi_use_f08_detect
   use mpi_f08
   call MPI_INIT(ierr)
   call MPI_FINALIZE(ierr)
end program mpi_use_f08_detect


COMPILE/RUN OUTPUT:

(1)
mpifort mpi_include_mpif_h_detect.f90
./a.out


/usr/bin/ld: warning: libgfortran.so.3, needed by /usr/lib/x86_64-kfreebsd-gnu/openmpi/lib/libmpi_usempif08.so, may conflict with libgfortran.so.4
[k3bsd:17030] PMIX ERROR: UNREACHABLE in file src/client/pmix_client.c at line 1017
[k3bsd:17031] PMIX ERROR: NOT-SUPPORTED in file src/server/pmix_server_listener.c at line 540
[k3bsd:17030] PMIX ERROR: UNREACHABLE in file src/client/pmix_client.c at line 205
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

   init pmix failed
   --> Returned value Unreachable (-12) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

   orte_ess_init failed
   --> Returned value Unreachable (-12) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

   ompi_mpi_init: ompi_rte_init failed
   --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[k3bsd:17030] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!


(2) 
mpifort mpi_use_f08_detect.f90
./a.out

/usr/bin/ld: warning: libgfortran.so.3, needed by /usr/lib/x86_64-kfreebsd-gnu/openmpi/lib/libmpi_usempif08.so, may conflict with libgfortran.so.4
[k3bsd:17039] PMIX ERROR: UNREACHABLE in file src/client/pmix_client.c at line 1017
[k3bsd:17040] PMIX ERROR: NOT-SUPPORTED in file src/server/pmix_server_listener.c at line 540
[k3bsd:17039] PMIX ERROR: UNREACHABLE in file src/client/pmix_client.c at line 205
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

   init pmix failed
   --> Returned value Unreachable (-12) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

   orte_ess_init failed
   --> Returned value Unreachable (-12) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

   ompi_mpi_init: ompi_rte_init failed
   --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[k3bsd:17039] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!


INSTALLED PACKAGES:

ii  gfortran                                 4:7.1.0-2                      kfreebsd-amd64 GNU Fortran 95 compiler
ii  gfortran-7                               7.2.0-1                        kfreebsd-amd64 GNU Fortran compiler
ii  libgfortran-7-dev:kfreebsd-amd64         7.2.0-1                        kfreebsd-amd64 Runtime library for GNU Fortran applications (development files)
ii  libgfortran3:kfreebsd-amd64              6.4.0-4                        kfreebsd-amd64 Runtime library for GNU Fortran applications
ii  libgfortran4:kfreebsd-amd64              7.2.0-1                        kfreebsd-amd64 Runtime library for GNU Fortran applications
ii  libopenmpi-dev                           2.1.1-6                        kfreebsd-amd64 high performance message passing library -- header files
ii  libopenmpi2:kfreebsd-amd64               2.1.1-6                        kfreebsd-amd64 high performance message passing library -- shared library
ii  openmpi-bin                              2.1.1-6                        kfreebsd-amd64 high performance message passing library -- binaries
ii  openmpi-common                           2.1.1-6                        all            high performance message passing library -- common files


MPIFORT ANALYSIS:

$ mpifort --showme mpi_include_mpif_h_detect.f90

   gfortran mpi_include_mpif_h_detect.f90
   -I/usr/lib/x86_64-kfreebsd-gnu/openmpi/include -pthread
   -I/usr/lib/x86_64-kfreebsd-gnu/openmpi/lib -L/usr//lib
   -L/usr/lib/x86_64-kfreebsd-gnu/openmpi/lib -lmpi_usempif08
   -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi


$ mpifort --showme mpi_use_f08_detect.f90

   gfortran mpi_use_f08_detect.f90
   -I/usr/lib/x86_64-kfreebsd-gnu/openmpi/include -pthread
   -I/usr/lib/x86_64-kfreebsd-gnu/openmpi/lib -L/usr//lib
   -L/usr/lib/x86_64-kfreebsd-gnu/openmpi/lib -lmpi_usempif08
   -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi


$ dpkg -L libopenmpi-dev |grep mpifort

/usr/share/man/man1/mpifort.openmpi.1.gz
/usr/bin/mpifort.openmpi


Could the gfortran library 4 vs 3 conflict explain this bug?

No:

   gfortran mpi_include_mpif_h_detect.f90
   -I/usr/lib/x86_64-kfreebsd-gnu/openmpi/include -pthread
   -I/usr/lib/x86_64-kfreebsd-gnu/openmpi/lib -L/usr//lib
   -L/usr/lib/x86_64-kfreebsd-gnu/openmpi/lib -lmpi_mpifh -lmpi

compiles as above, but without the warning; it gives the same error
messages as above.


RELATED BUGS?

Bug #846635 looks superficially similar to this bug, but on 2016-12-03
in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=846635 there's a
comment that "This is fixed with 2.0.2~git.2015125-1, currently in
experimental".


SYSTEM:

-- System Information:
Debian Release: buster/sid
   APT prefers unstable
   APT policy: (500, 'unstable')
Architecture: kfreebsd-amd64 (x86_64)

Kernel: kFreeBSD 10.3-0-amd64
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US:en (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: sysvinit (via /sbin/init)




More information about the Pkg-openmpi-maintainers mailing list