[Pkg-octave-devel] Bug#708191: octave-openmpi-ext will sometimes run successfully, error out or crash

Brian Woods bwoods288 at gmail.com
Mon May 13 21:28:56 UTC 2013


Package: octave-openmpi-ext
Version: 1.0.2-2
Severity: normal

Dear Maintainer,

   I was running the provided hellocell.m, and hellostruct.m scripts and sometimes it would either run successfully, mpi would error out or it would just crash.

command:  mpirun -x LD_PRELOAD=libmpi.so.0 -np  4 octave -q --eval hellocell

error out
==============
==============
warning: X11 DISPLAY environment variable not set
warning: X11 DISPLAY environment variable not set
warning: X11 DISPLAY environment variable not set
warning: X11 DISPLAY environment variable not set
warning: dispatch is obsolete and will be removed from a future version of Octave; please use classes instead
warning: dispatch is obsolete and will be removed from a future version of Octave; please use classes instead
warning: dispatch is obsolete and will be removed from a future version of Octave; please use classes instead
warning: dispatch is obsolete and will be removed from a future version of Octave; please use classes instead
We are at rank 0 that is master etc..
[aurora:29333] *** An error occurred in MPI_Send
[aurora:29333] *** on communicator MPI_COMM_WORLD
[aurora:29333] *** MPI_ERR_RANK: invalid rank
[aurora:29333] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
warning: octave_chunk_buffer::clear: 1 active allocations remain!
--------------------------------------------------------------------------
mpirun has exited due to process rank 2 with PID 29333 on
node aurora exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[aurora:29330] 2 more processes have sent help message help-mpi-errors.txt / mpi_errors_are_fatal
[aurora:29330] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
================
================

when it crashes
================
================
*** glibc detected *** octave: double free or corruption (!prev): 0x0000000001040590 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x76d76)[0x7fc06b968d76]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x6c)[0x7fc06b96daac]
/usr/lib/x86_64-linux-gnu/liboctinterp.so.1(_ZNSt8_Rb_treeISsSt4pairIKSsN12symbol_table8fcn_infoEESt10_Select1stI
S4_ESt4lessISsESaIS4_EE8_M_eraseEPSt13_Rb_tree_nodeIS4_E+0x62)[0x7fc06de49262]
/usr/lib/x86_64-linux-gnu/liboctinterp.so.1(_ZNSt8_Rb_treeISsSt4pairIKSsN12symbol_table8fcn_infoEESt10_Select1stI
S4_ESt4lessISsESaIS4_EE8_M_eraseEPSt13_Rb_tree_nodeIS4_E+0x24)[0x7fc06de49224]
======= Memory map: ========
00400000-00401000 r-xp 00000000 08:06 135826                             /usr/bin/octave
00600000-00601000 r--p 00000000 08:06 135826                             /usr/bin/octave
00601000-00602000 rw-p 00001000 08:06 135826                             /usr/bin/octave
00fc7000-01cd5000 rw-p 00000000 00:00 0                                  [heap]
7fc050000000-7fc050021000 rw-p 00000000 00:00 0 
7fc050021000-7fc054000000 ---p 00000000 00:00 0 
7fc0561f2000-7fc056223000 r-xp 00000000 08:06 136540                     /usr/lib/x86_64-linux-gnu/octave/package
s/openmpi_ext-1.0.2/x86_64-pc-linux-gnu-api-v48+/MPI_Send.oct
7fc056223000-7fc056422000 ---p 00031000 08:06 136540                     /usr/lib/x86_64-linux-gnu/octave/package
s/openmpi_ext-1.0.2/x86_64-pc-linux-gnu-api-v48+/MPI_Send.oct
7fc056422000-7fc056425000 r--p 00030000 08:06 136540                     /usr/lib/x86_64-linux-gnu/octave/package
s/openmpi_ext-1.0.2/x86_64-pc-linux-gnu-api-v48+/MPI_Send.oct
7fc056425000-7fc056426000 rw-p 00033000 08:06 136540                     /usr/lib/x86_64-linux-gnu/octave/package
s/openmpi_ext-1.0.2/x86_64-pc-linux-gnu-api-v48+/MPI_Send.oct
7fc056426000-7fc05a428000 rw-s 00000000 00:12 55549                      /tmp/openmpi-sessions-larks at aurora_0/659
2/1/shared_mem_pool.aurora (deleted)
7fc05a428000-7fc05a435000 r-xp 00000000 08:06 135698                     /usr/lib/openmpi/lib/openmpi/mca_osc_rdm
a.so
7fc05a435000-7fc05a635000 ---p 0000d000 08:06 135698                     /usr/lib/openmpi/lib/openmpi/mca_osc_rdm
a.so
7fc05a635000-7fc05a636000 rw-p 0000d000 08:06 135698                     /usr/lib/openmpi/lib/openmpi/mca_osc_rdm
a.so
7fc05a636000-7fc05a640000 r-xp 00000000 08:06 135697                     /usr/lib/openmpi/lib/openmpi/mca_osc_pt2
pt.so
7fc05a640000-7fc05a83f000 ---p 0000a000 08:06 135697                     /usr/lib/openmpi/lib/openmpi/mca_osc_pt2
pt.so
7fc05a83f000-7fc05a841000 rw-p 00009000 08:06 135697                     /usr/lib/openmpi/lib/openmpi/mca_osc_pt2
pt.so
7fc05a841000-7fc05a85a000 r-xp 00000000 08:06 135671                     /usr/lib/openmpi/lib/openmpi/mca_coll_tu
ned.so
7fc05a85a000-7fc05aa5a000 ---p 00019000 08:06 135671                     /usr/lib/openmpi/lib/openmpi/mca_coll_tu
ned.so
7fc05aa5a000-7fc05aa5b000 rw-p 00019000 08:06 135671                     /usr/lib/openmpi/lib/openmpi/mca_coll_tu
ned.so
7fc05aa5b000-7fc05aa5e000 r-xp 00000000 08:06 135670                     /usr/lib/openmpi/lib/openmpi/mca_coll_sy
nc.so
7fc05aa5e000-7fc05ac5d000 ---p 00003000 08:06 135670                     /usr/lib/openmpi/lib/openmpi/mca_coll_sy
nc.so
7fc05ac5d000-7fc05ac5e000 rw-p 00002000 08:06 135670                     /usr/lib/openmpi/lib/openmpi/mca_coll_sy
nc.so
==============
==============
it keeps on going but I though I'd end it there.  If you want I can send you the complete output.  It would sometimes work, then error, then work, then crash, then work, etc

When I did that fixed it was to uninstall the the octave-openmpi-ext package and then use octave to install the 1.1.0 version from sourceforge. While I haven't test it a lot, it seems to work without erring out or crashing, out of the 10 or so times I tested it.

If you need any more information please let me know and I'll be happy to provide you anything I can. 

Brian

-- System Information:
Debian Release: jessie/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 3.2.0-4-amd64 (SMP w/8 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) (ignored: LC_ALL set to en_US.UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages octave-openmpi-ext depends on:
ii  libc6          2.13-38
ii  libgcc1        1:4.7.2-5
ii  liboctave1     3.6.2-5
ii  libopenmpi1.3  1.4.5-1
ii  libstdc++6     4.7.2-5
ii  octave         3.6.2-5

Versions of packages octave-openmpi-ext recommends:
ii  openmpi-bin  1.4.5-1

octave-openmpi-ext suggests no packages.

-- no debconf information



More information about the Pkg-octave-devel mailing list