[Pkg-openmpi-maintainers] Bug#598553: Bug#598553: r-cran-rmpi: slave processes eat CPU when they have nothing to do

Manuel Prinz manuel at debian.org
Sat Oct 2 23:30:15 UTC 2010


On Sat, Oct 02, 2010 at 01:37:42PM -0700, Zack Weinberg wrote:
> I wrote a test MPI program that just calls MPI_Probe() once - this
> should block forever, since there are no sends happening.  When run
> with
> 
> $ mpirun -np 2 ./a.out
> 
> MPI_Probe never returns and the processes spin through poll(), which
> is what I originally reported.  So far so good.  If I change the
> invocation to
> 
> $ mpirun -np 2 --mca mpi_yield_when_idle 1 ./a.out
> 
> the behavior is the same, except that the processes alternate between
> poll() and sched_yield().  This doesn't help anything; the scheduler
> is still being thrashed, and the CPU is not allowed to go idle.  [In
> fact, my understanding of the Linux scheduler is that a zero-timeout
> poll() counts as a yield, so "Aggressive" mode isn't even doing
> anything constructive!]
> 
> The desired behavior is for an idle cluster's processes to BLOCK in
> poll().  So mpi_yield_when_idle does not do what I want.
> 
> Also, putting "mpi_yield_when_idle = 1" into
> ~/.openmpi/mca-params.conf has no effect, contra the documentation --
> this perhaps ought to be its own bug.  (I can set MCA parameters for R
> with environment variables, but that's not nearly as convenient as the
> host file.)

I'm out of ideas here. Jeff, could you please comment on the issue?
You can find the full log here:

  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=598553

Thanks in advance!

Best regards,
Manuel






More information about the Pkg-openmpi-maintainers mailing list