[Pkg-openmpi-maintainers] Bug#598553: Bug#598553: r-cran-rmpi: slave processes eat CPU when they have nothing to do

Manuel Prinz manuel at debian.org
Sat Oct 2 19:51:20 UTC 2010


Hi Zack!

On Sat, Oct 02, 2010 at 08:39:06AM -0700, Zack Weinberg wrote:
> On Sat, Oct 2, 2010 at 6:01 AM, Manuel Prinz <manuel at debian.org> wrote:
> >> On 29 September 2010 at 18:22, Zack Weinberg wrote:
> >> | (on an 8-core machine), CPU utilization jumps *immediately* from 98% idle
> >> | to 20% user, 70% system, 12% idle.  strace reveals that each slave is
> >> | spinning through poll() calls with timeout zero, rather than blocking
> >> | until a message arrives, as the documentation for mpi.probe() suggests
> >> | should happen.
> ...
> > Well, no. Actually, this behavior is by design. I'm not sure about the details
> > exactly but can get back to Jeff if you're interested in those. This is coming
> > up every now and then in the BTS or the user list. Open MPI is basically burning
> > every free cycle that is not used for computation (busy wait). There are no
> > immediate plans of changing that, as far as I know.

I did some reading and it seems the Open MPI indeed does support two modes
of waiting: aggressive and degraded. The default behavior is "aggressive",
but you can switch them by setting the mpi_yield_when_idle MCA parameter.
See the following FAQ entries (and links therein):

http://www.open-mpi.org/faq/?category=running#force-aggressive-degraded
http://www.open-mpi.org/faq/?category=running#oversubscribing

I guess this is basically the behaviour you want. It would be great if you
could give it a try and report back if it works for you. If it doesn't do
what you (and I) expect, I'll forward this issue upstream.

Best regards,
Manuel






More information about the Pkg-openmpi-maintainers mailing list