[Pkg-uml-pkgs] Bug#544225: I suspect I am running into the same bug
Anton Ivanov
anton.ivanov at kot-begemot.co.uk
Fri Feb 24 13:44:04 UTC 2012
First of all the bug should be upgraded above "normal". There is some
serious breakage here which under some circumstances can lead to data
corruption, data loss and some major entertainment for anyone running it.
This bug can be triggered by udev or any other process during boot-time.
It is specific to SKAS0 mode of operation.
Under SKAS0 UML userland is invoked multiple times asynchronously and to
add insult to injury it also reinitialises the periodic timer on every
invocation.
The offending code is after line 85 in arch/um/kernel/skas/mmu.c
else {
if (from_mm)
to_mm->id.u.pid = copy_context_skas0(stack,
from_mm->id.u.pid);
else to_mm->id.u.pid = start_userspace(stack);
if (to_mm->id.u.pid < 0) {
ret = to_mm->id.u.pid;
goto out_free;
}
}
This causes 6-8 start_userspace() invocations which depending on the
number of CPUs in the system, load, phase of the moon, etc end with one
of the proceesses in the UML instance hanging. Most common places are
udev and new style (parallelelized) rc2 boot. On UMLs with lots of
memory however this can happen later on.
In most cases the hanging process inside the UML is in Z state which is
not surprising because it is in a thread which has finished but it is no
longer being ptraced so there is noone to clean-up.
Once all of the memory in the UML instance has been initialized and all
the LDTs have been set up the instance will continue to run without
problems.
As with most race conditions this one is not 100% reproducible. However,
booting a memory constrained VM under heavy load will show it in
approximately 1-2% of all boots.
All in all - this is seriously broken. If uml is to be shipped this
either needs to be fixed (I apologize, but VM is beyond my fixing
capability so I cannot fix it) or SKAS3 shipped instead (the above piece
of code is not ivoked under skas3).
Brgds,
A.
More information about the Pkg-uml-pkgs
mailing list