[Buildd-tools-devel] Re: sarti build failures

Roger Leigh rleigh at whinlatter.ukfsn.org
Wed Apr 5 21:46:38 UTC 2006


Philip Hands <phil at hands.com> writes:

> James Vega wrote:
>> All builds on sarti are currently failing when invoking fakeroot:
>> 
>>      /usr/bin/fakeroot debian/rules clean
>>    fakeroot, while creating message channels: No space left on device
>>    This may be due to a lack of SYSV IPC support.
>>    fakeroot: error while starting the `faked' daemon.
>>    kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]
>> 
>> James
>
> It seems that there are 16 message queues belonging to buildd that are
> laying around.  My guess is that something kill-9ing fakeroots.

This can happen if the stalled package timeout is exceeded, but it
should send SIGTERM first, and them SIGKILL every 5 minutes after.  It
should never kill outright.

Note that I did find a bug in this code, which is fixed in the
packaged version of sbuild.  The wanna-build sbuild places some child
processes (including dpkg-buildpackage) in a separate process group by
calling setpgid(2).  This can make the build hang indefinitely because
if the child attempts any terminal I/O it gets sent SIGTSTP.  sbuild
doesn't handle SIGTTIN or SIGTTOU (since it's not a shell), and the
build hangs until the timeout.

This might not be occuring in your case, but might be worth checking.
I removed the process group stuff in buildd-tools sbuild, and it
hasn't caused any problems so far.

Where did the kill usage message come from?  sbuild never calls
/bin/kill; it uses kill(2).

> A brief look at the sbuild's source reveals that it seems to be willing to
>   do a kill("KILL") if a couple of timeouts occur, so I'm assuming that's
> what happened and removing the dead message queues.

I'm not sure that this is the root cause.  Assuming the timeout was
exceeded, why did the build not terminate cleanly on receiving
SIGTERM?  That might be the real problem.

> A question for the buildd-utils folks:
>
>   I realise this is difficult, since message queues are not tied to
>   processes, but would it be possible to notice that this has happened,
>   and tidy up the dead queues after the SIGKILL was sent?

This looks rather dangerous.  AFAICT, there's no way to determine if a
queue is dead nor which process was using it, so active queues could
be removed.

If this is the only solution, perhaps fakeroot moving to using an
anonymous POSIX mqueue would be better (this is unlinked from the
filesystem, so will get cleaned up automatically on process
termination).


Regards,
Roger

-- 
Roger Leigh
                Printing on GNU/Linux?  http://gutenprint.sourceforge.net/
                Debian GNU/Linux        http://www.debian.org/
                GPG Public Key: 0x25BFB848.  Please sign and encrypt your mail.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 190 bytes
Desc: not available
Url : http://lists.alioth.debian.org/pipermail/buildd-tools-devel/attachments/20060405/6f55d13d/attachment.pgp


More information about the Buildd-tools-devel mailing list