[pkg-s48-maint] Bug#450948: scsh-0.6: Logic error allows infinite block in event.c:queue_ready_ports()
Derek Upham
sand at blarg.net
Mon Nov 12 16:00:56 UTC 2007
Package: scsh-0.6
Version: 0.6.7-4
Severity: normal
When using scsh-0.6.7 with the SUnet web server, the VM regularly
hangs. strace shows that the VM (either the parent process, or a
child process spawned to handle a request) is blocked in the select()
system call:
select(0, [], [], [], NULL)
If the final parameter were a real struct timeval instead of NULL,
then select() would block for that time duration and return. But with
a NULL struct timeval and empty fdset arguments, this just blocks
forever (or until it gets some explicit interrupt, I suppose).
Looking at the code, the only select() call that seems to allow this
combination of parameters is in queue_ready_ports() in event.c. A
combination of 'wait' being true, 'seconds' being -1, and the global
'pending' variable being empty would trigger it:
if ((! wait)
&& (pending.first == NULL))
return (NO_ERRORS);
FD_ZERO(&reads);
FD_ZERO(&writes);
FD_ZERO(&alls);
limfd = 0;
for (fdp = pending.first; fdp != NULL; fdp = fdp->next) {
FD_SET(fdp->fd, fdp->is_input ? &reads : &writes);
FD_SET(fdp->fd, &alls);
if (limfd <= fdp->fd)
limfd = fdp->fd + 1;
}
tvp = &tv;
if (wait)
if (seconds == -1){
tvp = NULL;
}
else {
tv.tv_sec = seconds;
tv.tv_usec = ticks * (1000000 / TICKS_PER_SECOND);
}
else
timerclear(&tv);
'wait' is true so we skip the 'return (NO_ERRORS)'. 'pending.first' is
NULL, so we don't hit any of the FD_SET calls. And 'seconds' is -1 so
we set 'tvp' to NULL. Then this call
/* time gap */
left = select(limfd, &reads, &writes, &alls, tvp);
has 0, [], [], [] and NULL.
It looks like this is the same problem that causes the SUnet site's
own web-server to lock up periodically.
I'm not sure what the correct solution is for this bug. We could put
a special check in at the top, and return immediately for that
combination of parameters. Or this may be a "should never happen"
combination, pointing to a real issue elsewhere: it seems like the
real problem is the lack of 'pending' ports.
It may be worth checking the latest Scheme48 sources for the same bug,
as well.
Derek
-- System Information:
Debian Release: lenny/sid
APT prefers oldstable
APT policy: (500, 'oldstable'), (500, 'unstable')
Architecture: i386 (i686)
Kernel: Linux 2.6.22 (PREEMPT)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Versions of packages scsh-0.6 depends on:
ii libc6 2.6.1-6 GNU C Library: Shared libraries
ii libelfg0 0.8.6-4 an ELF object file access library
ii scsh-common-0.6 0.6.7-4 A `scheme' interpreter designed fo
scsh-0.6 recommends no packages.
-- no debconf information
More information about the pkg-scheme48-maintainers
mailing list