[Buildd-tools-devel] Logging issues with sbuild

Roger Leigh rleigh at whinlatter.ukfsn.org
Tue Mar 14 23:06:57 UTC 2006


Hi folks,

There are a number of odd issues with the sbuild log files, which I'd
like to discuss.

1) sbuild is supposed to maintain two separate logfiles.

   (This may only affect the buildd-tools fork of sbuild; I haven't
   checked upstream for this issue.  It's certainly present in sbuild
   0.42 in unstable)

   1) The build log, LOG, in build-$date.log ($main::main_logfile)
      See open_log() and close_log().

   2) The package log, PLOG, in $conf::log_dir/${pkg}_$date
      ($main::pkg_logfile)
      See open_pkg_log() and close_pkg_log().

   If you check, you will see that both logs for a given build are
   /identical/, but looking at the code, PLOG does not forward to LOG
   in any way, so how come build.log contains the package log?

   Ideally, the build log should only contain errors (AFAICT), not the
   actual package log.  I can't see anything in the code to explain
   why both logs should the same (why does PLOG also forward to LOG?),
   so any assistance here would be appreciated.

2) There are race conditions when opening the build log.  The log file
   is only unique to the minute, so if multiple builds are going on,
   the same log file may be opened.  This truncates the log, and wipes
   the original log, possibly leaving "holes" in the file depending on
   the timing:

   - build1 opens and truncates build-xxxxx.log
   - build1 writes some logs
   - build2 opens and truncates build-xxxxx.log
     (build1's log is wiped)
   - build1 writes some more logs at the old seek offset, creating a
     logfile with a hole in
   - build2 may or may not overwrite the build1 log and its hole

   This causes a mess.

   Solutions include:
   i)   Going to second resolution for the date part
   ii)  Adding .1, .2, ... .n suffixes
   iii)  Adding the distribution (build-$distribution-$date)
   These are all racy to different degrees.
   Another solution would be to
   iv) create the log with mktemp or equivalent, to ensure uniqueness.

   I personally favour a combination of iii and iv, but I don't want
   to break any existing tools which might rely on the naming scheme
   of the current build-$date.log format.  This would give a format
   like:
      build-$distribution-$date-XXXXXX.log
   (where XXXXXX is the random mktemp template).

   When running with chroot_mode=split (sudo), the chance of this
   happening is not too high.  It only happens if running concurrent
   builds for multiple suites, since each chroot only has a single
   sbuild running at once.  However, when running with
   chroot_mode=schroot with a session-capable chroot type such as
   lvm-snapshot, the system can support as many concurrent sbuilds as
   you have free disk space, and the chances are considerably higher.
   I'd like to fix both use cases.

3) When running in verbose mode, tee is used to pipe to log to the
   logfile and the tty.  Prior to closing LOG and PLOG, the tee
   process is killed with SIGTERM.  However, this is racy, leading to
   intermittent dataloss in the pipeline (the last few lines of the
   log are missing roughly 1 time in 3).  In the current CVS (and 0.42
   release), this is worked around by flushing the pipeline and
   sleeping for a second, but this is inefficient and still racy
   (albeit much less so):

@@ -2596,7 +2596,11 @@
 sub close_log {
 	my $date = strftime("%Y%m%d-%H%M",localtime);
 
-	kill( 15, $main::tee_pid ) if $main::verbose;
+	if ($main::verbose) {
+		LOG->flush();
+		sleep 1;
+		kill( "TERM", $main::tee_pid );
+	}
 	close( LOG );
 	if (!$main::nolog && !$main::verbose &&
 		-s $main::main_logfile && $conf::mailto) {
@@ -2644,7 +2648,7 @@
 	}
 	select( (select( PLOG ), $| = 1)[0] );
 
-	my $revision = '$Revision: 1.87 $';
+	my $revision = '$Revision: 1.88 $';
 	$revision =~ /([\d.]+)/;
 	$revision = $1;
 
@@ -2669,7 +2673,11 @@
 	print PLOG "*"x78, "\n";
 	printf PLOG "Finished at ${date}\nBuild needed %02d:%02d:%02d, %dk disk space\n",
 		   int($t/3600), int(($t%3600)/60), int($t%60), $main::this_space;
-	kill( 15, $main::pkg_tee_pid ) if $main::verbose && !$main::nolog;
+	if ($main::verbose) {
+		PLOG->flush();
+		sleep 1;
+		kill( "TERM", $main::pkg_tee_pid );
+	}
 	close( PLOG );
 	open( PLOG, ">&LOG" ) or warn "Can't redirect PLOG\n";
 	send_mail( $conf::mailto,

   According to Brendan O'Dea, the kill("TERM", $pid) is unnecessary:
   close(LOG) should result in tee exiting normally.  This is observed
   for PLOG, but not for LOG, which leaves the tee process hanging
   around.  I'm uncertain why the two differ (may be related to
   problem (1)).

   To solve this, I'd like to propose
   - Getting rid of the kill, so close works properly and is free of
     dataloss-causing racing.
   - Possibly eliminating tee by handling the multiplexing in perl
     directly.


Regards,
Roger

-- 
Roger Leigh
                Printing on GNU/Linux?  http://gutenprint.sourceforge.net/
                Debian GNU/Linux        http://www.debian.org/
                GPG Public Key: 0x25BFB848.  Please sign and encrypt your mail.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 190 bytes
Desc: not available
Url : http://lists.alioth.debian.org/pipermail/buildd-tools-devel/attachments/20060314/2bda614e/attachment.pgp


More information about the Buildd-tools-devel mailing list